Qiskit ) only process text, ignoring visual representations—circuit diagrams, Bloch spheres, histograms
What I built: - A synthetic data generation pipeline that extracts content from Qiskit documentation, papers, codes transcribes images via VLM, generates validate input and output pairs, and validates all code through automated unit tests - The first public multimodal dataset for quantum computing: 8,366 samples (45% with images) across function completion, code generation, and Q&A tasks - Fine-tuned Qwen3-VL-8B using LoRA (rsLoRA r=32), achieving +11pp on Qiskit HumanEval (32.45% → 43.71%) and +17.9pp on multimodal samples vs text-only - Interactive demo with chat interface and code challenges
Results: The model achieves 63.39% Pass@1 on visual samples—it learned to extract circuit topology from diagrams and infer parameters from visual annotations.
Qiskit ) only process text, ignoring visual representations—circuit diagrams, Bloch spheres, histograms
What I built: - A synthetic data generation pipeline that extracts content from Qiskit documentation, papers, codes transcribes images via VLM, generates validate input and output pairs, and validates all code through automated unit tests - The first public multimodal dataset for quantum computing: 8,366 samples (45% with images) across function completion, code generation, and Q&A tasks - Fine-tuned Qwen3-VL-8B using LoRA (rsLoRA r=32), achieving +11pp on Qiskit HumanEval (32.45% → 43.71%) and +17.9pp on multimodal samples vs text-only - Interactive demo with chat interface and code challenges
Results: The model achieves 63.39% Pass@1 on visual samples—it learned to extract circuit topology from diagrams and infer parameters from visual annotations.