Orbiting a Single Photo: 3D Camera Control with Qwen Image Edit and Three.js

0:00

/0:22

A browser-based interface showing a Three.js 3D camera orbiting a flat image plane on the left, with a generated re-angled version of the same subject on the right.

Upload a photo of anything. Orbit a little Three.js camera around it. Click generate. Out comes the same subject, from your new angle, hallucinated into plausibility by a diffusion model that has no right being this coherent.

That's Qwen Image Multiple Angles 3D Camera, a Hugging Face Space built by Apolinário, a machine learning engineer at Hugging Face who has a habit of turning freshly released models into polished, playable demos before most people have finished reading the model card. The interactive 3D camera control is a self-contained WebGL scene: a Three.js viewport embedded inside Gradio, complete with draggable orbit handles for azimuth, elevation, and distance. You manipulate the virtual camera, and the widget snaps your input to one of 96 discrete poses before passing a structured prompt to the model. It's a small, elegant bit of real-time 3D doing the job of what would otherwise be three confusing sliders. The underlying model is Qwen-Image-Edit-2511, running with two LoRAs stacked on top of each other: a Lightning adapter for fast 4-step inference, and the Multiple Angles LoRA built by Lovis Odin, a creative engineer at fal.ai and Gobelins graduate with a long history of WebGL work. Lovis trained the LoRA on 3,000+ Gaussian Splatting renders, covering 4 elevations, 8 azimuths, and 3 distances. The camera prompt format is terse and structured: <sks> front-right quarter view low-angle shot close-up. That specificity, trained against 3D-consistent synthetic data, is what keeps the outputs from collapsing into vague artistic reinterpretation.

The whole Space is a single app.py. Go read it. The Three.js scene setup, the snap-to-nearest logic, the prompt builder, the dual LoRA loading: it's all right there, cleanly organized and short enough to absorb in one sitting. If you want to push the camera system further, the LoRA model card lists all 96 prompt combinations. Try the low-angle shots; that's where the training data from Gaussian Splatting pays off most visibly. And follow both Apolinário and Lovis: they ship faster than you can bookmark.

Launch Demo

Live Demo: https://huggingface.co/spaces/multimodalart/qwen-image-multiple-angles-3d-camera
Source Code: https://huggingface.co/spaces/multimodalart/qwen-image-multiple-angles-3d-camera/tree/main
Author(s):
- Apolinário (X, LinkedIn)
- Lovis Odin (X, LinkedIn)

WebGL / WebGPU

Orbiting a Single Photo: 3D Camera Control with Qwen Image Edit and Three.js

AI Grid: Run LLMs in Your Browser, Share GPU Compute with the World

Stay in the loop