Insights

Camera-Controlled AI Image Editing with Qwen Image Edit (FastAPI + Web UI)

January 16, 20263 min read
Camera-Controlled AI Image Editing with Qwen Image Edit (FastAPI + Web UI)

Camera-Controlled AI Image Editing with Qwen Image Edit (FastAPI + Web UI)

Text-only prompts are often insufficient for precise image editing when camera perspective matters. To solve this, I built a camera-controlled AI image editing system using Qwen Image Edit 2511 with a Multiple-Angles LoRA adapter, backed by a FastAPI inference server and a lightweight browser UI.

This system allows users to upload a reference image, select camera angle, lighting, and shot type, and generate diffusion-optimized prompts automatically — all running locally.

Why Camera Control Matters in AI Image Editing

  • Text prompts alone are ambiguous for viewpoint changes

  • Camera angle consistency preserves subject identity

  • LoRA-based camera control improves edit accuracy

  • Local inference ensures privacy and predictability

System Architecture

The project is split into two independent repositories:

  • Frontend UI – Camera selection + prompt generation

  • Backend API – Qwen Image Edit inference server

Browser UI (HTML/JS)
   ↓
Prompt Generator
   ↓
FastAPI Backend
   ↓
Qwen Image Edit 2511 + LoRA
   ↓
Edited Image Output
Camera Control UI Screenshot

Frontend: Camera Prompt Generator (Browser-Based)

The frontend is intentionally simple — no framework, no cloud dependencies. It generates diffusion-safe camera prompts.

Camera Prompt Builder (JavaScript)


function generatePrompt() {
  const angle = document.getElementById('cameraAngle').value;
  const height = document.getElementById('cameraHeight').value;
  const shot = document.getElementById('shotType').value;
  const lighting = document.getElementById('lighting').value;

  return ` ${angle}, ${height}, ${shot}, ${lighting}, realistic perspective, same subject, consistent identity`;
}

Sending Image + Prompt to Backend


const response = await fetch('http://localhost:8000/generate', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    prompt: generatePrompt(),
    reference_image: uploadedImageBase64,
    guidance_scale: 1.0,
    num_inference_steps: 4,
    height: 768,
    width: 768
  })
});

const data = await response.json();
displayOutputImage(data.image);
Generated Image

Backend: FastAPI + Qwen Image Edit

The backend is a FastAPI server optimized for 8GB VRAM GPUs. It loads Qwen Image Edit once and reuses the pipeline across requests.

FastAPI Entry Point


from fastapi import FastAPI
from app.schemas import GenerateRequest
from app.inference import generate_image

app = FastAPI()

@app.post("/generate")
async def generate(req: GenerateRequest):
    image, seed = generate_image(req)
    return {
        "success": True,
        "image": image,
        "seed": seed
    }

Inference Pipeline (Qwen Image Edit)


pipe = QwenImageEditPipeline.from_pretrained(
    MODEL_ID,
    torch_dtype=torch.float16
).to("cuda")

pipe.enable_attention_slicing()
pipe.enable_vae_slicing()

result = pipe(
    prompt=prompt,
    image=reference_image,
    num_inference_steps=steps,
    guidance_scale=guidance
)

Health Check Endpoint


@app.get("/health")
def health():
    return {
        "status": "ok",
        "gpu_available": torch.cuda.is_available(),
        "model_loaded": pipe is not None
    }
FastAPI Swagger UI Screenshot

Memory & Performance Optimization

  • FP16 inference

  • Attention slicing

  • VAE slicing

  • 768×768 resolution limit

  • Stateless request handling

These optimizations allow stable inference on GPUs like RTX 3060 / 4070 Laptop.

Privacy-First by Design

Unlike cloud-based AI tools:

  • No image uploads to third-party servers

  • No prompt logging

  • No telemetry

  • Full local execution

This makes the system suitable for internal tools, R&D, and sensitive workflows.

Open Source Repositories

Final Thoughts

Camera-aware prompt engineering is the next step in controllable AI image editing. By separating UI, prompt logic, and inference, this architecture remains scalable, privacy-friendly, and production-ready.

If you’re exploring advanced diffusion workflows, Qwen Image Edit with camera control offers an excellent balance of power and efficiency.

#ai image editing#camera control diffusion#qwen image edit

Ready to Energize Your Project?

Join thousands of others experiencing the power of lightning-fast technology