What is OpenDirector?

OpenDirector is an open-source AI video production platform where 9 specialized AI agents collaborate to produce complete videos from a single sentence prompt. Built on LangGraph for orchestration, Next.js for the frontend, and FFCreator for rendering, it automates the entire pipeline — from scriptwriting and storyboarding to voiceover, background music, and final video render. It supports both interactive creative mode and automated batch production.

How does OpenDirector work?

OpenDirector uses a 9-agent pipeline orchestrated by LangGraph. The Research Agent gathers context, the Script Agent writes the narrative, the Art Style Agent selects from 34 visual styles, the Storyboard Agent plans shots, Character and Location Agents design visuals, the Voice Agent generates voiceovers, the BGM Agent composes music, and the Media Agent renders the final video. Each agent communicates through a state graph, enabling real-time collaboration and manual adjustments at any stage.

Is OpenDirector free?

Yes, OpenDirector is completely free and open-source under the MIT license. You can use it for personal and commercial projects. The only costs are for the AI API keys you configure (such as OpenAI, Anthropic, or other LLM providers) and any TTS or image generation services you connect.

How do I deploy OpenDirector?

OpenDirector is Docker-first. Clone the GitHub repository, configure your API credentials in the .env file, then run docker compose up to boot MySQL, Redis, MinIO, and the app. Open your browser to start creating. The entire setup takes about 5 minutes on any machine with Docker installed.

What video formats and aspect ratios does OpenDirector support?

OpenDirector supports three aspect ratios: 16:9 cinematic widescreen, 9:16 vertical portrait, and 1:1 square format. Videos can be exported at up to 1080p Full HD resolution. The platform includes 34 built-in art styles across 9 categories, from cinematic and anime to watercolor and neon noir.

Is my data private with OpenDirector?

Yes. Since OpenDirector is self-hosted via Docker, all your scripts, voiceovers, character designs, and rendered videos stay on your local machine. Nothing is uploaded to third-party servers. The only external calls are to the AI APIs you explicitly configure for generation tasks.

Step 7 of 9TTS Orchestrator

Voice Agent

Generates expressive, multi-character voiceovers with fine-tuned gender, accent, and emotional matching.

Overview

The Voice Agent takes the voiceover script and character voice profiles to generate high-quality text-to-speech audio. Using the local Edge TTS engine, it produces voiceovers with appropriate emotional tone, pacing, and emphasis for each character. It handles multi-character dialogues by switching between voice profiles, and adjusts speech rate and pitch to match the scene's mood — slower and lower for dramatic moments, faster and higher for exciting sequences. The agent supports multiple TTS providers: Edge TTS (free, local, 300+ voices across 70+ languages), OpenAI TTS (premium quality), and custom voice cloning. It processes each scene's narration independently, applies SSML markup for fine-grained prosody control, and outputs audio files synchronized to the storyboard timing. Audio normalization ensures consistent volume levels across all scenes and characters. The agent also handles pronunciation corrections for proper nouns and technical terms, manages breathing pauses for natural speech rhythm, and generates separate audio tracks for narration, dialogue, and sound effects — giving the Media Agent full control over audio mixing and spatial positioning in the final video.

Input

Voiceover script, character voice profiles

Output

Audio files per scene with character-specific voices

Tools

Edge TTS engine, audio processing

Related Agents

Step 6—Set Designer

Location Agent

Designs environment key art and background plates, ensuring consistency for all actions in a scene.

Step 8—Sound Designer

BGM Agent

Analyzes the script's emotional curve to compose or select matching background soundtracks and audio transitions.

← Previous: Location Agent View Full Pipeline Next: BGM Agent →