Media Agent
Synthesizes images, voices, and audio assets, feeding them to the render worker to compile the final video.
Overview
The Media Agent is the final stage of the pipeline, responsible for assembling all generated assets into a polished video. It collects character images, location backgrounds, voiceover audio, background music, and sound effects from upstream agents, then orchestrates rendering using FFCreator and FFmpeg. The agent handles image composition (placing characters in scene backgrounds with proper layering and parallax depth), subtitle overlay with customizable fonts and positioning, precise audio synchronization across multiple tracks, and video encoding to the target resolution and aspect ratio (16:9, 9:16, or 1:1). It supports output up to 1080p resolution with configurable frame rates. The agent also performs final quality checks — verifying audio levels are normalized, transitions between scenes are smooth, and subtitle timing aligns with voiceover. The output is a fully rendered, publication-ready video file.
Input
All generated assets (images, audio, subtitles)
Output
Final rendered video file (up to 1080p)
Tools
FFCreator renderer, FFmpeg, image composition engine