The Story Video feature processes scripts through several sequential stages. First, the AI storyboard generator breaks the script into scenes and generates visual descriptions. Then the system maintains character consistency across frames by tracking visual attributes and feeding them forward through the generation pipeline. The voiceover component synthesizes audio from the script text, synchronizing it with the generated visuals. Users get production control at each stage rather than receiving a single black-box output.
Vibe Video takes a different approach. It ingests static images and applies motion generation models to create short-form content with music and effects layered on top. The technical goal is producing social media clips optimized for platforms that prioritize movement and audio.
The Workflow Canvas operates as a node-based editor where users visually construct AI pipelines. You connect image nodes to video nodes to text nodes, creating custom processing chains. Each node represents a specific model operation. The system executes the graph in order, passing outputs from one node as inputs to the next. Users can save these workflow configurations and share them, which amounts to sharing the pipeline architecture itself. Undo and redo support suggests the system maintains a state history of node connections and parameters.
Beyond video, the system includes standalone generators. The Image Generator creates visuals from text prompts using models like Qwen Image 2.0. The Audio Generator produces music and sound effects from descriptions. The ASMR Generator processes existing videos to extract or enhance specific audio frequencies associated with ASMR content. These tools feed into the broader video creation pipeline or work independently.
Technical output varies considerably. Results depend entirely on which underlying model the system routes to. Some models excel at photorealism while others handle animation better. Character consistency works through facial recognition and attribute tracking, but maintaining perfect consistency across long sequences remains technically challenging for most generative models. The quality ceiling is determined by the capabilities of the specific model being used at that moment.
The system uses a credit system where different operations consume different amounts. A single video generation costs roughly 20 credits based on the provided numbers. More complex operations or higher-quality model selections likely consume more credits per generation. The free plan offers 60 lifetime credits, which translates to roughly 3 video generations total. that is extremely limited for testing workflows.
Paid plans start at $20 monthly for 1000 credits, approximately 50 video generations. The $40 monthly Pro plan provides 2800 credits for around 140 generations. Additional credits cost $10 for 500 units. Yearly plans include 2 months of free credits. Paid tiers remove watermarks and provide priority support plus early access to new model integrations.
The free tier's 60 lifetime credits create a major constraint for any serious evaluation. Three video generations isn't enough to test character consistency across multiple scenes or explore different model options. Standard support on the free plan means slower response times when troubleshooting pipeline issues. Watermarks on free outputs limit commercial use cases.
The system grants full commercial rights to generated content, which matters legally since AI-generated media exists in uncertain copyright territory. One-click export simplifies the final delivery step. No integrations are documented, so outputs need manual transfer to other platforms or tools.