Why this matters
Good lip-sync starts with good input. Clean media improves tracking, timing, and overall output quality.
Recommended input quality
For the best results:
use clear audio with intelligible speech,
avoid overlapping speakers,
keep background noise as low as possible,
use footage where the active speaker is clearly visible,
frontal or near-frontal face angles usually work best.
Media requirements
Please follow the supported input formats and limits described in our platform documentation and technical API docs.
If your media uses an uncommon codec or container, convert it before sending it to the API.
Tips for best results
Start with a short test clip before processing larger batches.
Validate your workflow with one representative example first.
Use stable, production-ready file delivery in your own pipeline.
Monitor token usage before large runs.
If quality matters more than speed, test several example clips before scaling.
Common reasons for failed or weak results
corrupted or unsupported source files,
poor face visibility,
very noisy or distorted audio,
speakers talking over each other,
clips that were not validated before large-scale processing.
Related articles
Lip-Sync API: Overview & first steps
Make your first API request
Token Usage, Plans & Billing for the API
Done!
That is it. Clean inputs and a short test-first workflow will give you the most reliable Lip-Sync API results.
