Skip to main content

Fixing Speaker diarization & assigning Speakers

Assign the right cloned voice to every line in CHAMELAION's video translation, fixing speaker diarization so each translated video sounds natural and clear.

K
Written by Konstantin Dorndorf
Updated over a month ago

What this is, in short

CHAMELAION detects speakers automatically when you upload a video, then splits the source into speaking blocks. If a line is assigned to the wrong speaker, or a speaker is missed, you can fix it in the Dubbing Studio by adjusting the Speaker number on each block, regenerating the audio for that block, and then regenerating the entire file so your edits become part of the media. Each speaker uses a separate cloned voice, so getting the speaker diarization right matters.

Where to make changes

  • Left pane, blocks: Each source block shows a Speaker badge in the top left. Click it to set which speaker should say that line.

  • Bottom timeline: Blocks also appear as clips on Source and Target tracks. Use the Source/Target monitor switch to compare the original with the dub while you work. Zoom in for precise timing.

Fix a wrong speaker label, step by step.

  1. Open your file, select the target language, and click Edit in Studio.

  2. Play the video and listen to Source versus Target using the monitor switch.

  3. For any mislabeled block, change the Speaker number in the block header. The first person who speaks in the video is Speaker 1, the second is Speaker 2, and so on.

  4. For that block, run the steps in order: Translate via the arrow if you have made changes to the source transcript for that block, then Generate Audio so the target clip is rebuilt in the correct cloned voice.

  5. When all blocks look correct, click Generate in the lower-right corner. In this dialogue, you can also decide whether to include Background Sounds and whether to apply Lip-Sync. This render makes your changes permanent 💯.

If a speaker was missed or two people were merged

  • Split a block at the playhead using the scissors icon ✂️ so each person has their own clip, then set the Speaker number on each, translate, and Generate Audio for both.

  • Add a new line: Switch to Add Clip, click where the new line should start, paste or type the text, assign the correct Speaker, translate, then Generate Audio. The latest block will be created at the location indicated by the green block.

  • If many lines need text changes, it can be faster to export SRT, edit it, and then reupload. Updating the source SRT realigns all selected target languages; to change only that language, upload a target SRT and regenerate.

Tips

  • Fix obvious transcription mix-ups first, for example, classic “House” versus “Mouse,” then regenerate the target audio. This keeps all later languages clean.

  • Use Zoom to align speaker turns precisely and trim clip edges in Selection mode for natural pacing.

    • For more information about how you can adjust the pacing and other parameters, visit our following articles about speaking styles, timing & pacing 👍.

  • Turn on Lip-Sync only for the final pass; it will align the lips to your updated audio.

  • If you don’t know which segment in the timeline corresponds to which block, just click it, and CHAMELEON will highlight the correct block for you in the top-left pane 😜.

Done!

That is it. Review blocks, assign the correct speaker numbers, regenerate audio per block, then regenerate the file. Your downloads will match exactly what you generated, with the right voice speaking every line 🎬.

Did this answer your question?