Delete/Adjusting Sequences, Speaking Style, Timing & Pacing

What this is, in short

Edit how each segment behaves in time and sound. Remove blocks you do not need, cut or merge where helpful, nudge timing on the timeline, compress or stretch pacing, and fine-tune the speaking style with Voice Settings. When you are done, click Generate to bake your edits into your final video without needing extra tokens 👍.

Deleting, cutting, and merging blocks

Delete a block: Select the block, click the trash icon. The clip is removed from the timeline and the transcript list.
Cut with scissors: Place the playhead where you want the split, click scissors to divide one block into two for finer control ✂️.

Merge with fuse: Select adjacent blocks, click fuse to combine them into one cleaner block.

Add a new line: Switch to Add Clip, click where the new line should start, paste or type the text, assign the correct Speaker, translate, then Generate Audio. The latest block will be created at the location indicated by the green block.

Timing, when the voice starts

Move a block on the timeline to change when the speech begins.
- Drag toward the beginning, the voice starts earlier.
- Drag toward the end, the voice starts later.
Use Zoom to place blocks more precisely, especially around fast exchanges.

Pacing, how fast the voice speaks

Do not move the whole block; trim on one end.
- Drag the start or end handle inward to shorten the duration and make the voice speak faster.
- Drag the start or end handle outward to lengthen the duration, making the voice speak more slowly.
If you made significant changes and things drift, use Auto speed alignment to let our AI realign the block to the correct place automatically 🔄.

Auto speed alignment

When a block became noticeably longer or shorter than usual, click Auto speed alignment 🔄. The Studio analyses the surrounding context and repositions or time-warps the block so it matches the video again. This is the quickest way to fix gaps, overlaps, or late entries after bigger edits.

Voice Settings, speaking style for the target audio

Open Voice Settings by clicking the settings icon next to the target audio track. You will see three sliders:

Stability, increasing stability makes the voice more consistent, but it can sound monotone. Lower this for longer texts.
Similarly, High enhancement improves clarity and speaker similarity. Very high values may cause artefacts.
Style: High values exaggerate the style relative to the speaker's audio but can cause instability. 0.0, default, is the fastest.

By clicking Apply, you simply save the changes. To have the audio change accordingly, hit “Generate Audio” on specific blocks, or regenerate all in one go by clicking “Regenerate all target audios”👍.

Practical tips

After edits, always generate so your final video matches your latest work.
Combine cut, merge, and pacing trims to fit fast dialogues cleanly.
If a line sounds too flat, try lowering Stability a little. If clarity drops, raise Similarity slightly. If delivery needs more character, add Style carefully; small increments work best 👍.

Done!

That is it. With these tools, you can clean up the structure, nail the sync, and shape the delivery so the dub sounds natural and on time 🗣️.