Skip to main content

Token Consumption by Feature

See exactly how CHAMELAION's video translation features spend tokens with clear per minute and per character rates, formulas and examples for planning translated video costs.

K
Written by Konstantin Dorndorf
Updated over a month ago

What it is

Each CHAMELAION feature consumes a predictable amount of tokens. Video and audio features are billed by time, and text features are billed by character. Use the quick table, then see the examples to estimate your run.

Token rates

Feature

Token use

Video translation

12,000 tokens per minute, or 200 tokens per second

Lip-sync

12,000 tokens per minute, or 200 tokens per second

Text-to-Speech

3,000 tokens per 1,000 characters equals three tokens per character

Text translation

1,000 tokens per 1,000 characters equals one token per character

Audio transcription

600 tokens per minute, or 10 tokens per second

Notes

  • For the most up-to-date information on our prices and token consumption, check out our pricing page🤝.

  • Charges apply per language. One minute translated into three languages counts as three minutes for billing.

  • Lip-sync is an add-on; enable it on top of video translation.

  • Regenerations in the Dubbing Studio consume tokens again, only for the seconds you regenerate.

Fast formulas

  • Video translation, tokens = duration in seconds × 200 × number of target languages

  • Lip-sync, tokens = duration in seconds × 200 × number of target languages

  • Text translation, tokens = characters × 1

  • Text-to-Speech, tokens = characters × 3

  • Audio transcription, tokens = duration in seconds × 10

Examples

A. Full run, 3 min video, two languages, Lip-sync on

  • Translation, 3 × 12,000 × 2 = 72,000 tokens

  • Lip-sync, 3 × 12,000 × 2 = 72,000 tokens
    Total, 144,000 tokens

B. Segment regeneration, 12 seconds, one language

  • Translation only, 12 × 200 = 2,400 tokens

  • If you tick Lip-Sync when you click Generate, it is applied to the entire video, not just the edited section.

    For a 10-minute video, Lip-Sync adds 10 × 12,000 = 120,000 tokens per language.

C. Text translation, 2,500 characters

  • 2,500 × 1 = 2,500 tokens

D. Text-to-Speech for a script, 8,000 characters

  • 8,000 × 3 = 24,000 tokens

E. Audio transcription, 10-minute podcast

  • 10 × 600 = 6,000 tokens

Good to know

  • Features inside team projects use the project creator’s tokens for any paid action in that project.

  • Text tools are character-based, while video and audio tools are time-based, making mixed workflows easier to estimate.

  • Your euro price per minute depends on your plan; the token rates above stay the same. See the Plans page for pricing.

For a broader overview of tokens and more information on the monthly resets, take a look at our Tokens article. For plan sizes and pricing, open Plans in the platform and try the calculator. This way, you know precisely which Plan is right for you😉.

Did this answer your question?