Segment filters

Mocked event-time filters. User type is counted at the moment of usage, so trial usage and later paid-yearly usage stay separate.

Device platform
All devices
App version
All versions
Device country
All countries
User type
All user types
Last 30 daysAll devicesAll versionsAll countriesAll user types100% usage slice
Planned cost
$611
For selected filtersLoading plan
Actual cost
$733
For selected filtersLoading actual usage
Cost delta
+$122
+20% vs planActual minus planned
Most over-plan feature
Roleplay / Conversation Practice
+$11814 costing rows

Monthly cost shape

Visual from actual row totals. The tables are the source.

GuidedGuidedDrillRoleplayFlashcard

Guided Lesson - Vocab

Word card -> learner speaks -> pronunciation score -> result shown

Actual vs planned$41.18-$3.83
Costing sub-featurePricing and sourcePlanned usageActual usagePlanned costActual costDeltaNote
Vocabulary pronunciation scoringGoogle Cloud Speech-to-Text V2, chirp_3

Learner audio -> speech scoring -> result. Cost = sessions x attempts x seconds.

$0.016 / processed minute, first 500k minutesGoogle STT
18,500 sessions x 2.4 attempts x 3.8 sec / 60 x $0.016 = $44.99
Loading actual usage
Monthly sessions16,260sessions
Attempts / session2.302attempts
Seconds / attempt4.124sec
Rate / min0.016$
16,260 sessions x 2.302 attempts x 4.124 sec / 60 x $0.016 = $41.16
$44.99$41.16-$3.83-8.517%
Guided lesson prompt audio storage and deliveryAWS S3 + CDN audio

CloudFront -> guided-lessons assets -> local cache. Live defaults: 878,416,628 stored bytes and 1,820,705,213 bytes downloaded month-to-date.

S3 Standard ap-southeast-2 storage plus CloudFront egress for dqztncf0oxx31.cloudfront.net/guided-lessons.AWS S3 + CloudFront
0.818 GB x $0.025 + max(0, 1.696 GB - 1,024 GB) x $0.12 = $0.02
Loading actual usage
Stored assets0.778GB
Storage / GB0.025$
Month-to-date CDN egress1.849GB
CloudFront free egress1,024GB
APAC egress / GB0.12$
0.778 GB x $0.025 + max(0, 1.849 GB - 1,024 GB) x $0.12 = $0.02
$0.02$0.02-$0.00-4.936%

Guided Lesson - Dialogue

Dialogue line -> learner speaks -> pronunciation score -> result shown

Actual vs planned$57.45+$1.11
Costing sub-featurePricing and sourcePlanned usageActual usagePlanned costActual costDeltaNote
Dialogue sentence pronunciation scoringGoogle Cloud Speech-to-Text V2, chirp_3

Learner sentence -> speech scoring -> result. Cost = sessions x attempts x seconds.

$0.016 / processed minute, first 500k minutesGoogle STT
14,200 sessions x 3.1 attempts x 4.8 sec / 60 x $0.016 = $56.35
Loading actual usage
Monthly sessions14,122sessions
Attempts / session3.331attempts
Seconds / attempt4.58sec
Rate / min0.016$
14,122 sessions x 3.331 attempts x 4.58 sec / 60 x $0.016 = $57.45
$56.35$57.45+$1.11+1.967%

Drill

Drill prompt -> learner speaks -> pronunciation score; optional word lookup

Actual vs planned$41.64-$3.78
Costing sub-featurePricing and sourcePlanned usageActual usagePlanned costActual costDeltaNote
Speaking drill pronunciation scoringGoogle Cloud Speech-to-Text V2, chirp_3

Learner audio -> speech scoring -> result. Cost = sessions x attempts x seconds.

$0.016 / processed minute, first 500k minutesGoogle STT
12,400 sessions x 3.9 attempts x 3.4 sec / 60 x $0.016 = $43.85
Loading actual usage
Monthly sessions14,697sessions
Attempts / session3.452attempts
Seconds / attempt2.973sec
Rate / min0.016$
14,697 sessions x 3.452 attempts x 2.973 sec / 60 x $0.016 = $40.21
$43.85$40.21-$3.63-8.282%
Drill dictionary lookupCerebras gpt-oss-120b

Word tap -> AI explanation -> learner reads. Cost = lookup taps x tokens.

$0.35/M input tokens + $0.75/M output tokensCerebras
3,200 calls x (850 input / 1M x $0.35 + 260 output / 1M x $0.75) = $1.58
Loading actual usage
Monthly lookups2,965calls
Input / lookup891tokens
Output / lookup224tokens
Input / 1M0.35$
Output / 1M0.75$
2,965 calls x (891 input / 1M x $0.35 + 224 output / 1M x $0.75) = $1.42
$1.58$1.42-$0.15-9.693%

Roleplay / Conversation Practice

AI speaks -> learner replies -> AI responds -> optional help, review, pronunciation, translation

Actual vs planned$545+$118
Costing sub-featurePricing and sourcePlanned usageActual usagePlanned costActual costDeltaNote
Shared assumptionsEdit these once. Cost rows below reference them without repeating the inputs.
Session volumeUsage model

Users -> sessions -> API calls. No direct cost.

No vendor charge; drives rows belowModel input
Usage volume
Session call count
Gemini STT + LLM assumptions
Google TTS assumptions
2,900 users x 3.3 sessions/user = 9,570 sessions; x 10 calls/session = 95,700 calls
Loading actual usage
Usage volume
Monthly active users3,228users
Sessions / user3.913sessions
Session call count
STT+LLM calls / session11.4calls
TTS calls / session10.4calls
Gemini STT + LLM assumptions
Gemini text tokens / call19.1tokens
Gemini audio sec / call6.258sec
Gemini audio tokens / sec37.5tokens
Gemini output tokens / call93.8tokens
Google TTS assumptions
Google TTS chars / call129chars
Google TTS rate / 1M chars30$
3,228 users x 3.913 sessions/user = 12,631 sessions; x 11.4 calls/session = 143,785 calls
AssumptionAssumptionInput only
Session costsOpen session -> AI generates first reply (LLM) -> AI speaks (TTS) -> learner records -> AI understands and replies (STT+LLM) -> AI speaks again (TTS) -> repeat until goals are done.
STT + LLM conversation callsGemini 2.5 Flash-Lite

Session -> 8-12 STT+LLM calls. Includes text, audio, and output tokens.

$0.10/M text input + $0.30/M audio input + $0.40/M outputGemini9,570 sessions x 10 calls/session x (20 text tokens / 1M x $0.1 + 6 sec x 32 audio tokens/sec / 1M x $0.3 + 105 output tokens / 1M x $0.4) = $9.72 (95,700 calls)
Loading actual usage
12,631 sessions x 11.4 calls/session x (19.1 text tokens / 1M x $0.1 + 6.258 sec x 37.5 audio tokens/sec / 1M x $0.3 + 93.8 output tokens / 1M x $0.4) = $15.78 (143,785 calls)
$9.72$15.78+$6.06+62.3%
Assistant speech outputGoogle Cloud Text-to-Speech, Chirp 3 HD

Session -> AI replies -> speech audio.

$30 / 1M synthesized charactersGoogle TTS9,570 sessions x 10 TTS calls/session x 140 chars/call / 1M x $30 = $402
Loading actual usage
12,631 sessions x 10.4 TTS calls/session x 129 chars/call / 1M x $30 = $507
$402$507+$105+26.2%
Goal completion checksGemini 2.5 Flash-Lite

One check per goal. Default assumes 4 goals/session.

$0.10/M text input + $0.40/M outputGemini
2,900 users x 3.3 sessions/user x 4 events/session x (650 input / 1M x $0.1 + 80 output / 1M x $0.4) = $3.71
Loading actual usage
Checks / session3.943checks
Input / check557tokens
Output / check81.8tokens
3,228 users x 3.913 sessions/user x 3.943 events/session x (557 input / 1M x $0.1 + 81.8 output / 1M x $0.4) = $4.40
$3.71$4.40+$0.69+18.6%
Optional action costsTriggered only when the learner asks for extra help.
Hint generationGemini 2.5 Flash-Lite

Hint tap -> short suggestion.

$0.10/M text input + $0.40/M outputGemini
2,900 users x 3.3 sessions/user x 0.75 events/session x (500 input / 1M x $0.1 + 120 output / 1M x $0.4) = $0.70
Loading actual usage
Hints / session0.721hints
Input / hint420tokens
Output / hint106tokens
3,228 users x 3.913 sessions/user x 0.721 events/session x (420 input / 1M x $0.1 + 106 output / 1M x $0.4) = $0.77
$0.70$0.77+$0.07+9.303%
Azure pronunciation checkAzure Pronunciation Assessment

Pronunciation tab -> audio check -> score.

Modeled as $1.00 / audio hourAzure Speech
2,900 users x 3.3 sessions/user x 0.2 checks/session x 8 sec / 3600 x $1 = $4.25
Loading actual usage
Checks / session0.226checks
Audio sec / check9.684sec
Rate / hour1$
3,228 users x 3.913 sessions/user x 0.226 checks/session x 9.684 sec / 3600 x $1 = $7.69
$4.25$7.69+$3.43+80.7%
TranslationGroq, openai/gpt-oss-120b

Translate tap -> English support text.

$0.15/M input + $0.60/M outputGroq
2,900 users x 3.3 sessions/user x 1.2 events/session x (650 input / 1M x $0.15 + 180 output / 1M x $0.6) = $2.36
Loading actual usage
Translations / session1.147translations
Input / translation555tokens
Output / translation169tokens
3,228 users x 3.913 sessions/user x 1.147 events/session x (555 input / 1M x $0.15 + 169 output / 1M x $0.6) = $2.68
$2.36$2.68+$0.32+13.5%
Final feedback generationGemini / Vertex fallback

Session ends -> AI summarizes performance.

$0.10/M text input + $0.40/M outputGemini
2,900 users x 3.3 sessions/user x 1 events/session x (2,200 input / 1M x $0.1 + 700 output / 1M x $0.4) = $4.79
Loading actual usage
Feedback / session0.902events
Input / feedback2,674tokens
Output / feedback768tokens
3,228 users x 3.913 sessions/user x 0.902 events/session x (2,674 input / 1M x $0.1 + 768 output / 1M x $0.4) = $6.55
$4.79$6.55+$1.76+36.9%

Flashcard Drill

Saved card -> learner speaks -> pronunciation score -> result shown

Actual vs planned$48.08+$11.05
Costing sub-featurePricing and sourcePlanned usageActual usagePlanned costActual costDeltaNote
Flashcard pronunciation scoringGoogle Cloud Speech-to-Text V2, chirp_3

Card prompt -> learner audio -> score. Cost = sessions x cards x seconds.

$0.016 / processed minute, first 500k minutesGoogle STT
6,200 sessions x 8 attempts x 2.8 sec / 60 x $0.016 = $37.03
Loading actual usage
Monthly drill sessions6,973sessions
Cards / session9.638cards
Seconds / card2.683sec
Rate / min0.016$
6,973 sessions x 9.638 attempts x 2.683 sec / 60 x $0.016 = $48.08
$37.03$48.08+$11.05+29.8%