If you're building an AI product, every user interaction costs you real money. This changes everything about pricing.
If you're building an AI product, this is the chapter that determines whether your business works. Every conversation, every voice generation, every AI evaluation costs you real money. Traditional SaaS has near-zero marginal cost per user. AI products don't.
RESTAURANT: In a normal restaurant, once you've paid for ingredients and staff, serving one more plate barely changes your costs. An AI restaurant is different — every plate requires a fresh delivery of premium ingredients. Your cost per plate is real and it never goes away.
The first step is calculating your actual cost per unit. For a conversation product, the unit is one minute of conversation. Add up every service that gets called during that minute:
Voice AI (ElevenLabs): The biggest cost. Real-time conversation uses speech-to-text, processing, and text-to-speech. The per-minute cost varies by plan and usage tier.
LLM (Claude API): Conversation memory, structured extraction, assessment scoring. Per-token pricing means longer conversations cost more.
Database operations: Reading and writing user data, conversation logs, memory updates. Negligible at small scale, material at large scale.
Infrastructure: Supabase, Webflow, domain, etc. Fixed monthly costs divided by total users.
COST: For Project Fluency, the blended cost per conversation minute is approximately $0.08-0.12. This is the number that determines every pricing decision. If you don't know your equivalent number, you're guessing.
Your price needs to cover three things: direct costs, overhead, and margin. Direct costs are what each user interaction costs you. Overhead is your fixed monthly expenses divided by users. Margin is what's left — the actual business.
A healthy AI product targets 60-70% gross margin after direct costs. Below 50%, your business can't sustain marketing spend and growth. Above 80%, you're probably overcharging and vulnerable to a competitor.
The V9 pricing philosophy for Project Fluency uses invisible caps — pedagogical framing rather than usage meters. Instead of showing users "you have 47 minutes left," the system naturally paces the learning experience. Users don't feel capped. The product just structures their session around optimal learning duration.
This is harder to build than a simple usage meter. It's also dramatically better for retention. Nobody wants to watch a countdown timer while they're learning.
NOTE: If your margins are thin, the instinct is to raise prices. The better instinct is to reduce cost per interaction. Optimize your prompts. Cache common responses. Use cheaper models for non-critical tasks. A 20% cost reduction is equivalent to a 20% price increase but doesn't risk losing customers.