Building a Cost-Efficient PDF Chat App with OpenRouter
Jan 12, 2026 · 9 min read
I built a PDF chat app — upload a document, ask questions about it, get answers powered by an LLM. The hard part wasn't the architecture. It was keeping the monthly bill under $20 while the app had zero revenue.
Here's every cost optimization that made it viable.
The Baseline
A naive setup would look like this:
- Embed every PDF chunk with OpenAI's embedding model
- Store vectors in a dedicated vector database
- Query a premium LLM (GPT-4) for every question
- Process documents on a dedicated worker server
Monthly estimate: $85+. At $0 revenue, that's not sustainable.
Optimization 1: Database Choice
Use Postgres with pgvector instead of a separate vector database. Supabase provides this in one service. No extra connection, no extra bill.
The ivfflat index is approximate but fast. For a personal project under 100K chunks, the accuracy difference is negligible compared to the cost savings.
Optimization 2: OpenRouter Tiered Model Strategy
Instead of using one model for everything, I route queries by complexity:
- Simple lookup questions → Mistral 7B ($0.07/M tokens)
- Summarization → Claude 3 Haiku ($0.25/M tokens)
- Complex reasoning → GPT-4o mini ($0.15/M tokens)
A simple classifier (under 50 lines) determines which tier to use based on the question:
The majority of questions hit the cheapest tier. Average cost per query dropped from $0.02 to $0.003.
Optimization 3: Embedding Caching
The same document chunk gets embedded once. I cache embeddings in a separate table keyed by (document_id, chunk_index). If a document is re-uploaded with changes, only the new chunks get embedded.
Optimization 4: Serverless Processing
Instead of a dedicated worker, process PDFs via Supabase Edge Functions (Deno). Triggered on upload, billed per execution, no idle cost.
The Total
| Component | Monthly Cost |
|---|---|
| VPS (small) | $5-15 |
| Database | $0-5 |
| OpenRouter API | $10-50 |
| Total | $17-70 |
Real-world average for light usage: ~$22/month. The app works, costs are manageable, and if usage grows, the profit margin is already built in.