The Solo Founder's Cloudflare Stack: How I Ship 12 Products Without a Team
A deep technical walkthrough of how I use Cloudflare Workers, Pages, R2, D1, KV, Queues, and AI to run a 12-product venture studio with zero DevOps, zero servers, and zero employees. Real architecture, real costs, real tradeoffs.
Every month, someone DMs me: “How are you running 12 products by yourself? What’s your infrastructure? How much does it cost? Do you sleep?”
The answer to the last question is “sometimes.” The answer to the rest is Cloudflare. Not as a sponsor, as a survival mechanism.
I run 12 live products across 6 categories. Zero servers. Zero DevOps. Zero employees. The entire portfolio runs on Cloudflare’s developer platform, and my total infrastructure bill last month was $47.
This is the complete technical walkthrough.
12
Live Products
$47
Monthly Infra Cost
300+
Edge Locations
0
Servers Managed
Why Cloudflare? The Decision That Changed Everything
I used to be an AWS person. EC2, RDS, Lambda, S3, CloudFront, the whole alphabet soup. For a solo founder running multiple products, AWS is death by a thousand cuts. Not because it’s bad. It’s designed for teams.
AWS (The Old Way)
Cloudflare (The New Way)
The math is simple. If you’re a solo founder, every hour you spend on infrastructure is an hour you’re not spending on product or distribution. Cloudflare eliminates the infrastructure tax almost entirely.
The key insight: Cloudflare isn’t “serverless” in the Lambda sense (containers that cold-start). It’s V8 isolates that run at the edge, meaning your code literally executes in the city closest to your user, with zero cold starts. This isn’t an optimization. It’s a fundamentally different architecture.
The Full Stack: Every Layer Explained
Here’s every Cloudflare primitive I use and exactly how it fits into the portfolio.
| Layer | Service | What I Use It For | Products Using It |
|---|---|---|---|
| Compute | Workers | API routes, backend logic, cron jobs, webhooks | All 12 |
| Frontend | Pages | Static sites, SSR with Next.js/Astro | All 12 |
| Database | D1 | User data, product data, analytics, metadata | 10 of 12 |
| Cache / State | KV | Session tokens, feature flags, config, rate limiting | All 12 |
| Storage | R2 | File uploads, audio files, images, exports, backups | 7 of 12 |
| Async Jobs | Queues | Email sends, webhook delivery, AI pipeline stages | 5 of 12 |
| AI | Workers AI | Whisper, Llama, embeddings, classification, summarization | 8 of 12 |
| Real-time | Durable Objects | WebSocket connections, collaborative state, counters | 3 of 12 |
| Vector Search | Vectorize | Semantic search, RAG retrieval, similarity matching | 4 of 12 |
That’s 9 services replacing what would traditionally be 15-20 AWS services, 3 third-party tools, and a DevOps engineer.
Architecture Deep Dive: How Each Product Is Built
Let me walk through the actual architecture of four products to show how these pieces fit together.
AudioPod AI: the most complex
AudioPod AI is an AI audio workstation: upload audio, get diarization (who spoke when), noise reduction, stem splitting, voice cloning, and translation. This is the most infrastructure-intensive product in the portfolio.
AudioPod AI Architecture Flow
The magic is in the Queue. Without it, the Workers request would time out (Workers have a 30-second CPU limit). The Queue decouples the upload from the processing, so the user gets an instant “upload confirmed” response while the heavy AI work happens asynchronously.
Go2.gg: the simplest
Go2.gg is a URL shortener with analytics. It’s the simplest product in the portfolio, but it handles the most traffic (millions of redirects per month).
Go2.gg Architecture (Entire Backend)
Total infrastructure for a URL shortener handling millions of redirects: one Worker, one KV namespace, one D1 database, one Queue. Monthly cost: approximately $3.
Findable: semantic search at the edge
Findable is an AI-powered semantic search platform. Users upload documents, and we build a searchable index that understands meaning, not just keywords.
Ingest
Document uploaded to R2. Worker extracts text, chunks it into 512-token segments, stores chunks in D1 with metadata.
Embed
Workers AI generates embeddings (bge-base-en-v1.5) for each chunk. Embeddings stored in Vectorize for similarity search.
Search
User query gets embedded. Vectorize finds top-10 similar chunks. Chunks fed as context to Llama 3 for synthesized answer.
Respond
Streamed response via Workers + SSE. User sees answer forming in real-time with source citations linked to original documents.
This is a full RAG pipeline (document ingestion, embedding, vector search, LLM synthesis) running entirely on Cloudflare. No OpenAI. No Pinecone. No separate vector database. Everything at the edge.
AgentDrive: AI agents on Workers
AgentDrive is the most architecturally interesting product. It’s a framework for deploying AI agents that can use tools, maintain state, and execute multi-step tasks, all running as Cloudflare Workers.
| Agent Component | Cloudflare Service | Why This Service |
|---|---|---|
| Agent runtime | Workers | Stateless compute, global deployment, sub-ms cold starts |
| Conversation memory | Durable Objects | Persistent state per agent session, survives restarts |
| Long-term memory | D1 + Vectorize | Structured facts in D1, semantic retrieval via Vectorize |
| Tool execution | Service Bindings | Agent calls other Workers as “tools” with zero network hop |
| Background tasks | Queues | Agent kicks off async work (send email, scrape URL, etc.) |
The key insight with AgentDrive is Service Bindings. When one Worker calls another Worker in the same account, there’s no HTTP overhead. It’s a direct function call. This means an AI agent can “use tools” (which are just other Workers) with near-zero latency. No API gateway. No network hop. Just a function call at the edge.
The Cost Breakdown: How $47/Month Runs 12 Products
People don’t believe me when I say the number. So here’s the actual breakdown.
Important Caveat: The $250K Credit
Cloudflare gave me $250K in startup credits through their Workers Launchpad program. This covers Workers AI inference (which would otherwise be my biggest cost) and gives me generous free tiers on everything else. Without credits, my bill would be closer to $200-400/month, still absurdly cheap for 12 products. The R2 zero-egress model alone saves hundreds compared to S3.
How This Compares to Traditional Infrastructure
| Item | AWS / Traditional | Cloudflare | Savings |
|---|---|---|---|
| Compute (12 services) | $240/mo | $5/mo | 98% |
| Database (14 DBs) | $350/mo | $2.50/mo | 99% |
| Storage + CDN (120GB) | $80/mo | $1.80/mo | 98% |
| AI inference | $500/mo | $0 (credits) | 100% |
| DevOps engineer (part-time) | $3,000/mo | $0 | 100% |
| Total | ~$4,170/mo | ~$47/mo | 99% |
The Shared Boilerplate: Ship a New Product in 3 Hours
The real superpower isn’t any single Cloudflare service. It’s the shared boilerplate I’ve built across all 12 products. Every product starts from the same template, which means spinning up product #13 takes hours, not weeks.
| Boilerplate Layer | What’s Included | Time Saved |
|---|---|---|
| Auth | OAuth (Google, GitHub), magic links, session management via KV | ~8 hours |
| Payments | Stripe Checkout, webhooks, subscription management, usage metering | ~12 hours |
| Transactional emails, welcome sequences, billing notifications | ~4 hours | |
| API layer | Rate limiting (KV), API key management, CORS, error handling | ~6 hours |
| Frontend | Landing page, dashboard shell, settings, billing portal | ~16 hours |
| Analytics | Event tracking, user activity logging, basic dashboards | ~4 hours |
| Total | ~50 hours |
50 hours of work that I never have to redo. This is the real moat of the venture studio model: each product makes the next one faster to build. This is also what became ShipQuest, my SaaS boilerplate product. I’m literally selling my own internal tooling.
The Tradeoffs: What Cloudflare Can’t Do (Yet)
I’m not going to pretend this stack is perfect. There are real limitations, and understanding them is critical for deciding if this approach works for your use case.
D1 is not Postgres
D1 is SQLite at the edge. It’s phenomenal for reads and simple writes, but it doesn’t have full Postgres features: no JSONB operators, no materialized views, no row-level security. For 80% of SaaS use cases, SQLite is more than enough. For the other 20%, you’ll need to supplement with Turso or Neon.
Workers have CPU time limits
Workers get 30 seconds of CPU time on the paid plan. This means heavy computation (image processing, large file parsing, complex data transforms) must be offloaded to Queues or Durable Objects. You can’t just “run a long script.” You have to architect for async.
Workers AI model selection is limited
Workers AI supports Llama, Mistral, Whisper, FLUX, and a handful of other models, but not GPT-4, Claude, or Gemini. For commodity tasks (summarization, classification, transcription), Workers AI is perfect. For frontier intelligence, I still call OpenAI or Anthropic APIs from the Worker.
No native WebSocket support on Pages
If you need real-time features (live updates, collaborative editing), you need Durable Objects, which have a learning curve. It’s not as simple as spinning up a Socket.io server. The mental model is different: you’re creating globally unique objects that hold state and accept connections.
My Rule of Thumb
When to use Cloudflare vs. When to reach elsewhere
Use Cloudflare when: your app is request-response (APIs, web apps, webhooks), your data model fits SQLite, your AI needs are commodity (transcription, summarization, embeddings), and you value speed-to-deploy over feature completeness.
Reach elsewhere when: you need complex relational queries (Postgres), GPU-intensive computation (Replicate, Modal), frontier AI models (OpenAI, Anthropic APIs), or real-time multiplayer state (consider Liveblocks or PartyKit on top of Durable Objects).
The Deployment Pipeline: How I Actually Ship
Every product follows the same deployment process. No CI/CD configuration. No Docker. No Kubernetes.
Code
Push to GitHub
Build
Pages auto-builds
Preview
Branch deploy URL
Merge
PR to main
Live
Global in ~30s
That’s it. Push code, it’s live globally in 30 seconds. Every push to a non-main branch creates a preview URL that I can share with beta users. Merging to main deploys to production. No staging environment needed: the preview URL is the staging environment.
For Workers (backend APIs), I use wrangler deploy which pushes the Worker to all 300+ edge locations in under 15 seconds. No rolling deploys. No blue-green. It’s just… instant.
The Wrangler.toml Pattern: One Config to Rule Them All
Every product has a wrangler.toml that defines its entire infrastructure. Here’s a simplified version of what AudioPod AI’s looks like:
wrangler.toml (simplified)
# Compute
name = “audiopod-api”
compatibility_date = “2025-12-01”
# Database
[[d1_databases]]
binding = “DB”
database_name = “audiopod-prod”
# Key-Value
[[kv_namespaces]]
binding = “SESSIONS”
# Object Storage
[[r2_buckets]]
binding = “AUDIO_FILES”
bucket_name = “audiopod-audio”
# Async Processing
[[queues.producers]]
binding = “PROCESSING_QUEUE”
queue = “audiopod-jobs”
# AI
[ai]
binding = “AI”
# Vector Search
[[vectorize]]
binding = “SEARCH_INDEX”
index_name = “audiopod-transcripts”
This single file declares the entire backend infrastructure. The database, the storage, the queue, the AI, the vector index, all in 30 lines. When I create a new product, I duplicate this file, change the names, run wrangler deploy, and I have a fully provisioned backend in under a minute.
Compare this to setting up equivalent infrastructure on AWS. You’d need CloudFormation templates, IAM policies, VPC configurations, security groups, and a weekend of your life you’ll never get back.
Patterns I’ve Learned Running 12 Products on This Stack
Pattern 1: Use KV as Your First Database
When prototyping a new product, skip D1 entirely. Use KV. It’s key-value, it’s global, it’s fast, and the data model forces you to think about access patterns upfront. When you need relational queries, migrate to D1. But for MVPs (user profiles, feature flags, session tokens), KV is perfect.
Pattern 2: Queue Everything That Can Wait
If the user doesn’t need the result immediately, put it in a Queue. Email notifications, analytics events, AI processing, webhook deliveries, all queued. This keeps your API response times under 50ms and makes your system naturally resilient to downstream failures.
Pattern 3: One Worker Per Domain, Not Per Route
I used to create separate Workers for each API endpoint. Terrible idea. Now, each product gets one Worker that handles all routes with a router (I use Hono). This keeps the deployment unit simple and lets routes share middleware (auth, rate limiting, logging).
Pattern 4: Cache AI Responses Aggressively
If someone asks Findable “What is your refund policy?” and another user asks the same question 5 minutes later, there’s no reason to run the LLM again. I cache AI responses in KV with a hash of the query + context as the key. Cache hit rate on commodity AI queries is 30-40%, which directly translates to cost savings and speed.
Pattern 5: Use R2 as Your Data Lake
Every product dumps raw event data into R2 as newline-delimited JSON. It’s my poor man’s data warehouse. When I need to analyze cross-product metrics on Sunday ops day, I pull the files and process them locally. Zero cost for storage, zero egress fees for retrieval. R2’s S3-compatible API means any tool that reads from S3 works automatically.
The Philosophy: Infrastructure as a Weapon
The solo founder’s biggest enemy isn’t competition. It’s operational drag.
Every minute you spend configuring Nginx, debugging Docker networking, patching security vulnerabilities, or scaling databases is a minute you’re not spending on the two things that actually matter: building the product and getting it in front of users. Cloudflare eliminates operational drag almost entirely. Not because it’s the best at any single thing, but because it’s good enough at everything, and it all works together without glue code, without config files, and without a DevOps team.
I’m not saying Cloudflare is the right choice for every company. If you’re building a database company, use AWS. If you’re training ML models, use GCP. If you need enterprise compliance, use Azure.
But if you’re a solo founder who needs to ship 12 products, iterate fast, keep costs near zero, and never think about infrastructure, there is nothing else that comes close.
The stack isn’t the product. The stack is what lets you build the product.
Building on Cloudflare? I share architecture decisions weekly.
Enjoyed this? Get more like it.
Weekly on AI product strategy and execution. No fluff.
Comments
Loading comments...