← ~/blog
· 16 min read

The Solo Founder's Cloudflare Stack: How I Ship 12 Products Without a Team

A deep technical walkthrough of how I use Cloudflare Workers, Pages, R2, D1, KV, Queues, and AI to run a 12-product venture studio with zero DevOps, zero servers, and zero employees. Real architecture, real costs, real tradeoffs.

Every month, someone DMs me: “How are you running 12 products by yourself? What’s your infrastructure? How much does it cost? Do you sleep?”

The answer to the last question is “sometimes.” The answer to the rest is Cloudflare. Not as a sponsor, as a survival mechanism.

I run 12 live products across 6 categories. Zero servers. Zero DevOps. Zero employees. The entire portfolio runs on Cloudflare’s developer platform, and my total infrastructure bill last month was $47.

This is the complete technical walkthrough.

12

Live Products

$47

Monthly Infra Cost

300+

Edge Locations

0

Servers Managed


Why Cloudflare? The Decision That Changed Everything

I used to be an AWS person. EC2, RDS, Lambda, S3, CloudFront, the whole alphabet soup. For a solo founder running multiple products, AWS is death by a thousand cuts. Not because it’s bad. It’s designed for teams.

AWS (The Old Way)

Deploy a new product2-3 days
Monthly cost (12 products)$800-2,000
Scaling configManual / auto-scaling rules
Cold starts300-1000ms (Lambda)
SSL/CDN setupPer-product config
Database managementRDS patching, backups, sizing
MonitoringCloudWatch (extra cost)

Cloudflare (The New Way)

Deploy a new product2-3 hours
Monthly cost (12 products)~$47
Scaling configAutomatic (edge)
Cold starts0ms (V8 isolates)
SSL/CDN setupAutomatic
Database managementD1: zero config SQLite
MonitoringBuilt-in analytics

The math is simple. If you’re a solo founder, every hour you spend on infrastructure is an hour you’re not spending on product or distribution. Cloudflare eliminates the infrastructure tax almost entirely.

The key insight: Cloudflare isn’t “serverless” in the Lambda sense (containers that cold-start). It’s V8 isolates that run at the edge, meaning your code literally executes in the city closest to your user, with zero cold starts. This isn’t an optimization. It’s a fundamentally different architecture.


The Full Stack: Every Layer Explained

Here’s every Cloudflare primitive I use and exactly how it fits into the portfolio.

LayerServiceWhat I Use It ForProducts Using It
ComputeWorkersAPI routes, backend logic, cron jobs, webhooksAll 12
FrontendPagesStatic sites, SSR with Next.js/AstroAll 12
DatabaseD1User data, product data, analytics, metadata10 of 12
Cache / StateKVSession tokens, feature flags, config, rate limitingAll 12
StorageR2File uploads, audio files, images, exports, backups7 of 12
Async JobsQueuesEmail sends, webhook delivery, AI pipeline stages5 of 12
AIWorkers AIWhisper, Llama, embeddings, classification, summarization8 of 12
Real-timeDurable ObjectsWebSocket connections, collaborative state, counters3 of 12
Vector SearchVectorizeSemantic search, RAG retrieval, similarity matching4 of 12

That’s 9 services replacing what would traditionally be 15-20 AWS services, 3 third-party tools, and a DevOps engineer.


Architecture Deep Dive: How Each Product Is Built

Let me walk through the actual architecture of four products to show how these pieces fit together.

AudioPod AI: the most complex

AudioPod AI is an AI audio workstation: upload audio, get diarization (who spoke when), noise reduction, stem splitting, voice cloning, and translation. This is the most infrastructure-intensive product in the portfolio.

AudioPod AI Architecture Flow

User uploads audio file (browser)
Workers API validates, generates presigned URL
R2: stores raw audio (zero egress fees)
Queue message triggers processing pipeline
Workers AI: Whisper (transcription)
Workers AI: Speaker diarization
External API: Stem splitting
D1: stores transcript, segments, metadata
User gets results in real-time via WebSocket (Durable Objects)

The magic is in the Queue. Without it, the Workers request would time out (Workers have a 30-second CPU limit). The Queue decouples the upload from the processing, so the user gets an instant “upload confirmed” response while the heavy AI work happens asynchronously.

Go2.gg: the simplest

Go2.gg is a URL shortener with analytics. It’s the simplest product in the portfolio, but it handles the most traffic (millions of redirects per month).

Go2.gg Architecture (Entire Backend)

User hits go2.gg/abc123
Worker looks up “abc123” in KV (< 1ms globally)
302 redirect to destination URL
Queue: log click (geo, device, referrer) to D1

Total infrastructure for a URL shortener handling millions of redirects: one Worker, one KV namespace, one D1 database, one Queue. Monthly cost: approximately $3.

Findable: semantic search at the edge

Findable is an AI-powered semantic search platform. Users upload documents, and we build a searchable index that understands meaning, not just keywords.

01

Ingest

Document uploaded to R2. Worker extracts text, chunks it into 512-token segments, stores chunks in D1 with metadata.

02

Embed

Workers AI generates embeddings (bge-base-en-v1.5) for each chunk. Embeddings stored in Vectorize for similarity search.

03

Search

User query gets embedded. Vectorize finds top-10 similar chunks. Chunks fed as context to Llama 3 for synthesized answer.

04

Respond

Streamed response via Workers + SSE. User sees answer forming in real-time with source citations linked to original documents.

This is a full RAG pipeline (document ingestion, embedding, vector search, LLM synthesis) running entirely on Cloudflare. No OpenAI. No Pinecone. No separate vector database. Everything at the edge.

AgentDrive: AI agents on Workers

AgentDrive is the most architecturally interesting product. It’s a framework for deploying AI agents that can use tools, maintain state, and execute multi-step tasks, all running as Cloudflare Workers.

Agent ComponentCloudflare ServiceWhy This Service
Agent runtimeWorkersStateless compute, global deployment, sub-ms cold starts
Conversation memoryDurable ObjectsPersistent state per agent session, survives restarts
Long-term memoryD1 + VectorizeStructured facts in D1, semantic retrieval via Vectorize
Tool executionService BindingsAgent calls other Workers as “tools” with zero network hop
Background tasksQueuesAgent kicks off async work (send email, scrape URL, etc.)

The key insight with AgentDrive is Service Bindings. When one Worker calls another Worker in the same account, there’s no HTTP overhead. It’s a direct function call. This means an AI agent can “use tools” (which are just other Workers) with near-zero latency. No API gateway. No network hop. Just a function call at the edge.


The Cost Breakdown: How $47/Month Runs 12 Products

People don’t believe me when I say the number. So here’s the actual breakdown.

Workers (compute across all products)$5.00
D1 (databases, 14 databases total)$2.50
KV (key-value reads/writes)$1.80
R2 (storage, ~120GB across products)$1.80
Queues (async job processing)$0.90
Workers AI (inference, with credits)$0.00
Pages (static hosting)$0.00
Vectorize (vector search)$5.00
Domains (12 .com/.gg domains)$16.00
External APIs (Stripe, email, etc.)$14.00
Total Monthly Infrastructure~$47.00

Important Caveat: The $250K Credit

Cloudflare gave me $250K in startup credits through their Workers Launchpad program. This covers Workers AI inference (which would otherwise be my biggest cost) and gives me generous free tiers on everything else. Without credits, my bill would be closer to $200-400/month, still absurdly cheap for 12 products. The R2 zero-egress model alone saves hundreds compared to S3.

How This Compares to Traditional Infrastructure

ItemAWS / TraditionalCloudflareSavings
Compute (12 services)$240/mo$5/mo98%
Database (14 DBs)$350/mo$2.50/mo99%
Storage + CDN (120GB)$80/mo$1.80/mo98%
AI inference$500/mo$0 (credits)100%
DevOps engineer (part-time)$3,000/mo$0100%
Total~$4,170/mo~$47/mo99%

The Shared Boilerplate: Ship a New Product in 3 Hours

The real superpower isn’t any single Cloudflare service. It’s the shared boilerplate I’ve built across all 12 products. Every product starts from the same template, which means spinning up product #13 takes hours, not weeks.

Boilerplate LayerWhat’s IncludedTime Saved
AuthOAuth (Google, GitHub), magic links, session management via KV~8 hours
PaymentsStripe Checkout, webhooks, subscription management, usage metering~12 hours
EmailTransactional emails, welcome sequences, billing notifications~4 hours
API layerRate limiting (KV), API key management, CORS, error handling~6 hours
FrontendLanding page, dashboard shell, settings, billing portal~16 hours
AnalyticsEvent tracking, user activity logging, basic dashboards~4 hours
Total~50 hours

50 hours of work that I never have to redo. This is the real moat of the venture studio model: each product makes the next one faster to build. This is also what became ShipQuest, my SaaS boilerplate product. I’m literally selling my own internal tooling.


The Tradeoffs: What Cloudflare Can’t Do (Yet)

I’m not going to pretend this stack is perfect. There are real limitations, and understanding them is critical for deciding if this approach works for your use case.

D1 is not Postgres

D1 is SQLite at the edge. It’s phenomenal for reads and simple writes, but it doesn’t have full Postgres features: no JSONB operators, no materialized views, no row-level security. For 80% of SaaS use cases, SQLite is more than enough. For the other 20%, you’ll need to supplement with Turso or Neon.

Workers have CPU time limits

Workers get 30 seconds of CPU time on the paid plan. This means heavy computation (image processing, large file parsing, complex data transforms) must be offloaded to Queues or Durable Objects. You can’t just “run a long script.” You have to architect for async.

Workers AI model selection is limited

Workers AI supports Llama, Mistral, Whisper, FLUX, and a handful of other models, but not GPT-4, Claude, or Gemini. For commodity tasks (summarization, classification, transcription), Workers AI is perfect. For frontier intelligence, I still call OpenAI or Anthropic APIs from the Worker.

No native WebSocket support on Pages

If you need real-time features (live updates, collaborative editing), you need Durable Objects, which have a learning curve. It’s not as simple as spinning up a Socket.io server. The mental model is different: you’re creating globally unique objects that hold state and accept connections.

My Rule of Thumb

When to use Cloudflare vs. When to reach elsewhere

Use Cloudflare when: your app is request-response (APIs, web apps, webhooks), your data model fits SQLite, your AI needs are commodity (transcription, summarization, embeddings), and you value speed-to-deploy over feature completeness.

Reach elsewhere when: you need complex relational queries (Postgres), GPU-intensive computation (Replicate, Modal), frontier AI models (OpenAI, Anthropic APIs), or real-time multiplayer state (consider Liveblocks or PartyKit on top of Durable Objects).


The Deployment Pipeline: How I Actually Ship

Every product follows the same deployment process. No CI/CD configuration. No Docker. No Kubernetes.

Code

Push to GitHub

Build

Pages auto-builds

Preview

Branch deploy URL

Merge

PR to main

Live

Global in ~30s

That’s it. Push code, it’s live globally in 30 seconds. Every push to a non-main branch creates a preview URL that I can share with beta users. Merging to main deploys to production. No staging environment needed: the preview URL is the staging environment.

For Workers (backend APIs), I use wrangler deploy which pushes the Worker to all 300+ edge locations in under 15 seconds. No rolling deploys. No blue-green. It’s just… instant.


The Wrangler.toml Pattern: One Config to Rule Them All

Every product has a wrangler.toml that defines its entire infrastructure. Here’s a simplified version of what AudioPod AI’s looks like:

wrangler.toml (simplified)

# Compute

name = “audiopod-api”

compatibility_date = “2025-12-01”

# Database

[[d1_databases]]

binding = “DB”

database_name = “audiopod-prod”

# Key-Value

[[kv_namespaces]]

binding = “SESSIONS”

# Object Storage

[[r2_buckets]]

binding = “AUDIO_FILES”

bucket_name = “audiopod-audio”

# Async Processing

[[queues.producers]]

binding = “PROCESSING_QUEUE”

queue = “audiopod-jobs”

# AI

[ai]

binding = “AI”

# Vector Search

[[vectorize]]

binding = “SEARCH_INDEX”

index_name = “audiopod-transcripts”

This single file declares the entire backend infrastructure. The database, the storage, the queue, the AI, the vector index, all in 30 lines. When I create a new product, I duplicate this file, change the names, run wrangler deploy, and I have a fully provisioned backend in under a minute.

Compare this to setting up equivalent infrastructure on AWS. You’d need CloudFormation templates, IAM policies, VPC configurations, security groups, and a weekend of your life you’ll never get back.


Patterns I’ve Learned Running 12 Products on This Stack

Pattern 1: Use KV as Your First Database

When prototyping a new product, skip D1 entirely. Use KV. It’s key-value, it’s global, it’s fast, and the data model forces you to think about access patterns upfront. When you need relational queries, migrate to D1. But for MVPs (user profiles, feature flags, session tokens), KV is perfect.

Pattern 2: Queue Everything That Can Wait

If the user doesn’t need the result immediately, put it in a Queue. Email notifications, analytics events, AI processing, webhook deliveries, all queued. This keeps your API response times under 50ms and makes your system naturally resilient to downstream failures.

Pattern 3: One Worker Per Domain, Not Per Route

I used to create separate Workers for each API endpoint. Terrible idea. Now, each product gets one Worker that handles all routes with a router (I use Hono). This keeps the deployment unit simple and lets routes share middleware (auth, rate limiting, logging).

Pattern 4: Cache AI Responses Aggressively

If someone asks Findable “What is your refund policy?” and another user asks the same question 5 minutes later, there’s no reason to run the LLM again. I cache AI responses in KV with a hash of the query + context as the key. Cache hit rate on commodity AI queries is 30-40%, which directly translates to cost savings and speed.

Pattern 5: Use R2 as Your Data Lake

Every product dumps raw event data into R2 as newline-delimited JSON. It’s my poor man’s data warehouse. When I need to analyze cross-product metrics on Sunday ops day, I pull the files and process them locally. Zero cost for storage, zero egress fees for retrieval. R2’s S3-compatible API means any tool that reads from S3 works automatically.


The Philosophy: Infrastructure as a Weapon

The solo founder’s biggest enemy isn’t competition. It’s operational drag.

Every minute you spend configuring Nginx, debugging Docker networking, patching security vulnerabilities, or scaling databases is a minute you’re not spending on the two things that actually matter: building the product and getting it in front of users. Cloudflare eliminates operational drag almost entirely. Not because it’s the best at any single thing, but because it’s good enough at everything, and it all works together without glue code, without config files, and without a DevOps team.

I’m not saying Cloudflare is the right choice for every company. If you’re building a database company, use AWS. If you’re training ML models, use GCP. If you need enterprise compliance, use Azure.

But if you’re a solo founder who needs to ship 12 products, iterate fast, keep costs near zero, and never think about infrastructure, there is nothing else that comes close.

The stack isn’t the product. The stack is what lets you build the product.

Building on Cloudflare? I share architecture decisions weekly.

Enjoyed this? Get more like it.

Weekly on AI product strategy and execution. No fluff.

Unsubscribe anytime.

share: twitter linkedin

Comments

Loading comments...