Slug: self-hosted-ai-stack-under-500Series: Self-Hosted Everything #1 Tags: Self-Hosted, AI Infrastructure, Coolify, Supabase, vLLM Target length: 3,500 words CTA: Gumroad — Self-Hosted AI Infrastructure Kit Publish: Mar 25, 2026 — 9:00 AM IST Platform: mrgulshanyadav.misar.blog/articles/self-hosted-ai-stack-under-500
A full production AI stack — LLM inference, vector database, auth, cache, email, CI/CD — runs under $500/month on self-hosted VPS
Managed equivalents (Supabase.com, Redis Cloud, Vercel, Resend, GitHub Actions) would cost $1,800–$2,400/month for the same workload
The crossover point is roughly 3 products or $2,000+ in managed spend — below that, managed wins on time savings
vLLM beats Ollama for concurrent production traffic; Coolify removes most of the DevOps overhead
Self-hosting requires 4–8 hours/week of ops time — budget this honestly before committing
I run five AI products — MisarIO, MisarMail, MisarDev, MisarBlog, and Assisters — on two Hetzner VPS instances. The main VPS (€38/month, 8-core, 32GB RAM) runs Coolify as the deployment platform, Forgejo for Git hosting, Redis for caching and queues, and Mailcow for transactional email. A second Hetzner VPS (€54/month) runs five separate Supabase stacks — one per product — each with its own PostgreSQL instance, GoTrue auth, PostgREST API, and S3-compatible storage. For LLM inference, I use a GPU instance (Hetzner GPU Cloud, pay-per-use, ~$80–120/month depending on workload) running vLLM with Llama 3.3 70B. Total monthly cost: approximately $210–250 for fixed infrastructure, $80–120 for GPU inference = $290–370/month total.
That's the complete picture. Everything below explains the why, the how, and what you'd actually need to replicate it.
Most self-hosting arguments start with "you own your data" or "no vendor lock-in." Those are real benefits but they're not why I made the decision. The actual reasons are more concrete.
1. The math at multiple products doesn't add up for managed.
When you run one product, managed services are clearly cheaper. Supabase.com at $25/month, Vercel Pro at $20/month, Redis Cloud at $15/month, Resend at $20/month — you're at $80/month before you've thought about it. That's fine for one product.
Now add a second product. Same costs again. Third product. The managed services scale per-project, not per-company. By the time I had three AI products with real workloads, managed was going to cost $400–500/month — just for the basic infrastructure layer, before LLM API costs.
Self-hosted infrastructure amortises across all products. The Supabase VPS running five stacks costs $54/month total. Five managed Supabase Pro plans would cost $125/month.
2. Predictability over elasticity.
Managed services bill on usage. That's great when you're growing and don't know your volumes yet. It's less great when you have a LLM feature that unpredictably calls the OpenAI API 100,000 times because of a bug, and you get a $900 bill for it.
Self-hosted infrastructure has a fixed cost. My Redis instance doesn't bill me more when I run 500,000 cache operations. My PostgreSQL instance doesn't charge per query. The ceiling is defined by VPS capacity, not by a pricing page I need to re-read every month.
3. Data sovereignty is a real constraint for AI products, not a philosophy.
If you're building AI products that process customer data, you need to think about where that data lives. For EU customers, GDPR applies. For Indian users, India's DPDP Act (2023) has specific provisions about cross-border data transfers. For healthcare or legal use cases, it gets stricter.
Running Supabase on a Hetzner VPS in Helsinki means I know exactly where every byte is stored. I can point to a specific server in a specific data centre. That's not possible when you're using Supabase.com — you're trusting their infrastructure choices.
Hetzner is the best value compute I've found for European-hosted infrastructure. The pricing is roughly 30–40% cheaper than DigitalOcean or Linode for equivalent specs, and the network performance is solid.
Main VPS (CX41, €38/month):
8 vCPUs, 32GB RAM, 240GB NVMe
Runs: Coolify, Forgejo, act_runner (CI), Redis, Mailcow, Nginx/Traefik
Suitable for: up to ~5 apps with moderate traffic, Git hosting, background jobs
Supabase VPS (CCX22, €54/month):
4 dedicated vCPUs, 16GB RAM, 240GB SSD
Runs: 5× Supabase stacks (one per product) in Docker
Port mapping: 5433–5437 for PostgreSQL, 8001–8005 for Kong API
Storage: separate Hetzner Volume (€4.80/month per 100GB) mounted for each product
The two-VPS architecture separates concerns cleanly. If the main VPS has an issue, the databases are unaffected. If I need to scale compute for apps, I don't touch the database VPS.
Coolify is a self-hosted PaaS that handles Docker deployment, reverse proxy (Traefik), SSL certificate management (Let's Encrypt), environment variable management, and basic monitoring. Think Vercel or Railway — but running on your own server.
What Coolify gets right:
One-click deployment from Git repository (Forgejo or GitHub)
Zero-downtime deployments with health checks
Automatic HTTPS for every app
Built-in deployment logs
Environment variable management per app
What Coolify doesn't replace:
Application-level monitoring (you still need something like Prometheus + Grafana, or a managed APM)
Advanced CI/CD pipelines (Forgejo + act_runner handles this separately)
Database management (Supabase is separate, not managed by Coolify)
Setup time for Coolify: about 30 minutes on a fresh VPS. Ongoing maintenance: near zero.
Supabase open-source is the same software as Supabase.com, packaged as a Docker Compose stack. You get PostgreSQL, GoTrue (auth), PostgREST (auto-generated REST API), Realtime (websockets), Storage (S3-compatible), and Supabase Studio (the dashboard).
The main consideration is running multiple Supabase stacks on one VPS. Each stack consumes about 1–1.5GB RAM at idle. With 16GB RAM, you can comfortably run 8–10 stacks. Five stacks uses about 7GB RAM, leaving headroom for queries and connections.
Per-product Supabase stacks means:
Complete database isolation — a bug in one product can't corrupt another's database
Separate auth instances — no shared user tables
Independent backup schedules
Separate storage buckets per product
The tradeoff: you manage five PostgreSQL instances instead of one. Backup scripts run per-instance. Schema migrations run per-product. It's more to manage, but the isolation is worth it at the product level.
PostgreSQL configuration I use across all stacks:
max_connections = 100
shared_buffers = 256MB
work_mem = 16MB
maintenance_work_mem = 128MB
This is conservative — the defaults work fine for most early-stage products.
A single Redis instance (the misar-redis container) runs on the main VPS and serves all five products via Docker network. All apps connect to it via redis://:<password>@misar-redis:6379 — internal Docker network, never exposed publicly.
Using Redis for:
Session storage (faster than PostgreSQL for auth checks)
API rate limiting (sliding window, per-user)
Job queues for background tasks (email sends, LLM calls that shouldn't block the request cycle)
Temporary caching for expensive database queries
Redis is simple to run self-hosted. No managed Redis service I've seen is worth the cost at moderate volumes. Redis Cloud's cheapest paid plan is $15/month for 250MB. The Redis instance on my VPS uses about 180MB for five products combined.
Mailcow is a self-hosted email server that handles SMTP, IMAP, and DKIM/DMARC/SPF configuration. I use it for transactional email via the MisarMail API layer (which is a thin API wrapper over the SMTP connection).
Running your own mail server in 2026 is viable if you do the DNS configuration correctly. The three requirements:
PTR record (reverse DNS) set on your server IP — Hetzner allows this
SPF, DKIM, and DMARC records configured in Cloudflare
IP not on any major blocklists (check on mxtoolbox.com before deploying)
I haven't had deliverability issues with this setup for transactional email. For high-volume marketing email, I'd reconsider.
Forgejo is a self-hosted Git platform (forked from Gitea). It runs on the main VPS, handles all source code, and integrates with act_runner for CI/CD.
act_runner executes Forgejo Actions workflows — the same YAML format as GitHub Actions. All CI runs on the VPS instead of GitHub's cloud runners.
Why this matters for AI products: LLM application tests often involve API keys, model configs, and infrastructure-specific environment variables that you don't want to expose to a third-party CI system. Running CI on your own infra keeps those credentials internal.
This is the variable-cost part of the stack. Hetzner GPU Cloud (A100 40GB, €2.49/hour) runs vLLM serving Llama 3.3 70B for inference.
I don't run a persistent GPU instance. Instead, I use a startup script that provisions the instance when a product feature needs LLM inference capacity, and scales it down during low-traffic periods. Average utilisation across the month translates to about 35–45 hours of GPU time — roughly $87–112/month.
Why vLLM, not Ollama: Ollama is the right choice for local development and single-user scenarios. For concurrent production traffic — multiple users hitting your API simultaneously — vLLM's continuous batching and PagedAttention algorithms give significantly better throughput. At 10 concurrent requests, vLLM handles the queue in roughly the same time it takes Ollama to process 3. The difference compounds under real load.
vLLM deployment is more complex than Ollama (requires CUDA drivers, more memory management config), but it's the production-grade choice.
ServiceSelf-Hosted CostManaged EquivalentManaged CostApp hosting (5 apps)Coolify on CX41 — €38/moVercel Pro × 5$100/moDatabase + Auth (5 products)Supabase VPS (CCX22) — €54/moSupabase Pro × 5$125/moRedisIncluded in main VPSRedis Cloud Essentials × 5$75/moEmail (transactional)Mailcow on main VPSResend Pro$89/moGit hostingForgejo on main VPSGitHub Team$16/moCI/CDact_runner on main VPSGitHub Actions minutes$30–50/moLLM inferencevLLM on GPU Cloud — ~$100/moOpenAI GPT-4o (equiv. volume)$800–1,200/moStorage (500GB total)Hetzner Volumes — €24/moS3/Supabase Storage$60/moTotal~$310–370/mo~$1,295–1,655/mo
The LLM inference column deserves attention. At equivalent inference volume, self-hosted Llama 3.3 70B costs roughly $0.0008–0.001 per 1K tokens (GPU time amortised). GPT-4o is $0.005–0.015 per 1K tokens input/output. At 50M tokens/month, that's a $250 vs $400–750 difference — just on that one service.
Self-hosting isn't free. The cost that doesn't appear in the table above:
Engineering time: 4–8 hours/week ongoing. This includes monitoring alerts, package/security updates, occasional incident response, and infrastructure changes. If your time is worth $100/hour, that's $400–800/week of implicit cost — far more than the managed alternative. Self-hosting only makes financial sense if you have the engineering capacity to absorb this.
On-call responsibility. If Redis goes down at 2am, it's your problem. If Supabase.com has an outage at 2am, it's their problem. Managed services include a support and SRE team in the price. Self-hosted means you are the SRE.
Managed services move faster. Supabase.com ships new features — branching, edge functions, realtime improvements — faster than the open-source release cycle. If you need those features immediately, managed has an advantage.
Cold-start complexity. Setting up this stack from scratch takes about 2–3 weeks of focused work. Supabase.com + Vercel + Redis Cloud takes about 2 hours.
Self-hosting starts making sense when two conditions are met:
Condition 1: You're running 3+ products or services on the same infrastructure. Fixed overhead amortises across all products. Below three products, the engineering time cost exceeds the managed service savings for most teams.
Condition 2: Your managed service spend exceeds $800–1,000/month. At that level, self-hosted infrastructure pays for itself within 3–4 months including setup time. Below that, the time investment rarely pencils out unless you have a specific compliance or data sovereignty requirement.
If you're at one product, under $500/month in managed costs, and don't have a specific technical or regulatory reason to self-host — use managed services. They're genuinely better for that stage.
If you're at three or more products, spending $1,000+/month on infrastructure, and have the engineering bandwidth to maintain a VPS — self-hosting is the right call.
This is the order I'd follow if I were setting this up again:
Week 1: Compute and networking
Provision Hetzner CX41 (main VPS) + CCX22 (Supabase VPS)
Set DNS: point all product domains to main VPS via Cloudflare
Install Coolify on main VPS
Configure SSH keys and firewall rules (ufw: 22, 80, 443 only public; all internal via Docker network)
Week 2: Core services
Install Forgejo and push all source code repositories
Configure act_runner for CI
Set up Mailcow — configure DKIM, DMARC, SPF, PTR record
Deploy Redis container on main VPS
Test email deliverability (mxtoolbox.com SPF/DKIM checks)
Week 3: Supabase and apps
Deploy Supabase stacks on Supabase VPS — one per product
Run database migrations per product
Deploy apps via Coolify from Forgejo repositories
Configure environment variables per app
Verify health checks pass
Ongoing:
Configure GPU Cloud instance with vLLM startup script
Set up basic monitoring (Coolify has built-in; add Uptime Kuma for external checks)
Automate Supabase backups to Hetzner Object Storage
The biggest operational risk with self-hosted infrastructure isn't the setup — it's not knowing something is broken until a user tells you.
Minimum monitoring stack:
Uptime Kuma (self-hosted, free) — external HTTP checks every 60s for all apps
Coolify built-in metrics — CPU, memory, network per container
PostgreSQL query logs — enabled per Supabase stack, stored 7 days
Redis INFO — check memory usage daily via cron
Alerts I actually act on:
Any app returning non-200 status for > 5 minutes → Telegram notification
Supabase VPS memory > 85% → Slack alert (means a runaway query)
Main VPS disk > 80% → immediate action (Mailcow logs grow quickly)
Monitoring setup takes about 4 hours. Skipping it costs you hours of debugging the first time something silently fails.
This article covered the full stack overview. Upcoming in Self-Hosted Everything:
Part 2: Replace Vercel + Supabase.com + Redis Cloud — the exact migration path, with specific commands
Part 3: vLLM vs Ollama in production — benchmarks, configuration, and when each is right
Part 4: Mailcow self-hosted email — end-to-end setup with deliverability verification
Part 5: Forgejo over GitHub — what the migration looks like and what you lose
Full AI infrastructure stack: ~$310–370/month self-hosted vs $1,300–1,650/month managed equivalents
The crossover point: 3+ products and $800+/month managed spend
Coolify handles the deployment layer; Supabase handles data; vLLM handles inference
Engineering overhead is 4–8 hours/week — this is the real cost, not the VPS bill
Don't self-host if you're at one product with manageable managed costs. Do self-host if you're running multiple products and have the bandwidth to maintain it.
If you want to skip the 3-week setup and start from a working template, I've packaged the full infrastructure configuration — Docker Compose files for every service, Coolify configuration, Supabase VPS setup scripts, vLLM deployment scripts, and the monitoring stack — into the Self-Hosted AI Infrastructure Kit.
It's the exact setup documented in this article, ready to deploy.
→ Self-Hosted AI Infrastructure Kit — $19
Free newsletter
Join thousands of creators and builders. One email a week — practical AI tips, platform updates, and curated reads.
No spam · Unsubscribe anytime
0 followers
AI systems builder · 7 years in production. RAG, self-hosted infra, agent architecture. 📬 Deep-dives → mrgulshanyadav.substack.com
Comments
Sign in to join the conversation
No comments yet. Be the first to share your thoughts!