
Learn how to replace expensive background-removal APIs with a low-cost home GPU rig, using open-source models and Cloudflare R2 for privacy-first image processing.
How I Built a Budget-Friendly Background Removal Service on a Home GPU Rig
TL;DR
- High-cost cloud APIs (≈$0.20/image) can be replaced by a self-hosted rig for ≈$0.01/image.
- A nine-GPU P106 cluster in Belarus costs ≈$26/month in electricity, far cheaper than cloud rates.
- Use FastAPI, Redis (or a simple task queue), Cloudflare R2, and open-source models (BiRefNet, Real ESRGAN, LaMa) for a scalable, privacy-first pipeline.
- Cooling, power supply, and DDoS protection are the main hardware risks.
- The next step? Scale to a dedicated GPU server or expose an API for developers.
Published by Brav
Table of Contents
Why this matters
When I first started a side project that required background removal, I hit two wall-flowers: the API cost was bleeding money, and every image I sent out to remove its background was leaving my apartment’s Wi-Fi, and I didn’t trust a third-party to delete the originals. The numbers were brutal. remove.bg charges about $0.20 per image remove.bg — Pricing (2026), so a thousand images a month would cost me $200 remove.bg — Pricing (2026).
I also discovered that running on a cloud GPU instance (DigitalOcean droplet, AWS GPU, or a managed GPU service) was no different price-wise. A $20/mo DigitalOcean droplet can give me a single P106 GPU for a fraction of a cloud instance’s cost DigitalOcean — Droplet Pricing (2026), but that still added an extra $20 to my bill, and I was still paying the same 0.20 $ per image because I was still sending images to a third-party.
The real cost was twofold: the API bill and the electricity bill of a home server. Belarus’s electricity cost is only $0.06–$0.09 per kWh PVKnowhow — Belarus Electricity Price (2026), and my rig would pull only about 600 W on average, or roughly 432 kWh per month. That works out to only $26 a month in electricity PVKnowhow — Belarus Electricity Price (2026), a far cry from a $200 monthly API bill.
Core concepts
The GPU economy
I started with an inexpensive NVIDIA P106, a 6 GB Pascal card that was originally a mining GPU. Its price on eBay is roughly $22 USD used Nvidia P106 — eBay listing (2026). Each card draws about 100 W under load Nvidia P106 — TechPowerUp spec (2026). Multiply that by nine cards and you’re looking at 600 W, which translates to $26/month in Belarusian electricity PVKnowhow — Belarus Electricity Price (2026).
Software stack
- FastAPI: I chose FastAPI for its asynchronous nature and its clean Python API surface. The docs are straightforward FastAPI — Docs (2026).
- Task queue: I kept it simple, using a small Redis-like queue stored in memory.
- Storage: Cloudflare R2 is a drop-in S3 compatible store that offers a free tier of 10 GB-month and 1 M Class A requests per month Cloudflare R2 — Pricing (2026).
- Open-source models:
- BiRefNet for fast, bidirectional matting BiRefNet — GitHub Repo (2026).
- Real ESRGAN for upscaling the background-removed image to a higher resolution Real ESRGAN — GitHub Repo (2026).
- LaMa (Llama) for inpainting or object removal Llama — GitHub Repo (2026).
Data privacy
All original images are deleted immediately after processing. No logs or storage persist on the server, and Cloudflare R2’s bucket is private, accessed only via pre-signed URLs that the client generates and then immediately discards.
How to apply it
- Gather GPUs – Scrape eBay or local mining hardware resellers for used P106s. Each card is $20–$30 used.
- Build the rig – Mount nine cards in a 4U case, connect a 600 W PSU (600 W * 1.5 = 900 W recommended). Install a small active cooling system (USB fans or small inline fans).
- Install drivers – Use the official CUDA 11 driver; the GPU is Pascal-compatible.
- Set up the OS – A lightweight Ubuntu 22.04 server (CPU: Intel Pentium, 2 cores, 20 GB RAM, 128 GB SSD).
- Deploy FastAPI – Use Uvicorn with async workers; expose two endpoints: upload (accepts a pre-signed URL for R2) and process (launches a background task).
- Queue & workers – Spin up a small thread pool that pulls tasks from a simple in-memory queue and assigns them to the GPU that has enough VRAM (BiRefNet needs 5.5 GB, Real ESRGAN 4 GB, LaMa 3 GB).
- Use TMPFS – Mount /tmp as tmpfs (RAM disk) to speed up intermediate I/O.
- Handle power – Monitor wattage with a USB meter and log to a simple Prometheus endpoint; set a threshold to pause jobs if voltage dips.
- Secure the API – Use JWT or a simple JVT token + fingerprint for each request; store the token in an HttpOnly cookie.
- Expose Cloudflare – Point DNS to Cloudflare; enable the free tier for DDoS protection and WAF.
- Delete originals – The pipeline deletes the source image from the local filesystem after the R2 upload is confirmed.
- Scale – If traffic increases, replace the 9-GPU rack with a dedicated server, or lease a GPU droplet on DigitalOcean for higher throughput.
Metrics you should monitor
| Metric | Target | Tool |
|---|---|---|
| CPU load | < 50 % | top/htop |
| GPU memory | < 90 % | nvidia-smi |
| Power consumption | < 650 W | USB meter |
| Latency per image | < 5 s | Prometheus |
| Error rate | 0 % | Sentry |
Pitfalls & edge cases
| Issue | Why it matters | Mitigation |
|---|---|---|
| Cooling failure | GPUs overheat, throttling or hardware failure | Add redundant fans, monitor temps, set auto-shutdown at 90 °C |
| Power outages | Service crashes, data loss | UPS with 30 min backup, write-through caching |
| DDoS attacks | Free Cloudflare tier can still be abused | Enable rate limits, use Cloudflare WAF |
| GPU failure | Single card failure reduces capacity | Spare GPU, health checks, job retries |
| Storage limits | R2 free tier caps at 10 GB-month | Upgrade to paid plan or add a local cache |
| Data privacy | Accidental leakage via logs | No logging of image payloads, secure cookies |
| Scaling limits | 9 GPUs only handle ~200 requests/hour | Move to GPU server or add more rigs |
Quick FAQ
What if I need more than 9 GPUs?
You can add more P106 cards if you have a larger chassis, but the power supply and cooling must scale. Alternatively, use a cloud GPU instance for burst traffic.Is the P106 still suitable for modern deep-learning models?
Yes, for models like BiRefNet and Real ESRGAN that fit in 6 GB VRAM, the P106 works fine. For larger models you’ll need 12 GB or more.How do I keep the API private?
Use JWT or a short-lived JVT token stored in an HttpOnly cookie. The token is validated on every request.Can I use the same pipeline for other image tasks?
Absolutely. Just swap in a different model (e.g., Stable Diffusion for generative art) and adjust the VRAM thresholds.Do I need a dedicated server?
No. A home server works as long as you manage power, cooling, and backup.What about the cost of the GPUs over time?
GPUs depreciate, but used P106s are cheap enough that even with a 10 % annual depreciation you still stay below $200/month.How do I handle large file uploads?
Use Cloudflare R2 pre-signed URLs for direct upload; this bypasses the server and reduces load.
Conclusion
If you’re a startup founder or hobbyist developer looking to keep costs low while maintaining full data privacy, a home GPU rig built around used P106 cards is a viable path. The key takeaways:
- Cut API bills by moving to your own GPUs.
- Cap electricity by running in a low-cost region (Belarus).
- Simplify storage with Cloudflare R2’s free tier.
- Build a resilient pipeline with FastAPI, a lightweight queue, and open-source models.
Who should use this? Anyone who can tolerate the upfront hardware cost, the extra maintenance overhead, and who cares about privacy. Who should not? Those who need instant horizontal scaling, or who lack the physical space and power supply to run a GPU rig.
Ready to jump in? Grab a few P106s, set up FastAPI, and let the GPU do the heavy lifting.
References
- remove.bg — Pricing (2026) https://www.deviantart.com/josephmoyers/journal/Easy-Ways-to-Remove-Background-from-Images-856541241
- DigitalOcean — Droplet Pricing (2026) https://www.digitalocean.com/pricing/droplets
- Cloudflare R2 — Pricing (2026) https://developers.cloudflare.com/r2/pricing/
- FastAPI — Docs (2026) https://fastapi.tiangolo.com/
- Nvidia P106 — eBay listing (2026) https://www.ebay.com.au/itm/156284589117
- Nvidia P106 — TechPowerUp spec (2026) https://www.techpowerup.com/gpu-specs/p106-100.c2980
- PVKnowhow — Belarus Electricity Price (2026) https://www.pvknowhow.com/solar-report/belarus/
- BiRefNet — GitHub Repo (2026) https://github.com/ZhengPeng7/BiRefNet
- Real ESRGAN — GitHub Repo (2026) https://github.com/xinntao/Real-ESRGAN
- Llama — GitHub Repo (2026) https://github.com/advimman/lama





