How I Built a Budget-Friendly Background Removal Service on a Home GPU Rig | Brav

Learn how to replace expensive background-removal APIs with a low-cost home GPU rig, using open-source models and Cloudflare R2 for privacy-first image processing.

How I Built a Budget-Friendly Background Removal Service on a Home GPU Rig

TL;DR

  • High-cost cloud APIs (≈$0.20/image) can be replaced by a self-hosted rig for ≈$0.01/image.
  • A nine-GPU P106 cluster in Belarus costs ≈$26/month in electricity, far cheaper than cloud rates.
  • Use FastAPI, Redis (or a simple task queue), Cloudflare R2, and open-source models (BiRefNet, Real ESRGAN, LaMa) for a scalable, privacy-first pipeline.
  • Cooling, power supply, and DDoS protection are the main hardware risks.
  • The next step? Scale to a dedicated GPU server or expose an API for developers.

Published by Brav

Table of Contents

Why this matters

When I first started a side project that required background removal, I hit two wall-flowers: the API cost was bleeding money, and every image I sent out to remove its background was leaving my apartment’s Wi-Fi, and I didn’t trust a third-party to delete the originals. The numbers were brutal. remove.bg charges about $0.20 per image remove.bg — Pricing (2026), so a thousand images a month would cost me $200 remove.bg — Pricing (2026).

I also discovered that running on a cloud GPU instance (DigitalOcean droplet, AWS GPU, or a managed GPU service) was no different price-wise. A $20/mo DigitalOcean droplet can give me a single P106 GPU for a fraction of a cloud instance’s cost DigitalOcean — Droplet Pricing (2026), but that still added an extra $20 to my bill, and I was still paying the same 0.20 $ per image because I was still sending images to a third-party.

The real cost was twofold: the API bill and the electricity bill of a home server. Belarus’s electricity cost is only $0.06–$0.09 per kWh PVKnowhow — Belarus Electricity Price (2026), and my rig would pull only about 600 W on average, or roughly 432 kWh per month. That works out to only $26 a month in electricity PVKnowhow — Belarus Electricity Price (2026), a far cry from a $200 monthly API bill.

Core concepts

The GPU economy

I started with an inexpensive NVIDIA P106, a 6 GB Pascal card that was originally a mining GPU. Its price on eBay is roughly $22 USD used Nvidia P106 — eBay listing (2026). Each card draws about 100 W under load Nvidia P106 — TechPowerUp spec (2026). Multiply that by nine cards and you’re looking at 600 W, which translates to $26/month in Belarusian electricity PVKnowhow — Belarus Electricity Price (2026).

Software stack

Data privacy

All original images are deleted immediately after processing. No logs or storage persist on the server, and Cloudflare R2’s bucket is private, accessed only via pre-signed URLs that the client generates and then immediately discards.

How to apply it

  1. Gather GPUs – Scrape eBay or local mining hardware resellers for used P106s. Each card is $20–$30 used.
  2. Build the rig – Mount nine cards in a 4U case, connect a 600 W PSU (600 W * 1.5 = 900 W recommended). Install a small active cooling system (USB fans or small inline fans).
  3. Install drivers – Use the official CUDA 11 driver; the GPU is Pascal-compatible.
  4. Set up the OS – A lightweight Ubuntu 22.04 server (CPU: Intel Pentium, 2 cores, 20 GB RAM, 128 GB SSD).
  5. Deploy FastAPI – Use Uvicorn with async workers; expose two endpoints: upload (accepts a pre-signed URL for R2) and process (launches a background task).
  6. Queue & workers – Spin up a small thread pool that pulls tasks from a simple in-memory queue and assigns them to the GPU that has enough VRAM (BiRefNet needs 5.5 GB, Real ESRGAN 4 GB, LaMa 3 GB).
  7. Use TMPFS – Mount /tmp as tmpfs (RAM disk) to speed up intermediate I/O.
  8. Handle power – Monitor wattage with a USB meter and log to a simple Prometheus endpoint; set a threshold to pause jobs if voltage dips.
  9. Secure the API – Use JWT or a simple JVT token + fingerprint for each request; store the token in an HttpOnly cookie.
  10. Expose Cloudflare – Point DNS to Cloudflare; enable the free tier for DDoS protection and WAF.
  11. Delete originals – The pipeline deletes the source image from the local filesystem after the R2 upload is confirmed.
  12. Scale – If traffic increases, replace the 9-GPU rack with a dedicated server, or lease a GPU droplet on DigitalOcean for higher throughput.

Metrics you should monitor

MetricTargetTool
CPU load< 50 %top/htop
GPU memory< 90 %nvidia-smi
Power consumption< 650 WUSB meter
Latency per image< 5 sPrometheus
Error rate0 %Sentry

Pitfalls & edge cases

IssueWhy it mattersMitigation
Cooling failureGPUs overheat, throttling or hardware failureAdd redundant fans, monitor temps, set auto-shutdown at 90 °C
Power outagesService crashes, data lossUPS with 30 min backup, write-through caching
DDoS attacksFree Cloudflare tier can still be abusedEnable rate limits, use Cloudflare WAF
GPU failureSingle card failure reduces capacitySpare GPU, health checks, job retries
Storage limitsR2 free tier caps at 10 GB-monthUpgrade to paid plan or add a local cache
Data privacyAccidental leakage via logsNo logging of image payloads, secure cookies
Scaling limits9 GPUs only handle ~200 requests/hourMove to GPU server or add more rigs

Quick FAQ

  1. What if I need more than 9 GPUs?
    You can add more P106 cards if you have a larger chassis, but the power supply and cooling must scale. Alternatively, use a cloud GPU instance for burst traffic.

  2. Is the P106 still suitable for modern deep-learning models?
    Yes, for models like BiRefNet and Real ESRGAN that fit in 6 GB VRAM, the P106 works fine. For larger models you’ll need 12 GB or more.

  3. How do I keep the API private?
    Use JWT or a short-lived JVT token stored in an HttpOnly cookie. The token is validated on every request.

  4. Can I use the same pipeline for other image tasks?
    Absolutely. Just swap in a different model (e.g., Stable Diffusion for generative art) and adjust the VRAM thresholds.

  5. Do I need a dedicated server?
    No. A home server works as long as you manage power, cooling, and backup.

  6. What about the cost of the GPUs over time?
    GPUs depreciate, but used P106s are cheap enough that even with a 10 % annual depreciation you still stay below $200/month.

  7. How do I handle large file uploads?
    Use Cloudflare R2 pre-signed URLs for direct upload; this bypasses the server and reduces load.

Conclusion

If you’re a startup founder or hobbyist developer looking to keep costs low while maintaining full data privacy, a home GPU rig built around used P106 cards is a viable path. The key takeaways:

  • Cut API bills by moving to your own GPUs.
  • Cap electricity by running in a low-cost region (Belarus).
  • Simplify storage with Cloudflare R2’s free tier.
  • Build a resilient pipeline with FastAPI, a lightweight queue, and open-source models.

Who should use this? Anyone who can tolerate the upfront hardware cost, the extra maintenance overhead, and who cares about privacy. Who should not? Those who need instant horizontal scaling, or who lack the physical space and power supply to run a GPU rig.

Ready to jump in? Grab a few P106s, set up FastAPI, and let the GPU do the heavy lifting.

References

Last updated: January 6, 2026

Recommended Articles

Building a Fourth Dimension: How Quantum Hall Experiments Let Us Walk Through 4D Space | Brav

Building a Fourth Dimension: How Quantum Hall Experiments Let Us Walk Through 4D Space

Discover how the quantum Hall effect lets us simulate a fourth spatial dimension in the lab. Learn about synthetic dimensions, 4-D edge states, and their potential for quantum computing.
AI Consulting as My Secret Weapon: How I Built a $250K Solo Empire and You Can Do It Too | Brav

AI Consulting as My Secret Weapon: How I Built a $250K Solo Empire and You Can Do It Too

Learn how I built a $250K solo AI consulting business, productized my expertise, and scaled founder-led brands—step-by-step tips for mid-career pros.
How I Built a RAG Agent That Stops Hallucinations With Source Validation | Brav

How I Built a RAG Agent That Stops Hallucinations With Source Validation

Learn how to build a RAG agent with source validation using CopilotKit and Pydantic AI. Stop hallucinations, add human approval, and sync in real time.
All-Optical Computer: I Built the First One—Why It Matters | Brav

All-Optical Computer: I Built the First One—Why It Matters

Discover how our all-optical computer delivers terahertz speeds, 10× lower power, and plug-in GPU-style performance for AI and data-center leaders.
Claude Skills Mastery: Build & Optimize Copy Into Conversions | Brav

Claude Skills Mastery: Build & Optimize Copy Into Conversions

Build and test Claude skills to boost copy conversion. Create a conversion review skill with scoring and frameworks. Perfect for copywriters, designers, devs, PMs.
Build a Network Security Monitoring Stack in VirtualBox: From Capture to Alerts with tshark, Zeek, and Suricata | Brav

Build a Network Security Monitoring Stack in VirtualBox: From Capture to Alerts with tshark, Zeek, and Suricata

Learn how to set up a network security monitoring stack with tshark, Zeek, and Suricata on VirtualBox. Capture, analyze, and detect threats in real time.