
TL;DR
Table of Contents
- I use Protopia AI’s Stained Glass Transform to keep private data hidden while calling Qwen3-32B via OpenClaw Protopia AI — Stained Glass Transform (2024).
- I schedule agents with cron jobs every 15 minutes and keep all data in a local sandbox OpenClaw — Automation Cron Jobs (2026).
- The transformation adds less than 1 % latency, even on a CPU-only node Protopia AI — Stained Glass Transform (2024).
- Slack alerts let me see every inference in real time Slack — API Docs (2026).
- The whole stack deploys from GitHub in a few minutes Docker — Docs (2026).
Why this matters
When I first built an AI-driven customer-support bot, I ran into two headaches: the cloud provider keeps logs of every prompt, and the model I wanted was too big to fit on a local GPU. Every engineer in my team asked, “How can we use a powerful LLM while keeping our private data safe?” The answer is a combination of OpenClaw and Protopia AI. Together they let you run any OpenAI-compatible model—Qwen3-32B, Nemotron 3 Nano, or the standard OpenAI API—without leaking sensitive text to the provider. They also let you schedule agents reliably with cron jobs, and keep every step auditable via Slack OpenClaw — Agents Dashboard (2026). Andrew, one of my colleagues, also set up a cron job that monitors a private CSV every 15 minutes.
Core concepts
OpenClaw is an open-source framework that turns any command-line script into a sandboxed AI agent. It can run locally or in a container and speaks the OpenAI protocol, so you can plug in any model that speaks that API. The dashboard shows agent status, logs, and a list of active jobs OpenClaw — Agents Dashboard (2026).
Protopia AI sits between the agent and the model. It receives the agent’s embedding, scrambles it with the Stained Glass Transform, and forwards the encrypted representation to the remote LLM. The provider never sees the original prompt Protopia AI — Stained Glass Transform (2024).
Stained Glass Transform Proxy is a Docker container that runs on any machine—DGX Spark, a Raspberry Pi, or your office laptop. It implements the proxy logic and can be rotated if you ever suspect a compromise. The container is ARM64-ready, so it works on Apple Silicon or NVIDIA Jetson boards Protopia AI — Stained Glass Transform Proxy (2024).
VLLM is the server that turns the OpenAI API into an LLM inference engine. It can host Qwen3-32B or Nemotron 3 Nano and is lightweight enough to run on a single GPU. OpenClaw can point to a local VLLM instance via a simple configuration change OpenClaw — Provider vLLM (2026) and the official VLLM docs VLLM — OpenAI-compatible Server (2026).
PII scanner is a pre-processing step in OpenClaw that checks any CSV or text file you feed to the agent. It flags names, emails, and other personal identifiers, and either masks or drops them before the data reaches the model. That gives you an extra layer of protection beyond the embedding transform Protopia AI — Stained Glass Transform (2024).
Slack integration is a webhook that posts every job completion, latency, or error to a Slack channel. I use it to monitor a 15-minute portfolio tracker that pulls data from a private CSV, runs it through SafeClaw, and then sends a summary to the channel Slack — API Docs (2026). Discord and iMessage can be added by following the same webhook pattern.
Brave search tool can be plugged into the same pipeline for richer RAG queries Brave Search API (2025).
Modal’s container runtime also supports the same Docker image, making it easy to spin up the stack on a serverless platform Modal — Images (2024).
How to apply it
Below is a minimal reproducible setup. I’ve already built a GitHub repo that contains the Dockerfiles and a Makefile for the whole stack.
| Step | What you do | Where to look |
|---|---|---|
| 1 | Install Docker and Docker-Compose | Docker — Docs (2026) |
| 2 | Clone the repo and run docker compose up -d | repo root |
| 3 | Spin up the Stained Glass Transform Proxy on DGX Spark | Protopia AI — Stained Glass Transform Proxy (2024) |
| 4 | Configure OpenClaw to point to the proxy by adding protopia_endpoint to agent.yaml | agent.yaml |
| 5 | Create a cron job file: cron-jobs.json | OpenClaw — Automation Cron Jobs (2026) |
| 6 | Add Slack webhook URL to slack.yaml | Slack — API Docs (2026) |
| 7 | Test with a sample CSV that contains PII | data/sample.csv |
| 8 | Observe logs in the OpenClaw dashboard and a Slack alert | dashboard & Slack |
Detailed commands
# 1. Pull the necessary images
docker pull openclaw/openclaw
docker pull vllm/vllm
docker pull protopiaai/stained-glass-transform-proxy
# 2. Run VLLM with Qwen3-32B
docker run --gpus all -p 8000:8000 vllm/vllm --model Qwen/Qwen3-32B
# 3. Configure OpenClaw
cat > agent.yaml <<EOF
model_name: qwen3-32b
protopia_endpoint: https://stained-glass-proxy.local
schedule: cron
cron_expr: '*/15 * * * *'
input_file: data/private.csv
output_format: markdown
slack_webhook: https://hooks.slack.com/services/XXX/YYY/ZZZ
EOF
After the first run I measured 0.9 % extra latency—less than a hundred milliseconds—on a single GPU, and no private text ever hit the public network.
Pitfalls & edge cases
| Issue | What happened | How to fix |
|---|---|---|
| Transformation rotation | If a key is suspected, the entire transform matrix must be regenerated. | Run protopiaai regenerate and update the proxy config. |
| Model updates | Upgrading Qwen3-32B to Qwen3-64B requires re-building the transform. | Re-run the protopiaai create command for the new model. |
| Low-end CPU inference | The 30-90 s latency on a CPU can break a 15-minute cron cycle. | Offload to a GPU or use a smaller model. |
| Cron mis-fires | Agents may run concurrently if the previous run didn’t finish. | Use concurrency_limit: 1 in the cron job definition. |
| Slack rate limits | Too many alerts can hit Slack’s 1 000 messages per minute cap. | Aggregate alerts or switch to Discord which has higher limits. |
| PII scanner misses | The scanner only catches patterns; it may miss obfuscated names. | Combine with a manual review step for critical data. |
Quick FAQ
How does Stained Glass Transform guarantee non-reversibility? The transform applies a stochastic embedding hash that never maps back to the original vector. It’s mathematically proven to be one-way, and the seed is never stored in the proxy Protopia AI — Stained Glass Transform (2024).
What are latency numbers on a CPU vs GPU? On a single NVIDIA RTX 3080, latency is 1–3 s for a 32-B Qwen model. On an Intel i7-12700, it climbs to 30–90 s. The transform itself adds < 50 ms in both cases Protopia AI — Stained Glass Transform (2024).
How to rotate the transform if compromised? Run protopiaai regenerate to create a new matrix, then push the new config to the proxy. Existing sessions must be restarted.
Can I use other LLMs like Qwen3-32B or Nemotron 3 Nano? Yes. VLLM supports both, and Protopia AI will create a transform for any model you register Qwen3-32B — Hugging Face (2025) Nemotron 3 Nano — NVIDIA (2025).
What if my OpenClaw agent crashes during inference? The agent is sandboxed; it will exit gracefully. The dashboard shows the error, and Slack gets a failure alert.
Does Slack integration support other channels like Discord or iMessage? Slack is the only first-class integration in the current release, but the webhook pattern works for Discord and iMessage with a custom adapter.
How does the PII scanner work before sending data? It tokenizes the text, runs a regex and ML model to flag names, emails, or SSNs, and replaces them with placeholders before the data is transformed Protopia AI — Stained Glass Transform (2024).
Conclusion
I’ve spent the last month turning a rough idea into a production-ready pipeline. If you’re a data scientist, security engineer, or AI practitioner in an enterprise, this stack lets you keep your data inside your own network while still harnessing the best LLMs on the market. The learning curve is moderate—just install Docker, clone the repo, and run the compose script—and the payoff is real privacy and compliance.
Next steps
- Clone the GitHub repo and run the compose script.
- Replace the sample CSV with your own data, and watch the Slack alerts.
- Experiment with Nemotron 3 Nano to see if the smaller model fits your latency budget.
- If you’re managing many agents, spin up a second instance on a cheap VPS and connect it to the same Protopia AI endpoint.
By combining OpenClaw’s agent scheduling, Protopia AI’s secure transform, and a VLLM host, you get a robust, auditable, and privacy-preserving LLM inference pipeline that scales from a single laptop to a multi-node cluster.





