Papercup is a Discord voice bot that calls a Claude Code session running on your own box. Press /pickup, talk like it's a phone call, get spoken answers. No cloud STT/TTS, no audio leaves your network.
bash <(curl -fsSL https://raw.githubusercontent.com/powder-nomad/papercup/main/install.sh)Three things, all load-bearing.
Silero VAD → faster-whisper STT → Kokoro/MeloTTS TTS, all running in Python sidecars on your hardware. Audio never leaves your LAN.
~3–8s loop on 4 coresSpeak, pause, get a spoken reply. Hang up and resume by name later. Multilingual (English, Korean, JP, ZH, ES, FR, …) auto-routed per utterance.
9 languages todayThe speaker delegates to sandboxed background Claude Code instances via an embedded MCP server. You hang up; they keep coding.
spawn → check → listSame core, different distribution shape. Pick whichever fits your setup.
Paste, answer three Discord-token questions, done. Engine selection via flags or the wizard below.
read more →claude code pluginDrop into ~/.claude/plugins. Drives setup via /papercup:setup / start / status slash commands.
Adds Papercup's voice stack to OpenClaw's Discord channel adapter as a SpeechProviderPlugin.
bash <(curl -fsSL https://raw.githubusercontent.com/powder-nomad/papercup/main/install.sh)Paste into your homelab terminal. Discord token / client ID / guild ID will be prompted interactively. Re-run with different flags any time to reconfigure.
┌─ Discord (phone / desktop) ─┐ ┌─────────── Homelab ───────────┐
│ │ voice │ │
│ /pickup → speak → /hangup │ ──────► │ Silero VAD → Whisper STT │
│ │ │ ↓ │
│ bot speaks back │ ◄────── │ Speaker agent (Haiku) │
│ │ Kokoro │ ↓ │
└─────────────────────────────┘ │ Kokoro / MeloTTS → audio │
│ │
│ spawn_extension(task) ───► │
│ Claude Code subagent │
│ in sandboxed dir │
└───────────────────────────────┘The speaker handles the call directly. For anything bigger than a quick file read, it spawns a background extension — a full Claude Code instance in its own dir — and narrates progress while it works. You can hang up; resume the session by name later (/resume name:foo).
Tested on a 4-core Linux homelab. macOS works for the base path; the MeloTTS (Korean) path is Linux-tested only.
| Minimum | Recommended | |
|---|---|---|
| OS | Linux x86_64, macOS (Intel or Apple Silicon) | Ubuntu 22.04+ |
| Python | 3.10 | 3.12 |
| Node | 20 | 20+ |
| Disk (English-only, Kokoro) | 2 GB free | 4 GB free |
| Disk (with Korean / MeloTTS) | 4 GB free | 8 GB free |
| RAM | 2 GB free | 4 GB free |
| CPU | 2 cores | 4+ cores (real-time STT/TTS) |
| Network | Outbound HTTPS for model downloads | — |
# Base install (Kokoro TTS only)
sudo apt-get install -y espeak-ng python3-venv
# + Korean / MeloTTS path
sudo apt-get install -y libmecab-dev mecab-ipadic-utf8 libssl-dev pkg-configbrew install espeak-ng node python@3.12
# Korean path also needs:
brew install mecab mecab-ipadic openssl pkg-configclaude CLI (Claude Code), codex CLI (ChatGPT), or an Anthropic API key. The wizard above lets you pick.base)| Component | Today | Notes |
|---|---|---|
| VAD | Silero | Only option |
| STT | Whisper | small (multilingual, default) auto-detects 99 languages; base / base.en / small.en available |
| TTS | Kokoro + MeloTTS + XTTS-v2 (auto) | Kokoro: en/ja/zh/es/fr/hi/it/pt. Korean → MeloTTS (light, monotone) or XTTS-v2 (~58 voices, voice cloning). Set via TTS_KO_ENGINE |
| Agent | 10 backends (7 CLI agents + 3 HTTP APIs) | claude-code · codex · aider · gemini-cli · opencode · crush · amp · anthropic-api · openai-compat · gemini-api. Switch via /backend at runtime. |
| Per-session config | model · effort · permissions · backend · streaming · reactivity · notify · mode | Set via /pickup flags or hot-swap mid-session via individual slash commands |
| Modes | Voice (phone-call prompt) + Text (vibecoding) | /pickup mode:voice or mode:text. Text mode drops the system prompt → normal Claude Code behavior |
| Reasoning effort | minimal · low · medium · high · xhigh · max | xhigh / max are Opus-only |
| Live progress | sticky message, optional event log | Text mode + /streaming summary|full. Anti-bomb: edit-throttled, auto-skips short turns |
| Budget tracking | per-day USD + tokens, daily cap | BOT_DAILY_BUDGET_USD or /budget set_usd:<n>; live on bot's rich-presence |
| Process hygiene | detached spawn, group-kill cancel, boot-time reaper | Each agent turn tracked in data/process-registry.json; orphans cleaned up on restart |
| Multi-bot | loop cap, reactivity modes, in-band roster | Multiple operators can co-host bots in one channel; cap prevents bot-to-bot loops |
| Transport | Discord voice + text | Bind a single channel via /bind, or @-mention anywhere |
See Slash commands for the runtime surface and Components for the deep dive.