Skip to content

Configuration reference

Two surfaces:

  1. Static config lives in packages/bot/.env — Discord credentials, voice pipeline tunables, TTS engine knobs, default agent backend / model. Copy .env.example and edit.
  2. Per-session config lives in data/sessions.json, mutated at runtime by slash commands (/pickup, /model, /effort, /permissions, /notify). Each session can have its own model, reasoning effort, permission policy, and notification opt-in. See Slash commands for the full surface.

Static-vs-session precedence: per-session settings override env defaults when the agent starts. Clearing a per-session override (/model name: blank, /effort level:default, /permissions mode:default-for-mode) falls back to the env value.

Discord

VarRequiredNotes
DISCORD_TOKENyesBot token
DISCORD_CLIENT_IDyesApplication ID
DISCORD_GUILD_IDyesServer ID where slash commands register
BOT_TEXT_CHANNEL_IDnoGlobal fallback bound channel; per-guild /bind wins
BOT_ALLOWED_USERSnoComma-separated Discord user IDs. When set, only those users can drive the bot. Set this before any deployment that exposes the bot to other users. See Security

Voice pipeline

VarDefaultNotes
SILENCE_MS600End-of-utterance silence (ms). Lower = snappier, higher = fewer false stops
VAD_THRESHOLD0.4Speech probability cutoff
VAD_MIN_SPEECH_WINDOWS3Min 32ms windows of speech to count an utterance

Whisper STT

VarDefaultNotes
WHISPER_MODELsmallsmall (multilingual, default) / small.en (English) / base (multilingual, lighter) / base.en (English, lightest)
WHISPER_DEVICEcpucpu or cuda
WHISPER_COMPUTEint8int8 (CPU), float16 / float32 (GPU)
WHISPER_BEAM1Beam search width. Higher = more accurate, slower

TTS

The TTS layer auto-routes by detected language. Default auto runs Kokoro for English/JP/ZH/ES/FR/HI/IT/PT and falls back to MeloTTS or XTTS for Korean. You can pin to a single engine via TTS_ENGINE=kokoro|melotts|xtts, or stay on auto and pick which Korean engine handles the routed-to-Korean case via TTS_KO_ENGINE.

Top-level routing

VarDefaultNotes
TTS_ENGINEautoauto (Kokoro + Korean engine) / kokoro / melotts / xtts
TTS_KO_ENGINEmelottsWhen TTS_ENGINE=auto, picks the Korean engine: melotts (faster, monotone) or xtts (heavier, expressive)

Kokoro (English + 7 other langs)

VarDefaultNotes
KOKORO_VOICEaf_heartAny of the 54 loaded voices
KOKORO_SPEED1.00.5–2.0 range
KOKORO_LANGen-usen-us, en-gb, ja, zh, es, fr, hi, it, pt-br
KOKORO_MODEL(resolved)Override model file path
KOKORO_VOICES(resolved)Override voices file path

MeloTTS (Korean — lightweight, monotone)

VarDefaultNotes
MELOTTS_LANGKRKR / EN / ES / FR / JP / ZH (uppercase)
MELOTTS_DEVICEcpucpu or cuda
MELOTTS_SPEED1.30.8–1.5. 1.3 keeps the voice from sounding too leaden
MELOTTS_PREWARM1Set to 0 to defer the ~17s PyTorch+BERT load until first KR call

XTTS-v2 (Korean — heavier, ~58 speakers, voice cloning)

VarDefaultNotes
XTTS_LANGkoko / en / ja / zh-cn / es / fr / de / it / pt / pl / tr / ru / nl / cs / ar / hu
XTTS_DEVICEcpucpu or cuda
XTTS_SPEED1.00.8–1.3
XTTS_SPEAKERDaisy StudiousOne of ~58 built-in Coqui speakers (Claribel Dervla, Gracie Wise, Tammie Ema, Damien Black, Andrew Chipper, Royston Min, Alma María, Lilya Stainthorpe, …)
XTTS_REFERENCE_WAVPath to a 6s+ WAV. Overrides XTTS_SPEAKER to clone that voice instead
XTTS_MODELtts_models/multilingual/multi-dataset/xtts_v2Override Coqui model id
XTTS_PREWARM1Set to 0 to defer the ~30s model load until first KR call

Speaker agent

Env-level (apply to every session unless overridden)

VarDefaultNotes
AGENT_BACKENDclaude-codeOne of 10 registered backends. CLI agents: claude-code · codex · aider-cli · gemini-cli · opencode-cli · crush-cli · amp-cli. HTTP APIs: anthropic-api · openai-compat · gemini-api. Switch at runtime with /backend.
AGENT_MODELhaikuDefault model. Per-session override via /model name:<id> or /pickup model:<id>
AGENT_MAX_TOKENS200For HTTP API backends (anthropic-api / openai-compat / gemini-api)
ANTHROPIC_API_KEYRequired if AGENT_BACKEND=anthropic-api; also used to populate the model catalog
CODEX_SANDBOXread-onlyread-only / workspace-write / danger-full-access
SPEAKER_TOOLSRead Glob GrepBuilt-in CC tools the speaker can use inline
PROJECT_DIRSComma-separated absolute paths the speaker can read
PAPERCUP_TURN_TIMEOUT_S0 (off)Per-turn hard cap (seconds). On timeout, SIGTERMs the spawned CLI's process group and rejects with turn timed out after Ns. Disabled by default so long extension turns aren't interrupted.

openai-compat backend

One adapter targeting any /v1/chat/completions endpoint. Unlocks OpenAI, Groq, Together.ai, Fireworks, DeepSeek, OpenRouter, LiteLLM, Ollama (local), LM Studio, vLLM via base-URL config.

VarDefaultNotes
OPENAI_COMPAT_BASE_URLRequired. e.g. https://api.openai.com/v1, https://api.groq.com/openai/v1, http://localhost:11434/v1 (Ollama)
OPENAI_COMPAT_API_KEYOptional (local providers like Ollama/LM Studio don't need one)
OPENAI_COMPAT_MODEL_DEFAULTgpt-4o-miniFallback when AgentBackendOpts.model unset

gemini-api backend (native, not the OpenAI shim)

VarDefaultNotes
GEMINI_API_KEYRequired. From aistudio.google.com
GEMINI_API_BASE_URLhttps://generativelanguage.googleapis.com/v1betaOverride for proxy / regional endpoints
GEMINI_API_DEFAULT_MODELgemini-2.5-flashFallback when AgentBackendOpts.model unset

CLI agent backends

Every CLI backend follows the same pattern: binary path override, isolated workdir, optional default model, optional extra-args slot.

BackendVars
aider-cliAIDER_BINARY · AIDER_WORKDIR · AIDER_EXTRA_ARGS
gemini-cliGEMINI_BINARY · GEMINI_WORKDIR · GEMINI_DEFAULT_MODEL · GEMINI_EXTRA_ARGS
opencode-cliOPENCODE_BINARY · OPENCODE_WORKDIR · OPENCODE_DEFAULT_MODEL · OPENCODE_EXTRA_ARGS
crush-cliCRUSH_BINARY · CRUSH_WORKDIR · CRUSH_DEFAULT_MODEL · CRUSH_YOLO · CRUSH_EXTRA_ARGS
amp-cliAMP_BINARY · AMP_WORKDIR · AMP_DEFAULT_MODEL · AMP_THREAD · AMP_EXTRA_ARGS

Each CLI is detached-spawned, process-group-tracked in data/process-registry.json, and cancelable via /cancel. See process management.

Per-session (set via slash commands)

These attach to the session record in data/sessions.json and survive /hangup/resume.

FieldSet viaNotes
model/pickup model:<id> or /model name:<id>Per-session model override (e.g. claude-opus-4-7). Hot-swap preserves history
effort/pickup effort:<level> or /effort level:<level>minimal / low / medium / high / xhigh (Opus only) / max (Opus only). Maps to --effort on the CLI; thinking.budget_tokens on Anthropic API
mode`/pickup mode:voicetext`
permissionMode/pickup permission-mode:<mode> or /permissions mode:<mode>Tool permission policy. Mode-aware default: text → bypassPermissions (vibecoding), voice → default. Choices: default / acceptEdits / auto / bypassPermissions / plan
notify`/notify state:onoff`
backend/backend name:<x>Which agent backend this session uses (e.g. claude-code, openai-compat). Persisted across restarts
backendId(auto)Backend's native session id (Claude Code UUID, Codex thread id) for resume
streaming`/streaming mode:offsummary
reactivity`/reactivity mode:strictloose
channelId(auto)For text sessions: the Discord channel id this session is bound to. Used to auto-resume on the first message after a bot restart

Extensions

Sandbox dirs at data/extensions/<id>/. MCP server picks an ephemeral localhost port. Permission policy is env-driven — see Security for the full hardening guide.

VarDefaultNotes
EXTENSION_PERMISSION_MODEbypassPermissionsdefault / acceptEdits / auto / bypassPermissions / plan. Tighten before public deployment
EXTENSION_ALLOWED_TOOLSdefaultWhitelist for tools the extension can use (e.g. "Read Edit Write Bash(npm *)")
EXTENSION_DISALLOWED_TOOLSExplicit denies (e.g. "WebFetch Bash(rm -rf *)")

Process management

Each CLI-agent turn (every backend except anthropic-api / openai-compat / gemini-api) spawns as a detached process group, tracked in data/process-registry.json. On bot restart, the boot-time reaper SIGTERMs any orphans whose botPid doesn't match the current process. Safe by construction: only PIDs the bot explicitly recorded are signaled.

VarDefaultNotes
PAPERCUP_TURN_TIMEOUT_S0 (off)See Speaker agent / env-level — same variable repeated here for cross-reference

See process management for the full lifecycle.

Budget tracking

Per-day USD + token tracking, persisted to data/budget.json. Pricing table covers Claude Opus/Sonnet/Haiku 4.x, GPT-5/4o/o3, Gemini 2.5 — unknown models record tokens only.

VarDefaultNotes
BOT_DAILY_BUDGET_USD0 (unlimited)Daily hard cap in USD. When today's cost ≥ cap, papercup refuses to respond — humans get a "budget spent" message, bots get silent ignore. Cap resets at UTC midnight. Override at runtime with /budget set_usd:<n>.

The bot's Discord rich-presence reflects today's budget percentage live (46% of $10/day); humans see it on hover and /budget shows the detailed breakdown.

Multi-bot orchestration

Multiple papercup bots can co-exist in one Discord server, each with its own credentials/model/budget. Several guardrails are configured here.

VarDefaultNotes
BOT_BOT_MAX_TURNS3Per-channel cap: papercup's consecutive replies since the last human message. Hitting the cap silences papercup for that channel until a human chimes in. Prevents bot-to-bot runaway loops.
BOT_ROSTER_CHANNEL_IDDesignated #roster channel (Discord channel id). On boot, papercup scrapes recent messages for papercup-roster v1 announcements from other bots. Used by /announce and /refresh-roster.
BOT_WORKDIRprocess.cwd()This bot's declared filesystem root. Compared against other bots' workdirs on boot — log-only warning if they overlap.
BOT_OWNER_DISCORD_IDOperator's Discord user-id, embedded in /announce so other operators know who owns this bot.
BOT_DEFAULT_REACTIVITYstrictDefault reactivity-to-other-bots for newly-created sessions. Per-session override via /reactivity.

See multi-bot orchestration for the full picture.

Diagnostic

VarDefaultNotes
DUMP_PCMSet to 1 to dump first significant utterance to /tmp/papercup-*.f32 for offline VAD/STT debugging

Released under the MIT License.