Architecture overview

ollim-bot is a single-process Python application that bridges Discord with Claude via the Agent SDK. All modules live under src/ollim_bot/, with two sub-packages (google/ and scheduling/) for domain-specific functionality. The core agent module delegates to three extracted helpers — agent_context.py (timestamps, pending updates, ThinkingConfig), agent_streaming.py (stream consumption and auto-compaction retry), and fork_state.py (contextvars, dataclasses, idle timeouts).

Data flow

A message from Discord travels through four stages before a response appears.

Discord event

bot.py receives a DM. It extracts text and image attachments, resolves reply context (including fork session resumption), and acquires the agent lock.

Agent processing

agent.py injects the message into the active ClaudeSDKClient session. agent_context.py prepends a timestamp and any pending background updates. agent_streaming.py consumes the SDK response, yielding text deltas and StreamStatus signals through an AsyncGenerator — including transparent auto-compaction retry.

Streaming to Discord

streamer.py consumes the generator, buffering deltas and progressively editing a Discord message. When the message exceeds 2000 characters, it finalizes the current message and starts a new one.

Post-stream transitions

bot.py checks for fork transitions — if the agent called enter_fork or exit_fork during the response, the bot handles the state change (creating fork embeds, swapping clients, or discarding the fork).

Background fork execution

Scheduled routines, reminders, and webhooks run on disposable forked sessions that execute in parallel without blocking the main conversation.

Forked mode
Isolated mode

The default. run_agent_background creates a client forked from the main session — the fork inherits full conversation history. Output is discarded unless the agent calls report_updates (which writes to pending_updates.json) or ping_user/discord_embed (which message the user directly, subject to ping budget).

When isolated: true is set in the routine or reminder YAML, create_isolated_client creates a standalone session with no conversation history. Used for tasks that don’t need prior context, like email triage.

Background forks communicate back to the main session through pending updates — summaries written to ~/.ollim-bot/state/pending_updates.json. The main session pops these updates and prepends them to the next user message. Forks peek at updates (read-only) to avoid consuming another fork’s output.

Background forks run without the agent lock. Fork state, busy state, chain context, background tracking, and fork config are all scoped via contextvars so concurrent forks don’t interfere with each other or the main session. The DM channel is a module-level global set once at startup — safe to share because it never changes.

Key architectural patterns

Session persistence

The bot maintains a single ClaudeSDKClient with a session ID persisted to ~/.ollim-bot/state/sessions.json. On restart, it resumes the existing session. All session lifecycle events (created, compacted, swapped, cleared, interactive_fork, bg_fork, isolated_bg, restarting) are logged to session_history.jsonl.

Contextvar isolation

Background forks use ContextVar instances to scope mutable state. All fork-related contextvars and dataclasses live in fork_state.py — key variables include _in_fork_var, _busy_var, _bg_tracking (a BgForkTracking dataclass holding output_sent, reported, and ping_count), and _bg_fork_config_var. _chain_context_var in agent_tools.py and _msg_collector in sessions.py follow the same pattern. This lets multiple forks run concurrently while the main session uses module-level globals for the same values.

File-based storage

All persistent data lives in ~/.ollim-bot/ as files:

Markdown with YAML frontmatter for human-editable data (routines, reminders, webhooks) — the agent reads and writes these
JSONL for append-only logs (session history)
JSON for small state files (session ID, ping budget, inquiries)

storage.py provides generic I/O with atomic writes (temp file + rename) and optional git auto-commit.

Dual state for tools

MCP tools in agent_tools.py maintain two parallel references — a module-level global for the main session and a ContextVar for background forks. Functions like set_chain_context / set_fork_chain_context set the appropriate reference based on execution context.

Persistent buttons

Discord buttons survive bot restarts through two mechanisms: DynamicItem[Button] in views.py reconstructs button handlers from custom_id patterns on startup, and inquiries.py persists agent-generated button prompts to disk with a 7-day TTL.

Module map

The codebase has 42 modules organized into five layers.

Full module map (42 modules)

Core loop

The main path from a Discord message to a streamed response.

Module	Role
`main.py`	CLI entry point — dispatches to bot or subcommands, sets up the SDK layout at startup
`auth.py`	Claude Code auth — headless login via bundled CLI, startup auth check
`bot.py`	Discord interface — DMs, slash commands, reaction acks
`agent.py`	Agent SDK wrapper — sessions, MCP servers (`discord` + `docs`), slash routing; delegates context prep to `agent_context.py` and streaming to `agent_streaming.py`
`agent_context.py`	Message context helpers — timestamps, duration formatting, pending update assembly, `ThinkingConfig` builder
`agent_streaming.py`	Stream response consumer — SDK message loop, auto-compaction retry, fork interrupt, fallback tiers
`streamer.py`	Streams text deltas to Discord — throttled edits, 2000-char overflow
`prompts.py`	System prompt for the main agent and fork prompt helpers
`subagents.py`	Bundled agent installation (`install_agents`) and tool-set extraction (`load_agent_tool_sets`) for policy validation
`subagents/`	Subagent specs as markdown files (ollim-bot-guide, gmail-reader, history-reviewer, responsiveness-reviewer, user-proxy)
`channel.py`	DM channel reference — set once at startup, read everywhere
`profile.py`	User profile files: IDENTITY.md (bot persona) and USER.md (user context), bootstrap and loading
`updater.py`	Git-based auto-update: fetch, compare, pull (`--ff-only`), `uv tool upgrade`, restart via `os.execv`
`doctor.py`	Health diagnostics — checks data dir, SDK layout, credentials, and reports issues

Tool system

MCP tools and the external trigger server the agent uses to interact with Discord and the outside world.

Module	Role
`agent_tools.py`	MCP tools for Discord embeds, pings, forks, and chains
`reminder_tools.py`	MCP tool implementations for reminder management (add, list, cancel)
`hooks.py`	Agent SDK hooks: `state_dir_guard` (blocks writes to `state/`) and `auto_commit_hook` (auto-commits `.md` changes)
`webhook.py`	HTTP server for external triggers — auth, validation, Haiku screening
`fork_state.py`	Fork state — contextvars, dataclasses (`BgForkTracking`, `BgForkConfig`), interactive fork globals
`forks.py`	Pending updates I/O and background fork execution (`run_agent_background`)
`tool_policy.py`	Tool pattern validation, per-job tool restrictions, and YAML tool policy config
`views.py`	Persistent button handlers via `DynamicItem` — delegates to google/, forks, and streamer

Storage and state

Persistence, configuration, and cross-cutting concerns.

Module	Role
`storage.py`	Shared JSONL and markdown I/O, git auto-commit for `~/.ollim-bot/`
`sessions.py`	Persists Agent SDK session ID + session history JSONL log
`permissions.py`	Tool approval — `canUseTool` callback, reaction-based approval
`config.py`	Env vars: `OLLIM_USER_NAME`, `OLLIM_BOT_NAME` (from `.env`)
`embeds.py`	Embed/button types and builders shared by `agent_tools` and `views`
`inquiries.py`	Persists button inquiry prompts to disk (7-day TTL, survives restarts)
`ping_budget.py`	Refill-on-read ping budget — capacity 5, refills 1 per 90 min
`runtime_config.py`	Persistent runtime config — model/thinking per context, timeouts, permission mode
`skills.py`	Skill permission helpers for background fork dispatch (SDK handles skill loading natively)
`formatting.py`	Tool-label formatting helpers shared by agent and permissions

Google integration (`google/`)

OAuth2-based integrations with Google services.

Module	Role
`auth.py`	Shared OAuth2 credentials for Tasks + Calendar + Gmail
`tasks.py`	Google Tasks CLI + API helpers (`complete_task`, `delete_task`)
`calendar.py`	Google Calendar CLI + API helpers (`delete_event`)
`gmail.py`	Gmail CLI (`ollim-bot gmail`) — read-only access

Scheduling (`scheduling/`)

Proactive routines and reminders via APScheduler.

Module	Role
`scheduler.py`	APScheduler integration — polls files every 10s, registers triggers
`routines.py`	Routine dataclass and markdown I/O — recurring crons in `routines/*.md`
`reminders.py`	Reminder dataclass and markdown I/O — one-shot + chainable
`preamble.py`	Background preamble builder — ping budget, schedule, and config
`routine_cmd.py`	CLI handler for `ollim-bot routine` (add, list, cancel)
`reminder_cmd.py`	CLI handler for `ollim-bot reminder` (add, list, cancel)

Find what you need

I want to…	Go to
Understand the full message-to-response pipeline	How ollim-bot works
See how sessions persist and recover across restarts	Session management
Trace context through forks and pending updates	Context flow
Learn how responses stream to Discord	Streaming & Discord
Understand the design decisions and tradeoffs	Design philosophy
See the data directory layout and env vars	Configuration reference

Next steps

Session management

How sessions persist, compact, and recover across restarts.

Context flow

How context flows between main sessions, forks, and pending updates.

Streaming

How agent responses stream to Discord with throttled edits.

Configuration reference

All environment variables and data directory structure.

Configuration

Development

​Data flow

​Background fork execution

​Key architectural patterns

​Session persistence

​Contextvar isolation

​File-based storage

​Dual state for tools

​Persistent buttons

​Module map

​Core loop

​Tool system

​Storage and state

​Google integration (google/)

​Scheduling (scheduling/)

​Find what you need

​Next steps

Session management

Context flow

Streaming

Configuration reference

Data flow

Background fork execution

Key architectural patterns

Session persistence

Contextvar isolation

File-based storage

Dual state for tools

Persistent buttons

Module map

Core loop

Tool system

Storage and state

Google integration (`google/`)

Scheduling (`scheduling/`)

Find what you need

Next steps