Persistent memory for AI coding agents — every session builds on the last.
Features · How It Works · Install · Tech Stack · Development
Imprint is a plugin for Claude Code that gives your AI agent persistent memory across sessions. Every tool call, decision, and discovery is captured, compressed via LLM, indexed for search, and injected as context into future sessions.
No Docker. No external databases. Single Go binary + SQLite.
Inspired by agentmemory (Node.js + Docker) and MemPalace (Python + ChromaDB), rebuilt from scratch in Go with ideas from both: agentmemory's observation pipeline and UI, MemPalace's 4-layer memory stack, query sanitization, and write-ahead log.
| Category | What you get |
|---|---|
| Automatic Capture | 11 hooks capture every tool use, prompt, error, and decision — zero manual effort |
| LLM Compression | Raw observations compressed into structured memories with concepts, files, and importance scores |
| Hybrid Extraction (v1.2.0) | Regex pre-pass extracts files, PascalCase concepts, URLs, error markers, and git refs deterministically. The LLM only writes title/narrative/importance — fewer tokens, faster, half the Haiku spend. Toggle with IMPRINT_EXTRACTION_MODE=llm-only to revert. |
| Prompt-Injection Defense (v1.3.0) | Tool outputs are scrubbed for ~12 known injection patterns ("ignore previous instructions", role-hijack, system-tag spoofs, exfil prompts, fence breakouts) before storage. Suspicious spans become [FLAGGED:reason] markers in the audit log. |
| Memory Decay (v1.3.0) | Low-strength memories (≤3) older than 30 days are soft-archived every 6 hours so retrieval stays focused on signal. Strong memories survive forever. |
| Backlink-Boosted Ranking (v1.3.0) | Optional graph in-degree multiplier on hybrid search. A memory referenced by N other nodes ranks higher; activates once a graph provider is attached. |
| Eval Capture (v1.3.0, opt-in) | With IMPRINT_EVAL_CAPTURE=1, every search/recall captures (query, returned ids) into an eval_candidates table after PII scrubbing. Export via GET /imprint/eval/export (NDJSON). Replay tooling lands in a follow-up. |
| Live Actions Kanban (v1.4.0) | The Actions tab now streams updates via Server-Sent Events on GET /imprint/actions/stream. Cards animate between Pending/In Progress/Done columns the moment the agent moves them, no 5-second poll wait. Each card carries a session badge so you can tell which Claude Code session produced it. Polling stays as a fallback when SSE is unavailable. |
| Pinned Memories (v1.5.0) | Toggle the ★ in any memory card or the modal header to immunize it against the decay sweep. Critical decisions and architecture notes survive forever even when rarely reinforced. |
| Time-Travel Slider (v1.5.0) | Drag the slider on the Memories tab to ask "what did I know N days ago?" — backend filters by created_at <= cutoff. Refresh-resistant: state survives in ?days=N. |
| Inline Concept Editor (v1.5.0) | Click any concept pill in the Memory modal to remove, or use the "+" input to add. Persists immediately via POST /memories/concepts. |
| Why-This-Memory Score Breakdown (v1.5.0) | Hit "Why was this retrieved?" inside any Memory modal — runs a hybrid search by the title and shows the BM25/Vec/Rank scores for the top 5 results. Debugging the retrieval is no longer a black box. |
| Cluster Summarizer (v1.5.0) | Hover a node in the Graph for ~600ms; the LLM names the cluster topic in 4-7 words ("Spring Boot · transaction expiration"). In-memory cache means the same hover doesn't burn Haiku tokens twice. |
| Conversation Playback (v1.5.0) | Click "Playback" in any session card to open a unified chronological timeline of observations + memories created + actions, with rail-style visualization. Useful for retro reviews. |
| Pipeline Health Dashboard (v1.5.0) | Header badge shows HEALTHY / BACKLOG / IDLE / STALLED derived from real-time pipeline state. Sparkline of LLM calls/minute (last 30 datapoints) + auto-detect of circuit breakers + 8s polling so you can see throughput live. |
| Live Knowledge Graph (v1.5.0+) | Force-directed memory graph with light/dark theme, percentile-based fit-to-bounds (no more outliers shrinking your cluster), zoom controls, label-propagation community detection (colored halos = topology clusters), idle breath animation, hover ripples, and synaptic pulses streaming along edges like a live neural net. |
| Memory Decay (Configurable) (v1.5.0) | Settings UI exposes min_strength_to_archive and min_age_days sliders; the scheduler reads from the live config so changes apply without restart. Pinned memories always exempt. |
| Color-Coded Concepts (v1.5.0) | Concept tags use a deterministic hash → fixed palette so the same concept ("auth", "decay") always paints the same color across Memories, Lessons, and Graph tooltips. Click any tag in Lessons to filter. |
| Knowledge Graph Pipeline | LLM extracts entities (files, functions, concepts) and relations from compressed observations. (v1.5.0) now runs incrementally on the periodic scheduler — graph_extracted_at column dedups so existing rows aren't re-processed every tick. |
| Background Pipeline | Scheduler runs summarize + consolidate + action extraction + graph extract every N minutes during active sessions (configurable) |
| Heartbeat-Based Finalize (v1.5.4) | The Claude Code SessionEnd event doesn't fire reliably on /exit, so finalize is triggered by absence of activity instead. The Stop hook posts /imprint/session/heartbeat every turn; after 15min without one, the scheduler runs RunFinalize (final summarize + consolidate + graph + actions + reflect) and marks the session completed. Sessions that get finalized but receive a heartbeat later (long pause + return) are auto-resurrected back to active, preserving the audit trail. |
| Hybrid Search | BM25 (Bleve) + vector cosine similarity with Reciprocal Rank Fusion. Search responses now include per-result bm25Score / vecScore / rank (v1.5.0). |
| Context Injection | Relevant memories injected at session start and before context compaction. (v1.5.0) injected items now carry inline metadata sufixes — (★N · Xd) on memories, (iN · Xd) on observations — so the assistant sees strength and age without a DB query. |
| Smart Hooks | User prompts captured as intent anchors, task completions sync to kanban, failures tracked for error learning. (v1.5.0) heuristic filter prevents short user prompts ("continua", "ok", task-notification XML) from polluting the Actions kanban. |
| Multi-Provider LLM | Anthropic (API key + Claude Code OAuth auto-detect), OpenRouter, llama.cpp with circuit breaker + fallback |
| MCP Server | 8 tools for explicit memory recall, save, search, and graph queries |
| 12-Tab Web UI | Dashboard, Recall, Sessions, Timeline, Memories, Graph, Actions, Lessons, Activity, Audit, Profile, Settings |
| Global Topbar Search | Modal search overlay on every page — query the Bleve index from anywhere with title, type, score, narrative, concepts and files. (v1.5.0) keyboard shortcut: press / from anywhere to focus. |
| URL-Synced UI State (v1.5.0) | Memories slider and Lessons tag filters reflect in the query string (?days=7&tags=auth,decay). Refresh-resistant + shareable links. |
| Settings UI | Select LLM provider/model, configure API keys, tune search weights, pipeline interval, decay thresholds — all from the browser |
| 4-Layer Memory Stack | L0 Identity, L1 Essential Story, L2 Session Context, L3 On-Demand Search — each with token budgets |
| Actions Kanban | Pending = waiting on user (permission prompts), In Progress = current prompt being worked on, Done = completed tasks. Older in-progress entries auto-graduate to done when a new one starts. |
| Lessons & Insights | Two-column split layout with independent scrolling — see lessons and insights side-by-side without scrolling the page |
| Query Sanitizer | Detects and strips system prompt contamination from search queries |
| Write-Ahead Log | Append-only JSONL audit of every write operation for crash recovery and poisoning detection |
| Index Self-Heal | If the BM25 index is empty on startup but the DB has compressed observations, a background goroutine reindexes everything automatically |
| Build Provenance (v1.5.1) | /imprint/health exposes version + git commit injected via build-time ldflags. Dashboard System Health shows which binary is actually running. |
| Transcript Mining | go run ./cmd/mine imports historical Claude Code JSONL sessions retroactively |
| Auto-Start | Server launches automatically on first Claude Code session with retry + error logging |
| Privacy | All data stays local in ~/.imprint/. Secrets are scrubbed with 16 regex patterns before storage |
graph TD
HOOK["Claude Code Hook<br/><i>SessionStart, PostToolUse, etc.</i>"]
ENSURE["ensure-server<br/><i>auto-start, 10s timeout, logs errors</i>"]
OBSERVE["/imprint/observe<br/><i>rate-limit, dedup, privacy scrub</i>"]
COMPRESS["LLM Compress<br/><i>type, title, narrative, importance</i>"]
INDEX["BM25 + Vector Index<br/><i>Bleve full-text + cosine similarity</i>"]
GRAPH["Knowledge Graph<br/><i>entities, relations, BFS traversal</i>"]
SCHEDULER["Background Scheduler<br/><i>summarize + consolidate every 5min</i>"]
CONTEXT["Context Builder<br/><i>token-budgeted XML injection</i>"]
SESSION["New Session<br/><i>SessionStart hook fires</i>"]
MCP["MCP Tools<br/><i>memory_recall, memory_save, etc.</i>"]
UI["Web UI<br/><i>http://localhost:3111</i>"]
HOOK --> ENSURE
ENSURE --> OBSERVE
OBSERVE --> COMPRESS
COMPRESS --> INDEX
COMPRESS --> GRAPH
INDEX --> SCHEDULER
SCHEDULER -->|"memories, lessons, actions"| INDEX
SESSION --> CONTEXT
CONTEXT -->|"stdout injection"| HOOK
MCP --> INDEX
UI --> INDEX
style HOOK fill:#F97316,color:#fff,stroke:none,rx:12
style ENSURE fill:#2D8E5E,color:#fff,stroke:none,rx:12
style OBSERVE fill:#2B7BB5,color:#fff,stroke:none,rx:12
style COMPRESS fill:#7E44A8,color:#fff,stroke:none,rx:12
style INDEX fill:#2B7BB5,color:#fff,stroke:none,rx:12
style GRAPH fill:#7E44A8,color:#fff,stroke:none,rx:12
style SCHEDULER fill:#B8860B,color:#fff,stroke:none,rx:12
style CONTEXT fill:#2D8E5E,color:#fff,stroke:none,rx:12
style SESSION fill:#F97316,color:#fff,stroke:none,rx:12
style MCP fill:#1A1612,color:#fff,stroke:none,rx:12
style UI fill:#1A1612,color:#fff,stroke:none,rx:12
v1.5.0): only obs without graph_extracted_at are processed, capped per tick to control Haiku cost.★N · Xd or iN · Xd) so the assistant has strength + age signal without a roundtrip to the DB.| Hook | Trigger | What it does |
|---|---|---|
| session-start | Claude Code opens | Creates session, injects context (retry 3x with backoff) |
| prompt-submit | User sends message | Captures user intent as high-importance observation; opens an in_progress entry on the Actions kanban (older in-progress in the same session graduate to done) |
| pre-tool-use | Before Read/Edit/Grep | Enriches context with relevant memories for the files being touched |
| post-tool-use | After any tool | Records observation, auto-compresses via LLM, indexes the result into BM25 |
| post-tool-failure | Tool fails | Records error with distinct type for pattern detection |
| pre-compact | Context compaction | Saves snapshot before context is lost, injects recovered context |
| notification | Permission prompt | Surfaces the prompt as a pending action so the user sees Claude is waiting on them |
| subagent-start / subagent-stop | Task agent spawned/finished | Records subagent lifecycle for the activity feed |
| stop | End of every assistant turn | Closes the kanban turn, processes transcript for missed observations, posts /imprint/session/heartbeat to keep the session alive (v1.5.4). |
| session-end | Session finalizes | Best-effort fast-path: posts /imprint/session/end + /imprint/finalize. The Claude Code event doesn't fire reliably on /exit, so the canonical finalize path is the scheduler's heartbeat-idle sweep (v1.5.4) — see "Heartbeat-Based Finalize" above. |
# Add the marketplace (one time)
/plugin marketplace add JohnPitter/imprint
# Install the plugin
/plugin install imprint@imprint-tools
Restart Claude Code once. The SessionStart hook downloads the prebuilt
binaries for your platform (darwin-arm64, darwin-amd64, linux-amd64,
linux-arm64, windows-amd64) from the matching GitHub release on first
run. Subsequent sessions reuse the cached binaries (~ms).
No Go or Node toolchain required. Releases are signed with sha256 checksums (latest).
For development, plugin authoring, or unsupported platforms:
git clone https://github.com/JohnPitter/imprint.git
cd imprint
go run ./cmd/install # build all binaries into plugin/bin/
Add the local plugin directly:
claude --plugin-dir $(pwd)/plugin
If you prefer the pre-marketplace flow that writes hooks/MCP straight into
~/.claude/settings.json (no marketplace required), pass --register:
go run ./cmd/install --register
This is kept for backwards compatibility. New users should use Option 1.
memory_recall, memory_save, etc.) are available to Claudecd imprint
go run ./cmd/install --uninstall
| Layer | Technology |
|---|---|
| Language | Go 1.25 (pure Go, no CGO) |
| Database | SQLite with WAL mode (modernc.org/sqlite) |
| Search | Bleve (BM25 full-text) + in-memory vector (cosine similarity) |
| HTTP | Chi router + embedded Svelte SPA |
| Frontend | Svelte 3 + TypeScript + Vite |
| LLM | Anthropic, OpenRouter, llama.cpp (configurable fallback chain) |
| Protocol | MCP (JSON-RPC over stdio) |
| Hooks | 11 compiled Go binaries + ensure-server launcher (~6MB each, <50ms startup) |
| Testing | Go testing + httptest |
| CI/CD | GitHub Actions (lint, test, build, security) |
git clone https://github.com/JohnPitter/imprint.git
cd imprint
# Install frontend deps
cd frontend && npm install && cd ..
# Run dev server
go run .
# Run tests
go test ./... -count=1
# Build production
cd frontend && npm run build && cd ..
go build -ldflags="-s -w" -o imprint.exe .
# Build hooks + MCP
go run ./cmd/install --build-only
imprint/
main.go # HTTP server + scheduler entrypoint
internal/
config/ # Config loader + user settings
store/ # SQLite stores (17 stores, 28 tables)
search/ # BM25 + vector + hybrid search
llm/ # Provider interface + Anthropic/OpenRouter/llama.cpp
pipeline/ # Compress, summarize, consolidate, graph extract
privacy/ # Secret scrubbing (16 regex patterns)
service/ # Business logic layer + scheduler
server/ # HTTP handlers + Chi router
mcp/ # MCP JSON-RPC server
hooks/ # Shared hook library
cmd/
hooks/ # 11 hook binaries
ensure-server/ # cross-platform auto-start launcher invoked by SessionStart
mcp-server/ # Standalone MCP binary
install/ # One-command installer
frontend/ # Svelte 3 + TypeScript UI
plugin/ # Claude Code plugin structure
| Variable | Default | Description |
|---|---|---|
IMPRINT_PORT |
3111 |
HTTP server port |
IMPRINT_DATA_DIR |
~/.imprint |
Data directory |
IMPRINT_SECRET |
— | Optional Bearer token for API auth |
ANTHROPIC_API_KEY |
auto-detect | API key or Claude Code OAuth |
OPENROUTER_API_KEY |
— | OpenRouter API key |
LLAMACPP_URL |
http://localhost:8080 |
llama.cpp server URL |
PIPELINE_INTERVAL_MIN |
5 |
Background pipeline interval (0 = disabled) |
DECAY_MIN_STRENGTH |
3 |
Memories with strength ≤ this become decay candidates (v1.5.0) |
DECAY_MAX_AGE_DAYS |
30 |
Min age before a weak memory is archived (v1.5.0) |
IMPRINT_EVAL_CAPTURE |
0 |
Set 1 to capture (query, returned ids) for retrieval regression testing |
IMPRINT_EXTRACTION_MODE |
hybrid |
hybrid (regex pre-pass + LLM) or llm-only (legacy) |
~/.imprint/Imprint stands on the shoulders of two excellent projects:
MIT License — use freely.