rohitg00/agentmemory Deep Dive: Persistent Memory for AI Coding Agents — Architecture, Risks, and Best Practices

Q: What does this briefing recommend developers do first?

Evaluate agentmemory by running `npx @agentmemory/agentmemory` in a separate terminal, then verify with `curl http://localhost:3111/agentmemory/health`. Test the capture and search pipeline with the no-op LLM default before enabling any API-key-based compression.

rohitg00/agentmemory is a TypeScript-based persistent memory server that captures coding agent activity across sessions and injects relevant context when the next session starts. It appeared at GitHub Trending rank two for the daily window on 2026-05-13 with over 5,800 total stars and an Apache-2.0 license. Built on the iii engine, it supports over sixteen agent clients including Claude Code, Cursor, Codex CLI, Hermes, and OpenClaw. This deep dive evaluates the repository's architecture, benchmarks, install path, operational risks, and alternatives based entirely on README, source structure, and GitHub metadata evidence.

Section 01

One-paragraph verdict

agentmemory solves a concrete problem that every developer using AI coding agents encounters: context loss between sessions. Every time you close Claude Code, Cursor, or Codex CLI, the agent forgets your architecture decisions, bug patterns, and preferences. Built-in memory mechanisms like CLAUDE.md and .cursorrules cap at roughly two hundred lines and go stale. Agentmemory silently captures tool use via hooks, compresses it into searchable memory, and injects the right context at session start. The repository is under active development (last push 2026-05-12), carries a permissive Apache-2.0 license, and ships with npm, Docker, and source-build install paths. The main adoption risk is the iii engine dependency — a relatively young runtime that agentmemory pins to version zero point eleven point two. For developers running multiple coding-agent sessions per day, this is worth evaluating. For teams requiring database-backed persistence with external vector stores, the local-first SQLite model may be a constraint.

Section 02

agentmemory at a glance

Concept diagram summarizing agentmemory as a persistent memory layer between coding agents and their sessions. — Explanatory visual A compact map of the agentmemory thesis: hook-based capture, triple-stream search, four-tier consolidation, and session-start injection.

Section 03

What the project does, based on README evidence

Automatic capture via hooks — twelve hooks (SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, PostToolUseFailure, SubagentStart, SubagentStop, Stop, SessionEnd, and others) silently capture agent activity without manual effort from the developer.
Four-tier memory consolidation — working memory (raw observations), episodic memory (compressed session summaries), semantic memory (extracted facts and patterns), and procedural memory (workflows and decision patterns). Memories decay following an Ebbinghaus curve and strengthen with access.
Triple-stream hybrid search — BM25 keyword search (always on, no external deps), vector similarity (when an embedding provider is configured), and graph traversal (when entities are detected). Results are fused via Reciprocal Rank Fusion with a diversity cap of three results per session.
Broad agent support — the README lists sixteen or more agent clients: Claude Code, Cursor, Codex CLI, Hermes, OpenClaw, Gemini CLI, OpenCode, Cline, Goose, Kilo Code, Aider, Claude Desktop, Windsurf, Roo Code, Claude SDK, and any MCP or REST client.
Fifty-one MCP tools — eight core tools visible by default, expandable to fifty-one via the AGENTMEMORY_TOOLS environment variable. Includes memory_smart_search, memory_save, memory_sessions, and memory_governance_delete among others.
Privacy-first design — API keys, Bearer tokens, and content inside private tags are stripped before storage. All data stays local (SQLite via iii engine, binds to 127.0.0.1 by default).
Real-time viewer — a web interface on port 3113 provides live observation streaming, session explorer, memory browser, and knowledge graph visualization.

Section 04

Architecture and workflow interpretation

This section is based on README headings, the AGENTS.md contributor guide, the source directory structure, and the package.json dependency list. No internal runtime behavior was observed directly.

The agentmemory architecture is built on three iii engine primitives: Worker (long-lived process supervision), Function (registered business logic), and Trigger (HTTP endpoints and event handlers). Everything flows through iii engine's SDK — there is no direct Express, Fastify, or database driver dependency. The package.json lists six runtime dependencies, with iii-sdk at version zero point eleven point two as the core engine, Anthropic's Claude agent SDK for optional LLM calls, and zod for schema validation.

The memory pipeline operates in four phases. First, capture: the twelve hooks fire on agent lifecycle events and route observations through a SHA-256 deduplication layer with a five-minute window, then a privacy filter that strips secrets and API keys. Second, process: raw observations are stored in iii KV state, then compressed — either by an LLM provider (Anthropic, Gemini, OpenRouter, or agent-sdk) or by synthetic zero-token compression (the default, no API key required). Compressed observations are embedded via one of eight embedding providers (local all-MiniLM-L6-v2 is free and runs without an API key) and indexed in BM25 plus vector plus graph stores.

Third, consolidate: sessions with five or more observations and an importance score of five or above are grouped by shared concepts, then an LLM merges them into structured Memory objects with type classification (pattern, preference, architecture, bug, workflow, fact). Fourth, retrieve: at SessionStart, the system loads the project profile, runs hybrid search across all tiers, enforces a token budget (default two thousand tokens), and injects results into the conversation context.

The source tree contains over one hundred and twenty functions organized into sixty modules under src/functions/, covering access tracking, auto-forget, branch awareness, checkpoints, Claude bridge, compression, consolidation, context generation, deduplication, diagnostics, eviction, export-import, governance, graph retrieval, image references, leases, mesh coordination, pattern mining, privacy filtering, query expansion, reflection, retention scoring, routine scheduling, semantic search, session replay, skill extraction, snapshot management, summarization, team memory, temporal graphs, timeline tracking, and vision search among others.

Section 05

Repository review map

Architecture review map for evaluating the agentmemory repository surface, memory pipeline, integrations, and risks. — Explanatory visual A reader-facing checklist for evaluating agentmemory: project surface, memory pipeline tiers, integration points, and operational risks.

Section 06

Benchmarks from README evidence

The README includes a comparison table citing retrieval benchmarks against LongMemEval-S (ICLR 2025, five hundred questions). Agentmemory reports 95.2 percent recall at five (R@5), 98.6 percent recall at ten (R@10), and 88.2 percent mean reciprocal rank (MRR). The BM25-only fallback achieves 86.2 percent R@5. The same table compares agentmemory against mem0 (68.5 percent R@5) and Letta/MemGPT (83.2 percent R@5).

Token savings are reported as approximately 170,000 tokens per year (roughly ten dollars per year at current API pricing) versus over 19.5 million tokens for pasting full context every session, representing a claimed 92 percent reduction compared to built-in approaches.

These figures come from the repository's own README and benchmark directory. They were not independently verified by SignalForges. LongMemEval-S is a published academic benchmark, which lends some credibility, but the test conditions, model versions, and hardware configurations are not fully documented in the available excerpt. Treat these as directional indicators rather than guaranteed performance guarantees.

Section 07

Workflow evidence map

Section 08

Install and first-test path

The README documents several install paths. The quickest start for evaluation:

One-command server start: npx @agentmemory/agentmemory — this starts the memory server on localhost:3111 with a local iii engine. The first run may pull a Docker image for the iii engine if Docker is available, or attempt a local cargo install of the iii engine binary.

Verify the server is running: curl http://localhost:3111/agentmemory/health — should return a health status response.

Seed sample data and test recall: npx @agentmemory/agentmemory demo — seeds sample observations and demonstrates the search pipeline.

Claude Code integration: run /plugin marketplace add rohitg00/agentmemory and /plugin install agentmemory in Claude Code. The plugin registers twelve hooks, four skills, and auto-wires the MCP stdio server via its .mcp.json, providing fifty-one MCP tools without extra configuration.

Docker alternative: the bundled docker-compose.yml pulls iiidev/iii:0.11.2 for containerized setups.

No reproducible_tests block was provided in the research brief, so this article does not make first-person claims about having executed these steps. The install commands are quoted directly from the README.

Section 09

Best practices and operational guardrails

For teams evaluating agentmemory as a persistent memory layer, several practical considerations emerge from the README and source evidence:

Start with the no-op LLM default. Agentmemory ships with synthetic (zero-token) compression as the default. This means you can evaluate the capture and search pipeline without providing an API key. Enable LLM-based compression only after verifying that the capture hooks work correctly in your environment.

Evaluate the iii engine dependency early. The iii engine is a relatively young runtime pinned at version zero point eleven point two. The README notes that version zero point eleven point six and above introduce a sandbox model not yet supported by agentmemory. This means you are locked to a specific engine version until agentmemory updates. Assess whether this pin creates operational risk for your deployment.

Check the local storage model. All data is stored in a local SQLite file via iii engine's StateModule. There is no built-in replication, backup, or multi-machine sync. If you run coding agents across multiple machines, each will have its own memory store unless you configure external synchronization. The REST API binds to 127.0.0.1 by default, which means it is not exposed to the network without explicit configuration changes.

Monitor token injection budgets. The default context injection budget is two thousand tokens. If your agent sessions involve large codebases or complex multi-file refactoring, this budget may be too small to capture all relevant context. The token budget is configurable, but increasing it means more tokens consumed per session.

Review the privacy filter scope. The README states that API keys, Bearer tokens, and content inside private tags are stripped. However, the filter's scope and edge cases (for example, custom authentication headers, environment variable values in shell output) are not fully documented in the README. Test with your actual agent output before relying on the filter in sensitive environments.

Use the real-time viewer for debugging. The web viewer on port 3113 provides a live stream of captured observations and a memory browser. This is the fastest way to verify that hooks are firing correctly and that search results match expectations.

Section 10

Section visual card

Explanatory visual Reusable visual card for dense evidence sections.

Section 11

Alternatives and when to avoid it

mem0 (53K+ stars) — a memory layer with vector and graph search. More mature and widely adopted, but requires external vector databases (Qdrant or pgvector) and manual add() calls rather than hook-based auto-capture. Better for teams that want database-backed persistence with managed infrastructure.
Letta/MemGPT (22K+ stars) — an agent framework with built-in memory management. More opinionated than agentmemory; it owns the agent runtime rather than layering on top of existing agents. Choose this if you want a full agent framework, not just a memory plugin.
Built-in agent memory (CLAUDE.md, .cursorrules, .windsurfrules) — zero-dependency, always available, but capped at roughly two hundred lines and does not support semantic search or automatic capture. Adequate for small projects with simple context needs.
Karpathy's LLM Wiki pattern — a manual note-taking approach where the developer maintains a structured markdown file with project knowledge. No tooling required, but requires discipline and does not scale across sessions automatically.

Section 12

Risks and known issues

iii engine version lock — pinned to v0.11.2; newer engine versions introduce a sandbox model that agentmemory does not yet support. Until the pin is updated, you cannot upgrade the engine independently.
Viewer behind reverse proxy — open issue #299 reports that the viewer UI hardcodes port 3113 in its JavaScript, breaking when served behind a reverse proxy on standard ports 80 or 443.
MCP tool count mismatch — open issue #234 reports that the standalone MCP shim with AGENTMEMORY_TOOLS=all returns only seven tools instead of the advertised fifty-one.
OpenClaw compatibility — open issue #262 notes that plugin hook signatures do not match OpenClaw version 2026.5.7.
Historical token burn issues — several closed issues (#138, #143, #149, #181) document past incidents where auto-compress or agent-sdk providers caused unexpected token consumption or infinite loops. These have been fixed, but the pattern suggests that new provider integrations may introduce similar regressions.
Local-first storage — no built-in replication or multi-machine sync. Each machine maintains its own memory store.

Section 13

Source ledger and methodology

Sources cited in this article:

1. rohitg00/agentmemory GitHub repository (primary evidence for purpose, architecture, install, and benchmarks) — https://github.com/rohitg00/agentmemory

2. README.md (full content, 51K characters) — https://github.com/rohitg00/agentmemory/blob/main/README.md

3. package.json (dependency and version evidence) — https://github.com/rohitg00/agentmemory/blob/main/package.json

4. AGENTS.md (contributor architecture guide) — https://github.com/rohitg00/agentmemory/blob/main/AGENTS.md

5. LICENSE (Apache-2.0 full text) — https://github.com/rohitg00/agentmemory/blob/main/LICENSE

Methodology: This article was drafted using repository metadata, README content, source directory structure, package.json dependencies, and GitHub issue data retrieved via the GitHub API and the zai-zread-repo MCP tool. No first-person testing was performed. All claims are grounded in the cited primary sources. AI assistance was used for drafting; the final text was reviewed against the SignalForges editorial policy.

Refresh-sensitive notes: Star counts, issue counts, and version numbers reflect repository state as of the 2026-05-13 enrichment window. The benchmarks reference LongMemEval-S (ICLR 2025); test conditions should be verified against the benchmark directory in the repository before citing specific percentages.

Section 14

Editorial conclusion

agentmemory addresses a real and growing pain point for developers who rely on AI coding agents across multiple sessions and repositories. The hook-based auto-capture, four-tier memory consolidation, and triple-stream hybrid search represent a thoughtful approach to the context-persistence problem. The Apache-2.0 license and npm install path lower the barrier to evaluation. However, the iii engine dependency lock, the local-first storage model, and the history of token-burn regressions in closed issues mean that this is a project to evaluate carefully before committing to production use. Start with the no-op LLM default, test the capture hooks in your environment, and verify that the memory retrieval quality meets your needs before enabling LLM-based compression. For teams with existing vector database infrastructure, mem0 may be a more natural fit. For individual developers seeking a zero-config memory layer for Claude Code or Cursor, agentmemory is worth the inspection time.

Editorial Conclusion

agentmemory is worth evaluating for individual developers using Claude Code or Cursor who need zero-config persistent memory across sessions. Teams with existing vector database infrastructure may prefer mem0.

Best for

Developers running multiple coding-agent sessions per day who want automatic context capture and retrieval without manual configuration.

Avoid when

You need multi-machine sync, database-backed persistence, or cannot accept a pinned iii engine dependency.

Refresh-sensitive details

Star counts, issue counts, and version numbers reflect repository state as of the 2026-05-13 enrichment window.
Benchmark figures reference LongMemEval-S (ICLR 2025); test conditions should be verified independently.
The iii engine dependency is pinned to a specific version; evaluate version-lock risk before production adoption.
Some claims are refresh-sensitive; verify the primary source before citing specific numbers.
Automation-assisted publication; SignalForges editors review audit reports after publication.

Editorial review

Evidence and Method

This page is kept in the SignalForges public index because it has a visible source trail, original editorial judgment, and a clear reader-use case.

Source Basis

rohitg00/agentmemory GitHub repository: Primary evidence for purpose, architecture, install, benchmarks, license, and activity.
agentmemory README.md: Full README content (51K characters) covering hooks, memory pipeline, benchmarks, competitor comparison, and install instructions.
agentmemory package.json: Dependency evidence: six runtime deps including iii-sdk v0.11.2, Anthropic Claude agent SDK, zod.
agentmemory AGENTS.md: Contributor architecture guide documenting iii engine primitives, function registration, and state management.
agentmemory LICENSE: Apache License 2.0 verification.

Original Value

Explains trade-offs and adoption fit instead of restating vendor marketing.
agentmemory is worth evaluating for individual developers using Claude Code or Cursor who need zero-config persistent memory across sessions. Teams with existing vector database infrastructure may prefer mem0.
Keeps a clear recommendation path for technical readers.
Retains only articles that meet the public sitemap depth threshold.

Visual Structure

Article visual asset
Source ledger table
Fact pack cards
Editorial conclusion box

Evidence

Source Ledger

These are the primary references used to keep the article grounded. Pricing, limits, benchmark results, and model names are rechecked against the source type shown below.

Source	Type	How it is used
rohitg00/agentmemory GitHub repository	ecosystem reference	Primary evidence for purpose, architecture, install, benchmarks, license, and activity.
agentmemory README.md	ecosystem reference	Full README content (51K characters) covering hooks, memory pipeline, benchmarks, competitor comparison, and install instructions.
agentmemory package.json	ecosystem reference	Dependency evidence: six runtime deps including iii-sdk v0.11.2, Anthropic Claude agent SDK, zod.
agentmemory AGENTS.md	ecosystem reference	Contributor architecture guide documenting iii engine primitives, function registration, and state management.
agentmemory LICENSE	ecosystem reference	Apache License 2.0 verification.

Fact Pack

What This Article Actually Claims

high confidence

rohitg00/agentmemory appeared at GitHub Trending rank two for the daily period on 2026-05-13.

https://github.com/trending

high confidence

The repository license is Apache-2.0.

GitHub API repository metadata and LICENSE file.

high confidence

The latest detected push time is 2026-05-12T23:22:32Z.

GitHub API repository metadata.

high confidence

Total stars: 5,822. Forks: 545. Open issues: 53.

GitHub API repository metadata at enrichment time.

high confidence

The repository is TypeScript with six runtime dependencies: iii-sdk, @anthropic-ai/claude-agent-sdk, @anthropic-ai/sdk, @clack/prompts, dotenv, zod.

package.json via GitHub API.

high confidence

agentmemory supports sixteen or more agent clients including Claude Code, Cursor, Codex CLI, Hermes, OpenClaw, Gemini CLI, OpenCode, Cline, Goose, Kilo Code, Aider, Claude Desktop, Windsurf, Roo Code, Claude SDK, and any MCP or REST client.

README.md headings and usage sections.

high confidence

The memory pipeline uses four tiers: working, episodic, semantic, procedural.

README.md four-tier memory consolidation section.

medium confidence

Retrieval benchmark: 95.2 percent R@5 on LongMemEval-S (ICLR 2025, five hundred questions).

README.md comparison table and benchmark directory.

medium confidence

Token savings reported as approximately 170,000 tokens per year versus over 19.5 million tokens for pasting full context.

README.md token savings section.

high confidence

twelve hooks, four skills, fifty-one MCP tools, one hundred and seven REST endpoints.

README.md feature sections and AGENTS.md.

high confidence

iii engine pinned to version zero point eleven point two; newer versions not supported.

README.md and package.json iii-sdk version.

high confidence

Open issues include viewer reverse proxy breakage (#299), MCP tool count mismatch (#234), and OpenClaw compatibility (#262).

GitHub Issues via API and web search.

high confidence

Closed issues document historical token burn incidents (#138, #143, #149, #181) that were fixed in subsequent releases.

GitHub Issues via API and CHANGELOG.md.

medium confidence

mem0 has 53K+ stars. Letta/MemGPT has 22K+ stars.

README.md comparison table citing competitor star counts.

Methodology

Draft composed by the Hermes Writer agent using repository metadata, README content, source structure, package.json, and GitHub issue data.
Evidence gathered via zai-zread-repo MCP tool and GitHub API.
No first-person testing was performed. All claims are grounded in cited primary sources.
AI assistance was used; no private data or unreleased sources were referenced.

Frequently asked

Questions readers ask

What does this briefing recommend developers do first?

Evaluate agentmemory by running npx @agentmemory/agentmemory in a separate terminal, then verify with curl http://localhost:3111/agentmemory/health. Test the capture and search pipeline with the no-op LLM default before enabling any API-key-based compression.

Where can readers verify the figures cited in this article?

Every precise figure must be verified against the primary URL. The first listed source is https://github.com/rohitg00/agentmemory. Benchmark figures reference the benchmark/ directory in the repository.

Is this article human-authored or AI-assisted?

The draft was composed with AI assistance by the Hermes Writer agent, then reviewed against the SignalForges editorial policy and the Autonomous Publishing Safety Contract before publication.

agentmemory is worth evaluating for individual developers using Claude Code or Cursor who need zero-config persistent memory across sessions. Teams with existing vector database infrastructure may prefer mem0.

Best for

Avoid when

Refresh-sensitive details

Source Ledger

What This Article Actually Claims

rohitg00/agentmemory appeared at GitHub Trending rank two for the daily period on 2026-05-13.

The repository license is Apache-2.0.

The latest detected push time is 2026-05-12T23:22:32Z.

Total stars: 5,822. Forks: 545. Open issues: 53.

The repository is TypeScript with six runtime dependencies: iii-sdk, @anthropic-ai/claude-agent-sdk, @anthropic-ai/sdk, @clack/prompts, dotenv, zod.

agentmemory supports sixteen or more agent clients including Claude Code, Cursor, Codex CLI, Hermes, OpenClaw, Gemini CLI, OpenCode, Cline, Goose, Kilo Code, Aider, Claude Desktop, Windsurf, Roo Code, Claude SDK, and any MCP or REST client.

The memory pipeline uses four tiers: working, episodic, semantic, procedural.

Retrieval benchmark: 95.2 percent R@5 on LongMemEval-S (ICLR 2025, five hundred questions).

Token savings reported as approximately 170,000 tokens per year versus over 19.5 million tokens for pasting full context.

twelve hooks, four skills, fifty-one MCP tools, one hundred and seven REST endpoints.

iii engine pinned to version zero point eleven point two; newer versions not supported.

Open issues include viewer reverse proxy breakage (#299), MCP tool count mismatch (#234), and OpenClaw compatibility (#262).

Closed issues document historical token burn incidents (#138, #143, #149, #181) that were fixed in subsequent releases.

mem0 has 53K+ stars. Letta/MemGPT has 22K+ stars.

Methodology

GitHub Trending AI Developer Tools: 2026-05-12 Ranking and Engineering Signal

Related Comparisons

GitHub Trending AI Developer Tools: 2026-05-12 Ranking and Engineering Signal