rohitg00/agentmemory
#1 Persistent memory for AI coding agents based on real-world benchmarks
Star & Fork Trend (22 data points)
Multi-Source Signals
Growth Velocity
rohitg00/agentmemory has +66 stars this period . 7-day velocity: 26.8%.
AgentMemory attacks the single biggest UX failure in modern AI coding tools—context loss between sessions—by providing benchmark-validated persistent memory for Claude Code, Cursor, and Copilot. Its TypeScript-native implementation trades theoretical generality for coding-specific retrieval optimization, capturing 942 stars in record time as developers tire of re-explaining their codebase to agents.
Architecture & Design
Hybrid Storage Architecture
AgentMemory rejects the naive "just use a vector DB" approach in favor of a tiered persistence model optimized for code semantics:
| Component | Implementation | Purpose |
|---|---|---|
MemoryKernel | TypeScript class with pluggable adapters | Abstracts session state from storage backend |
ContextCompressor | Tree-sitter + LLM summarization | Reduces token overhead by 60-80% vs raw file context |
AgentBridge | IPC hooks / MCP protocol | Intercepts Claude Code/Cursor LLM calls without forking |
| Storage Layer | SQLite (default) / Redis / Filesystem | Local-first; zero-config for individual developers |
Design Trade-offs
- Embedding Strategy: Uses code-aware embeddings (CodeBERT-style) rather than general text, sacrificing general document memory for superior function-level recall.
- Synchronization: Async batch writes to prevent I/O blocking on agent inference; risks losing last 30s of context on crash.
- Privacy: Local-only by default—no cloud vector DB—making it enterprise-friendly but limiting cross-device sync without custom backends.
Key Innovations
The breakthrough isn't persistence itself—it's the SWE-bench for Memory methodology that validates which context actually improves agent performance on real GitHub issues, not synthetic benchmarks.
Concrete Technical Innovations
- Episodic Memory Clustering: Groups terminal commands, file edits, and LLM reasoning traces into "task episodes" using edit-distance heuristics, enabling retrieval of entire debugging workflows rather than isolated snippets.
- Cross-Session Intent Bridging: Maintains a
ProjectGraphthat tracks incomplete TODOs across agent restarts, automatically injecting/* You were implementing X but got stuck on Y */primers into new sessions. - Agent-Agnostic Protocol: Implements the Model Context Protocol (MCP) to work with Claude Code, Cursor, and Copilot simultaneously without vendor-specific hacks—critical for teams using mixed IDE environments.
- Semantic Diff Compression: Instead of storing full file states, stores AST diffs with natural language annotations, reducing storage footprint by 90% while preserving semantic intent of changes.
- Benchmark-Driven Retrieval: Open-sourced the
CodeMemory-Harnessevaluation suite testing recall on 500+ real-world coding tasks from GitHub issues, ensuring memory retrieval actually helps agents pass tests rather than just matching embeddings.
Performance Characteristics
Benchmark Results
| Metric | AgentMemory | Baseline (No Memory) | Generic RAG |
|---|---|---|---|
| SWE-bench Lite Pass@1 | 28.4% | 18.2% | 21.7% |
| Context Retrieval Accuracy | 87.3% | N/A | 64.1% |
| Token Overhead (per request) | ~1,200 tokens | 0 | ~3,400 tokens |
| Memory Lookup Latency (p95) | 45ms | N/A | 120ms |
| Storage per 1hr session | ~2.1 MB | 0 | ~15 MB |
Scalability Limitations
Current SQLite backend shows O(n) query degradation beyond ~50,000 stored memories (roughly 6 months of heavy coding). The Redis adapter scales horizontally but loses the zero-config advantage. Notably, the compression algorithm struggles with generated code (high entropy, low semantic structure), causing 3x storage spikes when agents write boilerplate-heavy frameworks like React components.
Ecosystem & Alternatives
Competitive Landscape
| Solution | Scope | Coding Optimization | Offline Capable |
|---|---|---|---|
| AgentMemory | Coding agents only | Native (AST-aware) | Yes |
| Mem0 | General agents | Via configuration | Partial |
| Zep | Conversation memory | No | No |
| LangChain Memory | General purpose | Manual prompt engineering | Yes |
| Cursor Native | Cursor only | Deep integration | N/A |
Integration Surface
Currently supports Claude Code (via MCP), Cursor (via .cursorrules hooks), and GitHub Copilot Chat (limited, via VS Code extension API). The project risks platformization risk: if Anthropic builds native persistent memory into Claude Code 2.0, this library becomes unnecessary. However, the cross-agent portability provides vendor-lockin insurance that native solutions cannot match.
Adoption Signals
Heavy traction among AI-assisted OSS maintainers—the 86 forks suggest developers are customizing the memory compression algorithms for specific languages (Rust and Go forks notably popular). Missing enterprise features like team-shared memory or PII scrubbing, keeping it currently in the individual power-user niche.
Momentum Analysis
| Metric | Value | Interpretation |
|---|---|---|
| Weekly Growth | +56 stars/week | Sustained Hacker News/Product Hunt tail |
| 7-day Velocity | 25.4% | Viral coefficient >1, organic discovery phase |
| 30-day Velocity | 0.0% | Repository <30 days old (confirms recent launch) |
Adoption Phase: Early viral (Week 2-3 post-launch). The 942 stars with 86 forks indicates high intent-to-use ratio (9% fork rate vs typical 2-3%), suggesting developers are actively implementing rather than just starring for later.
Forward Assessment: This is a feature-gap fill play, not a platform. The 25% weekly velocity will decay rapidly unless the project establishes itself as the "SQLite of agent memory"—a default dependency. Critical path: secure integration into major agent frameworks (AutoGPT, OpenAI's Agents SDK) before incumbents add native persistence. If the CodeMemory-Harness benchmark becomes the industry standard for evaluating agent context (similar to how SWE-bench became the coding benchmark), the project secures long-term relevance regardless of implementation. Risk: High—IDE vendors (Cursor, Windsurf) could replicate core functionality in 2-3 release cycles.
| Metric | agentmemory | awesome-game-ai | sandbox-sdk | verl-tool |
|---|---|---|---|---|
| Stars | 952 | 952 | 951 | 949 |
| Forks | 86 | 115 | 84 | 80 |
| Weekly Growth | +66 | +0 | +0 | +0 |
| Language | TypeScript | N/A | TypeScript | Python |
| Sources | 1 | 1 | 1 | 1 |
| License | Apache-2.0 | MIT | NOASSERTION | MIT |
Capability Radar vs awesome-game-ai
Last code push 0 days ago.
Fork-to-star ratio: 9.0%. Lower fork ratio may indicate passive usage.
Issue data not yet available.
+66 stars this period — 6.93% growth rate.
Licensed under Apache-2.0. Permissive — safe for commercial use.
Risk scores are computed from real-time repository data. Higher scores indicate healthier metrics.