engram

Persistent memory for AI agents — organized by time and space. Important memories get promoted, noise decays naturally, and related knowledge clusters into a browsable topic tree. Fully automatic.

中文

Quick Start

# Install and start
curl -fsSL https://raw.githubusercontent.com/kael-bit/engram-rs/main/install.sh | bash

# Store a memory
curl -X POST http://localhost:3917/memories \
  -d '{"content": "Always run tests before deploying", "tags": ["deploy"]}'

# Recall by meaning
curl -X POST http://localhost:3917/recall \
  -d '{"query": "deployment checklist"}'

# Restore full context (session start)
curl http://localhost:3917/resume

Core Features

Most agent memory tools provide a vector store with search. engram adds a lifecycle — memories are not just stored, they are continuously managed.

LLM Quality Gate

New memories enter the Buffer layer. Promotion to Working or Core requires passing an LLM quality gate — the LLM evaluates each memory in context and determines whether it warrants long-term retention.

Buffer → [LLM gate: "Is this a decision, lesson, or preference?"] → Working
Working → [sustained access + LLM gate] → Core

Semantic Dedup & Merge

When two memories express the same concept in different words, engram detects and merges them:

Memory A: "use PostgreSQL for auth"
Memory B: "auth service runs on Postgres"
→ After consolidation: single merged memory preserving both contexts

Merging is LLM-powered — based on semantic understanding, not string similarity.

Automatic Decay

Decay is epoch-based — it only occurs during active consolidation cycles, not by wall-clock time. If the agent is idle for a week, memories remain intact.

Kind	Decay rate	Use case
`episodic`	Fast	Events, experiences, time-bound context
`semantic`	Slow	Knowledge, preferences, lessons (default)
`procedural`	Slowest	Workflows, instructions, how-to

Working and Core memories are never deleted. In the Working layer, importance decreases gradually but memories remain searchable. Buffer serves as a temporary staging area where all kinds may be evicted.

Self-Organizing Topic Tree

Vector clustering automatically groups related memories. The tree is hierarchical, with LLM-generated names:

Memory Architecture
├── Three-layer lifecycle [4]
├── Embedding pipeline [3]
└── Consolidation logic [5]
Deploy & Ops
├── CI/CD procedures [3]
└── Production incidents [2]
User Preferences [6]

The tree rebuilds automatically when memories change. At session start, the agent receives a topic index as a table of contents. Use POST /topic {"ids": ["kb3"]} to retrieve all memories within a specific cluster.

Triggers

Tag a memory with trigger:deploy, and it surfaces automatically when the agent queries /triggers/deploy before executing a deployment.

# Store a lesson
curl -X POST http://localhost:3917/memories \
  -d '{"content": "LESSON: always backup DB before migration", "tags": ["trigger:deploy", "lesson"]}'

# Pre-deployment check
curl http://localhost:3917/triggers/deploy
# → returns all memories tagged trigger:deploy, ranked by access count

Architecture

Memory is organized along two dimensions — time and space:

         Time (lifecycle)                    Space (topic tree)
┌─────────────────────────────┐    ┌──────────────────────────────┐
│                             │    │ Auth Architecture            │
│  Buffer → Working → Core    │    │ ├── OAuth2 migration [3]     │
│    ↓         ↓        ↑     │    │ └── Token handling [2]       │
│  evict     decay    gate    │    │ Deploy & Ops                 │
│                             │    │ ├── CI/CD procedures [3]     │
│                             │    │ └── Rollback lessons [2]     │
└─────────────────────────────┘    │ User Preferences [6]         │
                                   └──────────────────────────────┘

Time — a three-layer lifecycle inspired by the Atkinson–Shiffrin memory model:

Layer	Role	Behavior
Buffer	Short-term staging	All new memories enter here. Evicted when they fall below threshold
Working	Active knowledge	Promoted by consolidation. Never deleted — importance decays at different rates by kind
Core	Long-term identity	Promoted through LLM quality gate. Never deleted

Space — a self-organizing topic tree built from embedding vectors. Related memories cluster by semantic similarity, with LLM-generated names for each cluster:

Mechanism	Description
Vector clustering	Groups semantically similar memories into topics via cosine similarity
Hierarchy	Related topics nest under shared parent nodes, forming a multi-level tree
LLM naming	Generates human-readable names for each cluster automatically
Auto-rebuild	Tree updates when memories change — no manual maintenance required

Topic trees address a fundamental limitation of vector search: it requires the right query to find the right memory. Topic trees allow the agent to browse by subject — scan the directory, then drill into the relevant branch.

Session Recovery

A single call restores full context, intended for session start or post-compaction recovery:

GET /resume →

=== Core (24) ===
deploy: test → build → stop → start (procedural)
LESSON: never force-push to main
...

=== Recent ===
[02-27 14:15] switched auth to OAuth2
[02-27 11:01] published API docs

=== Topics (Core: 24, Working: 57, Buffer: 7) ===
kb1: "Deploy Procedures" [5]
kb2: "Auth Architecture" [3]
kb3: "Memory Design" [8]
...

=== Triggers ===
deploy, git-push, database-migration

Four sections, each serving a distinct purpose:

Section	Content	Budget
Core	Full text of permanent rules and identity — never truncated	~2k tokens
Recent	Memories changed since last consolidation window, for short-term continuity	~1k tokens
Topics	Named topic index — structured directory of all memories	Leaf list
Triggers	Pre-action safety tags for automatic lesson recall	Tag list

The agent reads the topic index, identifies relevant topics, and drills in via POST /topic on demand. This avoids loading the entire memory store into context.

Search & Retrieval

Hybrid retrieval combining semantic embeddings and BM25 keyword search (with jieba for CJK tokenization). Results are ranked by relevance, memory importance, and recency.

# Semantic search with budget control
curl -X POST http://localhost:3917/recall \
  -d '{"query": "how do we handle auth", "budget_tokens": 2000}'

# Pre-action safety check
curl http://localhost:3917/triggers/deploy

# Topic drill-down
curl -X POST http://localhost:3917/topic \
  -d '{"ids": ["kb3"]}'

Background Maintenance

Fully autonomous and activity-driven — cycles are skipped when there has been no write activity:

Consolidation (every 30 minutes)

Each cycle executes the following steps in order:

Decay — reduce importance of unaccessed memories
Dedup — detect and merge near-identical memories (cosine > 0.78)
Triage — LLM categorizes new Buffer memories for promotion
Gate — LLM evaluates promotion candidates (batched, single call)
Reconcile — LLM resolves ambiguous similar pairs; results are cached to avoid redundant calls
Topic tree rebuild — re-cluster and name new or changed topics

Topic Distillation

When a topic cluster grows too large (10+ memories), engram condenses overlapping memories into fewer, richer entries — preserving all specific details while reducing redundancy. Up to 2 topics are distilled per consolidation cycle.

Multi-Agent & Namespace Isolation

A single engram instance can serve multiple agents concurrently. SQLite WAL mode, a connection pool, and an RwLock-protected vector index make concurrent reads and writes safe out of the box.

Use the X-Namespace header to give each agent (or project) its own isolated memory space:

# Project-specific memories
curl -X POST http://localhost:3917/memories \
  -H "X-Namespace: my-project" \
  -d '{"content": "API uses OAuth2 bearer tokens"}'

# Cross-project knowledge in the default namespace
curl -X POST http://localhost:3917/memories \
  -d '{"content": "Always use UTC for timestamps"}'

Install

# macOS / Linux
curl -fsSL https://raw.githubusercontent.com/kael-bit/engram-rs/main/install.sh | bash

# Windows (PowerShell)
irm https://raw.githubusercontent.com/kael-bit/engram-rs/main/install.ps1 | iex

Docker

docker run -d --name engram \
  -p 3917:3917 \
  -v engram-data:/data \
  -e ENGRAM_LLM_URL=https://api.openai.com/v1 \
  -e ENGRAM_LLM_KEY=sk-... \
  ghcr.io/kael-bit/engram-rs:latest

LLM Configuration

engram functions without an LLM — providing keyword search and rule-based lifecycle only. Adding an LLM enables intelligent features (triage, gate, merge, topic naming, audit):

export ENGRAM_LLM_URL=https://api.openai.com/v1
export ENGRAM_LLM_KEY=sk-...

# Any OpenAI-compatible API (Ollama, LM Studio, etc.)
export ENGRAM_LLM_URL=http://localhost:11434/v1

# Anthropic native
export ENGRAM_LLM_PROVIDER=anthropic
export ENGRAM_LLM_KEY=sk-ant-...

Two-tier model configuration — strong model for judgment tasks, lightweight model for text processing:

ENGRAM_GATE_MODEL=gpt-4o          # quality decisions
ENGRAM_LLM_MODEL=gpt-4o-mini     # text processing

For AI Agents

Add this to your agent's system prompt or session:

Set up engram (persistent memory) by following the guide at:
https://raw.githubusercontent.com/kael-bit/engram-rs/main/docs/SETUP.md

Integration

Compatible with Claude Code, Cursor, Windsurf, OpenClaw, and any MCP-compatible tool.

17 MCP tools — see MCP docs. Full HTTP API — see Setup guide.

# MCP (Claude Code)
npx engram-rs-mcp

# MCP (Cursor / Windsurf / generic)
# Add to your MCP config:
{"mcpServers": {"engram": {"command": "npx", "args": ["-y", "engram-rs-mcp"]}}}

Web Dashboard

Built-in web UI at http://localhost:3917 for browsing memories, viewing the topic tree, monitoring LLM usage, and inspecting consolidation history.

Specs


Binary size	~10 MB
Memory usage	~100 MB RSS in production
Storage	SQLite, no external database
Language	Rust
Platforms	Linux, macOS, Windows (x86_64 + aarch64)
License	MIT

License

MIT

engram

engram

Quick Start

Core Features

LLM Quality Gate

Semantic Dedup & Merge

Automatic Decay

Self-Organizing Topic Tree

Triggers

Architecture

Session Recovery

Search & Retrieval

Background Maintenance

Consolidation (every 30 minutes)

Topic Distillation

Multi-Agent & Namespace Isolation

Install

Docker

LLM Configuration

For AI Agents

Integration

Web Dashboard

Specs

License

MCP Server · Populars

MarkItDown-MCP

MarkItDown

Awesome MCP Servers

mcp-server-sentry: A Sentry MCP server

AWS Knowledge Base Retrieval MCP Server

MCP Server · New

idea-reality-mcp

@sveltejs/mcp

BoltMCP

opentwitter-mcp

YouTube Skills for AI Agents 🎬