caura-ai

caura-memclaw

Community caura-ai
Updated

MemClaw — persistent memory for AI agent fleets (OSS)

Fleet memory for AI agents — governed, shared, self-improving.

Quick Start · Features · Performance · MCP · API Reference · Plugin Docs · Contributing · Discord

MemClaw — Fleet memory for AI agents

MemClaw is open-source memory for multi-tenant, multi-agent AI fleets. Your agents store what they learn, find what the fleet knows, and get smarter with every interaction — learning from each other instead of repeating mistakes.

Agents write plain text. MemClaw turns it into searchable, governed, self-improving memory.

One loop, three pillars: write, recall, compound — every interaction makes the next one smarter.

Built for fleets, not single agents. Public agent-memory benchmarks (LoCoMo, LongMemEval) measure one agent, one user, one long conversation — the single-chatbot shape. The deployment shape we see in production is the opposite: dozens or thousands of agents working on behalf of a company, sharing what they learn under governance. MemClaw is architected around that shape from day one — scoped memory, cross-agent outcome propagation, fleet-wide trust tiers — and competes on the axes that compound with agent count: latency, token efficiency, and governance. See Performance for the numbers, or read the benchmarks write-up.

Quick Start

Three paths — pick the one that matches your setup:

Path When Time to first memory
Managed platform Quickest. We host the DB + scaling. ~2 min
Self-hosted (Docker) Privacy / on-prem / air-gapped. ~5 min
OpenClaw plugin You already run an OpenClaw fleet — install MemClaw as a plugin against any of the above. ~3 min

Managed Platform

Get up and running in minutes — no infrastructure, automatic updates, usage analytics, and enterprise-grade security included.

  1. Sign up free on memclaw.net
  2. Grab your API key from the dashboard
  3. Connect via MCP or REST:
{
  "mcpServers": {
    "memclaw": {
      "url": "https://memclaw.net/mcp",
      "headers": { "X-API-Key": "mc_your_api_key_here" }
    }
  }
}

Production / team use: the quickstart key above is a tenant-scoped credential — fine for personal use, but a fleet of agents should bind each one to its own agent-scoped credential for trust gating, fleet membership, and per-agent keystones. Provision agent-scoped credentials atomically via POST /api/v1/admin/agent-keys/provision, or through the dashboard at /settings/organization/api-credentials. Both kinds use the mc_ prefix on the wire — scope is bound at mint time on the credential itself. The MCP server accepts the credential on either X-API-Key: mc_… or Authorization: Bearer mc_…. (Pre-existing mca_… and mci_… keys continue to authenticate via back-compat.)

Using a tenant-scoped credential? Pass an explicit agent_id on every MCP tool call — the gateway refuses the reserved default (mcp-agent) on the tenant-scoped path.

Self-Hosted (Open Source)

The fastest path is Docker Compose — one command brings up Postgres + pgvector + Redis + the API.

Prefer not to use Docker? Skip to Manual deployment (Python + Postgres) below for the bare-Python path.

No cloud API key, no external calls? v2.0+ supports a self-hosted local embedder (BAAI/bge-m3 via HuggingFace TEI) — see docs/local-embedder.md. The setup below walks through the OpenAI default; the local-embedder doc walks through the alternative.

Prerequisites
  • Docker Engine 24+ (Linux) or Docker Desktop (macOS / Windows). Confirm with docker --version.
  • Docker Compose v2 (built into modern Docker). Confirm with docker compose version.
  • Git for cloning.
  • ~2 GB free disk for images + Postgres data volume.
1. Clone and configure
git clone https://github.com/caura-ai/caura-memclaw.git
cd caura-memclaw
cp .env.example .env

Set your AI provider in .env — minimal setup with OpenAI:

EMBEDDING_PROVIDER=openai
ENTITY_EXTRACTION_PROVIDER=openai
USE_LLM_FOR_MEMORY_CREATION=true
OPENAI_API_KEY=sk-...

Without any AI keys the stack still starts — dummy providers return non-semantic embeddings, useful for testing the API surface.

💡 Want zero cloud API calls? v2.0+ ships a self-hosted embedderprofile (BAAI/bge-m3 on a HuggingFace TEIsidecar). Bring up the stack with docker compose --profile embed-local up -dand set the four OPENAI_EMBEDDING_* envs from .env.example — seedocs/local-embedder.md for the full setup.Combined with IS_STANDALONE=true (below) this is a fully self-containeddeployment with no external API calls.

Other providers (Gemini, Anthropic, OpenRouter, self-hosted)
Provider .env settings Required key
OpenAI (default) EMBEDDING_PROVIDER=openaiENTITY_EXTRACTION_PROVIDER=openai OPENAI_API_KEY
Google Gemini EMBEDDING_PROVIDER=openaiENTITY_EXTRACTION_PROVIDER=gemini GEMINI_API_KEY + OPENAI_API_KEY
Anthropic EMBEDDING_PROVIDER=openaiENTITY_EXTRACTION_PROVIDER=anthropic ANTHROPIC_API_KEY + OPENAI_API_KEY
OpenRouter EMBEDDING_PROVIDER=openaiENTITY_EXTRACTION_PROVIDER=openrouter OPENROUTER_API_KEY + OPENAI_API_KEY
Self-hosted (TEI / bge-m3) --profile embed-local + OPENAI_EMBEDDING_BASE_URL=http://tei:80/v1+ OPENAI_EMBEDDING_MODEL=BAAI/bge-m3+ OPENAI_EMBEDDING_SEND_DIMENSIONS=false none — runs locally

Anthropic, Gemini, and OpenRouter don't offer embedding APIs here — pair them with OpenAI (or with TEI) for embeddings. You can mix providers freely. Gemini uses the Google AI Studio key-auth Developer API (no GCP project/ADC required). The self-hosted TEI row keeps EMBEDDING_PROVIDER=openai because TEI speaks the same OpenAI-compatible API; see docs/local-embedder.md for hardware sizing, GPU setup, and model swapping.

2. Start the stack
docker compose up -d

By default this pulls the multi-arch images from ghcr.io (linux/amd64 + linux/arm64) on first run — takes ~30 seconds. Subsequent up commands re-use the cached image (no registry round-trip, works offline). To pin a specific version, set MEMCLAW_VERSION=v1.2.3 in your .env. To build from local source instead (e.g. when iterating on a fork), run docker compose up --build --no-pull.

To upgrade to a newer image at the same tag (e.g. :latest after we cut a new release), run docker compose pull && docker compose up -d. Without an explicit pull, the local cache wins — there's no silent version drift.

Offline / air-gapped operation: depending on whether the image is already cached locally:

  • Image cached, no network: docker compose up -d works as-is — pull_policy: missing doesn't try to pull when the image is present. Use docker compose up --no-pull if you want to be explicit.
  • No local image, no network: docker compose up --build --no-pull (build from source, don't try to pull).
  • Strict no-network guarantee (e.g. an air-gapped pipeline that should never reach ghcr.io): drop a docker-compose.override.yml setting pull_policy: never for both services — Compose then fails fast if the image is absent rather than attempting a pull.
Service URL
Core API (REST + MCP) http://localhost:8000
Core Storage API http://localhost:8002
PostgreSQL (pgvector) localhost:5432
Redis localhost:6379
3. Verify
curl http://localhost:8000/api/v1/health
# {"status":"ok","database":"connected",...}
4. Write and search
# Write a memory (standalone mode — no API key needed)
curl -X POST http://localhost:8000/api/v1/memories \
  -H "X-API-Key: standalone" \
  -H "Content-Type: application/json" \
  -d '{"tenant_id": "default", "content": "Our auth service uses JWT with 15-minute expiry."}'

# Search for it
curl -X POST http://localhost:8000/api/v1/search \
  -H "X-API-Key: standalone" \
  -H "Content-Type: application/json" \
  -d '{"tenant_id": "default", "query": "authentication token lifetime"}'

The write response includes LLM-inferred type, title, summary, tags, status, and importance_score — all from a single content field.

Auth modes

OSS supports three auth paths. Pick one and add it to your .env, then docker compose up -d to restart.

Standalone — single-tenant (tenant_id="default"), simplest for local / self-install:

IS_STANDALONE=true

No API key required for REST. MCP still expects a non-empty X-API-Key header — any value works.

Pair Standalone mode with --profile embed-local (see docs/local-embedder.md) for a fully self-contained deployment: no admin keys, no external API calls, all embeddings computed locally. Useful for offline / air-gapped environments and personal-laptop installs.

Admin key — multi-tenant with full access:

ADMIN_API_KEY=your-long-random-admin-key

Pass X-API-Key: your-long-random-admin-key and include tenant_id in request bodies / query params.

Shared gate — for network-exposed OSS deployments:

MEMCLAW_API_KEY=your-shared-key

Clients send X-API-Key: your-shared-key plus X-Tenant-ID: <tenant>.

See AGENT-INSTALL.md for the full agent self-install walkthrough.

Running tests
# Unit tests (no DB needed)
pytest tests/ -m "unit"

# All tests (requires PostgreSQL)
docker compose up -d db
pytest tests/ -m "not benchmark"

# Smoke test against live API (~30s, auto-cleanup)
python scripts/smoke_test.py --url http://localhost:8000 --api-key <admin-key>

OpenClaw Plugin

Already running an OpenClaw fleet? Install MemClaw as a plugin against either the managed platform or your self-hosted stack:

# Point at whichever URL hosts your MemClaw API
export MEMCLAW_URL=https://memclaw.net          # managed
# or:  export MEMCLAW_URL=http://localhost:8000  # self-hosted
export MEMCLAW_KEY=your-key                      # `standalone` works in self-hosted standalone mode
export MEMCLAW_FLEET=my-fleet

curl -sf -H "X-API-Key: $MEMCLAW_KEY" \
  "$MEMCLAW_URL/api/v1/install-plugin?fleet_id=$MEMCLAW_FLEET&api_url=$MEMCLAW_URL" | bash

# Restart the gateway to load the plugin
openclaw gateway restart

The plugin claims the OpenClaw memory slot (replacing memory-core) and exposes the same 12 MCP tools. Full setup, agent prompts, and trust levels: static/docs/integration-guide.md.

Features

Governance

  • Tenant isolation — row-level database separation per tenant; PII auto-detected and quarantined before it can cross fleet boundaries
  • Visibility scopes — every memory is stamped at write time: scope_agent (private), scope_team (fleet-wide, default), or scope_org (cross-fleet). Cross-fleet recall is permissioned, not open
  • Agent trust tiers — four levels control cross-fleet reads, writes, and deletes. Agents are either provisioned atomically via POST /admin/agent-keys/provision (recommended — mints key + row + trust + fleet in one call) or auto-registered on first write (legacy fallback)
  • Full audit log — every write, delete, and transition logged with tenant and scope context

Memory Pipeline

  • Single-pass LLM enrichment — every write auto-classifies into one of 14 memory types, generates title/summary/tags, scores importance, flags PII, and extracts entities — from a single content field
  • Hybrid search — pgvector semantic similarity + full-text keyword matching + knowledge graph expansion (up to 2 hops), ranked by composite score of similarity, importance, freshness, and graph boost
  • Live knowledge graph — people, orgs, locations, and concepts extracted into entities and relations on every write. Semantic entity resolution (>0.85 cosine) auto-merges duplicates
  • Contradiction detection — RDF triple comparison + LLM semantic analysis detects conflicting memories and automatically supersedes them, with full contradiction chain tracking

Self-Improving Memory

  • Outcome-based learning (Karpathy Loop) — agents report success/failure after acting on recalled memories; the system reinforces what works and auto-generates preventive rule-type memories on failure
  • Crystallization — LLM merges near-duplicate memories into canonical atomic facts with full provenance; 8-status lifecycle automation retires stale data
  • Per-agent retrieval tuning — each agent optimizes its own retrieval profile (top_k, min_similarity, graph_max_hops, blend weights) from feedback, so search quality compounds with every interaction

Integrations

  • MCP server — built-in Model Context Protocol at /mcp (Streamable HTTP). Connect Claude Desktop, Claude Code, Cursor, Windsurf, or any MCP client with a URL and API key
  • Multi-provider LLM — primary + fallback provider chain per tenant (OpenAI, Gemini, Anthropic, OpenRouter) with platform defaults for zero-config tenants
  • Document store — structured JSONB collections alongside semantic memories for exact-field lookups (customer records, config, task lists)

Performance

Benchmarked against the two most-cited public agent-memory benchmarks. Methodology and operator-scale context live in docs/performance.md; the full write-up is on the blog.

LoCoMo LongMemEval Search latency
Accuracy (LLM-judge) 77.6% 72.5%
Token savings vs full context 96.6% 98.2%
Latency 23 ms p50 · 27 ms p95

Accuracy sits inside the leading cluster across the field (Mem0, Zep, MemClaw — scores cluster in a narrow band). The axes we push hardest are latency and token efficiency, because those are the ones that compound as agent count grows — a few hundred ms of search latency disappears behind one LLM call, but bills millions of times a day across a fleet.

Single-agent benchmarks can't measure cross-agent recall, outcome propagation between agents, fleet-scoped visibility, or governance-aware retrieval. Those are the questions that decide whether a memory system is deployable inside a company. See docs/performance.md.

Source: Fast, Token-Efficient, and Built for Fleets (2026-04-19).

MCP (Model Context Protocol)

Add MemClaw to any MCP client with one config block.

Self-hosted (localhost):

{
  "mcpServers": {
    "memclaw": {
      "url": "http://localhost:8000/mcp",
      "headers": { "X-API-Key": "standalone" }
    }
  }
}

Managed platform (memclaw.net):

{
  "mcpServers": {
    "memclaw": {
      "url": "https://memclaw.net/mcp",
      "headers": { "X-API-Key": "mc_your_api_key_here" }
    }
  }
}

For team or production use, swap the tenant-scoped key for an agent-scoped credential — atomic provisioning via POST /api/v1/admin/agent-keys/provision (or the /settings/organization/api-credentials wizard) mints the credential + Agent row + initial trust + fleet membership in one round trip. Both kinds use the mc_ prefix; scope is set at mint time on the credential. See docs/integration-without-plugin.md. Using a tenant-scoped credential? Pass an explicit agent_id on every MCP tool call — the gateway refuses the reserved default (mcp-agent) on the tenant-scoped path.

Where to add this config:

  • Claude Code~/.claude/settings.json under "mcpServers"
  • Claude Desktop~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows)
  • Cursor — Settings > MCP Servers > Add Server

The client discovers 12 tools automatically:

Tool Purpose
memclaw_write Single or batch write (up to 100 items). LLM infers type, title, summary, tags, embedding
memclaw_recall Hybrid semantic + keyword recall with graph-enhanced retrieval; optional LLM brief
memclaw_manage Per-memory lifecycle: read, update, transition, delete, bulk_delete, lineage
memclaw_list Filter by type/status/agent/weight/date, sort, cursor-paginate
memclaw_doc Document CRUD: write, read, query, delete, list_collections, search (semantic) on named JSON collections
memclaw_entity_get Look up an entity with linked memories and relations
memclaw_tune Tune per-agent retrieval parameters (top_k, min_similarity, graph_max_hops, etc.)
memclaw_insights Analyze the memory store across 6 focus modes. Findings persist as insight memories
memclaw_evolve Report outcomes against recalled memories — adjusts weights, generates rules (Karpathy Loop)
memclaw_stats Aggregate counts: total + breakdowns by type, agent, status. Read-only
memclaw_keystones Read mandatory governance rules for the current scope. Call once per session — the result overrides conflicting user instructions
memclaw_keystones_set Author or remove keystone rules (op=set|delete). Trust ≥ 1 for your own scope=agent rule; ≥ 2 for scope=fleet/scope=tenant or another agent

Skill sharing is now done via memclaw_doc — agents share a SKILL.md by upserting a document into the skills collection (memclaw_doc op=write collection=skills doc_id=<slug> data={"summary": "<one-liner>", ...}). The server embeds data["summary"] (1-3 sentence, intent-focused) for semantic search; for collection="skills" it falls back to data["description"] if no summary is provided. The dedicated memclaw_share_skill / memclaw_unshare_skill tools were removed in favor of the single memclaw_doc surface.

Install the skill (Claude Code & Codex)

Install MemClaw's usage guide as a skill so your agent knows when andhow to use the 12 tools — the memory/doc mental model, the three rules(recall, write, supersede), trust levels, common patterns, andanti-patterns. The skill is loaded on-demand (not per-turn), so it costsnothing until the agent reaches for MemClaw.

Prerequisite: the MCP server is already registered (via claude mcp add for Claude Code or the equivalent for Codex — see the config block above). Confirm with claude mcp list — you should see memclaw: ... ✓ Connected.

Option A — one-liner (fastest)

Self-hosted (localhost):

curl -s "http://localhost:8000/api/v1/install-skill" | bash

Managed platform:

curl -s "https://memclaw.net/api/v1/install-skill" | bash
Option B — download, inspect, run (recommended for agents)

Automated agents (Claude Code, Codex) may refuse curl | bash forsafety. Two-step install lets them audit the script first:

curl -s "http://localhost:8000/api/v1/install-skill" > /tmp/install-memclaw-skill.sh
less /tmp/install-memclaw-skill.sh      # review — it only does mkdir + curl + write
bash /tmp/install-memclaw-skill.sh
Options
Query param Effect
(none) Install for both Claude Code and Codex (default)
?agent=claude-code Only Claude Code → ~/.claude/skills/memclaw/SKILL.md
?agent=codex Only Codex → ~/.agents/skills/memclaw/SKILL.md
Verify
ls -la ~/.claude/skills/memclaw/SKILL.md       # Claude Code
ls -la ~/.agents/skills/memclaw/SKILL.md       # Codex

Restart your agent after installing — skills are loaded at startup.Re-run the installer any time to pull the latest version.

OpenClaw-plugin users get the skill automatically when the plugininstalls; skip this step.

Deployment

The recommended way to run MemClaw is via Docker Compose (see Quick Start). This gives you a production-ready PostgreSQL + pgvector + Redis + API stack with a single command.

Published container images

Each release publishes multi-arch (linux/amd64, linux/arm64) images to GitHub Container Registry:

ghcr.io/caura-ai/caura-memclaw-core-api:v2.5.0
ghcr.io/caura-ai/caura-memclaw-core-storage-api:v2.5.0

Tags follow SemVer with floating aliases — :v1, :v1.0, :v1.0.0, plus :latest for the latest stable release. Pull them in your own compose file or Kubernetes manifests instead of building from source.

Manual deployment (without Docker)

The core-api/ service is a standard FastAPI app that runs under any ASGI server (uvicorn, hypercorn). Requirements:

  • Python 3.12+
  • PostgreSQL 16+ with the pgvector extension
  • Redis (optional — falls back to in-memory cache if unavailable)
uvicorn core_api.app:app --host 0.0.0.0 --port 8000 --workers 2

Deployment topologies

MemClaw ships with two operational modes for the storage layer. Single-node (default) is what you get from Docker Compose, pip install, or any fresh deploy — one core-storage-api instance serves both reads and writes. This is the right choice for any deployment that isn't seeing sustained 100+ writes/sec.

The reader/writer split is an opt-in topology for high-write-rate deploys that want to scale reads independently of writes — e.g. by pointing read traffic at a Postgres streaming replica. Enabling it means running two core-storage-api services with different roles and pointing core-api at both:

  • Set CORE_STORAGE_ROLE=writer on the write-serving instance; =reader on the read-serving instance(s).
  • Set CORE_STORAGE_READ_URL on core-api to the reader service URL. Leave CORE_STORAGE_API_URL pointing at the writer.
  • READ_DATABASE_URL on each core-storage-api can point at a read replica if you have one.

Defaults: CORE_STORAGE_ROLE=hybrid and CORE_STORAGE_READ_URL="" — both null-safe, so single-node deploys need zero configuration to get the legacy single-service behavior.

Upgrading from v1.x

⚠️ v2.0.0 ships a destructive schema migration. If your installation is onv1.x and has any memories already stored, follow this procedure carefully — themigration NULLs every existing embedding to widen the pgvector column from768 → 1024 dim. The application is designed to refuse the migrationautomatically; you must opt in.

What changes

  • Default embedder model: BAAI/bge-m3 (was: OpenAI text-embedding-3-small).Self-hosted via the new tei profile in docker-compose; documented indocs/local-embedder.md.
  • pgvector schema dim: vector(1024) (was: vector(768)).
  • Existing embeddings on memories.embedding, entities.name_embedding, anddocuments.embedding are NULLed by alembic revision 012_vector_dim_1024.Re-embedding is required; until rows are re-embedded, semantic search returnsno results for those rows.

Procedure (OSS, docker-compose)

  1. Stop the stack so no writes happen during migration:

    docker compose down
    
  2. Snapshot the database. A pg_dump is the safest fallback. Replace<container> with the running PostgreSQL container name (typicallycaura-memclaw-db-1):

    docker compose up -d db    # bring just the DB back
    docker exec <container> pg_dump -U memclaw memclaw > backup-pre-v2.sql
    docker compose down
    
  3. Pull the new image and start with the migration opt-in env set. Thegate enforces an explicit opt-in because the migration is destructive ona populated DB:

    docker compose pull
    MEMCLAW_RUN_DESTRUCTIVE_MIGRATIONS=true docker compose up -d
    

    The core-storage-api container will run alembic upgrade head onstartup. The migration runs in seconds-to-minutes for typical OSSworkloads.

  4. Verify migration completed. Look for the lineDatabase initialization complete in the core-storage-api logs:

    docker compose logs core-storage-api | grep -i "alembic\|migration"
    
  5. Re-embed your data. Two paths:

    • Lazy (zero action): the application re-embeds rows on next read orwrite that touches them. Search will return empty results for cold rowsuntil they are touched. Acceptable for low-traffic personal deployments.
    • Eager (recommended): run the bundled backfill CLI. It walks everymemory and entity with a NULL embedding and re-embeds via the configuredprovider. Idempotent — safe to re-run. First do a dry-run to estimatescope:
      docker compose run --rm core-storage-api \
        python -m core_storage_api.scripts.backfill_embeddings --dry-run
      
      Then the real run:
      docker compose run --rm core-storage-api \
        python -m core_storage_api.scripts.backfill_embeddings
      
      Optional knobs: --tenant-id <id> (per-tenant cutover), --batch-size N,--max-inflight N, --only-table memories|entities. Documents are NOTcovered (their embed-source field is per-row JSON, not a fixed column);re-write any embedded documents to refresh them.
    • Eager (event-driven, recommended for multi-tenant production): if yourun the core-worker service, drive the existing EMBED_REQUESTEDconsumer instead. The CLI scans WHERE embedding IS NULL and publishesone event per row, inheriting per-tenant concurrency + retry + DLQ:
      docker compose run --rm core-worker \
        python -m core_worker.cli backfill-embeddings --dry-run
      docker compose run --rm core-worker \
        python -m core_worker.cli backfill-embeddings
      
      Same knobs as the standalone script (--tenant-id, --batch-size,--max-inflight, --dry-run). Currently covers memories only.
  6. Once stable, unset MEMCLAW_RUN_DESTRUCTIVE_MIGRATIONS so subsequentup commands don't carry the opt-in:

    unset MEMCLAW_RUN_DESTRUCTIVE_MIGRATIONS  # if exported in the shell
    # or remove the line from your .env file
    

What if I skip the opt-in?

core-storage-api will refuse to start, with a clear error message reportinghow many rows would be NULLed. The container exits non-zero; the rest of thestack stays healthy. No data is touched. Set the env var and retry.

Rolling back

The migration has a symmetric downgrade(). To revert, set the env var andexplicitly downgrade:

docker compose run --rm \
  -e MEMCLAW_RUN_DESTRUCTIVE_MIGRATIONS=true \
  core-storage-api alembic downgrade 011

This NULLs any 1024-dim embeddings written since the upgrade and widens thecolumns back to vector(768). The same data-loss tradeoff applies in reverse.For untouched-since-upgrade installations, the simpler recovery is to restorethe snapshot from step 2.

v1.x → v2.x compatibility for client code

No public API changes. Code that reads memory embeddings via the search/recallendpoints is unaffected. Client code that hardcodes 768 for vector lengthsshould be updated to read VECTOR_DIM from common.constants.

API Reference

All routes are versioned under /api/v1/. Interactive Swagger docs at /api/docs.

Memory endpoints
Endpoint Method Description
/memories POST Write a memory. LLM enrichment + embedding + entity extraction + contradiction detection. "persist": false for extract-only preview
/memories/bulk POST Write up to 100 memories. Batches embeddings, parallelizes enrichment, single transaction. Requires X-Bulk-Attempt-Id header (per-attempt idempotency); a retry with the same id resolves committed rows as duplicate_attempt instead of duplicating. Returns 200 (clean / all-error) or 207 Multi-Status (mixed) — read per-item status
/memories GET List memories (filter by type, status, agent; paginate)
/memories/{id} GET Full memory detail (embedding stats, entity links, RDF triple, temporal bounds)
/memories/{id} PATCH Update content or metadata. Re-embeds if content changes
/memories/{id} DELETE Soft delete (sets status to deleted)
/memories/{id}/status PATCH Update lifecycle status
/memories/{id}/contradictions GET View contradiction chain
/memories DELETE Bulk soft-delete
/memories/stats GET Counts by type, agent, and status
/search POST Hybrid semantic + keyword search with graph-enhanced retrieval
/recall POST Search + LLM summarization — returns context paragraph + source memories
/ingest/preview POST Extract 5-20 atomic facts from a URL or text (no writes)
/ingest/commit POST Write previewed facts as memories
Knowledge graph endpoints
Endpoint Method Description
/entities GET List entities (filter by type, search)
/entities/upsert POST Create or update entity
/entities/{id} GET Entity detail with relations and linked memories
/relations/upsert POST Create or update relation
/graph GET Full knowledge graph (entities + relations)
Evolve, Insights, Agents, Crystallizer, Documents, Fleet, Admin

Karpathy Loop / Evolve

Endpoint Method Description
/evolve/report POST Report an outcome (success/failure/partial) against recalled memories

Insights

Endpoint Method Description
/insights/generate POST LLM-powered analysis. Focus: contradictions, failures, stale, divergence, patterns, discover

Agents

Endpoint Method Description
/agents GET List registered agents with trust levels
/agents/{id} GET Single agent detail
/agents/{id}/trust PATCH Set trust level (0-3)

Memory Crystallizer

Endpoint Method Description
/crystallize POST Trigger crystallization for a tenant
/crystallize/all POST Trigger for all tenants (admin key, nightly)
/crystallize/reports GET List crystallization reports
/crystallize/latest GET Most recent completed report

Documents

Endpoint Method Description
/documents POST Store or update a structured JSON document
/documents/{id} GET Retrieve document by ID
/documents/query POST Query by field equality filters
/documents/{id} DELETE Delete a document

Fleet

Endpoint Method Description
/fleet/heartbeat POST Plugin heartbeat — upserts node status, returns pending commands
/fleet/nodes GET List fleet nodes with status (online/stale/offline)
/fleet/commands POST Queue a command for a node
/fleet/commands GET List command history

Admin + System

Endpoint Method Description
/health GET Liveness check
/version GET Current version
/tool-descriptions GET Canonical MCP tool descriptions
/admin/tenants GET List all tenants (admin key)
/admin/fleets GET List fleets across all tenants (admin key)
/admin/memories GET List memories across all tenants with filters (admin key)
/admin/memories/stats GET Memory counts by tenant/type/status (admin key)
/settings GET / PUT Per-tenant configuration
/audit-log GET Audit log entries
/mcp POST MCP Streamable HTTP endpoint (mounted at app root, NOT under /api/v1)

Auth: X-API-Key header for all endpoints. Admin endpoints require the admin key. Public (no auth): /api/v1/health, /api/v1/version, /api/v1/tool-descriptions.

Gateway-injected headers (trusted only behind the enterprise gateway):

Header Effect
X-Agent-ID Scopes the request to this agent
X-Org-Read-Only: true Read-only mode — creates/updates return 403
X-Tenant-ID Tenant identity when using the shared MEMCLAW_API_KEY gate

These headers are honored unconditionally — core-api must not be network-exposed without a gateway that strips them from untrusted callers.

Rate limiting (managed platform)

These limits apply to the managed platform at memclaw.net. In the OSS edition, rate limiting is a no-op — see the Rate limiting section below.

Scope Limit
Memory writes 60 req/min per API key
Memory searches 120 req/min per API key
General reads 300 req/min per API key
Auth endpoints 10 req/min per IP
Global DDoS floor 1000 req/min per IP

Exceeded limits return HTTP 429 with a Retry-After header.

Configuration

All configuration is via environment variables or .env. See .env.example for the full list.

Migrating from a pre-1.0 deploy? The legacy ALLOYDB_* env var names are still accepted as aliases — POSTGRES_HOST falls back to ALLOYDB_HOST, etc. Aliases will be dropped in a future major release.

Variable Default Description
POSTGRES_HOST 127.0.0.1 Database host
POSTGRES_PORT 5432 Database port
POSTGRES_USER memclaw Database user
POSTGRES_PASSWORD changeme Database password
POSTGRES_DB memclaw Database name
POSTGRES_USE_IAM_AUTH false Use GCP IAM for DB auth (managed Postgres on GCP only)
ADMIN_API_KEY (empty) Admin API key — bypasses tenant enforcement
EMBEDDING_PROVIDER openai openai, local, or fake
ENTITY_EXTRACTION_PROVIDER openai openai, gemini, anthropic, openrouter, fake, or none
ENTITY_EXTRACTION_MODEL gpt-5.4-nano LLM model for enrichment and entity extraction
OPENAI_API_KEY Required for OpenAI embeddings and enrichment
USE_LLM_FOR_MEMORY_CREATION true LLM auto-classifies type, weight, title, summary, tags on write
ANTHROPIC_API_KEY Required for Anthropic
OPENROUTER_API_KEY Required for OpenRouter
GEMINI_API_KEY Required for Gemini (Developer API, from AI Studio)
CORS_ORIGINS http://localhost:3000 Comma-separated allowed CORS origins
ENVIRONMENT development development or production
SETTINGS_ENCRYPTION_KEY Fernet key for encrypting tenant settings. Required in production
PLATFORM_LLM_PROVIDER (empty) Platform-default LLM: openai, vertex, or empty to disable
PLATFORM_LLM_MODEL (empty) Model override (e.g. gpt-5.4-nano, gemini-3.1-flash-lite-preview)
PLATFORM_LLM_API_KEY OpenAI API key for the platform LLM singleton
PLATFORM_LLM_GCP_PROJECT_ID GCP project for platform Vertex LLM
PLATFORM_LLM_GCP_LOCATION us-central1 GCP region for platform Vertex LLM
PLATFORM_EMBEDDING_PROVIDER (empty) Platform-default embeddings: openai or empty to disable
PLATFORM_EMBEDDING_MODEL (empty) Embedding model override (e.g. text-embedding-3-small)
PLATFORM_EMBEDDING_API_KEY OpenAI API key for platform embeddings
Project structure
memclaw/
├── core-api/                      # Main FastAPI service
│   └── src/core_api/
│       ├── app.py                 # FastAPI app, lifespan, middleware
│       ├── mcp_server.py          # MCP server (Streamable HTTP, 12 tools)
│       ├── constants.py           # Tool descriptions, limits, ranking params
│       ├── config.py              # Settings (env vars)
│       ├── auth.py                # API key + JWT auth, tenant enforcement
│       ├── routes/                # Route handlers
│       ├── services/              # Business logic
│       ├── providers/             # LLM/embedding abstraction + fallback
│       ├── pipeline/              # Composable write/search pipelines
│       └── tools/                 # MCP tool implementations
│
├── core-storage-api/              # PostgreSQL CRUD microservice
│   └── src/core_storage_api/
│       ├── routers/               # Memory, entity, document, fleet CRUD
│       ├── services/              # ORM operations
│       └── database/              # SQLAlchemy models, Alembic migrations
│
├── plugin/                        # OpenClaw plugin (TypeScript)
│   └── src/
│       ├── tools.ts               # Tool implementations
│       ├── agent-auth.ts          # Per-agent credentials (agent-scoped mc_ keys)
│       ├── context-engine.ts      # Auto-read/write lifecycle
│       ├── heartbeat.ts           # 60s heartbeat → MemClaw API
│       └── educate.ts             # Agent education delivery
│
├── common/                        # Shared SQLAlchemy ORM models and constants
├── tests/                         # Test suite
├── scripts/                       # Smoke tests, benchmarks, export tools
├── docker-compose.yml             # Production-like stack
├── docker-compose.dev.yml         # Dev stack
└── .env.example                   # Full configuration reference
Latency benchmarks

Typical results on a single-instance deployment (OpenAI embeddings + GPT-5.4 Nano):

Operation Mean P50 P95
memclaw_write ~2000ms ~2000ms ~2300ms
memclaw_recall ~650ms ~640ms ~670ms
memclaw_recall (with include_brief=true) ~1300ms ~1200ms ~2100ms

Write latency is dominated by LLM enrichment. Recall latency by the embedding API call.

python scripts/latency_test.py --url http://localhost:8000 --api-key <admin-key> --runs 20

Public API & Stability

MemClaw v1.x follows SemVer 2.0.0. The surfaces below are stable; everything else is internal and may change in any release.

Stable surfaces

MCP tools (12)

The MCP server is mounted at /mcp. Tool names, parameter names, and the documented op-dispatch values are stable.

Tool Purpose
memclaw_recall Hybrid semantic + keyword search over memories, with optional LLM-summarised brief.
memclaw_write Single or batch (≤100) memory write; auto-enriched with type, title, summary, tags.
memclaw_manage Per-memory lifecycle, op-dispatched: read | update | transition | delete | bulk_delete | lineage.
memclaw_list Non-semantic enumeration with filters, sort, cursor pagination.
memclaw_doc Structured-document CRUD, op-dispatched: write | read | query | delete | list_collections | search.
memclaw_entity_get Look up a knowledge-graph entity by UUID.
memclaw_tune Read/update an agent's per-search profile (top_k, fts_weight, freshness, blend, …).
memclaw_insights Karpathy-Loop reflection: contradictions, failures, stale, divergence, patterns, discover.
memclaw_evolve Karpathy-Loop feedback: record an outcome (success | failure | partial) against memories.
memclaw_stats Aggregate counts: total + breakdowns by type / agent / status. Read-only.
memclaw_keystones Read mandatory governance rules for the current scope (tenant + fleet + agent merged). Call once per session.
memclaw_keystones_set Author/remove keystone rules, op-dispatched: set | delete. Trust ≥ 1 for self-authored scope=agent; ≥ 2 otherwise.

Skill sharing uses the generic memclaw_doc surface — write/read/query/search/delete on collection="skills". The server validates the slug and embeds data["summary"] for semantic discovery (with a back-compat fallback to data["description"] for skills).

REST endpoints

All paths are prefixed with /api/v1 unless noted. Request and response shapes documented in the OpenAPI schema at /openapi.json are part of the contract.

Area Endpoints
Memory GET/POST /memories, PATCH /memories/{id}, DELETE /memories/{id}, PATCH /memories/{id}/status, POST /memories/bulk, POST /memories/bulk-delete, GET /memories/stats, GET /memories/{id}, GET /memories/{id}/contradictions, POST /search, POST /recall, POST /ingest/preview, POST /ingest/commit
Knowledge graph GET /entities, GET /entities/{id}, POST /entities/upsert, GET /graph, POST /relations/upsert
Documents POST /documents, GET /documents, GET /documents/{id}, POST /documents/query, DELETE /documents/{id}
Keystones GET /memclaw/keystones, POST /memclaw/keystones, DELETE /memclaw/keystones/{doc_id}
Fleet POST /fleet/heartbeat, GET /fleet/nodes, POST /fleet/commands, GET /fleet/commands
Agents GET /agents, GET /agents/{id}, PATCH /agents/{id}/trust, POST /admin/agent-keys/provision (atomic key + row + trust + fleet), GET /whoami (identity probe)
Insights POST /insights/generate
Evolve POST /evolve/report
Crystallizer POST /crystallize, POST /crystallize/all, GET /crystallize/reports, GET /crystallize/latest
Settings GET/PUT /settings
System GET /health, GET /version, GET /tool-descriptions, GET /audit-log
MCP POST /mcp (Streamable HTTP transport, mounted at app root)
Bootstrap (plugin) GET /plugin-source, GET /plugin-source-hash, GET/POST /install-plugin, GET /install-skill, GET /skill/memclaw. Aliased under /api (no /v1) for one-line installers.
Plugin environment variables

Read by the OpenClaw plugin. The plugin's published name (memclaw) and these variables are the public contract; the plugin's TypeScript module structure is internal.

Var Purpose
MEMCLAW_API_URL Base URL of the core-api server.
MEMCLAW_API_KEY Tenant or admin API key sent in X-API-Key.
MEMCLAW_TENANT_ID Optional pre-resolved tenant id; bypasses lookup.
MEMCLAW_FLEET_ID Default fleet id for writes/heartbeat.
MEMCLAW_NODE_NAME Fleet node identifier reported on heartbeat.
MEMCLAW_AUTO_WRITE_TURNS Auto-write turn summaries (default true).
Server environment variables

These mirror the Configuration table above. See it for defaults.

Group Vars
Database POSTGRES_HOST, POSTGRES_PORT, POSTGRES_USER, POSTGRES_PASSWORD, POSTGRES_DB, POSTGRES_USE_IAM_AUTH, POSTGRES_REQUIRE_SSL
Auth ADMIN_API_KEY, MEMCLAW_API_KEY, IS_STANDALONE
Providers EMBEDDING_PROVIDER, ENTITY_EXTRACTION_PROVIDER, OPENAI_API_KEY, ANTHROPIC_API_KEY, OPENROUTER_API_KEY, GEMINI_API_KEY, USE_LLM_FOR_MEMORY_CREATION
Runtime CORS_ORIGINS, ENVIRONMENT, SETTINGS_ENCRYPTION_KEY, REDIS_URL
Auth modes
Mode Activated by Use case
Standalone IS_STANDALONE=true Single-tenant self-host; auth bypassed.
Multi-tenant admin ADMIN_API_KEY=… Operator key for multi-tenant deployments.
Shared gate MEMCLAW_API_KEY=… Optional shared secret required on every non-admin request.

See AGENT-INSTALL.md for installation flows that exercise each mode.

Internal (not covered by SemVer)

Anything not listed above is internal and may change in any release without a major version bump:

  • Python module layout (core_api.middleware.*, core_api.providers.*, core_api.pipeline.*, core_api.services.*, common/*)
  • Database schema, table names, migration paths
  • Gateway-injected HTTP headers (X-Memclaw-Gateway, X-Tenant-ID, X-Agent-ID, X-Org-Read-Only)
  • Most /api/v1/admin/* and all /api/v1/testing/* routes (the documented exception is POST /admin/agent-keys/provision, which is part of the stable identity-bootstrap surface — see the Agents row above)
  • The core-storage-api microservice (internal, not user-facing)
  • The plugin's TypeScript module structure
  • API-key prefix formats — currently unified on mc_… (with legacy mca_… / mci_… aliases still accepted via back-compat); formats may continue to evolve

Reporting breaking changes

Contributors who introduce a breaking change to a stable surface must:

  • Add a BREAKING CHANGE: trailer to the commit message describing the impact and any migration steps.
  • Apply the kind/breaking label to the pull request.

Reviewers will block merges to dev that touch a stable surface without these markers. If you are unsure whether a change is breaking, open the PR with the label and let review decide — better to over-label than ship a silent break.

Telemetry and error tracking

MemClaw supports optional Sentry integration for error tracking and performance monitoring:

  • Opt-in only — set the SENTRY_DSN environment variable to enable. No errors are reported unless you explicitly configure a DSN.
  • No usage analytics — MemClaw does not collect usage statistics, feature flags, or behavioral data.
  • No phone-home — the application makes zero outbound calls unless you configure a Sentry DSN or an LLM/embedding provider.

OpenClaw Plugin

See static/docs/integration-guide.md for full plugin setup, agent system prompts, and usage examples.

Rate limiting

Rate limiting in the OSS edition is a no-op — all rate-limit decorators are identityfunctions that accept every request without throttling. For production deployments exposed tothe internet, add rate limiting at your reverse proxy (nginx, Caddy, Cloudflare) or implementapplication-level limiting in core-api/src/core_api/middleware/rate_limit.py.

Telemetry

MemClaw does not phone home by default. No usage data, analytics, or tracking of any kind.

If you set the SENTRY_DSN environment variable, Sentry error trackingis enabled — crash reports and performance traces are sent to your configured Sentry project.When SENTRY_DSN is empty (the default), Sentry is not initialized and no data leaves theserver.

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines, development setup, and how to submit PRs.

License

MemClaw is licensed under the Apache License, Version 2.0.

See NOTICE for copyright and third-party attributions.

Trademarks

"MemClaw" and "Caura" are trademarks of Caura. The Apache License 2.0 grantspermission to use the source code but does not grant permission to use thesenames, logos, or branding in a way that suggests endorsement of, or affiliationwith, any derivative work. See Apache License 2.0 §6 for the full legal terms.

MCP Server · Populars

MCP Server · New

    omarshahine

    HomeClaw

    HomeKit smart home control via MCP — lights, locks, thermostats, and scenes for Claude Desktop, Claude Code, and OpenClaw

    Community omarshahine
    YawLabs

    @yawlabs/tailscale-mcp

    Tailscale MCP server for managing your tailnet from AI assistants

    Community YawLabs
    CursorTouch

    🍎 macOS-MCP

    Lightweight MCP server for computer use in MacOS

    Community CursorTouch
    QVerisAI

    QVeris Agent Toolkit

    Open-source toolkit for the QVeris capability routing network: CLI, MCP server, Python SDK, skills, and REST API docs for agents to discover, inspect, call, and audit real-world tools.

    Community QVerisAI
    caura-ai

    caura-memclaw

    MemClaw — persistent memory for AI agent fleets (OSS)

    Community caura-ai