retavyn

Persistent memory for Claude across sessions.

Every Claude session starts cold — no memory of what you worked on yesterday, what decisions you made, what you learned. Retavyn fixes that. It stores what matters and injects it back into Claude's context automatically at the start of every session.

You talk to Claude normally. It remembers.

Features

Automatic context injection — a SessionStart hook dumps all memories to a local cache and injects them into context before the first message
Hybrid search — full-text (tsvector/tsquery) and semantic similarity (pgvector) combined for recall that works on exact words or general concepts
Two transport modes — stdio for Claude Code, HTTP/SSE for claude.ai remote access
Category tagging — store memories with categories (ci-cd, journal, project, etc.) for filtered recall
Bulk ingestion — ingest_path walks a file or directory tree and stores each file as a memory, with automatic embedding backfill
Live cache refresh — a PostToolUse hook refreshes the local cache immediately after every remember call
OAuth-secured remote access — custom OAuth 2.0 + JWT flow required by the MCP spec for HTTP transport, served behind a Cloudflare Tunnel

How it works

Retavyn runs as an MCP server alongside Claude. When a session starts, a hook fires automatically — it dumps all stored memories to a local cache file and injects them into Claude's context before the first message. A second hook refreshes that cache after every remember call, so new memories are available in the next session immediately.

Search is hybrid: full-text (tsvector/tsquery) for exact matches and semantic similarity (pgvector cosine distance) for concept-level recall. Results from both passes are merged and ranked.

The server supports two transports. In stdio mode, Claude Code spawns it as a local subprocess — zero network exposure. In HTTP/SSE mode, it runs on a server behind a Cloudflare Tunnel with OAuth 2.0 + JWT auth, and claude.ai connects to it as a remote MCP server. That same HTTP endpoint is also what lets multiple machines share one memory pool — every Claude Code install can point its MCP config at the remote database, so your memories follow you across machines.

Architecture

Claude Code (local, stdio)

┌──────────────────────────────────────────────────────────┐
│                      Claude Code                          │
│   SessionStart hook ──► inject retavyn-cache.md          │
│   PostToolUse hook  ──► refresh cache after remember      │
└───────────────────────────┬──────────────────────────────┘
                            │ stdio  (MCP protocol)
                   ┌────────▼────────┐
                   │    retavyn      │  Python + FastMCP
                   │   MCP server   │
                   └────────┬────────┘
                            │
                   ┌────────▼────────┐
                   │  PostgreSQL 18  │  Docker · port 5433
                   │  + pgvector     │  tsvector + pgvector
                   └─────────────────┘

claude.ai (remote, HTTP/SSE)

claude.ai  →  https://mcp.retavyn.com  →  Cloudflare edge (TLS)
           →  cloudflared tunnel  →  retavyn :8765  →  PostgreSQL :5433

OAuth flow: claude.ai opens /authorize, user authenticates, server issues a JWT, claude.ai uses it as a Bearer token on all subsequent MCP calls.

Search internals

When you call recall("billing pipeline"), retavyn runs two passes and merges the results:

Full-text search — tsvector @@ to_tsquery('billing & pipeline'), ranked by ts_rank
Semantic search — cosine distance between the query embedding and stored embeddings via pgvector (embedding <=> $1 < threshold)
Results are deduplicated and returned ranked by combined score

Embeddings are generated via OpenAI text-embedding-3-small or Cohere embed-english-v3.0 (configurable). Memories without embeddings fall back to full-text only.

MCP tools

Tool	Description
`remember`	Store a memory with optional category tag
`recall`	Hybrid full-text + semantic search across memories
`update_memory`	Edit an existing memory by ID
`forget`	Delete a memory by ID
`forget_path`	Delete all memories ingested from a file or directory path
`ingest_path`	Bulk-import a file or directory tree as memories
`backfill_embeddings`	Generate embeddings for memories that don't have them
`ask_infra`	Ask a DevOps question — runs a full agent loop (memory search + live gcloud) and returns a synthesized answer

ask_infra

ask_infra is an agent embedded inside retavyn. When called, it spins up its own Claude tool-use loop with two tools — recall_memory (hybrid search over your retavyn memories) and run_gcloud (read-only live GCP queries) — iterates until it has a complete answer, then returns it as a single response.

From Claude Code's perspective it's one tool call. Under the hood it's a full agent making multiple passes across memory and live infrastructure state before synthesizing an answer.

Example questions:

"What load balancer setup do we use for Cloud Run services?"

"Which GKE clusters are running in prod right now?"

"How do we handle Cloud SQL private service connect?"

The agent is also available as a standalone CLI — see infra-agent/README.md.

Setup

Guide	What it covers
INSTALL.md	Local setup — run retavyn on your machine with Claude Code
SERVER.md	Remote server — deploy to a VM for claude.ai and cross-machine access

Environment variables

Variable	Default	Description
`MEMORY_DB_HOST`	`localhost`	PostgreSQL host
`MEMORY_DB_PORT`	`5433`	PostgreSQL port
`MEMORY_DB_NAME`	`retavyn`	Database name
`MEMORY_DB_USER`	`claude`	Database user
`MEMORY_DB_PASSWORD`	`claude`	Database password
`MEMORY_TRANSPORT`	`stdio`	`stdio` or `streamable-http`
`MEMORY_HOST`	`0.0.0.0`	Bind address (HTTP mode)
`MEMORY_PORT`	`8765`	Port (HTTP mode)
`OAUTH_SECRET`	—	JWT signing secret (HTTP mode)
`OAUTH_PASSWORD`	—	Auth password for browser flow (HTTP mode)
`OPENAI_API_KEY`	—	For OpenAI embeddings (optional)
`COHERE_API_KEY`	—	For Cohere embeddings (optional)

Documentation

File	Contents
INSTALL.md	Local install: setup.sh, MCP config, hooks
SERVER.md	Remote deploy: GCE VM, Cloudflare Tunnel, OAuth, claude.ai
TUTORIAL.md	First memory → first recall → journaling
API.md	Complete tool reference, search internals, advanced usage

CLI commands

python main.py          # start MCP server (stdio)
python main.py dump     # export all memories to ~/.claude/retavyn-cache.md
python main.py remember <content> [category]  # store a memory from the CLI
python main.py health   # check DB connection and memory count
python main.py ingest <path> [category]  # bulk ingest a file or directory

retavyn

retavyn

Features

How it works

Architecture

Claude Code (local, stdio)

claude.ai (remote, HTTP/SSE)

Search internals

MCP tools

ask_infra

Setup

Environment variables

Documentation

CLI commands

MCP Server · Populars

🦞 OpenClaw — Personal AI Assistant

MarkItDown-MCP

MarkItDown

Awesome MCP Servers

mcp-server-sentry: A Sentry MCP server

MCP Server · New

jurisd

sosumi.ai

Wazuh MCP Server

抖音视频上传 Skills

codesurface