§ IV — Operator manual
Documentation
Set up Engram, connect LLM clients, inspect memory behavior, and choose between proxy-based automatic memory and MCP-based tool memory.
§ 01
Quick start
Engram runs as a FastAPI service backed by PostgreSQL and pgvector. The dashboard stores one Engram API key in the browser, while provider keys stay on the server.
Use the same ek_... key across Claude Desktop, VS Code Agent Mode, Cursor-style clients, and the dashboard when you want one shared memory store.
cp .env.example .env
docker compose up -d
curl -X POST http://localhost:8000/users \
-H "Content-Type: application/json" \
-d '{"external_id":"test_user_1"}'§ 02
Dashboard workflow
Home
Land on the overview: total memories, pending reviews, entity count, recent activity timeline, top connected entities, and latest retrievals.
Memories
The full ledger at /memories. Search, add, edit, approve pending extractions, merge duplicates, export, import, and decay confidence.
Graph
/graph renders entities extracted from memories as a force-directed graph. Hover to highlight neighborhoods, click for memory lists, filter by type.
Chat
/chat talks to your configured provider through the proxy. Memories are injected automatically, new ones extracted in the background.
Logs
Every retrieval event: which memories were surfaced, their scores, and the conversation that caused it.
Settings
Engram API key, provider config, encrypted provider keys, retrieval thresholds, retrieval mode (vector / hybrid / graph), and dedup tuning.
§ 03
Connect MCP clients
MCP clients use Engram over stdio. Claude Desktop, VS Code Agent Mode, Cursor, Windsurf, and similar clients can load the same tool server and write to the same memory account.
Build the MCP package once, then point the client at mcp/dist/index.js with --transport stdio.
{
"servers": {
"engram": {
"type": "stdio",
"command": "C:\\nvm4w\\nodejs\\node.exe",
"args": [
"F:\\Engram\\mcp\\dist\\index.js",
"--transport",
"stdio"
],
"env": {
"ENGRAM_API_URL": "http://localhost:8000",
"ENGRAM_API_KEY": "ek_your_key_here"
}
}
}
}§ 04
Automatic memory capture
The MCP tool capture_conversation lets clients store memories without the user typing "store this". The assistant sends the latest user message and assistant response, then Engram extracts only durable facts.
Memory rules
- Durable facts about the user
- Preferences, projects, skills, and corrections
- Long-lived context useful across future sessions
- No greetings, throwaway questions, or assistant-only claims
Always use Engram memory. Before answering when user context may matter, search Engram for relevant memories. After each meaningful exchange, call capture_conversation with the user message, assistant response, source client name, and session id. Store durable user facts, preferences, project context, and corrections. Do not store greetings, one-off questions, temporary details, or assistant-only claims.§ 05
Proxy mode
Proxy mode is the fully automatic path. Your app sends OpenAI-style chat requests to Engram, Engram injects relevant memory, forwards the request, returns the provider response, and extracts new facts in the background.
Set stream: true in the body for SSE streaming — tokens flow to the client as they arrive and extraction fires once the stream closes.
Add X-Engram-Namespace to scope retrieval and storage to a sub-store (e.g. work vs personal). Defaults to default.
curl -N -X POST http://localhost:8000/v1/chat \
-H "Content-Type: application/json" \
-H "X-Engram-Key: ek_your_key_here" \
-H "X-Engram-User-ID: test_user_1" \
-H "X-Engram-Provider: openai" \
-H "X-Engram-Namespace: work" \
-d '{"model":"gpt-4o-mini","stream":true,
"messages":[{"role":"user","content":"What stack should I use?"}]}'§ 06
Retrieval modes
Each user can pick the retrieval strategy used when memories get injected into the system prompt. Set it via PATCH /users/me/config with a retrieval_mode field, or from the Settings page.
vector (default)
Pure pgvector cosine similarity over the 384-dim embeddings. Fast, single SQL round trip. Best when queries paraphrase memory content.
hybrid
Runs vector and Postgres full-text (tsvector) searches in parallel, then merges with Reciprocal Rank Fusion (k=60). Better recall when queries name specific terms the embedding may not weight highly.
graph
Vector seed plus 1-hop expansion through the entity graph (memories sharing entities with the seed are pulled in). Requires ENABLE_GRAPH=true and a backfilled entity set.
Reranker (optional)
Set ENABLE_RERANKER=true on the API to load a cross-encoder (cross-encoder/ms-marco-MiniLM-L-6-v2) at startup. When loaded, all retrieval modes re-rank their candidate set before returning. Adds ~400 MB to API memory and a small per-query CPU cost.
§ 07
Graph memory
Engram extracts named entities (people, projects, skills, technologies, preferences, topics, organizations) from each memory and stores them as a graph in Postgres — no Neo4j required.
Set ENABLE_GRAPH=true on the API. New memories trigger entity extraction asynchronously (uses your configured extraction provider). Call POST /graph/extract to backfill entities on memories that pre-dated the flag.
The dashboard /graph page renders this as a force-directed graph (Obsidian-style): dots sized by mention count, edges between entities that co-occur in memories, hover to highlight 1-hop neighborhood, click for memory list.
# 1. Enable on the API
ENABLE_GRAPH=true
# 2. Backfill entities for existing memories
curl -X POST http://localhost:8000/graph/extract \
-H "X-Engram-Key: ek_..."
# 3. List entities with mention counts
curl http://localhost:8000/graph/entities \
-H "X-Engram-Key: ek_..."
# 4. Co-occurrence edges (entities sharing memories)
curl http://localhost:8000/graph/edges \
-H "X-Engram-Key: ek_..."
# 5. Switch retrieval to graph mode
curl -X PATCH http://localhost:8000/users/me/config \
-H "X-Engram-Key: ek_..." \
-H "Content-Type: application/json" \
-d '{"retrieval_mode":"graph"}'§ 08
Namespaces & organizations
Namespaces partition a single user's memories. Pass X-Engram-Namespace: <name> on proxy requests; pass ?namespace=<name> when listing memories. Default is default. Useful for separating work/personal contexts under one Engram key.
Organizations group users with role-based access (owner / admin / member). Memories tagged with org_id are visible to all org members.
# Create an org (you become owner)
curl -X POST http://localhost:8000/orgs \
-H "X-Engram-Key: ek_..." \
-H "Content-Type: application/json" \
-d '{"name":"Acme"}'
# Add a member by external_id
curl -X POST http://localhost:8000/orgs/<org_id>/members \
-H "X-Engram-Key: ek_..." \
-H "Content-Type: application/json" \
-d '{"external_id":"teammate_42","role":"member"}'
# List the orgs you belong to
curl http://localhost:8000/orgs \
-H "X-Engram-Key: ek_..."§ 09
REST surface
Users
POST /users, GET /users/me, PATCH /users/me/config
Memories
GET /memories, POST /memories, PATCH /memories/{id}, DELETE /memories/{id}
Search
POST /memories/search
Capture
POST /memories/capture
Logs
GET /logs, GET /logs/{id}
Proxy
POST /v1/chat (supports stream: true)
Graph
GET /graph/entities, GET /graph/edges, GET /graph/memories/{id}/neighbors, POST /graph/extract
Orgs
POST /orgs, GET /orgs, POST /orgs/{id}/members, DELETE /orgs/{id}/members/{ext_id}
§ 10
Deployment checklist
API
Deploy the FastAPI service on a container host (Azure Container Apps, Fly.io, Render, Railway). Set DATABASE_URL, provider keys, CORS_ORIGINS, ENGRAM_SERVICE_KEY, ENGRAM_PROVIDER_KEY_ENCRYPTION_KEY.
Database
PostgreSQL with the vector extension enabled. Local Docker Postgres and Supabase Postgres both supported. Schema migrates on API startup.
Dashboard
Vercel with Root Directory dashboard and NEXT_PUBLIC_API_URL pointing at the API. Add the Vercel origin to API CORS_ORIGINS.
MCP
Run MCP locally for desktop clients, or expose SSE from a reachable host for clients that support remote MCP.
Feature flags
Optional API env vars: ENABLE_GRAPH=true for entity extraction + graph retrieval, ENABLE_RERANKER=true for the cross-encoder reranker (adds ~400 MB memory).
GHCR image
Pushing to main auto-builds ghcr.io/<owner>/engram-api:latest. Container Apps caches by tag — create a new revision (or pin to :<sha>) to pull a fresh image.
§ 11
Troubleshooting
Dashboard says invalid key
Confirm the browser saved the same ek_... key as your MCP client and that API CORS includes the dashboard origin.
MCP server does not show tools
Run npm --prefix mcp run build, restart the client, and check that the command path points to Node and the built index file.
No memories appear
Add one manually, call capture_conversation, or route a chat through /v1/chat. Empty memory stores are valid for new keys.
Proxy fails provider requests
Check provider keys in server environment variables and verify X-Engram-Provider matches the configured provider.
Graph page shows Not Found
The deployed API doesn't have the /graph/* routes yet. Update the container to the latest image and create a new revision so it actually pulls.
Backfill returns 0 entity links
Either ENABLE_GRAPH isn't set on the API, your extraction provider key is missing, or there are no approved memories yet. Check API logs during the call.
Streaming requests hang
Behind a reverse proxy, ensure SSE buffering is disabled. The proxy already sends X-Accel-Buffering: no for nginx; for Cloudflare enable streaming on the route.
Hybrid retrieval returns empty
Verify the memories.content_tsv column exists (auto-generated by the schema). Run apply_schema against the database if you upgraded from an older version.
Start with Settings to save a key. Land on the home dashboard for system state, dive into the memory ledger to browse entries, or open the entity graph to see how they connect.