§ IV — Operator manual

Documentation

Set up Engram, connect LLM clients, inspect memory behavior, and choose between proxy-based automatic memory and MCP-based tool memory.

Read path

Quick start Dashboard MCP clients Auto capture Proxy API Retrieval modes Graph memory Namespaces & orgs REST API Deploy Troubleshooting

§ 01

Quick start

Engram runs as a FastAPI service backed by PostgreSQL and pgvector. The dashboard stores one Engram API key in the browser, while provider keys stay on the server.

Use the same ek_... key across Claude Desktop, VS Code Agent Mode, Cursor-style clients, and the dashboard when you want one shared memory store.

cp .env.example .env
docker compose up -d
curl -X POST http://localhost:8000/users \
  -H "Content-Type: application/json" \
  -d '{"external_id":"test_user_1"}'

§ 02

Dashboard workflow

Home

Land on the overview: total memories, pending reviews, entity count, recent activity timeline, top connected entities, and latest retrievals.

Memories

The full ledger at /memories. Search, add, edit, approve pending extractions, merge duplicates, export, import, and decay confidence.

Graph

/graph renders entities extracted from memories as a force-directed graph. Hover to highlight neighborhoods, click for memory lists, filter by type.

Chat

/chat talks to your configured provider through the proxy. Memories are injected automatically, new ones extracted in the background.

Logs

Every retrieval event: which memories were surfaced, their scores, and the conversation that caused it.

Settings

Engram API key, provider config, encrypted provider keys, retrieval thresholds, retrieval mode (vector / hybrid / graph), and dedup tuning.

§ 03

Connect MCP clients

MCP clients use Engram over stdio. Claude Desktop, VS Code Agent Mode, Cursor, Windsurf, and similar clients can load the same tool server and write to the same memory account.

Build the MCP package once, then point the client at mcp/dist/index.js with --transport stdio.

{
  "servers": {
    "engram": {
      "type": "stdio",
      "command": "C:\\nvm4w\\nodejs\\node.exe",
      "args": [
        "F:\\Engram\\mcp\\dist\\index.js",
        "--transport",
        "stdio"
      ],
      "env": {
        "ENGRAM_API_URL": "http://localhost:8000",
        "ENGRAM_API_KEY": "ek_your_key_here"
      }
    }
  }
}

§ 04

Automatic memory capture

The MCP tool capture_conversation lets clients store memories without the user typing "store this". The assistant sends the latest user message and assistant response, then Engram extracts only durable facts.

Memory rules

Durable facts about the user
Preferences, projects, skills, and corrections
Long-lived context useful across future sessions
No greetings, throwaway questions, or assistant-only claims

Always use Engram memory. Before answering when user context may matter, search Engram for relevant memories. After each meaningful exchange, call capture_conversation with the user message, assistant response, source client name, and session id. Store durable user facts, preferences, project context, and corrections. Do not store greetings, one-off questions, temporary details, or assistant-only claims.

§ 05

Proxy mode

Proxy mode is the fully automatic path. Your app sends OpenAI-style chat requests to Engram, Engram injects relevant memory, forwards the request, returns the provider response, and extracts new facts in the background.

Set stream: true in the body for SSE streaming — tokens flow to the client as they arrive and extraction fires once the stream closes.

Add X-Engram-Namespace to scope retrieval and storage to a sub-store (e.g. work vs personal). Defaults to default.

curl -N -X POST http://localhost:8000/v1/chat \
  -H "Content-Type: application/json" \
  -H "X-Engram-Key: ek_your_key_here" \
  -H "X-Engram-User-ID: test_user_1" \
  -H "X-Engram-Provider: openai" \
  -H "X-Engram-Namespace: work" \
  -d '{"model":"gpt-4o-mini","stream":true,
       "messages":[{"role":"user","content":"What stack should I use?"}]}'

§ 06

Retrieval modes

Each user can pick the retrieval strategy used when memories get injected into the system prompt. Set it via PATCH /users/me/config with a retrieval_mode field, or from the Settings page.

vector (default)

Pure pgvector cosine similarity over the 384-dim embeddings. Fast, single SQL round trip. Best when queries paraphrase memory content.

hybrid

Runs vector and Postgres full-text (tsvector) searches in parallel, then merges with Reciprocal Rank Fusion (k=60). Better recall when queries name specific terms the embedding may not weight highly.

graph

Vector seed plus 1-hop expansion through the entity graph (memories sharing entities with the seed are pulled in). Requires ENABLE_GRAPH=true and a backfilled entity set.

Reranker (optional)

Set ENABLE_RERANKER=true on the API to load a cross-encoder (cross-encoder/ms-marco-MiniLM-L-6-v2) at startup. When loaded, all retrieval modes re-rank their candidate set before returning. Adds ~400 MB to API memory and a small per-query CPU cost.

§ 07

Graph memory

Engram extracts named entities (people, projects, skills, technologies, preferences, topics, organizations) from each memory and stores them as a graph in Postgres — no Neo4j required.

Set ENABLE_GRAPH=true on the API. New memories trigger entity extraction asynchronously (uses your configured extraction provider). Call POST /graph/extract to backfill entities on memories that pre-dated the flag.

The dashboard /graph page renders this as a force-directed graph (Obsidian-style): dots sized by mention count, edges between entities that co-occur in memories, hover to highlight 1-hop neighborhood, click for memory list.

# 1. Enable on the API
ENABLE_GRAPH=true

# 2. Backfill entities for existing memories
curl -X POST http://localhost:8000/graph/extract \
  -H "X-Engram-Key: ek_..."

# 3. List entities with mention counts
curl http://localhost:8000/graph/entities \
  -H "X-Engram-Key: ek_..."

# 4. Co-occurrence edges (entities sharing memories)
curl http://localhost:8000/graph/edges \
  -H "X-Engram-Key: ek_..."

# 5. Switch retrieval to graph mode
curl -X PATCH http://localhost:8000/users/me/config \
  -H "X-Engram-Key: ek_..." \
  -H "Content-Type: application/json" \
  -d '{"retrieval_mode":"graph"}'

§ 08

Namespaces & organizations

Namespaces partition a single user's memories. Pass X-Engram-Namespace: <name> on proxy requests; pass ?namespace=<name> when listing memories. Default is default. Useful for separating work/personal contexts under one Engram key.

Organizations group users with role-based access (owner / admin / member). Memories tagged with org_id are visible to all org members.

# Create an org (you become owner)
curl -X POST http://localhost:8000/orgs \
  -H "X-Engram-Key: ek_..." \
  -H "Content-Type: application/json" \
  -d '{"name":"Acme"}'

# Add a member by external_id
curl -X POST http://localhost:8000/orgs/<org_id>/members \
  -H "X-Engram-Key: ek_..." \
  -H "Content-Type: application/json" \
  -d '{"external_id":"teammate_42","role":"member"}'

# List the orgs you belong to
curl http://localhost:8000/orgs \
  -H "X-Engram-Key: ek_..."

§ 09

REST surface

Users

POST /users, GET /users/me, PATCH /users/me/config

Memories

GET /memories, POST /memories, PATCH /memories/{id}, DELETE /memories/{id}

POST /memories/search

Capture

POST /memories/capture

Logs

GET /logs, GET /logs/{id}

Proxy

POST /v1/chat (supports stream: true)

Graph

GET /graph/entities, GET /graph/edges, GET /graph/memories/{id}/neighbors, POST /graph/extract

Orgs

POST /orgs, GET /orgs, POST /orgs/{id}/members, DELETE /orgs/{id}/members/{ext_id}

§ 10

Deployment checklist

API

Deploy the FastAPI service on a container host (Azure Container Apps, Fly.io, Render, Railway). Set DATABASE_URL, provider keys, CORS_ORIGINS, ENGRAM_SERVICE_KEY, ENGRAM_PROVIDER_KEY_ENCRYPTION_KEY.

Database

PostgreSQL with the vector extension enabled. Local Docker Postgres and Supabase Postgres both supported. Schema migrates on API startup.

Dashboard

Vercel with Root Directory dashboard and NEXT_PUBLIC_API_URL pointing at the API. Add the Vercel origin to API CORS_ORIGINS.

MCP

Run MCP locally for desktop clients, or expose SSE from a reachable host for clients that support remote MCP.

Feature flags

Optional API env vars: ENABLE_GRAPH=true for entity extraction + graph retrieval, ENABLE_RERANKER=true for the cross-encoder reranker (adds ~400 MB memory).

GHCR image

Pushing to main auto-builds ghcr.io/<owner>/engram-api:latest. Container Apps caches by tag — create a new revision (or pin to :<sha>) to pull a fresh image.

§ 11

Troubleshooting

Dashboard says invalid key

Confirm the browser saved the same ek_... key as your MCP client and that API CORS includes the dashboard origin.

MCP server does not show tools

Run npm --prefix mcp run build, restart the client, and check that the command path points to Node and the built index file.

No memories appear

Add one manually, call capture_conversation, or route a chat through /v1/chat. Empty memory stores are valid for new keys.

Proxy fails provider requests

Check provider keys in server environment variables and verify X-Engram-Provider matches the configured provider.

Graph page shows Not Found

The deployed API doesn't have the /graph/* routes yet. Update the container to the latest image and create a new revision so it actually pulls.

Backfill returns 0 entity links

Either ENABLE_GRAPH isn't set on the API, your extraction provider key is missing, or there are no approved memories yet. Check API logs during the call.

Streaming requests hang

Behind a reverse proxy, ensure SSE buffering is disabled. The proxy already sends X-Accel-Buffering: no for nginx; for Cloudflare enable streaming on the route.

Hybrid retrieval returns empty

Verify the memories.content_tsv column exists (auto-generated by the schema). Run apply_schema against the database if you upgraded from an older version.

Start with Settings to save a key. Land on the home dashboard for system state, dive into the memory ledger to browse entries, or open the entity graph to see how they connect.