Pick your mode. Install in 2 minutes. Works with any MCP-compatible IDE.
code --install-extension memoryai.memoryai-vsx · then run MemoryAI: Connect from the command palette.
— or configure manually below (Claude Code / Codex CLI / curl) —
Full brain: persistent memory + context guard. AI remembers everything and never loses context.
Persistent memory across sessions. No context window monitoring. You manage context yourself.
Just context overflow protection. No persistent memory. Prevents data loss in long sessions.
Full brain — persistent memory + automatic context protection.
memoryai-claude installs 3 lifecycle hooks — memory recalls before every prompt and saves after every turn, with zero agent decisions.
npx memoryai-claude install
Restart Claude Code once. Done. The CLI auto-provisions a free key if you don't have one.
Other commands:
npx memoryai-claude doctor # verify hooks + ping server
npx memoryai-claude status # show hook scope (user/project)
npx memoryai-claude uninstall # remove hooks, keep server data
Flags: --project (project scope) · --endpoint URL (self-host) · --key KEY (skip prompt) · MEMORYAI_NONINTERACTIVE=1 (CI)
Run:
claude mcp add memoryai -s user \
-e HM_API_KEY=hm_sk_YOUR_KEY \
-e HM_ENDPOINT=https://memoryai.dev \
-e HM_COMPACT_AT=100000 \
-e HM_CRITICAL_AT=150000 \
-- npx -y memoryai-mcp
MCP-only = agent decides when to call tools. CLI hooks = fully automatic.
Edit ~/.cursor/mcp.json:
{
"mcpServers": {
"memoryai": {
"command": "npx",
"args": ["-y", "memoryai-mcp"],
"env": {
"HM_API_KEY": "hm_sk_YOUR_KEY",
"HM_ENDPOINT": "https://memoryai.dev",
"HM_COMPACT_AT": "100000",
"HM_CRITICAL_AT": "150000"
}
}
}
}
Edit ~/.codeium/windsurf/mcp_config.json:
{
"mcpServers": {
"memoryai": {
"command": "npx",
"args": ["-y", "memoryai-mcp"],
"env": {
"HM_API_KEY": "hm_sk_YOUR_KEY",
"HM_ENDPOINT": "https://memoryai.dev",
"HM_COMPACT_AT": "100000",
"HM_CRITICAL_AT": "150000"
}
}
}
}
Edit .vscode/mcp.json (or Cline MCP settings):
{
"servers": {
"memoryai": {
"command": "npx",
"args": ["-y", "memoryai-mcp"],
"env": {
"HM_API_KEY": "hm_sk_YOUR_KEY",
"HM_ENDPOINT": "https://memoryai.dev",
"HM_COMPACT_AT": "100000",
"HM_CRITICAL_AT": "150000"
}
}
}
}
Edit .kiro/settings/mcp.json:
{
"mcpServers": {
"memoryai": {
"command": "npx",
"args": ["-y", "memoryai-mcp"],
"env": {
"HM_API_KEY": "hm_sk_YOUR_KEY",
"HM_ENDPOINT": "https://memoryai.dev",
"HM_COMPACT_AT": "100000",
"HM_CRITICAL_AT": "150000"
}
}
}
}
Create ~/.claude/CLAUDE.md (or your IDE's rules file):
# MemoryAI
## Session Start
Call `memory_bootstrap` to load context from previous sessions.
## During Session
- Preference learned → `memory_store` (memory_type="preference")
- Decision made → `memory_store` (memory_type="decision")
- Need past context → `memory_recall`
## Context Guard (every ~15 messages)
Call `context_guard_check` with estimated_tokens and max_tokens.
If "compact_now" → call `context_guard_compact` with summary.
Persistent memory across sessions. No context monitoring.
claude mcp add memoryai -s user \
-e HM_API_KEY=hm_sk_YOUR_KEY \
-e HM_ENDPOINT=https://memoryai.dev \
-- npx -y memoryai-mcp
Edit ~/.cursor/mcp.json:
{
"mcpServers": {
"memoryai": {
"command": "npx",
"args": ["-y", "memoryai-mcp"],
"env": {
"HM_API_KEY": "hm_sk_YOUR_KEY",
"HM_ENDPOINT": "https://memoryai.dev"
}
}
}
}
Same JSON config — paste into your IDE's MCP settings file:
{
"mcpServers": {
"memoryai": {
"command": "npx",
"args": ["-y", "memoryai-mcp"],
"env": {
"HM_API_KEY": "hm_sk_YOUR_KEY",
"HM_ENDPOINT": "https://memoryai.dev"
}
}
}
}
# MemoryAI
## Session Start
Call `memory_bootstrap` to load context.
## During Session
- Preference → `memory_store` (memory_type="preference")
- Decision → `memory_store` (memory_type="decision")
- Need context → `memory_recall`
Prevents context overflow. No persistent memory.
claude mcp add memoryai -s user \
-e HM_API_KEY=hm_sk_YOUR_KEY \
-e HM_ENDPOINT=https://memoryai.dev \
-e HM_COMPACT_AT=100000 \
-e HM_CRITICAL_AT=150000 \
-- npx -y memoryai-mcp
{
"mcpServers": {
"memoryai": {
"command": "npx",
"args": ["-y", "memoryai-mcp"],
"env": {
"HM_API_KEY": "hm_sk_YOUR_KEY",
"HM_ENDPOINT": "https://memoryai.dev",
"HM_COMPACT_AT": "100000",
"HM_CRITICAL_AT": "150000"
}
}
}
}
# Context Guard
## Every ~15 messages
Call `context_guard_check` with estimated_tokens and max_tokens.
- "safe" → continue
- "compact_soon" → prepare summary
- "compact_now" → call `context_guard_compact` immediately
Control when Context Guard triggers. Lower = more aggressive (compact earlier).
| Setting | Env Var | Default | Meaning |
|---|---|---|---|
| Compact warning | HM_COMPACT_AT | 100000 | Warn when conversation reaches 100K tokens (absolute count, not %) |
| Force compact | HM_CRITICAL_AT | 150000 | Force compact at 150K tokens |
Note: v2.3.2+ uses absolute token counts instead of percentages. Pick numbers that match the cost ceiling you want — they apply uniformly across every model, independent of the underlying context window.
| Preset | COMPACT_AT | CRITICAL_AT | Best for |
|---|---|---|---|
| Conservative | 20 | 30 | With MemoryAI (memories saved externally) |
| Balanced | 30 | 50 | General use, no external memory |
| Aggressive | 50 | 70 | Long sessions, trust AI to compact fast |
| Tokens used | % of 1M | With 20/30 setting |
|---|---|---|
| < 200K | < 20% | safe — continue normally |
| 200K - 300K | 20-30% | compact_soon — prepare summary |
| > 300K | > 30% | compact_now — compact immediately |
Context Guard now uses absolute token thresholds (HM_COMPACT_AT / HM_CRITICAL_AT) — no model-window detection required. Pick numbers that match your cost ceiling and you're done.
| Model | Context | Auto-detected? |
|---|---|---|
| claude-opus-4-6[1m] | 1,000,000 | Yes |
| claude-sonnet-4-6[1m] | 1,000,000 | Yes |
| claude-opus-4-6 | 200,000 | Yes |
| gpt-4o | 128,000 | Yes |
| gemini-1.5-pro | 2,000,000 | Yes |
| gemini-2.0-flash | 1,000,000 | Yes |
| Any model with [Xm] suffix | X × 1M | Yes |
HM_COMPACT_AT and HM_CRITICAL_AT in tokens (e.g. 100000 / 150000), not percentagescontext_guard_check call# Test manually:
HM_API_KEY=hm_sk_your_key npx memoryai-mcp
# Should print: "MemoryAI MCP server running"
# Check server:
curl https://memoryai.dev/health
# Should return: {"status": "ok"}
# Override per-request (preferred — absolute tokens):
curl -X POST -H "Authorization: Bearer hm_sk_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"turn_count": 30, "compact_at_tokens": 100000, "critical_at_tokens": 150000}' \
https://memoryai.dev/v1/ide/guard/turn-check
# Or just edit HM_COMPACT_AT / HM_CRITICAL_AT in your IDE config and restart.
MemoryAI — A living brain for AI agents.
Home · Full Docs · GitHub