MemoryAI Setup Guide

Install once. Memory works automatically forever. Full guide for every IDE and platform.

Skip the manual config — install the extension

Kiro · Cursor · Windsurf · VS Code. Paste your API key, done. The extension wires MCP server, hooks, and rules automatically.

VS Marketplace Open VSX

Or via CLI: code --install-extension memoryai.memoryai-vsx · then run MemoryAI: Connect from the command palette.

— or read the full manual setup below (Claude Code / Antigravity / Cline / Bot integrations) —

Table of Contents 1. Quick Start (2 minutes) 2. IDE Setup (per platform) 3. Auto-Bootstrap (fully automatic) 4. Context Guard Setup 5. Bot/Agent Setup (ClawdBot, custom bots) 6. Available Tools 7. How It Works 8. Troubleshooting 9. Context Guard & Session Management 10. Bot Integration (Advanced)

1. Quick Start

Get an API Key

curl -X POST https://memoryai.dev/v1/admin/provision \
  -H "Content-Type: application/json" \
  -d '{"name": "my-agent", "tos_accepted": true}'

Save the api_key from the response. Pro trial 30 days free.

Install MCP Server

npx memoryai-mcp

Or install globally: npm install -g memoryai-mcp

Add config to your IDE (see section 2 below)

Add auto-bootstrap rule (see section 3 below)

Done!

After these 4 steps, memory works automatically. Your AI agent will remember across sessions, protect important memories (DNA), and manage context automatically.

2. IDE Setup

Copy the config below into your IDE's MCP settings file. Replace YOUR_KEY with your API key.

Claude Code (CLI) — Recommended

One command. Fully automatic memory.

The memoryai-claude package installs 3 lifecycle hooks that make memory 100% automatic — no MCP tools needed, no agent decisions. Memory recalls before every prompt and saves after every turn, invisibly.

npx memoryai-claude install

That's it. Restart Claude Code once. The CLI will:

Step	What it does
1	Ask for API key (leave blank = auto-provision a free one)
2	Write 3 hooks to `~/.claude/settings.json`: • SessionStart → bootstrap context at session open • UserPromptSubmit → recall relevant memories before each prompt • Stop → save important info after each turn
3	Append a note to `~/.claude/CLAUDE.md`
4	Backup originals before any writes

Other commands:

npx memoryai-claude doctor     # verify hooks + ping server
npx memoryai-claude status     # show hooks at user/project scope
npx memoryai-claude uninstall  # remove hooks (memory on server stays)

Customize the compact threshold (tokens)

By default, MemoryAI starts saving (compact) at 160K tokens and forces one at the hard ceiling of 192K (1.2× the compact point). The same threshold applies to every model — Claude 200K, Gemini 1M, GPT-5 — so your bill stays predictable on huge context windows.

You only choose one number: the compact point. The hard ceiling is derived automatically at 1.2× of it, so there is nothing else to tune. The setting is stored on the server against your API key, so you set it once — it applies across every IDE/host that uses the key (Claude Code, Kiro, Cursor, Windsurf, …).

npx memoryai-claude config                       # show current thresholds
npx memoryai-claude config compact 200000        # compact 200K → ceiling 240K (1.2x)
npx memoryai-claude config compact reset         # clear override → defaults
npx memoryai-claude config reset                 # clear all overrides → defaults

Note: setting compact automatically sets the hard ceiling to 1.2× of it in the same step, so the compact always fires before the ceiling. On small-window models (32K, 16K), the system auto-caps the value so the compact still fires before the host truncates.

Flags: --project (project scope instead of user), --endpoint URL (self-host), --key KEY (skip prompt), MEMORYAI_NONINTERACTIVE=1 (CI/scripts).

Alternative: if you prefer MCP-only (no hooks), use the generic MCP config below instead.

MCP-only config (manual, less automatic)

File: ~/.claude/settings.json

{
  "mcpServers": {
    "memoryai": {
      "command": "npx",
      "args": ["-y", "memoryai-mcp"],
      "env": {
        "HM_ENDPOINT": "https://memoryai.dev",
        "HM_API_KEY": "YOUR_KEY"
      }
    }
  }
}

Note: MCP-only means the agent must decide when to call memory tools. The CLI hooks approach is fully automatic.

Cursor

File: ~/.cursor/mcp.json

{
  "mcpServers": {
    "memoryai": {
      "command": "npx",
      "args": ["-y", "memoryai-mcp"],
      "env": {
        "HM_ENDPOINT": "https://memoryai.dev",
        "HM_API_KEY": "YOUR_KEY"
      }
    }
  }
}

VS Code (Copilot / Cline)

File: .vscode/mcp.json

{
  "servers": {
    "memoryai": {
      "command": "npx",
      "args": ["-y", "memoryai-mcp"],
      "env": {
        "HM_ENDPOINT": "https://memoryai.dev",
        "HM_API_KEY": "YOUR_KEY"
      }
    }
  }
}

Kiro

File: .kiro/settings/mcp.json

{
  "mcpServers": {
    "memoryai": {
      "command": "npx",
      "args": ["-y", "memoryai-mcp"],
      "env": {
        "HM_ENDPOINT": "https://memoryai.dev",
        "HM_API_KEY": "YOUR_KEY"
      }
    }
  }
}

Windsurf

File: ~/.codeium/windsurf/mcp_config.json

{
  "mcpServers": {
    "memoryai": {
      "command": "npx",
      "args": ["-y", "memoryai-mcp"],
      "env": {
        "HM_ENDPOINT": "https://memoryai.dev",
        "HM_API_KEY": "YOUR_KEY"
      }
    }
  }
}

Claude Desktop

File: ~/Library/Application Support/Claude/claude_desktop_config.json (macOS)
%APPDATA%\Claude\claude_desktop_config.json (Windows)

{
  "mcpServers": {
    "memoryai": {
      "command": "npx",
      "args": ["-y", "memoryai-mcp"],
      "env": {
        "HM_ENDPOINT": "https://memoryai.dev",
        "HM_API_KEY": "YOUR_KEY"
      }
    }
  }
}

Antigravity

File: ~/.antigravity/mcp.json

{
  "mcpServers": {
    "memoryai": {
      "command": "npx",
      "args": ["-y", "memoryai-mcp"],
      "env": {
        "HM_ENDPOINT": "https://memoryai.dev",
        "HM_API_KEY": "YOUR_KEY"
      }
    }
  }
}

Any MCP-Compatible Tool

Same JSON config — just find your tool's MCP settings file and paste.

3. One-Click Setup (Copy → Paste → Done)

After MCP config (step 2), create one rule file so your AI agent automatically uses MemoryAI + Context Guard every session.

Option A: One command (fastest):

curl -s https://memoryai.dev/install/SKILL.md > ~/.claude/CLAUDE.md

This downloads the latest rule file directly. Works for Claude Code. For other IDEs, change the path (see table below).

Option B: Manual copy-paste (see content below)

Platform	Create this file
Claude Code	`~/.claude/CLAUDE.md`
Cursor	`.cursor/rules/memoryai.mdc`
Kiro	`.kiro/rules/memoryai.md`
Windsurf	`.windsurfrules`
VS Code (Copilot)	`.github/copilot-instructions.md`
Any other IDE	Your IDE's system prompt / rules file

Copy this entire block and paste into the file above:

# MemoryAI — Auto Memory + Context Guard

You have access to MemoryAI — a persistent brain that remembers across sessions.
Everything below is AUTOMATIC. Follow these rules every session.

## 1. Session Start (ALWAYS do first)

Call MCP tool `context_guard_bootstrap` or `memory_bootstrap` to load context.
This gives you: user identity, preferences, recent work, and important decisions.
Do NOT ask the user to repeat context from previous sessions.

## 2. During Session

### Store important things (auto, no need to ask user):
- User says a preference → `memory_store` with memory_type="preference"
- A decision is made → `memory_store` with memory_type="decision"
- User shares identity info → `memory_store` with memory_type="identity"
- Important fact learned → `memory_store` with memory_type="fact"

### Recall when needed:
- Need context from past sessions → `memory_recall` with relevant query
- User asks "what did we decide about X?" → `memory_recall`

### Do NOT store:
- "ok", "thanks", "got it" (trivial)
- Temporary debug output
- Things already in code or git

## 3. Context Guard (CRITICAL — prevents context loss)

### Check every ~15-20 messages:
Call `context_guard_check` with:
  {"estimated_tokens": your_current_token_count, "max_tokens": model_context_window}

### Response meanings:
- "safe" → continue normally
- "compact_soon" → prepare to save context soon
- "compact_now" → IMMEDIATELY call `context_guard_compact` with a summary

### When compact_now:
1. Summarize the key points of the current conversation
2. Call `context_guard_compact` with that summary
3. Continue working — memories are saved in MemoryAI

### Settings (already configured on server):
- Compact warning: at 30% of effective context window
- Critical compact: at 50% of effective context window
- Compact warning: at 30% of your model's context window
- Critical compact: at 50% of your model's context window
- Works with ANY model size automatically (200K, 1M, etc.)

## 4. Emotion Awareness (optional)

Call MCP tool to check user emotional state. Adapt your style:
- stressed/frustrated → short, actionable answers
- confused → step-by-step, detailed explanations
- excited → match energy
- calm/focused → balanced, direct

## 5. Important Notes

- If MemoryAI is down, continue working normally (non-fatal).
- Preferences and decisions are DNA-protected — they NEVER expire.
- Regular facts fade over time if unused (like real memory).
- The brain consolidates memories overnight — next session may have new insights.

## Quick Reference — MCP Tools

| Tool | When |
|------|------|
| memory_bootstrap / context_guard_bootstrap | Session start |
| memory_store | Save something important |
| memory_recall | Need past context |
| memory_compact / context_guard_compact | Context getting full |
| memory_health / context_guard_check | Check context pressure |
| memory_explore | Explore memory connections |
| learn | Save action + result + lesson |

That's it! 2 steps total:

1. Paste MCP config (step 2) → MemoryAI connected
2. Paste the rule above → AI knows how to use memory + context guard + emotion

Open IDE → everything automatic. Memory persists. Context protected. Zero manual work.

4. Context Guard

Context Guard is already included in the rule above. Here's what it does:

State	Meaning	What happens
SAFE	Context below compact threshold	Nothing, keep working
COMPACT_SOON	Context between compact and critical	AI prepares to save context
COMPACT_NOW	Context above critical threshold	AI saves conversation to MemoryAI immediately

Threshold Settings (env vars)

Env Var	Default	Meaning
`HM_COMPACT_AT`	`100000`	Warn when conversation reaches this many tokens (absolute count)
`HM_CRITICAL_AT`	`150000`	Force compact at this token count

Note: v2.3.2+ uses absolute token counts instead of percentages. Pick numbers that match the cost ceiling you want — they apply uniformly across every model, independent of the underlying context window.

Recommended Presets

Preset	COMPACT_AT	CRITICAL_AT	Best for
Cost-optimised	100000	150000	With MemoryAI (memories saved externally, compact early to keep bills predictable)
Balanced	200000	300000	General use, mid-size sessions
Long-form	400000	600000	Long agentic dev sessions on 1M-window models

Tip: With MemoryAI active, use 100K / 150K

Your memories are already saved externally — you don't need to keep everything in the context window. Compact early, compact often. Nothing is lost.

Example: How thresholds fire

Tokens used	Result
< 100K	safe
100K - 150K	compact_soon
> 150K	compact_now

Custom settings via API:

# Set absolute thresholds via per-request override
curl -X POST https://memoryai.dev/v1/ide/guard/turn-check \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"turn_count": 30, "compact_at_tokens": 100000, "critical_at_tokens": 150000}'

Or set via MCP env vars (in your IDE config):

"env": {
  "HM_COMPACT_AT": "100000",
  "HM_CRITICAL_AT": "150000"
}

Priority: per-request override > env vars > server defaults.

Context Guard is included in the auto-bootstrap rule above. No extra setup needed.

5. Bot/Agent Setup

Python Bot (ClawdBot, Telegram, Discord, etc.)

pip install hmc-memory

from memoryai import AsyncMemoryAI

async with AsyncMemoryAI(
    api_key="hm_sk_your_key",
    base_url="https://memoryai.dev"
) as mem:
    # Session start — load context
    ctx = await mem.guard_bootstrap(max_tokens=4000)

    # During conversation — store important stuff
    await mem.store(
        "User prefers Python for backend",
        memory_type="preference",
        zone="important"
    )

    # Search past memories
    results = await mem.recall("what language does user prefer?")

    # Before session ends — compact if needed
    guard = await mem.guard_check(estimated_tokens=85000, max_tokens=200000)
    if guard.get("should_compact"):
        await mem.guard_compact(conversation_text, task_context="chat session")

Node.js Bot

npm install @cortex-memory/memoryai

import { MemoryAI } from '@cortex-memory/memoryai';

const mem = new MemoryAI({
  apiKey: 'hm_sk_your_key',
  baseUrl: 'https://memoryai.dev'
});

// Session start
const ctx = await mem.contextGuardBootstrap({ maxTokens: 4000 });

// Store memories
await mem.store("User prefers dark mode", { memoryType: "preference" });

// Recall
const results = await mem.recall("user preferences");

// Context guard
const guard = await mem.contextGuardCheck(85000, 200000);
if (guard.should_compact) {
  await mem.contextGuardCompact(conversationText);
}

// Emotion awareness
const emotion = await mem.getEmotion();
// { emotion: "focused", intensity: 0.7, style: "direct", verbosity: "minimal" }

REST API (Any Language)

# Store
curl -X POST https://memoryai.dev/v1/store \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"content": "User prefers Python", "memory_type": "preference"}'

# Recall
curl -X POST https://memoryai.dev/v1/recall \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "what does user prefer?", "depth": "deep", "limit": 5}'

# Context Guard Check
curl -X POST https://memoryai.dev/v1/context/guard/check \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"estimated_tokens": 85000, "max_tokens": 200000}'

# Emotion State
curl https://memoryai.dev/v1/emotion \
  -H "Authorization: Bearer YOUR_KEY"

6. Available MCP Tools

Tool	What It Does
`memory_bootstrap`	Load DNA + recent context at session start
`memory_store`	Save a memory (fact, decision, preference, identity)
`memory_recall`	Search memories by meaning (semantic + graph + FTS)
`memory_compact`	Save conversation before context loss
`memory_health`	Check context pressure (safe/warning/critical)
`memory_explore`	Explore neural graph connections
`memory_clusters`	View topic clusters
`memory_recover`	Recover session after a break
`learn`	Store action + result + lesson
`entity_list`	List tracked entities
`reasoning_store`	Deep reasoning memory (Pro+)
`reasoning_recall`	Recall reasoned insights (Pro+)
`snapshot_create`	Backup memory state
`snapshot_restore`	Restore from backup
`context_guard_check`	Check context window pressure
`context_guard_compact`	Compact context into MemoryAI
`context_guard_bootstrap`	Full context bootstrap with DNA

v2.2 — DNA-aligned tools (npm `[email protected]`)

8 new tools serving the 3 DNA lines: One brain. ∞ agents. Forever. · Top models are always expensive — MemoryAI is the retina for AI. · Brain belongs to the user.

Tool	What It Does	DNA
`brain_export`	Export entire brain → portable JSON bundle (Bundle Format v1)	#3 vendor-neutral
`brain_import`	Restore bundle into current tenant — idempotent (content_hash dedup)	#3 vendor-neutral
`benchmark_recall_vs_full`	Side-by-side: smart recall vs full-context dump on YOUR brain	#2 retina
`benchmark_pricing`	Public model pricing reference (Claude/GPT/Gemini/etc.)	#2 retina
`trust_agents`	Agent reputation leaderboard (Wilson lower bound) — team+	#1 ∞ agents
`trust_chunk`	Per-chunk trust info: source agent + reputation + helpful/unhelpful counts	#1 ∞ agents
`twin_respond`	Cognitive Twin: predict your free-form response — promax+	#1 ∞ agents
`twin_status`	Twin readiness check (cheap, no LLM call)	#1 ∞ agents

Public spec endpoints (no auth required)

Endpoint	Purpose
`GET /v1/spec`	MemoryAI Protocol v1 — Markdown spec (CC BY 4.0)
`GET /v1/spec/info`	Machine-readable JSON contract: format name, version, conformance levels, endpoints, memory types
`GET /v1/benchmark/pricing`	Assumed $/1M-token pricing for each LLM (used by benchmark)

The spec is published under CC BY 4.0. Reference implementation MIT-licensed. Anyone may implement. Three conformance levels: Producer, Consumer, Bidirectional.

7. How It Works

Open IDE
  → MCP auto-connects (from settings.json)
  → Agent reads rules → calls memory_bootstrap
  → Loads: DNA memories + recent work + preferences + entities
  → Agent knows who you are and what you're working on

During session:
  → Agent auto-stores important decisions/preferences
  → Memories get emotional tagging (amygdala)
  → Neural graph links related memories (association learning)
  → Unused memories slowly fade (natural fade)
  → DNA memories (preference/decision/identity) NEVER fade

Context getting full:
  → Agent calls memory_health → "compact_now"
  → Agent calls memory_compact → saves conversation
  → Context compaction happens safely

Background (24/7, no user action):
  → Sleep consolidation (LLM summarizes insights)
  → association strengthening (co-recalled memories get stronger)
  → natural fade (unused memories fade)
  → Emotion decay (feelings fade without reinforcement)
  → Community detection (finds topic clusters)

Next session:
  → memory_bootstrap loads everything back
  → Including new insights from overnight consolidation
  → Cycle repeats. Memory grows smarter over time.

8. Troubleshooting

Problem	Cause	Fix
Tools not showing in IDE	IDE not restarted	Restart IDE after editing config
Connection refused	Server unreachable	Check HM_ENDPOINT URL is correct
401 Unauthorized	Wrong API key	Check HM_API_KEY value
Agent doesn't auto-bootstrap	Missing rule file	Create the rule file (section 3)
npx not found	Node.js not installed	Install Node.js 18+
Memories not persisting	Wrong endpoint	Use https://memoryai.dev
Context Guard not triggering	Rule file missing or estimated_tokens=0	Add context_guard_check to rule file. Send estimated_tokens > 0.
Session compress returns "already_compressed"	Same session_id sent twice	Normal — each session_id can only be compressed once (idempotent)
Bot recall scores all 1.0	Scores are capped at 1.0 after layer weighting	Use the `layer` field to distinguish DNA vs sessions vs archive

9. Context Guard & Session Management

MemoryAI manages context window pressure automatically. When your session gets full, it compacts your conversation into durable memories — no data loss, no user intervention needed.

Two Integration Paths

Path	For	Endpoint Prefix	How It Works
IDE (Simple)	Claude Code, Kiro, Cursor, any MCP client	/v1/ide/guard/*	Bot sends summary when full. Server has buffer fallback — works even if bot sends nothing useful.
Bot (Advanced)	ClawdBot, custom bots with session control	/v1/bot/*	Full session compression with idempotent retries. Requires bot to manage session lifecycle.

Which path should I use?

If you're using an IDE (Claude Code, Kiro, Cursor) or a simple bot — use the IDE path. It works automatically with zero extra code. The Bot path is for advanced integrations that need full conversation preservation.

IDE Path — How It Works

Bot chats normally, calling /v1/recall every message
  → Server silently accumulates queries in session buffer (zero cost)
  → Guard detects pressure (30% warning, 50% critical)
  → Bot calls /v1/ide/guard/compact with summary (or anything)
  → Server stores: paragraphs + reasoning + extracted facts
  → If bot sends useless content → server uses buffer as fallback
  → Next recall finds everything — nothing lost

IDE Guard Check API

POST /v1/ide/guard/check
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "estimated_tokens": 100000,
  "max_tokens": 200000
}

Parameters

Field	Required	Description
estimated_tokens	Yes	Current token count in your session
max_tokens	Yes	Model's context window (128K, 200K, 1M, etc.)
model	No	Model name (auto-detects context window if max_tokens not set)
compact_at_tokens	No	Preferred. Absolute token threshold for compact_soon (e.g. 100000)
critical_at_tokens	No	Preferred. Absolute token threshold for compact_now (e.g. 150000)
compact_pct	No	Legacy decimal override (e.g. 0.30). Used only if absolute tokens not provided.
critical_pct	No	Legacy decimal override (e.g. 0.50). Used only if absolute tokens not provided.

Response

{
  "recommendation": "compact_now",
  "urgency": "high",
  "should_compact": true,
  "usage_percent": 50.0,
  "compact_at_tokens": 60000,
  "critical_at_tokens": 100000,
  "dna_memories": 12,
  "bootstrap_ready": true
}

IDE Guard Compact API

POST /v1/ide/guard/compact
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "content": "Summary of conversation: discussed pricing for Kiro accounts, decided on 2K plan at $20/acc, margin 92.7%...",
  "task_context": "pricing discussion"
}

Buffer Fallback

If your bot sends a short or useless content string (e.g., "context guard triggered"), the server automatically uses its internal session buffer (accumulated from your recall queries) instead. You don't need to send a perfect summary — but a good one produces better memories.

IDE Guard Bootstrap API

POST /v1/ide/guard/bootstrap
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "task": "continue working on pricing model"
}

Returns DNA memories + recent activity + task-relevant context for session start.

Full IDE Integration Example

# Python — IDE integration (simple)
from memoryai import AsyncMemoryAI

mem = AsyncMemoryAI(api_key="hm_sk_xxx", base_url="https://memoryai.dev")

# Session start
ctx = await mem.bootstrap(task="my project")

# Every message: recall (server auto-buffers queries)
memories = await mem.recall("user question here", depth="deep")

# Every 15 messages: check context pressure
guard = await mem.ide_guard_check(
    estimated_tokens=current_tokens,
    max_tokens=200000,
)

if guard.get("should_compact"):
    # Send summary — or anything. Server has buffer fallback.
    await mem.ide_guard_compact(
        content=summarize_conversation(messages),
        task_context="current task"
    )
    # Clear local context, continue working
    # All memories preserved — recall will find them

Thresholds

Level	Trigger	Action
Safe	Below compact_at_tokens	Continue normally
Warning	compact_at_tokens reached	compact_soon — prepare a summary
Critical	critical_at_tokens reached	compact_now — compact immediately

Defaults: compact_at_tokens=100000, critical_at_tokens=150000. Override per-request or via env vars on your MCP config.

For Advanced Bot Integration (ClawdBot)

If your bot can manage session lifecycle (spawn sessions, track token counts, send full message arrays), use the /v1/bot/* endpoints below.

10. Bot Integration (ClawdBot & Custom Bots)

For bots that can manage session lifecycle — full conversation preservation with 100% recall within your plan period.

How It Works

Session 1 reaches 70% context → Guard: "spawn new session"
  → Bot spawns Session 2
  → Session 2 reaches threshold (ready)
  → Bot calls /v1/bot/session/compress for Session 1
  → Server checks idempotency (skip if already compressed)
  → Session 1 compressed → stored in session memory
  → DNA extracted → stored permanently
  → Session 2 recalls compressed memories with freshness boost
  → Scores capped at 1.0 after layer weighting
  → User notices nothing — seamless transition

After plan period (2-30 days):
  → Session memory expires → compacted to long-term archive (permanent)
  → DNA already permanent
  → Frequently accessed chunks → promoted to permanent (as summary)
  → Recall still finds everything via permanent + archive

Plan Tiers — Memory Retention

Plan	100% Recall	After Expiry	DNA
Free	2 days	Summary, permanent	Permanent
Personal	7 days	Summary, permanent	Permanent
Pro	14 days	Summary, permanent	Permanent
ProMax	30 days	Summary, permanent	Permanent
Team	60 days	Summary, permanent	Permanent
Enterprise	Forever	Permanent	Permanent

Bot Guard Check

POST /v1/bot/guard/check
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "estimated_tokens": 140000,
  "max_tokens": 200000,
  "compress_threshold": 140000
}

Note: compress_threshold accepts 0 (meaning "always spawn"). Omit or set to null to use the default 70%.

Response

{
  "recommendation": "compact_now",
  "should_spawn_new_session": true,
  "spawn_reason": "context_pressure (140000/140000 tokens) — spawn new session, compress old when new reaches 20K",
  "compress_threshold": 140000,
  "usage_percent": 70.0,
  "dna_memories": 8,
  "bootstrap_ready": true
}

Session Compress

When your new session is ready (20K+ context), compress the old session:

POST /v1/bot/session/compress
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "session_id": "session-1-uuid",
  "messages": [
    {"role": "user", "content": "calculate 100k credit over 15 days..."},
    {"role": "assistant", "content": "Math: 50 acc × $20 = $1000..."},
    ...
  ]
}

Idempotent: If session_id was already compressed, returns "status": "already_compressed" without re-processing. Safe to retry on network errors.

Response

{
  "status": "completed",
  "compress_id": "abc123def456",
  "chunks_created": 7,
  "dna_extracted": 3,
  "summaries": ["Session discussed pricing model...", "..."],
  "message": "Session compressed: 7 chunks, 3 DNA facts promoted to permanent memory."
}

Bot Recall

POST /v1/bot/recall
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "query": "pricing calculation yesterday",
  "depth": "deep",
  "limit": 10
}

Smart layered recall — DNA, session memory, regular facts, and archive are merged with priority weighting (all final scores capped at 1.0). Recent sessions get an automatic freshness boost.

Bot Bootstrap

POST /v1/bot/guard/bootstrap
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "task": "continue pricing discussion",
  "mode": "default"
}

Modes:

Mode	Size	Content
default	~5.5K tokens	DNA memories + session summaries
deep	~25K tokens	DNA + summaries + top relevant chunk (full)

Full Integration Example (Python)

from memoryai import AsyncMemoryAI

mem = AsyncMemoryAI(api_key="hm_sk_your_key", base_url="https://memoryai.dev")

# 1. Bootstrap new session
ctx = await mem.bot_bootstrap(task="my project", mode="default")
# → DNA + recent session summaries

# 2. Every message: recall
memories = await mem.bot_recall("user question here", depth="deep")
# → DNA + sessions + regular facts + archive (merged)

# 3. Every 15 messages: check guard
guard = await mem.bot_guard_check(
    estimated_tokens=current_tokens,
    max_tokens=200000,
)

# 4. When spawn signal:
if guard.get("should_spawn_new_session"):
    new_session = spawn_new_session()  # platform-specific

    # ... new session works normally ...
    # ... when new session is ready:

    result = await mem.bot_session_compress(
        session_id="old-session-id",
        messages=old_session_messages,
    )
    # → Old session compressed, DNA promoted to permanent memory
    # → New session recalls them via /v1/bot/recall

What Happens Over Time

Time	What Bot Recalls	Quality
Day 1 (just compressed)	DNA + full session chunks (max freshness)	100%
Day 3	DNA + session chunks (with freshness)	100%
After plan expiry	DNA + detailed summary	90%+
Forever	DNA (preferences, decisions)	Core facts

Frequency Promotion

Frequently recalled chunks are automatically promoted to permanent memory as a summary. Often-accessed memories become permanent — just like the human brain.

Need help?

Email: [email protected] | GitHub: github.com/memoryai-dev/memoryai

memoryai v3.0.0 — A living brain for AI agents

MemoryAI Setup Guide

1. Quick Start

2. IDE Setup

Claude Code (CLI) — Recommended

Customize the compact threshold (tokens)

Cursor

VS Code (Copilot / Cline)

Kiro

Windsurf

Claude Desktop

Antigravity

Any MCP-Compatible Tool

3. One-Click Setup (Copy → Paste → Done)

4. Context Guard

Threshold Settings (env vars)

Recommended Presets

Example: How thresholds fire

5. Bot/Agent Setup

Python Bot (ClawdBot, Telegram, Discord, etc.)

Node.js Bot

REST API (Any Language)

6. Available MCP Tools

v2.2 — DNA-aligned tools (npm [email protected])

Public spec endpoints (no auth required)

7. How It Works

8. Troubleshooting

9. Context Guard & Session Management

Two Integration Paths

IDE Path — How It Works

IDE Guard Check API

Parameters

Response

IDE Guard Compact API

IDE Guard Bootstrap API

Full IDE Integration Example

Thresholds

10. Bot Integration (ClawdBot & Custom Bots)

How It Works

Plan Tiers — Memory Retention

Bot Guard Check

Response

Session Compress

Response

Bot Recall

Bot Bootstrap

Full Integration Example (Python)

What Happens Over Time

v2.2 — DNA-aligned tools (npm `[email protected]`)