MemoryAI Setup Guide

Install once. Memory works automatically forever. Full guide for every IDE and platform.

MemoryAI
Skip the manual config — install the extension
Kiro · Cursor · Windsurf · VS Code. Paste your API key, done. The extension wires MCP server, hooks, and rules automatically.
VS Marketplace Open VSX
Or via CLI: code --install-extension memoryai.memoryai-vsx · then run MemoryAI: Connect from the command palette.

— or read the full manual setup below (Claude Code / Antigravity / Cline / Bot integrations) —

Table of Contents 1. Quick Start (2 minutes) 2. IDE Setup (per platform) 3. Auto-Bootstrap (fully automatic) 4. Context Guard Setup 5. Bot/Agent Setup (ClawdBot, custom bots) 6. Available Tools 7. How It Works 8. Troubleshooting 9. Context Guard & Session Management 10. Bot Integration (Advanced)

1. Quick Start

1
Get an API Key
curl -X POST https://memoryai.dev/v1/admin/provision \
  -H "Content-Type: application/json" \
  -d '{"name": "my-agent", "tos_accepted": true}'

Save the api_key from the response. Pro trial 30 days free.

2
Install MCP Server
npx memoryai-mcp

Or install globally: npm install -g memoryai-mcp

3
Add config to your IDE (see section 2 below)
4
Add auto-bootstrap rule (see section 3 below)
Done!
After these 4 steps, memory works automatically. Your AI agent will remember across sessions, protect important memories (DNA), and manage context automatically.

2. IDE Setup

Copy the config below into your IDE's MCP settings file. Replace YOUR_KEY with your API key.

Claude Code (CLI) — Recommended

One command. Fully automatic memory.

The memoryai-claude package installs 3 lifecycle hooks that make memory 100% automatic — no MCP tools needed, no agent decisions. Memory recalls before every prompt and saves after every turn, invisibly.

npx memoryai-claude install

That's it. Restart Claude Code once. The CLI will:

StepWhat it does
1Ask for API key (leave blank = auto-provision a free one)
2Write 3 hooks to ~/.claude/settings.json:
SessionStart → bootstrap context at session open
UserPromptSubmit → recall relevant memories before each prompt
Stop → save important info after each turn
3Append a note to ~/.claude/CLAUDE.md
4Backup originals before any writes

Other commands:

npx memoryai-claude doctor     # verify hooks + ping server
npx memoryai-claude status     # show hooks at user/project scope
npx memoryai-claude uninstall  # remove hooks (memory on server stays)

Customize the compact threshold (tokens)

By default, MemoryAI suggests a compact at 150K tokens and forces one at 200K tokens. The same threshold applies to every model — Claude 200K, Gemini 1M, GPT-5 — so your bill stays predictable on huge context windows.

You can change the thresholds with the commands below. The setting is stored on the server against your API key, so you only need to set it once — it applies across every IDE/host that uses the key (Claude Code, Kiro, Cursor, Windsurf, …).

npx memoryai-claude config                       # show current thresholds
npx memoryai-claude config compact 180000        # set soft warning to 180K
npx memoryai-claude config critical 250000       # set hard ceiling to 250K
npx memoryai-claude config compact reset         # clear override → default 150K
npx memoryai-claude config critical reset        # clear override → default 200K
npx memoryai-claude config reset                 # clear both → defaults

Note: compact must be strictly less than critical. On small-window models (32K, 16K), the system auto-caps the value so the compact still fires before the host truncates.

Flags: --project (project scope instead of user), --endpoint URL (self-host), --key KEY (skip prompt), MEMORYAI_NONINTERACTIVE=1 (CI/scripts).

Alternative: if you prefer MCP-only (no hooks), use the generic MCP config below instead.

MCP-only config (manual, less automatic)

File: ~/.claude/settings.json

{
  "mcpServers": {
    "memoryai": {
      "command": "npx",
      "args": ["-y", "memoryai-mcp"],
      "env": {
        "HM_ENDPOINT": "https://memoryai.dev",
        "HM_API_KEY": "YOUR_KEY"
      }
    }
  }
}

Note: MCP-only means the agent must decide when to call memory tools. The CLI hooks approach is fully automatic.

Cursor

File: ~/.cursor/mcp.json

{
  "mcpServers": {
    "memoryai": {
      "command": "npx",
      "args": ["-y", "memoryai-mcp"],
      "env": {
        "HM_ENDPOINT": "https://memoryai.dev",
        "HM_API_KEY": "YOUR_KEY"
      }
    }
  }
}

VS Code (Copilot / Cline)

File: .vscode/mcp.json

{
  "servers": {
    "memoryai": {
      "command": "npx",
      "args": ["-y", "memoryai-mcp"],
      "env": {
        "HM_ENDPOINT": "https://memoryai.dev",
        "HM_API_KEY": "YOUR_KEY"
      }
    }
  }
}

Kiro

File: .kiro/settings/mcp.json

{
  "mcpServers": {
    "memoryai": {
      "command": "npx",
      "args": ["-y", "memoryai-mcp"],
      "env": {
        "HM_ENDPOINT": "https://memoryai.dev",
        "HM_API_KEY": "YOUR_KEY"
      }
    }
  }
}

Windsurf

File: ~/.codeium/windsurf/mcp_config.json

{
  "mcpServers": {
    "memoryai": {
      "command": "npx",
      "args": ["-y", "memoryai-mcp"],
      "env": {
        "HM_ENDPOINT": "https://memoryai.dev",
        "HM_API_KEY": "YOUR_KEY"
      }
    }
  }
}

Claude Desktop

File: ~/Library/Application Support/Claude/claude_desktop_config.json (macOS)
%APPDATA%\Claude\claude_desktop_config.json (Windows)

{
  "mcpServers": {
    "memoryai": {
      "command": "npx",
      "args": ["-y", "memoryai-mcp"],
      "env": {
        "HM_ENDPOINT": "https://memoryai.dev",
        "HM_API_KEY": "YOUR_KEY"
      }
    }
  }
}

Antigravity

File: ~/.antigravity/mcp.json

{
  "mcpServers": {
    "memoryai": {
      "command": "npx",
      "args": ["-y", "memoryai-mcp"],
      "env": {
        "HM_ENDPOINT": "https://memoryai.dev",
        "HM_API_KEY": "YOUR_KEY"
      }
    }
  }
}

Any MCP-Compatible Tool

Same JSON config — just find your tool's MCP settings file and paste.

3. One-Click Setup (Copy → Paste → Done)

After MCP config (step 2), create one rule file so your AI agent automatically uses MemoryAI + Context Guard every session.

Option A: One command (fastest):

curl -s https://memoryai.dev/install/SKILL.md > ~/.claude/CLAUDE.md

This downloads the latest rule file directly. Works for Claude Code. For other IDEs, change the path (see table below).

Option B: Manual copy-paste (see content below)

PlatformCreate this file
Claude Code~/.claude/CLAUDE.md
Cursor.cursor/rules/memoryai.mdc
Kiro.kiro/rules/memoryai.md
Windsurf.windsurfrules
VS Code (Copilot).github/copilot-instructions.md
Any other IDEYour IDE's system prompt / rules file

Copy this entire block and paste into the file above:

# MemoryAI — Auto Memory + Context Guard

You have access to MemoryAI — a persistent brain that remembers across sessions.
Everything below is AUTOMATIC. Follow these rules every session.

## 1. Session Start (ALWAYS do first)

Call MCP tool `context_guard_bootstrap` or `memory_bootstrap` to load context.
This gives you: user identity, preferences, recent work, and important decisions.
Do NOT ask the user to repeat context from previous sessions.

## 2. During Session

### Store important things (auto, no need to ask user):
- User says a preference → `memory_store` with memory_type="preference"
- A decision is made → `memory_store` with memory_type="decision"
- User shares identity info → `memory_store` with memory_type="identity"
- Important fact learned → `memory_store` with memory_type="fact"

### Recall when needed:
- Need context from past sessions → `memory_recall` with relevant query
- User asks "what did we decide about X?" → `memory_recall`

### Do NOT store:
- "ok", "thanks", "got it" (trivial)
- Temporary debug output
- Things already in code or git

## 3. Context Guard (CRITICAL — prevents context loss)

### Check every ~15-20 messages:
Call `context_guard_check` with:
  {"estimated_tokens": your_current_token_count, "max_tokens": model_context_window}

### Response meanings:
- "safe" → continue normally
- "compact_soon" → prepare to save context soon
- "compact_now" → IMMEDIATELY call `context_guard_compact` with a summary

### When compact_now:
1. Summarize the key points of the current conversation
2. Call `context_guard_compact` with that summary
3. Continue working — memories are saved in MemoryAI

### Settings (already configured on server):
- Compact warning: at 30% of effective context window
- Critical compact: at 50% of effective context window
- Compact warning: at 30% of your model's context window
- Critical compact: at 50% of your model's context window
- Works with ANY model size automatically (200K, 1M, etc.)

## 4. Emotion Awareness (optional)

Call MCP tool to check user emotional state. Adapt your style:
- stressed/frustrated → short, actionable answers
- confused → step-by-step, detailed explanations
- excited → match energy
- calm/focused → balanced, direct

## 5. Important Notes

- If MemoryAI is down, continue working normally (non-fatal).
- Preferences and decisions are DNA-protected — they NEVER expire.
- Regular facts fade over time if unused (like real memory).
- The brain consolidates memories overnight — next session may have new insights.

## Quick Reference — MCP Tools

| Tool | When |
|------|------|
| memory_bootstrap / context_guard_bootstrap | Session start |
| memory_store | Save something important |
| memory_recall | Need past context |
| memory_compact / context_guard_compact | Context getting full |
| memory_health / context_guard_check | Check context pressure |
| memory_explore | Explore memory connections |
| learn | Save action + result + lesson |
That's it! 2 steps total:
1. Paste MCP config (step 2) → MemoryAI connected
2. Paste the rule above → AI knows how to use memory + context guard + emotion

Open IDE → everything automatic. Memory persists. Context protected. Zero manual work.

4. Context Guard

Context Guard is already included in the rule above. Here's what it does:

StateMeaningWhat happens
SAFEContext below compact thresholdNothing, keep working
COMPACT_SOONContext between compact and criticalAI prepares to save context
COMPACT_NOWContext above critical thresholdAI saves conversation to MemoryAI immediately

Threshold Settings (env vars)

Env VarDefaultMeaning
HM_COMPACT_AT100000Warn when conversation reaches this many tokens (absolute count)
HM_CRITICAL_AT150000Force compact at this token count

Note: v2.3.2+ uses absolute token counts instead of percentages. Pick numbers that match the cost ceiling you want — they apply uniformly across every model, independent of the underlying context window.

Recommended Presets

PresetCOMPACT_ATCRITICAL_ATBest for
Cost-optimised100000150000With MemoryAI (memories saved externally, compact early to keep bills predictable)
Balanced200000300000General use, mid-size sessions
Long-form400000600000Long agentic dev sessions on 1M-window models
Tip: With MemoryAI active, use 100K / 150K
Your memories are already saved externally — you don't need to keep everything in the context window. Compact early, compact often. Nothing is lost.

Example: How thresholds fire

Tokens usedResult
< 100Ksafe
100K - 150Kcompact_soon
> 150Kcompact_now

Custom settings via API:

# Set absolute thresholds via per-request override
curl -X POST https://memoryai.dev/v1/ide/guard/turn-check \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"turn_count": 30, "compact_at_tokens": 100000, "critical_at_tokens": 150000}'

Or set via MCP env vars (in your IDE config):

"env": {
  "HM_COMPACT_AT": "100000",
  "HM_CRITICAL_AT": "150000"
}

Priority: per-request override > env vars > server defaults.

Context Guard is included in the auto-bootstrap rule above. No extra setup needed.

5. Bot/Agent Setup

Python Bot (ClawdBot, Telegram, Discord, etc.)

pip install hmc-memory
from memoryai import AsyncMemoryAI

async with AsyncMemoryAI(
    api_key="hm_sk_your_key",
    base_url="https://memoryai.dev"
) as mem:
    # Session start — load context
    ctx = await mem.guard_bootstrap(max_tokens=4000)

    # During conversation — store important stuff
    await mem.store(
        "User prefers Python for backend",
        memory_type="preference",
        zone="important"
    )

    # Search past memories
    results = await mem.recall("what language does user prefer?")

    # Before session ends — compact if needed
    guard = await mem.guard_check(estimated_tokens=85000, max_tokens=200000)
    if guard.get("should_compact"):
        await mem.guard_compact(conversation_text, task_context="chat session")

Node.js Bot

npm install @cortex-memory/memoryai
import { MemoryAI } from '@cortex-memory/memoryai';

const mem = new MemoryAI({
  apiKey: 'hm_sk_your_key',
  baseUrl: 'https://memoryai.dev'
});

// Session start
const ctx = await mem.contextGuardBootstrap({ maxTokens: 4000 });

// Store memories
await mem.store("User prefers dark mode", { memoryType: "preference" });

// Recall
const results = await mem.recall("user preferences");

// Context guard
const guard = await mem.contextGuardCheck(85000, 200000);
if (guard.should_compact) {
  await mem.contextGuardCompact(conversationText);
}

// Emotion awareness
const emotion = await mem.getEmotion();
// { emotion: "focused", intensity: 0.7, style: "direct", verbosity: "minimal" }

REST API (Any Language)

# Store
curl -X POST https://memoryai.dev/v1/store \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"content": "User prefers Python", "memory_type": "preference"}'

# Recall
curl -X POST https://memoryai.dev/v1/recall \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "what does user prefer?", "depth": "deep", "limit": 5}'

# Context Guard Check
curl -X POST https://memoryai.dev/v1/context/guard/check \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"estimated_tokens": 85000, "max_tokens": 200000}'

# Emotion State
curl https://memoryai.dev/v1/emotion \
  -H "Authorization: Bearer YOUR_KEY"

6. Available MCP Tools

ToolWhat It Does
memory_bootstrapLoad DNA + recent context at session start
memory_storeSave a memory (fact, decision, preference, identity)
memory_recallSearch memories by meaning (semantic + graph + FTS)
memory_compactSave conversation before context loss
memory_healthCheck context pressure (safe/warning/critical)
memory_exploreExplore neural graph connections
memory_clustersView topic clusters
memory_recoverRecover session after a break
learnStore action + result + lesson
entity_listList tracked entities
reasoning_storeDeep reasoning memory (Pro+)
reasoning_recallRecall reasoned insights (Pro+)
snapshot_createBackup memory state
snapshot_restoreRestore from backup
context_guard_checkCheck context window pressure
context_guard_compactCompact context into MemoryAI
context_guard_bootstrapFull context bootstrap with DNA

v2.2 — DNA-aligned tools (npm [email protected])

8 new tools serving the 3 DNA lines: One brain. ∞ agents. Forever. · Top models are always expensive — MemoryAI is the retina for AI. · Brain belongs to the user.

ToolWhat It DoesDNA
brain_exportExport entire brain → portable JSON bundle (Bundle Format v1)#3 vendor-neutral
brain_importRestore bundle into current tenant — idempotent (content_hash dedup)#3 vendor-neutral
benchmark_recall_vs_fullSide-by-side: smart recall vs full-context dump on YOUR brain#2 retina
benchmark_pricingPublic model pricing reference (Claude/GPT/Gemini/etc.)#2 retina
trust_agentsAgent reputation leaderboard (Wilson lower bound) — team+#1 ∞ agents
trust_chunkPer-chunk trust info: source agent + reputation + helpful/unhelpful counts#1 ∞ agents
twin_respondCognitive Twin: predict your free-form response — promax+#1 ∞ agents
twin_statusTwin readiness check (cheap, no LLM call)#1 ∞ agents

Public spec endpoints (no auth required)

EndpointPurpose
GET /v1/specMemoryAI Protocol v1 — Markdown spec (CC BY 4.0)
GET /v1/spec/infoMachine-readable JSON contract: format name, version, conformance levels, endpoints, memory types
GET /v1/benchmark/pricingAssumed $/1M-token pricing for each LLM (used by benchmark)

The spec is published under CC BY 4.0. Reference implementation MIT-licensed. Anyone may implement. Three conformance levels: Producer, Consumer, Bidirectional.

7. How It Works

Open IDE
  → MCP auto-connects (from settings.json)
  → Agent reads rules → calls memory_bootstrap
  → Loads: DNA memories + recent work + preferences + entities
  → Agent knows who you are and what you're working on

During session:
  → Agent auto-stores important decisions/preferences
  → Memories get emotional tagging (amygdala)
  → Neural graph links related memories (association learning)
  → Unused memories slowly fade (natural fade)
  → DNA memories (preference/decision/identity) NEVER fade

Context getting full:
  → Agent calls memory_health → "compact_now"
  → Agent calls memory_compact → saves conversation
  → Context compaction happens safely

Background (24/7, no user action):
  → Sleep consolidation (LLM summarizes insights)
  → association strengthening (co-recalled memories get stronger)
  → natural fade (unused memories fade)
  → Emotion decay (feelings fade without reinforcement)
  → Community detection (finds topic clusters)

Next session:
  → memory_bootstrap loads everything back
  → Including new insights from overnight consolidation
  → Cycle repeats. Memory grows smarter over time.

8. Troubleshooting

ProblemCauseFix
Tools not showing in IDEIDE not restartedRestart IDE after editing config
Connection refusedServer unreachableCheck HM_ENDPOINT URL is correct
401 UnauthorizedWrong API keyCheck HM_API_KEY value
Agent doesn't auto-bootstrapMissing rule fileCreate the rule file (section 3)
npx not foundNode.js not installedInstall Node.js 18+
Memories not persistingWrong endpointUse https://memoryai.dev
Context Guard not triggeringRule file missing or estimated_tokens=0Add context_guard_check to rule file. Send estimated_tokens > 0.
Session compress returns "already_compressed"Same session_id sent twiceNormal — each session_id can only be compressed once (idempotent)
Bot recall scores all 1.0Scores are capped at 1.0 after layer weightingUse the layer field to distinguish DNA vs sessions vs archive

9. Context Guard & Session Management

MemoryAI manages context window pressure automatically. When your session gets full, it compacts your conversation into durable memories — no data loss, no user intervention needed.

Two Integration Paths

PathForEndpoint PrefixHow It Works
IDE (Simple)Claude Code, Kiro, Cursor, any MCP client/v1/ide/guard/*Bot sends summary when full. Server has buffer fallback — works even if bot sends nothing useful.
Bot (Advanced)ClawdBot, custom bots with session control/v1/bot/*Full session compression with idempotent retries. Requires bot to manage session lifecycle.
Which path should I use?

If you're using an IDE (Claude Code, Kiro, Cursor) or a simple bot — use the IDE path. It works automatically with zero extra code. The Bot path is for advanced integrations that need full conversation preservation.

IDE Path — How It Works

Bot chats normally, calling /v1/recall every message
  → Server silently accumulates queries in session buffer (zero cost)
  → Guard detects pressure (30% warning, 50% critical)
  → Bot calls /v1/ide/guard/compact with summary (or anything)
  → Server stores: paragraphs + reasoning + extracted facts
  → If bot sends useless content → server uses buffer as fallback
  → Next recall finds everything — nothing lost

IDE Guard Check API

POST /v1/ide/guard/check
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "estimated_tokens": 100000,
  "max_tokens": 200000
}

Parameters

FieldRequiredDescription
estimated_tokensYesCurrent token count in your session
max_tokensYesModel's context window (128K, 200K, 1M, etc.)
modelNoModel name (auto-detects context window if max_tokens not set)
compact_at_tokensNoPreferred. Absolute token threshold for compact_soon (e.g. 100000)
critical_at_tokensNoPreferred. Absolute token threshold for compact_now (e.g. 150000)
compact_pctNoLegacy decimal override (e.g. 0.30). Used only if absolute tokens not provided.
critical_pctNoLegacy decimal override (e.g. 0.50). Used only if absolute tokens not provided.

Response

{
  "recommendation": "compact_now",
  "urgency": "high",
  "should_compact": true,
  "usage_percent": 50.0,
  "compact_at_tokens": 60000,
  "critical_at_tokens": 100000,
  "dna_memories": 12,
  "bootstrap_ready": true
}

IDE Guard Compact API

POST /v1/ide/guard/compact
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "content": "Summary of conversation: discussed pricing for Kiro accounts, decided on 2K plan at $20/acc, margin 92.7%...",
  "task_context": "pricing discussion"
}
Buffer Fallback

If your bot sends a short or useless content string (e.g., "context guard triggered"), the server automatically uses its internal session buffer (accumulated from your recall queries) instead. You don't need to send a perfect summary — but a good one produces better memories.

IDE Guard Bootstrap API

POST /v1/ide/guard/bootstrap
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "task": "continue working on pricing model"
}

Returns DNA memories + recent activity + task-relevant context for session start.

Full IDE Integration Example

# Python — IDE integration (simple)
from memoryai import AsyncMemoryAI

mem = AsyncMemoryAI(api_key="hm_sk_xxx", base_url="https://memoryai.dev")

# Session start
ctx = await mem.bootstrap(task="my project")

# Every message: recall (server auto-buffers queries)
memories = await mem.recall("user question here", depth="deep")

# Every 15 messages: check context pressure
guard = await mem.ide_guard_check(
    estimated_tokens=current_tokens,
    max_tokens=200000,
)

if guard.get("should_compact"):
    # Send summary — or anything. Server has buffer fallback.
    await mem.ide_guard_compact(
        content=summarize_conversation(messages),
        task_context="current task"
    )
    # Clear local context, continue working
    # All memories preserved — recall will find them

Thresholds

LevelTriggerAction
SafeBelow compact_at_tokensContinue normally
Warningcompact_at_tokens reachedcompact_soon — prepare a summary
Criticalcritical_at_tokens reachedcompact_now — compact immediately

Defaults: compact_at_tokens=100000, critical_at_tokens=150000. Override per-request or via env vars on your MCP config.

For Advanced Bot Integration (ClawdBot)

If your bot can manage session lifecycle (spawn sessions, track token counts, send full message arrays), use the /v1/bot/* endpoints below.

10. Bot Integration (ClawdBot & Custom Bots)

For bots that can manage session lifecycle — full conversation preservation with 100% recall within your plan period.

How It Works

Session 1 reaches 70% context → Guard: "spawn new session"
  → Bot spawns Session 2
  → Session 2 reaches threshold (ready)
  → Bot calls /v1/bot/session/compress for Session 1
  → Server checks idempotency (skip if already compressed)
  → Session 1 compressed → stored in session memory
  → DNA extracted → stored permanently
  → Session 2 recalls compressed memories with freshness boost
  → Scores capped at 1.0 after layer weighting
  → User notices nothing — seamless transition

After plan period (2-30 days):
  → Session memory expires → compacted to long-term archive (permanent)
  → DNA already permanent
  → Frequently accessed chunks → promoted to permanent (as summary)
  → Recall still finds everything via permanent + archive

Plan Tiers — Memory Retention

Plan100% RecallAfter ExpiryDNA
Free2 daysSummary, permanentPermanent
Personal7 daysSummary, permanentPermanent
Pro14 daysSummary, permanentPermanent
ProMax30 daysSummary, permanentPermanent
Team60 daysSummary, permanentPermanent
EnterpriseForeverPermanentPermanent

Bot Guard Check

POST /v1/bot/guard/check
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "estimated_tokens": 140000,
  "max_tokens": 200000,
  "compress_threshold": 140000
}

Note: compress_threshold accepts 0 (meaning "always spawn"). Omit or set to null to use the default 70%.

Response

{
  "recommendation": "compact_now",
  "should_spawn_new_session": true,
  "spawn_reason": "context_pressure (140000/140000 tokens) — spawn new session, compress old when new reaches 20K",
  "compress_threshold": 140000,
  "usage_percent": 70.0,
  "dna_memories": 8,
  "bootstrap_ready": true
}

Session Compress

When your new session is ready (20K+ context), compress the old session:

POST /v1/bot/session/compress
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "session_id": "session-1-uuid",
  "messages": [
    {"role": "user", "content": "calculate 100k credit over 15 days..."},
    {"role": "assistant", "content": "Math: 50 acc × $20 = $1000..."},
    ...
  ]
}

Idempotent: If session_id was already compressed, returns "status": "already_compressed" without re-processing. Safe to retry on network errors.

Response

{
  "status": "completed",
  "compress_id": "abc123def456",
  "chunks_created": 7,
  "dna_extracted": 3,
  "summaries": ["Session discussed pricing model...", "..."],
  "message": "Session compressed: 7 chunks, 3 DNA facts promoted to permanent memory."
}

Bot Recall

POST /v1/bot/recall
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "query": "pricing calculation yesterday",
  "depth": "deep",
  "limit": 10
}

Smart layered recall — DNA, session memory, regular facts, and archive are merged with priority weighting (all final scores capped at 1.0). Recent sessions get an automatic freshness boost.

Bot Bootstrap

POST /v1/bot/guard/bootstrap
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "task": "continue pricing discussion",
  "mode": "default"
}

Modes:

ModeSizeContent
default~5.5K tokensDNA memories + session summaries
deep~25K tokensDNA + summaries + top relevant chunk (full)

Full Integration Example (Python)

from memoryai import AsyncMemoryAI

mem = AsyncMemoryAI(api_key="hm_sk_your_key", base_url="https://memoryai.dev")

# 1. Bootstrap new session
ctx = await mem.bot_bootstrap(task="my project", mode="default")
# → DNA + recent session summaries

# 2. Every message: recall
memories = await mem.bot_recall("user question here", depth="deep")
# → DNA + sessions + regular facts + archive (merged)

# 3. Every 15 messages: check guard
guard = await mem.bot_guard_check(
    estimated_tokens=current_tokens,
    max_tokens=200000,
)

# 4. When spawn signal:
if guard.get("should_spawn_new_session"):
    new_session = spawn_new_session()  # platform-specific

    # ... new session works normally ...
    # ... when new session is ready:

    result = await mem.bot_session_compress(
        session_id="old-session-id",
        messages=old_session_messages,
    )
    # → Old session compressed, DNA promoted to permanent memory
    # → New session recalls them via /v1/bot/recall

What Happens Over Time

TimeWhat Bot RecallsQuality
Day 1 (just compressed)DNA + full session chunks (max freshness)100%
Day 3DNA + session chunks (with freshness)100%
After plan expiryDNA + detailed summary90%+
ForeverDNA (preferences, decisions)Core facts
Frequency Promotion

Frequently recalled chunks are automatically promoted to permanent memory as a summary. Often-accessed memories become permanent — just like the human brain.

Need help?

Email: [email protected] | GitHub: github.com/memoryai-dev/memoryai

memoryai v3.0.0 — A living brain for AI agents