Now live — Cloud-hosted MCP server

Your AI should
remember everything

Zero-latency, lossless Context Graph for Claude Code, Cursor & Devin. SUMA replaces noisy chat history with explicitly tracked Decision Traces and Organizational Judgment.

New: QMS Edge Integration. Log in via Google to instantly bridge your SQUAD ticket tracking. SUMA automatically preloads your assigned tickets, team context, and active sprint blockers directly into your IDE before you even type a single keystroke.

AI-ASSISTED SETUP
🤖

Don't want to read the docs? Hit Ctrl+A to select this page, copy it, then paste into Cursor or Claude Code.

Your AI will read our setup instructions and automatically configure itself to use SUMA memory. The ultimate "show, don't tell" for autonomous agents.

settings.json
{
  "mcpServers": {
    "suma-memory": {
      "url": "https://sumapro-mcp.quadframe.work/mcp",
      "headers": {
        "Authorization": "Bearer sk_live_..."
      }
    }
  }
}
Without SUMA
Monday: Explain your codebase to Claude. 10 min, 5000 tokens
Tuesday: New session. Explain again. 10 min, 5000 tokens
Switch to Cursor? Start from zero.
15,000 tokens of context. Every. Single. Message.
With SUMA Pro
Explain once. SUMA ingests it as graph nodes.
Next session: AI calls suma_search. 200 tokens, instant.
Switch tools freely. Same memory everywhere.
98% reduction in context tokens.

Not just memory. A Context Graph.

SUMA doesn't just store text. It builds a lossless Context Graph — Decision Traces, Organizational Judgment, and structural relationships — so your AI retrieves exactly what matters, not everything you ever said.

Topological Bindings

Finds connections flat-file search can't. Ask "Who knows my deployment pipeline?" — SUMA resolves the structural path and returns the answer.

Contextual Gravity

Not all knowledge is equal. SUMA's proprietary gravity model pulls the highest-impact context to the surface — so your AI always gets what matters most.

QMS Zero-Shot Context

Cross-links with QMS Edge using unified Google Identity. SUMA fetches your active tickets and prepends them into your Claude session instantly. The AI knows what you are doing.

Structural Continuity

Designed to surface the most relevant context over time — not just recency or frequency. The context that counts rises to the surface.

Paradox Detection

Finds contradictions in your knowledge graph. "You said X last week but Y today." Resolves conflicts automatically.

Pattern Detection

Detects repeated behaviors over time. "You always deploy on Fridays." Cross-pattern causation analysis.

Enterprise B2B

One algorithm. Five industry profiles.

SUMA's K-WIL gravity engine physically shifts its focus based on your industry. A healthcare AI prioritizes patient context at 0.95 weight. A dev team AI prioritizes current sprint at 0.90. Same math, different gravity.

👤

Personal

Default for SUMA Companion. Family, health, emotions weighted highest.

Relationships0.90
Health0.80
Trauma0.85
💻

Enterprise Dev

For dev teams. Current sprint and architecture weighted highest.

Current Work0.90
Architecture0.85
Team0.70
🏥

Healthcare

For clinical settings. Patient health and safety weighted highest.

Health0.95
Trauma0.85
Identity0.70
🏛

Government

For civic tech. Financial/civic compliance weighted highest.

Financial0.85
Health0.80
Identity0.75
🎓

Education

For schools/universities. Learning progress weighted highest.

Learning0.85
Achievements0.80
Identity0.75

Enterprise clients can request custom weight profiles tuned to their specific domain.

Four steps. Three minutes.

1

Get your API key

Try free for 30 days or subscribe at $4.99/month. Your sk_live_... key appears instantly.

2

Connect your AI tool (one time)

Add to .mcp.json in your project root. Works with Claude Code, Cursor, or any MCP client.

// .mcp.json (create this file in your project root)
{
  "mcpServers": {
    "suma-memory": {
      "type": "http",
      "url": "https://sumapro-mcp.quadframe.work/mcp",
      "headers": {
        "Authorization": "Bearer sk_live_YOUR_KEY_HERE"
      }
    }
  }
}

Restart your AI tool after adding. Type /mcp to verify it shows connected.

3

Add orchestration rules (recommended)

Add to your project's CLAUDE.md or .cursorrules so your AI uses SUMA automatically — no manual prompting.

## SUMA MCP Rules
You are connected to SUMA Memory via MCP.
1. Before answering project questions, call suma_talk for context.
2. After solving a bug or design, call suma_ingest to store it.
3. For recurring issues, use suma_patterns for analysis.
4

Talk naturally. SUMA learns and remembers.

Just work normally. Your AI calls SUMA in the background — ingesting new knowledge, searching for context, building your graph. Next session, it remembers everything. 200 tokens instead of 15,000.

6 Tools. Infinite memory.

Your AI calls these tools automatically via the Model Context Protocol.

Why only 6 tools? SUMA replaces bloated API surfaces with 6 mathematically pure graph primitives — specifically designed so your LLM never gets confused about which tool to call.

suma_search Mathematical graph search with precision recall
// AI calls this automatically
suma_search(
  query: "deployment pipeline",
  depth: 2,       // multi-hop traversal
  limit: 5,       // top 5 results
  sphere: "work"   // scope to work context
)
// Returns: nodes + precision scores + contextual signals + structural paths
// Add ?format=toon for 30–60% fewer tokens in the response
?format=toon TOON serialization — reduces response tokens by 30–60% for LLM cost savings
suma_ingest Add knowledge to the graph
suma_ingest(
  text: "Our auth system uses JWT tokens with role-based access control",
  sphere: "architecture",
  extract_relationships: true  // auto-extracts triplets
)
// Automatically creates: Auth → uses → JWT, Auth → has → role_based_access
suma_talk Bidirectional — searches AND learns in one call
suma_talk(
  message: "We decided to use microservices for the payment module",
  persona: "companion"
)
// Returns: relevant context + learned nodes + structural analysis
suma_node Get a node with its full structural profile
suma_node(
  node_id: "AUTH_SYSTEM",
  include_neighbors: true,
  include_weight_profile: true
)
// Returns: content, sphere, neighbors, structural weight, contextual signal
suma_patterns Detect behavioral patterns and causation
suma_patterns(
  query: "deployment",
  include_cross_patterns: true
)
// Returns: temporal patterns, cross-pattern causation
// e.g., "deploy_friday" CAUSES "hotfix_monday"
suma_forget Delete or correct mistakes in the graph
suma_forget(
  node_id: "WORK_abc123",    // specific node
  // OR
  keyword: "outdated_api"    // fuzzy match
)
// Removes incorrect or outdated knowledge from your graph

Setup in 2 minutes.

One command. Perfect stability. Zero disconnects.

1

Get your API key

Subscribe or try free — your key is emailed instantly.

2

Add to your IDE config

Create .cursor/mcp.json in your project root:

{
  "mcpServers": {
    "suma-memory": {
      "command": "npx",
      "args": ["suma-mcp-proxy", "--key=sk_live_YOUR_KEY"]
    }
  }
}
🔐
Firewall-Piercing Local Bridge

The suma-mcp-proxy runs entirely via local stdio, then makes outbound HTTPS calls to SUMA Cloud. This natively bypasses corporate firewall restrictions that block incoming MCP connections. Enterprise-ready by design.

3

Restart your IDE

Restart Claude Code or Cursor. Your AI now has persistent memory across all sessions.

The Magic

The Zero-Prompt Agent Architecture

Standard AI requires you to constantly say "Hey AI, please remember this for later." With SUMA's orchestration rules, you never type the word "remember" again. Your AI becomes a continuous background archivist.

🔍 PRE-FLIGHT PROTOCOL (Read Before Acting)
### Autonomous Pre-Flight Triggers
BEFORE taking action, query memory in these scenarios:

1. Deployment Pre-Check
   Before running deploy commands (docker push, xcrun altool, gcloud deploy)
   → Call suma_search("deployment credentials API key")
   NEVER guess credentials. Always verify first.

2. Friction Pre-Check
   When you see an error you might have solved before
   → Call suma_talk("error: [paste error signature]")
   Check if this was already solved. Don't repeat mistakes.

3. Architecture Pre-Check
   Before creating new components or answering "how should we build this?"
   → Call suma_search("component pattern [area]")
   Verify existing patterns. Maintain consistency.
⚡ POST-FLIGHT PROTOCOL (Write After Completing)
### Autonomous Ingestion Triggers
AFTER completing work, archive to memory in these scenarios:

1. Deployment → Shipped code to production? Call suma_ingest
2. Architectural → Finalized a design decision? Call suma_ingest
3. Session End → User says "done for today"? Call suma_ingest
4. Friction → Same bug twice? Archive what NOT to do.
5. Vision → New business goal stated? Archive the "why".

Pre-Flight = Read first. Post-Flight = Write after. Your AI never guesses and never forgets.

See what your AI is thinking.

Your subscription includes the SUMA Pro Dashboard — watch your AI build your knowledge graph in real-time.

4
Knowledge Nodes
$12.40
Tokens Saved This Month
3
Knowledge Spheres

Recent Activity

📥
Ingested 40 tokens → 1 node
"Authentication uses JWT tokens with RBAC..."
🔍
Searched → 3 results
"deployment pipeline"

Watch exactly how Claude and Cursor are mapping your codebase. Inspect K-WIL node weights. See exactly how much you're saving.

One price. One claim.

Can reduce your AI API bill by up to 80% — based on K-WIL retrieval precision benchmarks. Cloud-hosted. No setup. Cancel anytime.

SUMA Pro — Cloud
$4.99 /month
Full brain. Secure Cloud API. Zero installation.
Secure Cloud API — plug into Claude Code, Cursor, Devin
Contextual Gravity Engine
Topological Binding search
Paradox + pattern detection
K-WIL weighting (patent pending)
MCP native (3-line config, no Docker)
20,000 queries/month
Cancel anytime. API key emailed immediately.

"Your data is encrypted in transit and at rest. Multi-tenant isolation enforced at the database layer. You own your story."

Enterprise / air-gapped deployment? Contact us

Copy. Paste. Your AI remembers.

Add this template to your project's CLAUDE.md or .cursorrules. Your AI becomes a proactive memory engine.

CLAUDE.md — paste at the bottom of your existing file
# [SYSTEM DIRECTIVE: SUMA MEMORY INTEGRATION]
You are connected to the developer's Long-Term Knowledge Graph
via the suma_ingest and suma_talk MCP tools.

## Proactive Rules:
1. Before answering architecture questions, SILENTLY call suma_talk.
2. After solving a bug or finalizing a design, call suma_ingest
   with a 50-word compressed "Gist" in the relevant sphere.
3. For recurring issues, use suma_patterns to analyze causation.
4. Use [System]_[Component]_[Status] naming for nodes.
5. Use universal verbs: DEPENDS_ON, IMPLEMENTS, RESOLVES, BUILT_BY,
   USES, HAS_ISSUE, MANAGES, LEADS, DEPLOYED_TO, LEARNING.

## Auto-Ingest Triggers (no manual prompting needed):
- New database schema or design pattern → ingest to "architecture"
- Bug fixed or ticket resolved → ingest root cause + solution
- Career goal or learning intent mentioned → ingest to "personal"
- Same topic debugged 3+ times → create a friction node

Try this. We dare you.

Two machines. One brain. Zero sync scripts.

💻
Machine A (laptop)

Add your API key to Claude Code. Tell it: "Our auth system uses JWT with role-based access. The deploy pipeline runs on Docker."

🚶
Walk to Machine B (desktop)

Add the same API key to Cursor. Ask: "What database does our API use?"

It just works

Cursor answers: "JWT with role-based access, deployed via Docker." Same brain. Different machine.

Same API key = same brain. Works across Claude Code, Cursor, Devin, or any MCP client. No sync. No git. No iCloud.

Transparent security. No crypto-jargon.

We tell you exactly how your data is protected. No buzzwords.

Data in Transit

HTTPS/TLS encryption on every request. Same standard as Stripe, GitHub, and Slack.

Data at Rest

Google Cloud KMS server-side encryption. Your data is encrypted on disk by Google's infrastructure.

Multi-Tenant Isolation

Every query filtered by your API key's org_id. You can never see another customer's data. Mathematically enforced.

V1 uses the Shared Trust Cloud Model (same as GitHub, Notion, Linear). V2 Zero-Knowledge Local Encryption coming Q3 2026.

Stop re-explaining your code.

Give your AI a brain that remembers.

Try Free — 30 Days