โ† Back to AI & LLMs
AI & LLMs by @emberdesire

jasper-recall

Local RAG system for agent memory using ChromaDB

0
Source Code

Jasper Recall v0.2.3

Local RAG (Retrieval-Augmented Generation) system for AI agent memory. Gives your agent the ability to remember and search past conversations.

New in v0.2.2: Shared ChromaDB Collections โ€” separate collections for private, shared, and learnings content. Better isolation for multi-agent setups.

New in v0.2.1: Recall Server โ€” HTTP API for Docker-isolated agents that can't run CLI directly.

New in v0.2.0: Shared Agent Memory โ€” bidirectional learning between main and sandboxed agents with privacy controls.

When to Use

  • Memory recall: Search past sessions for context before answering
  • Continuous learning: Index daily notes and decisions for future reference
  • Session continuity: Remember what happened across restarts
  • Knowledge base: Build searchable documentation from your agent's experience

Quick Start

Setup

One command installs everything:

npx jasper-recall setup

This creates:

  • Python venv at ~/.openclaw/rag-env
  • ChromaDB database at ~/.openclaw/chroma-db
  • CLI scripts in ~/.local/bin/
  • OpenClaw plugin config in openclaw.json

Why Python?

The core search and embedding functionality uses Python libraries:

  • ChromaDB โ€” Vector database for semantic search
  • sentence-transformers โ€” Local embedding models (no API needed)

These are the gold standard for local RAG. There are no good Node.js equivalents that work fully offline.

Why a Separate Venv?

The venv at ~/.openclaw/rag-env provides:

Benefit Why It Matters
Isolation Won't conflict with your other Python projects
No sudo Installs to your home directory, no root needed
Clean uninstall Delete the folder and it's gone
Reproducibility Same versions everywhere

The dependencies are heavy (~200MB total with the embedding model), but this is a one-time download that runs entirely locally.

Basic Usage

Search your memory:

recall "what did we decide about the API design"
recall "hopeIDS patterns" --limit 10
recall "meeting notes" --json

Index your files:

index-digests  # Index memory files into ChromaDB

Create session digests:

digest-sessions          # Process new sessions
digest-sessions --dry-run  # Preview what would be processed

How It Works

Three Components

  1. digest-sessions โ€” Extracts key info from session logs (topics, tools used)
  2. index-digests โ€” Chunks and embeds markdown files into ChromaDB
  3. recall โ€” Semantic search across your indexed memory

What Gets Indexed

By default, indexes files from ~/.openclaw/workspace/memory/:

  • *.md โ€” Daily notes, MEMORY.md
  • session-digests/*.md โ€” Session summaries
  • repos/*.md โ€” Project documentation
  • founder-logs/*.md โ€” Development logs (if present)

Embedding Model

Uses sentence-transformers/all-MiniLM-L6-v2:

  • 384-dimensional embeddings
  • ~80MB download on first run
  • Runs locally, no API needed

Agent Integration

Memory-Augmented Responses

# Before answering questions about past work
results = exec("recall 'project setup decisions' --json")
# Include relevant context in your response

Automated Indexing (Heartbeat)

Add to HEARTBEAT.md:

## Memory Maintenance
- [ ] New session logs? โ†’ `digest-sessions`
- [ ] Memory files updated? โ†’ `index-digests`

Cron Job

Schedule regular indexing:

{
  "schedule": { "kind": "cron", "expr": "0 */6 * * *" },
  "payload": {
    "kind": "agentTurn",
    "message": "Run index-digests to update the memory index"
  },
  "sessionTarget": "isolated"
}

Shared Agent Memory (v0.2.0+)

For multi-agent setups where sandboxed agents need access to some memories:

Memory Tagging

Tag entries in daily notes:

## 2026-02-05 [public] - Feature shipped
This is visible to all agents.

## 2026-02-05 [private] - Personal note
This is main agent only (default if untagged).

## 2026-02-05 [learning] - Pattern discovered
Learnings shared bidirectionally between agents.

ChromaDB Collections (v0.2.2+)

Memory is stored in separate collections for isolation:

Collection Purpose Who accesses
private_memories Main agent's private content Main agent only
shared_memories [public] tagged content Sandboxed agents
agent_learnings Learnings from any agent All agents
jasper_memory Legacy unified (backward compat) Fallback

Collection selection:

# Main agent (default) - searches private_memories
recall "api design"

# Sandboxed agents - searches shared_memories only
recall "product info" --public-only

# Search learnings only
recall "patterns" --learnings

# Search all collections (merged results)
recall "everything" --all

# Specific collection
recall "something" --collection private_memories

# Legacy mode (single collection)
recall "old way" --legacy

Sandboxed Agent Access

# Sandboxed agents use --public-only
recall "product info" --public-only

# Main agent can see everything
recall "product info"

Moltbook Agent Setup (v0.4.0+)

For the moltbook-scanner (or any sandboxed agent), use the built-in setup:

# Configure sandboxed agent with --public-only restriction
npx jasper-recall moltbook-setup

# Verify the setup is correct
npx jasper-recall moltbook-verify

This creates:

  • ~/bin/recall โ€” Wrapper that forces --public-only flag
  • shared/ โ€” Symlink to main workspace's shared memory

The sandboxed agent can then use:

~/bin/recall "query"  # Automatically restricted to public memories

Privacy model:

  1. Main agent tags memories as [public] or [private] in daily notes
  2. sync-shared extracts [public] content to memory/shared/
  3. Sandboxed agents can ONLY search the shared collection

Privacy Workflow

# Check for sensitive data before sharing
privacy-check "text to scan"
privacy-check --file notes.md

# Extract [public] entries to shared directory
sync-shared
sync-shared --dry-run  # Preview first

CLI Reference

recall

recall "query" [OPTIONS]

Options:
  -n, --limit N     Number of results (default: 5)
  --json            Output as JSON
  -v, --verbose     Show similarity scores and collection source
  --public-only     Search shared_memories only (sandboxed agents)
  --learnings       Search agent_learnings only
  --all             Search all collections (merged results)
  --collection X    Search specific collection by name
  --legacy          Use legacy jasper_memory collection

serve (v0.2.1+)

npx jasper-recall serve [OPTIONS]

Options:
  --port, -p N    Port to listen on (default: 3458)
  --host, -h H    Host to bind (default: 127.0.0.1)

Starts HTTP API server for Docker-isolated agents.

Endpoints:
  GET /recall?q=query&limit=5    Search memories
  GET /health                    Health check

Security: public_only=true enforced by default.
Set RECALL_ALLOW_PRIVATE=true to allow private queries.

Example (from Docker container):

curl "http://host.docker.internal:3458/recall?q=product+info"

privacy-check (v0.2.0+)

privacy-check "text"     # Scan inline text
privacy-check --file X   # Scan a file

Detects: emails, API keys, internal IPs, home paths, credentials.
Returns: CLEAN or list of violations.

sync-shared (v0.2.0+)

sync-shared [OPTIONS]

Options:
  --dry-run    Preview without writing
  --all        Process all daily notes

Extracts [public] tagged entries to memory/shared/.

index-digests

index-digests

Indexes markdown files from:
  ~/.openclaw/workspace/memory/*.md
  ~/.openclaw/workspace/memory/session-digests/*.md
  ~/.openclaw/workspace/memory/repos/*.md
  ~/.openclaw/workspace/memory/founder-logs/*.md

Skips files that haven't changed (content hash check).

digest-sessions

digest-sessions [OPTIONS]

Options:
  --dry-run    Preview without writing
  --all        Process all sessions (not just new)
  --recent N   Process only N most recent sessions

Configuration

Custom Paths

Set environment variables:

export RECALL_WORKSPACE=~/.openclaw/workspace
export RECALL_CHROMA_DB=~/.openclaw/chroma-db
export RECALL_SESSIONS_DIR=~/.openclaw/agents/main/sessions

Chunking

Default settings in index-digests:

  • Chunk size: 500 characters
  • Overlap: 100 characters

Security Considerations

โš ๏ธ Review these settings before enabling in production:

Server Binding

The serve command defaults to 127.0.0.1 (localhost only). Do not use --host 0.0.0.0 unless you explicitly intend to expose the API externally and have secured it appropriately.

Private Memory Access

The server enforces public_only=true by default. The env var RECALL_ALLOW_PRIVATE=true bypasses this restriction. Never set this on public/shared hosts โ€” it exposes your private memories to any client.

autoRecall Plugin

When autoRecall: true in the OpenClaw plugin config, memories are automatically injected before every agent message. Consider:

  • Set publicOnly: true in plugin config for sandboxed agents
  • Review which collections will be searched
  • Use minScore to filter low-relevance injections

What's automatically skipped (no recall triggered):

  • Heartbeat polls (HEARTBEAT, Read HEARTBEAT.md, HEARTBEAT_OK)
  • Messages containing NO_REPLY
  • Messages < 10 characters
  • Agent-to-agent messages (cron jobs, workers, spawned agents)
  • Automated reports (๐Ÿ“‹ PR Review, ๐Ÿค– Codex Watch, ANNOUNCE_*)
  • Messages from senders starting with agent: or worker-

Safer config for untrusted contexts:

"jasper-recall": {
  "enabled": true,
  "config": {
    "autoRecall": true,
    "publicOnly": true,
    "minScore": 0.5
  }
}

Environment Variables

The following env vars affect behavior โ€” set them explicitly rather than relying on defaults:

Variable Default Purpose
RECALL_WORKSPACE ~/.openclaw/workspace Memory files location
RECALL_CHROMA_DB ~/.openclaw/chroma-db Vector database path
RECALL_SESSIONS_DIR ~/.openclaw/agents/main/sessions Session logs
RECALL_ALLOW_PRIVATE false Server private access
RECALL_PORT 3458 Server port
RECALL_HOST 127.0.0.1 Server bind address

Dry-Run First

Before sharing or syncing, use dry-run options to preview what will be exposed:

privacy-check --file notes.md     # Scan for sensitive data
sync-shared --dry-run             # Preview public extraction
digest-sessions --dry-run         # Preview session processing

Sandboxed Environments

For maximum isolation, run jasper-recall in a container or dedicated account:

  • Limits risk of accidental data exposure
  • Separates private memory from shared contexts
  • Recommended for multi-agent setups with untrusted agents

Troubleshooting

"No index found"

index-digests  # Create the index first

"Collection not found"

rm -rf ~/.openclaw/chroma-db  # Clear and rebuild
index-digests

Model download slow First run downloads ~80MB model. Subsequent runs are instant.

Links