← Back to Clawdbot Tools
Clawdbot Tools by @orosha-ai

agent-observability-dashboard

Unified observability

0
Source Code

Agent Observability Dashboard πŸ“Š

Unified observability for OpenClaw agents β€” metrics, traces, and performance insights.

What It Does

OpenClaw agents need production-grade visibility. Multiple platforms exist (Langfuse, Langsmith, AgentOps) but no unified view.

Agent Observability Dashboard provides:

  • Metrics tracking β€” Latency, success rate, token usage, error counts
  • Trace visualization β€” Tool chains, decision flows, session timelines
  • Cross-agent aggregation β€” Compare performance across multiple agents/sessions
  • Exportable reports β€” JSON, CSV, markdown for human review
  • Alert thresholds β€” Notify when metrics exceed limits

Problem It Solves

  • No centralized view of OpenClaw agent performance
  • Hard to debug across multiple tool calls
  • No way to compare agents or track regressions
  • Production monitoring is enterprise-grade; agents need the same

Usage

# Start dashboard server
python3 scripts/observability.py --dashboard

# Record metrics from a session
python3 scripts/observability.py --record --session agent:main --latency 1.5 --success true

# View session trace
python3 scripts/observability.py --trace --session agent:main:12345

# Get performance report
python3 scripts/observability.py --report --period 24h

# Export to CSV
python3 scripts/observability.py --export metrics.csv

# Set alert thresholds
python3 scripts/observability.py --alert --metric latency --threshold 5.0

Metrics Tracked

Category Metric Description
Performance Latency Tool call latency (ms)
Throughput Calls per second
Success Success Rate % of successful tool calls
Error Count Failed operations
Cost Token Usage Input + output tokens
API Cost Estimated cost in USD
Quality Hallucinations Detected false outputs
Corrections Needed User corrections

Trace Format

Each tool call is logged with:

  • Timestamp
  • Agent session ID
  • Tool name + parameters
  • Latency
  • Success/failure
  • Token usage
  • Error details (if failed)

Example trace:

{
  "session_id": "agent:main:12345",
  "trace": [
    {
      "timestamp": "2026-01-31T14:00:00Z",
      "tool": "web_search",
      "params": {"query": "agent observability"},
      "latency_ms": 1234,
      "success": true,
      "tokens_used": 150
    },
    {
      "timestamp": "2026-01-31T14:00:02Z",
      "tool": "memory_write",
      "params": {"content": "..."},
      "latency_ms": 45,
      "success": true,
      "tokens_used": 0
    }
  ]
}

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Instrumentationβ”‚  ← Auto-capture from OpenClaw logs
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Metrics Store  β”‚  ← SQLite/InfluxDB for time-series
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Analytics      β”‚  ← Aggregations, trends, anomalies
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Dashboard UI  β”‚  ← Web interface (Flask/FastAPI)
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Requirements

  • Python 3.9+
  • flask (for dashboard web UI)
  • pandas (for analytics)
  • influxdb-client (optional, for production storage)

Installation

# Clone repo
git clone https://github.com/orosha-ai/agent-observability-dashboard

# Install dependencies
pip install flask pandas influxdb-client

# Run dashboard
python3 scripts/observability.py --dashboard
# Open http://localhost:5000

Inspiration

  • Dynatrace AI Observability App β€” Enterprise-grade unified observability
  • Langfuse vs AgentOps benchmarks β€” Comparison of platforms
  • Microsoft .NET tracing guide β€” Practical implementation patterns
  • OpenLLMetry β€” OpenTelemetry integration for LLMs

Local-Only Promise

  • Metrics stored locally (SQLite/InfluxDB)
  • Dashboard runs locally
  • No data sent to external services

Version History

  • v0.1 β€” MVP: Metrics tracking, trace visualization, dashboard UI
  • Roadmap: InfluxDB integration, anomaly detection, multi-agent comparison