AI agent debugger

Your agent failed. Find out why in 30 seconds.

Paste a trace. Get the exact step that failed, what broke next, and how to fix it. Debug in the browser, record in code, and gate in CI.

Analysis complete
·1 fault detected·3 steps affected·Fix readytrc_8f2d94a1c3e7

trace_id

trc_8f2d94a1c3e7

agent

revenue-summarizer

total

6.93s

Faulted
Span detailfetch_document

span_id

span_d4e5f6

tool

fetch_document

duration

4.82s

status

FAULT

input

{
  "document_id": "rev_q3_2024",
  "format": "markdown"
}

error

TimeoutError: fetch_document exceeded 5000ms threshold

TOOL_TIMEOUT

Analysis

Debugger output

Root Cause

fetch_document timed out after 4.8s. No retry policy is configured on this tool call. The agent proceeded without a valid document source.

What broke next

vector_search

returned 0 results — fallback activated

llm_completion

completed with stale cache — output degraded

Output

marked incomplete, missing source data

Suggested Fix

fetch_document:

timeout: 2000

retry:

attempts: 3

backoff: "exponential"

fallback_source: "cache_store"

No account · Paste any format · Instant results

What Sepurux catches

AI agents fail in specific ways. Most are invisible without a trace.

Sepurux scans your trace for these patterns. Each one maps to a root cause, not just an error code.

TOOL_TIMEOUT

Tool timed out, cascade started.

fetch_document hung at 4.8 s. The agent continued on empty output. Three affected steps failed silently.

SCHEMA_BREAK

Tool returned the wrong shape.

Expected string, got null. The LLM filled in the gap with plausible-sounding fabricated data.

RETRY_EXHAUSTED

Retried with identical params. All failed.

query_db was called three times with the same arguments. No backoff, no parameter change. Same error each time.

CONTEXT_OVERFLOW

Prompt hit the token limit silently.

16,420 tokens in, limit is 16,384. The model truncated the last 2,000 tokens without raising an error.

FALLBACK_TRIGGERED

Primary failed. Fallback returned nothing.

vector_search fell back to cache. Cache returned 0 results. The agent treated that as valid context.

HALLUCINATION_RISK

LLM completed from an empty tool output.

The document tool returned null. The model generated a summary anyway. Output looks real — it isn't.

How it works

Three steps. Usually under a minute.

01

Paste the trace

Drop in JSON, raw logs, a LangSmith run URL, or a bare trace ID. Sepurux parses whatever you have.

No SDK. No account. No setup.

02

Get the root cause

Not just the error — the step that caused it, what broke next, and which affected steps silently continued on bad data.

Scored by confidence and impact.

03

Apply the fix

Each failure surfaces a concrete code snippet — retry config, timeout tuning, schema guard, or context trim.

Replay it, stress-test it, or add it to CI when you're ready.

When you need to go deeper

Debug first. Go further when you need to.

Sepurux starts as a debugger. Then you can install the recorder, capture traces automatically, replay and stress-test the same path, and gate releases in CI.

Stress Test

Stress-test a passing trace

Take a trace that's working and inject failures — timeouts, schema breaks, rate limits. Find out if your agent handles them before your users do.

Scheduled Runs

Run checks on a schedule

Run traces on a cron schedule and get notified when a passing agent starts failing. Catch regressions automatically.

Policy

Set limits on what your agent can do

Control which tools your agent can call. Block risky invocations or require human approval before sensitive tool calls go through.

Works with the stacks you already ship

LangChainLangGraphCrewAIOpenAI Agents SDKLlamaIndexPydantic AIMCP Workflows
Python SDKTypeScript SDKGo SDK
pip install sepuruxnpm install @sepurux/recordergo get github.com/sepurux/go-recorder

Debug in the browser. Record in code. Gate in CI.

You have a broken agent.
Start here.

No account. No setup. Paste a trace and know what failed in 30 seconds.