Product

Debug fast. Everything else follows when you need it.

Debug in the browser first, then install the recorder to capture traces in code, and gate releases in CI when the workflow is stable. No account required to start.

Trace debuggerReplayCompare runsStress testsPolicy controlsCI integration

What ships

Debug in the browser

Paste a trace and see where it failed or looks risky.

Record in code

Install the recorder to capture traces automatically from real workflows.

Gate in CI

Turn reliability findings into release gates that block regressions automatically.

Analysis

Instant

No sign-in required

History

Saved later

When the trace matters

CI

CI-ready

Block on pass rate

Platform surfaces

Starts with a trace. Goes as far as you need.

Debug in the browser, record in code, then gate in CI.

Failure classes

The costly bugs are the ones that look plausible until production proves otherwise.

Agents rarely fail like ordinary software. They often continue returning believable outputs while a schema shifts, a downstream service degrades, or approval logic is quietly bypassed.

Scenario

Database timeout cascade

Retries would have stacked, delayed refunds, and obscured the real first failing tool.

Detected by

Stress test: db.timeout + replay attribution

Outcome

Caught in staging, replayed, and hardened before rollout.

Scenario

Prompt injection through support tooling

An agent could have triggered unauthorized tool actions from crafted external input.

Detected by

Policy + security checks on replayed support workflow

Outcome

Sanitization and deny rules were added before release.

Scenario

Schema drift in webhook parsing

Downstream logic would have silently dropped billing events while returning superficially valid responses.

Detected by

Schema stress test + contract validation

Outcome

Parser was hardened and the CI gate blocked the change until fixed.

Screenshots

Every failure, fully visible.

Replay traces, review stress test results, and read the failure report — all in one place.

Sepurux dashboard showing agent trace replay and stress test timeline
Sepurux trace view with prompt, tool call, and breakpoint inspection
Sepurux reliability score report summarizing agent failure analysis

Fits your stack

Debug in the browser. Record in code. Gate in CI.

Sepurux is easiest to adopt when the first step is obvious: debug a trace first, then install the recorder to capture traces automatically, and gate releases in CI.

Connects to

OpenAI SDK

LangChain

LangGraph

GitHub Actions

OpenTelemetry

Vercel AI SDK

Pydantic AI

MCP tools

Typical workflow

01

Capture the trace once

Keep the real agent story instead of rebuilding it by hand.

02

Install the recorder in code

Capture traces automatically from real workflows after the first browser debug pass.

03

Promote stable traces into CI gates

Hand the same trace to replay, stress testing, or release gating when the fix is ready.

Replay later

Use the same trace again when you want a second pass after the fix.

Continuous checks

Promote stable cases into CI or scheduled reliability checks.

Saved traces

Persist the analysis once it matters for your team history.

Policy gates

Keep approvals and guardrails available when the workflow gets sensitive.

Release control

Put reliability evidence directly into the deployment path.

The product is strongest when it stops being a dashboard you check later and becomes a gate the team cannot ignore.

.github/workflows/sepurux-gate.yml
name: Sepurux Reliability Gate

on:
  pull_request:
  workflow_dispatch:

jobs:
  reliability-gate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: sepurux/sepurux-platform/.github/actions/sepurux-gate@main
        with:
          base_url: ${{ secrets.SEPURUX_API_BASE_URL }}
          token: ${{ secrets.SEPURUX_API_KEY }}
          project_id: ${{ secrets.SEPURUX_PROJECT_ID }}
          campaign_name: refund-reliability
          min_pass_rate: "0.85"

What the gate gives the team

  • Register a scenario pool once and replay every trace on each pull request.
  • Gate on pass rate, scenario coverage, and unsafe attempt count instead of a vague quality claim.
  • Push a single verdict into branch protection and release review.

Scenario coverage

100%

Unsafe attempts

0

Pass rate

0.92

Governed scale

Start with the debugger, then grow into a full control plane.

As programs mature, Sepurux becomes the place where platform teams, security reviewers, and operators evaluate whether an AI workflow should run at all.

Capability

Compliance Summary

Project-scoped metrics across audit, policy, and security events for governance reporting and release review.

Capability

Policy + Security Controls

Approval rules, unsafe action blocks, sanitization layers, and tool zoning for sensitive workflows.

Capability

Audit Visibility

Structured event history for incident review, approvals, and operational accountability as the program scales.

Ready when you are

Start with the trace. Go further when you're ready.

Use the sandbox for the first analysis, install the recorder for continuous capture, then move into replay, stress testing, and CI gates when you want the fix to hold.

FAQ

Common questions from platform teams.