Product
Reliability workflow for production AI agents.
Sepurux helps platform teams replay real traces, inject controlled chaos, evaluate outcomes, and gate merges when agent quality regresses.
Replay
Deterministic timeline reconstruction for every attempt. Pinpoint first failure tool and event index.
- Tool call ordering
- Failure reason capture
- Run inspector compatible artifacts
Mutations
Seeded chaos scenarios with schema drift, fault injection, and poisoning to stress brittle agent behavior.
- Weighted mutation packs
- Deterministic seeds
- Per-attempt mutation trace
Counterfactuals
What-if replay strategies to evaluate whether remediation options would have prevented failure.
- Retry policy simulation
- Policy gate alternatives
- Diff snapshots
Policy Packs
Guardrails for irreversible actions with allow/deny/require_approval decisions and event-level records.
- Tool pattern rules
- Approval token checks
- Policy events per run
CI Gate
Create CI runs and block pull requests when pass-rate or unsafe thresholds are violated.
- Strict pass/fail decisions
- Threshold-aware polling
- GitHub Action integration
System flow
Trace intake feeds campaign runs. Worker replays each mutation attempt, policy checks are applied, analytics summarize failures, and CI decides pass/fail for release confidence.
trace -> run -> attempts -> policy_events -> analytics -> ci_decision