GitHub Action
Use sepurux-gate directly in any workflow step. Pass trace_file + campaign_name to avoid storing UUIDs in secrets — the action uploads the trace and resolves the campaign automatically.
# Preferred: trace_file + campaign_name — no UUIDs in secrets
- uses: sepurux/sepurux-platform/.github/actions/sepurux-gate@main
with:
base_url: ${{ secrets.SEPURUX_API_BASE_URL }}
token: ${{ secrets.SEPURUX_API_KEY }}
project_id: ${{ secrets.SEPURUX_PROJECT_ID }}
trace_file: trace.json # file written by your test/instrumentation step
campaign_name: refund-reliability # human-readable name, not a UUID
min_pass_rate: "0.85"
# Alternative: pass UUIDs directly (no lookup needed)
- uses: sepurux/sepurux-platform/.github/actions/sepurux-gate@main
with:
base_url: ${{ secrets.SEPURUX_API_BASE_URL }}
token: ${{ secrets.SEPURUX_CI_TOKEN }}
trace_id: ${{ steps.record.outputs.trace_id }}
campaign_id: ${{ secrets.SEPURUX_CAMPAIGN_ID }}trace_file
Path to a JSON trace file generated by your test step. The action uploads it and uses the returned trace_id — no pre-existing UUID needed.
campaign_name
Human-readable name of your campaign. The action looks up the UUID via the campaigns API — no secrets to rotate when campaigns change.
Outputs
The step exposes run_id, trace_id, campaign_id, decision (pass|fail), and dashboard_url as step outputs for downstream steps.
Full Workflow Example
Use a dedicated reliability gate job in pull requests and mark it as a required status check in repository branch protection rules.
name: Sepurux Reliability Gate
on:
pull_request:
workflow_dispatch:
permissions:
contents: read
pull-requests: write
checks: write
statuses: write
jobs:
reliability-gate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Install Sepurux CLI
run: |
python -m pip install --upgrade pip
python -m pip install sepurux
- name: Run reliability gate
env:
# Use the live cloud API in CI, not localhost.
SEPURUX_API_BASE_URL: https://app.sepurux.dev/api/backend
SEPURUX_API_KEY: ${{ secrets.SEPURUX_API_KEY }}
SEPURUX_PROJECT_ID: ${{ secrets.SEPURUX_PROJECT_ID }}
SEPURUX_CAMPAIGN_ID: ${{ secrets.SEPURUX_CAMPAIGN_ID }}
# Produced by earlier instrumentation/upload steps in your pipeline.
# Omit if the campaign already has an eval-set — batch runs all registered traces.
SEPURUX_TRACE_ID: ${{ steps.record.outputs.trace_id }}
run: |
python - <<'PY'
import os
import sys
import time
from sepurux import SepuruxClient
min_pass_rate = 0.85
timeout_seconds = 600
campaign_id = (os.getenv("SEPURUX_CAMPAIGN_ID") or "").strip()
trace_id = (os.getenv("SEPURUX_TRACE_ID") or "").strip() or None
if not campaign_id:
print("Missing SEPURUX_CAMPAIGN_ID")
sys.exit(1)
client = SepuruxClient(
base_url=os.environ["SEPURUX_API_BASE_URL"],
api_key=os.environ["SEPURUX_API_KEY"],
project_id=(os.getenv("SEPURUX_PROJECT_ID") or None),
)
# start_run always returns a batch_id. When the campaign has an eval-set,
# one run is created per registered trace. Otherwise uses the supplied trace_id.
batch_id = client.start_run(
trace_id=trace_id or "",
campaign_id=campaign_id,
thresholds={"min_pass_rate": min_pass_rate, "max_unsafe": 0},
)
print(f"Started Sepurux batch: {batch_id}")
terminal = {"completed", "done", "failed", "pass", "fail"}
deadline = time.time() + timeout_seconds
latest = {}
while time.time() < deadline:
latest = client.get_ci_batch(batch_id)
status = str(latest.get("status", "")).lower()
coverage = latest.get("scenario_coverage_pct", 0)
print(f"Batch status: {status} coverage: {coverage:.0f}%")
if status in terminal:
break
time.sleep(5)
else:
print(f"Batch timed out after {timeout_seconds}s")
sys.exit(1)
decision = str(latest.get("decision", "")).lower()
pass_rate = float(latest.get("pass_rate") or 0)
unsafe = int(latest.get("unsafe_attempts") or 0)
print(f"Decision: {decision} pass_rate: {pass_rate:.0%} unsafe: {unsafe}")
if decision == "fail" or unsafe > 0:
print("Gate failed.")
sys.exit(1)
if decision != "pass":
print(f"Unexpected decision state: {decision}")
sys.exit(1)
print("Gate passed.")
PYTrace Source Options
Use trace_file to generate a trace file in CI and pass the path directly, or capture a trace_id from an earlier SDK step.
# Option A: write trace.json in a test/smoke step
- name: Run smoke test and export trace
run: python scripts/smoke_test.py --output trace.json
# Pass the file to the gate action — it uploads and resolves trace_id automatically
- uses: sepurux/sepurux-platform/.github/actions/sepurux-gate@main
with:
trace_file: trace.json
campaign_name: refund-reliability
...
# Option B: capture trace_id from an earlier SDK step and pass it directly
- name: Record trace
id: record
run: python record.py # prints trace_id=<uuid> to GITHUB_OUTPUT
- uses: sepurux/sepurux-platform/.github/actions/sepurux-gate@main
with:
trace_id: ${{ steps.record.outputs.trace_id }}
campaign_id: ${{ vars.SEPURUX_CAMPAIGN_ID }}Scenario Coverage & Batch Runs
Register a pool of real-world traces against a campaign's eval-set. CI submissions against that campaign automatically run every registered trace in parallel and gate on an aggregated verdict.
1. Register scenario pool (one-time setup)
# Register real-world traces as the scenario pool for a campaign.
# Run this once (or incrementally) before your first batch CI submission.
curl -X PATCH https://app.sepurux.dev/api/backend/v1/campaigns/<campaign_uuid>/eval-set \
-H "Content-Type: application/json" \
-H "X-API-Key: $SEPURUX_API_KEY" \
-H "X-Project-Id: $SEPURUX_PROJECT_ID" \
-d '{
"add_trace_ids": ["<trace_uuid_1>", "<trace_uuid_2>", "<trace_uuid_3>"],
"min_scenarios": 3
}'2. Submit batch CI run + poll
# POST /v1/ci/runs — when the campaign has eval-set traces, this creates
# one CI run per registered trace and returns a shared ci_batch_id.
curl -X POST https://app.sepurux.dev/api/backend/v1/ci/runs \
-H "Content-Type: application/json" \
-H "X-Sepurux-Token: $CI_TOKEN" \
-d '{
"campaign_id": "<campaign_uuid>",
"thresholds": {
"min_pass_rate": 0.85,
"max_unsafe": 0,
"max_failures": 0
},
"repo": "sepurux/sepurux-platform",
"pull_request_number": 42
}'
# → { "run_id": "...", "run_ids": [...], "trace_count": 3, "ci_batch_id": "<batch_uuid>" }
# Poll the aggregated batch verdict — pass requires ALL scenario runs to pass.
curl https://app.sepurux.dev/api/backend/v1/ci/batch/<batch_uuid> \
-H "X-Sepurux-Token: $CI_TOKEN"Batch verdict response
{
"batch_id": "...",
"status": "done",
"decision": "pass",
"pass_rate": 0.91,
"unsafe_attempts": 0,
"failures": 0,
"trace_count": 3,
"scenario_coverage_pct": 100.0,
"top_failing_tools": [],
"dashboard_url": "https://app.sepurux.dev/runs/..."
}min_scenarios gate
If the registered scenario pool has fewer traces than min_scenarios, the CI submission is rejected with 422. This prevents gating against a trivially small sample.
Batch verdict rule
Decision is `pass` only when every scenario run passes. A single failing trace produces a `fail` verdict regardless of aggregate pass rate.
scenario_coverage_pct
Completed runs / total traces × 100. Use this field to confirm that all scenarios were actually evaluated before acting on the verdict.
CI API Flow (Single Trace)
For single-trace mode or custom CI systems not using eval-set, call the CI endpoints directly and decide pass/fail from the returned decision payload.
# 1) create CI run
curl -X POST https://app.sepurux.dev/api/backend/v1/ci/runs \
-H "Content-Type: application/json" \
-H "X-Sepurux-Token: $CI_TOKEN" \
-d '{
"trace_id": "<trace_uuid>",
"campaign_id": "<campaign_uuid>",
"thresholds": {
"min_pass_rate": 0.85,
"max_unsafe": 0,
"max_failures": 0,
"min_reliability_score": 80
},
"repo": "sepurux/sepurux-platform",
"pull_request_number": 42
}'
# 2) poll CI decision
curl -X GET https://app.sepurux.dev/api/backend/v1/ci/runs/<run_uuid> \
-H "X-Sepurux-Token: $CI_TOKEN"Decision Model
A CI run returns `pass`, `fail`, or `pending`. Block deploy on `fail`; poll while `pending`. Batch runs include `trace_count` and `scenario_coverage_pct`.
{
"run_id": "...",
"status": "done",
"decision": "fail",
"pass_rate": 0.72,
"unsafe_attempts": 1,
"failures": 5,
"trace_count": 1,
"top_failing_tools": [
{"tool": "payments.refund", "count": 3}
],
"dashboard_url": "https://app.sepurux.dev/runs/..."
}Rollout Tips
Use a staged enforcement model so teams adopt reliability gating without breaking delivery velocity.
Week 1
Observe only: report scores and failure tools, but do not block merges.
Week 2
Soft gate: require review when reliability falls below threshold.
Week 3+
Hard gate: block merge/deploy when decision is fail.
