Documentation
Developer-grade guidance for deploying AI Safety Gate into production workflows.
PASS
WARN
BLOCK
Prefer Postman? Download the collection: ai-safety-gate.postman_collection.json
AI Safety Gate documentation
This documentation explains how to integrate AI Safety Gate into production systems that execute AI-driven actions. You will learn where to place validate(), how PASS/WARN/BLOCK enforcement works, and how to integrate the enforcement API into any system.
How it works (end-to-end)
Exact request/response flow: validate → enforce → (WARN) approve.
Step 1 — Generate AI output
Your system produces the final AI output you intend to use to trigger an action.
Step 2 — Call the Safety Gate before executing
Your app or service calls
POST https://aisafegate.com/api/validate with ai_output (string) and context (object).Step 3 — Enforce PASS / WARN / BLOCK
PASS: execute the action.
BLOCK: do not execute the action.
WARN: do not execute yet; persist
decision_id and approval_token and begin polling.Step 4 — Human approval (WARN only)
Poll
GET https://aisafegate.com/api/decisions/:id/approval with the approval token in X-Approval-Token (or Authorization: Bearer ...) until you receive {"approved": true}.Step 5 — Fail closed
If validation fails, approval polling never returns
{"approved": true}, or your system times out, you must stop and not execute.n8n is optional: the n8n drop-in workflow follows the same contract, but you can integrate directly from any app or service using the HTTP API.
What AI Safety Gate does
One validation call that returns an enforceable decision.
Rule of thumb: validate after AI output and before any real-world action (money, customer contact, production writes).
PASS
Proceed automatically.
WARN
Review required — route to review or retry.
BLOCK
Stop execution.
AI
LLM output
Message / tool args
GATE
validate()
Policy + risk checks
WORKFLOW
PASS
WARN
BLOCK
Branch by status
Where it fits in systems
Think of it as an enforcement layer, not a content filter.
Place AI Safety Gate on the boundary between model output and irreversible actions.
It is most effective when workflows treat BLOCK as terminal and WARN as an interlock requiring review.
n8n Cloud quick start
Deploy in minutes and branch safely by decision.
STEP 1
Import workflow JSON
Workflows → Import from File → select the template.
STEP 2
Replace the trigger
Swap in your Webhook/Cron/app event trigger.
STEP 3
Connect AI output → Gate
Wire the LLM output into the validate request.
STEP 4
Branch by decision
PASS → action, WARN → review, BLOCK → safe fallback.
Do not place the gate after the action. If the refund/email/write already happened, validation is too late.
Placement guide (DO / DO NOT)
Keep enforcement close to execution.
Place the gate inline: immediately after model output is produced and immediately before the irreversible action node.
DO:
AI → Gate → Action
Route WARN to approvals
Log every decision
DO NOT:
Validate only prompts (validate outputs too)
Treat WARN as PASS
Allow BLOCK to reach action nodes
PASS / WARN / BLOCK behavior
Deterministic workflow branching.
PASS means low risk. Continue to the real action node.
WARN means elevated risk (review required). Add human review, retry with stricter prompts, or degrade to safe templates.
BLOCK means stop. Treat it as terminal for the action path.
Recommended Safety Defaults (Production)
Customer-safe defaults for reliable enforcement.
- Fail closed: if validation fails (timeouts, parsing, missing fields), do not execute.
- Treat WARN as an interlock: pause execution until a human explicitly approves.
- Treat BLOCK as terminal for the action path.
- Log the decision status and explanation for audit and incident response.
- Use environment separation (dev/stage/prod) and verify policy changes before rollout.
Known limitations (v1)
Practical boundaries to plan around.
- Decisions are only as accurate as the context you provide (action type, channel, environment, and payload).
- AI Safety Gate cannot prevent actions your system executes without enforcing the returned outcome.
- WARN and BLOCK handling must be implemented deterministically in your workflow (no “best effort” execution).
- Misuse-resistant operations require customer-side controls (RBAC, rate limits, and review processes) in addition to gating.
Data evaluated
Pass enough context to make decisions accurate.
Recommended inputs
AI output text (message/tool args)
Action type (refund/email/write)
Workflow name / environment
Optional: actor, channel, customer segment
Returned outputs
status: PASS | WARN | BLOCK
explanation: human-readable reason
timestamps + identifiers for audit
Policy management
How policy changes relate to integration behavior.
Single integration contract
The integration contract is stable: you call validate(), enforce PASS/WARN/BLOCK, and (for WARN) poll approval.
Managed policies
Managed policies may be updated over time to improve safety. Your integration must not assume outcomes remain constant.
Customer-defined policies (if enabled)
If your account allows customer-defined policies, changes can affect future decisions. Treat the Safety Gate response as the source of truth and enforce it.
Nodes do not define policy
Workflow nodes never define policy. They call validate() and route outcomes (PASS/WARN/BLOCK) deterministically.
Common mistakes
Issues that reduce enforcement reliability.
Gating only prompts. Validate outputs too.
Ignoring WARN. Review required — treat it as an interlock.
Non-terminal BLOCK. BLOCK must stop the action path.
No audit trail. Store status + explanation.
FAQ
Operational questions
Do I need n8n?
No. Any system that can call an HTTP API and branch by status works.
What do I do on WARN?
Route to human approval, retry, or fallback to safe templates.
What do I do on BLOCK?
Stop execution, notify your team, and log the decision.
How do I use this as AI workflow enforcement?
Place validate() after the model response and before the irreversible step. Treat BLOCK as terminal and WARN as an interlock.