Documentation
Prompt Injection Policies

Prompt Injection Policies

Managed Mode reference. You do not configure these policies. They are enforced automatically to prevent instruction override and enforcement bypass.

Trigger example (derived from the managed catalog)

These policies evaluate ai_output for indicators of prompt injection or attempts to bypass system constraints.
Example ActionKind where these policies apply: workflow.execute

Example decision

json
{
  "status": "BLOCK",
  "decision_id": "dec_...",
  "reasons": [
    "Block prompt injection and instruction override attempts."
  ]
}

Resolution

Follow the policy remediation guidance and re-validate.
SummarySeverityApplies toRequired contextRemediation
Block prompt injection and instruction override attempts.BLOCKworkflow.execute, workflow.modify, infra.change, money.refund, data.export, messaging.sendai_output, system_instructions, user_inputRemove/neutralize attacker instructions. Use structured tool arguments. Re-run with sanitized user input.
Block any attempt to bypass the Safety Gate or disable enforcement.BLOCKsecurity.policy_change, workflow.modify, workflow.executeai_outputTreat as hostile. Do not execute. Investigate the source prompt and upstream inputs.
Warn when output attempts to rewrite safety rules/policies or reduce constraints.WARNsecurity.policy_change, workflow.modifyai_outputOnly allow explicit human-reviewed policy updates with a clear change ticket and rollback plan.
Legal & Responsibility Notice
Summary
Informational only
Provided for general guidance. Not legal, compliance, security, or professional advice.
You control implementation
You are responsible for policies, prompts, integrations, workflows, and regulatory requirements.
Liability limitation
To the maximum extent permitted by law, the company disclaims liability for losses arising from use of this documentation or implementations based on it.