Prompt Injection Policies

Managed Mode reference. You do not configure these policies. They are enforced automatically to prevent instruction override and enforcement bypass.

Trigger example (derived from the managed catalog)

These policies evaluate ai_output for indicators of prompt injection or attempts to bypass system constraints.

Example ActionKind where these policies apply: workflow.execute

Example decision

json

{
  "status": "BLOCK",
  "decision_id": "dec_...",
  "reasons": [
    "Block prompt injection and instruction override attempts."
  ]
}

Resolution

Follow the policy remediation guidance and re-validate.

Summary	Severity	Applies to	Required context	Remediation
Block prompt injection and instruction override attempts.	BLOCK	workflow.execute, workflow.modify, infra.change, money.refund, data.export, messaging.send	ai_output, system_instructions, user_input	Remove/neutralize attacker instructions. Use structured tool arguments. Re-run with sanitized user input.
Block any attempt to bypass the Safety Gate or disable enforcement.	BLOCK	security.policy_change, workflow.modify, workflow.execute	ai_output	Treat as hostile. Do not execute. Investigate the source prompt and upstream inputs.
Warn when output attempts to rewrite safety rules/policies or reduce constraints.	WARN	security.policy_change, workflow.modify	ai_output	Only allow explicit human-reviewed policy updates with a clear change ticket and rollback plan.

Legal & Responsibility Notice

Summary

Informational only

Provided for general guidance. Not legal, compliance, security, or professional advice.

You control implementation

You are responsible for policies, prompts, integrations, workflows, and regulatory requirements.

Liability limitation

To the maximum extent permitted by law, the company disclaims liability for losses arising from use of this documentation or implementations based on it.