ADR-007: Guardrail SPI Design

Status

Accepted (v0.7)

Context

Kairo needs a request/response interception mechanism for safety concerns including content filtering, prompt injection detection, PII filtering, and output validation.

The existing PermissionGuard SPI is tool-scoped — it checks tool name + input only. It cannot intercept model requests/responses or tool outputs. Guardrails need to operate at four distinct boundary points: PRE_MODEL, POST_MODEL, PRE_TOOL, and POST_TOOL.

Overloading PermissionGuard with model-level interception would violate its single responsibility and bloat its contract with unrelated concerns.

Decision

New GuardrailPolicy SPI

Introduce a new GuardrailPolicy SPI in kairo-api (package io.kairo.api.guardrail), marked @Experimental, completely separate from PermissionGuard.

Core types (6 total):

GuardrailPolicy — SPI interface with evaluate(GuardrailContext): Mono<GuardrailDecision>
GuardrailContext — record carrying the typed payload, agent identity, and metadata map
GuardrailPayload — sealed interface with 4 variants:
- ModelInput — messages about to be sent to the model
- ModelOutput — messages returned from the model
- ToolInput — tool name + arguments before execution
- ToolOutput — tool result after execution
GuardrailDecision — record with Action enum and optional reason/modified payload
GuardrailDecision.Action — enum: ALLOW, DENY, MODIFY, WARN
DefaultGuardrailChain — ordered chain evaluator (in kairo-core)

GuardrailPayload uses List<Msg> (the project's existing message type), NOT a new Message type — no unnecessary abstraction.

Chain Evaluation Semantics

DefaultGuardrailChain evaluates policies in registration order:

Short-circuits on DENY — remaining policies are not evaluated.
Merges MODIFY payloads sequentially — each policy sees the modified output of the previous.
WARN is recorded but does not halt the chain.
Empty chain returns ALLOW (no-op) — zero overhead when no policies are registered.

Tool Pipeline Ordering

The full tool execution pipeline with guardrail placement:

CircuitBreaker → ActiveToolConstraints → PlanMode → PermissionGuard → Guardrail(PRE_TOOL) → Execution → Guardrail(POST_TOOL) → Sanitize

Key invariants:

PermissionGuard rejects BEFORE Guardrail — consistent with "static first, dynamic second" principle.
A PermissionGuard denial does NOT trigger Guardrail evaluation.

MCP Integration

MCP static policy (allow/deny lists) is implemented as McpStaticGuardrailPolicy with order = Integer.MIN_VALUE within the same DefaultGuardrailChain — single implementation point, single audit trail. See ADR-009 for full MCP security design.

Consequences

Positive: New package io.kairo.api.guardrail with 6 well-scoped types.
Positive: DefaultGuardrailChain in kairo-core provides ordered, short-circuit evaluation.
Positive: GuardrailChain injected into ReActLoop (model boundaries) and DefaultToolExecutor (tool boundaries) — all four interception points covered.
Positive: Empty chain is zero-overhead — no performance impact for users who don't register policies.
Negative: All v0.7 guardrail types are @Experimental — contract may change in v0.8.
Negative: Adds complexity to the tool pipeline ordering — must be carefully documented.

References

ADR-004 (Exception Hierarchy Design)
PermissionGuard SPI in kairo-api
OWASP Agentic AI Top 10

Spring Boot Configuration

Example application.yml registering guardrail policies:

yaml

kairo:
  guardrail:
    policies:
      - name: content-filter
        phase: PRE_MODEL
        order: 0
      - name: pii-redactor
        phase: POST_MODEL
        order: 10
      - name: tool-audit
        phase: PRE_TOOL
        order: 0

ADR-007: Guardrail SPI Design ​

Status ​

Context ​

Decision ​

New GuardrailPolicy SPI ​

Chain Evaluation Semantics ​

Tool Pipeline Ordering ​

MCP Integration ​

Consequences ​

References ​

Spring Boot Configuration ​