Skip to main content

Production Readiness

This page documents practical production-hardening guidance for the current framework release.

Scope

ai-agent-framework is a library, not a hosted runtime. Production readiness mostly depends on how you run it inside your application.

1. Error Boundaries

Handle framework errors at your app boundary and map them to safe user responses.

Common errors:

  • ModelError: provider call failed
  • ToolNotFoundError: model requested unknown tool
  • ToolValidationError: tool args did not match zod schema
  • MaxStepsExceededError: agent loop exhausted step budget
  • PromptTemplateError: missing prompt variables
  • OutputParserError: invalid parser input (for example non-JSON in JSON parser)

Recommended pattern:

  1. catch framework errors in one place
  2. return sanitized responses to users
  3. log structured details internally with request IDs

2. Timeouts And Retries

The framework does not enforce provider timeouts/retries for you.

Recommended:

  • apply network timeout at provider client layer
  • retry only transient failures (rate limits, timeouts, transport errors)
  • cap retries and use jittered backoff
  • avoid retrying deterministic validation failures

3. Agent Guardrails

For tool-using agents:

  • keep maxSteps conservative (start around 6-12)
  • keep tool schemas strict and explicit
  • keep tool side effects idempotent when possible
  • require confirmation for destructive operations at app layer

4. Observability Baseline

Minimum signals to capture:

  • request ID / trace ID
  • prompt + tool execution latency
  • model/token usage from provider responses (if available)
  • tool call counts and failure rates
  • parser failure rates
  • max-step exhaustion count

Use hooks for lifecycle instrumentation:

  • hooks.onStart(state)
  • hooks.onEnd(state, result)

Runtime spans are also available on state for step-level timings.

5. Prompt And Output Safety

  • enforce strict output contracts with JsonOutputParser where possible
  • validate downstream business constraints after parsing
  • version prompts intentionally; treat prompt changes like code changes
  • never trust model output directly for privileged actions

6. Tool Safety

  • least-privilege tool design
  • authz checks inside tool handlers
  • redact secrets in tool outputs before storing in memory/logs
  • rate-limit expensive or external tools

7. Configuration Hygiene

  • keep API keys in environment/secret manager, not code
  • separate staging and production model configs
  • pin model names intentionally and review changes before upgrades

8. Testing Strategy

Layered testing:

  1. unit test runnables, parsers, and tools in isolation
  2. integration test chain/agent orchestration with provider mocks
  3. golden tests for stable prompt/output contracts
  4. failure-path tests: tool validation, missing tools, max-step exceed
  1. ship chain-based workflows first
  2. add tools behind feature flags
  3. enable agent loops for a narrow user segment
  4. monitor latency, failure rates, and max-step exhaustions
  5. expand traffic only after stable error budget

Current Limits

Current framework does not include:

  • built-in persistence or distributed queue execution
  • built-in auth/authz layer
  • built-in policy engine for tool permissions
  • built-in metrics export pipeline

Treat these as application responsibilities in the current release.