Developer Guide

How to model agents, compose workflows, and build production systems on AgentWorld.

Developer Guide

Developers should approach AgentWorld as an execution platform, not a prompt playground.

The goal is to express a business process as typed runtime components. Start with roles and state. Add tools second. Add autonomy last.

1

Define the business unit

Choose the boundary that owns policy, budgets, and treasury.

2

Model the workflow

Write the trigger, checkpoints, terminal states, and required approvals.

3

Define agents by role

Create narrow workers with clear objectives, memory scope, and tool access.

4

Add tool adapters

Expose only the minimal side effects needed for the role.

5

Attach policy

Set spend limits, retry ceilings, escalation conditions, and approval rules.

6

Instrument everything

Capture run metrics, queue depth, error classes, and settlement outcomes.

Agent design pattern

The safest pattern is one role, one job, one authority surface.

A support triage agent should classify and route. It should not also issue refunds. A treasury agent should settle approved obligations. It should not also rewrite policy. Narrow roles produce cleaner prompts, better evaluations, and smaller blast radius.

Suggested build manifest

business_unit: growth
workflow: outbound-campaign
trigger: new-target-list
agents:
  - name: research-agent
    objective: enrich target accounts
    tools: [web.search, crm.read, notes.write]
  - name: offer-agent
    objective: draft personalized outbound offers
    tools: [crm.read, templates.read, drafts.write]
  - name: review-agent
    objective: validate policy and brand compliance
    tools: [policy.check, drafts.read]
approvals:
  - required_for: high_value_offer
    reviewer: sales-lead
budgets:
  token_limit: 120000
  spend_limit_usd: 200

The exact syntax can vary. The structural idea should not.

Tool adapter guidance

Tool adapters should be boring.

They should expose a stable interface, clear schemas, explicit scopes, and predictable errors. Avoid adapters that return giant ambiguous payloads or perform multiple side effects in one call.

Evaluation strategy

A strong evaluation stack should test:

  • task completion rate

  • structured output validity

  • policy violation rate

  • approval deferral rate

  • tool failure rate

  • average cost per completed workflow

  • settlement reconciliation accuracy

These metrics are more useful than generic chat benchmarks.

Environment strategy

Use different assumptions per environment.

Local

Run with mock tools, synthetic wallets, and aggressive logging.

Staging

Use real integrations with reduced scopes, test funds, and approval-heavy policy.

Production

Use minimal tool access, live policy versions, signer separation, and strict observability.

Observability checklist

Developers should be able to answer these questions quickly:

  • Which workflows are stuck?

  • Which agents consume the most cost?

  • Which approvals create the most delay?

  • Which tool adapters fail most often?

  • Which transaction intents did not reconcile cleanly?

If these answers are hard to get, the platform is under-instrumented.

Anti-patterns

Avoid these patterns early:

  • giant general-purpose agents

  • hidden prompt state as business memory

  • direct wallet access from open-ended tools

  • untyped tool outputs promoted to state

  • retries without failure classification

  • approvals happening outside the runtime log

circle-info

The fastest way to a robust AgentWorld app is to reduce ambiguity at every interface.

Last updated