How to model agents, compose workflows, and build production systems on AgentWorld.
Developer Guide
Developers should approach AgentWorld as an execution platform, not a prompt playground.
The goal is to express a business process as typed runtime components. Start with roles and state. Add tools second. Add autonomy last.
Recommended build order
1
Define the business unit
Choose the boundary that owns policy, budgets, and treasury.
2
Model the workflow
Write the trigger, checkpoints, terminal states, and required approvals.
3
Define agents by role
Create narrow workers with clear objectives, memory scope, and tool access.
4
Add tool adapters
Expose only the minimal side effects needed for the role.
5
Attach policy
Set spend limits, retry ceilings, escalation conditions, and approval rules.
6
Instrument everything
Capture run metrics, queue depth, error classes, and settlement outcomes.
Agent design pattern
The safest pattern is one role, one job, one authority surface.
A support triage agent should classify and route. It should not also issue refunds. A treasury agent should settle approved obligations. It should not also rewrite policy. Narrow roles produce cleaner prompts, better evaluations, and smaller blast radius.
Suggested build manifest
business_unit:growthworkflow:outbound-campaigntrigger:new-target-listagents:-name:research-agentobjective:enrich target accountstools:[web.search,crm.read,notes.write]-name:offer-agentobjective:draft personalized outbound offerstools:[crm.read,templates.read,drafts.write]-name:review-agentobjective:validate policy and brand compliancetools:[policy.check,drafts.read]approvals:-required_for:high_value_offerreviewer:sales-leadbudgets:token_limit:120000spend_limit_usd:200
The exact syntax can vary. The structural idea should not.
Tool adapter guidance
Tool adapters should be boring.
They should expose a stable interface, clear schemas, explicit scopes, and predictable errors. Avoid adapters that return giant ambiguous payloads or perform multiple side effects in one call.
Evaluation strategy
A strong evaluation stack should test:
task completion rate
structured output validity
policy violation rate
approval deferral rate
tool failure rate
average cost per completed workflow
settlement reconciliation accuracy
These metrics are more useful than generic chat benchmarks.
Environment strategy
Use different assumptions per environment.
Local
Run with mock tools, synthetic wallets, and aggressive logging.
Staging
Use real integrations with reduced scopes, test funds, and approval-heavy policy.
Production
Use minimal tool access, live policy versions, signer separation, and strict observability.
Observability checklist
Developers should be able to answer these questions quickly:
Which workflows are stuck?
Which agents consume the most cost?
Which approvals create the most delay?
Which tool adapters fail most often?
Which transaction intents did not reconcile cleanly?
If these answers are hard to get, the platform is under-instrumented.
Anti-patterns
Avoid these patterns early:
giant general-purpose agents
hidden prompt state as business memory
direct wallet access from open-ended tools
untyped tool outputs promoted to state
retries without failure classification
approvals happening outside the runtime log
The fastest way to a robust AgentWorld app is to reduce ambiguity at every interface.