Agent Observability Beyond Logs

Woman typing on laptop at wooden table with breakfast.

Marcus Chen

7 min read

Agent Observability Beyond Logs

Logs tell you what happened. Traces tell you why.

When a traditional API fails, you check the logs, find the error, and fix it. When an agent fails, the failure mode is fundamentally different. The agent might have made a reasonable plan, called the right tools, received correct data — and still produced a wrong answer because it misinterpreted the results at step four of a seven-step chain.

Logs cannot capture this. You need traces.

What an Agentkit trace contains

Every agent run produces a trace. A trace is a tree of steps, where each step contains:

  • The prompt sent to the model

  • The model's response, including any reasoning

  • Any tool calls made, with inputs and outputs

  • Token count and cost for that step

  • Latency breakdown (model time vs tool time vs overhead)

  • The decision the agent made based on the response

This tree can be arbitrarily deep. An agent that plans four subtasks, each of which calls two tools, produces a trace with at least twelve nodes.

Replay

The most-used feature of Agentkit Trace is replay. Click any run in the dashboard and step through it exactly as it happened. You see what the agent saw at each decision point.

This is how one of our users caught a subtle regression. Their agent started over-extracting invoice line items after a model update. The logs showed successful runs. The trace showed that step three was now returning 40% more items than before because the model's extraction prompt was matching partial strings.

Without replay, this would have shipped to production and inflated their anomaly count for weeks.

Cost attribution

Every trace includes per-step cost. You can aggregate costs by agent, by user, by organization, or by tool. This is how teams discover that one poorly-written tool is responsible for 60% of their spend because it makes redundant API calls that the agent then has to deduplicate.

Live tail

For real-time monitoring, Trace supports live tail. Stream traces to your dashboard as they happen. Watch an agent work step by step. This is useful during development and during incident response when you need to see what production agents are doing right now.

Export

Traces export as JSON for integration with your existing observability stack. You can also generate shareable links for individual runs — useful for debugging sessions with your team or for sharing a trace in a pull request review.

Related articles

Related articles

Usage-based pricing that scales with you.

Usage-based pricing that scales with you.

Start free, pay for what runs. No seats, no platform fees, no surprise overages — just runs, tokens, and the tools you actually use.

Start free, pay for what runs. No seats, no platform fees, no surprise overages — just runs, tokens, and the tools you actually use.

Start free, pay for what runs. No seats, no platform fees, no surprise overages — just runs, tokens, and the tools you actually use.

Start free, pay for what runs. No seats, no platform fees, no surprise overages — just runs, tokens, and the tools you actually use.

The framework for building, deploying, and observing production AI agents. Made in Berlin, shipped globally.

© AgentKit Inc.
Berlin — 2026

The framework for building, deploying, and observing production AI agents. Made in Berlin, shipped globally.

© AgentKit Inc.
Berlin — 2026

The framework for building, deploying, and observing production AI agents. Made in Berlin, shipped globally.

© AgentKit Inc.
Berlin — 2026

Create a free website with Framer, the website builder loved by startups, designers and agencies.