Route every LLM call to the cheapest model that can handle it. Cut agent costs by 40% with zero code changes to your logic.

Automatically routes simple tasks to cheaper models (Haiku, GPT-4o-mini) and hard tasks to smart models.
Detects runaway agent loops instantly. Escalates to smarter models or halts execution to save your wallet.
See exactly which step of your agent is costing the most money. Full visibility into every token.
Works with your existing Python agents. Just wrap your LLM calls and we handle the rest.
from token_aud.agent import AgentSpend
# Initialize the router
agent = AgentSpend.default()
# Route a task - we pick the model
result = agent.route_call(
step="reason",
messages=[{"role": "user", "content": "..."}]
)
print(f"Saved: {result.cost_saved_usd}")
# Output: Saved: $0.024Join the beta and start optimizing your agent fleet today.