Daily brief · Token cost · Agent workflow

AI token economics, written like a field brief.

A bright, readable guide to what AI really costs: model prices, coding-agent workflows, benchmark signals, and practical ways to spend fewer tokens.

明快、可读、可追踪:帮个人和小团队看懂模型价格、Agent 工作流、评测信号和省 token 方法。

Price WatchInput, output, cache, batch, context, retry.
Agent CostClaude Code, Codex, Cursor, Aider, OpenCode.
Token SavingPrompt caching, routing, compression, context discipline.

Starting hooks

From top to bottom, each card gives title, date, summary, and the opening signal so readers can decide fast.

01

Fable's judgement

Agent savings can come from fewer bad loops, not only cheaper tokens.

One of the most interesting tips I got from the Fireside Chat I hosted with Cat Wu and Thariq Shihipar from the Claude Code team at AIE on Wednesday was …

Claude Codeagent judgmentworkflow
Read source →
05

How Cursor deploys AI inside the enterprise

Vibe coding becomes a team budget problem when workflows scale.

Cursor's Pauline Brunet explains how her team of Forward Deployed Engineers help organizations implement agents — essentially setting up software factories.

Cursorsoftware factoryenterprise AI
Read source →
06

What’s new in Claude Sonnet 5

New model releases affect defaults, agent costs, and failure rates.

Claude Sonnet 5 came out this morning. I always head straight for the "what's new" developer docs because they tend to have more actionable information than the official announcement post. …

ClaudeSonnetmodel update
Read source →
08

Ornith-1.0: Self-Scaffolding LLMs for Agentic Coding

Open-weight coding models can change the API cost equation.

This is an interesting new open weights (MIT licensed) model, the first model release from DeepReinforce. [...] with variants including 9B Dense, 31B Dense, 35B MoE, and 397B MoE. Built …

open weightsagentic codingcoding model
Read source →
12

Qwen3 benchmark results

A durable bridge between China models and coding-agent evaluation.

Benchmark results for Qwen3 models using the Aider polyglot coding benchmark.

QwenAidercoding benchmark
Read source →
13

How Claude Code uses prompt caching

Prompt caching directly changes speed and token cost.

Claude Code manages prompt caching automatically. See why a model switch triggers a slow uncached turn, what /compact costs, why CLAUDE.md edits don't apply mid-session, and how to check your cache hit rate.

Claude Codeprompt cachingcache hit rate
Read source →
14

Ditching Claude for OpenCode and OpenRouter

A real switching case from default tools to open router/model workflows.

For the entirety of June I ditched Claude Code and have been using open weight models with Opencode and openrouter.ai. Here

OpenCodeOpenRouterClaude
Read source →

Readable by people and agents

Static HTML first, with machine-readable endpoints for automation and search.