Fable's judgement
Agent savings can come from fewer bad loops, not only cheaper tokens.
One of the most interesting tips I got from the Fireside Chat I hosted with Cat Wu and Thariq Shihipar from the Claude Code team at AIE on Wednesday was …
A bright, readable guide to what AI really costs: model prices, coding-agent workflows, benchmark signals, and practical ways to spend fewer tokens.
明快、可读、可追踪:帮个人和小团队看懂模型价格、Agent 工作流、评测信号和省 token 方法。
From top to bottom, each card gives title, date, summary, and the opening signal so readers can decide fast.
Agent savings can come from fewer bad loops, not only cheaper tokens.
One of the most interesting tips I got from the Fireside Chat I hosted with Cat Wu and Thariq Shihipar from the Claude Code team at AIE on Wednesday was …
A minimal coding agent maps where token spend happens.
A coding agent built on LLM
Prompt optimization can be evaluated with harnesses instead of vibes.
Leveraging the DSPy framework, this project evaluates and refines the core production system prompts used by Datasette Agent’s read-only SQL question answerer. The methodology involves a harness where DSPy agents …
Agent-readable websites are becoming part of the product surface.
The Vercel Chief of Software explains how its agent framework, eve, was created — and why skills, sandboxes and agent-readable websites now matter.
Vibe coding becomes a team budget problem when workflows scale.
Cursor's Pauline Brunet explains how her team of Forward Deployed Engineers help organizations implement agents — essentially setting up software factories.
New model releases affect defaults, agent costs, and failure rates.
Claude Sonnet 5 came out this morning. I always head straight for the "what's new" developer docs because they tend to have more actionable information than the official announcement post. …
Agent benchmarks are moving toward real enterprise migration tasks.
A Blog post by IBM Research on Hugging Face
Open-weight coding models can change the API cost equation.
This is an interesting new open weights (MIT licensed) model, the first model release from DeepReinforce. [...] with variants including 9B Dense, 31B Dense, 35B MoE, and 397B MoE. Built …
Output token growth is a major hidden cost in agent workflows.
It's happening.
China/open models are part of global agent cost/performance comparisons.
A capability threshold I've been carefully monitoring.
Your own tooling may matter more than public leaderboard rank.
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
A durable bridge between China models and coding-agent evaluation.
Benchmark results for Qwen3 models using the Aider polyglot coding benchmark.
Prompt caching directly changes speed and token cost.
Claude Code manages prompt caching automatically. See why a model switch triggers a slow uncached turn, what /compact costs, why CLAUDE.md edits don't apply mid-session, and how to check your cache hit rate.
A real switching case from default tools to open router/model workflows.
For the entirety of June I ditched Claude Code and have been using open weight models with Opencode and openrouter.ai. Here
Agent history and reusable context can reduce repeated token spend.
Your Claude Code and Codex history auto-deletes. Contextify keeps it forever in a searchable database, syncs it across every machine, and runs on macOS and Linux.
Static HTML first, with machine-readable endpoints for automation and search.