"Token usage" is becoming the new lines-of-code metric. Easy to measure. Weakly connected to value.
AI adoption should be judged by useful shipped work, validated outcomes, maintenance cost, and whether the workflow actually improved after the model entered it.
In engineering leadership roles, I have seen teams optimize for visible activity when the harder work is system improvement. AI creates a new version of the same trap: more prompts, more tokens, more internal training sessions, more generated artifacts — none of which automatically mean better delivery.
The better signal is whether a team ships safer changes, reduces rework, improves cycle time, lowers operational cost, or makes a previously manual workflow auditable and repeatable. Everything else is instrumentation noise.
AI maturity starts when the metric moves from usage to operational impact.