AI gateways are growing spend controls. Here is what that still misses.
Databricks just put budgets, resource tagging, and email alerts inside its Unity AI Gateway. It is exactly the right move. It also quietly proves the harder point: a gateway can only govern the traffic that flows through it, and almost no enterprise runs all of its AI through one.
What Databricks shipped
Databricks added AI Spend Controls to Unity AI Gateway: a way to set proactive budgets and alerts across the foundation models, agents, and external providers that route through the gateway. The design is genuinely good, and it is worth naming the pieces because they are the pieces every AI cost program eventually needs:
- Budgets at four levels. Per user (for example $2,000/month for individual experimentation), per use case (for example $1,000/month for coding agents), per workspace (for example $50,000 for production, $5,000 for a sandbox), and an organization-wide ceiling (for example $200,000/month).
- Resource tagging. Budgets attach to specific gateway endpoints and models by tag, so a limit can target one use case rather than the whole account.
- Configurable email alerts when a threshold is crossed, plus budget detail pages that show spend trending in close to real time.
- Cost analytics broken down by identity, workspace, model, and provider, with every request logged to Unity Catalog system tables and costed automatically across provisioned throughput, pay-per-token, and external provider usage.
If you live inside Databricks, turn this on today. Multi-level budgets plus per-identity attribution plus alerts is the correct shape for AI cost governance, and it is a real upgrade over the single annual number most teams still run.
Why the timing matters
The same week these controls landed, the press was full of Uber capping engineers at $1,500/month after blowing its annual AI budget on Claude Code and Cursor. Neither of those tools routes through a Databricks gateway. The control plane caught up; the blast radius had already moved outside it.
The structural limit of any gateway
A gateway governs what passes through it. That is its strength and its boundary. In 2026 the typical engineering org spends on AI through several channels at once, and only some of them sit behind a managed gateway:
- Agentic coding tools that bill direct. Claude Code, Cursor, and similar tools meter against their own accounts or straight to the model provider. They are where the runaway spend stories keep happening, and they rarely sit behind your gateway.
- Seat-based assistants. GitHub Copilot bills through GitHub; ChatGPT Enterprise through OpenAI. Real money, different console, no gateway in the path.
- Cloud model consoles. Amazon Bedrock, Google Vertex, and Azure OpenAI each have their own usage and billing surface. Spend in one is invisible to a gateway in front of another.
- Shadow usage. A personal API key on a corporate card never touches the gateway at all, which is exactly why it is dangerous.
Stack those up and the gateway dashboard, however good, answers "what did my gateway traffic cost?" The question finance actually asks is "what did AI cost this company, by team, and is it buying anything?" Those are not the same question, and the gap between them is where budget surprises live.
What sits above the gateway
The fix is not to abandon gateway-native controls. Keep them. The fix is a spend layer that sits one level up and consolidates every source into one set of per-team budgets and alerts:
- One roll-up across providers and tools. Gateway traffic, direct-billed coding tools, seat-based assistants, and cloud consoles, normalized into the same per-team and per-model view.
- Budgets that follow the team, not the tool. A team's envelope should cover its Claude Code, its Copilot seats, and its Bedrock calls together, so a limit reflects the whole spend and not one slice of it.
- Alerts on the aggregate. An 80% warning is only useful if 80% means 80% of everything the team spends on AI, not 80% of the part that happened to route through one gateway.
- Discovery of what is not behind the gateway. Surfacing the rogue API key and the unmanaged tool is half the value, because you cannot budget for spend you cannot see.
The honest read
Databricks shipping spend controls is a signal, not a solved problem. It tells you the industry now agrees that AI cost governance needs budgets, tagging, attribution, and alerts. It also tells you those controls live inside each platform's walls, and your spend does not respect those walls.
That is the layer Inventoria's AI Spend module provides: per-team and per-model token-and-dollar visibility across OpenAI, Anthropic, Azure OpenAI, Vertex/Gemini, and GitHub Copilot, on a few-hour polling cadence, with monthly budgets and automatic alerts. It complements your gateway instead of competing with it. The gateway governs its traffic; Inventoria tells you what AI is costing the whole company, by the team that spent it.
One budget for all your AI, not one per console.
Connect every provider and tool in five minutes. See spend by team and model across all of them, set budgets, get alerted before the bill does the talking.
Start free →