Skip to content

Product · Cost & FinOps

See, attribute and govern your model spend

Two modules in one place: a declared catalogue of the models in your estate — capabilities, list pricing, routing policy — and a FinOps view that breaks token and cost spend down by model, provider, agent, session, team and project. Set budgets and thresholds, read a run-rate projection, and let budget signals gate model resolution. Seeing and governing cost — not Olivares running your inference.

In the product

The cost dashboard

A genuine screenshot, example data. The executive view: spend to date, a run-rate projection, token volume, the active governed models, and a spend-trend chart — broken down so you can see where the money goes.

Real screenshot
Olivares cost dashboard: headline spend and token totals, a run-rate projection, a count of active governed models, and a spend-trend chart over time, populated with example data.

What you get

Two modules: the model estate, and the money

A catalogue of the models you govern and the policy that resolves them, paired with a cost view that attributes every micro-dollar.

A declared model catalogue

The models in your estate with their capabilities and list pricing, governed centrally. Pricing is a declared, dated reference you edit — verify it against the provider; we never present it as immutable truth.

Routing policy with a fallback chain

Define how a request resolves to a model — by cost, latency, capability or pinned — with a /resolve fallback chain. This is the policy that decides; running the inference is a separate, explicitly provisioned step.

Spend, attributed

Token and cost spend broken down by model, provider, agent, session, team and project. Money is integer micro-USD internally, so totals add up exactly. Model and provider breakdowns are always present; finer attribution depends on the connector wired in.

Budgets that can gate resolution

Budgets with thresholds, alerts and recommendations. A breached budget can gate model resolution — block or throttle — so cost limits are enforced at the decision point, not discovered on the invoice.

What’s real

Live for seeing and governing cost — not for running your inference

We are precise about what each number is, because finance decisions depend on it:

  • Live: read, analysis and budget signalling. Spend by model, provider, agent, session, team and project; budgets with thresholds, alerts and recommendations; and budget enforcement that can gate model resolution by block or throttle. Model and provider breakdowns are always populated.
  • Honest gaps in the data: list pricing is a declared, dated reference you maintain — verify it against the provider before you act on it. The forecast is a linear projection at the current run-rate, not a predictive model. Per-agent, per-session and per-team attribution may read empty until a session-attributing connector is wired — and a truncated aggregate is shown as partial, never as an exact total. We do not derive a cache-savings figure from the cost stream, so we do not show one.
  • Roadmap / seam: routing policy is defined here, but routing execution — the gateway that actually calls a model — is a separate component. Model /execute is deny-closed and returns 503 without explicit provisioning. Olivares helps you see and govern cost; it does not run your inference for you.

Cost & FinOps — questions

Where do the prices come from — are they live from the providers?

No. The pricing in the catalogue is a declared list price: a dated reference you edit and maintain, not a live feed. It is there so cost estimates are consistent, not so you can treat it as the provider’s current truth. Verify it against the provider before you make a decision on it.

Is the forecast a prediction of what we’ll spend?

It is a linear projection at the current run-rate — it extends your present spend rate forward, nothing more. It is not a predictive model and does not account for seasonality, planned changes or anything you have not done yet. Read it as “if nothing changes, this is the trajectory”.

Why is some per-agent or per-team cost showing as empty?

Because that attribution needs a connector that tags spend with the session, agent or team it belongs to. Until that is wired, the breakdown is honestly empty rather than guessed — and where an aggregate is incomplete it is labelled partial, never presented as an exact total. Model and provider breakdowns do not depend on this and are always present.

Does Olivares route and run my model calls?

No. You define routing policy here — by cost, latency, capability or pinned, with a /resolve fallback chain — but executing the call is a separate gateway component. Model /execute is deny-closed and returns 503 unless it is explicitly provisioned. This surface is about seeing and governing cost, not about Olivares sitting in your inference path.

Take control of your model spend

Deploy Olivares on your own infrastructure, declare your model estate, attribute every micro-dollar, and let budgets gate resolution before the cost is incurred.