Amazon Bedrock gets detailed inference cost attribution across users and applications

Q: What is the source?

Originally published on AWS Machine Learning Blog. Hamidun News processes and adapts the material with AI.

Q: When was it published?

May 2, 2026. Reading time: 3 min.

AWS has enabled detailed inference cost attribution in Amazon Bedrock. The platform now automatically maps costs to the IAM user, role, or federated session…

Hamidun News Editorial

AI monitoring · AWS Machine Learning Blog

May 2, 2026· 3 min

AI-processed from AWS Machine Learning Blog; edited by Hamidun News

Amazon Bedrock gets detailed inference cost attribution across users and applications — Source: AWS Machine Learning Blog. Collage: Hamidun News.

◐ Listen to article

AWS launched on April 17, 2026 a granular cost attribution feature for Amazon Bedrock — detailed inference expense tracking. Now expenses can be automatically tied to a specific user, application, IAM role, or tenant without changing existing model call scenarios.

How it works

Bedrock began sending AWS Billing data about which IAM principal sent a request to the model. This can be a regular IAM user, an application role, a temporary federated session via Okta or Entra ID, as well as a Bedrock API key if it's associated with an IAM identity. The CUR 2.

0 report gets a new field line_item_iam_principal, and by line_item_usage_type you can see which model was used, in which region, and whether the money went to input or output tokens. On top of this, AWS offers to connect cost allocation tags. They can be attached directly to IAM users and roles or passed as session tags during federated authorization and AssumeRole.

Once activated in billing, such tags appear in both CUR 2.0 and Cost Explorer, where expenses can already be aggregated by team, project, cost center, or tenant. The feature itself is available in commercial regions at no extra charge, but it requires enabling IAM principal export to CUR 2.

0 and waiting for tags to appear within 24–48 hours.

"Understanding exactly who is spending money on inference is the first

step toward chargeback, forecasting, and optimization."

Four accounting scenarios

AWS describes four typical schemes where the new attribution is especially useful. The logic is simple: whoever calls Bedrock becomes the accounting unit. But the labeling method depends on whether a person, service, corporate SSO session, or a common LLM gateway is working with the models. The difference matters because it determines where the calling party's identifier and tags for subsequent cost aggregation in reports and alerts will be stored.

IAM users and API keys — suitable for small teams and prototypes: you can see each developer's spending separately.
IAM roles for applications — convenient for production services: expenses are divided by backends, batch jobs, and projects.
Federated users via IdP — corporate users are visible by session name and tags from SAML or OIDC.
LLM gateway or proxy — for SaaS and internal AI platforms where you need a breakdown by users and tenants, not a single line for the entire gateway.

The most non-trivial option is the gateway. If the proxy accesses Bedrock under a single role, billing will only see that role and lose granularity. AWS suggests solving this by AssumeRole for each user or tenant with role-session-name and tags passed. The obtained credentials can be cached for up to an hour, so the model doesn't require calling STS on every request. The default STS limit for AssumeRole is 500 calls per second per account, and this is important to account for in high-throughput systems.

Practical effect for teams

For FinOps and AI platforms, this closes a long-standing gap: previously, Bedrock expenses were often only visible at the account level or a single service user, and then teams had to build their own logging layer and manually distribute costs. Now AWS provides a native chargeback and showback mechanism through the same IAM identities and tags that companies already use for access and governance. An additional plus is the ability to quickly understand who's pulling expensive models like Opus, who's on lighter configurations, and where the budget is eaten up by output tokens.

For engineering teams, the value goes beyond finances. If each microservice has its own role and employees have their own federated session, then the same mechanism simultaneously enhances security and transparency. In multi-tenant SaaS this is especially useful: you can compare the cost of serving customers, build internal pricing, and set alerts in Cost Explorer by tags.

Essentially, AWS turns the model call identity into a full-fledged financial label that can be used to build reports without a separate data pipeline.

What this means

Amazon Bedrock becomes noticeably more convenient for companies scaling GenAI not on dozens of demo requests, but on real teams, services, and customers. The more AI traffic a business has, the more important it is to see not just the total bill, but the specific source of expenses.

Hamidun News

AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Telegram channel RSS hamidun.com

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

🎓 Academy — 7 days free Free consultation