From Resource Tagging to Token Tracing:

The FinOps Code for AI Cost Control
Shifting from Infrastructure Tagging to Application-Level Telemetry
FinOps Evolution - The 7 Phases
The public cloud has existed for more than two decades, since AWS launched its first services in 2002.  
FinOps = the discipline and practice of managing cloud spend
Phase 1: Observational FinOps
The Infancy
Focus: Accessing and collecting cost and usage data.
Goal: Gaining a clear picture of consumption and cost before receiving the cloud bill.
Challenges: Early cloud providers did not expose cost data effectively, and reporting formats varied by vendor.
Improvements: FOCUS (FinOps Open Cost and Usage Specification) provides uniform cost and usage datasets.
Status: Observational FinOps is a necessary, foundational component.

Phase 2: Analytical FinOps
The Childhood
Focus: Analyzing collected data to understand the underlying drivers of cost.
Challenge: It is often hard to understand where the money is, as managed services include costs for compute, network, and storage resources, and idle resources still contribute to cost.
Outcome:
    - Extracting meaning from data is vital for actual optimization.
    - Leads to identifying potential waste, detecting anomalies, and defining
      automated  guardrails.

Phase 3: Attributional FinOps
The Adolescence
Focus: Attributing the undifferentiated cost of resources to specific services to manage infrastructure costs.
Process: Starts with foundational practices like resource tagging.
Complexity: Gets complicated with shared resources (load balancers, Kubernetes).
Impact: Closes the FinOps feedback loop by providing financial data back to engineers, allowing them to evaluate how their components impact the overall system cost.

Phase 4: Applied FinOps
The Early Adulthood
Focus: Applying changes based on analysis to achieve financial goals.
Core Practices:
    - Smart use of Committed Use Discounts (CUDs)
    - Spot/Preemptible instance utilization
    - Right-sizing
    - Data Tiering
    - Waste identification and elimination
    - Outsourcing or Insourcing (based on cost-effectiveness)
Challenge: Application is often reactive, done as an afterthought, rather than being integrated into system design.

Phase 5: Architectural FinOps
The Adulthood
Focus: Returning to the design board to build systems with cost, alongside reliability and performance, as a key consideration.
Process: Relies on feedback from all preceding FinOps practices to identify bottlenecks and costly system parts.
Examples:
    - Rewriting resource-intensive code in native languages like C or Rust.
    - Smart use of queueing and caching.
    - Re-evaluating autoscaling strategies.
Note: Autoscaling and microservices can become a source of waste if not correctly designed.

Phase 6: Automated FinOps
The Maturity
The Necessity: The FinOps Feedback Loop is time-consuming and requires unwavering discipline, especially with growing system complexity. The solution is automation.
Definition: Codifying the analysis and application of FinOps knowledge to occur continuously throughout the software delivery lifecycle.
Essence: Continuous evaluation and automated balancing of the conflicting concerns of performance, reliability, and cost.
Conclusion: Automated FinOps is the only way to manage costs in 2026; otherwise, manual calculations lead to burnout or rigid, innovation-hurting guardrails.

The Future: Integrated FinOps

Near-term: Automation will continue to evolve, with AI/ML augmenting existing FinOps observability and analysis capabilities.
Major Shift: Integrating all FinOps practices—from observation to automation—into the platforms used to run software.
Benefit: This integration makes FinOps accessible and proactive, enabling continuous infrastructure optimization aligned with business goals.
Next: Practical Implementation with AI Cost Control.
The New FinOps Problem: Runaway Tokens

Old Cloud FinOps Challenge: The un-tagged resource (cost built up over weeks).
New AI FinOps Challenge: Runaway tokens (budget-busting cost spikes in hours).
Problem: Unoptimized prompts hitting expensive LLMs rapidly causes cost explosions.
Solution Shift: Move from infrastructure tagging to application-level telemetry to track every token in real-time.
Key Focus: Autonomous AI agents doing work for the whole team.
FinOps 1.0 vs. FinOps for AI
Integrated Solution: OpenTelemetry (OTel)

✅ What is OTel?
An open-source framework for collecting observability data (traces, metrics, logs).
✅ How it Helps:
Use tracing capabilities to wrap LLM calls and inject critical FinOps context.
✅ Goal:
Embed FinOps intelligence directly into the application layer to report on every token instantly.

Code Example:
The OpenTelemetry framework is used with Google's Agent Development Kit (ADK) for cost allocation tracking.
How FinOps Tags Get In (Code Breakdown)
Mechanism: Wrap the agent's activity in a custom OpenTelemetry Span that carries budget details.
How FinOps Tags Get In (Code Breakdown)

Mechanism: Wrap the agent's activity in a custom OpenTelemetry Span that carries budget details
1. Starting the FinOps Span:
Declares a parent span. Subsequent instrumented code (like ADK's internal LLM calls) creates child spans.
2. Adding FinOps Metadata (Tags):
Injects cost center details. Child spans (with token counts) are automatically linked for allocation.
The Critical FinOps Metrics

These attributes, found in the nested LLM call span, are essential for a billable cost report:

finops.project_code
The Cost Center for Allocation (e.g., BLOG-FINOPS-001).
llm.usage.input_tokens
Cost Metric 1: Tokens sent to the model (part of the bill).
 llm.usage.output_tokens
Cost Metric 2: Tokens received from the model (the other part of the bill).
llm.model_used
The Pricing Tier for calculation (e.g., gemini-2.5-flash-latest).
The Critical FinOps Metrics
These attributes, found in the nested LLM call span, are essential for a billable cost report:
finops.project_code
The Cost Center for Allocation (e.g., BLOG-FINOPS-001).

Purpose: Allocation
llm.usage.input_tokens
Cost Metric 1: Tokens sent to the model (part of the bill).
llm.usage.output_tokens
Cost Metric 2: Tokens received from the model (the other part of the bill)
llm.model_used
The Pricing Tier for calculation (e.g., gemini-2.5-flash-latest).
Purpose: Calculation
Advanced FinOps: Multi-Agent Flows

Scenario:
Complex workflows where one agent delegates work to another.

OTel Power:
The trace context of the parent span is automatically carried down the call chain.

Result:
The entire multi-step agent choreography executes within the initial FinOps Span's context.

Benefit:
One unified, auditable cost report for the whole complex workflow, covered by a single, top-level FinOps tag.
The Tutorial
The Tutorial
IMPORTANT: The tutorial uses `ConsoleSpanExporter` (prints to terminal).
DO NOT use in Production.
Production Setup

Replace with a dedicated OTLP Exporter that sends data to a robust backend service:
Google Cloud Trace
Managed Observability Backends (Jaeger, Datadog, New Relic)
Backend Value:  
Enables querying and aggregation for FinOps reports.
Resources & Links
ADK Observability
Confirms ADK's native, built-in support for OpenTelemetry instrumentation.
Getting Started with OpenTelemetry
Explains Spans, Context, and Context Propagation.
ADK Agents
Details agent types that benefit from this tracing.
Runnable Code
Complete implementation examples and tutorials.
Images for Presentation for varios IT strategy management platphorms.
Back to Top