Azure OpenAI PTU Calculator

Azure OpenAI PTU Calculator

Optimize your Azure OpenAI costs with intelligent PTU sizing, real-time pricing from the Azure Retail Prices API, and comprehensive cost analysis.

Try it live at ptucalc.com | User Guide | Changelog

Features

Pricing & Cost Analysis

Live Azure pricing — fetches real-time rates from the Azure Retail Prices API with intelligent fallback
5 pricing tiers compared — PAYGO, PTU On-Demand, PTU Monthly Reserved, PTU 1-Year Reserved, and Spillover (hybrid) model
Priority Processing (GA) — new pay-per-token tier with SLA-backed latency guarantees
Deployment-aware pricing — Global, Data Zone, and Regional deployments with correct per-deployment rates

Models & Usage

19 PTU-supported models — GPT-5.5, GPT-5.4, GPT-5.4 Mini, GPT-5.3 Codex, GPT-5.2, GPT-5.2 Codex, GPT-5.1, GPT-5.1 Codex, GPT-5, GPT-5 Mini, GPT-4.1, GPT-4.1 Mini, GPT-4.1 Nano, GPT-4o, GPT-4o Mini, o3, o4-mini, o1, o3-mini
Two input methods — KQL/TPM data from Azure Log Analytics (Method A) or direct monthly token counts (Method B)
Prompt Cache Hit Rate — factor in Azure's prompt caching to reduce effective input tokens for more accurate PTU sizing
Region-aware model filtering — model picker shows only models available in the selected region
Spillover strategy — reserve base PTUs for average usage, let burst traffic spill over to PAYGO

Smart Analysis

Context-aware recommendations — PAYGO, Full PTU, or Spillover based on utilization rate and cost comparison
Break-even analysis — shows when PTU becomes cost-effective vs PAYGO
Burst pattern detection — identifies usage spikes and sizing implications
Interactive cost comparison chart with all tiers visualized

Optimization & Throttling Prevention (NEW)

429 Risk Score — real-time throttling risk gauge (0-100) with prioritized mitigation checklist
max_tokens Optimizer — concurrency impact analysis showing how tightening max_tokens increases effective capacity
Leaky Bucket Simulation — interactive visualization of Azure's rate-limiting algorithm with burst testing
Spillover Architecture Comparison — side-by-side analysis of PTU→PayGo, PTU→Priority Processing, and APIM AI Gateway patterns
Retry & Backoff Calculator — configurable retry strategy with exponential backoff + jitter, with code snippets (Python, JS, C#)
Right-Size Wizard — 4-step guided wizard for workload analysis, model selection, PTU sizing, and applying recommendations

User Experience

Guided Quick Tour — 8-step interactive walkthrough with sample data (now includes Optimization tab)
Tabbed results — Cost Analysis, Usage Patterns, Optimization, and Advanced tabs
Sticky executive summary — recommendation, savings, PTUs, and utilization always visible
Export — download analysis as CSV or copy results as JSON
Built-in KQL query — ready-to-use Log Analytics query for gathering usage data

How Pricing Works

The calculator uses a 4-tier pricing priority system:

Custom Override — user-entered rates (for enterprise/negotiated pricing)
Live Azure API — real-time from Azure Retail Prices API via a Vercel serverless proxy (api/azure-pricing.js)
Official Hardcoded — curated rates from Microsoft documentation
Fallback — conservative estimates when all else fails

Live pricing is cached for 3 hours and includes:

PTU hourly on-demand rates per deployment type
PTU reservation prices (1-Month and 1-Year terms)
PAYGO per-token rates (input/output) per model and deployment

Azure PTU Reservation Tiers

Azure offers two PTU reservation options (there is no 3-year PTU reservation):

Reservation	Discount vs On-Demand	Commitment
Monthly (1-Month)	~64% off	No long-term commitment
1-Year	~70% off	1-year commitment

KQL Query for Usage Data

Use this query in Azure Monitor Log Analytics to get your TPM data:

let window = 1m;
let p = 0.99;
AzureMetrics
| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES"
| where MetricName in ("ProcessedPromptTokens", "ProcessedCompletionTokens")
| where TimeGenerated >= ago(7d)
| summarize Tokens = sum(Total) by bin(TimeGenerated, window)
| summarize
    AvgTPM = avg(Tokens),
    P99TPM = percentile(Tokens, p),
    MaxTPM = max(Tokens)
| extend
    AvgPTU = ceiling(AvgTPM / 50000.0),
    P99PTU = ceiling(P99TPM / 50000.0),
    MaxPTU = ceiling(MaxTPM / 50000.0)
| extend RecommendedPTU = max_of(AvgPTU, P99PTU)
| project AvgTPM, P99TPM, MaxTPM, AvgPTU, P99PTU, MaxPTU, RecommendedPTU

Note: The 50000.0 divisor is a generic placeholder. Refer to the official TPM-per-PTU table for the exact value for your model (e.g., GPT-4.1 = 3,000 TPM/PTU, GPT-4o = 2,500 TPM/PTU).

Quick Start

Prerequisites

Node.js 18+
npm

Local Development

git clone https://github.com/ricmmartins/azureptucalc.git
cd azureptucalc
npm install
npm run dev
# Visit http://localhost:5173

Deploy to Vercel (Recommended)

Fork or clone this repo
Go to vercel.com, import your repo, and click Deploy
Or use Vercel CLI:
```
npm install -g vercel
vercel --prod
```

The included vercel.json is pre-configured. The api/azure-pricing.js serverless function handles Azure Retail Prices API proxying to avoid CORS issues.

Project Structure

azureptucalc/
├── api/
│   └── azure-pricing.js          # Vercel serverless: Azure Retail Prices API proxy
├── src/
│   ├── components/
│   │   ├── ui/                   # Shadcn/UI components
│   │   ├── EnhancedResults.jsx   # Executive summary & cost breakdown
│   │   ├── GuidedTour.jsx        # Interactive quick tour
│   │   ├── MaxTokensOptimizer.jsx    # max_tokens concurrency optimizer
│   │   ├── LeakyBucketVisualization.jsx  # Interactive leaky bucket simulation
│   │   ├── ThrottlingAdvisor.jsx     # 429 Risk Score gauge
│   │   ├── SpilloverComparison.jsx   # Spillover architecture comparison
│   │   ├── RetryCalculator.jsx       # Retry & backoff calculator
│   │   └── RightSizeWizard.jsx       # 4-step PTU right-sizing wizard
│   ├── enhanced_pricing_service.js   # Pricing API client with cache & fallback
│   ├── officialPTUPricing.js         # Official PTU rates & reservation overrides
│   ├── official_token_pricing.js     # PAYGO & Priority Processing rates
│   ├── enhanced_model_config.json    # 19 PTU model definitions
│   ├── ptu_supported_models.json     # Model support matrix
│   ├── external_pricing_config.json  # Fallback pricing config
│   ├── ExternalPricingService.js     # Config-based pricing service
│   ├── App.jsx                       # Main application
│   └── main.jsx                      # Entry point
├── deployment/                   # Docker, Bicep, Azure Static Web Apps configs
├── docs/                         # User guide
├── vercel.json                   # Vercel deployment config
└── package.json

Configuration

Environment Variables (Optional)

Set in Vercel dashboard:

VITE_AZURE_PRICING_API=https://prices.azure.com/api/retail/prices
VITE_CACHE_DURATION=10800000

Alternative Deployment Options

Azure Static Web Apps

az staticwebapp create \
  --name azureptucalc \
  --resource-group rg-azureptucalc \
  --source https://github.com/ricmmartins/azureptucalc \
  --location "East US 2" \
  --branch main \
  --app-location "/" \
  --output-location "dist"

Docker / Azure Container Apps

FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build

FROM nginx:alpine
COPY --from=builder /app/dist /usr/share/nginx/html
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]

Contributing

We welcome contributions! See CONTRIBUTING.md for details.

Bug Reports — include screenshots and browser info
Feature Requests — describe the business value
Code — fork → branch → PR
Docs — improve clarity, add examples

FAQ

How accurate are the pricing calculations? The calculator fetches live rates from the Azure Retail Prices API. Fallback data is regularly updated against official documentation. Always verify with the Azure pricing page before making purchasing decisions.

Can I use this without KQL data? Yes! Use Method B (token counts) to enter your monthly input/output token consumption directly, or use Method A with estimated TPM values.

What deployment types are supported? Global (multi-region, lowest cost), Data Zone (EU/US data residency), and Regional (single-region, lowest latency). Each has different PTU pricing.

How does the spillover model work? Reserve base PTUs for average usage, let burst traffic spill over to PAYGO. Ideal for predictable baselines with occasional spikes (2–5× average).

What is Priority Processing? A GA pay-per-token option with SLA-backed low-latency guarantees. Available for select models on Global and Data Zone deployments. Pricing varies by model.

Is my data secure? All calculations happen in your browser. No usage data is sent to external servers. The app only fetches public Azure pricing information.

Support

License

This project is licensed under the MIT License. See LICENSE for details.

Acknowledgements

Azure OpenAI Service team for official pricing data
Ahmed Geedi for the PTU sizing & throttling prevention best practices document that inspired the Optimization tab
All contributors and testers
The Azure community

Made with love for the Azure community — ptucalc.com

Name		Name	Last commit message	Last commit date
Latest commit History 181 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
api		api
deployment		deployment
docs		docs
public		public
src		src
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
components.json		components.json
eslint.config.js		eslint.config.js
index.html		index.html
jsconfig.json		jsconfig.json
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.js		tailwind.config.js
vercel.json		vercel.json
vite.config.js		vite.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Azure OpenAI PTU Calculator

Features

Pricing & Cost Analysis

Models & Usage

Smart Analysis

Optimization & Throttling Prevention (NEW)

User Experience

How Pricing Works

Azure PTU Reservation Tiers

KQL Query for Usage Data

Quick Start

Prerequisites

Local Development

Deploy to Vercel (Recommended)

Project Structure

Configuration

Environment Variables (Optional)

Alternative Deployment Options

Azure Static Web Apps

Docker / Azure Container Apps

Contributing

FAQ

Support

License

Acknowledgements

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Azure OpenAI PTU Calculator

Features

Pricing & Cost Analysis

Models & Usage

Smart Analysis

Optimization & Throttling Prevention (NEW)

User Experience

How Pricing Works

Azure PTU Reservation Tiers

KQL Query for Usage Data

Quick Start

Prerequisites

Local Development

Deploy to Vercel (Recommended)

Project Structure

Configuration

Environment Variables (Optional)

Alternative Deployment Options

Azure Static Web Apps

Docker / Azure Container Apps

Contributing

FAQ

Support

License

Acknowledgements

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages