# I Compared the Real Cost of Claude Code, OpenRouter, and Image APIs

> Source: <https://dev.to/cleandatadev/i-compared-the-real-cost-of-claude-code-openrouter-and-image-apis-1cip>
> Published: 2026-06-17 14:15:00+00:00

An API request that looks cheap on a pricing page can become much more expensive inside a real product.

The pricing page normally gives you the unit rate:

That is useful, but it is not yet a production budget.

A production workflow can also include repeated context, tool results, cache writes, retries, duplicate submissions, failed media jobs, and outputs that users reject and regenerate.

I wanted to see how much those factors change the estimate, so I built the same budget model for three different workloads:

This article explains the method. I also published an editable calculator and downloadable CSV dataset at the end.

Consider a small application with the following usage:

```
100 monthly active users
20 active days per month
3 requests per user per active day
2,000 average input tokens per request
500 average output tokens per request
$3 per 1M input tokens
$15 per 1M output tokens
3% retry rate
15% planning buffer
```

These are editable planning assumptions, not universal usage averages or an official provider quote.

The first step is to estimate request volume:

```
monthly requests =
100 users
× 20 active days
× 3 requests

monthly requests = 6,000
```

The input-token cost is:

```
6,000 requests
× 2,000 input tokens
÷ 1,000,000
× $3

input cost = $36
```

The output-token cost is:

```
6,000 requests
× 500 output tokens
÷ 1,000,000
× $15

output cost = $45
```

That produces a listed base cost of:

``` php
$36 + $45 = $81
```

Adding the editable 3% retry assumption:

```
$81 × 1.03 = $83.43
```

Adding a further 15% planning buffer:

```
$83.43 × 1.15 = $95.94
```

The difference is relatively small in this example, but the same multipliers become more significant at higher volume.

The important point is that `$95.94`

is not an official price quote. It is a derived planning result based on:

Those categories should remain separate.

A coding-agent task is not equivalent to one chat message.

One completed task may involve:

The user may only see a short final response such as:

Fixed the validation bug and added a regression test.

However, the agent may have processed a much larger amount of context before producing that response.

For coding-agent workflows, I find this cost unit more useful:

```
cost per completed task
```

rather than:

```
cost per user message
```

A simplified task formula is:

```
task cost =
model calls per task
× estimated cost per model call
× retry multiplier
```

Each model call may include:

```
input tokens =
instructions
+ conversation history
+ source files
+ tool definitions
+ previous tool results
```

That is why two tasks with similarly short final answers can have very different costs.

A one-file configuration fix and a repository-wide migration should not share the same expected token budget.

Tools do not need a separate flat fee to increase the total cost.

The model may need to:

Large command outputs and large file reads can become part of later input context.

For a deeper breakdown, see the [Claude Code Token Cost Guide](https://aicostplanner.com/claude-code-token-cost/).

Image generation needs a different cost model.

Depending on the provider and model, the billing unit may be:

The cost of one accepted image is also different from the cost of one submitted API job.

Suppose a product needs 1,000 accepted images per month.

If users accept the first result every time, the application submits approximately 1,000 generation jobs.

But if one accepted image requires an average of 2.4 attempts:

```
1,000 accepted images
× 2.4 attempts

= 2,400 submitted jobs
```

At an illustrative price of `$0.04`

per submitted generation:

```
2,400 × $0.04 = $96
```

The effective cost per accepted image becomes:

```
$96 ÷ 1,000 = $0.096
```

That is more than twice the listed one-generation price.

This does not mean every provider charges for every failed request. Failed-job billing varies by provider, failure type, and processing stage.

The useful distinction is between:

Those numbers should be reconciled with request IDs and the provider billing dashboard.

The [Image Generation API Cost Guide](https://aicostplanner.com/image-generation-api-cost/) explains the different billing units in more detail.

A marketplace or gateway introduces another layer of cost management.

With OpenRouter, it is useful to distinguish between:

These concepts are related, but they are not interchangeable.

For example, a key may have a spending limit even when the account still has credits. A request may also fail because of a permission or policy restriction rather than insufficient balance.

OpenRouter currently reserves the right to expire unused credits one year after purchase. An HTTP `402`

normally indicates insufficient credits, while `403`

generally points to a permission, guardrail, or moderation restriction.

I documented the operational checks separately in [OpenRouter Credits](https://aicostplanner.com/openrouter-credits/).

For a text API request, I would record at least:

```
request_id
provider
model
input_tokens
output_tokens
estimated_cost
status
```

For a more useful production record, I would add:

```
cached_input_tokens
retry_count
latency
provider_reported_cost
created_at
```

For media jobs, I would also record:

```
job_id
duration_or_resolution
submitted_at
completed_at
failure_stage
accepted_by_user
```

Without request-level records, it is difficult to answer basic billing questions:

I now separate cost planning into four layers.

This comes from provider pricing documentation.

Examples include:

```
price per 1M input tokens
price per 1M output tokens
price per generated image
price per generated video second
```

These are specific to the product:

```
active users
requests per user
average input tokens
average output tokens
tool calls per task
attempts per accepted image
generated video duration
```

These may include:

```
retries
duplicate submissions
cache writes
cache reads
rejected outputs
failed jobs
```

These should be editable assumptions rather than universal facts.

A budget buffer can help when usage is uncertain, but it should not hide the underlying estimate.

The calculator should show both:

```
base API cost
planned operational budget
```

A cost calculator has important limitations.

It cannot determine:

Price is also not a measure of output quality.

A calculator is a planning tool, not a replacement for provider usage records or billing dashboards.

I put the complete model into the [2026 AI API Cost Benchmark](https://aicostplanner.com/ai-api-cost-benchmark/).

It includes:

The current pricing snapshot was reviewed on June 17, 2026.

The calculator is free to use and does not require an API key. Pricing changes frequently, so verify current provider documentation before launching a production workload and compare the estimate with a small real-world test.

What caused the largest difference between your original estimate and your actual AI bill: output tokens, repeated agent context, retries, or media regeneration?
