Blog

Idempotency For LLM Requests

Idempotency keeps a retried LLM operation from becoming duplicate work.

Retries are dangerous without dedupe. This post explains how to choose idempotency keys for LLM tasks and how ReqRun deduplicates by project.

The failure mode

A user submits a support ticket classification. The server accepts the request but the client times out before it sees the response. The client retries.

Without idempotency, the second request may create a second model call. Now the app has two answers for one logical operation and no clean way to know which one should win.

Where duplicates come from

Duplicate LLM work is not only caused by impatient users. It can come from browser retries, queue retries, webhook redelivery, cron replays, serverless retries, and wait-mode fallback logic.

The pattern is always the same: the same logical operation is submitted more than once.

  • Client retries after a timeout
  • Webhook sender retries delivery
  • Background job runner retries a failed job
  • User double-clicks a submit action
  • API gateway repeats a request after a network failure

The key should identify the operation

A good idempotency key is stable for the logical operation. It should not be a fresh random UUID generated on every retry.

Use identifiers your application already trusts: ticket id plus action name, webhook event id, task id, document id plus operation, or a database record id.

TypeScript
await reqrun.chat.completions.create({
  model: "gpt-5-nano",
  messages: [{ role: "user", content: "Classify ticket 842" }],
  wait: true,
  idempotency_key: "ticket-842-classification",
});

How ReqRun scopes idempotency

ReqRun deduplicates by project plus idempotency_key. The API key resolves the project automatically, so public API callers do not send a project_id.

If the same project submits the same idempotency key again within the dedupe window, ReqRun returns the existing request instead of creating duplicate work.

Bad keys are worse than no plan

A random idempotency key created per attempt gives a false sense of safety. It looks like the system supports idempotency, but every retry still creates new work.

The rule of thumb is simple: if the key changes when the client retries the same operation, it is not an idempotency key. It is just a request label.

What to store

Store the ReqRun rr_ request id next to your own operation record. That gives your application a durable way to reconnect to the result.

For example, a ticket classification row can store ticket_id, operation_name, idempotency_key, and reqrun_request_id.