# RFC: Stateless OAuth Client Identity for Ephemeral (DCR) Clients on a Doorkeeper-style Provider

> Source: <https://gist.github.com/elct9620/f884eb73b0dbf2a2852feb665233366b>
> Published: 2026-06-21 09:01:48+00:00

Category |
Informational / Design Pattern |
Status |
Draft — distilled from production experience |
Applies to |
Rails + Doorkeeper OAuth providers (Ruby), serving MCP or other dynamically-registered clients |
Author |
Aotokitsuruya（蒼時弦也） |
Date |
2026-06-21 |
Audience |
Engineers who must host OAuth for dynamically-registered, one-time, unbounded clients (typically MCP agents) on top of Doorkeeper |

This document records a reusable design: when an OAuth client is **dynamically registered (RFC 7591 DCR), one-time, and unbounded in number** (the MCP client is the canonical case), how to carry it on a **Doorkeeper-style provider** with a **stateless JWS client identity** — so the database never accumulates an unbounded client-registration table — and how to then handle the **Access Grant / Access Token lifecycle, cleanup, revocation, and "Active Session" modeling** correctly.

Core conclusion: **the client can be stateless (a JWS), but grants and tokens necessarily persist.** The real engineering problem is not "whether to store" but "how the persisted rows get reclaimed — correctly, promptly, and completely." Cleanup **MUST** be keyed on *time and revocation state alone*, never on the client/application dimension, which under this design is dynamic and non-recurring. Statelessness is a deliberate trade with real costs (§4.5); this document states them rather than hiding them. An optional security extension (a `client_subject`

denylist, §16) recovers per-client revocation at the price of a bounded amount of state — a deliberate hybrid, not pure statelessness.

Normative guidance uses RFC 2119 keywords (MUST / SHOULD / MAY). Code is a *recommended* shape, not a drop-in library; §15 lists the collaborators an adopter must supply.

- Terminology
- Problem Statement — and where CIMD fits
- Architecture Overview
- Stateless Client Identity (JWS as
`client_id`

) - Doorkeeper Integration —
`application_class`

and`by_uid`

- Grant / Token Model, Custom Attributes, and Authorize-time Population
- Coexistence with Classic OAuth — Type Dispatch
- Lifecycle & Cleanup — the core problem
- Active Session Modeling — a decoupled metadata record
- Revocation — scope it; do not use the library-wide
`revoke_all_for`

- Security Considerations
- Operational Considerations
- Pitfalls / Landmines
- Decision Log
- Reusability Checklist — collaborators an adopter must supply
- Future Work / Unverified Extensions
- References

**DCR**— Dynamic Client Registration (RFC 7591). A client registers at runtime to obtain its`client_id`

.**CIMD**— Client ID Metadata Document. An alternative where the`client_id`

is an HTTPS URL the authorization server fetches to obtain the client's metadata; no server-issued identity (§2.1).**Ephemeral client**— a one-time client. Each instance / connection may run its own DCR and produce a non-reused`client_id`

; the count is unbounded. MCP clients behave this way.**Stateless client identity**— the client's identity is encoded in a self-contained, server-signed string (a JWS), verified by signature rather than by a lookup; the server stores**no** client-registration row. (An optional denylist extension, §16, adds a bounded post-verification lookup — a hybrid.)— a stable per-registration subject identifier (a UUIDv7), embedded in the JWS and recorded`client_subject`

*by value*on each grant/token. It is the true client-isolation key (§5).**derived_id**— an integer deterministically derived from`client_subject`

, used as the provider's`application_id`

surrogate. It is dynamic, non-recurring, and has no backing row. It is plumbing, not an isolation key.**denylist**— a small set of`client_subject`

s blocked at a gate after resolution; a bounded amount of state that recovers per-client revocation (§16, unverified).**call-time boundary**— the resource server's check, on every request, that the presented token is acceptable (not revoked, not expired, right audience, owner still a member). It is the real authorization gate; this document scopes itself to the provider side and treats the boundary as adopter-supplied (§15).**SoT**— Single Source of Truth.- Normative keywords
**MUST / MUST NOT / SHOULD / SHOULD NOT / MAY** follow RFC 2119.

A traditional OAuth provider assumes **a bounded set of long-lived clients, registered by humans or a controlled process** — so storing each client (with a secret) in an `oauth_applications`

table is reasonable.

MCP (and similar agent / native-app ecosystems) breaks that assumption:

**Clients self-register via DCR**, not human registration.** Clients are often one-time**: each desktop instance, reinstall, or even each connection may register anew.** The count is unbounded**— there is no natural cap.** Public clients (no secret)**— a native app cannot keep a secret, so it uses PKCE.

Persisting an `oauth_applications`

row per DCR would grow an **unbounded table of mostly-dead rows**. That is the problem this design solves.

Key insight (the heart of this design):making theclientstateless solves unbounded growth of theclient table. Butgrants and tokens still persist, so the unbounded-growth problemmovesto the access-grant and access-token tables. Anyone adopting this patternMUSTtreat "grant/token reclamation" as a first-class design problem (§8), not an afterthought.

A standards-track alternative is emerging. With **CIMD**, the `client_id`

is an HTTPS URL the authorization server fetches to read the client's metadata document — there is no server-issued identity at all, which sidesteps the unbounded-table problem differently. The 2025-11-25 MCP authorization revision **elevated CIMD to SHOULD and demoted DCR to MAY**, with a registration preference order: pre-registration → CIMD → DCR → ask the user.

As of mid-2026, CIMD support across clients and servers is **still low in practice**, so DCR remains a pragmatic choice where CIMD is not yet viable. A robust provider **SHOULD** support DCR for those clients and **SHOULD** become CIMD-compatible (accept a URL `client_id`

and fetch its metadata) as adoption grows — the two coexist behind the same resolution dispatch (§7).

The stateless **JWS-as- client_id** pattern in this document is one way to implement DCR without an unbounded registry. It remains preferable to CIMD when there is

**no client-hosted metadata URL**, when identities must be

**issued offline**, or when the AS needs to

**control identity expiry and rotation itself**. Where CIMD fits the deployment, prefer it; this pattern is for the DCR path.

```
┌────────────────────────────────────────────────────────────────┐
│ Layer            Owner                          State?           │
├────────────────────────────────────────────────────────────────┤
│ Client identity  JWS (signed client_id)         STATELESS (none) │
│ OAuth flow       Doorkeeper (authz code + PKCE)  —               │
│ Credentials/SoT  access_tokens / access_grants   PERSISTED       │
│ Session metadata side table (last_seen, name)    PERSISTED (thin)│
│ Resource server  the call-time boundary          adopter-supplied│
└────────────────────────────────────────────────────────────────┘
```

Normative division of responsibility:

- The client-identity layer
**MUST** write no per-client-registration row. A bounded denylist keyed only on*revoked*subjects (§16) is permitted — it is not a registration table and does not grow with registrations. - Grants and tokens are the
**SoT** for whether a*call*is authenticated. - The session-metadata layer holds only what a token cannot express (display name, last-seen); it is observational and
**MUST NOT** be the authority that gates a call. - The resource server
**MUST** complete its call-time boundary (resolve the token to scope / tenant / audience / membership) before any action runs. That boundary is the adopter's; this document specifies the provider side.

The DCR endpoint accepts RFC 7591 metadata (`redirect_uris`

, `client_name`

, `scope`

, …), **validates it**, and **issues a JWS as the client_id**, writing

**nothing** to the database. The endpoint is unauthenticated, so two validations are non-negotiable

*before*issuance:

**redirect_uri validation** per RFC 7591 / RFC 8252 §7 — accept loopback (§7.3), private-use scheme (§7.1), or claimed-`https`

(§7.2, the form RFC 8252 prefers); reject non-loopback`http`

, wildcards, and open-redirect-prone values. This is an existing DCR / native-app requirement, not a new one: skipping it issues a*signed*open-redirect that PKCE does not fully contain.**rate limiting** of the endpoint (§11).

A recommended `ClientIdentity`

collaborator (adopter-supplied; see §15). The `metadata`

it embeds is the validated RFC 7591 subset this design consumes — the wire names `client_name`

, `redirect_uris`

, `scope`

(the resolved `Identity`

's `requested_scope`

is just the alias for `scope`

).

```
class ClientIdentity
  ISSUER   = "https://issuer.example"   # this AS's signing-domain tag (see note below)
  LIFETIME = 90.days
  ALG      = "ES256"

  Identity = Struct.new(:subject, :client_name, :redirect_uris, :requested_scope, keyword_init: true)

  # issue → the JWS string that becomes the client_id.
  # `metadata` MUST already have its redirect_uris validated (RFC 7591 / 8252 §7) — see above.
  def self.issue(metadata)
    now = Time.current.to_i
    payload = { iss: ISSUER, sub: SecureRandom.uuid_v7, iat: now,
                exp: now + LIFETIME.to_i, reg: metadata }
    JWT.encode(payload, KeyStore.active_private_key, ALG, kid: KeyStore.active_kid)
  end

  # resolve → an Identity, or nil when the JWS does not verify (fail closed).
  # Pure signature verification — NO database lookup. The optional denylist (§16) is a
  # separate gate run AFTER resolution, not here, so this stays stateless.
  def self.resolve(client_id)
    payload, _ = JWT.decode(client_id.to_s, nil, true,
                            algorithm: ALG, iss: ISSUER, verify_iss: true,
                            verify_expiration: true, required_claims: %w[iss sub exp],
                            leeway: 30) do |header|
      KeyStore.verification_key(header["kid"])   # unknown/retired kid → nil → verification fails
    end
    reg = payload["reg"] || {}
    Identity.new(subject: payload["sub"], client_name: reg["client_name"],
                 redirect_uris: Array(reg["redirect_uris"]), requested_scope: reg["scope"])
  rescue JWT::DecodeError
    nil
  end
end
```

this JWS is an`iss`

note:internalartifact, not an OAuth/OIDC token that leaves the AS's trust domain. Its`iss`

is a private signing-domain tag,notthe OAuth issuer URL of RFC 8414/9207; it stillMUSTbe verified so the signature is bound to this AS's key namespace. (The library shown is`ruby-jwt`

; treat the option names as illustrative and confirm against your JWT library.)

A presented `client_id`

is verified by signature (`resolve`

above): signature, `iss`

, and `exp`

(with a small `leeway`

for clock skew, §11), and the required claims must be present. An unknown `kid`

(a retired key) yields no verification key and fails. Any failed check **MUST** fail closed (treated as "no such client") and **MUST NOT** leak which check failed. If the denylist extension (§16) is enabled, its gate runs after this, also fail-closed.

| Property | Mechanism |
|---|---|
| Survives restarts | Verification is by signature, not a lookup; a restart does not affect an existing `client_id` |
| No registration table | Registration writes no row |
| Retirable | `exp` lapses → the client re-registers; rotating the signing key invalidates every `client_id` it signed |
| Retirement is not restart-triggered | Only expiry or key rotation retires an identity — both controllable |

Adopters

MUST NOTuse "restart" as a retirement mechanism, andMUST NOTtreat the JWS as a session or a credential. It is anidentity: it authenticatesnothingabout whoever presents it. Anyone holding a`client_id`

JWS can begin an authorization; access requires completing PKCE and user consent, so the JWS grants nothing on its own (§11). The credential is the access token. Enabling the denylist extension (§16) deliberately trades a slice of "no lookup" for per-client revocability — a hybrid, chosen knowingly.

The signing key is asymmetric (ES256). A recommended `KeyStore`

collaborator:

- Keep the active
**private** key in a secret store (env / secret manager),**never** in source. - The JWS header's
`kid`

is a fingerprint of the**public** key;`verification_key(kid)`

is a lookup in a small map of`kid → public key`

. **Planned rotation:** hold both the outgoing and incoming keys in the map for a grace period (≥ the JWS`LIFETIME`

for zero forced re-registration). Sign new identities with the incoming key while still verifying the outgoing one; drop the outgoing key only after the grace period.**Emergency rotation**(key compromise) deliberately collapses the grace period and forces all clients to re-register — see §4.5.

Statelessness is a deliberate trade. The costs are real and **MUST** be weighed:

**No single-client revocation (without the §16 extension).** There is no per-client row, so there is no per-client off switch. A compromised or abusive`client_id`

can otherwise be retired only by waiting out its`exp`

or rotating the signing key (which retires everyone). The §16 denylist extension blocks*future*authorizations for a subject — but already-issued tokens stay valid until their own expiry, so it**MUST** be coupled with token revocation (§10), and revocation immediacy remains bounded by the token TTL.**Key-rotation blast radius.** Rotating the signing key invalidates every`client_id`

it signed. Planned rotation needs the multi-`kid`

grace period (§4.4); emergency rotation forces all clients to re-register at once.**Identifier size.** A signed JWS`client_id`

is ~300–500 bytes (ES256, base64url). It travels in authorize URLs (mind URL-length limits), redirects, and logs, and its`client_subject`

is recorded by value on every grant and token (§6) — modest storage amplification. Anyone who can read those logs can also decode the JWS payload (it is signed, not encrypted), so keep no sensitive data in`reg`

. In particular`client_name`

(embedded in`reg`

) is client-supplied and may carry user- or device-identifying text; since it rides in those same URLs and logs, treat it as possibly sensitive — evaluate per deployment, or resolve it out of band instead of embedding. Acceptable, but not free.

This is the single bridge that lets a stateless client sit on Doorkeeper, and it is a **supported extension point**.

```
# config/initializers/doorkeeper.rb (the config this design depends on)
application_class "OauthApplication"
force_pkce                                            # public clients MUST use PKCE (S256)
grant_flows %w[authorization_code]
default_scopes :mcp
custom_access_token_attributes [ :workspace_id, :audience, :client_subject ]  # §6
access_token_expires_in 2.hours                       # see §8.1 / §12.2 on TTL
# The hook fires for BOTH the authorize and the token endpoints, so it MUST be
# guarded — record_active_session lives only on the authorize controller (§6.2).
after_successful_authorization do |controller, _ctx|
  controller.send(:record_active_session) if controller.respond_to?(:record_active_session, true)
end

# In config/routes.rb — NOT this initializer: `controllers` is a routes DSL method.
#   use_doorkeeper do
#     controllers authorizations: "oauth/authorizations"   # the authorize-time hook (§6.2)
#   end
class OauthApplication < ApplicationRecord
  include ::Doorkeeper::Orm::ActiveRecord::Mixins::Application

  # Doorkeeper calls this to resolve a presented client_id
  # (Doorkeeper::OAuth::Client.find / .authenticate). nil → "no such client" (fail closed).
  def self.by_uid(uid)
    identity = ClientIdentity.resolve(uid)  # verify the JWS, do not look up a row
    return unless identity
    virtual(uid, identity)
  end

  # An in-memory record marked "persisted but never saved".
  def self.virtual(uid, identity)
    instantiate(
      "id"           => derived_id(identity.subject),  # stable integer surrogate (plumbing, not isolation)
      "uid"          => uid,
      "name"         => identity.client_name.presence || "MCP Client",
      "redirect_uri" => identity.redirect_uris.join("\n"),
      "scopes"       => identity.requested_scope.presence || Doorkeeper.config.default_scopes.to_s,
      "confidential" => false,  # public client
      "secret"       => nil
    )
  end

  # client_subject (UUID) → a stable positive bigint for Doorkeeper's integer application_id.
  # Mask to 63 bits so it always fits a signed bigint and stays non-negative.
  def self.derived_id(subject)
    OpenSSL::Digest::SHA256.digest(subject.to_s).unpack1("Q>") & 0x7FFF_FFFF_FFFF_FFFF
  end
end
```

Normative points:

`instantiate`

marks the object**not a new record**(it relies on`ActiveRecord::Persistence.instantiate`

setting`@new_record = false`

), so assigning it to a grant/token sets only the foreign key and does**not** autosave (autosave fires only for new records). Adopters**MUST** use`instantiate`

, not`new`

. (This leans on an ActiveRecord behavior; re-verify across**Rails** versions, not only Doorkeeper versions, and include the model's NOT-NULL columns in the hash.)`derived_id`

**MUST** be a deterministic function of`client_subject`

landing in a positive bigint. Doorkeeper compares`grant.application_id == client.id`

(**integer comparison**) across the authorize and token requests, so the surrogate must be stable and non-nil.** On**the surrogate is a 63-bit hash, so two distinct`derived_id`

collisions:`client_subject`

s can in principle collide (the birthday bound applies to the distinct`client_subject`

s carried by*all un-cleaned grants/tokens at once*, not just "live" registrations — negligible at realistic scale, but not zero). Two defenses, with different reach:- At the
**token exchange**, a collision is non-exploitable: the PKCE`code_verifier`

is bound to the specific authorization, so a colliding client cannot redeem another's code. (PKCE covers*only*the exchange.) - Everywhere else —
**revocation, segmentation, session identity**— never use`derived_id`

as a key; PKCE does not protect those. Key them on`client_subject`

, the true isolation key (§8.5, §10, §9, Pitfalls #4/#15).

- At the

Verified against Doorkeeper 5.9.1:the authorization-code→token exchange validates the grant with`grant.application_id == client.id`

(integer comparison,`authorization_code_request.rb`

); it doesnotdereference the application row. The core handshake is therefore correct with no application row present.

Doorkeeper flows custom attributes from the grant onto the token (it slices `grant.attributes`

by `custom_access_token_attributes`

at issuance). Under that mechanism, adopters **MUST** add a database column for each on **both** the grant and token tables (the slice reads from the grant).

| Attribute | Purpose | Why by value |
|---|---|---|
`audience` |
RFC 8707 audience binding; the token is valid only for this server | security invariant |
`workspace_id` / tenant key |
the tenant the token is bound to | resolve the tenant at call time |
`client_subject` |
which client (= the JWS `sub` ) |
the server keeps no client registry; this is the isolation key, recorded by value |

Normative:any dimension you will later use to identify a client / revoke / list sessionsMUSTbe recorded by value on the token; youMUST NOTrely on resolving it through`application_id`

(dynamic noise with no backing row).

Custom attributes reach the token only if they are first set on the **grant** during authorization. Subclass the provider's authorization controller and set them in a `before_action`

:

```
class Oauth::AuthorizationsController < Doorkeeper::AuthorizationsController
  before_action :bind_audience, :resolve_tenant, :record_client_subject, only: %i[new create]

  private

  # current_resource_owner — Doorkeeper-provided: the signed-in member.

  # Honor the client's RFC 8707 `resource`, then set the audience to the *validated*
  # resource; fall back to the canonical value only when none was sent.
  def bind_audience
    resource = validate_requested_resource!(params[:resource])  # → the validated resource, or renders invalid_target
    params[:audience] = resource || CANONICAL_AUDIENCE          # CANONICAL_AUDIENCE: this server's RFC 9728 resource id
  end

  # → a tenant id the owner may authorize for, validated against its memberships;
  #   renders invalid_target on a tenant the owner cannot authorize (never raw from input).
  def resolve_tenant
    params[:workspace_id] = authorized_tenant_id_for(current_resource_owner, params)
  end

  # Fail-secure: trust only the verified JWS subject, overwriting any supplied value.
  def record_client_subject
    params[:client_subject] = ClientIdentity.resolve(params[:client_id])&.subject
  end

  # Invoked from the guarded after_successful_authorization hook (§5), which fires for
  # both the authorize and token endpoints — the guard ensures this runs only here.
  def record_active_session
    identity = ClientIdentity.resolve(params[:client_id])  # client_id is present on the authorize request
    McpSession.authorize!(
      tenant:         Tenant.find(params[:workspace_id]),
      user:           current_resource_owner,
      client_subject: params[:client_subject],   # already set, fail-secure, by the before_action
      client_name:    identity&.client_name,
      expires_at:     Time.current + Doorkeeper.config.access_token_expires_in
    )
  end
end
```

How do theseDoorkeeper's`params`

reach the grant?`PreAuthorization`

reads them from the request params and slices them (by`custom_access_token_attributes`

) onto the grant when it is created; at token exchange the same slice copies them to the token. So setting them in a`before_action`

— which runs before the`new`

/`create`

action builds the pre-authorization — lands them on the grant. (Verified against Doorkeeper 5.9.1.)

Helper contracts the adopter supplies: `validate_requested_resource!(resource)`

→ the resource when it names this server (RFC 8707 §2 / RFC 9728 id), else renders `invalid_target`

; `authorized_tenant_id_for(owner, params)`

→ a tenant id the owner may authorize for, else renders `invalid_target`

; `CANONICAL_AUDIENCE`

→ this server's canonical resource identifier (the value it publishes via RFC 9728).

If one provider must serve both "classic persisted OAuth clients" and "stateless DCR clients" (and, later, CIMD URL clients), **they do not conflict in Doorkeeper's core; they conflict in your customization layer**, which collapses to a single discriminator question.

| Seam | How to coexist |
|---|---|
`by_uid` resolution |
`super(uid) |
| Authorize endpoint | The DCR-specific invariants (audience binding, tenant binding, session recording) MUST activate only on the DCR branch, and SHOULD live in a concern loaded only there, so they cannot leak onto classic OAuth |
| Global config | `default_scopes` , `grant_flows` , TTLs are global switches. Per-context-adjustable ones (`custom_access_token_expires_in` , per-request scope) SHOULD be dispatched in a block |
| Resource servers | Each endpoint guards itself; naturally no conflict. If a token-introspection endpoint (RFC 7662) is exposed, it MUST answer consistently for both client types — return `client_subject` (not the raw JWS) as the client id, and align `active` with the call-time boundary, not the decoupled session view. Introspection necessarily reads the credential tables; the decoupled session's query savings (§9.2) apply only to the session list, not to introspection |

Anti-pattern:writing the DCR-specific audience/tenant invariants asunconditionalbefore-actions hijacks every authorization and blocks classic clients. TheyMUSTbe conditional on the resolved type.

Do

notstand up two Doorkeeper providers to coexist (one dispatching`by_uid`

serves all). Donotmake DCR clients persisted again to coexist — one-time clients explode the client table, which is the whole reason for statelessness.

- Each authorization → one grant (short TTL, typically 10 minutes; revoked at token exchange).
- Each token issuance → one token row.
- Without refresh tokens, an expired token requires re-running the full authorization flow (which needs user consent), so DCR token TTLs are often set
**long**(days/weeks). This is a UX-vs-security trade — see §11 and §12.2. It assumes** human-in-the-loop consent**; if a host can complete consent non-interactively, the long-TTL-for-UX rationale collapses — prefer short TTL with automated re-consent, which also shrinks the leak window.

Doorkeeper's own `StaleRecordsCleaner`

(exposed via `rake doorkeeper:db:cleanup`

) runs queries shaped, *conceptually*, like:

```
clean_revoked : where.not(revoked_at: nil).where(revoked_at < now).in_batches(&:delete_all)
clean_expired : where.not(expires_in: nil).where(created_at < now - ttl)
                  .where(<adapter-specific expiration SQL> < now).in_batches(&:delete_all)
```

Verified fact (Doorkeeper 5.9.1):both queries keyonlyon`revoked_at`

/`created_at`

+`expires_in`

— theynevertouch`application_id`

.

Consequences:

- The dynamic, non-recurring
`application_id`

(§5) has**no effect** on cleanup — cleanup deletes each row by its own time/revocation state and never needs to know which client it belonged to. - Adopters
**MUST** clean up by time + revocation only. A`dependent: :delete_all`

cascade from the application would never help here anyway (there is no application row to cascade from), so it plays no part. `application_id`

is*not*a usable discriminator for cleanup or segmentation (per-registration noise with no backing row); use`audience`

/`client_subject`

instead (§8.5).- The
`oauth_applications`

table stays empty; there is nothing to clean there.

A row escapes **both** queries only when `revoked_at IS NULL`

**and** `expires_in IS NULL`

(an "immortal" row).

Normative:under this reclamation model, adoptersMUST NOTissue non-expiring tokens (`expires_in`

always set). With that held, every row is eventually reclaimed: revoked rows by`clean_revoked`

(prompt, TTL-independent), naturally expired rows by`clean_expired`

(after their TTL).No immortal rows, no unreachable rows.(This boundsstorage; it does not bound a leaked token's validity window — see §11.)

Wrap the native cleaner so the logic is a class method reproducible in any environment (console, test, rake) — not only where the schedule runs — and schedule it as a job:

```
class OauthCredentialCleanupJob < ApplicationJob
  queue_as :default

  def self.cleanup
    tokens = Doorkeeper.config.access_token_model
    grants = Doorkeeper.config.access_grant_model

    Doorkeeper::StaleRecordsCleaner.new(tokens.where(refresh_token: nil))
      .clean_expired(Doorkeeper.config.access_token_expires_in)
    Doorkeeper::StaleRecordsCleaner.new(tokens).clean_revoked
    Doorkeeper::StaleRecordsCleaner.new(grants)
      .clean_expired(Doorkeeper.config.authorization_code_expires_in)
    Doorkeeper::StaleRecordsCleaner.new(grants).clean_revoked
  end

  def perform = self.class.cleanup
end
# config/recurring.yml  (Solid Queue)
production:
  oauth_credential_cleanup:
    class: OauthCredentialCleanupJob
    schedule: every hour
```

**Two cadences, not one.**`clean_revoked`

**SHOULD** run frequently (e.g. hourly) regardless of TTL, so a revocation is reflected in storage promptly;`clean_expired`

only needs to keep pace with the TTL. Do**not** collapse both into a single "run every TTL" schedule — under a long TTL that would let revoked rows pile up. A single hourly job calling all four steps satisfies both.- SQLite
**is** supported by the native cleaner's adapter-specific expiration SQL; other adapters not in the support table degrade to the coarse`created_at`

filter (with a warning) — provide a`custom_expiration_time_sql`

there. **Refresh-token caveat:**`clean_expired`

skips rows with a non-null`refresh_token`

(they may be refreshed). If you enable refresh tokens (§12.2), those rows are reclaimed only by`clean_revoked`

or refresh-token rotation — design that lifecycle explicitly before turning refresh on. This document's cleanup model assumes no refresh tokens.

**Single TTL (DCR-only):** the native cleanup as scheduled is sufficient.**Mixed TTL (long DCR + short classic OAuth):**`clean_expired`

's coarse filter`created_at < now - globalTTL`

is keyed to a single global TTL. If the global is the longest value, short-TTL tokens linger until that window. Run a**segmented cleanup**:`StaleRecordsCleaner`

accepts any relation + an explicit ttl, so segment by`audience`

(or scope) and pass each segment's TTL. If a segment uses a longer`custom_access_token_expires_in`

, the global`access_token_expires_in`

**MUST** be ≥ the largest segment TTL, or the coarse filter under-collects.

Requirement: show "which clients a member authorized, when each was last seen, and whether each is still live."

An Active Session is closer to a **client** than to a **token**: one authorization aggregates many rotating tokens, so the session **persists across token rotation** and is **not 1:1 with a token**. Model it as a **decoupled metadata record**, keyed by `(tenant, member, client_subject)`

, holding its own validity window plus the observational fields a token cannot express. A recommended shape (adopter-supplied):

```
# Table: tenant_id, user_id, client_subject (unique together),
#        client_name, expires_at, last_seen_at, timestamps
class McpSession < ApplicationRecord
  RECENCY = 1.hour     # deployment-set display window

  scope :active, -> { where(last_seen_at: RECENCY.ago..) }   # shown vs hidden

  # Born at consent; the validity window is advanced on each re-authorization.
  def self.authorize!(tenant:, user:, client_subject:, client_name:, expires_at:)
    find_or_initialize_by(tenant_id: tenant.id, user_id: user.id, client_subject: client_subject)
      .update!(client_name: client_name, expires_at: expires_at)  # = now + access_token_expires_in
  end

  # Observed on each authenticated call, throttled to a minute (do not write the auth-hot path hard).
  def self.touch_seen(tenant:, user_id:, client_subject:)
    rec = find_by(tenant_id: tenant.id, user_id: user_id, client_subject: client_subject)
    return if rec.nil? || (rec.last_seen_at && rec.last_seen_at > 1.minute.ago)
    rec.update_column(:last_seen_at, Time.current)
  end

  # Held vs discarded — its own lifecycle, independent of token cleanup.
  def self.prune_expired = where(expires_at: ..Time.current).delete_all
end
```

Display and retention are two **orthogonal** filters: *shown vs hidden* = `last_seen_at`

within the recency window; *held vs discarded* = the session's own `expires_at`

passed.

You *could* instead define `active = EXISTS a non-revoked, unexpired token for (tenant, member, client_subject)`

. That existence query is the **same shape** as the scoped-revoke scan (§10) and needs no extra index either — so "avoiding an index" is **not** a real reason to prefer one over the other. The honest trade-off is:

| Option | Benefit | Cost |
|---|---|---|
Decoupled |
the session is a client-level record that persists across token rotation; the list renders without touching the credential tables | brief staleness after a revoke (the record ages out by its own expiry / recency) |
| Token-projected | the list is exact at all times | a credential-table existence query on every render, and coupling a client-level view to token-level rows |

The choice turns on **one observable property: does the list offer a per-row revoke action?**

**Display-only list (no per-row revoke) → decoupled is the right default.** The list is observational; the real gate is the call-time boundary, which rejects a revoked/expired token regardless of what the list shows; brief staleness is harmless. A new token under the same authorization keeps the session legitimately live, which last-seen recency reflects exactly.**List drives revocation (a per-row revoke control) → decoupled is the wrong default.** An operator acting on a stale row could revoke the wrong thing, or see a just-revoked client still shown as live. There the session**MUST** reflect true credential state — either project liveness from tokens, or revoke-and-remove the session record in one step (so the action the operator just took is visibly reflected). Evaluate this against your roadmap, not only today's UI: if a per-row revoke action is planned (§16), adopt the revoke-synced shape now — decoupled-vs-projected is a schema decision costly to reverse later.

Prefer decoupled where it applies (the common display-only case) because the session is conceptually a client that persists across token rotation, and the list renders without touching the credential tables.

- The session's
`prune_expired`

(by its own expiry) is independent of token cleanup — two independent entities, two reclamations. Intentional, not a smell. - Member-removal revocation (§10) acts on tokens. For a display-only list the session record simply ages out; for a revocation-driving list, remove/expire the session record at revocation time (§9.2).

When access must be cut off (member removal now; per-client "end session" via §16), the **credential** must actually stop working, not merely disappear from a list.

Verified fact (Doorkeeper 5.9.1):the built-in`revoke_all_for(application_id, resource_owner)`

keys only on`(application_id, resource_owner)`

— it hasno tenant filter.

Because one stateless client (`client_subject`

→ one `derived_id`

) can be authorized in **multiple tenants** by the same user, `revoke_all_for`

would **over-revoke across tenants** in a multi-tenant deployment.

Normative:

- In a
**multi-tenant** deployment, revocation**MUST** be scoped to the full granularity it targets and**MUST NOT** use`revoke_all_for`

. (In a strictly single-tenant deployment the over-revoke risk does not arise;`revoke_all_for`

is then acceptable.) Member removal scopes by`(tenant_id, resource_owner_id)`

; a per-client "end session" adds`client_subject`

. - Revocation
**MUST** act on**both** tokens and any unredeemed grants (an in-flight grant could otherwise be exchanged into a token after revocation — a race window). **Per-token revocation**(revoking a single token by its id) is cheap and needs no client registry; expose it together with an enumeration entry (§11/§15) so a*leaked token*can actually be located and killed without waiting out a long TTL.

```
# On member removal: revoke that member's credentials for this tenant, scoped.
def revoke_member_credentials(user_id)
  now = Time.current.utc
  [ Doorkeeper::AccessToken, Doorkeeper::AccessGrant ].each do |model|
    model.where(resource_owner_id: user_id, workspace_id: id, revoked_at: nil)
         .update_all(revoked_at: now)
  end
end
```

- After revocation the token is immediately inactive (the call-time boundary checks
`revoked_at`

), and the scheduled cleaner's`clean_revoked`

reclaims the row — a**closed loop: revoke → reclaim**. - Member removal
**SHOULD** combine active revocation (above) with the call-time membership re-check as**defense in depth**: even if a revoke is missed, the boundary still denies a removed member.

**Audience binding (RFC 8707).** RFC 8707 is*client-driven*: the client sends one or more`resource`

parameters, and the AS applies an audience restriction. The verifiable invariant: a provider**MUST NOT** unconditionally overwrite a validly-sent`resource`

. Validate each`resource`

against the set the resource-owner is authorized to mint tokens for —**not merely "names this server"**: a server exposing several resource variants must not let a client bind a token to a variant outside the owner's grant (the same authorization check`resolve_tenant`

already applies to the tenant). Set the token's audience to the validated resource(s), falling back to the canonical value only when none is sent (a single-audience deployment collapses to it), and handle multiple`resource`

values explicitly rather than silently keeping one. The audience is expressed on the token (RFC 7519`aud`

/ RFC 7662) and**MUST** be compared at the call-time boundary.**DCR redirect_uri validation.** The DCR endpoint is unauthenticated; it**MUST** validate`redirect_uris`

at registration per RFC 7591 / RFC 8252 §7 (loopback or private-use scheme; reject non-loopback`http`

, wildcards)**before** issuing a`client_id`

. Skipping this issues a*signed*open-redirect that PKCE does not fully contain.`client_subject`

fail-secure.**MUST** trust only the`sub`

from the verified JWS; a same-named value in the request**MUST** be overwritten (§6.2).**JWS is not a credential.** The`client_id`

JWS appears in authorize URLs and logs and is**not** a secret. It authenticates nothing about the presenter; security comes from**PKCE possession + audience + user consent**. Do** not**treat JWS validity as an access-control lever or replay defense. (Its payload is signed, not encrypted — readable by anyone with the log.)**PKCE.** A stateless client is public (no secret), so the provider**MUST**`force_pkce`

with S256 (RFC 7636 / RFC 8252). PKCE is also what makes a`derived_id`

collision non-exploitable*at the token exchange*(§5) — but only there.**DCR endpoint abuse / DoS.** Statelessness stops the*client table*from growing, but each issued`client_id`

can drive authorizations that create grant/token rows — so the DoS surface**moves** to those tables. Rate-limit the DCR/authorize/token endpoints (Rails 8 ships`rate_limit`

; Doorkeeper supports throttling —**neither is on by default**). Note the DCR endpoint has no stable client identity to limit by, so it can only be limited coarsely (per-IP/global), which an attacker can evade by rotating IPs; the real defense for grant/token DoS is the tight cleanup schedule (§8.4) plus a bounded token TTL.**Token leak window vs long TTL.** With long TTLs and no refresh, a leaked bearer token stays valid for that TTL, and member-removal / membership-recheck do**not** help if the member is still valid. To kill a leaked token you must first*find*it: the decoupled session (§9) is client-level and carries no token id, so provide an**admin entry that enumerates a member's live tokens by**(cheap, no registry). Provide that, or prefer short access tokens + refresh rotation (OAuth 2.0 Security BCP) — long TTL with neither leaves a multi-week leak window.`(tenant, owner, client_subject)`

and revokes by id**Member removal is immediate.** Removal**MUST** revoke the member's credentials (§10), and the boundary**SHOULD** independently re-check membership.**Tenant is a selector, not a source.** A tenant named in the request**MUST** only be matched against the token's tenant, never used as its source.**Fail closed.** Any failed boundary check**MUST** collapse to one unauthenticated outcome, not leaking whether a resource exists. A denylist (§16) lookup failure**MUST** also fail closed.**Clock skew.** Cleanup and`exp`

/`expires_in`

checks compare against wall-clock time. In a multi-node deployment a cleaner whose clock runs fast can delete a token slightly before its real expiry, and verifiers can disagree on validity. Keep nodes on NTP, allow a small`leeway`

on`exp`

verification (§4.1), and centralize the comparison on a**single clock source**(e.g. the database's`now()`

via the provider's expiration SQL). Note this*centralizes*, not eliminates, skew — and a per-node (non-shared) SQLite deployment does not get this benefit; there, let a single node own expiry decisions and give verifiers adequate`leeway`

.**Key rotation.** Rotating the signing key retires every`client_id`

; use the multi-`kid`

grace period (§4.4) for planned rotation.**Audit (a named compensating control).** Discarding the client table also discards the natural registration history — so the*only*trace of who registered when is the audit log. Security-relevant events (DCR registration, authorization, revocation, member removal)**SHOULD** be logged; for a design that deliberately keeps no client registry, registration audit is the compensating control for after-the-fact investigation, not an optional nicety.

- Adopters
**SHOULD NOT** add or remove indexes on, or migrate, the OAuth provider's own tables for their feature's query patterns. Adding columns through the provider's own`custom_access_token_attributes`

mechanism is supported (the provider knows about them); reaching in to re-index the provider's tables for your access pattern fights the library across upgrades. - You will not
*need*such an index: scoped revocation (§10) filters by`resource_owner_id`

(already indexed by the provider) plus the tenant key — a tiny scan within one owner's rows, run rarely; and the decoupled Active Session (§9) does not query the credential tables at all. (This is a "you won't need it" observation, not the*reason*the session is decoupled — that reason is in §9.2.) - The provider's default
`application_id`

index becomes high-cardinality and unused under this design (every registration is a distinct id), but it is harmless write overhead bounded by live rows. Leaving it respects the provider's schema ownership.

- With no refresh and human-in-the-loop re-authorization, token TTL
**SHOULD** be set long to avoid frequent consent —**but** weigh the leak window (§11): pair a long TTL with per-token revocation (§10), or prefer enabling refresh tokens (with a designed cleanup lifecycle, §8.4) where the blast radius matters. - When using a per-segment
`custom_access_token_expires_in`

, the global TTL**MUST** be ≥ the largest segment TTL (§8.5).

Verified fact (Doorkeeper 5.9.1):`revoke_previous_authorization_code_token`

calls`revoke_previous_tokens(grant.application, ...)`

→`application.id`

with no safe-navigation; under stateless identity`grant.application`

(loaded by the`application_id`

association, which has no backing row) is nil →`nil.id`

→ NoMethodError.

`revoke_previous_authorization_code_token`

and`revoke_previous_client_credentials_token`

**MUST NOT** be enabled until the nil-application path is handled (return a virtual application for the`grant.application`

association, or guard the nil).`reuse_access_token`

is**not** in the same crash class: its matching reads`application&.scopes`

(safe-navigated) and, in the authorization-code flow, the client is the virtual app (non-nil). It still involves the application object, so verify before enabling — but it does not crash on a nil application the way`revoke_previous_*`

does. Verdict: leave it off — this design needs no token reuse, and enabling it only adds matching over the virtual app to re-verify for no benefit.

| # | Pitfall | Consequence | Rule |
|---|---|---|---|
| 1 | Persist one-time clients | client table grows unbounded | use a stateless JWS (or CIMD) |
| 2 | Assume statelessness removes unbounded growth | grant/token tables grow instead | treat cleanup as first-class |
| 3 | Expect an application cascade to clean up | no application row → it cannot help | clean by time/revocation |
| 4 | Use `application_id` as a cleanup/revoke/segment key |
dynamic noise; collision mis-targets | use `audience` / `client_subject` |
| 5 | Issue non-expiring tokens (this model) | immortal rows | `expires_in` always set |
| 6 | One "every TTL" cleanup cadence | revoked rows pile up under long TTL | run `clean_revoked` frequently; `clean_expired` per TTL |
| 7 | Single global coarse filter under mixed TTL | short-TTL tokens linger | segmented `StaleRecordsCleaner` |
| 8 | Decoupled session for a list that drives revocation | operator acts on a stale/just-revoked row | decoupled only for display-only lists; project or revoke-sync when a per-row revoke action exists |
| 9 | Revoke with library-wide `revoke_all_for` (multi-tenant) |
cross-tenant over-revoke | scope by `(tenant, owner[, client_subject])` |
| 10 | Revoke tokens but not grants | in-flight grant race | revoke both |
| 11 | DCR invariants as unconditional before-actions | hijacks classic OAuth authorization | conditional on resolved type |
| 12 | Enable `revoke_previous_*` |
dereference nil application → crash | handle the virtual application first |
| 13 | Build the virtual app with `new` instead of `instantiate` |
autosaves an application row | use `instantiate` |
| 14 | Leave the DCR endpoint unthrottled / its redirect_uris unvalidated | grant/token DoS; signed open-redirect | rate-limit; validate redirect_uris (RFC 8252 §7) before issuing |
| 15 | `derived_id` treated as the isolation key |
collision → wrong-client confusion | `client_subject` is the isolation key; PKCE backstops only the exchange |
| 16 | Unconditionally overwrite the client's `resource` |
breaks RFC 8707 client-driven audience | set audience to the validated resource; canonical is fallback |

| Decision | Trade-off | Conclusion |
|---|---|---|
| Persisted client vs stateless JWS | one-time clients explode the table | stateless (hard requirement for DCR) |
| DCR vs CIMD | CIMD is now SHOULD (low support today); DCR is MAY | DCR where CIMD isn't viable; become CIMD-compatible as it lands (§2.1) |
| Signing algorithm | symmetric vs asymmetric | ES256 (asymmetric, ships with Ruby's OpenSSL) |
`application_id` surrogate |
nil breaks `validate_grant` 's integer compare |
non-nil per-registration derived_id, masked to a positive 63-bit integer; `client_subject` (by value) is the true isolation key |
| Custom prune vs the library's native cleaner | the native `StaleRecordsCleaner` already covers it and supports SQLite |
native cleaner, wrapped in a scheduled job/class method |
| Token-derived session vs decoupled session | projection is exact but couples client-level to token-level; decoupled is simpler but briefly stale | decoupled for display-only lists; project or revoke-sync when the list drives revocation |
`revoke_all_for` vs scoped revoke |
the built-in lacks a tenant filter | scoped by `(tenant, owner)` in multi-tenant; `revoke_all_for` acceptable only single-tenant |
| Touch the provider's schema vs leave it | re-indexing the provider's tables is intrusive and unnecessary | leave the provider's schema alone |
| Per-client revocation | full statelessness gives it up | optional denylist + per-token revoke extension (§16) — a deliberate bounded-state hybrid, unverified |

This document specifies the provider-side design; an adopter **MUST** supply these collaborators to reach an equivalent mechanism:

-
—`ClientIdentity`

`issue(metadata) → JWS`

,`resolve(client_id) → Identity|nil`

(pure verify, no lookup) (§4.1). Identity exposes`subject / client_name / redirect_uris / requested_scope`

. -
**DCR registration validation**— validate`redirect_uris`

(RFC 7591 / 8252 §7) and rate-limit the endpoint**before**`issue`

(§4.1, §11). -
—`KeyStore`

`active_private_key`

,`active_kid`

,`verification_key(kid)`

; keys from a secret store; multi-`kid`

rotation grace period (§4.4). -
— returns an`OauthApplication.by_uid`

`instantiate`

d virtual app;`derived_id`

deterministic, a positive 63-bit integer (§5). -
**The authorize-time hook**—`bind_audience`

(honor + validate`resource`

, canonical fallback),`resolve_tenant`

,`record_client_subject`

(verified JWS),`record_active_session`

; helper contracts per §6.2. -
**Custom-attribute columns** on**both** grant and token tables +`custom_access_token_attributes`

(§6.1). -
**The Active Session record**— table +`authorize!`

/`touch_seen`

/`prune_expired`

/`active`

scope; decoupled for display-only, projected/revoke-synced if it drives revocation (§9). -
**The call-time boundary**(resource-server side) — token acceptable + audience + tenant + membership, fail closed (§3, §11). -
**Cleanup**— a scheduled job wrapping the native cleaner (revoked + expired, tokens** and**grants);`expires_in`

never nil;`clean_revoked`

frequent,`clean_expired`

per TTL (§8.4). -
**Revocation**— scoped by`(tenant, owner)`

, tokens**and** grants; never`revoke_all_for`

in multi-tenant (§10). -
**Token enumeration + revoke entry**— list a member's un-revoked tokens by`(tenant, owner, client_subject)`

and revoke by id, so a leaked token can be killed without waiting out the TTL (§11). -
**Rate limiting**— on DCR/authorize/token endpoints (not on by default) (§11). -
**Audit log**— DCR registration / authorization / revocation events (§11). - Config (initializer):
`force_pkce`

(S256), single`grant_flows`

,`application_class`

,**guarded**`after_successful_authorization`

(§5). Routes:`use_doorkeeper { controllers authorizations: "oauth/authorizations" }`

(the`controllers`

DSL belongs in routes, not the initializer). - Mixed TTL → segmented
`StaleRecordsCleaner`

, discriminator = audience; global TTL ≥ the largest segment (§8.5). -
`revoke_previous_*`

left off unless the nil-application path is handled (§12.3). - If using the denylist (§16): consult it as a gate
**after** resolution (not inside`resolve`

), fail closed, couple it with per-token/scoped revocation.

The following are

design sketches, NOT implemented or validated. Treat them as direction, not normative guidance.

**Per-client identity revocation (a bounded-state security extension).** Full statelessness gives up the single-client off switch (§4.5). Reintroduce a*bounded*amount of state — a small— as a deliberate hybrid (stateless-core identity + a bounded denylist):`client_subject`

denylist- Consult the denylist as a
**gate after** signature verification (not inside`resolve`

, which stays a pure verify); a denied subject is treated as no client. - The lookup
**MUST** fail closed (a denylist-store failure denies; it does not open). **Couple it with token revocation.** The denylist blocks*future*authorizations for a subject; already-issued tokens are killed only by revoking them by`(tenant, user, client_subject)`

(§10). Neither alone suffices.- A denylist entry must live at least
`max(client_id exp, longest token TTL)`

, then be reclaimed by its own TTL-based GC — fold that into the §8 reclamation story so the hybrid's state stays genuinely bounded (§8.3's closure does not otherwise cover it). - Driving it from the Active Session record (revoking a session revokes its tokens
*and*denylists its subject) is one natural shape — and it is what a per-row revoke action in a session list (§9.2) would call. Unimplemented and unverified here; validate growth, expiry, and read-path cost before adoption.

- Consult the denylist as a
**CIMD compatibility**(§2.1). Accept an HTTPS-URL`client_id`

alongside JWS client_ids, fetching the client's metadata document, as CIMD adoption grows. The resolution dispatch (§7) is the natural seam.**Token introspection**(RFC 7662). If exposed, answer consistently across client types — return`client_subject`

as the client id and align`active`

with the call-time boundary, not the session view (§7).

- RFC 6749 — The OAuth 2.0 Authorization Framework
- RFC 7591 — OAuth 2.0 Dynamic Client Registration Protocol
- RFC 7636 — Proof Key for Code Exchange (PKCE)
- RFC 7662 — OAuth 2.0 Token Introspection
- RFC 8252 — OAuth 2.0 for Native Apps (redirect_uri rules, §7)
- RFC 8414 — OAuth 2.0 Authorization Server Metadata (AS discovery)
- RFC 8707 — Resource Indicators for OAuth 2.0
- RFC 9728 — OAuth 2.0 Protected Resource Metadata (the canonical resource identifier / audience)
- RFC 2119 — Key words for use in RFCs
- OAuth 2.0 Security Best Current Practice (short-lived access tokens, refresh rotation)
- Model Context Protocol — Authorization (2025-11-25 revision; DCR → MAY, CIMD direction)
- Doorkeeper 5.9.x —
`application_class`

/`by_uid`

,`custom_access_token_attributes`

,`custom_access_token_expires_in`

,`StaleRecordsCleaner`

,`rake doorkeeper:db:cleanup`

- Solid Queue — recurring tasks (
`config/recurring.yml`

)

*"Verified fact" notes were checked against Doorkeeper 5.9.1 source and SQLite behavior; the instantiate no-autosave behavior is an ActiveRecord property — re-verify the relevant sections against your Doorkeeper and Rails versions and your database adapter before relying on them. JWT option names follow ruby-jwt; confirm against your library.*