Skip to content

Commit 2e6a460

Browse files
authored
Merge pull request #17 from drogers0/fix-codex-refresh-expiry
fix(codex): gate refresh on access-token expiry, not the id_token
2 parents 484c005 + df88099 commit 2e6a460

9 files changed

Lines changed: 248 additions & 106 deletions

File tree

CLAUDE.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -35,9 +35,9 @@ JSON is the default; exit codes: 0 success / 1 any-provider-failed / 2 usage err
3535
- [internal/providers/usagecache/](internal/providers/usagecache/) — provider-neutral 90 s file-backed usage cache. `New(provider, nowFn, warnFn)` validates the provider char-set and writes `$CACHE/aistat/usage/<provider>-v1.json` with `<provider>.cache.lock`. Warn strings include the provider name.
3636
- [internal/providers/multiaccount/](internal/providers/multiaccount/) — provider-neutral helpers consumed by both Claude and Codex multi-account `Fetch`: `SortAccountResults`, `RecordFetchOutcome`, `RecomputeResetAfter`, `Budget(base, perAccount, count)`.
3737
- [internal/providers/claude/](internal/providers/claude/) — Claude's multi-account machinery: [claude.go](internal/providers/claude/claude.go) (Fetch + FetchForSwitch + cache wiring), [profile.go](internal/providers/claude/profile.go) (identity endpoint), [refresh.go](internal/providers/claude/refresh.go) (OAuth refresh), [reconcile.go](internal/providers/claude/reconcile.go) (pure decision tree: byte-match → profile lookup → fallback), [account.go](internal/providers/claude/account.go) (`StoredAccessToken` / `StoredRefreshToken` / `StoredExpiresAt` token-parsing helpers operating on the opaque `accounts.Account`).
38-
- [internal/providers/codex/](internal/providers/codex/) — Codex's multi-account machinery, structurally a mirror of Claude's: [codex.go](internal/providers/codex/codex.go) (Fetch + FetchForSwitch + cache wiring + `rotateRawBlob`), [refresh.go](internal/providers/codex/refresh.go) (OAuth refresh against `https://auth.openai.com/oauth/token` with client_id confirmed via Codex binary inspection), [reconcile.go](internal/providers/codex/reconcile.go) (pure decision tree: byte-match → JWT `sub` lookup → live-unstored), [account.go](internal/providers/codex/account.go) (Codex-shaped `Stored*` helpers). Identity is the `sub` claim of the OIDC `id_token` — no network endpoint, JWT-payload decode only. Slot-vs-duration window labelling is by `limit_window_seconds`, NOT by slot position, so free-account weekly windows that land in the primary slot are not mislabelled as `five_hour`.
38+
- [internal/providers/codex/](internal/providers/codex/) — Codex's multi-account machinery, structurally a mirror of Claude's: [codex.go](internal/providers/codex/codex.go) (Fetch + FetchForSwitch + cache wiring + `rotateRawBlob`), [refresh.go](internal/providers/codex/refresh.go) (OAuth refresh against `https://auth.openai.com/oauth/token` with client_id confirmed via Codex binary inspection), [reconcile.go](internal/providers/codex/reconcile.go) (pure decision tree: byte-match → JWT `sub` lookup → live-unstored), [account.go](internal/providers/codex/account.go) (Codex-shaped `Stored*` helpers). `StoredExpiresAt` decodes expiry from the **access_token** JWT's `exp` (via `cred.ParseJWTExp`) — the long-lived API credential — NOT the short-lived OIDC `id_token`, which is identity-only (gating on the id_token's ~1 h `exp` made the refresh gate fire on every run). Identity is the `sub` claim of the OIDC `id_token` — no network endpoint, JWT-payload decode only. Slot-vs-duration window labelling is by `limit_window_seconds`, NOT by slot position, so free-account weekly windows that land in the primary slot are not mislabelled as `five_hour`.
3939
- [internal/render/](internal/render/)`json` and `text` renderers. The JSON shape is the public contract; the text renderer is a thin presentation layer over the same model. Provider-agnostic: when `len(result.Accounts) > 0` for ANY provider the renderer emits the nested per-account view, otherwise the legacy flat form (still in use for Copilot, and as a Claude/Codex fallback).
40-
- [internal/cred/](internal/cred/) — credential read/write and JWT decoding. `Credential.Raw []byte` preserves the verbatim provider blob (every byte the upstream CLI wrote) so a switch re-publishes byte-for-byte. `ReadClaudeCredential` / `WriteClaudeLiveBlob` (macOS Keychain item `Claude Code-credentials` via a `runSecurity` test seam; Linux `~/.claude/.credentials.json`). `ReadCodexCredential` / `WriteCodexLiveBlob` (file-only on both OSes: `~/.codex/auth.json`, mode 0600, atomic rename + fsync). `ParseCodexIDToken` (exported here to avoid a `cred``providers/codex` import cycle) decodes the OIDC `id_token` payload for `sub` / `email` / `exp` without signature verification.
40+
- [internal/cred/](internal/cred/) — credential read/write and JWT decoding. `Credential.Raw []byte` preserves the verbatim provider blob (every byte the upstream CLI wrote) so a switch re-publishes byte-for-byte. `ReadClaudeCredential` / `WriteClaudeLiveBlob` (macOS Keychain item `Claude Code-credentials` via a `runSecurity` test seam; Linux `~/.claude/.credentials.json`). `ReadCodexCredential` / `WriteCodexLiveBlob` (file-only on both OSes: `~/.codex/auth.json`, mode 0600, atomic rename + fsync). `ParseCodexIDToken` (exported here to avoid a `cred``providers/codex` import cycle) decodes the OIDC `id_token` payload for `sub` / `email` / `exp` without signature verification; `ParseJWTExp` similarly decodes just the `exp` claim of any JWT (no signature check) and is used by `codex.StoredExpiresAt` to read the access-token expiry.
4141
- [internal/httpx/](internal/httpx/) — shared HTTP transport: `Doer.GetJSON` (Authorization-reserved) + `Doer.PostForm` (no Authorization by default — used by both refresh clients) sharing an unexported `setCommonHeaders` / `do` split. `do` runs a bounded retry loop (max 3 attempts) that honors `Retry-After` (capped at 10 s) on transient classifications, falling back to exponential backoff with ±20 % jitter. `Classifier` takes `*http.Response` so callers can inspect headers.
4242
- [internal/orchestrate/](internal/orchestrate/) — parallel fan-out across providers; one failing provider does not block the others. Preserves per-account rows on provider-level error (D8 contract).
4343
- [internal/testutil/](internal/testutil/) — shared test helpers.
@@ -76,7 +76,7 @@ These are the principles every change should respect. When in doubt, optimize fo
7676

7777
These are upstream OAuth-provider behaviors aistat cannot work around without re-authentication. Both fail closed with actionable errors. **`codex login --device-auth` is the right flow on remote / headless machines** (no browser required on the host), but per upstream [code](https://github.com/openai/codex/blob/main/codex-rs/login/src/device_code_auth.rs) and [docs](https://developers.openai.com/codex/auth) device-auth uses the same persistence path and the same single-cached-login model as the browser flow — the revocation semantics below apply equally.
7878

79-
- **Refresh-token rotation race.** OpenAI's `/oauth/token` endpoint single-uses each refresh_token; the Codex CLI rotates it on every refresh and writes the new value to `~/.codex/auth.json`. If the Codex CLI runs between aistat reading the file and aistat sending its refresh request, aistat's in-memory copy is stale and the server returns a 401 whose body reads `Your refresh token has already been used to generate a new access token.` aistat tightens this to `stale refresh token (codex CLI rotated it); retry or run codex login to recover` (matched on `already been used` in `refreshErrorMessage`). Recovery: `codex login` (or just wait for the next aistat run — the cache will catch up next pass).
79+
- **Refresh-token rotation race.** OpenAI's `/oauth/token` endpoint single-uses each refresh_token; the Codex CLI rotates it on every refresh and writes the new value to `~/.codex/auth.json`. If the Codex CLI runs between aistat reading the file and aistat sending its refresh request, aistat's in-memory copy is stale and the server returns a 401 whose body reads `Your refresh token has already been used to generate a new access token.` aistat tightens this to `stale refresh token (codex CLI rotated it); retry or run codex login to recover` (matched on `already been used` in `refreshErrorMessage`). Recovery: `codex login` (or just wait for the next aistat run — the cache will catch up next pass). Note: since `StoredExpiresAt` now gates on the long-lived access_token's `exp`, routine reporting no longer triggers a refresh on every run — only when the access token is genuinely near expiry — so this race is hit far less often.
8080
- **Switch-side token revocation.** When a new account logs in on the same OAuth client via the browser flow (`codex logout && codex login`), OpenAI's server invalidates the previous account's tokens. aistat's `switch` re-publishes the stored blob byte-for-byte, but server-revoked tokens stay revoked — the next usage call returns a 401 whose body carries either `token_revoked` or `token_invalidated` (OpenAI uses both interchangeably for this condition). aistat tightens both to `tokens revoked by upstream (likely a codex login for another account); run codex login to recover` (matched by `isRevokedTokenErr`). Recovery: `codex login` for the now-active account.
8181

8282
## Working in this repo

internal/cred/claude.go

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,8 +27,12 @@ var ErrClaudeWriteUnsupported = errors.New("writing Claude live credential is no
2727
type Credential struct {
2828
AccessToken string
2929
RefreshToken string
30-
ExpiresAt int64 // ms since epoch; 0 if absent
31-
Raw []byte
30+
// ExpiresAt is ms since epoch; 0 if absent. Claude only — populated from the
31+
// claudeAiOauth.expiresAt field. Codex leaves this 0 (its auth.json has no
32+
// expiry field); the codex refresh gate decodes the access-token JWT exp on
33+
// demand in codex.StoredExpiresAt.
34+
ExpiresAt int64
35+
Raw []byte
3236
}
3337

3438
// parseClaudeCredFull parses the JSON payload used by both the macOS Keychain

internal/cred/codex.go

Lines changed: 31 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -64,19 +64,47 @@ func ParseCodexIDToken(idToken string) (sub, email string, expSec int64, err err
6464
return claims.Sub, claims.Email, int64(claims.Exp), nil
6565
}
6666

67+
// ParseJWTExp decodes the exp claim (seconds since epoch) from a JWT payload
68+
// without verifying the signature — same posture as ParseCodexIDToken (the Codex
69+
// CLI already accepted the token). Returns (0, false) when token is not a
70+
// parseable 3-segment JWT or the exp claim is absent/zero. Used by
71+
// codex.StoredExpiresAt to read the access-token expiry that drives the refresh
72+
// gate.
73+
func ParseJWTExp(token string) (expSec int64, ok bool) {
74+
parts := strings.Split(token, ".")
75+
if len(parts) != 3 || parts[0] == "" || parts[1] == "" || parts[2] == "" {
76+
return 0, false
77+
}
78+
payload, err := base64.RawURLEncoding.DecodeString(parts[1])
79+
if err != nil {
80+
return 0, false
81+
}
82+
var claims struct {
83+
Exp float64 `json:"exp"` // JSON number; may be integer or float
84+
}
85+
if err := json.Unmarshal(payload, &claims); err != nil {
86+
return 0, false
87+
}
88+
if claims.Exp <= 0 {
89+
return 0, false
90+
}
91+
return int64(claims.Exp), true
92+
}
93+
6794
// rawCodexAuth is the minimal shape of ~/.codex/auth.json for credential extraction.
6895
type rawCodexAuth struct {
6996
Tokens struct {
7097
AccessToken string `json:"access_token"`
7198
RefreshToken string `json:"refresh_token"`
72-
IDToken string `json:"id_token"`
7399
} `json:"tokens"`
74100
}
75101

76102
// parseCodexCredFull parses the JSON payload of ~/.codex/auth.json.
77103
// access_token is required; its absence returns ErrCodexTokenNotFound.
78-
// id_token is optional: if absent or malformed, ExpiresAt is 0.
79-
// Raw is set to bytes.Clone(data) so the caller's buffer can be reused.
104+
// Credential.ExpiresAt is always 0 for codex (auth.json has no expiry field);
105+
// the refresh gate decodes the access-token JWT exp on demand in
106+
// codex.StoredExpiresAt. Raw is set to bytes.Clone(data) so the caller's buffer
107+
// can be reused.
80108
func parseCodexCredFull(data []byte) (Credential, error) {
81109
var raw rawCodexAuth
82110
if err := json.Unmarshal(data, &raw); err != nil {
@@ -86,20 +114,9 @@ func parseCodexCredFull(data []byte) (Credential, error) {
86114
return Credential{}, ErrCodexTokenNotFound
87115
}
88116

89-
var expiresAt int64
90-
if raw.Tokens.IDToken != "" {
91-
if _, _, expSec, err := ParseCodexIDToken(raw.Tokens.IDToken); err == nil {
92-
expiresAt = expSec * 1000
93-
}
94-
// malformed id_token → ExpiresAt stays 0; not an error for ReadCodexCredential.
95-
// T3 reconcile calls ParseCodexIDToken independently on the live credential
96-
// to extract sub/email for identity; it will surface the error there if needed.
97-
}
98-
99117
return Credential{
100118
AccessToken: raw.Tokens.AccessToken,
101119
RefreshToken: raw.Tokens.RefreshToken,
102-
ExpiresAt: expiresAt,
103120
Raw: bytes.Clone(data),
104121
}, nil
105122
}

internal/cred/codex_test.go

Lines changed: 63 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -193,6 +193,62 @@ func TestParseCodexIDToken(t *testing.T) {
193193
}
194194
}
195195

196+
// --- ParseJWTExp tests ---
197+
198+
func TestParseJWTExp(t *testing.T) {
199+
tests := []struct {
200+
name string
201+
run func(t *testing.T) (int64, bool)
202+
wantExp int64
203+
wantOK bool
204+
}{
205+
{"present", func(t *testing.T) (int64, bool) {
206+
return ParseJWTExp(makeTestJWT(t, `{"sub":"u1","exp":1700000000}`))
207+
}, 1700000000, true},
208+
{"past exp still returned", func(t *testing.T) (int64, bool) {
209+
return ParseJWTExp(makeTestJWT(t, `{"exp":1000000000}`))
210+
}, 1000000000, true},
211+
{"float exp truncated", func(t *testing.T) (int64, bool) {
212+
return ParseJWTExp(makeTestJWT(t, `{"exp":1700000000.9}`))
213+
}, 1700000000, true},
214+
{"exp absent", func(t *testing.T) (int64, bool) {
215+
return ParseJWTExp(makeTestJWT(t, `{"sub":"u1"}`))
216+
}, 0, false},
217+
{"exp zero", func(t *testing.T) (int64, bool) {
218+
return ParseJWTExp(makeTestJWT(t, `{"exp":0}`))
219+
}, 0, false},
220+
{"exp negative", func(t *testing.T) (int64, bool) {
221+
return ParseJWTExp(makeTestJWT(t, `{"exp":-1}`))
222+
}, 0, false},
223+
{"opaque non-jwt", func(t *testing.T) (int64, bool) {
224+
return ParseJWTExp("test-access-token-abc")
225+
}, 0, false},
226+
{"two segments", func(t *testing.T) (int64, bool) {
227+
return ParseJWTExp("header.payload")
228+
}, 0, false},
229+
{"empty segment", func(t *testing.T) (int64, bool) {
230+
return ParseJWTExp("a..c")
231+
}, 0, false},
232+
{"bad base64 payload", func(t *testing.T) (int64, bool) {
233+
return ParseJWTExp("header.!!!.sig")
234+
}, 0, false},
235+
{"bad json payload", func(t *testing.T) (int64, bool) {
236+
return ParseJWTExp("header." + base64.RawURLEncoding.EncodeToString([]byte("not json")) + ".sig")
237+
}, 0, false},
238+
{"empty", func(t *testing.T) (int64, bool) {
239+
return ParseJWTExp("")
240+
}, 0, false},
241+
}
242+
for _, tt := range tests {
243+
t.Run(tt.name, func(t *testing.T) {
244+
expSec, ok := tt.run(t)
245+
if expSec != tt.wantExp || ok != tt.wantOK {
246+
t.Errorf("ParseJWTExp = (%d, %t), want (%d, %t)", expSec, ok, tt.wantExp, tt.wantOK)
247+
}
248+
})
249+
}
250+
}
251+
196252
// --- ReadCodexCredential tests ---
197253

198254
const testIDToken = "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJjb2RleC10ZXN0LXN1YiIsImVtYWlsIjoidGVzdEBleGFtcGxlLmNvbSIsImV4cCI6OTk5OTk5OTk5OX0.testsig"
@@ -203,6 +259,9 @@ func TestReadCodexCredential(t *testing.T) {
203259
run func(t *testing.T)
204260
}{
205261
{"happy path", func(t *testing.T) {
262+
// Codex Credential.ExpiresAt is always 0 — even with a valid id_token
263+
// present, parseCodexCredFull does not derive expiry (the refresh gate
264+
// reads the access-token JWT exp on demand in codex.StoredExpiresAt).
206265
body := `{"tokens":{"access_token":"tok","refresh_token":"ref","id_token":"` + testIDToken + `"}}`
207266
writeAuth(t, body)
208267
cred, err := ReadCodexCredential(context.Background())
@@ -213,19 +272,19 @@ func TestReadCodexCredential(t *testing.T) {
213272
if cred.RefreshToken != "ref" {
214273
t.Errorf("RefreshToken = %q, want ref", cred.RefreshToken)
215274
}
216-
if cred.ExpiresAt != 9999999999000 {
217-
t.Errorf("ExpiresAt = %d, want 9999999999000", cred.ExpiresAt)
275+
if cred.ExpiresAt != 0 {
276+
t.Errorf("ExpiresAt = %d, want 0 (codex never derives ExpiresAt)", cred.ExpiresAt)
218277
}
219278
if len(cred.Raw) == 0 {
220279
t.Error("Raw should be non-empty")
221280
}
222281
}},
223-
{"no id token", func(t *testing.T) {
282+
{"no id token still zero expiry", func(t *testing.T) {
224283
writeAuth(t, `{"tokens":{"access_token":"tok","refresh_token":"ref"}}`)
225284
cred, err := ReadCodexCredential(context.Background())
226285
testutil.WantNoErr(t, err)
227286
if cred.ExpiresAt != 0 {
228-
t.Errorf("ExpiresAt = %d, want 0 when id_token absent", cred.ExpiresAt)
287+
t.Errorf("ExpiresAt = %d, want 0", cred.ExpiresAt)
229288
}
230289
if cred.AccessToken != "tok" {
231290
t.Errorf("AccessToken = %q, want tok", cred.AccessToken)
@@ -264,18 +323,6 @@ func TestReadCodexCredential(t *testing.T) {
264323
t.Errorf("Raw = %q, want %q", cred.Raw, body)
265324
}
266325
}},
267-
{"malformed id token sets expires at zero", func(t *testing.T) {
268-
// D4: malformed-but-present id_token sets ExpiresAt=0 without error.
269-
writeAuth(t, `{"tokens":{"access_token":"tok","id_token":"not.a.valid.jwt.at.all"}}`)
270-
cred, err := ReadCodexCredential(context.Background())
271-
testutil.WantNoErr(t, err)
272-
if cred.ExpiresAt != 0 {
273-
t.Errorf("ExpiresAt = %d, want 0 for malformed id_token", cred.ExpiresAt)
274-
}
275-
if cred.AccessToken != "tok" {
276-
t.Errorf("AccessToken = %q, want tok", cred.AccessToken)
277-
}
278-
}},
279326
{"api key mode fails", func(t *testing.T) {
280327
// D7 / A7: auth_mode=="api_key" with no tokens object → fail-closed with ErrCodexTokenNotFound.
281328
writeAuth(t, `{"auth_mode":"api_key","OPENAI_API_KEY":"sk-test"}`)

internal/providers/codex/account.go

Lines changed: 9 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,6 @@ type rawCodexTokens struct {
1313
Tokens struct {
1414
AccessToken string `json:"access_token"`
1515
RefreshToken string `json:"refresh_token"`
16-
IDToken string `json:"id_token"`
1716
} `json:"tokens"`
1817
}
1918

@@ -43,16 +42,15 @@ func StoredRefreshToken(a accounts.Account) string {
4342
}
4443

4544
// StoredExpiresAt returns the access-token expiry as milliseconds since epoch,
46-
// derived from the exp claim of tokens.id_token in a.RawBlob.
47-
// Returns 0 if id_token is absent, the JWT is malformed, or exp is absent.
45+
// decoded from the exp claim of tokens.access_token in a.RawBlob. Returns 0 if
46+
// the access token is absent or not a JWT carrying an exp claim — in which case
47+
// the caller performs no proactive refresh and relies on the usage call (an
48+
// expired-but-opaque token surfaces via the usage endpoint's 401, which maps to
49+
// an actionable `codex login` hint). The OIDC id_token is short-lived and is
50+
// NOT used here — it is identity-only (sub/email); see reconcile.go.
4851
func StoredExpiresAt(a accounts.Account) int64 {
49-
idTok := parseStoredRaw(a).Tokens.IDToken
50-
if idTok == "" {
51-
return 0
52+
if expSec, ok := cred.ParseJWTExp(StoredAccessToken(a)); ok {
53+
return expSec * 1000
5254
}
53-
_, _, expSec, err := cred.ParseCodexIDToken(idTok)
54-
if err != nil {
55-
return 0
56-
}
57-
return expSec * 1000
55+
return 0
5856
}

0 commit comments

Comments
 (0)