Summary
OpenAI Responses WebSocket v2 passthrough mode can stay silent on the downstream client connection during long Codex tasks, for example Codex image_generation. When Sub2API is behind an idle-sensitive proxy/load balancer/CDN, the downstream WebSocket may be closed before response.completed, even though the upstream OpenAI WebSocket is still alive.
This is different from the existing SSE keepalive issues and from ctx_pool continuation issues. The affected path is the WS v2 passthrough relay.
Observed behavior
A Codex client using Responses WebSocket v2 through Sub2API passthrough may report errors similar to:
stream disconnected before completion: websocket closed by server before response.completed
This is easiest to reproduce with long-running image generation or other turns where the upstream can be busy for a while without producing downstream business frames.
Root cause
The passthrough relay forwards client/upstream frames, but it did not actively keep the downstream WebSocket alive when there were no downstream business writes for a while.
If an intermediate proxy has an idle timeout, it can close the client-facing WS connection during the long upstream turn. Sub2API then sees a graceful client-side EOF/close and has to drain upstream only for usage/accounting.
Expected behavior
For WS v2 passthrough mode, Sub2API should optionally send WebSocket Ping control frames to the downstream client after the first downstream business frame, when the downstream side has been idle for a configured interval.
Suggested behavior:
- Default interval around 20 seconds.
- Configurable timeout for waiting on Pong, around 5 seconds.
0 interval disables the behavior.
- If the downstream connection does not support active Ping, log/trace this and keep existing relay behavior.
- If Ping/Pong fails, treat it as a graceful client disconnect and preserve the existing upstream drain/usage capture behavior.
- Do not start downstream keepalive before the first downstream business write, so the relay does not keep never-started/failed handshakes alive artificially.
Validation from a downstream deployment
After adding downstream WS Ping/Pong keepalive, long Codex image_generation requests behind a proxy showed repeated successful downstream keepalive traces during the same long session, for example:
stage=downstream_ping_ok direction=downstream_keepalive graceful=true wrote_downstream=true
The same deployment continued to drain upstream usage correctly when the client connection closed.
Proposed fix
I can open a PR with a minimal implementation that:
- Adds
Ping(ctx) support to the passthrough downstream frame connection.
- Starts a downstream keepalive goroutine after the first downstream business write.
- Adds config keys:
gateway.openai_ws.passthrough_downstream_ping_interval_seconds
gateway.openai_ws.passthrough_downstream_ping_timeout_seconds
- Adds relay tests for:
- ping starts only after the first downstream write;
- successful Ping/Pong during idle periods;
- Ping failure is handled as graceful client disconnect while preserving upstream drain;
- idle timeout behavior is avoided while pings succeed.
Related issues checked
Related but not exact duplicates:
Summary
OpenAI Responses WebSocket v2
passthroughmode can stay silent on the downstream client connection during long Codex tasks, for example Codeximage_generation. When Sub2API is behind an idle-sensitive proxy/load balancer/CDN, the downstream WebSocket may be closed beforeresponse.completed, even though the upstream OpenAI WebSocket is still alive.This is different from the existing SSE keepalive issues and from
ctx_poolcontinuation issues. The affected path is the WS v2 passthrough relay.Observed behavior
A Codex client using Responses WebSocket v2 through Sub2API passthrough may report errors similar to:
This is easiest to reproduce with long-running image generation or other turns where the upstream can be busy for a while without producing downstream business frames.
Root cause
The passthrough relay forwards client/upstream frames, but it did not actively keep the downstream WebSocket alive when there were no downstream business writes for a while.
If an intermediate proxy has an idle timeout, it can close the client-facing WS connection during the long upstream turn. Sub2API then sees a graceful client-side EOF/close and has to drain upstream only for usage/accounting.
Expected behavior
For WS v2 passthrough mode, Sub2API should optionally send WebSocket Ping control frames to the downstream client after the first downstream business frame, when the downstream side has been idle for a configured interval.
Suggested behavior:
0interval disables the behavior.Validation from a downstream deployment
After adding downstream WS Ping/Pong keepalive, long Codex image_generation requests behind a proxy showed repeated successful downstream keepalive traces during the same long session, for example:
The same deployment continued to drain upstream usage correctly when the client connection closed.
Proposed fix
I can open a PR with a minimal implementation that:
Ping(ctx)support to the passthrough downstream frame connection.gateway.openai_ws.passthrough_downstream_ping_interval_secondsgateway.openai_ws.passthrough_downstream_ping_timeout_secondsRelated issues checked
Related but not exact duplicates: