Skip to content

feat(openai): native content-block streaming for the Responses API#38004

Open
Nick Hollon (nick-hollon-lc) wants to merge 1 commit into
nh/native-content-block-streaming-openaifrom
nh/native-content-block-streaming-openai-responses
Open

feat(openai): native content-block streaming for the Responses API#38004
Nick Hollon (nick-hollon-lc) wants to merge 1 commit into
nh/native-content-block-streaming-openaifrom
nh/native-content-block-streaming-openai-responses

Conversation

@nick-hollon-lc

Copy link
Copy Markdown
Contributor

Stacked on #37999 (OpenAI Chat Completions), which stacks on #37995 (core BlockStreamTracker + Anthropic). Review/merge those first; this retargets to master as they land.

Completes native content-block streaming for ChatOpenAI by covering the Responses API (the other API surface alongside Chat Completions in #37999), so stream_events(version="v3") no longer rides the compat bridge for Responses streaming.

What it does

A thin converter (convert_openai_responses_stream / async twin) reuses the existing _convert_responses_chunk_to_generation_chunk (injected to avoid a circular import) to map each raw Responses event to an indexed chunk, feeding the shared BlockStreamTracker. The BaseChatOpenAI hooks now route the Responses path to it; structured output (response_format) and raw-header mode still defer to the bridge, since those rely on the final-completion handling only _stream performs.

The win over the bridge: true mid-stream boundaries

The Responses API streams sequentially with a monotonic content index, so a block is complete the moment the stream advances to a higher index. The converter emits content-block-finish at that point — delivering true mid-stream block completion (text, tool calls, reasoning) rather than the bridge's finalize-everything-at-message-end. Verified: every lower-index finish precedes the next block's start.

Consistency

message-start carries the stream's LangChain run id (threaded from core), matching the bridge and the other native paths. Converters re-apply the caller's provider. Sync/async are faithful twins.

Worth careful review

  • The monotonic-index boundary logic (the core mechanic).
  • The 3-way hook routing: response_format/headers → bridge; Responses API → native; else Completions native.
  • Parity is anchored by the pre-existing test_responses_stream_events_v3_emits_reasoning_lifecycle (4 reasoning blocks), which now routes through the native converter unchanged; a sync + async parity pair plus a boundary-ordering test cover the rest. The Responses VCR integration suite also exercises this path.

@github-actions github-actions Bot added feature For PRs that implement a new feature; NOT A FEATURE REQUEST integration PR made that is related to a provider partner package integration internal openai `langchain-openai` package issues & PRs size: M 200-499 LOC labels Jun 10, 2026

@open-swe open-swe Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Open SWE Review: No issues found

Open SWE reviewed this PR and found no potential bugs to report.

Open in WebView Open SWE trace

@nick-hollon-lc Nick Hollon (nick-hollon-lc) force-pushed the nh/native-content-block-streaming-openai-responses branch from f5ca419 to 535219d Compare June 11, 2026 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature For PRs that implement a new feature; NOT A FEATURE REQUEST integration PR made that is related to a provider partner package integration internal openai `langchain-openai` package issues & PRs size: M 200-499 LOC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant