feat(openai): native content-block streaming for the Responses API#38004
Open
Nick Hollon (nick-hollon-lc) wants to merge 1 commit into
Conversation
b67eab5 to
7a7f79a
Compare
f5ca419 to
535219d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Completes native content-block streaming for
ChatOpenAIby covering the Responses API (the other API surface alongside Chat Completions in #37999), sostream_events(version="v3")no longer rides the compat bridge for Responses streaming.What it does
A thin converter (
convert_openai_responses_stream/ async twin) reuses the existing_convert_responses_chunk_to_generation_chunk(injected to avoid a circular import) to map each raw Responses event to an indexed chunk, feeding the sharedBlockStreamTracker. TheBaseChatOpenAIhooks now route the Responses path to it; structured output (response_format) and raw-header mode still defer to the bridge, since those rely on the final-completion handling only_streamperforms.The win over the bridge: true mid-stream boundaries
The Responses API streams sequentially with a monotonic content index, so a block is complete the moment the stream advances to a higher index. The converter emits
content-block-finishat that point — delivering true mid-stream block completion (text, tool calls, reasoning) rather than the bridge's finalize-everything-at-message-end. Verified: every lower-index finish precedes the next block's start.Consistency
message-startcarries the stream's LangChain run id (threaded from core), matching the bridge and the other native paths. Converters re-apply the caller'sprovider. Sync/async are faithful twins.Worth careful review
response_format/headers → bridge; Responses API → native; else Completions native.test_responses_stream_events_v3_emits_reasoning_lifecycle(4 reasoning blocks), which now routes through the native converter unchanged; a sync + async parity pair plus a boundary-ordering test cover the rest. The Responses VCR integration suite also exercises this path.