Problem
A single destination response can be large enough that the resulting LogEntry exceeds the log queue's per-message size limit. When that happens the publish fails, the attempt never lands in the logstore, and the event retries forever (no prior attempt found in logstore, will retry).
The webhook driver reads the response body with io.ReadAll and stores it verbatim — there is no cap:
internal/destregistry/providers/destwebhook/httphelper.go:172 — ParseHTTPResponse → io.ReadAll(resp.Body) → delivery.Response["body"]
internal/destregistry/registry.go:205,261 — attempt.ResponseData = deliveryData.Response
LogEntry { Event, Attempt, Destination } → json.Marshal → published to the log MQ
SQS limits are 256 KB per message and 1 MB per batch. #845 capped batch aggregation (MaxBatchByteSize in internal/mqs/queue_awssqs.go), but a single oversized LogEntry still exceeds the per-message limit and is rejected — so #845 does not cover this case. It is also an unbounded-memory read regardless of queue backend.
Reported in #663 and PR #845 discussion.
Proposal
Add a configurable max response body size for the webhook driver.
- Default: no cap — opt-in, fully backward-compatible.
- When set and the response body exceeds the limit, replace the body with a placeholder instead of storing it, e.g.
Response larger than <limit> not stored, and flag it (response_truncated: true plus the size we observed). Do not store partial content.
- Enforce at read time in
ParseHTTPResponse via io.LimitReader, so we never buffer the full body. Read up to limit + 1; if the extra byte is present the response is over the limit. (Note: because we stop reading early we can only report > limit, not the exact original size — acceptable trade-off for bounding memory.)
- Apply the placeholder to both the logstore record and the customer-visible attempt response so the two stay consistent.
Config
New knob, env-configurable, unset = no cap. Naming TBD, e.g. DESTINATIONS_WEBHOOK_MAX_RESPONSE_BODY_BYTES.
Scope
This exposes a knob for the webhook response body. We are deliberately not solving the general "any LogEntry can exceed the queue limit" problem here — operators set the limit to fit their backend.
Cases we're knowingly leaving to the operator for now:
- Other destination types — every provider's response funnels through
registry.go:205,261, so a central cap would cover all of them. But only the webhook driver produces large responses (others store message IDs / ack metadata), so we're not adding a central backstop.
- Large event payloads — the event itself also counts toward the per-message limit. That's customer input accepted at ingest; capping it is a separate decision, not addressed here.
Open questions
- Final config name and units (bytes vs KB).
- Placeholder wording and exact extra fields on
response_data.
Problem
A single destination response can be large enough that the resulting
LogEntryexceeds the log queue's per-message size limit. When that happens the publish fails, the attempt never lands in the logstore, and the event retries forever (no prior attempt found in logstore, will retry).The webhook driver reads the response body with
io.ReadAlland stores it verbatim — there is no cap:internal/destregistry/providers/destwebhook/httphelper.go:172—ParseHTTPResponse→io.ReadAll(resp.Body)→delivery.Response["body"]internal/destregistry/registry.go:205,261—attempt.ResponseData = deliveryData.ResponseLogEntry { Event, Attempt, Destination }→json.Marshal→ published to the log MQSQS limits are 256 KB per message and 1 MB per batch. #845 capped batch aggregation (
MaxBatchByteSizeininternal/mqs/queue_awssqs.go), but a single oversizedLogEntrystill exceeds the per-message limit and is rejected — so #845 does not cover this case. It is also an unbounded-memory read regardless of queue backend.Reported in #663 and PR #845 discussion.
Proposal
Add a configurable max response body size for the webhook driver.
Response larger than <limit> not stored, and flag it (response_truncated: trueplus the size we observed). Do not store partial content.ParseHTTPResponseviaio.LimitReader, so we never buffer the full body. Read up tolimit + 1; if the extra byte is present the response is over the limit. (Note: because we stop reading early we can only report> limit, not the exact original size — acceptable trade-off for bounding memory.)Config
New knob, env-configurable, unset = no cap. Naming TBD, e.g.
DESTINATIONS_WEBHOOK_MAX_RESPONSE_BODY_BYTES.Scope
This exposes a knob for the webhook response body. We are deliberately not solving the general "any
LogEntrycan exceed the queue limit" problem here — operators set the limit to fit their backend.Cases we're knowingly leaving to the operator for now:
registry.go:205,261, so a central cap would cover all of them. But only the webhook driver produces large responses (others store message IDs / ack metadata), so we're not adding a central backstop.Open questions
response_data.