Skip to content

@opentelemetry/otlp-exporter-base createHttpAgent() ignores HTTPS_PROXY env / EnvHttpProxyAgent — silent telemetry loss for users behind env-proxies #6638

Description

@mayank6136

@opentelemetry/otlp-exporter-base createHttpAgent() constructs new Agent() and ignores HTTPS_PROXY / NODE_USE_ENV_PROXY

What happened

In environments where outbound HTTP must traverse an env-var-configured proxy (HTTPS_PROXY / https_proxy set in the process env, or NODE_USE_ENV_PROXY=1), the OTel HTTP exporters in this package silently bypass the proxy. They construct their own new https.Agent(agentOptions) (or new http.Agent(...)), which doesn't pick up Node's EnvHttpProxyAgent (the agent that consults HTTPS_PROXY). DNS resolution for the OTLP endpoint then goes through the runtime's default getaddrinfo, fails (EAI_AGAIN), and the export silently retries forever with no surface error at default log levels.

We hit this running an OTel-instrumented Node process inside a sandbox that injects HTTPS_PROXY=http://10.200.0.1:3128 into the env. From that env, every other https.request(...) call in the process — including ones we wrote ourselves to validate the path — uses the proxy correctly via globalAgent / EnvHttpProxyAgent. Only the OTel exporter goes around it. After tracing, we found the divergence in:

  • File: experimental/packages/otlp-exporter-base/src/transport/http-transport-utils.ts
  • Compiled: node_modules/@opentelemetry/otlp-exporter-base/build/src/transport/http-transport-utils.js
function createHttpAgent(rawUrl: string, agentOptions: http.AgentOptions) {
    const parsedUrl = new URL(rawUrl);
    const Agent = parsedUrl.protocol === 'http:' ? http.Agent : https.Agent;
    return new Agent(agentOptions);          // ← always a fresh Agent, no env-proxy
}

Repro

Minimal:

# 1. Stand up a server that's only reachable via your env-proxy.
#    (e.g., put it behind an HTTPS proxy you control, or reproduce in a
#     sandbox/k3s environment where HTTPS_PROXY is required for egress.)

# 2. Point an OTel SDK exporter at that server with HTTPS_PROXY set:
HTTPS_PROXY=http://your-proxy:3128 \
  NODE_USE_ENV_PROXY=1 \
  node my-exporter-test.js

# Where my-exporter-test.js does:
#   const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-http');
#   const exporter = new OTLPTraceExporter({ url: 'https://your-server/v1/traces' });
#   ... export a fake span ...

Result: EAI_AGAIN <hostname> even though HTTPS_PROXY is correctly set and the proxy IS reachable.

For comparison, the SAME endpoint can be reached from the same process using:

const https = require('https');
https.request({ hostname: 'your-server', path: '/v1/traces', method: 'POST',
                agent: https.globalAgent /* picks up EnvHttpProxyAgent */ })
     .end();
// → succeeds (HTTP 200)

The exporter's failure is silent at default OTel log levels — only after manually wiring diag.setLogger(new DiagConsoleLogger(), DiagLogLevel.DEBUG) did we see the error surface.

Root cause

createHttpAgent constructs a fresh https.Agent(agentOptions) per export. That agent doesn't have any awareness of HTTPS_PROXY. Node's standard pattern for env-proxy support is to use globalAgent (which Node's runtime replaces with EnvHttpProxyAgent when NODE_USE_ENV_PROXY=1 is set, and which respects the env vars).

By instantiating its own Agent, this package opts out of env-proxy support — which AFAICT was unintentional. The exporter probably wanted custom socket-keepalive / pool settings (which is what agentOptions is for) but didn't account for the env-proxy use case.

Suggested fix

function createHttpAgent(rawUrl: string, agentOptions: http.AgentOptions) {
    const parsedUrl = new URL(rawUrl);

    // If the user has env-proxy configured, defer to globalAgent so Node's
    // EnvHttpProxyAgent can route through HTTPS_PROXY / http_proxy / etc.
    if (
      process.env.NODE_USE_ENV_PROXY === '1' ||
      process.env.HTTPS_PROXY ||
      process.env.https_proxy ||
      process.env.HTTP_PROXY ||
      process.env.http_proxy
    ) {
        return parsedUrl.protocol === 'http:' ? http.globalAgent : https.globalAgent;
    }

    const Agent = parsedUrl.protocol === 'http:' ? http.Agent : https.Agent;
    return new Agent(agentOptions);
}

This preserves the existing Agent construction for users who don't have env-proxy set (they get the per-export keepalive/pool behavior they had before) and adds the env-proxy fallback for users who need it.

Alternatives considered

A. Always use globalAgent

return parsedUrl.protocol === 'http:' ? http.globalAgent : https.globalAgent;

Simpler, but loses the per-Agent keepalive customization that the current code provides. Probably too breaking.

B. Allow user-supplied agent in exporter constructor

new OTLPTraceExporter({ url, httpAgentOverride: https.globalAgent });

More configurable but pushes the work onto every user; the env-proxy case is common enough to deserve a default.

C. Document that env-proxy isn't supported

Effectively the status quo. Not great — most users will hit this only when their backend is unreachable, and the OTel SDK's silent-retry-on-failure makes it hard to diagnose.

Test plan

Unit:

  • createHttpAgent('http://example.com/foo', {}) → returns a fresh http.Agent (existing behavior).
  • With HTTPS_PROXY=http://proxy:3128 set: same call → returns http.globalAgent (new behavior).
  • Same for https://... URLs and https.globalAgent.

Integration:

  • OTel exporter pointed at a server that's only reachable via the configured HTTPS_PROXY. Without fix: EAI_AGAIN. With fix: 200 OK.

Risk / blast radius

Small. The change is conditional — only kicks in when env-proxy is configured. Users without HTTPS_PROXY set get exactly the existing behavior (same Agent, same options).

Edge case: a user who has HTTPS_PROXY set in their env but DOESN'T want OTel to use it (e.g., to bypass a corporate proxy for telemetry specifically). That's currently impossible to get wrong because env-proxy isn't honored at all. After the fix, those users would need an opt-out — could pair with alternative B (httpAgentOverride constructor option) for explicit override.

Open questions for maintainers

  1. Default behavior: does the conditional fallback feel right, or do you prefer the always-use-globalAgent simpler path?
  2. Custom agent override: worth adding a httpAgent option to the exporter constructor for users who want to bypass env-proxy explicitly?
  3. Other transport packages: the same pattern (createHttpAgent constructing new Agent(...)) likely exists in @opentelemetry/otlp-grpc-exporter-base and @opentelemetry/otlp-proto-exporter-base. Should the fix land in all three together?
  4. Backports: @opentelemetry/otlp-exporter-base@0.203.0 is the version we hit. Would this fix be backported to any LTS branches, or strictly forward?

Tested-against

  • Node v22.22.2
  • @opentelemetry/otlp-exporter-base v0.203.0
  • @opentelemetry/exporter-trace-otlp-http (HTTP exporter, the one we use)
  • Reproduced in a sandbox env with HTTPS_PROXY injected; not env-specific to that sandbox — should reproduce anywhere HTTPS_PROXY routing is required.

Severity

Medium-high. Doesn't crash or misbehave loudly — but produces silent telemetry loss for any user who deploys behind an env-proxy, which is a common production pattern (corporate proxies, sandboxed runtimes, restricted egress environments). The combination of "exporter failure" + "silent at default log level" makes this hard to diagnose; users typically discover it only by checking their backend and seeing zero data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingpkg:otlp-exporter-basepriority:p2Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions