Skip to content

[feature request] Add a privacy preset to hide or mask all attributes that may contain user-generated content #3203

Description

@jordanbertasso

Is your feature request related to a problem? Please describe.

OpenInference currently exposes a number of useful masking controls for span attributes, including environment variables and TraceConfig options such as:

  • OPENINFERENCE_HIDE_INPUTS
  • OPENINFERENCE_HIDE_OUTPUTS
  • OPENINFERENCE_HIDE_INPUT_MESSAGES
  • OPENINFERENCE_HIDE_OUTPUT_MESSAGES
  • OPENINFERENCE_HIDE_INPUT_IMAGES
  • OPENINFERENCE_HIDE_INPUT_TEXT
  • OPENINFERENCE_HIDE_OUTPUT_TEXT
  • OPENINFERENCE_HIDE_LLM_PROMPTS
  • OPENINFERENCE_HIDE_LLM_TOOLS

The current approach works, but it requires users to reason about and enable several individual flags to avoid exporting attributes that may contain user-generated content (UGC). That is easy to get wrong, especially as OpenInference adds more instrumentors and semantic attributes over time.

For privacy-sensitive applications, the safer default would be a single opt-in setting that hides or masks every span attribute that may contain UGC, without requiring users to keep track of each individual attribute family.

Docs reference: https://arize.com/docs/phoenix/tracing/how-to-tracing/advanced/masking-span-attributes

Describe the solution you'd like

Please add a single configuration option, available both as an environment variable and in TraceConfig, that applies a comprehensive UGC privacy preset.

For example:

OPENINFERENCE_HIDE_UGC=true
from openinference.instrumentation import TraceConfig

config = TraceConfig(
    hide_ugc=True,
)
const traceConfig = {
  hideUGC: true,
}

At least initially, this preset could simply enable the existing UGC-related masking toggles together, such as hiding inputs, outputs, input/output messages, input/output text, input images, LLM prompts, and any other existing flags that may expose user-provided content.

The important behavior is that users can choose a privacy posture instead of maintaining a checklist of individual masking flags. If new masking flags are added later for additional UGC-bearing attributes, this preset could include those as well.

Why this matters

In many applications, user prompts, uploaded documents, retrieved snippets, tool arguments, and model outputs can contain sensitive user-provided data. Today, a user can try to approximate this behavior by enabling several existing options one by one, but that has a few drawbacks:

  • It is easy to miss a flag.
  • The required flag set may change as the semantic conventions evolve.
  • The same privacy intent has to be repeated across languages and instrumentors.
  • Users must understand OpenInference's full attribute taxonomy before they can safely configure instrumentation.

A single hide_ugc / OPENINFERENCE_HIDE_UGC option would make the safe path much simpler.

Describe alternatives you've considered

The current workaround is to enable the existing masking flags individually, for example hiding inputs, outputs, input messages, output messages, input text, output text, images, prompts, and potentially tools. That works only if the user knows the complete set of attributes that can carry UGC.

Another alternative would be field-level masking or regex-based redaction. That would be useful too, and appears related to #2550, but this request is specifically for a maintained preset that covers all known UGC-bearing OpenInference semantic attributes.

Additional context

Ideally, this configuration would be implemented consistently across the Python, TypeScript, and Go instrumentors, with the same precedence model as the existing masking configuration:

  1. Values set in TraceConfig
  2. Environment variables
  3. Defaults

It would also be helpful for the documentation to list which existing flags are enabled by the UGC preset, so privacy-sensitive users can understand the tradeoff between observability detail and data minimization.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions