[feature request] Add a privacy preset to hide or mask all attributes that may contain user-generated content

**Is your feature request related to a problem? Please describe.**

OpenInference currently exposes a number of useful masking controls for span attributes, including environment variables and `TraceConfig` options such as:

- `OPENINFERENCE_HIDE_INPUTS`
- `OPENINFERENCE_HIDE_OUTPUTS`
- `OPENINFERENCE_HIDE_INPUT_MESSAGES`
- `OPENINFERENCE_HIDE_OUTPUT_MESSAGES`
- `OPENINFERENCE_HIDE_INPUT_IMAGES`
- `OPENINFERENCE_HIDE_INPUT_TEXT`
- `OPENINFERENCE_HIDE_OUTPUT_TEXT`
- `OPENINFERENCE_HIDE_LLM_PROMPTS`
- `OPENINFERENCE_HIDE_LLM_TOOLS`

The current approach works, but it requires users to reason about and enable several individual flags to avoid exporting attributes that may contain user-generated content (UGC). That is easy to get wrong, especially as OpenInference adds more instrumentors and semantic attributes over time.

For privacy-sensitive applications, the safer default would be a single opt-in setting that hides or masks every span attribute that may contain UGC, without requiring users to keep track of each individual attribute family.

Docs reference: https://arize.com/docs/phoenix/tracing/how-to-tracing/advanced/masking-span-attributes

**Describe the solution you'd like**

Please add a single configuration option, available both as an environment variable and in `TraceConfig`, that applies a comprehensive UGC privacy preset.

For example:

```bash
OPENINFERENCE_HIDE_UGC=true
```

```python
from openinference.instrumentation import TraceConfig

config = TraceConfig(
    hide_ugc=True,
)
```

```ts
const traceConfig = {
  hideUGC: true,
}
```

At least initially, this preset could simply enable the existing UGC-related masking toggles together, such as hiding inputs, outputs, input/output messages, input/output text, input images, LLM prompts, and any other existing flags that may expose user-provided content.

The important behavior is that users can choose a privacy posture instead of maintaining a checklist of individual masking flags. If new masking flags are added later for additional UGC-bearing attributes, this preset could include those as well.

**Why this matters**

In many applications, user prompts, uploaded documents, retrieved snippets, tool arguments, and model outputs can contain sensitive user-provided data. Today, a user can try to approximate this behavior by enabling several existing options one by one, but that has a few drawbacks:

- It is easy to miss a flag.
- The required flag set may change as the semantic conventions evolve.
- The same privacy intent has to be repeated across languages and instrumentors.
- Users must understand OpenInference's full attribute taxonomy before they can safely configure instrumentation.

A single `hide_ugc` / `OPENINFERENCE_HIDE_UGC` option would make the safe path much simpler.

**Describe alternatives you've considered**

The current workaround is to enable the existing masking flags individually, for example hiding inputs, outputs, input messages, output messages, input text, output text, images, prompts, and potentially tools. That works only if the user knows the complete set of attributes that can carry UGC.

Another alternative would be field-level masking or regex-based redaction. That would be useful too, and appears related to #2550, but this request is specifically for a maintained preset that covers all known UGC-bearing OpenInference semantic attributes.

**Additional context**

Ideally, this configuration would be implemented consistently across the Python, TypeScript, and Go instrumentors, with the same precedence model as the existing masking configuration:

1. Values set in `TraceConfig`
2. Environment variables
3. Defaults

It would also be helpful for the documentation to list which existing flags are enabled by the UGC preset, so privacy-sensitive users can understand the tradeoff between observability detail and data minimization.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature request] Add a privacy preset to hide or mask all attributes that may contain user-generated content #3203

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[feature request] Add a privacy preset to hide or mask all attributes that may contain user-generated content #3203

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions