Skip to content

Add anonymous usage telemetry to H2O-3 clients (Python, R, JVM) #16875

Description

@valenad1

Add opt-out anonymous usage telemetry to the H2O-3 clients so we can see which features and platforms are used in the field.

Scope

  • Python & R clients: one event per h2o.init() / h2o.connect() and per major action (train, score/predict, MOJO & model download, upload, import, parse, AutoML, model save/load), on the v2.1 wire contract.
  • JVM: standalone java -jar h2o.jar and hadoop jar h2odriver.jar clusters emit a single init event when the cluster forms (leader-only, fires even with no Python/R client attached).
  • Opt-out via H2O_DISABLE_TELEMETRY / DO_NOT_TRACK env vars, telemetry=False (Python) / telemetry = FALSE (R), or -Dsys.ai.h2o.telemetry.disabled=true.
  • Only coarse bucketed metrics plus version / OS / cluster-topology metadata are sent — never code, paths, dataset or model names, column names, parameter values, or any user content. Fire-and-forget: never blocks, raises, or retries.

Docs

  • Add a Telemetry page under the User Guide and a Privacy & Telemetry section in the README (separate PR).

The wire contract and receiver are open source in h2o-3-telemetry; a private receiver can be targeted via H2O_TELEMETRY_URL.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions