Skip to content
#

ml-datasets

Here are 22 public repositories matching this topic...

Privacy Act 2020 compliance auditor for ML datasets — scans CSV / Parquet / HuggingFace datasets and flags New Zealand-specific PII (IRD, NHI, driver licence, phone, address, te reo names) using a hybrid regex + NER + LLM verification pipeline.

  • Updated Jun 22, 2026
  • Python

HuggingFace Datasets for Elixir - A native Elixir port of the popular HuggingFace datasets library. Stream, load, and process ML datasets from the HuggingFace Hub with full BEAM/OTP integration. Supports Parquet streaming, dataset splitting, shuffling, and seamless integration with Nx tensors for machine learning workflows.

  • Updated Apr 4, 2026
  • Elixir

Improve this page

Add a description, image, and links to the ml-datasets topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the ml-datasets topic, visit your repo's landing page and select "manage topics."

Learn more