stonnx

High performance, cross-platform ONNX inference runtime + quantization/serving toolkit - Run ONNX models on Apple Silicon NPU/GPU, arm64 and amd64 CPU (with or without AVX512 vectorization on Intel/AMD), Nvidia GPU, and webGPU (transformers.js)

Convert pytorch and other models formats into ONNX, and chop up/reformat/assemble and serve ONNX machine learning models directly in your Go application. Our goal is to make it easy to containerize and serve (or run locally) small to medium-sized machine learning workloads at low cost and complexity, across runtimes and compute environments, in a format that is amenable to finetuning, model composition, and distributed inference/machine learning.

Contents to be incrementally populated from other Accretional internal/external inference repositories - please stand by

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

stonnx

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

stonnx

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages