ONNX Runtime is a high-performance inference engine for machine learning creations across Windows, Linux, and Mac. Developers can use the service to train AI models in any framework and turn these models to production in the cloud and edge. Open Neural Network Exchange (ONNX) creates a standard open platform for AI models that will work across frameworks. Its main features include:
“Framework interoperability: Developers can more easily move between frameworks and use the best tool for the task at hand. Each framework is optimized for specific characteristics such as fast training, supporting flexible network architectures, inferencing on mobile devices, etc. Many times, the characteristic most important during research and development is different than the one most important for shipping to production. This leads to inefficiencies from not using the right framework or significant delays as developers convert models between frameworks. Frameworks that use the ONNX representation simplify this and enable developers to be more agile. Shared optimization: Hardware vendors and others with optimizations for improving the performance of neural networks can impact multiple frameworks at once by targeting the ONNX representation. Frequently optimizations need to be integrated separately into each framework which can be a time-consuming process. The ONNX representation makes it easier for optimizations to reach more developers.”
ONNX Growth
Numerous frameworks are already adopting the ecosystem. They include Microsoft’s own Cognitive Toolkit, as well as Caffe 2, Apache MXNet, PyTorch and NVIDIA’s TensorRT.