Let’s make your AI
tiny and fast.

Try out our inference engine for Arm Cortex-M. It is the fastest and smallest in the world. MobileNetV2 runs with 40% lower latency and uses 49% less RAM. Model accuracy does not change. No binarization, no additional quantization, no pruning.

Drag .tflite file here


Please submit a valid TF Lite model (.tflite extension and <10MB).
There was an error validating the form input. Please verify the content and try again.
The service is currently very busy. Please try again in 5 minutes.
An internal error occured. This error has been reported to the Plumerai team.
The public benchmarker is currently disabled. We are working on getting this up and running again as soon as possible.

We will not reuse, reverse engineer, or steal your model, and will not sell or provide your data to third parties.
For full details, see our privacy policy.

Here's how it works

Upload your .tflite model.
We compile and run it on Arm Cortex-M4 (STM32L4R9) and Arm Cortex-M7 (STM32H7B3) microcontrollers.
We show the speedups and memory savings with our inference engine compared to TensorFlow Lite for Microcontrollers.

Model requirements

Contact us and use our inference engine in your product

Get started