Meet us at CES 2025 in our suite at the Venetian — Contact us

Let’s make your AI
tiny and fast.

Try out our inference engine for Arm Cortex-M. It is the fastest and smallest in the world. On average it gives a speedup of 2.6x, a RAM reduction of 2.0x and a code size reduction of 3.6x. Accuracy does not change. No binarization, no pruning.

Drag .tflite file here

filename

Please submit a valid TF Lite model (.tflite extension and <10MB).
There was an error validating the form input. Please verify the content and try again.
The service is currently very busy. Please try again in 5 minutes.
An internal error occured. This error has been reported to the Plumerai team.
The public benchmarker is currently disabled. We are working on getting this up and running again as soon as possible.

We will not reuse, reverse engineer, or steal your model, and will not sell or provide your data to third parties.
For full details, see our privacy policy.

Here's how it works

Upload your .tflite model.
We compile and run it on Arm Cortex-M4 (STM32L4R9), Arm Cortex-M7 (STM32H7B3), and Arm Cortex-M33 (STM32U585) microcontrollers.
We show the speedups, memory and code size savings with our inference engine compared to TensorFlow Lite for Microcontrollers.

Model requirements

Contact us and use our inference engine in your product

Get started