Let’s make your AI
tiny and fast.

Try out our inference engine for Arm Cortex-M. It is the fastest and smallest in the world. On average it gives a speedup of 2.6x, a RAM reduction of 2.0x and a code size reduction of 3.6x. Accuracy does not change. No binarization, no pruning.

Check our MLPerf results

Drag .tflite file here

Or choose file

filename

Name

Company

Receive Plumerai updates – unsubscribe at any time.

Please submit a valid TF Lite model (.tflite extension and <10MB).

There was an error validating the form input. Please verify the content and try again.

The service is currently very busy. Please try again in 5 minutes.

An internal error occured. This error has been reported to the Plumerai team.

The public benchmarker is currently disabled. We are working on getting this up and running again as soon as possible.

We will not reuse, reverse engineer, or steal your model, and will not sell or provide your data to third parties.
For full details, see our privacy policy.

Here's how it works

Upload your .tflite model.

We compile and run it on Arm Cortex-M4 (STM32L4R9), Arm Cortex-M7 (STM32H7B3), and Arm Cortex-M33 (STM32U585) microcontrollers.

We show the speedups, memory and code size savings with our inference engine compared to TensorFlow Lite for Microcontrollers.

Model requirements

TensorFlow Lite Flatbuffer .tflite file.
INT8 quantized, float is not supported.
Model size smaller than 2MB.
Runs with TensorFlow Lite for Microcontrollers.

Contact us and use our inference engine in your product

Get started

Let’s make your AItiny and fast.