deep learning
tiny. Finally.

Radically more efficient deep learning to enable inference on low-power hardware.

We are working on a future in which a forest tells us there’s a fire. A warehouse directs us to the missing box. And a home lets us know if our elderly need help. It’s a future that is safer and more thoughtful thanks to intelligent sensors inside all the things around us.

you had to sit at your PC.

Today, you take your
smartphone with you.

Tomorrow, intelligent sensors
turn computing ambient.

AI everywhere.

A large number of intelligent chips are required to make computing ambient, as a result they have to be small, cheap and low-power. Our technology works on $1 chips that consume less than 10 milliwatts. These are so efficient, they can be powered by a coin battery or small solar cell.

On chips less than $1

Consumes less than 10mW


Better without
the cloud.

More reliable

Processing data on the device is inherently more reliable than a connection with the cloud. Intelligence shouldn’t have to depend on weak WiFi.


Sending data from the sensor to the cloud, processing the data, and sending it back again takes time. Sometimes whole seconds. This latency is problematic for products that need to respond to sensor input in real-time.

Private & Secure

Sending sensor data such as audio and video to the cloud increases privacy and security risks. To reduce abuse and give people confidence to let intelligent sensors into their lives, the data should not leave the device.

Bandwidth friendly

Ubiquitous connected sensors would overwhelm the network. Plumerai chips only use the network when they have something to report. This keeps bandwidth and mobile data costs low.

Energy efficient

The farther we move data, the more energy we use. Sending data to the cloud uses a lot of energy. Processing data on-chip is more efficient by orders of magnitude. If a device needs a battery life of months or years, data needs to be processed locally.

Use cases

Putting our tiny
technology to work.

Person detection

Our computer vision models for person detection are small enough to run in real-time on affordable, battery-powered cameras. They can be deployed in large numbers to cover every corner of the room. This allows a store to better allocate staff and improve product placements, an office to detect free desks, and a building to optimize heating and cooling.

Speech recognition

By removing price and energy concerns our speech recognition and sound detection technology can be embedded in the microwave in your kitchen, in the lightbulb in your garage and in the thermostat in your bedroom. This way, no data can leave the device which makes it inherently secure.

Binarized Neural Networks

Radically more efficient with only 1 bit

Deep learning models can have millions of parameters and these parameters are encoded in bits. Where others require 32, 16 or 8 bits, Binarized Neural Networks use only 1 single bit for each parameter. This property makes it possible to perfectly calibrate the model to get the maximum performance out of every bit.

Deep learning models can have millions of parameters and these parameters are encoded in bits. Where others require 32, 16 or 8 bits, Binarized Neural Networks use only 1 single bit for each parameter. This property makes it possible to perfectly calibrate the model to get the maximum performance out of every bit.

Why we use BNNs

A BNN needs drastically less memory to store its weights and activations than an 8-bit deep learning model. This saves energy by reducing the need to access off-chip memory and makes it feasible to deploy deep learning on more affordable memory constrained devices.

In addition to this, BNNs are also computationally radically more efficient. Convolutions are an essential building block of deep learning models. They consist of additions and multiplications, which – because the complexity of a multiplier is proportional to the square of the bit-width – can be replaced in a BNN by the simple POPCOUNT and XNOR operations.

Challenges we had to overcome

Information is lost when the weights and activations are encoded using 1 bit instead of 8 bits. This affects the accuracy of the model.

Furthermore, the activation functions inside BNNs do not have a meaningful derivative. This is a problem when deep learning models are trained using gradient descent.

We solved these issues through our research and technology:

Helwegen et al, Latent weights do not exist: Rethinking binarized neural network optimization, NeurIPS (2019)

Bannink et al, Larq Compute Engine: Design, Benchmark, and Deploy State-of-the-Art Binarized Neural Networks, MLSys (2021)

Larq: Plumerai's ecosystem of open-source Python packages for Binarized Neural Networks


Vertical integration
makes our BNNs work

We combine our BNN inference stack with our collection of trained BNN models to provide a turnkey solution on off-the-shelf chips. And where applicable, we also provide our IP-core for FPGAs.

Software and algorithms for training BNNs

To prevent the loss in accuracy that occurs when BNNs are trained as 8-bit deep learning models, we developed new software for training BNNs. Combined with our world-class research on state-of-the-art BNN architectures and training algorithms, this results in tiny but highly accurate BNN models.

Software stack for BNN inference

We built inference software to enable BNNs to run efficiently on microcontrollers. Our compute engines are optimized for ARM Cortex-M, ARM Cortex-A and RISC-V architectures.

Datasets optimized for the real-world

We collect, label and build our own datasets and our internal data pipeline identifies failure cases to ensure that our models are highly reliable and accurate.

IP-core for BNN inference on FPGAs

For customers that use FPGAs and require the most energy-efficient solution, we provide a custom IP-core that is highly-optimized for our BNN models and software.

Plumerai’s BNNs vs. 8-bit deep learning
for person presence detection

Plumerai’s technology is more accurate, faster and smaller than the best publicly available 8-bit deep learning models with TensorFlow Lite for Microcontrollers. This is the case on ARM Cortex-M based microcontrollers as well as on many other platforms. These improvements result in better products with longer battery life and lower memory requirements.

Get started with Plumerai today

Plumerai's collection of trained BNN models is available to our customers, along with our inference software stack and our IP-core for FPGAs. Learn more about the impact our technology will have on your product and specific application.

Get started


What’s happening