Enot: Shaping the Future of Neural Network Compression and Acceleration

Enot: Shaping the Future of Neural Network Compression and Acceleration

From robotics and quality control to self-checkouts – neural networks are about to get deployed everywhere and take over the legwork from humans when it comes to visual inspection. But especially for edge devices, hardware components controlling the data flow at a network’s edge, fast execution speed, and low resource and energy consumption are crucial when operating neural networks. 

Founded by Sergey Alyamkin in 2021 and based in Riga, Latvia, the AI startup Enot addresses exactly these challenges. After raising their pre-seed round from New Nordic Ventures in 2021, they have just launched their AutoML framework for neural network compression and acceleration this spring. 

Learn more about the future of neural network compression and acceleration from our interview with the CEO Sergey Alyamkin:

Why Did You Start Enot?

I had worked as a chief data scientist for a few enterprises before and encountered similar problems repeatedly when it came to machine learning: For research and development, the goal is always to make a neural network most accurate. But for building an actual product involving neural networks, one has to consider also the computational resources it consumes and the cost-efficiency. 

A few years ago, most machine learning projects were just research and development – now, neural networks are being deployed everywhere. However, from my experience, more than 90% of machine learning projects fail on the way from development into production – neural networks are still ineffective for a lot of use cases, and evaluating whether they work for a particular use case takes too much time and money. 

We started Enot to fix this problem, especially when running neural networks on edge devices and processing distributed data. Also, from a personal point of view, I wanted to start something myself and implement my own ideas rather than being an employee again. 

How Does Neural Network Compression Work?

Our goal is to accelerate and compress neural networks to make them work on the edge, where energy resources are limited, but everything still needs to work reliably. And there are two ways we achieve this: First, a neural network is just a series of matrix or tensor operations. We accelerate the execution of these operations through our runtime library that translates them to low-level code and can run not only on a CPU but also on a GPU. 

Second, we built an automated machine learning framework that simplifies the architecture of neural networks, e.g. reducing the size of matrices and tensors. It removes unnecessary network layers and idle connections – removing up to 80% of a neural network while monitoring the network’s performance. Through this compression, neural networks can infer predictions up to 20x faster, which is crucial e.g. for speedy image or video processing or to enable chatbots with low latency. 

Clients use our framework once to optimize the architecture of every new neural network they build, and every time they retrain a neural network and like to optimize RAM consumption through a process called quantization. 

How Did You Evaluate Your Startup Idea?

We skipped all the formal evaluation and customer interviews and just got started building the product we desired. We knew the problem well ourselves – and at least ten other people who definitely needed such a framework – and were confident that this would give value. 

Who doesn’t like speedy neural networks? The risk is less in the market and more in the technology, i.e. whether you can deliver performance without compromising accuracy. We’re technical founders serving technical customers who don’t want to see slides but how the product works. 

What Advice Would You Give Fellow Deep Tech Founders?

If you have a great idea, just get started working on it. If you face more market risk, it makes sense to talk to users first. But getting started changes everything – you have to put in the work to build and evaluate your product.