Neurolabs: Shaping the Future of Image Recognition for Consumer Packaged Goods

Supermarkets and convenience stores are increasingly equipped with cameras to facilitate the checkout process and automatically recognize the products you’re buying through computer vision. But what if one of the producers decided to change, e.g., the packaging for a promotional campaign? 

Usually, this would mean taking lots of photos of the new packaging design, labeling these images, and then retraining the computer vision models—a manual and arduous process. However, by utilizing synthetic, computer-generated images, Neurolabs can short-cut this process from weeks to hours. Founded in the fall of 2018 by Paul PopRemus Pop, and Patric Fulop in Edinburgh, it has just raised a 3.5$ M seed round in May 2022 from 7percent VenturesLAUNCHHub VenturesTechstart Ventures, and Lunar Ventures.

Learn more about the future of computer vision and synthetic data from our interview with the CTO, Patric Fulop:

Why Did You Start Neurolabs?

My two co-founders, Paul and Remus, met in their first year of studies at the University of Edinburgh. Remus and I had been colleagues before, so the three of us started working together on side projects. We looked a lot into computer vision and graphics, with a whole new world opening up of complex challenges and elegant maths. 

We soon learned how much money and effort is spent on data annotation in computer vision and thought there must be a better way to do so. That’s when we started taking building and de-risking our startup more seriously, going through the Fast Track Malmö startup accelerator program, conducting extensive customer research, and obtaining the first pilot projects. 

Building a startup involves a lot of creativity—creative problem solving—and we were eager to make something creatively on our own. 

How Does Image Recognition Work?

As we identified, the availability of labeled data was the bottleneck for creating really good computer vision models and enabling image recognition, e.g., for AI-powered checkouts in supermarkets. The solution we pursued is synthetic data—AI-generated pixel-perfect labeled images based on real-world objects that can then be used to train computer vision algorithms. 

Therefore, we take 3D scans or a few images of real-world objects, like a Coke bottle, and create 3D assets accounting for their geometry, textures, and properties like light reflection. We then place these assets programmatically in 3D scenes like people do with blender and thereby create lots of labeled sample images, which can be used to train a computer vision model. It’s a very fast and scalable process that allows us to generate a wide variety of sample images—with different camera angles or light conditions—that would require a lot of effort to obtain in real-life. 

But we also had to solve several challenges: First, the rendering of the images. Second, bridging the data domain gap—real and synthetic images have different pixel distributions. We needed to bridge this gap by making the rendered images super-photorealistic using a technique called neural radiance fields so as not to distort the training data. Third, we had to create lots of computer vision models based on synthetic images for solving specific problems—such as recognizing Coke bottles. 

Placing these assets in a scene also comes with another challenge of its own: How do they deform once you place them into a scene? Hard objects like a Coke bottle won’t deform much at all, whereas a bag of chips can deform in various ways depending on how it falls into the scene. We build a separate engine to place assets and use physics to determine their illumination accurately. 

Ultimately, our vision is that everyone can digitize real-world objects, e.g., on their phone, which would support many new commercial applications. The sheer amount of research addressing this challenge is mind-boggling. 

How Did You Evaluate Your Startup Idea?

We did several pilot projects early on to benchmark from a technical perspective how much better synthetic images could be compared to real-world images. That’s how we learned how much faster, cheaper, and more scalable they are. Having better data leads to better machine learning models—that’s why the data-centric movement is treating data as the cornerstone of machine learning. 

Still, you need real-world data to adapt to a new domain—the more real-world images, the more robustly you can specify your ground truth. But as you serve more clients, you gather more data, and thus, also the synthetic data improves. Looking back, I would have gotten more real-world data sooner—in order to then build better synthetic data earlier. Proving something in the lab and the real world is a whole different story. 

Especially since real-world data keeps changing: coke bottles get promo labels, and it’s our value proposition to update our computer vision models in a matter of hours instead of weeks. No capturing of new images and painful data annotation is necessary. Once you upload the 3D model of the promo Coke, the whole process is automatic: we create the 3D assets, generate new synthetic images for the promo Coke bottle design, and update the models by creating a new representation instead of retraining the entire model in a process called few-shot learning. 

What Advice Would You Give Fellow Deep Tech Founders?

Talk to your customer early on! As a technical founder, you usually think about the value proposition mainly from a technical perspective. But it needs to be validated with customers, which leads you to discover new things, e.g. how important the onboarding process really is.

Comments are closed.