LightOn: Shaping the Future of Large Language Models for Enterprises
For several years, machine learning has been used to generate human-grade texts. Yet, it was not until ChatGPT that everyone became aware and excited about this novel technology and started looking into how to employ such large language models to automate tedious tasks like writing emails, document summaries, or marketing material.
Founded in 2016, LightOn started out developing optical chips for machine learning but soon recognized the opportunity around machine learning itself. LightOn now helps enterprises develop and host large language models, keeping all of their data and prompts private and finetuning the models for their specific use cases.
Learn more about the future of large language models for enterprises from our interview with the co-founder and co-CEO, Laurent Daudet:
Why Did You Start LightOn?
Before LightOn, I used to be a university professor with 15 years of experience in leading a research group and writing publications. Unfortunately, in academia, a lot of research projects get buried after publication or stuck with the university’s tech transfer office. So we had to take the initiative ourselves to found a startup and use our research findings to build a product that customers would love. That’s important for tech startups: the vision for a new invention has to be found by the people who created the technology.
So originally, LightOn developed a photonic co-processor using light to make the computations underlying machine learning very fast. From our back-of-the-envelope calculation, we expected our technology to be a thousand times more efficient than GPUs, and even if it were just one or two orders of magnitude in practice, it would still have been worth pursuing.
Hardware development was slow, and customers were fiddling with their two-year budgets, hesitant to replace GPUs with an optical co-processor. At the same time, large language models emerged, and customers were asking us how they could pay if we helped them develop and host their own large language models. Ultimately, they didn’t care whether it was a photonic or electronic chip running their models, but they wanted to use these models for their particular use case. So we pivoted, and today we focus fully on helping customers with developing and hosting their large language models.
How Can Enterprises Use Large Language Models?
With ChatGPT, now everyone knows about large language models: large neural networks trained on a large amount of text that have amazed us with their ability to generate new texts, from writing emails to writing poems or even scientific papers.
Most enterprise use cases are around customer support and autocompletion, for example, to write emails and Q&A around internal documentation. We help our clients to install and host their own proprietary large language models locally – as powerful as GPT but tailored toward use in a business environment. Also, all data and prompts remain private, and our business model is a scalable, license-based model.
Large language models are not only good for autocompletion but can also address many use cases that we haven’t thought of before. One cool application of our large language models is in language tests, assessing how well you speak, for example, French. So we can help not only with the language level assessment itself but also with assessing how professors’ grades vary over time and with the quality control of the assessment.
We started out offering pre-trained models, but now we also fine-tune these models to specific use cases and for individual customers. Ultimately, these use cases depend very much on the customers, and we don’t want to focus just on one of them by vertical integration. Our plan is instead to license the tools for model finetuning so that sophisticated customers can be self-autonomous in their AI development.
Research has found that larger models generally perform better. This is known as the scaling hypothesis. But, it’s not just about the number of parameters but also the amount of training data and the quality of the data – a lot of content on the internet is redundant. So we developed our own de-duplication mechanism to design a high-quality corpus of material for training as it has a huge impact on performance.
There are also several ways to make large language models smarter and better geared toward a particular use case. One is reinforcement learning with human feedback. Another one is using knowledge graphs. And it’s not just for training; these models will form the backbone of further layers and rule-based systems built on top of them.
It’s not about ‘artificial general intelligence’, but really whether these models are able to solve more and more tasks. Soon machine learning models will have the answering capabilities for automating lots of white-collar jobs that involve lots of text.
How Did You Evaluate Your Startup Idea?
What we did with optical computing was a world first, demonstrating the first optical AI accelerator integrated into a French supercomputer. Researchers and R&D departments loved it, but in practice, what people spend on cloud providers was a bigger pain than what they spend on their energy bill.
In the long run, these specialized AI chips will become important, but in the short term, helping our customers with natural language processing is a concern that they have today.
We have a roadmap and can try this and that, but ultimately, a big driver is still talking to customers and building a product based on a large language model that customers want, for example, to organize knowledge within a company. Talking to customers really helps to understand how large language models are changing their lives. Always start from the customer: How could this product be a game-changer for the company? How does it solve a problem in a better way?
What Advice Would You Give Fellow Deep Tech Founders?
First: Do It! Take the leap and found a deep tech startup. It’s an amazing, bumpy journey, and if you feel ready, you should really do it! Think big about your vision, and not just by yourself, share it with as many people as possible. Then figure out whether it’s just a dream or whether it could meet an actual market demand. Last but not least, hire people smarter than you – and listen to them!
LightOn: The $1,000 GPT-3. Progress usually comes from a steady technology bootstrap…until it doesn’t – Read more on LightOn’s blog about the road towards making the use of large language models affordable in practice.
LightOn AI Meetup 10: “Rethinking Attention with Performers” with Krzysztof Choromanski – Check out the LightOn YouTube channel for more insightful recordings of their AI meetup.
Europeans scramble in AI race – TechXplore article quoting Laurent Daudet: “… the battle for generative AI isn’t over”
Angry Bing Chatbot Just Mimicking Humans, Experts Say – Laurent Daudet in an interview with Agence France-Presse.
LightOn raises $3.3 million for optics-based AI data processing – News article from 2018 by VentureBeat on LightOn’s seed round.