What’s been happening very recently (over the past few years) is that state-of-the-art techniques have started to become accessible to people outside of the research community. They’ve reached the point where they work well enough that you don’t need to be an expert to use them.
There are also now a lot of off-the-shelf models that are already trained and are general-purpose enough that they can be applied to many different problems.
This has been kind of happening under the radar and hasn’t had much attention.
The release of ChatGPT made all this very visible – though I do think we might be reaching peak hype (and maybe already falling into the trough of disillusionment…).
For a bit of history, the current state of the world has been building up for some time.
Deep learning kicked off quite a few years back and was the realisation that there was now a lot of data available along with a lot of computing power.
This meant that you could build and train very large neural networks – the “deep” in deep learning comes from the fact that there are many many layers in the neural network.
There were two problems with doing these deep networks before:
Firstly, there just wasn’t enough training data available. This lack of data makes it very hard for a very large network to generalise – so it simply memorises the training data and can’t handle anything new.
Secondly, it takes a very long time to train – there are a lot of calculations required so you need a lot of computing power.
The internet solved the problem of there not being enough data. There were suddenly a lot of pictures being made publicly available and it was fairly easy to gather enough people to manually catalogue them.
Gaming GPUs and cloud infrastructure solved the computation problem. By a happy accident, it turned out that the calculations needed to make amazing 3D games are very similar to the calculations needed to train a neural network.
A lot of the focus on deep learning has been around computer vision and object detection.
Generative Models: Progress in Image and Text Generation
Over the past few years, we’ve started to see generative models getting better. These are deep models that can generate or transform content.
The image generation models were the first ones to start getting good enough for people to get excited. The initial versions were pretty poor and only really of interest to researchers trying to improve them, but the more recent ones have reached the point where they are usable and can produce amazing results (take a look at the output the Midjourney can generate).
Text generation seemed to be lagging behind a bit. The models felt like fairly crappy toys. You could create fun things but they would quickly start generating drivel.
These text generation and transformation models are called Large Language Models – The “Large” in the name comes from the fact that they are very big and are trained on a lot of data.
What’s interesting about these LLMs is that the training is very simple. They are presented with some text and one of the words is masked out, the model is trained to predict what the missing word is e.g. “The? sat on the mat” – the missing word is probably “cat”.
What’s been surprising is that this simple training seems to lead to the model gaining an understanding of language. It also leads to some emergent behaviour and the trained models seem to be capable of doing much more than simply predicting missing words.
OpenAI releasing ChatGPT opened up a can of worms – it was very, very good compared to what had been seen before.
They had trained a very big Large Language Model, which was capable of solving a lot of problems that previously would have needed a very specific model. It was able to generalise and solve lots of problems.
They then took this model and fine-tuned it to produce responses that people preferred. This is called Reinforcement Learning from Human Feedback (RHLF) – basically, someone asks the LLM to do something and then they tell it how good the response was.
The clever bit here is that you train another model to do the scoring based on how people would score the response and then you fine-tune the original LLM using this model.
They released this fine-tuned model as ChatGPT and everyone went slightly nuts.
A lot of AI researchers were caught slightly on the hop, OpenAI were busy training their next model and predicting that it would be even better and more capable (GPT4 – which is very good).
Suddenly it seemed like AGI (Artificial General Intelligence) might be a lot closer than people had been thinking. The worry is that you can just keep making the model bigger and it will keep getting more and more clever – it’s not clear how true this is, but it’s got everyone sweating.
The current AI excitement is pretty much all focused on these LLMs and leveraging what they can do.
I talk to quite a few small start-ups and they are all getting pressure from investors to explain their “AI Strategy” or how they are going to “Embed AI in their business”.
One of the problems with this is that it’s already kind of happening.
In software engineering there are tools like “GitHub’s Copilot” – this is like having someone sit next to you as you work, helping you do your job and making useful suggestions. For some people, this is massively increasing their productivity.
Microsoft (and Google) have been adding AI to their products without people noticing.
Most of the CRM systems have also been adding functionality so sales and marketing are also getting a boost.
Things that LLMs (particularly the ones that OpenAI has) are good at (this is barely scratching the surface):
– Extracting structured data from unstructured data
– Augmenting data
– Turning very vague requests into concrete actions
– Extrapolating new information from existing data
Any business that deals with data (which is pretty much everyone) should be able to find some problem or process that could be improved.
Main Points:
- AI techniques have become accessible to non-experts and offer off-the-shelf models for various problems.
- The release of ChatGPT by OpenAI brought attention to the advancements in AI, raising AGI concerns.
- Deep learning has thrived due to the availability of data and computing power, enabling training of large neural networks.
- Generative models, especially LLMs, have shown significant progress in text and image generation.
- LLMs gain an understanding of language through simple training and display emergent behavior.
- OpenAI’s fine-tuned ChatGPT demonstrated impressive problem-solving abilities, fueling excitement about AGI.
- AI integration is already happening in various products and services, enhancing productivity and functionality.
- LLMs offer vast possibilities for improving processes and problem-solving in businesses dealing with data.