ChatGPT’s New Rival: Google’s Gemini
Google has introduced a revamped AI model that is said to outperform ChatGPT. Let’s learn more.
Image by Author
For a while now, ChatGPT has been in the limelight. Everyone is talking about it, and a lot of people are using it, what could possibly go wrong?
Google has always aimed to maintain its reputation of being an AI-first company, and so far they have been doing well. However, in the last year, it’s clear to say that OpenAI has been taking the lead with ChatGPT, and it was only a matter of time before Google came in to try to take the lead again.
CEO Sundar Pichai stated that:
One of the reasons we got interested in AI from the very beginning is that we always viewed our mission as a timeless mission.
Introducing Gemini from Google.
If you haven’t already had the chance to look at the trailer, I’d prompt you to watch it here.
What is Gemini?
Gemini is Google's largest language model, which CEO Pichai initially first tested at a conference in June, and is now officially launching to the public. So what is so great about Gemini and why does it have ChatGPT shaking in its boots?
Gemini is not just a single AI model. It comes in different variations to meet different demands. For example, you have the lighter version called Gemini Nano which is compatible to run on Android devices. You also have Gemini Pro which is using the backbone of Barb and will be used to power a lot of Google AI services.
But it doesn’t end there. You also have Gemini Ultra, which is Google’s most capable model and most powerful LLM yet. Gemini Ultra seems to be specifically designed for data centers and enterprise applications in particular.
A quick breakdown:
- Gemini Ultra - largest and most capable model for highly complex tasks.
- Gemini Pro - best model for scaling across a wide range of tasks.
- Gemini Nano - most efficient model for on-device tasks.
This 3 variant family of large language models has been built to understand and operate across different types of information. The LLM can handle different types of information such as text, code, images, audio and videos. Multimodality at its finest.
So how good is it?
Google has been putting in a lot of work to test the Gemini models to ensure that they fit requirements and have been rigorously evaluated on a variety of tasks. It is said that Google’s Gemini Ultra exceeded current state-of-the-art results on 30 of the 32 widely-used academic benchmarks used in LLM research, with a whopping score of 90.0%.
Image from Google Gemini
Gemini Ultra has shown to be the first model to outperform human experts on MMLU (massive multitask language understanding). MMLU combines 57 subjects which include math, history, law, medicine, physics and more to test world knowledge as well as problem-solving abilities.
Looking into these benchmarks, we can see that the biggest advantage that Gemini has is its ability to understand and interact with videos and audio.
We have seen OpenAI aim to achieve this with the creation of DALL-E and Whisper. However, Google went one step further with a multisensory model from the beginning. Google also mentioned the improvements in coding as it uses a new code-generating system called AlphaCode 2, which is said to perform 85% better than other coding competition participants.
With this being said, benchmarks are just benchmarks. We will be able to fully understand Gemini's full capabilities when everyday users interact with it.
If you would like to learn more about the capabilities of Gemini, watch this video:
How to Access Gemini
For Pixel 8 Pro users, you may have already seen some new features such as the auto-summarisation feature in the Recorder app, and the Smart Reply part of the Gboard keyboard, thanks to Gemini Nano.
If you’re eager to try out Gemini Pro, you can do so now with Bard. Developers and enterprise customers will also be able to access Gemini Pro through Google Generative AI Studio or Vertex AI in Google Cloud from December 13th.
If you’re intrigued about Gemini Nano, you may have to wait a little bit longer as it will be available next year.
It is good to note that Gemini is only currently available in English for now. More languages will be available as CEO Pichai stated that the company aims to integrate the model into Google’s search engine, ad products, the Chrome browser, and more.
Wrapping it Up
This is looking like Google’s time to take back the crown and show us why they were at the forefront of AI innovation. What do you think will pop up next?
Nisha Arya is a Data Scientist and Freelance Technical Writer. She is particularly interested in providing Data Science career advice or tutorials and theory based knowledge around Data Science. She also wishes to explore the different ways Artificial Intelligence is/can benefit the longevity of human life. A keen learner, seeking to broaden her tech knowledge and writing skills, whilst helping guide others.