
According to Google, Gemini promises to surpass OpenAI's GPT-4 technology in terms of text generation, natural language understanding and problem-solving capabilities, revolutionizing the way we search and find information on the web.
The technology multinational launched its new AI model, called Gemini, on December 6, stating that it is capable of learning from a wide variety of information sources, adapting to different contexts and domains, and generating coherent and relevant responses to any query.
Google Gemini is the result of several years of research and development at the Google DeepMind laboratory, where the latest deep learning techniques, neural networks and natural language processing have been applied. The company says that its new AI model is the most intelligent and powerful on the market to date, and that it represents a great qualitative leap in the AI race, where it competes with other companies such as OpenAI, Meta and Microsoft.
Gemini is currently available in three versions, Ultra, Pro, Nano, and according to Google, can be tested via its chatbot Bard, which also rivals OpenAI's ChatGPT.
What is Google Gemini?
Gemini is an artificial intelligence (AI) model based on deep neural networks, which can process information of different types and sources, such as text, images, audio or video. In addition, Gemini can understand the context and purpose of a query, and generate relevant and complete answers, even if the query is complex or ambiguous.
According to Demis Hassabis, CEO and co-founder of Google DeepMind, gemini is “the most capable and general model we have ever built”.
If you ask Gemini “What do I need to climb Mount Fuji in winter?”, the AI model could analyze information from different websites, blogs, videos or images on the topic, to give us a list of tips, equipment, routes and precautions to take into account when undertaking the activity. In addition, the AI has the ability to compare Mount Fuji with other mountains we have climbed before, to suggest how to adapt our previous experience to the conditions of Mount Fuji.
In this regard, Hassabis emphasized that Gemini was designed and built from scratch to be multimodal, which means that “it can generalize and understand, operate and combine, without problems, different types of information.”
How does Google's new AI work?
Gemini is able to do all this because it uses an architecture called Transform, which allows it to simultaneously and autonomously learn multiple tasks and skills, such as natural language understanding, text generation, image recognition, translation or speech synthesis. In this way, Gemini can transfer what it learns from one task to another, and improve its performance over time.
According to Google, Gemini is 1.000 times more powerful than its predecessor, BERT, who was already able to understand the meaning of words based on context.
With Gemini, Google aims to create an intelligent assistant that can answer any type of query, and that can interact with everyone in a natural and fluid way.
Natively multimodal AI with next-generation performance
Gemini is not only an intelligent and creative chatbot, but it also has advanced programming capabilities, including generating high-quality code. This AI can also solve complex programming problems and collaborate with developers.
According to Google, the new model scored 90% in Measuring Massive Multitask Language Understanding (MMLU), or measuring massive multitasking language comprehension, outperforming human experts in the area. MMLU uses a combination of 57 subjects, including math, physics, history, law, medicine and ethics, to test both world knowledge and problem-solving skills, the company said.
“Our new benchmark approach for MMLU allows Gemini to use its reasoning capabilities to think more carefully before answering difficult questions.”
AI expert Rowan Cheung had pointed out that Gemini is the most powerful chatbot available on the market. However, after Google admitted that it had staged the initial Gemini demo, he questioned the company’s transparency and the capabilities of this AI model. “Was it just a simple PR miscommunication, or is it further behind ChatGPT than initially thought?” I ask Cheung.
Despite this, other experts believe that Gemini could become the foundation that further deepens the integration of AI into everyday tasks and activities.
Continue reading: Google Cloud becomes a validator within the Polygon PoS network