The world is witnessing a remarkable surge in the realm of artificial intelligence (AI), with a growing demand for cutting-edge, customer-centric AI techniques that enhance efficiency and productivity.
Among the various domains gaining widespread popularity, natural language processing (NLP) stands out prominently. The advent of commercially successful models such as ChatGPT has revolutionised how people across the globe harness the power of language models for an array of tasks.
advertisement
From ensuring impeccable grammar to crafting intricate production-ready algorithms, ChatGPT has proven to be a versatile and indispensable tool.
These language models are transforming the landscape of repetitive tasks, significantly reducing time consumption and providing ample space for fostering creativity and ideation. Moreover, they serve as a valuable second opinion, offering meticulous proofreading and error-checking capabilities, catching elusive mistakes often overlooked by human eyes.
The explosion of language models has prompted tech giants to race toward innovation, releasing successive iterations of large language models.
Subsequently, the second wave of innovation has ushered in advanced enhancements, such as improved question-answering abilities, contextual comprehension, real-time information retrieval from the internet, and even image generation based on user prompts. What was once considered theoretical speculation has now become a tangible, almost magical reality.
As AI and product professionals, we’re continually amazed by the expanding possibilities within the field. Leveraging Generative AI, we can enhance chatbots, streamline document summaries, create supporting content, and even generate visual product designs. The transformative potential of Generative AI in our work is truly remarkable, and we anticipate even greater innovations ahead.
Sumedha Rai, a Senior Data Scientist and AI researcher based out of New York, and Praneel Midha, a sophomore at the University of Illinois at Urbana Champaign studying Systems Engineering & Design and currently working with Alcon Inc in Georgia, conducted a comprehensive performance analysis comparing the two prominent large language models from OpenAI and Google.
They assessed the effectiveness of diverse subjects by analysing their responses to multiple prompts.
COST ANALYSIS
The pricing for both the LLMs looks very similar after Gemini’s recent rebranding to Gemini.
ChatGPT (3.5 only) | Free |
ChatGPT Plus (3.5 and 4) | $20/month |
Gemini | Free |
Gemini Advanced | $19.99/month |
ENGLISH AND GENERATIVE AI
When it comes to rectifying fundamental grammatical errors, both models exhibit similar performance. Gemini does a slightly better job of explaining the corrections made. The presentation of the answer is also more readable.
Next, they tried to test the creative skills of these LLMs by supplying them with some chosen keywords and asking them to generate a short story with similar keywords and prompts.
Both narratives exhibit exceptional craftsmanship, characterised by well-structured compositions featuring introductions, subtle climaxes, and satisfying conclusions. Their descriptive prowess enhances the overall impact, showcasing the remarkable ability of AI to weave coherent narratives from seemingly disparate prompts. Notably, both pieces surpass the prescribed word limit of 100, with ChatGPT exceeding it by a mere 4 words and Gemini by over 100.
Yet, Gemini efficiently trims its narrative length upon user request through the ‘Modify response’ feature, demonstrating adaptability and responsiveness to user needs. This disregard for word limits might prioritise storytelling over strict adherence, raising questions about AI’s creative freedom and our expectations for human-made versus machine-generated narratives.
In their assessment, ChatGPT serves as an exceptional writing companion, adeptly paraphrasing text while grasping the subtleties of language. It proficiently handles spell-checking and grammar correction, offering polished rephrases. Conversely, Gemini tries to take creative liberties and embellishes your text, occasionally introducing content that diverges from the original intent.
Their verdict: ChatGPT: Precise and direct, maintaining main ideas Gemini: Inventive, with versatile storytelling.
IMAGE GENERATION
Presently, both ChatGPT 4.0 and Gemini (including the freely accessible version) possess image generation capabilities, a feature absent in ChatGPT 3.5.
While Gemini’s image generation capabilities are still subject to scrutiny, the ability to produce images using a freely available model remains intriguing. They then asked ChatGPT 4.0 and Gemini to generate images and compare the results.
Prompt: Generate an image of AI in neurosurgery DALL·E produces captivating and futuristic images with meticulous attention to detail, seamlessly transforming imaginative concepts into visually stunning creations. Conversely, Gemini’s generated images lean towards a more humanised aesthetic, exhibiting a less futuristic tone and a comparatively lower level of detail.
While Gemini offers the advantage of generating four images per request, expanding user choices, the output tends to lack the same degree of innovation and creative flair seen in DALL·E’s outputs. This distinction may likely arise because DALL·E’s training focuses intensely on text and image pairs for image generation, endowing it with a superior capability in this area. In contrast, Gemini’s training, though more diverse, covering text, images, audio, and more, might not offer the same level of specialisation in image generation as DALL·E.
Their Verdict: ChatGPT: Detailed, imaginative imagery, often favouring creativity over human elements in visuals. Gemini: Offers a variety of images in a college framework, and often emphasises human involvement in creation.
OVERALL USER EXPERIENCE
Appearance: Both AI models utilise standard light and dark themes, featuring a left panel that showcases the conversation history with the LLM, while the main page remains central. ChatGPT enhances the user experience with its streamlined interface—conversation scrolling is facilitated by compact line spacing and minimal indents, significantly boosting readability.
Conversely, Gemini attempts to achieve a cleaner aesthetic by increasing white space and line spacing. However, this design choice inadvertently diminishes readability, particularly in the context of chat-based conversations.
Ability to ‘Modify response’: Gemini has a ‘Modify response’ button which allows it to shorten or lengthen the response as per user requirements. This feature also enables Gemini to alter the tone of the response to make it sound simpler, more casual, or more professional.
They found these features especially useful while writing articles or drafting emails. With ChatGPT, you can either enter a follow-up command for editing the lengths and tones, or you can choose to regenerate the response altogether. However, Gemini lets you do the same without entering a new command with very few clicks.
A second opinion: Gemini also has a ‘Show drafts’ feature which essentially shows you other responses that it prepared but ultimately rejected in favour of the response shown. They found this feature especially useful when approaching a problem from different angles.
Easy share: Gemini also lets you export the chats to a Google Doc, which we feel is extremely time-efficient. Instead of copying and pasting and editing the formatting, you can simply export it to a Google Doc without any hassle.
Speed and Latency: ChatGPT 3.5 boasts remarkable speed, outpacing Gemini in delivering answers for queries that don’t necessitate internet access, effectively operating in ‘turbo’ mode. While ChatGPT 4 introduces a delay due to its real-time internet browsing for enriched responses, its performance remains efficient for standard inquiries.
Conversely, Gemini’s noticeable response latency can be seen as a limitation.
Their Verdict: ChatGPT 3.5: Lightning-fast, doesn’t always need the latest information. Gemini: Internet-dependent, speed takes a backseat
CONCLUDING THOUGHTS
While both LLMs offer significant benefits, ChatGPT provides a straightforward and expansive experience, excelling particularly in conversational AI, contextual understanding, and rapid output. Gemini, as a newer entrant, prioritises aesthetics while incorporating numerous useful and engaging features and gathering valuable feedback.
Moreover, ChatGPT is likely to enjoy a degree of customer loyalty by now. We have both been ardent users of ChatGPT for over a year now. Adjusting to a new user interface entails its challenges for customers, notwithstanding the comparable features of both products.
OpenAI’s early entry into the market could grant it an advantage over Google’s Gemini. Notably, though, Gemini’s integration with various Google services, such as Docs, suggests the potential for further expansion into Sheets, Colab, and Drive, which could significantly enhance its ecosystem and might substantially increase adoption. The landscape of LLMs remains vast and exploratory, with ample room for innovation.
The market is far from saturated, with major tech players racing to develop increasingly advanced models such as Meta’s launch of Llama 3. In the future, we do expect a proliferation of LLMs to enter the market. However, success will hinge on a unique niche and strong USP. Coexistence among multiple models is possible, akin to analogous products from major competitors. Alternatively, market integration could propel one to dominance. Time alone will determine the ultimate victor!