ChatGPT Introduces Video and Screenshare Features

OpenAI has made a significant leap forward with the update to ChatGPT’s Advanced Voice Mode (AVM), finally adding video and screensharing capabilities. Originally, this mode only supported audio interaction, but with the latest version, users can now use their phone cameras, allowing ChatGPT to "see" what they see. This long-anticipated feature, first teased with the launch of GPT-4o in May, adds a new layer of interaction, taking the model beyond just text-based chats.

Live Demo: ChatGPT Assists with Making Coffee

In a recent livestream, OpenAI's Chief Product Officer Kevin Weil and other team members showcased the new features in action. During the demo, ChatGPT assisted with the process of making pour-over coffee, using the phone’s camera to recognize the steps of the process. As the camera focused on the actions, the model understood them and guided the team through each step, demonstrating its ability to comprehend and assist with visual tasks.

Additionally, the team demonstrated how the new functionality supports screensharing, with ChatGPT understanding and processing an open message on a phone, even when Weil was wearing a Santa Claus beard, adding a fun touch to the demonstration.

Competition and New Developments: Google’s Gemini 2.0

This release from OpenAI comes shortly after Google unveiled its Gemini 2.0 model, which also integrates visual and audio processing capabilities. Gemini 2.0 promises to be even more advanced, with the ability to execute complex tasks on behalf of users through specialized agents: Project Astra (universal assistant), Project Mariner (specific AI tasks), and Project Jules (for developers). Despite the competition, ChatGPT continues to stand out for its ability to accurately identify objects in real-time and adapt to interruptions, making it a versatile and powerful tool.

Santa AI Has Arrived!

To add a festive touch, OpenAI has included a Santa Claus voice option in the new voice mode of ChatGPT. With this feature, users can interact with a ChatGPT version of Santa, speaking with a deep, jolly voice and lots of "ho-ho-hos." This feature can be accessed by tapping the snowflake icon within the app. However, OpenAI has added an age warning: the voice option is only available to users aged 13 or older, sparking some mystery about whether the real Santa Claus's voice was used to train the model.

Availability and Next Steps

Starting today, video and screensharing features are available to ChatGPT Plus and Pro users. Enterprise and Edu versions are set to be available in January 2025. With this update, ChatGPT becomes an even more powerful and flexible tool, capable of handling complex tasks and assisting users in new and entertaining ways.

Get ready for an even more immersive experience with ChatGPT, where vision and visual interaction are closer than ever before!

Search This Blog

Artificially Informed