OpenAI has launched new upgrades for ChatGPT, enabling voice and image interactions, marking a significant step toward achieving artificial general intelligence. The enhanced ChatGPT-Plus now includes voice chat powered by a human-like text-to-speech model and the ability to discuss images using integrated image generation models. These features are part of OpenAI’s GPT Vision and represent key components of the multimodal version of GPT-4.
This upgrade follows the unveiling of DALL-E 3, OpenAI’s advanced text-to-image generator, which will be integrated into ChatGPT Plus. Microsoft, OpenAI’s largest backer, is also integrating OpenAI’s AI capabilities into its products, furthering the development of AI assistants that can perceive the world like humans. OpenAI plans to make these new functionalities available to Plus and Enterprise users in the next two weeks, with expansion to developers in the pipeline. The competition in the AI industry is heating up, with Google announcing its own multimodal LLM, Gemini, adding to the race for AI dominance.