What Is Multimodal AI?

How-To Geek

By Tim Brookes

Published 4 days ago

There’s a new AI buzzword in town.

Justin Duino / How-To Geek

Multimodal AI uses multiple input sources (text, images, audio, sensors) to achieve better results and more advanced applications.
Multimodal AI is more knowledgeable and can associate different inputs to provide enhanced outcomes.
Examples of multimodal AI models include Google Gemini, OpenAI's GPT-4V, Runway Gen-2, and Meta ImageBind.