Multimodal
Module: fundamentals
What it is
Multimodal AI can process and generate multiple types of content—not just text, but also images, audio, and video. A multimodal model might accept image uploads and describe them, or generate images from text descriptions. The same model handles different modalities.
Why it matters
Multimodal capabilities dramatically expand what AI can help with. You can share screenshots for debugging, upload documents for analysis, or describe images you want created. Understanding multimodality helps you know what inputs an AI can accept and what outputs it can produce.