Vision / Image Input
Module: beginner tips
What it is
Vision or image input is the ability for AI to understand and describe images. You can upload screenshots, photos, diagrams, or any image and ask questions about it. Multimodal models can "see" the image and discuss its contents, extract text, or identify objects.
Why it matters
Vision dramatically expands what you can ask AI about. Share a screenshot of an error message, a photo of handwritten notes, a chart you need explained, or a design you want feedback on. The AI understands visual content just like it understands text.