From the course: Introduction to Multimodal Prompting for Generative AI
Unlock the full course today
Join today to access over 24,800 courses taught by industry experts.
Gemini video inputs
From the course: Introduction to Multimodal Prompting for Generative AI
Gemini video inputs
- [Instructor] Take a look at this video. Although a video is technically a series of images, it can often represent things that a still image cannot represent. Google's Gemini 1.5 does quite an impressive job when it comes to reasoning about video inputs. Let's head over to AI Studios and check this out. Now, if you're using Gemini through the chat interface, you can add YouTube videos using their YouTube URLs. Here on the other hand, you can insert video files, so I'm going to select video. I'll do upload, browse, magic, and I'll go ahead and select this file. After the video is processed, I'm told that this particular clip will take up 2065 tokens from my context window, and I'm ready to add my prompt. In this video, the ball does not vanish. What object does disappear? Why is that surprising? Gemini is able to tell me that the object that disappears is the magic wand and that this is a surprise because I clearly hold the one in one hand and the ball in the other. So the attention…