From the course: Multimodal Prompting with Google's Project Gemini

Unlock this course with a free trial

Join today to access over 24,700 courses taught by industry experts.

Exploring Gemini's roadmap

Exploring Gemini's roadmap

- Gemini isn't just a model, but a whole family of models that comes in different types and with varying capabilities. One of the things that makes Gemini unique is that it was trained from the start to process different file types simultaneously, and it can make inferences from all that information available. For video, it takes a recording and converts it to a series of stills that it can process, but for audio, it's able to understand the data natively instead of first converting it to text to feed it as a prompt. It can also output responses with text and images. So potentially, you should be able to record some audio of a bird you hear in the woods and ask it to show you a picture of what that bird looks like. These capabilities and features are not all currently available, but are rolling out with different versions of the models over time. There are three main versions of Gemini, Gemini Nano, Pro, and Ultra. They…

Contents