From the course: Introduction to Multimodal Prompting for Generative AI
Unlock the full course today
Join today to access over 24,800 courses taught by industry experts.
Textual and auditory modality
From the course: Introduction to Multimodal Prompting for Generative AI
Textual and auditory modality
- [Instructor] Text as a modality has been front and center in the way many people have discovered the power of generative AI. There are GPT-based systems such as ChatGPT and Copilot, as well as Google's Gemini and Anthropic's Claude. Now, these systems largely began as receiving text and producing text, and were widely used for text generation, question answering. Some of these models are used for what's called "semantic similarity search," where we look for similar text based on meaning, rather than trying to match words. Now, many of these text-based systems now support tasks that leverage multimodality, and we'll have a look at this very soon. Finally, there's audio as a modality, and we now have systems that can take a prompt that is a text-based prompt requesting some music and produce music based on that prompt. There's also speech synthesis and enhancement. There's speech recognition. Both of these have been around for a while, but have really improved lately. Finally, there's…