LinkedIn and 3rd parties use essential and non-essential cookies to provide, secure, analyze and improve our Services, and to show you relevant ads (including professional and job ads) on and off LinkedIn. Learn more in our Cookie Policy.

Select Accept to consent or Reject to decline non-essential cookies for this use. You can update your choices at any time in your settings.

Start free trial Sign in

From the course: Introduction to Multimodal Prompting for Generative AI

Unlock the full course today

Join today to access over 24,800 courses taught by industry experts.

Textual and auditory modality

Textual and auditory modality

From the course: Introduction to Multimodal Prompting for Generative AI

Start my 1-month free trial Buy for my team

Textual and auditory modality

“

- [Instructor] Text as a modality has been front and center in the way many people have discovered the power of generative AI. There are GPT-based systems such as ChatGPT and Copilot, as well as Google's Gemini and Anthropic's Claude. Now, these systems largely began as receiving text and producing text, and were widely used for text generation, question answering. Some of these models are used for what's called "semantic similarity search," where we look for similar text based on meaning, rather than trying to match words. Now, many of these text-based systems now support tasks that leverage multimodality, and we'll have a look at this very soon. Finally, there's audio as a modality, and we now have systems that can take a prompt that is a text-based prompt requesting some music and produce music based on that prompt. There's also speech synthesis and enhancement. There's speech recognition. Both of these have been around for a while, but have really improved lately. Finally, there's…

Contents