From the course: Building a Project with the ChatGPT API

Unlock the full course today

Join today to access over 24,700 courses taught by industry experts.

Turn audio into text using Whisper

Turn audio into text using Whisper - ChatGPT Tutorial

From the course: Building a Project with the ChatGPT API

Turn audio into text using Whisper

- Do you want to learn about the most robust and accurate automatic speech recognition system on the planet? We're talking about a model trained on 680,000 hours of multilingual and multitask data collected from the web. Through Open AI's whisper APIs, you can turn audio or speech into text. It provides easy access to the open source text to speech whisper model. The model takes in an audio file and performs transcriptions, which enables translation in multiple languages and from those languages into English. Whisper accepts certain input file types like MP3 or MP4 and others. There are nine models in the Whisper family to choose from, with different sizes and capabilities available. The Whisper model is a transformer based encoder decoder model, also called a sequence to sequence model. I encourage you to pause this video and learn more about sequence to sequence learning to deepen your understanding of machine…

Contents