From the course: Introduction to Transformer Models for NLP
Unlock this course with a free trial
Join today to access over 24,800 courses taught by industry experts.
BERT for question/answering
From the course: Introduction to Transformer Models for NLP
BERT for question/answering
Let's turn our attention to something a little bit more complex, question and answering. The task on its face is quite simple. Given a question and some context, can BERT extract a subset a direct substring of the context, whereas the abstractive answer to the question might be a great guy. The term a great guy does not exist in the context, but it is based on the context itself. BERT's architecture for question and answering leans more on the extractive side, which again we will talk about abstractive question and answering when we get to our later transformer based models like GPT. But for BERT, extractive question and answering is formulated as such. Starting from the bottom, we feed into a pre-trained language model or a pre-trained BERT, a question and a context separated by the SEP token. Recall, the SEP token is meant to separate a sentence A and a sentence B, just like BERT was pre-trained on the next sentence prediction, we reuse that formula of a sentence A and a sentence B,…