From the course: Introduction to Transformer Models for NLP
Unlock this course with a free trial
Join today to access over 24,800 courses taught by industry experts.
BERT for sequence classification
From the course: Introduction to Transformer Models for NLP
BERT for sequence classification
- Section 6.2, BERT for sequence classification. Let's get started fine tuning our first BERT model using our own dataset. Recall when we're doing sequence classification, the way we'll be approaching this is by adding a feedforward layer on top of the pooler layer that's been pre-trained on the next sentence prediction task to predict however many labels that we have. It's worth noting that this is only one way to structure sequence classification using BERT, but it's also the most common way. Most likely because it's the easiest and it comes with a prebuilt class in transformers to do it this way. Other ways of performing classification include averaging several encoders hidden states and using those as the vectors instead of the pooler output. But we are going to be using the more common way of using a feed forward after the pooler layer. So let's go ahead and get right into the code. Now we can get our hands-on look at BERT for sequence classification. To get started, let's go…