LinkedIn and 3rd parties use essential and non-essential cookies to provide, secure, analyze and improve our Services, and to show you relevant ads (including professional and job ads) on and off LinkedIn. Learn more in our Cookie Policy.

Select Accept to consent or Reject to decline non-essential cookies for this use. You can update your choices at any time in your settings.

Start free trial Sign in

From the course: Introduction to Transformer Models for NLP

Unlock this course with a free trial

Join today to access over 24,800 courses taught by industry experts.

BERT for sequence classification

BERT for sequence classification

From the course: Introduction to Transformer Models for NLP

Start my 1-month free trial Buy for my team

BERT for sequence classification

“

- Section 6.2, BERT for sequence classification. Let's get started fine tuning our first BERT model using our own dataset. Recall when we're doing sequence classification, the way we'll be approaching this is by adding a feedforward layer on top of the pooler layer that's been pre-trained on the next sentence prediction task to predict however many labels that we have. It's worth noting that this is only one way to structure sequence classification using BERT, but it's also the most common way. Most likely because it's the easiest and it comes with a prebuilt class in transformers to do it this way. Other ways of performing classification include averaging several encoders hidden states and using those as the vectors instead of the pooler output. But we are going to be using the more common way of using a feed forward after the pooler layer. So let's go ahead and get right into the code. Now we can get our hands-on look at BERT for sequence classification. To get started, let's go…

Contents