SlideShare a Scribd company logo
Intro to Sentiment Analysis
“FAST, NEAT, AVERAGE, FRIENDLY, GOOD, GOOD” was the author’s first sentiment.
aka Opinion Mining
 Sentiment analysis is opinion mining.
 Uses Natural Language Processing.
 Dives deep into text analysis.
 Leverages computational linguistics.
 Develops meta data with business intelligence.
Basic Opinion Mining
 Construct a range of polarity for opinion markers.
 Classify statements by their polarity.
 Analyse several levels deep.
 Websites are one level.
 Authors are another level.
 Web page is a third level.
 A sentence is a fourth level.
Ranges of Polarity
 Classify emotional states.
 “Angry” can be codified as “upset” or “cross”.
 “Sad” may be “disappointed” or “confused”.
 “Happy” may be “amazing” or “gorgeous”.
Scaling Systems
 Some words are negative and deserve to be minus 10.
 Some words are neutral and should be equal to five.
 Some words are positive and could range from six to 10.
Subjective and Objective
Subjectivity and Objectivity
 Starts with classifying a given text (no more than a paragraph).
 Mark the media text as objective or subjective.
 The challenge lies in the subtlety of expression or the compound effect of multiple authors.
 Proper analysis normally means removing objective statements from the given text.
Aspect-Based Sentiment Analysis
 Determine opinions based on features.
 Mark the media text as objective or subjective.
 The challenge lies in the subtlety of expression or the compound effect of multiple authors.
 Proper analysis normally means removing objective statements from the given text.
Ambiguous and Disambiguation
When Something is Ambiguous
 Detect entity within text, such as person, place or company.
 Get detailed view at entity level, not document-level.
 “I love Ireland but I hate traveling on Irish roads.”
Disambiguation
 Detect entity within text, such as person, place or company.
 Get detailed view at entity level, not document-level.
 “I love Ireland but I hate traveling on Irish roads.”
Entity-Level
 Detect entity within text, such as person, place or company.
 Get detailed view at entity level, not document-level.
 “I love Ireland but I hate traveling on Irish roads.”
Keyword-Level Sentiment
 Gleans sentiment for every detected keyword.
 Much more detailed than view at document-level.
 BMW can determine positive comments about cars mention quality of handling.
User-Specified Sentiment
 You, the analyst, target specific words or phrases.
 So you specify a restaurant’s name and return sentiment scores based on that name.
 You cull various media texts for sentiment about a specific hotel.
Directional Sentiment
 Identifies the commentator and emotional range.
 First, discover the incident where emotion is expressed.
 Second, determine the degree of positive or negative response.
 Third, conclude who is mentioning both the product and how negatively.
Disambiguation by Location
 Identifies the exact point on the earth.
 Use contextual cues.
 Perhaps where something is posted or where commentator is based.
Disambiguation: Meta Data
 Meta data provides data about data.
 Links can remove ambiguity.
 Past geographical movements clarify reach of commentators.
 Simple internet searches can provide accurate profile data.
Entity Subtypes
 Author is a real person.
 Author is a man.
 Man’s name is Paul O’Connell.
 This Paul O’Connell is Munster.
Exact Quotations
 What was said.
 Who said what.
 When it was said.
 Where it was said.
 This exactness provides context.
Author Profile
 Analyse the text.
 Validate the context.
 Extract the concept.
 Extract the keywords.
 Apply to author profile.
 Determine what author’s write about.
References
 Turney and Pang applied methods for detecting polarity at the document level.
 Pang and Snyder classified documents on a multi-way scale, such as “five stars”.
 Katie Paine wrote “Measure What Matters”
Useful Links
 For Immediate Release G+ Community
 Marketing Over Coffee Podcast
 KD Paine’s Blog
 The Alchemy Blog
Continue the Discussion
 Use the Google Doc.
 Consult Moodle.
 Shout to @topgold

More Related Content

PPTX
Mercu learning ctr (socialstudies) seminar_18_jul2012_slideshare
PPT
Sec 3 Social Studies SBQ Skill: Reliability ppt
PPTX
LO2 - Lesson 5 - Secondary Sources
DOC
Sec 3 Social Studies SBQ Skill: Reliability notes
PPT
Comparison and Purpose Questions
PPTX
How to Tackle SBQ - Social Studies
PPTX
Social Studies Exam - Strategy & Revision Guide
PPT
SBQ skills
Mercu learning ctr (socialstudies) seminar_18_jul2012_slideshare
Sec 3 Social Studies SBQ Skill: Reliability ppt
LO2 - Lesson 5 - Secondary Sources
Sec 3 Social Studies SBQ Skill: Reliability notes
Comparison and Purpose Questions
How to Tackle SBQ - Social Studies
Social Studies Exam - Strategy & Revision Guide
SBQ skills

What's hot (8)

DOCX
Sec 3 Social Studies SBQ Skill: Inferences notes
PPTX
Social Studies Exam Guide
PPT
Sec 2 History SBQ Skill: Compare and Contrast
PPT
Source based made simple
PPT
SBQ Notes for Social Studies
DOC
Sec 2 History SBQ Skill: Compare and Contrast notes
PPT
A Critical Reading
ODP
The Art of SBQs
Sec 3 Social Studies SBQ Skill: Inferences notes
Social Studies Exam Guide
Sec 2 History SBQ Skill: Compare and Contrast
Source based made simple
SBQ Notes for Social Studies
Sec 2 History SBQ Skill: Compare and Contrast notes
A Critical Reading
The Art of SBQs
Ad

Similar to Intro to Sentiment Analysis (20)

PDF
Annotated Bibliographies
DOCX
sent_analysis_report
PPT
Writing Research Report
PPT
Effective Writing2
PDF
A Summary Of Interrater Reliability
PDF
Business Analyst-KnowYourAudience-Guide
DOCX
1 Recognizing Assignment Expectations Implied by Key Ver.docx
DOCX
1 Recognizing Assignment Expectations Implied by Key Ver.docx
PPTX
EVALUATING-SOURCES.pptx
PDF
Tips for Scale Development: Evaluating Automatic Personas
PDF
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
PDF
Critical Thinking Skills.pdf citical thinking
PPTX
Online research and citing sources for speeches grayson
PPTX
Ap lang apsi 2012 presentation kristen
DOCX
Module 7 Discussion Board Algebra1. What does it mean when s.docx
PPT
What is academic writing
PPT
Evalauting Text
PPTX
DBS Library Harvard Referencing Class Slides
PPT
Tools of critical reading
Annotated Bibliographies
sent_analysis_report
Writing Research Report
Effective Writing2
A Summary Of Interrater Reliability
Business Analyst-KnowYourAudience-Guide
1 Recognizing Assignment Expectations Implied by Key Ver.docx
1 Recognizing Assignment Expectations Implied by Key Ver.docx
EVALUATING-SOURCES.pptx
Tips for Scale Development: Evaluating Automatic Personas
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
Critical Thinking Skills.pdf citical thinking
Online research and citing sources for speeches grayson
Ap lang apsi 2012 presentation kristen
Module 7 Discussion Board Algebra1. What does it mean when s.docx
What is academic writing
Evalauting Text
DBS Library Harvard Referencing Class Slides
Tools of critical reading
Ad

More from Bernard Goldbach (20)

PPTX
Journaling through a Pinhole
PPTX
An Introduction to Media Writing
PPTX
Sharing Workflow Ideas
PPTX
Mapping active responses
PPTX
Academic Credit for Marchathon 2018
PPTX
Enhanced podcasts in education
PPTX
Creating Digital Media Profiles Online
PPTX
How to See People Who Block You on Twitter
PPTX
Managing Digital Footprints
PPTX
Using OneNote for Teaching and Learning
PPTX
Online Profiles of Creative Students
PPTX
Attracting and engaging with sharing
PPTX
Identity as a workshop
PPTX
Talking to Creative Illustrator and Author Nicola Colton
PPTX
Social Media Process
PPTX
Realism with Realia
PPTX
Spotlighting innovation
PPT
Digital Literacy and Professional Development #heie
PPT
The Alpha Version of a Wundering Moleskine
PPT
Topgold's Dropbox Workflow
Journaling through a Pinhole
An Introduction to Media Writing
Sharing Workflow Ideas
Mapping active responses
Academic Credit for Marchathon 2018
Enhanced podcasts in education
Creating Digital Media Profiles Online
How to See People Who Block You on Twitter
Managing Digital Footprints
Using OneNote for Teaching and Learning
Online Profiles of Creative Students
Attracting and engaging with sharing
Identity as a workshop
Talking to Creative Illustrator and Author Nicola Colton
Social Media Process
Realism with Realia
Spotlighting innovation
Digital Literacy and Professional Development #heie
The Alpha Version of a Wundering Moleskine
Topgold's Dropbox Workflow

Recently uploaded (20)

PPTX
Big Data Technologies - Introduction.pptx
PDF
Machine learning based COVID-19 study performance prediction
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
MYSQL Presentation for SQL database connectivity
PDF
KodekX | Application Modernization Development
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
cuic standard and advanced reporting.pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Empathic Computing: Creating Shared Understanding
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Encapsulation theory and applications.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
Big Data Technologies - Introduction.pptx
Machine learning based COVID-19 study performance prediction
Review of recent advances in non-invasive hemoglobin estimation
Encapsulation_ Review paper, used for researhc scholars
MYSQL Presentation for SQL database connectivity
KodekX | Application Modernization Development
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
cuic standard and advanced reporting.pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Empathic Computing: Creating Shared Understanding
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Digital-Transformation-Roadmap-for-Companies.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
Encapsulation theory and applications.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?

Intro to Sentiment Analysis

  • 1. Intro to Sentiment Analysis “FAST, NEAT, AVERAGE, FRIENDLY, GOOD, GOOD” was the author’s first sentiment.
  • 2. aka Opinion Mining  Sentiment analysis is opinion mining.  Uses Natural Language Processing.  Dives deep into text analysis.  Leverages computational linguistics.  Develops meta data with business intelligence.
  • 3. Basic Opinion Mining  Construct a range of polarity for opinion markers.  Classify statements by their polarity.  Analyse several levels deep.  Websites are one level.  Authors are another level.  Web page is a third level.  A sentence is a fourth level.
  • 4. Ranges of Polarity  Classify emotional states.  “Angry” can be codified as “upset” or “cross”.  “Sad” may be “disappointed” or “confused”.  “Happy” may be “amazing” or “gorgeous”.
  • 5. Scaling Systems  Some words are negative and deserve to be minus 10.  Some words are neutral and should be equal to five.  Some words are positive and could range from six to 10.
  • 7. Subjectivity and Objectivity  Starts with classifying a given text (no more than a paragraph).  Mark the media text as objective or subjective.  The challenge lies in the subtlety of expression or the compound effect of multiple authors.  Proper analysis normally means removing objective statements from the given text.
  • 8. Aspect-Based Sentiment Analysis  Determine opinions based on features.  Mark the media text as objective or subjective.  The challenge lies in the subtlety of expression or the compound effect of multiple authors.  Proper analysis normally means removing objective statements from the given text.
  • 10. When Something is Ambiguous  Detect entity within text, such as person, place or company.  Get detailed view at entity level, not document-level.  “I love Ireland but I hate traveling on Irish roads.”
  • 11. Disambiguation  Detect entity within text, such as person, place or company.  Get detailed view at entity level, not document-level.  “I love Ireland but I hate traveling on Irish roads.”
  • 12. Entity-Level  Detect entity within text, such as person, place or company.  Get detailed view at entity level, not document-level.  “I love Ireland but I hate traveling on Irish roads.”
  • 13. Keyword-Level Sentiment  Gleans sentiment for every detected keyword.  Much more detailed than view at document-level.  BMW can determine positive comments about cars mention quality of handling.
  • 14. User-Specified Sentiment  You, the analyst, target specific words or phrases.  So you specify a restaurant’s name and return sentiment scores based on that name.  You cull various media texts for sentiment about a specific hotel.
  • 15. Directional Sentiment  Identifies the commentator and emotional range.  First, discover the incident where emotion is expressed.  Second, determine the degree of positive or negative response.  Third, conclude who is mentioning both the product and how negatively.
  • 16. Disambiguation by Location  Identifies the exact point on the earth.  Use contextual cues.  Perhaps where something is posted or where commentator is based.
  • 17. Disambiguation: Meta Data  Meta data provides data about data.  Links can remove ambiguity.  Past geographical movements clarify reach of commentators.  Simple internet searches can provide accurate profile data.
  • 18. Entity Subtypes  Author is a real person.  Author is a man.  Man’s name is Paul O’Connell.  This Paul O’Connell is Munster.
  • 19. Exact Quotations  What was said.  Who said what.  When it was said.  Where it was said.  This exactness provides context.
  • 20. Author Profile  Analyse the text.  Validate the context.  Extract the concept.  Extract the keywords.  Apply to author profile.  Determine what author’s write about.
  • 21. References  Turney and Pang applied methods for detecting polarity at the document level.  Pang and Snyder classified documents on a multi-way scale, such as “five stars”.  Katie Paine wrote “Measure What Matters”
  • 22. Useful Links  For Immediate Release G+ Community  Marketing Over Coffee Podcast  KD Paine’s Blog  The Alchemy Blog
  • 23. Continue the Discussion  Use the Google Doc.  Consult Moodle.  Shout to @topgold

Editor's Notes

  • #2: This is the first look at sentiment analysis during a discussion with business students in the Limerick Institute of Technology in October 2013. It is based on professional experience shared by Bernard @topgold Goldbach, Katie @kdpaine Paine, Neville @jangles Hobson, Christopher @cspenn Penn and The Alchemy Group. The author of this deck lives at http://www.insideview.ie.
  • #3: Sentiment analysis  (also known as  opinion mining ) refers to the use of  natural language processing ,  text analysis  and  computational linguistics  to identify and extract subjective information in source materials. Generally speaking, sentiment analysis aims to determine the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document. The attitude may be his or her judgment or evaluation (see  appraisal theory ), affective state (that is to say, the emotional state of the author when writing), or the intended emotional communication (that is to say, the emotional effect the author wishes to have on the reader).
  • #4: A basic task in sentiment analysis is classifying the  polarity  of a given text at the document, sentence, or feature/aspect level — whether the expressed opinion in a document, a sentence or an entity feature/aspect is positive, negative, or neutral. Advanced, "beyond polarity" sentiment classification looks, for instance, at emotional states such as "angry," "sad," and "happy."
  • #5: Advanced, "beyond polarity" sentiment classification looks, for instance, at emotional states such as "angry," "sad," and "happy."
  • #6: A different method for determining sentiment is the use of a scaling system whereby words commonly associated with having a negative, neutral or positive sentiment with them are given an associated number on a -10 to +10 scale (most negative up to most positive) and when a piece of unstructured text is analyzed using  natural language processing , the subsequent concepts are analyzed for an understanding of these words and how they relate to the concept [ citation needed ] . Each concept is then given a score based on the way sentiment words relate to the concept, and their associated score. This allows movement to a more sophisticated understanding of sentiment based on an 11 point scale. Alternatively, texts can be given a positive and negative sentiment strength score if the goal is to determine the sentiment in a text rather than the overall polarity and strength of the text.
  • #7: Another research direction is  subjectivity/objectivity identification . According to Wikipedia, this task is commonly defined as classifying a given text (usually a sentence) into one of two classes: objective or subjective. This problem can sometimes be more difficult than polarity classification: the subjectivity of words and phrases may depend on their context and an objective document may contain subjective sentences (e.g., a news article quoting people's opinions). Results are largely dependent on the definition of subjectivity used when annotating texts. (Su) As Pang’s research shows, removing objective sentences from a document before classifying its polarity helped improve performance.
  • #8: Another research direction is  subjectivity/objectivity identification . This task is commonly defined as classifying a given text (usually a sentence) into one of two classes: objective or subjective. This problem can sometimes be more difficult than polarity classification: the subjectivity of words and phrases may depend on their context and an objective document may contain subjective sentences (e.g., a news article quoting people's opinions). Results leargely depend on the definition of subjectivity used when annotating texts. (Su) Removing objective sentences from a document before classifying its polarity helped improve performance. (Pang)
  • #9: The more fine-grained analysis model is called the  feature/aspect-based sentiment analysis . It refers to determining the opinions or sentiments expressed on different features or aspects of entities, e.g., of a cell phone, a digital camera, or a bank. A feature or aspect is an attribute or component of an entity, e.g., the screen of a cell phone, or the picture quality of a camera. This problem involves several sub-problems, e.g., identifying relevant entities, extracting their features/aspects, and determining whether an opinion expressed on each feature/aspect is positive, negative or neutral. More detailed discussions about this level of sentiment analysis can be found in Liu's NLP Handbook chapter,  "Sentiment Analysis and Subjectivity”.
  • #10: Ambiguous: open to more than one interpretation. Disambiguation: clarification that follows from the removal of ambiguity.
  • #11: AMBIGUOUS. You need to provide sentiment data for every detected entity within text, such as person, place, organization. You need to give clients a more detailed view than document-level sentiment analysis.
  • #12: REMOVE AMBIGUITY WITH DISAMBIGUATION TACTICS.
  • #13: Entity-Level Sentiment Analysis  provides sentiment data for every detected entity within text, such as person, place, organization. Alchemy algorithms do this kind of work.
  • #14: Keyword-Level Sentiment Analysis  provides sentiment data for every detected keyword so that instead of generating sentiment by document, it’s possible to generate sentiment for keywords within the document. For example, when analyzing car posts, determine that of the 70% posts that were positive, 80% of them mentioned road handling and 30% complained about the road tax.
  • #15: User-Specified Sentiment Analysis  allows the user to target specific words or phrases. For instance, specifying a movie title returns sentiment scores based on that phrase. This can be done by hand or by Alchemy API.
  • #16: Directional Sentiment Analysis  reveals who is emitting the sentiment. For example, if a person spoke negatively about a product, determine not only that the product was mentioned negatively, but who mentioned the product negatively.
  • #17: Disambiguation: Dominos in Limerick or Dominos all across Ireland? Since one business can have multiple locations, you need to be able to distinguish by location. This effectively means you are using a disambiguation technique to ferret out the various locations. You can often located contextual cues within the text or by geolocation in a Foursquare tip.
  • #18: Disambiguation: Additional Information Disambiguation provides additional information for the people, places and things mentioned in a document such as links to their official websites, Wikipedia pages, geographical coordinates and more.
  • #19: Entity Subtypes: Paul O’Connell, a Person and an Athlete. In addition to the most common entity types, such as person or organization, you should seek to identify subtypes. For example, your basic text analysis services will identify Paul O’Connell as a man but you need to know he is a prominent rugby player for Munster. That way, you know he is an influencer.
  • #20: Quotations Extraction: What Was Said and Who Said It Entity extraction determines what was said, but quotations extraction tells you who said what by extracting a quote and attributing it back to the person or organization responsible. Knowing that a company was mentioned in a piece of text is important, however, finding out who mentioned the company gives a fuller story. For example, entity extraction can provide you with a list of news articles where a topic and Willie O’Dea were both mentioned, but quotations extraction can provide you with a list of news articles where Willie O’Dea was quoted mentioning that topic.
  • #21: Author Extraction For data to be meaningful, your text analysis service must be able to contribute to building an author profile. Comments on web pages, tweets, image collections, and site critiques provide excellent data sets. Author extraction combined with concept extraction, keyword extraction, and entity extraction provides information on what topics specific authors write about.
  • #22: Early work in that area includes Turney and Pang who applied different methods for detecting the polarity of product reviews and movie reviews respectively. This work is at the document level. One can also classify a document's polarity on a multi-way scale, which was attempted by Pang and Snyder . This expanded the basic task of classifying a movie review as either positive or negative to predicting star ratings on either a 3 or a 4 star scale, while Snyder performed an in-depth analysis of restaurant reviews, predicting ratings for various aspects of the given restaurant, such as the food and atmosphere (on a five-star scale). Peter Turney (2002). "Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews".  Proceedings of the Association for Computational Linguistics . pp. 417–424. Bo Pang; Lillian Lee and Shivakumar Vaithyanathan (2002).  "Thumbs up? Sentiment Classification using Machine Learning Techniques" .  Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) . pp. 79–86. Bo Pang; Lillian Lee (2005).  "Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales" .  Proceedings of the Association for Computational Linguistics (ACL) . pp. 115–124. Benjamin Snyder; Regina Barzilay (2007).  "Multiple Aspect Ranking using the Good Grief Algorithm" .  Proceedings of the Joint Human Language Technology/North American Chapter of the ACL Conference (HLT-NAACL) . pp. 300–307.
  • #23: The FIR Community is at https://plus.google.com/communities/112349929544876511942 MOC is http://marketingovercoffee.com KD Paine blogs at http://kdpaine.blogs.com/ Alchemy’s blog is at http://www.alchemyapi.com/blog/
  • #24: The Moodle Document concerning sentiment analysis is at http://bit.ly/crm-document04 but that might change as the years go on. MOC is http://marketingovercoffee.com KD Paine blogs at http://kdpaine.blogs.com/ Alchemy’s blog is at http://www.alchemyapi.com/blog/ You can contact the author by using the nic “topgold” on all good social networks. This document was written to support the business curriculum in LIT.ie on 11 October 2013.