SlideShare a Scribd company logo
Sentiment analysis:
Incremental learning to build domain-models
Raimon Bosch (@raimonbosch)
TALN, DTIC, UPF
Sentiment analysis: Incremental learning to build domain-models
What is sentiment analysis?
[Liu, 2010] Proposes a quintuple (oj, fjk, ooijkl, hi, tj). Text
unstructured data to structured data.
oj: Object
fjk: Object features (Aspect)
ooijkl: Opinion orientations (positive/negative),
(calm/anger/joy/happiness), intensity, ...
hi: Opinion holder
tj: Time frame
What is sentiment analysis?
(oj, fjk, ooijkl, hi, tj) examples:
("easyjet", "baggage", "too expensive" => -5, "John", "01-07-
2013")
("rentaz", "house rent", "horrible people" => -10, "John", "02-07-
2013")
...
("jazztel", "internet", "no problems" => +4, "John", "03-07-
2013")
Sentiment analysis: Incremental learning to build domain-models
State-of-the-art
- Twitter as a corpus [Pak and Paroubek, 2010]: Text-
classification problem. Features for machine learning
techniques.
- Emoticons :)
- N-grams
- Negations
- Pos-tagging
- Syntax
- Twitter specific features.
State-of-the-art
- Pointwise Mutual Information [Su and Xiang, 2006]: We can
have the probability of certain words in a phrase of being
positive or negative depending on their co-occurrences in the
WWW.
State-of-the-art
- Sentiment dictionaries: Sentiwordnet [Baccianella and Esuli,
2010]. Positive score and Negative score for each meaning
(#N). Calculated with Random-walk algorithm.
State-of-the-art
- Cross-domain models [Pan, 2010]: Bipartite graph.
State-of-the-art
- Twitter prediction [O’Connor, 2010]: Correlation between
tweets and polls. Real-time information.
Not developed in state-of-the-art
Structured N-grams.
Most of the work is done with N-grams.
Buzz detection.
Aspect identification is not a main focus.
Technology stack
Technology stack
- Simplicity. Ruby.
- Integration with Java (JRuby, Hadoop Streaming).
- Big Data ready. Hadoop.
Hypothesis
H1: We can create groups of N-grams that influence specifically
to one aspect in a negative or a positive orientation. This is what
we call sentigrams.
H2: By using incremental learning the system improves in
each iteration. User interaction increases precision.
H3: After certain number of iterations is reached we can assign
sentigrams to a tweet automatically.
Hypothesis (H1) - Sentigrams
We define as sentigram the relation between sentiwords and
aspects that define if a tweet is postive or negative.
- Sentigram is an evolution from N-grams. Which could be
considered as structured N-gram.
- Detect aspects and sentiwords inside a text.
Hypothesis (H1) - Sentigrams
- Mark opinion orientations. Not only if they are positive or
negative, also which aspect are they referring to.
Hypothesis (H2) - Incremental learning
By using incremental learning the system improves in each
iteration. Increasing precision.
- Original sentiwordnet version was not very adapted to our
domain.
- We include new sentiwords from annotations in our dictionary
with scores (pos_score: 0, neg_score: 0).
- Random-walk update word scores until accuracy converges.
Hypothesis (H3) - Automatization
After certain number of iterations is reached we can assign
sentigrams to a tweet automatically without manual
intervention.
- Multi class problem!! Each tweet has several words to guess.
Text-classification problem!!
Hypothesis (H3) - ML
- Convert a multiclass problem in a binary problem
(i.e. "ryanair is a joke").
0,801829636,-
545403680,1561023766,2119008529,11,801829636,-
545403680,1561023766,2119008529,0
2,801829636,-545403680,1561023766,2119008529,0
3,801829636,-545403680,1561023766,2119008529,2
- Focus the problem by position: (0..N). N partial observations
from each tweet.
- Numerical codes for words. Three classes available {0,1,2}
Hypothesis (H3) - Dependency parsing
- Mate Tools
1 ryanair _ ryanair _ NN _ _ -1 2 _ SBJ _ _
2 is _ be _ VBZ _ _ -1 0 _ ROOT _ _
3 a _ a _ DT _ _ -1 4 _ NMOD _ _
4 joke _ joke _ NN _ _ -1 2 _ PRD _ _
- Still noisy. Work in progress.
- ML approach: Accuracy is 85% against our gold standard.
Focusing only on aspects we can get 94% accuracy.
Conclusions
- Sentiwordnet version was not very adapted to our domain.
Accuracy 47%. Random-walk necessary.
- Design of interface to perform interactive annotations. Semi-
supervised approach.
- With words from annotations pos scores and neg scores are
changed randomly until accuracy is optimized. Convergence
reached. Accuracy 89%.
Conclusions
- Focus on aspect identification. Not only +/-. We detect what
the user is complaining about.
- Convert a multi class problem in a binary problem. Divide &
conquer!!
- Machine-learning & dependency parsing of tweets to detect
patterns. Accuracy 85%
What's next?
- Finish integration with dependency parsing.
- Data visualization. Comparison between several topics.
Positive aspects and negative aspects of each topic.
- Train the system for several domains: airlines, politics, tv,
telecommunications, etc...
Thanks!
Questions?

More Related Content

PPTX
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
PPT
Twitter sentiment-analysis Jiit2013-14
PPTX
Sentimental Analysis of twitter data .
PPTX
Sarcasm Detection: Achilles Heel of sentiment analysis
PDF
The sarcasm detection with the method of logistic regression
PDF
SENTIMENT ANALYSIS OF TWITTER DATA
PPTX
social network analysis project twitter sentimental analysis
PPTX
Sentiment Analysis in Twitter
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
Twitter sentiment-analysis Jiit2013-14
Sentimental Analysis of twitter data .
Sarcasm Detection: Achilles Heel of sentiment analysis
The sarcasm detection with the method of logistic regression
SENTIMENT ANALYSIS OF TWITTER DATA
social network analysis project twitter sentimental analysis
Sentiment Analysis in Twitter

What's hot (20)

PPTX
Sentiment analysis using ml
PPTX
sentiment analysis
DOCX
Sentiment analysis in twitter using python
PDF
Project report
PDF
Sentiment analysis of Twitter Data
PPTX
Twitter Sentiment Analysis
PPTX
Twitter sentiment analysis ppt
PPTX
Twitter sentiment analysis ppt
PPTX
Sentiment Analysis on Twitter
ODP
Sentiment Analysis on Twitter
PDF
Sentimental analysis
PPTX
Sentimental Analysis - Naive Bayes Algorithm
PPTX
Tweet sentiment analysis (Data mining)
PDF
Twitter sentimentanalysis report
PPTX
Sentiment Analysis using Twitter Data
PPTX
Alleviating Data Sparsity for Twitter Sentiment Analysis
PPTX
Sentiment analysis of Twitter data using python
PPTX
Twitter sentiment analysis
PPTX
Sentiment analysis of twitter data
PDF
Sentiment Analysis Using Hybrid Structure of Machine Learning Algorithms
Sentiment analysis using ml
sentiment analysis
Sentiment analysis in twitter using python
Project report
Sentiment analysis of Twitter Data
Twitter Sentiment Analysis
Twitter sentiment analysis ppt
Twitter sentiment analysis ppt
Sentiment Analysis on Twitter
Sentiment Analysis on Twitter
Sentimental analysis
Sentimental Analysis - Naive Bayes Algorithm
Tweet sentiment analysis (Data mining)
Twitter sentimentanalysis report
Sentiment Analysis using Twitter Data
Alleviating Data Sparsity for Twitter Sentiment Analysis
Sentiment analysis of Twitter data using python
Twitter sentiment analysis
Sentiment analysis of twitter data
Sentiment Analysis Using Hybrid Structure of Machine Learning Algorithms
Ad

Similar to Sentiment analysis: Incremental learning to build domain-models (20)

PDF
Sentiment Analysis of Twitter Data
PDF
Sentiment Analysis on Twitter Data
PDF
Neural Network Based Context Sensitive Sentiment Analysis
PPTX
Aman chaudhary
PDF
Sentiment Analysis: A comparative study of Deep Learning and Machine Learning
PPTX
Emotion Classification In Software Engineering Texts: A Comparative Analysis ...
PPTX
Predicting Tweet Sentiment
PDF
April 10th of 2018 budapest presentation
PDF
Live Twitter Sentiment Analysis and Interactive Visualizations with PyLDAvis ...
PDF
IRJET - Twitter Sentiment Analysis using Machine Learning
PDF
IRJET - Cyberbulling Detection Model
PPTX
Svm and maximum entropy model for sentiment analysis of tweets
PDF
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
PDF
Tweet analyzer web applicaion
PDF
Combining Lexicon based and Machine Learning based Methods for Twitter Sentim...
PDF
IRJET- A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...
PDF
IRJET- Survey of Classification of Business Reviews using Sentiment Analysis
PDF
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
PDF
Sentiment Analysis and Classification of Tweets using Data Mining
PDF
AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位
Sentiment Analysis of Twitter Data
Sentiment Analysis on Twitter Data
Neural Network Based Context Sensitive Sentiment Analysis
Aman chaudhary
Sentiment Analysis: A comparative study of Deep Learning and Machine Learning
Emotion Classification In Software Engineering Texts: A Comparative Analysis ...
Predicting Tweet Sentiment
April 10th of 2018 budapest presentation
Live Twitter Sentiment Analysis and Interactive Visualizations with PyLDAvis ...
IRJET - Twitter Sentiment Analysis using Machine Learning
IRJET - Cyberbulling Detection Model
Svm and maximum entropy model for sentiment analysis of tweets
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
Tweet analyzer web applicaion
Combining Lexicon based and Machine Learning based Methods for Twitter Sentim...
IRJET- A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...
IRJET- Survey of Classification of Business Reviews using Sentiment Analysis
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
Sentiment Analysis and Classification of Tweets using Data Mining
AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位
Ad

Recently uploaded (20)

PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Spectroscopy.pptx food analysis technology
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
sap open course for s4hana steps from ECC to s4
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
cuic standard and advanced reporting.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Encapsulation theory and applications.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Programs and apps: productivity, graphics, security and other tools
Big Data Technologies - Introduction.pptx
Spectroscopy.pptx food analysis technology
Encapsulation_ Review paper, used for researhc scholars
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
MIND Revenue Release Quarter 2 2025 Press Release
Per capita expenditure prediction using model stacking based on satellite ima...
sap open course for s4hana steps from ECC to s4
NewMind AI Weekly Chronicles - August'25 Week I
Reach Out and Touch Someone: Haptics and Empathic Computing
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Review of recent advances in non-invasive hemoglobin estimation
cuic standard and advanced reporting.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
Encapsulation theory and applications.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Unlocking AI with Model Context Protocol (MCP)
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows

Sentiment analysis: Incremental learning to build domain-models

  • 1. Sentiment analysis: Incremental learning to build domain-models Raimon Bosch (@raimonbosch) TALN, DTIC, UPF
  • 3. What is sentiment analysis? [Liu, 2010] Proposes a quintuple (oj, fjk, ooijkl, hi, tj). Text unstructured data to structured data. oj: Object fjk: Object features (Aspect) ooijkl: Opinion orientations (positive/negative), (calm/anger/joy/happiness), intensity, ... hi: Opinion holder tj: Time frame
  • 4. What is sentiment analysis? (oj, fjk, ooijkl, hi, tj) examples: ("easyjet", "baggage", "too expensive" => -5, "John", "01-07- 2013") ("rentaz", "house rent", "horrible people" => -10, "John", "02-07- 2013") ... ("jazztel", "internet", "no problems" => +4, "John", "03-07- 2013")
  • 6. State-of-the-art - Twitter as a corpus [Pak and Paroubek, 2010]: Text- classification problem. Features for machine learning techniques. - Emoticons :) - N-grams - Negations - Pos-tagging - Syntax - Twitter specific features.
  • 7. State-of-the-art - Pointwise Mutual Information [Su and Xiang, 2006]: We can have the probability of certain words in a phrase of being positive or negative depending on their co-occurrences in the WWW.
  • 8. State-of-the-art - Sentiment dictionaries: Sentiwordnet [Baccianella and Esuli, 2010]. Positive score and Negative score for each meaning (#N). Calculated with Random-walk algorithm.
  • 9. State-of-the-art - Cross-domain models [Pan, 2010]: Bipartite graph.
  • 10. State-of-the-art - Twitter prediction [O’Connor, 2010]: Correlation between tweets and polls. Real-time information.
  • 11. Not developed in state-of-the-art Structured N-grams. Most of the work is done with N-grams. Buzz detection. Aspect identification is not a main focus.
  • 13. Technology stack - Simplicity. Ruby. - Integration with Java (JRuby, Hadoop Streaming). - Big Data ready. Hadoop.
  • 14. Hypothesis H1: We can create groups of N-grams that influence specifically to one aspect in a negative or a positive orientation. This is what we call sentigrams. H2: By using incremental learning the system improves in each iteration. User interaction increases precision. H3: After certain number of iterations is reached we can assign sentigrams to a tweet automatically.
  • 15. Hypothesis (H1) - Sentigrams We define as sentigram the relation between sentiwords and aspects that define if a tweet is postive or negative. - Sentigram is an evolution from N-grams. Which could be considered as structured N-gram. - Detect aspects and sentiwords inside a text.
  • 16. Hypothesis (H1) - Sentigrams - Mark opinion orientations. Not only if they are positive or negative, also which aspect are they referring to.
  • 17. Hypothesis (H2) - Incremental learning By using incremental learning the system improves in each iteration. Increasing precision. - Original sentiwordnet version was not very adapted to our domain. - We include new sentiwords from annotations in our dictionary with scores (pos_score: 0, neg_score: 0). - Random-walk update word scores until accuracy converges.
  • 18. Hypothesis (H3) - Automatization After certain number of iterations is reached we can assign sentigrams to a tweet automatically without manual intervention. - Multi class problem!! Each tweet has several words to guess. Text-classification problem!!
  • 19. Hypothesis (H3) - ML - Convert a multiclass problem in a binary problem (i.e. "ryanair is a joke"). 0,801829636,- 545403680,1561023766,2119008529,11,801829636,- 545403680,1561023766,2119008529,0 2,801829636,-545403680,1561023766,2119008529,0 3,801829636,-545403680,1561023766,2119008529,2 - Focus the problem by position: (0..N). N partial observations from each tweet. - Numerical codes for words. Three classes available {0,1,2}
  • 20. Hypothesis (H3) - Dependency parsing - Mate Tools 1 ryanair _ ryanair _ NN _ _ -1 2 _ SBJ _ _ 2 is _ be _ VBZ _ _ -1 0 _ ROOT _ _ 3 a _ a _ DT _ _ -1 4 _ NMOD _ _ 4 joke _ joke _ NN _ _ -1 2 _ PRD _ _ - Still noisy. Work in progress. - ML approach: Accuracy is 85% against our gold standard. Focusing only on aspects we can get 94% accuracy.
  • 21. Conclusions - Sentiwordnet version was not very adapted to our domain. Accuracy 47%. Random-walk necessary. - Design of interface to perform interactive annotations. Semi- supervised approach. - With words from annotations pos scores and neg scores are changed randomly until accuracy is optimized. Convergence reached. Accuracy 89%.
  • 22. Conclusions - Focus on aspect identification. Not only +/-. We detect what the user is complaining about. - Convert a multi class problem in a binary problem. Divide & conquer!! - Machine-learning & dependency parsing of tweets to detect patterns. Accuracy 85%
  • 23. What's next? - Finish integration with dependency parsing. - Data visualization. Comparison between several topics. Positive aspects and negative aspects of each topic. - Train the system for several domains: airlines, politics, tv, telecommunications, etc...

Editor's Notes

  • #3: I want to start this presentation with a little bit of thinking. I want you to read this quote and think about it for a few seconds. Is this really true? If for instance we are in a supermarket and we have to choose between two products with similar prices. Normally we buy from the brand that gives us better feeling. And this feeling is connected with its advertising campaign and its power to create this good feeling. But is this good feeling real? Behind a nice and inspiring ad it could be thousands of reasons equally important to not buy this product. Other values such as how well this company interact with its workers, how well this company interacts with its clients or how many non-resolved reclamations they have. SA can give us access to this information. The aggregation of opinions is a way of giving people the power of taking more informed decisions. Because they can analyze which kind of opinions other users have about a brand and if it is worth it to buy from them. I see also SA as a way of creating real change. If we buy from brands with better social values, we will be able to evolve to a better society.
  • #4: Bing Liu does a very good definition of SA. He defines this as a quintuple where we have 5 fields. An object or main topic, the different object aspects which the opinion is referring to. A set of different opinion orientations that could be positive or negative and with a determined degree of intensity. And finally we can have an opinion holder and a specific time.
  • #5: So we can see some examples of this quintuples here. We could have an opinion about easyjet that considers that the baggage is too expensive. Another one about a house renting company that says that they are horrible people, and maybe some positive opinions here we have one about jazztel that says that there are no problems. We can see how each opinion is defined with a different degree of intesnisty.
  • #6: But what we can do with those tuples of information. What if we aggregate all of them in one place? What if we have one place where in seconds we can know how a brand treats its clients? This idea is very powerful because it will be a way to force companies to be more human and respond to certain values if they want to survive. Informed citizens are smart citizens.
  • #7: But what we do when texts are from different domains. The negative words in the domain of airlines are not the same that in the domain of politics. Can we build cross-domain solutions? Pan proposed a solution for that, basically dividing in two groups of words. On the left we have domain-independent words and in the right domain-dependent words. As you will see this organization of information creates little groups such as never_buy with blurry and boring. With a system like that we can detect new domain-dependent opinion words by checking its co-ocurrence with words on the left side.
  • #8: One of the main techniques is Pointwise Mutual Information. This method consists in using the World Wide Web as a database. Basically if we want to query if a "phrase" is positive or negative we have to take first N results in a search engine of this "phrase" and calculate how many co-occurrences we have in positive contexts and how many in negative contexts. Depending on that we can guess the orientation of this "phrase".
  • #9: Other state-if -the art technique are sentiment dictionary. this consists in databases with words where each word has a positive and a negative score. We can use this information in our programs to build a sentiment score for any text. One of the main is Sentiwordnet that as you will see it has a pos score, a neg score, also a little gloss to understand, and the specific words affected for this meaning. Obviously we can have the same word in different meanings. This is way we use this hashtag and number at the end of each word.
  • #10: But what we do when texts are from different domains. The negative words in the domain of airlines are not the same that in the domain of politics. Can we build cross-domain solutions? Pan proposed a solution for that, basically dividing in two groups of words. On the left we have domain-independent words and in the right domain-dependent words. As you will see this organization of information creates little groups such as never_buy with blurry and boring. With a system like that we can detect new domain-dependent opinion words by checking its co-ocurrence with words on the left side.
  • #11: And yes, SA has been used to predict. In 2008 was used for the Obama's election process, it has been used also in Germany and also to predcit stock market. Is possible to find indicators that anticipate the tendencies seen in polls. So twitter allows us to see the tendencies in real-time. Sometimes to find this tendencies we need to work in other dimensionality sentiment spaces different from positive/negative such as calm/anger/joy/happiness/....
  • #12: This slides are to explain what is not very developed yet in state-of-the-art of SA. Basically we saw that N-grams are very exploited to detect opinions, but there is not exploitation of combinations of N-grams as new units. Finding correlations between similar N-grams is a very interesting line of investigation. Yes, another thing not seen. Is treat opinions as a problem. So what if we want to read a newspaper without the writer opinion? What if we only want to read the facts, the data? SA has not been very exploited to remove opinions from texts. Which I think in some cases would be interesting
  • #13: After reviewing the material we chose this architecture. Basically we donwload some tweets from the Twitter Api (about any topic that interest us), we merge this information with a dictionary through Hadoop so we get an score for each tweet depending on how many words of the dictionary are inside each tweet. And finally we cans how this information in a Rails interface and compare different topics, create statistics, and so on. But at the same time, we can improve the system performing annotations on tweets, and little corrections. Those corrections are reused to create new words in the dictionary and improving this tweet score. And at the same time, this annotations can be used to create Weka models that can help to create this statistics that we want to show in the interface.
  • #14: We choose Ruby because its simplicity. We do not need to compile. We do not need to deploy. Maintenance is simple. And at the same time we can use Java when needed with Jruby or Hadoop Streaming library. Hadoop allows us to perform this agroupation between tweets and dictionary without wasting memory. So all this "GROUP BY" can be done in disk (writing sequentally). In a iterative version we would need to save on memory all tweets and dictionary and check them there. What if we have 10 millions of tweets, will fit in memory?
  • #15: We work with three hypothesis here. 1/ That is possible to create groups of N-grams called sentigrams. Groups that indicate if a tweet is positive or negative and that refer to a specific aspect. 2/ That the system allows to do incremental learning and improve this tweet score in each iteration. 3/ That we can learn sentigrams as the number of interations increases and at certain point we will be able to dectect if the tweet is positive or negative and why.
  • #16: Read text. As we can see there we mark the aspects in black and the sentiwords in red. So we have that ryanair is a nightmare, and that is ridiculous to pay extra for baggage. Those two sentigrams will tell to us that the message is basically negative.
  • #17: After that we have to mark opinion orientation independently of the score given by our system (that could be wrong). So here we have a postive message that says that this two airlines are always on time. So we mark as good. And in the negative message we mark as negative.
  • #18: The second hypothesis is about the idea of "incremental learning". That was needed because original dictionary had an accuracy below 50%. To fix that we can use random-walk algorithm to rebalance the scores of the words.
  • #19: Third hypothesis. Automatization of sentigram detection. As we will see this is a multiclass problem, because we have to choose between several strings. Working with text is not like working with numbers, is different.
  • #20: To solve this problem we transform this multiclass problem in a binary problem. So we ceate 4 partial observations each one in a different position of the text. First, second, third and fourth. We transform words in numbers through hash codes. And then we determine if a word is an aspect, a sentiword, or it is not relevant by adding three codes (0,1,2). This idea is similar to Viterbi algorithm that works with partial observations to guess next state.
  • #21: We are currently investigating other techniques such as dependency parsing. So we want to see if providing a surface structure can help to classificate those sentigrams. We are still working on it. So basically the ML approach is giving us better results (85%), (94% if we focus individually in aspects or sentiwords)
  • #22: That the original dictionary was useless and we needed to perform random-walk. So we designed a screen to perform interactive corrections.
  • #23: And in this third iteration that is not finished yet we are working in sentigram identification through machine learning and dependency parsing. Our accuracy right now is 85%.
  • #24: Read text.