SlideShare a Scribd company logo
2
Most read
8
Most read
14
Most read
www.decideo.fr/bruley
Natural Language ProcessingNatural Language Processing
June 2013
Michel Bruley
www.decideo.fr/bruley
Natural Language Processing (NLP)Natural Language Processing (NLP)
NLP is the branch of computer science focused on developing systems
that allow computers to communicate with people using everyday
language
NLP is considered as a sub-field of artificial intelligence and has
significant overlap with the field of computational linguistics. It is
concerned with the interactions between computers and human (natural)
languages.
– Natural language generation systems convert information from
computer databases into readable human language
– Natural language understanding systems convert human language
into representations that are easier for computer programs to
manipulate.
NLP encompasses both text and speech, but work on speech processing
has evolved into a separate field
www.decideo.fr/bruley
Where does it fit in the CS*Where does it fit in the CS*
taxonomy?taxonomy?
Computers
Artificial Intelligence AlgorithmsDatabases Networking
Robotics SearchNatural Language Processing
Information
Retrieval
Machine
Translation
Language
Analysis
Semantics Parsing* CS = Computer Science
www.decideo.fr/bruley
Why Natural Language Processing?Why Natural Language Processing?
Applications for processing large amounts of texts require NLP expertise
Classify text into categories, index and search large texts: Classify documents
by topics, language, author, spam filtering, information retrieval (relevant, not
relevant), sentiment classification (positive, negative)
Extracting data from text: converting unstructured text into structure data
Information extraction: discover names of people and events they participate in,
from a document, …
Automatic summarization: Condense 1 book into 1 page, …
Speech processing, artificial voice: get flight information or book a hotel over
the phone, …
Question answering: find answers to natural language questions in a text
collection or database
Spelling & Grammar Corrections
Plagiarism detection
Automatic translation
Etc.
www.decideo.fr/bruley
The problemThe problem
When people see text, they understand its meaning (by and large)
According to research, it deosn’t mttaer in what oredr the ltteers in a
wrod are, the olny iprmoetnt tihng is that the frist and lsat ltteer are in
the rghit pclae. The rset can be a toatl mses and you can sitll raed it
wouthit a porbelm. Tihs is bcuseae we do not raed ervey lteter by islelf
but the wrod as a wlohe.
When computers see text, they get only character strings (and perhaps
HTML tags)
We'd like computer agents to see meanings and be able to intelligently
process text
These desires have led to many proposals for structured, semantically
marked up formats
But often human beings still resolutely make use of text in human
languages
This problem isn’t likely to just go away
www.decideo.fr/bruley
Example: Natural languageExample: Natural language
understandingunderstanding
Raw speech signal
• Speech recognition
Sequence of words spoken
• Syntactic analysis using knowledge of the grammar
Structure of the sentence
• Semantic analysis using info. about meaning of words
Partial representation of meaning of sentence
• Pragmatic analysis using info. about context
Final representation of meaning of sentence
Natural language understanding process – Prof. Carolina Ruiz
www.decideo.fr/bruley
Example detail: Syntactic AnalysisExample detail: Syntactic Analysis
The big cat is drinking milk
Noun Phrase Verb Phrase
Determiner Adjective
Phrase
Noun Auxiliary Verb Noun
Phrase
The big cat is drinking milk
• Syntactic analysis involves isolating phrases and sentences into a
hierarchical structure, allowing the study of its constituents.
• For example the sentence “the big cat is drinking milk” can be broken
up into the following constituents:
www.decideo.fr/bruley
Why NLP is difficultWhy NLP is difficult
Language is flexible
– New words, new meanings
– Different meanings in different contexts
Language is subtle
– He arrived at the lecture
– He chuckled at the lecture
– He chuckled his way through the lecture
– **He arrived his way through the lecture
Language is complex!
www.decideo.fr/bruley
Why NLP is difficultWhy NLP is difficult
MANY hidden variables
– Knowledge about the world
– Knowledge about the context
– Knowledge about human communication techniques
• Can you tell me the time?
Problem of scale
– Many (infinite?) possible words, meanings, context
Problem of sparsity
– Very difficult to do statistical analysis, most things (words,
concepts) are never seen before
Long range correlations
www.decideo.fr/bruley
Why NLP is difficultWhy NLP is difficult
Key problems:
– Representation of meaning
– Language presupposes knowledge about the world
– Language only reflects the surface of meaning
– Language presupposes communication between people
www.decideo.fr/bruley
Patented Natural Language Processing (NLP)Patented Natural Language Processing (NLP)
“Reads” Every Communication“Reads” Every Communication
 Each data feed is parsed
through one or more of the 7
NLP engines
 …it is then deconstructed to
provide context, subject, and
other information regarding
the customer (gender, name
etc.)
 Finally each identified
customer is matched back to
the Discovery platform data
to gain a full view
Natural language processing (NLP) is the study of the
interactions between computers and natural languages
(e.g., English, Polish). The crucial challenge that NLP
addresses is in deriving meaning from human or natural
language input and allowing consumers to analyze
parsed meanings in large volumes.
www.decideo.fr/bruley
For Example….For Example….
I bought an iPad2 for my mom last week. She loves the weight, but doesn’t like the color. She
wishes it came in blue. She says if it came in blue, then she’d buy one for all her friends
Entities (brands, people, locations, times, products…)
Events and relationships (purchasing event, my mom…)
Sentiment (product specifications)
Suggestions (feature specifications)
Intent (to purchase, to leave)
Geo/Temporal
QUESTION: Why is this a big deal?
NLP takes a simple English statement, parses them into the categories above (and more categories)
and VOILA…we got STRUCTURED DATA
www.decideo.fr/bruley
Aster
ASTER DISCOVERY
PLATFORM
“Now-
structured”
data
“Now-
structured”
data
ArchitectureArchitecture
Customers /
Sales / Other
data
Customers /
Sales / Other
data
Churn Score
SQL MR
Churn Score
SQL MR
Attensity Pipeline
Real-time
annotated
social media
data feed:
150+ million
social and
online sources
Other Unstructured Data
Emails; Surveys;
CRM Notes….
Pipeline Connector
ASAS
Wrapper
SQL MR
ASAS
Wrapper
SQL MR
NLP
ETL
Visualization
(e.g., Tableau,
MSTR)
Predictive
www.decideo.fr/bruley
 This integration provides types, subtypes, super types (“Savings”, “Checking”,
“Investment”)
 Inclusion of the Anaphora: Connecting a subject (George Harrison) without
repeating the full name (“He”, “Him”)
 Includes other languages besides English
 Attensity’s Semantic Annotation Server (ASAS) capabilities
 Entity Extraction: Automatic detection and extraction of more than 35 entities such as Name,
Place
 Uses Attensity Triples to create context on entities and identify verbs, relationships, actions
 Auto Classification: Uses custom classification rules to classify articles by content, sort by
relevance, and discovers repeated information
 Exhaustive Extraction: Application of linguistic principles to extract context, entities, and
relationships similar to how the human mind would
 Voice Tags: to identify types of statements and auto classify them (Question, Intent,
Conditional)
 Creates a unique identifier for each entity for cross reference
Aster + Attensity = CompetitiveAster + Attensity = Competitive
AdvantageAdvantage
www.decideo.fr/bruley
Structuring Unstructured Data: ProcessStructuring Unstructured Data: Process
FlowFlow
The flight was delayed and flight attendant would not give us
any new information.
www.decideo.fr/bruley
New Table: Customer Reactions
Database Record from a Customer Survey
date
10-02-06
region
0006
rec?
4
source
telephone
Why would you recommend/not recommend?
The flight was delayed and flight attendant would
not give us any new information.
Who/What
flight
Behavior
delay
Fact/Triple
flight : delay
Same Record with Relational Facts
Extracted from Notes Field
date region source rec? who-what Behavior Fact/Triple
10-2-12 0006 telephone 4 flight delay flight : delay
10-2-12 0006 telephone 4 information give [not]
information :
give [not]
1-1-13 0007 e-mail 8 i happy [not] i : happy [not]
1-1-13 0007 e-mail 8 rep rude rep : rude
1-1-13 0007 e-mail 8 flight cancel flight : cancel
Original Structured Data
Newly Structured Data
Provided by Attensity
How Triples are Extracted &How Triples are Extracted &
StructuredStructured
Extract
Extract relational facts & Triples
from Notes field
Then Fuse
Populate new table with
attribute values and fuse with
structured data.
www.decideo.fr/bruley
Team PowerTeam Power

More Related Content

PPTX
Natural Language Processing
PPTX
PDF
Natural language processing (NLP) introduction
PPTX
natural language processing help at myassignmenthelp.net
PDF
Natural language processing and its application in ai
PPTX
Natural Language Processing
PPT
Natural Language Processing
PDF
Natural Language Processing (NLP)
Natural Language Processing
Natural language processing (NLP) introduction
natural language processing help at myassignmenthelp.net
Natural language processing and its application in ai
Natural Language Processing
Natural Language Processing
Natural Language Processing (NLP)

What's hot (20)

PPTX
Natural language processing
PDF
Natural language processing
PPTX
Natural language processing
PPTX
Natural Language Processing (NLP) - Introduction
PPTX
5. phases of nlp
PPTX
Natural Language Processing in AI
PPT
Introduction to Natural Language Processing
PPTX
Natural lanaguage processing
PPTX
Natural language processing
PPTX
Natural language processing
PDF
Challenges in nlp
PDF
Natural Language Processing
PDF
Natural Language Processing seminar review
PPTX
Natural Language Processing
DOCX
Natural language processing
PDF
Introduction to Natural Language Processing (NLP)
PPTX
Speech and Language Processing
PPTX
Natural Language Processing
Natural language processing
Natural language processing
Natural language processing
Natural Language Processing (NLP) - Introduction
5. phases of nlp
Natural Language Processing in AI
Introduction to Natural Language Processing
Natural lanaguage processing
Natural language processing
Natural language processing
Challenges in nlp
Natural Language Processing
Natural Language Processing seminar review
Natural Language Processing
Natural language processing
Introduction to Natural Language Processing (NLP)
Speech and Language Processing
Natural Language Processing
Ad

Viewers also liked (20)

PPT
Introduction to Natural Language Processing
PPT
Natural language processing
PDF
Practical Natural Language Processing
PDF
RDBMS & noSQL: Mixed for best performance
PDF
NOVA Data Science Meetup 1/19/2017 - Presentation 2
PPTX
معرفی روش‌های تحقیق در شبکه های اجتماعی
PDF
تحلیل احساسات در شبکه های اجتماعی
PPTX
LEXICAL RELATIONS AND ITS APPLICATION ON "THE KITE"
PDF
Natural Language Processing (NLP), Search and Wearable Technology
PPTX
SAMOA: A Platform for Mining Big Data Streams (Apache BigData North America 2...
PDF
NLP from scratch
PPTX
3 NLP Techniques to Improve your next Presentation - YouncK
PPT
Big Data & Pricing
PPT
Big Data and Visualization
PDF
PPT
Big Data and Marketing Attribution
PPT
Big Data and Social CRM
PPTX
Human Computer Interaction HCI
PPTX
Influence mapping Toolbox Presentation London 2015
Introduction to Natural Language Processing
Natural language processing
Practical Natural Language Processing
RDBMS & noSQL: Mixed for best performance
NOVA Data Science Meetup 1/19/2017 - Presentation 2
معرفی روش‌های تحقیق در شبکه های اجتماعی
تحلیل احساسات در شبکه های اجتماعی
LEXICAL RELATIONS AND ITS APPLICATION ON "THE KITE"
Natural Language Processing (NLP), Search and Wearable Technology
SAMOA: A Platform for Mining Big Data Streams (Apache BigData North America 2...
NLP from scratch
3 NLP Techniques to Improve your next Presentation - YouncK
Big Data & Pricing
Big Data and Visualization
Big Data and Marketing Attribution
Big Data and Social CRM
Human Computer Interaction HCI
Influence mapping Toolbox Presentation London 2015
Ad

Similar to Big Data and Natural Language Processing (20)

PPT
The impact of standardized terminologies and domain-ontologies in multilingua...
ODP
Corpora, Blogs and Linguistic Variation (Paderborn)
PDF
FinalReport
PDF
Natural Language Processing
PDF
IS-EUD-2015, Madrid, Spain, 27 May 2015
PPT
Using construction grammar in conversational systems
PDF
PPTX
6CS4_AI_Unit-5 @zammers.pptx(for artificial intelligence)
PDF
An Overview Of Natural Language Processing
PPTX
EXPLORING NATURAL LANGUAGE PROCESSING (1).pptx
PPTX
NATURAL LANGUAGE PROCESSING AA PPT1.pptx
PPT
NLP introduced and in 47 slides Lecture 1.ppt
PDF
Teachbot teaching robot_using_artificial
ODP
Text-mining and Automation
PDF
Domain Specific Terminology Extraction (ICICT 2006)
PPTX
NLP todo
PPTX
Computational linguistics
PPTX
nlp-updated-230720173348-d9097e (1).pptx
PPT
Textmining
PPT
Gadgets pwn us? A pattern language for CALL
The impact of standardized terminologies and domain-ontologies in multilingua...
Corpora, Blogs and Linguistic Variation (Paderborn)
FinalReport
Natural Language Processing
IS-EUD-2015, Madrid, Spain, 27 May 2015
Using construction grammar in conversational systems
6CS4_AI_Unit-5 @zammers.pptx(for artificial intelligence)
An Overview Of Natural Language Processing
EXPLORING NATURAL LANGUAGE PROCESSING (1).pptx
NATURAL LANGUAGE PROCESSING AA PPT1.pptx
NLP introduced and in 47 slides Lecture 1.ppt
Teachbot teaching robot_using_artificial
Text-mining and Automation
Domain Specific Terminology Extraction (ICICT 2006)
NLP todo
Computational linguistics
nlp-updated-230720173348-d9097e (1).pptx
Textmining
Gadgets pwn us? A pattern language for CALL

More from Michel Bruley (20)

PDF
Propos sur différents sujets de 2022 à 2024 .pdf
PDF
Propos sur d'autres sujets - compilation 2022
PDF
Propos sur l'histoire - compilation - 2022
PDF
Textes de famille concernant les guerres V2.pdf
PDF
Mes trois moyen âge : une période de 1000 ans comprise entre Ve et XVe siècle
PDF
Propos sur l'âme, extraits de recherches numériques
PDF
Religion : Dieu y es-tu ? (les articles)
PDF
Réflexion sur les religions : Dieu y es-tu ?
PDF
La chute de l'Empire romain comme modèle.pdf
PDF
Synthèse sur Neuville.pdf
PDF
Propos sur des sujets qui m'ont titillé.pdf
PDF
Propos sur les Big Data.pdf
PDF
Sun tzu
PDF
Georges Anselmi - 1914 - 1918 Campagnes de France et d'Orient
PPT
Poc banking industry - Churn
PPT
Big Data POC in communication industry
PDF
Photos de famille 1895 1966
PDF
Compilation d'autres textes de famille
PDF
J'aime BRULEY
PDF
Textes de famille concernant les guerres (1814 - 1944)
Propos sur différents sujets de 2022 à 2024 .pdf
Propos sur d'autres sujets - compilation 2022
Propos sur l'histoire - compilation - 2022
Textes de famille concernant les guerres V2.pdf
Mes trois moyen âge : une période de 1000 ans comprise entre Ve et XVe siècle
Propos sur l'âme, extraits de recherches numériques
Religion : Dieu y es-tu ? (les articles)
Réflexion sur les religions : Dieu y es-tu ?
La chute de l'Empire romain comme modèle.pdf
Synthèse sur Neuville.pdf
Propos sur des sujets qui m'ont titillé.pdf
Propos sur les Big Data.pdf
Sun tzu
Georges Anselmi - 1914 - 1918 Campagnes de France et d'Orient
Poc banking industry - Churn
Big Data POC in communication industry
Photos de famille 1895 1966
Compilation d'autres textes de famille
J'aime BRULEY
Textes de famille concernant les guerres (1814 - 1944)

Recently uploaded (20)

PDF
DOC-20250806-WA0002._20250806_112011_0000.pdf
PDF
20250805_A. Stotz All Weather Strategy - Performance review July 2025.pdf
PPTX
AI-assistance in Knowledge Collection and Curation supporting Safe and Sustai...
PPTX
Business Ethics - An introduction and its overview.pptx
PPT
Data mining for business intelligence ch04 sharda
PDF
COST SHEET- Tender and Quotation unit 2.pdf
PPTX
Dragon_Fruit_Cultivation_in Nepal ppt.pptx
PDF
Chapter 5_Foreign Exchange Market in .pdf
PDF
A Brief Introduction About Julia Allison
PPTX
5 Stages of group development guide.pptx
PPTX
ICG2025_ICG 6th steering committee 30-8-24.pptx
PPTX
Probability Distribution, binomial distribution, poisson distribution
PPT
340036916-American-Literature-Literary-Period-Overview.ppt
PDF
WRN_Investor_Presentation_August 2025.pdf
PDF
BsN 7th Sem Course GridNNNNNNNN CCN.pdf
PDF
How to Get Funding for Your Trucking Business
PPTX
job Avenue by vinith.pptxvnbvnvnvbnvbnbmnbmbh
PDF
Ôn tập tiếng anh trong kinh doanh nâng cao
PDF
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
PDF
Deliverable file - Regulatory guideline analysis.pdf
DOC-20250806-WA0002._20250806_112011_0000.pdf
20250805_A. Stotz All Weather Strategy - Performance review July 2025.pdf
AI-assistance in Knowledge Collection and Curation supporting Safe and Sustai...
Business Ethics - An introduction and its overview.pptx
Data mining for business intelligence ch04 sharda
COST SHEET- Tender and Quotation unit 2.pdf
Dragon_Fruit_Cultivation_in Nepal ppt.pptx
Chapter 5_Foreign Exchange Market in .pdf
A Brief Introduction About Julia Allison
5 Stages of group development guide.pptx
ICG2025_ICG 6th steering committee 30-8-24.pptx
Probability Distribution, binomial distribution, poisson distribution
340036916-American-Literature-Literary-Period-Overview.ppt
WRN_Investor_Presentation_August 2025.pdf
BsN 7th Sem Course GridNNNNNNNN CCN.pdf
How to Get Funding for Your Trucking Business
job Avenue by vinith.pptxvnbvnvnvbnvbnbmnbmbh
Ôn tập tiếng anh trong kinh doanh nâng cao
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
Deliverable file - Regulatory guideline analysis.pdf

Big Data and Natural Language Processing

  • 1. www.decideo.fr/bruley Natural Language ProcessingNatural Language Processing June 2013 Michel Bruley
  • 2. www.decideo.fr/bruley Natural Language Processing (NLP)Natural Language Processing (NLP) NLP is the branch of computer science focused on developing systems that allow computers to communicate with people using everyday language NLP is considered as a sub-field of artificial intelligence and has significant overlap with the field of computational linguistics. It is concerned with the interactions between computers and human (natural) languages. – Natural language generation systems convert information from computer databases into readable human language – Natural language understanding systems convert human language into representations that are easier for computer programs to manipulate. NLP encompasses both text and speech, but work on speech processing has evolved into a separate field
  • 3. www.decideo.fr/bruley Where does it fit in the CS*Where does it fit in the CS* taxonomy?taxonomy? Computers Artificial Intelligence AlgorithmsDatabases Networking Robotics SearchNatural Language Processing Information Retrieval Machine Translation Language Analysis Semantics Parsing* CS = Computer Science
  • 4. www.decideo.fr/bruley Why Natural Language Processing?Why Natural Language Processing? Applications for processing large amounts of texts require NLP expertise Classify text into categories, index and search large texts: Classify documents by topics, language, author, spam filtering, information retrieval (relevant, not relevant), sentiment classification (positive, negative) Extracting data from text: converting unstructured text into structure data Information extraction: discover names of people and events they participate in, from a document, … Automatic summarization: Condense 1 book into 1 page, … Speech processing, artificial voice: get flight information or book a hotel over the phone, … Question answering: find answers to natural language questions in a text collection or database Spelling & Grammar Corrections Plagiarism detection Automatic translation Etc.
  • 5. www.decideo.fr/bruley The problemThe problem When people see text, they understand its meaning (by and large) According to research, it deosn’t mttaer in what oredr the ltteers in a wrod are, the olny iprmoetnt tihng is that the frist and lsat ltteer are in the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit a porbelm. Tihs is bcuseae we do not raed ervey lteter by islelf but the wrod as a wlohe. When computers see text, they get only character strings (and perhaps HTML tags) We'd like computer agents to see meanings and be able to intelligently process text These desires have led to many proposals for structured, semantically marked up formats But often human beings still resolutely make use of text in human languages This problem isn’t likely to just go away
  • 6. www.decideo.fr/bruley Example: Natural languageExample: Natural language understandingunderstanding Raw speech signal • Speech recognition Sequence of words spoken • Syntactic analysis using knowledge of the grammar Structure of the sentence • Semantic analysis using info. about meaning of words Partial representation of meaning of sentence • Pragmatic analysis using info. about context Final representation of meaning of sentence Natural language understanding process – Prof. Carolina Ruiz
  • 7. www.decideo.fr/bruley Example detail: Syntactic AnalysisExample detail: Syntactic Analysis The big cat is drinking milk Noun Phrase Verb Phrase Determiner Adjective Phrase Noun Auxiliary Verb Noun Phrase The big cat is drinking milk • Syntactic analysis involves isolating phrases and sentences into a hierarchical structure, allowing the study of its constituents. • For example the sentence “the big cat is drinking milk” can be broken up into the following constituents:
  • 8. www.decideo.fr/bruley Why NLP is difficultWhy NLP is difficult Language is flexible – New words, new meanings – Different meanings in different contexts Language is subtle – He arrived at the lecture – He chuckled at the lecture – He chuckled his way through the lecture – **He arrived his way through the lecture Language is complex!
  • 9. www.decideo.fr/bruley Why NLP is difficultWhy NLP is difficult MANY hidden variables – Knowledge about the world – Knowledge about the context – Knowledge about human communication techniques • Can you tell me the time? Problem of scale – Many (infinite?) possible words, meanings, context Problem of sparsity – Very difficult to do statistical analysis, most things (words, concepts) are never seen before Long range correlations
  • 10. www.decideo.fr/bruley Why NLP is difficultWhy NLP is difficult Key problems: – Representation of meaning – Language presupposes knowledge about the world – Language only reflects the surface of meaning – Language presupposes communication between people
  • 11. www.decideo.fr/bruley Patented Natural Language Processing (NLP)Patented Natural Language Processing (NLP) “Reads” Every Communication“Reads” Every Communication  Each data feed is parsed through one or more of the 7 NLP engines  …it is then deconstructed to provide context, subject, and other information regarding the customer (gender, name etc.)  Finally each identified customer is matched back to the Discovery platform data to gain a full view Natural language processing (NLP) is the study of the interactions between computers and natural languages (e.g., English, Polish). The crucial challenge that NLP addresses is in deriving meaning from human or natural language input and allowing consumers to analyze parsed meanings in large volumes.
  • 12. www.decideo.fr/bruley For Example….For Example…. I bought an iPad2 for my mom last week. She loves the weight, but doesn’t like the color. She wishes it came in blue. She says if it came in blue, then she’d buy one for all her friends Entities (brands, people, locations, times, products…) Events and relationships (purchasing event, my mom…) Sentiment (product specifications) Suggestions (feature specifications) Intent (to purchase, to leave) Geo/Temporal QUESTION: Why is this a big deal? NLP takes a simple English statement, parses them into the categories above (and more categories) and VOILA…we got STRUCTURED DATA
  • 13. www.decideo.fr/bruley Aster ASTER DISCOVERY PLATFORM “Now- structured” data “Now- structured” data ArchitectureArchitecture Customers / Sales / Other data Customers / Sales / Other data Churn Score SQL MR Churn Score SQL MR Attensity Pipeline Real-time annotated social media data feed: 150+ million social and online sources Other Unstructured Data Emails; Surveys; CRM Notes…. Pipeline Connector ASAS Wrapper SQL MR ASAS Wrapper SQL MR NLP ETL Visualization (e.g., Tableau, MSTR) Predictive
  • 14. www.decideo.fr/bruley  This integration provides types, subtypes, super types (“Savings”, “Checking”, “Investment”)  Inclusion of the Anaphora: Connecting a subject (George Harrison) without repeating the full name (“He”, “Him”)  Includes other languages besides English  Attensity’s Semantic Annotation Server (ASAS) capabilities  Entity Extraction: Automatic detection and extraction of more than 35 entities such as Name, Place  Uses Attensity Triples to create context on entities and identify verbs, relationships, actions  Auto Classification: Uses custom classification rules to classify articles by content, sort by relevance, and discovers repeated information  Exhaustive Extraction: Application of linguistic principles to extract context, entities, and relationships similar to how the human mind would  Voice Tags: to identify types of statements and auto classify them (Question, Intent, Conditional)  Creates a unique identifier for each entity for cross reference Aster + Attensity = CompetitiveAster + Attensity = Competitive AdvantageAdvantage
  • 15. www.decideo.fr/bruley Structuring Unstructured Data: ProcessStructuring Unstructured Data: Process FlowFlow The flight was delayed and flight attendant would not give us any new information.
  • 16. www.decideo.fr/bruley New Table: Customer Reactions Database Record from a Customer Survey date 10-02-06 region 0006 rec? 4 source telephone Why would you recommend/not recommend? The flight was delayed and flight attendant would not give us any new information. Who/What flight Behavior delay Fact/Triple flight : delay Same Record with Relational Facts Extracted from Notes Field date region source rec? who-what Behavior Fact/Triple 10-2-12 0006 telephone 4 flight delay flight : delay 10-2-12 0006 telephone 4 information give [not] information : give [not] 1-1-13 0007 e-mail 8 i happy [not] i : happy [not] 1-1-13 0007 e-mail 8 rep rude rep : rude 1-1-13 0007 e-mail 8 flight cancel flight : cancel Original Structured Data Newly Structured Data Provided by Attensity How Triples are Extracted &How Triples are Extracted & StructuredStructured Extract Extract relational facts & Triples from Notes field Then Fuse Populate new table with attribute values and fuse with structured data.

Editor's Notes

  • #17: Here’s an example of how this process works – you can see in the upper right some of the feedback captured by a call center agent taking a complaint call from a customer – in this sentence – the facts about the flight and the details about the customer’s opinions are extracted into the relational table below. The newly structured facts are FUSED with the available structured data (customer id/segment, date, flight number, etc.) So that any of these facts can be analyzed along with the structured data.