SlideShare a Scribd company logo
ENTITY CO-OCCURENCE and ENTITY REPUTATION FROM
UNSTRUCTURED DATA USING KNOWLEDGE GRAPH
By Venkatraman.J
Senior data software engineer, Metapack, London
AGENDA
• Motivation
• Knowledge graph basic introduction
• Problem statement
• Data flow architecture
• Conclusion
MOTIVATION
• Most of the Data is very unstructured in nature.
• Not every problem needs ML/Deep learning models.
• Large amount of datasets are connected in this world.
• Connected or Linked data provides valuable insights quickly.
KNOWLEDGE GRAPH
• Knowledge graphs encode structured information of entities and their rich
relations.
• Captures relationship between individual items providing a model and
access patterns that can be processed automatically by machines.
• Entities are represented using nodes and relationships as edges between
entities.
• Data representation is also named as triples – (Subject, Predicate, Object)
• Very much compared to Ontology where Ontology captures relationships
between concepts, data and entities within a particular domain. Eg.
Dbpedia, Yago, WordNet.
• Use cases powered by knowledge graph – Improving search relevance,
Question answering applications, Recommendation engines.
KNOWLEDGE GRAPH REPRESENTATION
PROBLEM STATEMENT
• Individuals review products on social media.
• Actionable insight to find out how the product is doing in market.
KNOWLEDGE GRAPH CONSTRUCTION
• NLP techniques to do Information extraction techniques to extract
entities and relationship across entities
• Entities are represented using nodes and relationships among entities
as edges
• Entities in twitter feeds are persons, products and location.
• Relationships are likes and dislikes of individual person about a
product, relation to other users.
GRAPH INFERENCE
• Centrality algorithms – Degree, Pagerank, Closeness.
• Degree centrality – Used for determining popular nodes in the graph.
• Degree centrality measures the number of incoming and outgoing
relations from a node. Entities that have the highest degree centrality
score are considered very popular.
DATA FLOW ARCHITECTURE
LEARNING
• Tweets are easy to get but quality of data is very poor and too noisy.
• Graph querying is not same as SQL. Querying works as pattern
matching. Understand the internals of query language is needed to
write efficient queries and debug problems.
• Understand the data model represented in graph.
• Start with small graph and iterate on before building a bigger one.
• Neo4j is ACID compliant like RDBMS, watch out for multiple writers
writing to Database.
• Deploy containerized applications and orchestrate using Docker
swarm or Kubernetes to scale up.
CONCLUSION
• Identify the problems that can be solved using graph theory and
connected data.
• Scoring via Graph model can augment or support the results received
from ML/DL models.
• Papers related to knowledge graph:
http://ceur-ws.org/Vol-2306/paper9.pdf
https://aclweb.org/anthology/D18-2024
Questions?

More Related Content

PPTX
Odsc 2018 detection_classification_of_fake_news_using_cnn_venkatraman
PDF
IRJET- Fake News Detection and Rumour Source Identification
PDF
FAKE NEWS DETECTION WITH SEMANTIC FEATURES AND TEXT MINING
PDF
IRJET- Fake News Detection using Logistic Regression
PDF
Fake News Detection using Machine Learning
PDF
IRJET- Fake News Detection
PDF
Final Poster for Engineering Showcase
PDF
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
Odsc 2018 detection_classification_of_fake_news_using_cnn_venkatraman
IRJET- Fake News Detection and Rumour Source Identification
FAKE NEWS DETECTION WITH SEMANTIC FEATURES AND TEXT MINING
IRJET- Fake News Detection using Logistic Regression
Fake News Detection using Machine Learning
IRJET- Fake News Detection
Final Poster for Engineering Showcase
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...

What's hot (20)

PDF
IRJET - Fake News Detection using Machine Learning
PDF
Rostislav Yavorsky - Research Challenges of Dynamic Socio-Semantic Networks
PDF
Automatic Hate Speech Detection: A Literature Review
PPTX
Political prediction analysis using text mining and deep learning
PDF
Tweet segmentation and its application to named entity recognition
DOCX
NE7012- SOCIAL NETWORK ANALYSIS
DOCX
Tweet segmentation and its application to named entity recognition
PDF
E017433538
PDF
Tweet segmentation and its application to named entity recognition.
PDF
Analyzing-Threat-Levels-of-Extremists-using-Tweets
PDF
Tweet Segmentation and Its Application to Named Entity Recognition
PDF
Extraction and Analysis of Publication Data of Conferences - ICACCE 2015
PPTX
C:\Fakepath\Learning Through Conversation
PDF
PDF
Ith ch1-part1
PDF
CATEGORIZING 2019-N-COV TWITTER HASHTAG DATA BY CLUSTERING
DOC
Modelling and Analyzing Complex Networks"
PPTX
Sentiment Analysis Using Twitter
PDF
How Anonymous Can Someone be on Twitter?
IRJET - Fake News Detection using Machine Learning
Rostislav Yavorsky - Research Challenges of Dynamic Socio-Semantic Networks
Automatic Hate Speech Detection: A Literature Review
Political prediction analysis using text mining and deep learning
Tweet segmentation and its application to named entity recognition
NE7012- SOCIAL NETWORK ANALYSIS
Tweet segmentation and its application to named entity recognition
E017433538
Tweet segmentation and its application to named entity recognition.
Analyzing-Threat-Levels-of-Extremists-using-Tweets
Tweet Segmentation and Its Application to Named Entity Recognition
Extraction and Analysis of Publication Data of Conferences - ICACCE 2015
C:\Fakepath\Learning Through Conversation
Ith ch1-part1
CATEGORIZING 2019-N-COV TWITTER HASHTAG DATA BY CLUSTERING
Modelling and Analyzing Complex Networks"
Sentiment Analysis Using Twitter
How Anonymous Can Someone be on Twitter?
Ad

Similar to Odsc 2019 entity_reputation_knowledge_graph (20)

PPTX
The Semantic Knowledge Graph
PPTX
The Apache Solr Semantic Knowledge Graph
PPTX
Introduction to Data Science.pptx
PDF
Transforming AI with Graphs: Real World Examples using Spark and Neo4j
PDF
Transforming AI with Graphs: Real World Examples using Spark and Neo4j
PDF
Journey of The Connected Enterprise - Knowledge Graphs - Smart Data
PDF
Graph analytic and machine learning
PDF
3. Relationships Matter: Using Connected Data for Better Machine Learning
PPTX
Data Science presentation for explanation of numpy and pandas
PPTX
chapter 6 data visualization ppt.pptx
PDF
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
PPTX
Sharing a Startup’s Big Data Lessons
PPTX
SOFTWARE ENGINEERING AND SOFTWARE PROJECT MANAGEMENT
PPTX
Data modelling it's process and examples
PPTX
Azure Databricks for Data Scientists
PDF
Shubhangi nov20
PDF
Leveraging Graphs for Better AI
PDF
Choosing a Machine Learning technique to solve your need
PDF
Mastering Customer Data on Apache Spark
PPTX
From SQL to Python - A Beginner's Guide to Making the Switch
The Semantic Knowledge Graph
The Apache Solr Semantic Knowledge Graph
Introduction to Data Science.pptx
Transforming AI with Graphs: Real World Examples using Spark and Neo4j
Transforming AI with Graphs: Real World Examples using Spark and Neo4j
Journey of The Connected Enterprise - Knowledge Graphs - Smart Data
Graph analytic and machine learning
3. Relationships Matter: Using Connected Data for Better Machine Learning
Data Science presentation for explanation of numpy and pandas
chapter 6 data visualization ppt.pptx
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
Sharing a Startup’s Big Data Lessons
SOFTWARE ENGINEERING AND SOFTWARE PROJECT MANAGEMENT
Data modelling it's process and examples
Azure Databricks for Data Scientists
Shubhangi nov20
Leveraging Graphs for Better AI
Choosing a Machine Learning technique to solve your need
Mastering Customer Data on Apache Spark
From SQL to Python - A Beginner's Guide to Making the Switch
Ad

Recently uploaded (20)

PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPT
Quality review (1)_presentation of this 21
PDF
annual-report-2024-2025 original latest.
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
Introduction to machine learning and Linear Models
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Computer network topology notes for revision
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PDF
Foundation of Data Science unit number two notes
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
Acceptance and paychological effects of mandatory extra coach I classes.pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Quality review (1)_presentation of this 21
annual-report-2024-2025 original latest.
Galatica Smart Energy Infrastructure Startup Pitch Deck
Introduction to machine learning and Linear Models
ISS -ESG Data flows What is ESG and HowHow
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Data_Analytics_and_PowerBI_Presentation.pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Computer network topology notes for revision
Introduction to Knowledge Engineering Part 1
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Foundation of Data Science unit number two notes
Business Ppt On Nestle.pptx huunnnhhgfvu

Odsc 2019 entity_reputation_knowledge_graph

  • 1. ENTITY CO-OCCURENCE and ENTITY REPUTATION FROM UNSTRUCTURED DATA USING KNOWLEDGE GRAPH By Venkatraman.J Senior data software engineer, Metapack, London
  • 2. AGENDA • Motivation • Knowledge graph basic introduction • Problem statement • Data flow architecture • Conclusion
  • 3. MOTIVATION • Most of the Data is very unstructured in nature. • Not every problem needs ML/Deep learning models. • Large amount of datasets are connected in this world. • Connected or Linked data provides valuable insights quickly.
  • 4. KNOWLEDGE GRAPH • Knowledge graphs encode structured information of entities and their rich relations. • Captures relationship between individual items providing a model and access patterns that can be processed automatically by machines. • Entities are represented using nodes and relationships as edges between entities. • Data representation is also named as triples – (Subject, Predicate, Object) • Very much compared to Ontology where Ontology captures relationships between concepts, data and entities within a particular domain. Eg. Dbpedia, Yago, WordNet. • Use cases powered by knowledge graph – Improving search relevance, Question answering applications, Recommendation engines.
  • 6. PROBLEM STATEMENT • Individuals review products on social media. • Actionable insight to find out how the product is doing in market.
  • 7. KNOWLEDGE GRAPH CONSTRUCTION • NLP techniques to do Information extraction techniques to extract entities and relationship across entities • Entities are represented using nodes and relationships among entities as edges • Entities in twitter feeds are persons, products and location. • Relationships are likes and dislikes of individual person about a product, relation to other users.
  • 8. GRAPH INFERENCE • Centrality algorithms – Degree, Pagerank, Closeness. • Degree centrality – Used for determining popular nodes in the graph. • Degree centrality measures the number of incoming and outgoing relations from a node. Entities that have the highest degree centrality score are considered very popular.
  • 10. LEARNING • Tweets are easy to get but quality of data is very poor and too noisy. • Graph querying is not same as SQL. Querying works as pattern matching. Understand the internals of query language is needed to write efficient queries and debug problems. • Understand the data model represented in graph. • Start with small graph and iterate on before building a bigger one. • Neo4j is ACID compliant like RDBMS, watch out for multiple writers writing to Database. • Deploy containerized applications and orchestrate using Docker swarm or Kubernetes to scale up.
  • 11. CONCLUSION • Identify the problems that can be solved using graph theory and connected data. • Scoring via Graph model can augment or support the results received from ML/DL models. • Papers related to knowledge graph: http://ceur-ws.org/Vol-2306/paper9.pdf https://aclweb.org/anthology/D18-2024 Questions?