SlideShare a Scribd company logo
Deep Learning for Information Extraction in
Natural Language Text
Pankaj Gupta
CT RDA BAM MIC-DE
Young Research Forum 2017 | Siemens AG, Germany
Siemens Corporate TechnologyRestricted © Siemens AG 2016
Restricted © Siemens AG 2016
31.05.2016Page 2 Corporate Technology
About Me @CT RDA BAM MIC-DE !!
Nov, 2015
Present
➢ LinkedIn: https://www.linkedin.com/in/pankaj-gupta-6b95bb17/
➢ Google Scholar: https://scholar.google.com/citations?user=_YjIJF0AAAAJ&hl=en
Research Focus: Machine Learning (Deep Learning) techniques to solve Natural Language Processing (NLP) tasks
• Bachelor of Technology in Computer Science, Amity University, India
• Bachelor Internship: Queen’s University, Belfast Northern Ireland UK
• Publications: 2
• Senior Software Engineer at Wipro and Aricent Technologies, IndiaJun, 2010-
Sept, 2013
• Master of Science in Informatics, Technical University of Munich (TUM), Germany
• Master’s Thesis (Siemens + LMU): Deep Learning Methods for the Extraction of Relations in Natural Language Text
• Publications: Deep Learning/NLP focused: 2 and Machine Learning based: 2
Oct, 2013
Nov, 2015
• PhD at CT-RDA-BAM-MIC-DE Siemens AG and at LMU (CIS) Munich Germany
• Advisor(s): Dr. Bernt Andrassy (Siemens AG) and Prof. Dr. Hinrich Schütze (CIS LMU)
• PhD Thesis: Deep Learning Methods for Information Extraction in Natural Language Text
• Publications: Published: 1, Review: 3, Filed 3 patents
Oct, 2006-
April, 2010
Restricted © Siemens AG 2016
31.05.2016Page 3 Corporate Technology
Information Extraction(IE) in Natural Language Text
• Entity Extraction
• Relation Extraction
• Structure the unstructured text
• Knowledge Graph Construction
• In web search, retrieval, Q&A, etc.
Information Extraction
Entity Extraction: Detect entities such as person, organization, location, product, technology, sensor, etc.
Relation Extraction: Detect relation between the given entities or nominals
End-to-End Knowledge Base Population
Text Documents Knowledge GraphIE Engine
SensorSensor
Competitor-of
Sensor
Restricted © Siemens AG 2016
31.05.2016Page 4 Corporate Technology
Supervised Deep Learning Techniques in Information Extraction
• Natural language is sparse and noisy
• Better Representation Learning
• Build state-of-the-art entity and relation
extraction systems with Neural
Networks to extract triples (entity1,
entity2, relation)
Challenges and Motivation Our Pipelined Deep Learning System for Entity and Relation Extraction
Motivation to build the state-of-the-art Deep Learning system(s) for Smart Data Web project
• Learn from noisy text
• Better approximate the highly non-
linear arbitrary function
• Pattern and Representation Learning,
especially in Language Models with no
explicit feature extraction
Benefits of Deep Learning in NLP
Extended Convolutional Neural Network2
Connectionist Bi-directional Neural Network2
Ranking Recurrent Neural Network (R-RNN)1
Entity/Concept Extraction Relation Extraction
TriplesText
(1) N.T.Vu, P. Gupta, H. Adel, H. Schütze. Bi-directional RNN with Ranking Loss for SLU. In ICASSP2016.
(2) N.T.Vu, H. Adel, P. Gupta, H. Schütze. Combining Recurrent and Convolutional Neural Networks for Relation Classification. In NAACL2016.
( Siemens,
Competitor-of,
ABB )
Siemens,
ABB
Competitor-of
Restricted © Siemens AG 2016
31.05.2016Page 5 Corporate Technology
Supervised Deep Learning in Joint Entity and Relation Extraction
• Entity and relation inter-dependencies
• Multi-tasking to jointly learn entity and
relation representations and patterns
• State-of-the-art system published3 for
joint entity and relation extraction
Motivation: Joint/Multi-task Learning Joint/Multi-task Neural Learning for End-to-End Entity and Relation Extraction
Our State-of-the-Art system based on Neural Architectures for Joint Entity and Relation Extraction
Neural Information Extraction System3
Text Documents
(3) P. Gupta, A. Bernt, H. Schütze. Table Filling Multi-Task Recurrent Neural Network for Joint Entity and Relation Extraction.
In COLING2016.
Supplier-ofSupplier-of
Supplier-of
Competitor-of
Sensor
Sensor
Sensor Sensor
Restricted © Siemens AG 2016
31.05.2016Page 6 Corporate Technology
Deep Learning and NLP Applications at Siemens
Public Domain: Web Semantic Search and Retrieval
Application of Information Extraction in Public and Industrial domains
Industrial Domain: Slot Filling for Product in Tender Documents
Rectifier
RATED CURRENT: ??
OUTPUT VOLTAGE: ??
OVERLOAD : ??
Query-Input Tender Documents,
Service Reports
IE System
Query-Output
Rectifier
RATED CURRENT: 2666 A
OUTPUT VOLTAGE: 1500 V
OVERLOAD: 2 h
Restricted © Siemens AG 2016
31.05.2016Page 7 Corporate Technology
Deep Learning and NLP Applications for TimeLines at Siemens
Public Domain: TimeLine Generation from Biographies
Application of Information Extraction in Public and Industrial domains
Industrial Domain: TimeLine of Product for Historical Analysis and Monitoring (Future Work)
Timeline of -
➢ Industrial products
➢ Industrial re-organization
➢ Business Strategy vs Profit
Bloomberg-Biographies,
European Data Forum
Digitaleweltmagazin.de
Intelligent TimeLine
Extraction System

More Related Content

PDF
Neural NLP Models of Information Extraction
PDF
textTOvec: DEEP CONTEXTUALIZED NEURAL AUTOREGRESSIVE TOPIC MODELS OF LANGUAGE...
PDF
Lecture 07: Representation and Distributional Learning by Pankaj Gupta
PDF
Document Informed Neural Autoregressive Topic Models with Distributional Prior
PDF
Lecture 05: Recurrent Neural Networks / Deep Learning by Pankaj Gupta
PDF
Neural Relation ExtractionWithin and Across Sentence Boundaries
PDF
IRJET- Visual Information Narrator using Neural Network
PPTX
Neural Information Retrieval: In search of meaningful progress
Neural NLP Models of Information Extraction
textTOvec: DEEP CONTEXTUALIZED NEURAL AUTOREGRESSIVE TOPIC MODELS OF LANGUAGE...
Lecture 07: Representation and Distributional Learning by Pankaj Gupta
Document Informed Neural Autoregressive Topic Models with Distributional Prior
Lecture 05: Recurrent Neural Networks / Deep Learning by Pankaj Gupta
Neural Relation ExtractionWithin and Across Sentence Boundaries
IRJET- Visual Information Narrator using Neural Network
Neural Information Retrieval: In search of meaningful progress

Similar to Deep Learning for Information Extraction in Natural Language Text (20)

PDF
Joint Bootstrapping Machines for High Confidence Relation Extraction
PDF
Deep Learning and What's Next?
PDF
GDG_sece_dataScience_introduction about Data science and Road map to Data sci...
PPTX
Natural Language Processing Advancements By Deep Learning - A Survey
PPTX
Knowledge_Based_Systems_Siemens
PDF
IRJET- Survey on Text Error Detection using Deep Learning
PPTX
Information Extraction from Text, presented @ Deloitte
PPTX
Understanding deep learning
PDF
How deep learning is shaping natural language processing(NLP)
PDF
IRJET - Deep Learning Applications and Frameworks – A Review
PDF
Semantische Technologien. Datenspeicher oder Wissensmodelle?
PPTX
Deep Learning and Recurrent Neural Networks in the Enterprise
PPTX
Introduction-to-Deep-Learning about new technologies
PPTX
Exploring-Deep-Learning detailed and very important note
PDF
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
PPTX
Industrial Natural Language Processing and Information Extraction
PPTX
Neural Networks - it’s usage in Corporate
PPTX
Deep Learning and Watson Studio
PDF
Understanding Deep Learning: The Backbone of Modern AI
PDF
Deep learning and neural network converted
Joint Bootstrapping Machines for High Confidence Relation Extraction
Deep Learning and What's Next?
GDG_sece_dataScience_introduction about Data science and Road map to Data sci...
Natural Language Processing Advancements By Deep Learning - A Survey
Knowledge_Based_Systems_Siemens
IRJET- Survey on Text Error Detection using Deep Learning
Information Extraction from Text, presented @ Deloitte
Understanding deep learning
How deep learning is shaping natural language processing(NLP)
IRJET - Deep Learning Applications and Frameworks – A Review
Semantische Technologien. Datenspeicher oder Wissensmodelle?
Deep Learning and Recurrent Neural Networks in the Enterprise
Introduction-to-Deep-Learning about new technologies
Exploring-Deep-Learning detailed and very important note
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
Industrial Natural Language Processing and Information Extraction
Neural Networks - it’s usage in Corporate
Deep Learning and Watson Studio
Understanding Deep Learning: The Backbone of Modern AI
Deep learning and neural network converted
Ad

More from Pankaj Gupta, PhD (7)

PDF
textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language...
PDF
Poster: Neural Relation ExtractionWithin and Across Sentence Boundaries
PDF
Poster: Document Informed Neural Autoregressive Topic Models with Distributio...
PDF
Pankaj Gupta CV / Resume
PDF
LISA: Explaining RNN Judgments via Layer-wIse Semantic Accumulation and Examp...
PDF
Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retri...
PDF
RNN-RSM (Topics over Time) | NAACL2018 conference talk
textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language...
Poster: Neural Relation ExtractionWithin and Across Sentence Boundaries
Poster: Document Informed Neural Autoregressive Topic Models with Distributio...
Pankaj Gupta CV / Resume
LISA: Explaining RNN Judgments via Layer-wIse Semantic Accumulation and Examp...
Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retri...
RNN-RSM (Topics over Time) | NAACL2018 conference talk
Ad

Recently uploaded (20)

PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PDF
How to run a consulting project- client discovery
PDF
Business Analytics and business intelligence.pdf
PDF
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
PPT
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
PPTX
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
PPTX
Database Infoormation System (DBIS).pptx
PPTX
Leprosy and NLEP programme community medicine
PDF
Transcultural that can help you someday.
PPTX
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
PDF
Global Data and Analytics Market Outlook Report
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
A Complete Guide to Streamlining Business Processes
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PPTX
retention in jsjsksksksnbsndjddjdnFPD.pptx
PPTX
Introduction to Inferential Statistics.pptx
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
How to run a consulting project- client discovery
Business Analytics and business intelligence.pdf
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
Database Infoormation System (DBIS).pptx
Leprosy and NLEP programme community medicine
Transcultural that can help you someday.
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
Global Data and Analytics Market Outlook Report
IBA_Chapter_11_Slides_Final_Accessible.pptx
A Complete Guide to Streamlining Business Processes
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
STERILIZATION AND DISINFECTION-1.ppthhhbx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
retention in jsjsksksksnbsndjddjdnFPD.pptx
Introduction to Inferential Statistics.pptx

Deep Learning for Information Extraction in Natural Language Text

  • 1. Deep Learning for Information Extraction in Natural Language Text Pankaj Gupta CT RDA BAM MIC-DE Young Research Forum 2017 | Siemens AG, Germany Siemens Corporate TechnologyRestricted © Siemens AG 2016
  • 2. Restricted © Siemens AG 2016 31.05.2016Page 2 Corporate Technology About Me @CT RDA BAM MIC-DE !! Nov, 2015 Present ➢ LinkedIn: https://www.linkedin.com/in/pankaj-gupta-6b95bb17/ ➢ Google Scholar: https://scholar.google.com/citations?user=_YjIJF0AAAAJ&hl=en Research Focus: Machine Learning (Deep Learning) techniques to solve Natural Language Processing (NLP) tasks • Bachelor of Technology in Computer Science, Amity University, India • Bachelor Internship: Queen’s University, Belfast Northern Ireland UK • Publications: 2 • Senior Software Engineer at Wipro and Aricent Technologies, IndiaJun, 2010- Sept, 2013 • Master of Science in Informatics, Technical University of Munich (TUM), Germany • Master’s Thesis (Siemens + LMU): Deep Learning Methods for the Extraction of Relations in Natural Language Text • Publications: Deep Learning/NLP focused: 2 and Machine Learning based: 2 Oct, 2013 Nov, 2015 • PhD at CT-RDA-BAM-MIC-DE Siemens AG and at LMU (CIS) Munich Germany • Advisor(s): Dr. Bernt Andrassy (Siemens AG) and Prof. Dr. Hinrich Schütze (CIS LMU) • PhD Thesis: Deep Learning Methods for Information Extraction in Natural Language Text • Publications: Published: 1, Review: 3, Filed 3 patents Oct, 2006- April, 2010
  • 3. Restricted © Siemens AG 2016 31.05.2016Page 3 Corporate Technology Information Extraction(IE) in Natural Language Text • Entity Extraction • Relation Extraction • Structure the unstructured text • Knowledge Graph Construction • In web search, retrieval, Q&A, etc. Information Extraction Entity Extraction: Detect entities such as person, organization, location, product, technology, sensor, etc. Relation Extraction: Detect relation between the given entities or nominals End-to-End Knowledge Base Population Text Documents Knowledge GraphIE Engine SensorSensor Competitor-of Sensor
  • 4. Restricted © Siemens AG 2016 31.05.2016Page 4 Corporate Technology Supervised Deep Learning Techniques in Information Extraction • Natural language is sparse and noisy • Better Representation Learning • Build state-of-the-art entity and relation extraction systems with Neural Networks to extract triples (entity1, entity2, relation) Challenges and Motivation Our Pipelined Deep Learning System for Entity and Relation Extraction Motivation to build the state-of-the-art Deep Learning system(s) for Smart Data Web project • Learn from noisy text • Better approximate the highly non- linear arbitrary function • Pattern and Representation Learning, especially in Language Models with no explicit feature extraction Benefits of Deep Learning in NLP Extended Convolutional Neural Network2 Connectionist Bi-directional Neural Network2 Ranking Recurrent Neural Network (R-RNN)1 Entity/Concept Extraction Relation Extraction TriplesText (1) N.T.Vu, P. Gupta, H. Adel, H. Schütze. Bi-directional RNN with Ranking Loss for SLU. In ICASSP2016. (2) N.T.Vu, H. Adel, P. Gupta, H. Schütze. Combining Recurrent and Convolutional Neural Networks for Relation Classification. In NAACL2016. ( Siemens, Competitor-of, ABB ) Siemens, ABB Competitor-of
  • 5. Restricted © Siemens AG 2016 31.05.2016Page 5 Corporate Technology Supervised Deep Learning in Joint Entity and Relation Extraction • Entity and relation inter-dependencies • Multi-tasking to jointly learn entity and relation representations and patterns • State-of-the-art system published3 for joint entity and relation extraction Motivation: Joint/Multi-task Learning Joint/Multi-task Neural Learning for End-to-End Entity and Relation Extraction Our State-of-the-Art system based on Neural Architectures for Joint Entity and Relation Extraction Neural Information Extraction System3 Text Documents (3) P. Gupta, A. Bernt, H. Schütze. Table Filling Multi-Task Recurrent Neural Network for Joint Entity and Relation Extraction. In COLING2016. Supplier-ofSupplier-of Supplier-of Competitor-of Sensor Sensor Sensor Sensor
  • 6. Restricted © Siemens AG 2016 31.05.2016Page 6 Corporate Technology Deep Learning and NLP Applications at Siemens Public Domain: Web Semantic Search and Retrieval Application of Information Extraction in Public and Industrial domains Industrial Domain: Slot Filling for Product in Tender Documents Rectifier RATED CURRENT: ?? OUTPUT VOLTAGE: ?? OVERLOAD : ?? Query-Input Tender Documents, Service Reports IE System Query-Output Rectifier RATED CURRENT: 2666 A OUTPUT VOLTAGE: 1500 V OVERLOAD: 2 h
  • 7. Restricted © Siemens AG 2016 31.05.2016Page 7 Corporate Technology Deep Learning and NLP Applications for TimeLines at Siemens Public Domain: TimeLine Generation from Biographies Application of Information Extraction in Public and Industrial domains Industrial Domain: TimeLine of Product for Historical Analysis and Monitoring (Future Work) Timeline of - ➢ Industrial products ➢ Industrial re-organization ➢ Business Strategy vs Profit Bloomberg-Biographies, European Data Forum Digitaleweltmagazin.de Intelligent TimeLine Extraction System