SlideShare a Scribd company logo
Pingar - The Future of Text Analytics
The future of Text Analytics
Agenda
Who is Chris
The problem
What is text analytics
Why use it
How text analytics evolved
Use cases
The FUTURE
Who is Pingar?
Who is Chris
Just learned what Cricket is!
VP Marketing @ Pingar
Author in the area of
Content Management
Twitter: @HoardingInfo
Unstructured
Data Problem
Unstructured content makes up 80% of all digital
content *
The value of unstructured content diminishes
exponentially after it is published
Metadata is key to making any use of a document
after it is published
ref: AIIM.org 2012
Why use it?
Without metadata the time spent in producing
content is lost, and the content posses a risk for the
organization
Extracting metadata without text analytics is a
manual process, which is expensive and prone to
human error and inconsistency
What is Text Analytics
Technology that extracts value from unstructured
content
Turns documents into Keywords and Entities -
Metadata
Transforms unstructured to transactional
Evolution of text analytics
Started appearing around 2003
Initial engines were statistical
Accurate but lots of work
Modern engines use machine learning
Power of disambiguation & Linked Data
Several general purpose engines but mostly vertical
solutions
Use Cases
Use Cases
Content Migration and Discovery
Content Classification and Organization
Internal Content Publishing
Content Migration &
Discovery - Problem
A large oil and gas company in the US was
recently sued and lost ($ millions ). Due to poor
content control, documents left the
organization that should not have.
So the company decided to implement an ECM
system. But 90% of the organizations content
is stored in a File Share, the “Z” drive and no
one knows what is there.
In order to move to ECM they need to quickly
analyze the file share to isolate relevant
content, and remove that which is not
relevant. Also to prepare for migration to ECM.
Content Migration &
Discovery - Solution
Analyze the file share to produce a list of content by type and
relationships to other content.
Determine what content is relevant, what content should be
removed, and build an information architecture for a proper
ECM platform.
Visualize the content based on location, people, etc. to help gain
insight and make decisions how to deal with the content to avoid
future litigation.
Content Migration &
Discovery - Result
• New ECM system with relevant content only
• Purged non-relevant content
• Better control which means less legal risk
• Ability to make better business decisions
Content Classification &
Organization - Problem
One of the US’s largest commercial banks
produces regular collateral and promotional
materials. Because the resulting scripts and
media files are poorly organized they are
finding they are duplicating effort on future
campaigns and losing valuable and
expensive content.
They need to improve organization of these
assets, and cross pollination of information.
Content Classification &
Organization - Solution
Build a hierarchy of content, a taxonomy to be used to file content. As
content is saved to the rich media content repository have it
automatically filed according to the taxonomy.
Automatically generate search filters so navigation of the content is
more efficient, and fewer documents are missed by the team.
Content Classification &
Organization - Result
• Users spend 50% less time finding content
• Content is now organized by topic automatically
• Save $750,000 a year in duplicated effort
• Improve idea sharing
Internal Content
Publishing - Problem
One of the worlds largest chemical
manufactures has many R&D departments. As
new chemicals are invented scientists publish
documents discussing the intellectual property
of these inventions. The articles are to be
published to other scientist so they can use the
knowledge to further their research and
development.
The system for publishing this content is manual
and costly. A high paid chemical scientist has to
manually tag and summarize articles before
they are saved to a content management
system. Scientist have to “search” for content
they might find interesting, but they don’t
always know what to look for. This is costly,
prone to human error, and information is lost.
Internal Content
Publishing - Solution
Automatically tag, classify, and summarize content as
it’s being published by scientists.
Generate emails with summaries and links to articles.
Send the emails to scientists based on their profile,
showing only content that is relevant to them.
Internal Content
Publishing - Result
• 70% cost reduction in publishing process
• Content is published 150 x faster
• Scientist no longer have to search, content is pushed to
them
• The content auditors can focus on other responsibilities
Text Analytics is increasing the
value of unstructured content,
reducing risk, and making
organizations more efficient
The future
Text Analytics will be a mandatory for all organizations doing
unified information access
Machine Learning Engines take over
BigData and BigContent join forces
The need for Language Scientist and Data Scientist increases
Buzz Words: Unified Information, Content Intelligence,
BigContent
Who Is
• The Text Analytics Subject
Matter Experts
• Helping you make money
with a Text Analytics practice

More Related Content

PPTX
Compensation hrd 8212 lirio
PPTX
FAIR principles and metrics for evaluation
PPTX
Towards metrics to assess and encourage FAIRness
PDF
Big Data for Library Services (2017)
PDF
Insight and business discovery. The right type of fans and how to get them. q...
PDF
SIR 2012 - Altmetrics Social Web - Aldo de Pape
PDF
Marketing AI - How to Build a Keyword Ontology
PDF
Mining Institutional Knowledge: Using Text and Data Mining to Enhance Discovery
Compensation hrd 8212 lirio
FAIR principles and metrics for evaluation
Towards metrics to assess and encourage FAIRness
Big Data for Library Services (2017)
Insight and business discovery. The right type of fans and how to get them. q...
SIR 2012 - Altmetrics Social Web - Aldo de Pape
Marketing AI - How to Build a Keyword Ontology
Mining Institutional Knowledge: Using Text and Data Mining to Enhance Discovery

What's hot (20)

PDF
Real World Knowledge Graphs
PPTX
Term project presentation
PPTX
FAIR data
PDF
Ict journal layout
PPTX
presentation
PPSX
USING BIGDATA WITH ACADEMIC LIBRARY SERVICES: A VIEW
PDF
Comintelli SCIP Webinar - Effective Media Monitoring for CI Purposes
PPTX
The Missing Link: Giving Statistical Data Meaning
PDF
Citi Global T4I Accelerator Data and Analytics Presentation
PPTX
Data Analytics
PDF
Data science and ethics in fundraising
PPTX
Creating impact with accessible data in agriculture and nutrition: sharing da...
PPT
Introduction to Data Management Planning
PDF
Harnessing search engines for KM
PPTX
NISO Plus: Data Discovery and Reuse: AI Solutions & the Human Factor
PPTX
The New Dimensions in Scholcomm: How a global scholarly community collaborati...
PDF
Library Science Emerging Career Trends 2016
PDF
FAIR Data Experiences - Kees van Bochove - The Hyve
PPTX
Data Analytics
PPTX
DCC and FAIR initiatives
Real World Knowledge Graphs
Term project presentation
FAIR data
Ict journal layout
presentation
USING BIGDATA WITH ACADEMIC LIBRARY SERVICES: A VIEW
Comintelli SCIP Webinar - Effective Media Monitoring for CI Purposes
The Missing Link: Giving Statistical Data Meaning
Citi Global T4I Accelerator Data and Analytics Presentation
Data Analytics
Data science and ethics in fundraising
Creating impact with accessible data in agriculture and nutrition: sharing da...
Introduction to Data Management Planning
Harnessing search engines for KM
NISO Plus: Data Discovery and Reuse: AI Solutions & the Human Factor
The New Dimensions in Scholcomm: How a global scholarly community collaborati...
Library Science Emerging Career Trends 2016
FAIR Data Experiences - Kees van Bochove - The Hyve
Data Analytics
DCC and FAIR initiatives
Ad

Viewers also liked (8)

PPTX
Pingar App for SharePoint
PPTX
How Taxonomies and facets bring end users closer to big data
PPTX
Discover New Value from Unstructured Data
PPTX
DevOps is for Everyone - DevOps East
PPTX
The Bootstrappers Guide to the Startup Stack
PPTX
Continuous Integration (CI) is about more than releases
PPTX
Robot & Frank & Basic AI
PPTX
Enterprise Docker Requires a Private Registry
Pingar App for SharePoint
How Taxonomies and facets bring end users closer to big data
Discover New Value from Unstructured Data
DevOps is for Everyone - DevOps East
The Bootstrappers Guide to the Startup Stack
Continuous Integration (CI) is about more than releases
Robot & Frank & Basic AI
Enterprise Docker Requires a Private Registry
Ad

Similar to Pingar - The Future of Text Analytics (20)

PDF
Content analytics
PPT
Content management
PPT
Content Management, Metadata and Semantic Web
PPT
Content Management, Metadata and Semantic Web
PPTX
conceptClassifier For SharePoint Driving Business Value
PPTX
Climbing the Slippery Slope of SharePoint Migrations Webinar
PDF
Getting it Right: Building Quality into your Content (July 2014)
PDF
AI-Driven News & Article Data Scraping: A Deep Dive into Content Extraction
PDF
SharePoint Fest Chicago Presentation
PDF
How to Get Enterprise Search Right Webinar
PDF
Getting Knowledge Transfer Right Enterprise Wide Webinar
PDF
ARMA Calgary Spring Seminar: The Nuts and Bolts of Metadata Tagging and Taxon...
PPTX
Dc2010 fanning
PDF
Eliminating End User Tagging – Minimizing Organizational Risk and Improving B...
PDF
KMWorld Martin Briefing
PPTX
AMCTO presentation on moving from records managment to information management
PPT
Content classification - where is my stuff?
PDF
Getting Control of Your Content: AI Solutions to Streamline and Optimize Your...
PPTX
Taxonomy and seo sla 05-06-10(jc)
PDF
Talent Base: Best practises in a WCM project
Content analytics
Content management
Content Management, Metadata and Semantic Web
Content Management, Metadata and Semantic Web
conceptClassifier For SharePoint Driving Business Value
Climbing the Slippery Slope of SharePoint Migrations Webinar
Getting it Right: Building Quality into your Content (July 2014)
AI-Driven News & Article Data Scraping: A Deep Dive into Content Extraction
SharePoint Fest Chicago Presentation
How to Get Enterprise Search Right Webinar
Getting Knowledge Transfer Right Enterprise Wide Webinar
ARMA Calgary Spring Seminar: The Nuts and Bolts of Metadata Tagging and Taxon...
Dc2010 fanning
Eliminating End User Tagging – Minimizing Organizational Risk and Improving B...
KMWorld Martin Briefing
AMCTO presentation on moving from records managment to information management
Content classification - where is my stuff?
Getting Control of Your Content: AI Solutions to Streamline and Optimize Your...
Taxonomy and seo sla 05-06-10(jc)
Talent Base: Best practises in a WCM project

More from Chris Riley ☁ (20)

PPTX
What DevOps means for QA Teams
PPTX
Enterprise DevOps fact or fiction - DevOps Summit 2014
PPTX
Navigating the Developer Tools Market: DevOps Camp Houston 2014
PPTX
Infragistics uses DevOps to increase customer engagment
PPTX
CloudShare TeamLabs Walkthrough
PPTX
Dev/Test in the Cloud - A Business Case
PPT
SharePoint meet ECM @ Live 360 2013
PPT
Move your SharePoint Development to the Cloud
PPTX
SPS Toronoto - SharePoint meet ECM
PPTX
CloudShare SP Expert Hackathon
PPTX
SharePoint Meet ECM at #SPSSC
PPTX
SharePoint Meet ECM - SPS Houston
PPTX
SharePoint, Cloud, Records Managment
PPTX
SharePoint Meet ECM - SPSLA 2012
PPT
Cloud - AIIM Conference 2012
PPTX
Dev/Test in the Cloud - F
PPTX
CloudShare Welcome Wizard
PPT
SharePoint - the opportunity for service bureaus
PPTX
SharePoint Meets ECM 2011
PPT
CloudShare Dev and Test SPSTCDC
What DevOps means for QA Teams
Enterprise DevOps fact or fiction - DevOps Summit 2014
Navigating the Developer Tools Market: DevOps Camp Houston 2014
Infragistics uses DevOps to increase customer engagment
CloudShare TeamLabs Walkthrough
Dev/Test in the Cloud - A Business Case
SharePoint meet ECM @ Live 360 2013
Move your SharePoint Development to the Cloud
SPS Toronoto - SharePoint meet ECM
CloudShare SP Expert Hackathon
SharePoint Meet ECM at #SPSSC
SharePoint Meet ECM - SPS Houston
SharePoint, Cloud, Records Managment
SharePoint Meet ECM - SPSLA 2012
Cloud - AIIM Conference 2012
Dev/Test in the Cloud - F
CloudShare Welcome Wizard
SharePoint - the opportunity for service bureaus
SharePoint Meets ECM 2011
CloudShare Dev and Test SPSTCDC

Recently uploaded (20)

PDF
Encapsulation theory and applications.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
cuic standard and advanced reporting.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Modernizing your data center with Dell and AMD
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Electronic commerce courselecture one. Pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Encapsulation theory and applications.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
cuic standard and advanced reporting.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Mobile App Security Testing_ A Comprehensive Guide.pdf
Spectral efficient network and resource selection model in 5G networks
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Modernizing your data center with Dell and AMD
Per capita expenditure prediction using model stacking based on satellite ima...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
MYSQL Presentation for SQL database connectivity
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
“AI and Expert System Decision Support & Business Intelligence Systems”
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Understanding_Digital_Forensics_Presentation.pptx
Chapter 3 Spatial Domain Image Processing.pdf
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Electronic commerce courselecture one. Pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows

Pingar - The Future of Text Analytics

  • 2. The future of Text Analytics
  • 3. Agenda Who is Chris The problem What is text analytics Why use it How text analytics evolved Use cases The FUTURE Who is Pingar?
  • 4. Who is Chris Just learned what Cricket is! VP Marketing @ Pingar Author in the area of Content Management Twitter: @HoardingInfo
  • 5. Unstructured Data Problem Unstructured content makes up 80% of all digital content * The value of unstructured content diminishes exponentially after it is published Metadata is key to making any use of a document after it is published ref: AIIM.org 2012
  • 6. Why use it? Without metadata the time spent in producing content is lost, and the content posses a risk for the organization Extracting metadata without text analytics is a manual process, which is expensive and prone to human error and inconsistency
  • 7. What is Text Analytics Technology that extracts value from unstructured content Turns documents into Keywords and Entities - Metadata Transforms unstructured to transactional
  • 8. Evolution of text analytics Started appearing around 2003 Initial engines were statistical Accurate but lots of work Modern engines use machine learning Power of disambiguation & Linked Data Several general purpose engines but mostly vertical solutions
  • 10. Use Cases Content Migration and Discovery Content Classification and Organization Internal Content Publishing
  • 11. Content Migration & Discovery - Problem A large oil and gas company in the US was recently sued and lost ($ millions ). Due to poor content control, documents left the organization that should not have. So the company decided to implement an ECM system. But 90% of the organizations content is stored in a File Share, the “Z” drive and no one knows what is there. In order to move to ECM they need to quickly analyze the file share to isolate relevant content, and remove that which is not relevant. Also to prepare for migration to ECM.
  • 12. Content Migration & Discovery - Solution Analyze the file share to produce a list of content by type and relationships to other content. Determine what content is relevant, what content should be removed, and build an information architecture for a proper ECM platform. Visualize the content based on location, people, etc. to help gain insight and make decisions how to deal with the content to avoid future litigation.
  • 13. Content Migration & Discovery - Result • New ECM system with relevant content only • Purged non-relevant content • Better control which means less legal risk • Ability to make better business decisions
  • 14. Content Classification & Organization - Problem One of the US’s largest commercial banks produces regular collateral and promotional materials. Because the resulting scripts and media files are poorly organized they are finding they are duplicating effort on future campaigns and losing valuable and expensive content. They need to improve organization of these assets, and cross pollination of information.
  • 15. Content Classification & Organization - Solution Build a hierarchy of content, a taxonomy to be used to file content. As content is saved to the rich media content repository have it automatically filed according to the taxonomy. Automatically generate search filters so navigation of the content is more efficient, and fewer documents are missed by the team.
  • 16. Content Classification & Organization - Result • Users spend 50% less time finding content • Content is now organized by topic automatically • Save $750,000 a year in duplicated effort • Improve idea sharing
  • 17. Internal Content Publishing - Problem One of the worlds largest chemical manufactures has many R&D departments. As new chemicals are invented scientists publish documents discussing the intellectual property of these inventions. The articles are to be published to other scientist so they can use the knowledge to further their research and development. The system for publishing this content is manual and costly. A high paid chemical scientist has to manually tag and summarize articles before they are saved to a content management system. Scientist have to “search” for content they might find interesting, but they don’t always know what to look for. This is costly, prone to human error, and information is lost.
  • 18. Internal Content Publishing - Solution Automatically tag, classify, and summarize content as it’s being published by scientists. Generate emails with summaries and links to articles. Send the emails to scientists based on their profile, showing only content that is relevant to them.
  • 19. Internal Content Publishing - Result • 70% cost reduction in publishing process • Content is published 150 x faster • Scientist no longer have to search, content is pushed to them • The content auditors can focus on other responsibilities
  • 20. Text Analytics is increasing the value of unstructured content, reducing risk, and making organizations more efficient
  • 21. The future Text Analytics will be a mandatory for all organizations doing unified information access Machine Learning Engines take over BigData and BigContent join forces The need for Language Scientist and Data Scientist increases Buzz Words: Unified Information, Content Intelligence, BigContent
  • 22. Who Is • The Text Analytics Subject Matter Experts • Helping you make money with a Text Analytics practice