SlideShare a Scribd company logo
Semantic Search and Content Management
Case Studies in Successful Software Implementations
Marjorie M.K. Hlava
Founder/President
mhlava@accessinn.com
ü Content management and content discovery needed major
improvements.
ü User were not getting the results they needed.
ü The content production team including editorial, managing editors –
the whole team – could no longer cope with the volume and variety.
ü Content quality was suffering.
ü Brief discussions of each organization’s challenges set the stage for AI-
based, human curated solutions.
ü What worked, what didn’t, and the how and the why will be
presented.
What do PLI, Met Opera, ASCO, McGraw-Hill and PLOS have in
common?
Who are they?
üPracticing Law Institute
onon-profit continuing legal education organization
ochartered by the Regents of the University of the
State of New York.
oFounded in 1933
othe company organizes and provides CLE programs
around the world.
üHow to make their content discoverable
üHow to reuse and cross reference training materials
üTime crunches and staff overloaded already
üLot of content so need automation to make this happen
First Auto-tag the Content When Ingested
Keep Track of Every Record and the Level of Tagging
Everyone Gets a Login to the Metadata Platform
– Levels of Permission Differ
üTailored the taxonomy to the content
üAuto-tagging on backfile and forward flow production
üEnsure accuracy in tagging with QA program of spot
checking and automatic monitors
üEnriched data quality and coverage
üAllowed creation of new CLE offerings through content
reuse
üSignificantly improved search for both staff and
customers
Financed by programs
and donations
Database, DOS based
implemented in 1980’s
Performers use the
archives database as
their resume referral
system
Keep track of every
performer in every
scene of every opera
ever preformed at The
Met
Time critical data input
needed
Archives mostly staffed
by volunteers – make
database easy to use
Metropolitan Opera
Metropolitan Opera
Metropolitan Opera
üAdd images and videos
üConnect to costume database
üSupport the archives website
üAllow immediate entry of data
after each performance in season
üSubmit comments and
corrections online
üMake search fast, accurate, and
easy
American Society for
Clinical Oncology
üWeb search inconsistent results
üTaxonomy to support synonyms and variant
term usage
üInsure comprehensive search
üType ahead in search
üSort conference papers into tracks
üTag to user profiles for better matching of
talks to attendees
üCreate meta-titles for better search SEO
üUse in journal productions tracking
üHelp match peer reviewers to potential
papers
American Society for Clinical
Oncology
Metastatic 170,000,000
Stage 4 296,000,000
Invasive 92,000,000
Consistent Search Based on
Taxonomic Metadata
üSearch for
oInvasive breast cancer
oMetastatic breast cancer
oStage IV breast cancer
üGet a single search set result
whatever term set you enter
Funding – 2
Conferences -3862
Journals 2993
Well formed vocabulary control Rich Synonymy and cross references Deep automatic tagging
Taxonomy – Tagging – Data Enriched
- Content Analysis
- Term Identification
- Multiple Taxonomies
with Weighted Terms
- Enriched Content (XML,
Video Transcript, Excel)
Automated
Tagging
- Taxonomy Development
- AI Training Set
- Classifying
- Metadata Enrichment
Lauren Sapira
Director, Sci/Tech Digital Products
“
“
Before we did this upgrade project, we heard
all kinds of complaints about our search.
Since we relaunched, there have been no
complaints at all!
Public Library of
Science
üTag every document on ingestion
üWebsite search platform
üThe search hierarchy is visible to 3
levels deep for enhanced search
üUsers can vote on keywords used to
train the system
üIndex to the most specific level
üEditors can view the entire hierarchy
üSearch is keyword driven based on
the taxonomy metadata
üTrack every usage of data by
searchers
Public Library of Science
üNeed to keep track of all incoming papers as
well as usage by customers
üWhat new journals should we create, and
which are not doing well?
üWho uses what on the web site
üWhat works, what does not
üIf we change something, what is the impact?
Show keywords
and allow users
to suggest
additions or
deletions
Automatic term suggestions for incoming documents
Automatic weighting of
taxonomy terms based on
source content
Summary
Content enrichment gives better data
Use subject metadata for better search
Don't launch at less than 85% accuracy
-- Precision and recall combined
Unstructured data is still structured
and can be tagged
Keep track of the data usage
About
ALBUQUERQUE
PHILADELPHIA
Cambridge
• Established in 1978
• Experts in semantic solutions
• Over 300 clients on 4 continents
• Over 2,000 project engagements
• 95% YOY client retention
• Data Harmony Suite
• Award Winning
• Patented Technology
• Semantic Services
• Managed Services
• Project Services
Clients
Publishing & Media
Education
Government
Non-profits &
Societies
Health/Pharma
Manufacturing
& Retail
Services Overview
We are experts in semantic solutions
ü Metadata Creation and Enhancement
ü AI Training Set Development
ü Custom Data Classification
ü Develop Controlled Vocabularies
ü Abstract and Index Documents
ü Automated Indexing
ü Capture and Convert Data
Not only do we build solutions that are compliant, in
many cases, we built or influence the data standard!
Semantic Environment
ü Concept Extraction
ü Term Recommendations
ü Sentiment Analysis
ü Visualization Tools
ü NLP Precision Monitoring
ANALYZE
ü Taxonomy Building
ü Thesaurus Building
ü AI Training-Set Building
MODEL
ü Abstracting
ü Classifying
ü Metadata Enrichment
ü Inline Tagging
ü Meta-Titles
ENRICH
Marjorie MK Hlava
Founder/President
mhlava@accessinn.com
(m): 505-975-5578
Change Search to Found
We are the intelligence and the technology behind world-class semantic solutions

More Related Content

PDF
AI-SDV 2021: Heiko Wongel - Machine learning tools in patent searching - are ...
PDF
AI-SDV 2021: Nils Newmann - AI – Who is in control and why is that important?
PDF
AI-SDV 2021 - Tony Trippe - The Current State of Machine Learning for Patent ...
PPTX
Patent Search - Before beginning and conducting search
PDF
IC-SDV 2018: Aleksandar Kapisoda (Boehringer) Using Machine Learning for Auto...
PDF
AI-SDV 2020: Special Hypertext Information Treatment in is Special Hypertext ...
PDF
ICIC 2013 Conference Proceedings Tony Trippe Patinformatics
PDF
AI-SDV 2020: Can There Be Profitable Revenue from an AI Deployment? The Upsid...
AI-SDV 2021: Heiko Wongel - Machine learning tools in patent searching - are ...
AI-SDV 2021: Nils Newmann - AI – Who is in control and why is that important?
AI-SDV 2021 - Tony Trippe - The Current State of Machine Learning for Patent ...
Patent Search - Before beginning and conducting search
IC-SDV 2018: Aleksandar Kapisoda (Boehringer) Using Machine Learning for Auto...
AI-SDV 2020: Special Hypertext Information Treatment in is Special Hypertext ...
ICIC 2013 Conference Proceedings Tony Trippe Patinformatics
AI-SDV 2020: Can There Be Profitable Revenue from an AI Deployment? The Upsid...

What's hot (20)

PDF
II-SDV 2014 Automated Relevancy Check of Patents and Scientific Literature (K...
PDF
AI-SDV 2021: Linus Wretblad - Best practice on new intelligent tools in IP ma...
PDF
Data Science Application in Business Portfolio & Risk Management
PDF
AI-SDV 2021 - Deep SEARCH 9
PDF
Fair webinar, Ted slater: progress towards commercial fair data products and ...
PDF
Technology Trend Analysis of R&D Strategy on iPS Cells
PDF
II-SDV 2014 Search and Data Mining Open Source Platforms (Patrick Beaucamp - ...
PDF
Building up a Data Science Team from Scratch
PDF
Data Wrangling and the Art of Big Data Discovery
PPTX
Utilising Open Source and Communities to Drive Innovation in a Cost-Effective...
PDF
VALUENEX Singapore Seminar: Our Services (and Case Study)
PPTX
Machine Learning in Big Data
PPTX
PROPEL . Austrian's Roadmap for Enterprise Linked Data
PPT
Lecture 3
PDF
II-SDV 2017: Approaches of Web Information Analysis in a Day to Day Work Envi...
PDF
II-SDV 2012 Towards Unified Access Systems for Data Exploration
PDF
The 3 Key Barriers Keeping Companies from Deploying Data Products
PPTX
Introduction to Data Science
PDF
II-SDV 2012 Patent Prior-Art Searching with Latent Semantic Analysis
PDF
Data Scientist Enablement roadmap 1.0
II-SDV 2014 Automated Relevancy Check of Patents and Scientific Literature (K...
AI-SDV 2021: Linus Wretblad - Best practice on new intelligent tools in IP ma...
Data Science Application in Business Portfolio & Risk Management
AI-SDV 2021 - Deep SEARCH 9
Fair webinar, Ted slater: progress towards commercial fair data products and ...
Technology Trend Analysis of R&D Strategy on iPS Cells
II-SDV 2014 Search and Data Mining Open Source Platforms (Patrick Beaucamp - ...
Building up a Data Science Team from Scratch
Data Wrangling and the Art of Big Data Discovery
Utilising Open Source and Communities to Drive Innovation in a Cost-Effective...
VALUENEX Singapore Seminar: Our Services (and Case Study)
Machine Learning in Big Data
PROPEL . Austrian's Roadmap for Enterprise Linked Data
Lecture 3
II-SDV 2017: Approaches of Web Information Analysis in a Day to Day Work Envi...
II-SDV 2012 Towards Unified Access Systems for Data Exploration
The 3 Key Barriers Keeping Companies from Deploying Data Products
Introduction to Data Science
II-SDV 2012 Patent Prior-Art Searching with Latent Semantic Analysis
Data Scientist Enablement roadmap 1.0
Ad

Similar to AI-SDV 2021 - Marjorie Hlava - Semantic Search and Content Management – Case Studies in Successful Software Implementations (20)

PPTX
Eureka, I found it! - Special Libraries Association 2021 Presentation
PPT
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
PPTX
Understanding How Search Works November 7 2024.pptx
PDF
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
PDF
How to Get Enterprise Search Right Webinar
PPTX
Implementing a Taxonomy in a Content Management Portal
PDF
How to Jump Start Taxonomy Content Creation webinar slides 9 24 15
PPT
Taxonomies And Search Aiim Mn
PDF
Using Metadata-Driven Taxonomies to Solve Business Problems
PDF
ALA 2010 -- Jabin White
PPTX
Taxonomy Assessments - Part Two
PDF
SharePoint Saturday London - The Nuts and Bolts of Metadata Tagging and Taxon...
PPT
Enterprise Navigation (KM World 2007)
PDF
FEDSPUG Meeting: Intelligent Metadata and Auto-classification in Records Mana...
PPT
Semantic Search using RDF Metadata (SemTech 2005)
PDF
White, "Ontologies & User Needs in Publishing"
PDF
Reduce Your Taxonomy Deployment Time from Months to Weeks Webinar
PDF
Why You Need Intelligent Metadata and Auto-classification in Records Management
PPTX
Data Harmony Version 3.9 Features Update
PDF
ARMA Calgary Spring Seminar: The Nuts and Bolts of Metadata Tagging and Taxon...
Eureka, I found it! - Special Libraries Association 2021 Presentation
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
Understanding How Search Works November 7 2024.pptx
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
How to Get Enterprise Search Right Webinar
Implementing a Taxonomy in a Content Management Portal
How to Jump Start Taxonomy Content Creation webinar slides 9 24 15
Taxonomies And Search Aiim Mn
Using Metadata-Driven Taxonomies to Solve Business Problems
ALA 2010 -- Jabin White
Taxonomy Assessments - Part Two
SharePoint Saturday London - The Nuts and Bolts of Metadata Tagging and Taxon...
Enterprise Navigation (KM World 2007)
FEDSPUG Meeting: Intelligent Metadata and Auto-classification in Records Mana...
Semantic Search using RDF Metadata (SemTech 2005)
White, "Ontologies & User Needs in Publishing"
Reduce Your Taxonomy Deployment Time from Months to Weeks Webinar
Why You Need Intelligent Metadata and Auto-classification in Records Management
Data Harmony Version 3.9 Features Update
ARMA Calgary Spring Seminar: The Nuts and Bolts of Metadata Tagging and Taxon...
Ad

More from Dr. Haxel Consult (20)

PDF
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
PDF
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
PDF
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
PDF
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
PDF
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
PDF
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
PDF
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
PDF
AI-SDV 2022: Machine learning based patent categorization: A success story in...
PDF
AI-SDV 2022: Machine learning based patent categorization: A success story in...
PDF
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
PDF
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
PDF
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
PDF
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
PDF
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
PDF
AI-SDV 2022: Copyright Clearance Center
PDF
AI-SDV 2022: Lighthouse IP
PDF
AI-SDV 2022: New Product Introductions: CENTREDOC
PDF
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
PDF
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
PDF
The Artificial Intelligence Conference on Search, Data and Text Mining, Analy...
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Lighthouse IP
AI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
The Artificial Intelligence Conference on Search, Data and Text Mining, Analy...

Recently uploaded (20)

PDF
SlidesGDGoCxRAIS about Google Dialogflow and NotebookLM.pdf
PDF
Introduction to the IoT system, how the IoT system works
PPTX
E -tech empowerment technologies PowerPoint
PPTX
Slides PPTX: World Game (s): Eco Economic Epochs.pptx
PPT
isotopes_sddsadsaadasdasdasdasdsa1213.ppt
PPT
250152213-Excitation-SystemWERRT (1).ppt
PPTX
Power Point - Lesson 3_2.pptx grad school presentation
PPTX
t_and_OpenAI_Combined_two_pressentations
PPT
Ethics in Information System - Management Information System
PPTX
newyork.pptxirantrafgshenepalchinachinane
PDF
📍 LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1 TERPOPULER DI INDONESIA ! 🌟
PDF
The Evolution of Traditional to New Media .pdf
PDF
SASE Traffic Flow - ZTNA Connector-1.pdf
PDF
📍 LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1 TERPOPULER DI INDONESIA ! 🌟
PDF
Session 1 (Week 1)fghjmgfdsfgthyjkhfdsadfghjkhgfdsa
PPTX
artificialintelligenceai1-copy-210604123353.pptx
PDF
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
PPTX
Database Information System - Management Information System
PPTX
SAP Ariba Sourcing PPT for learning material
PPTX
Internet Safety for Seniors presentation
SlidesGDGoCxRAIS about Google Dialogflow and NotebookLM.pdf
Introduction to the IoT system, how the IoT system works
E -tech empowerment technologies PowerPoint
Slides PPTX: World Game (s): Eco Economic Epochs.pptx
isotopes_sddsadsaadasdasdasdasdsa1213.ppt
250152213-Excitation-SystemWERRT (1).ppt
Power Point - Lesson 3_2.pptx grad school presentation
t_and_OpenAI_Combined_two_pressentations
Ethics in Information System - Management Information System
newyork.pptxirantrafgshenepalchinachinane
📍 LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1 TERPOPULER DI INDONESIA ! 🌟
The Evolution of Traditional to New Media .pdf
SASE Traffic Flow - ZTNA Connector-1.pdf
📍 LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1 TERPOPULER DI INDONESIA ! 🌟
Session 1 (Week 1)fghjmgfdsfgthyjkhfdsadfghjkhgfdsa
artificialintelligenceai1-copy-210604123353.pptx
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
Database Information System - Management Information System
SAP Ariba Sourcing PPT for learning material
Internet Safety for Seniors presentation

AI-SDV 2021 - Marjorie Hlava - Semantic Search and Content Management – Case Studies in Successful Software Implementations

  • 1. Semantic Search and Content Management Case Studies in Successful Software Implementations Marjorie M.K. Hlava Founder/President mhlava@accessinn.com
  • 2. ü Content management and content discovery needed major improvements. ü User were not getting the results they needed. ü The content production team including editorial, managing editors – the whole team – could no longer cope with the volume and variety. ü Content quality was suffering. ü Brief discussions of each organization’s challenges set the stage for AI- based, human curated solutions. ü What worked, what didn’t, and the how and the why will be presented. What do PLI, Met Opera, ASCO, McGraw-Hill and PLOS have in common?
  • 3. Who are they? üPracticing Law Institute onon-profit continuing legal education organization ochartered by the Regents of the University of the State of New York. oFounded in 1933 othe company organizes and provides CLE programs around the world. üHow to make their content discoverable üHow to reuse and cross reference training materials üTime crunches and staff overloaded already üLot of content so need automation to make this happen
  • 4. First Auto-tag the Content When Ingested
  • 5. Keep Track of Every Record and the Level of Tagging
  • 6. Everyone Gets a Login to the Metadata Platform – Levels of Permission Differ
  • 7. üTailored the taxonomy to the content üAuto-tagging on backfile and forward flow production üEnsure accuracy in tagging with QA program of spot checking and automatic monitors üEnriched data quality and coverage üAllowed creation of new CLE offerings through content reuse üSignificantly improved search for both staff and customers
  • 8. Financed by programs and donations Database, DOS based implemented in 1980’s Performers use the archives database as their resume referral system Keep track of every performer in every scene of every opera ever preformed at The Met Time critical data input needed Archives mostly staffed by volunteers – make database easy to use Metropolitan Opera
  • 10. Metropolitan Opera üAdd images and videos üConnect to costume database üSupport the archives website üAllow immediate entry of data after each performance in season üSubmit comments and corrections online üMake search fast, accurate, and easy
  • 11. American Society for Clinical Oncology üWeb search inconsistent results üTaxonomy to support synonyms and variant term usage üInsure comprehensive search üType ahead in search üSort conference papers into tracks üTag to user profiles for better matching of talks to attendees üCreate meta-titles for better search SEO üUse in journal productions tracking üHelp match peer reviewers to potential papers
  • 12. American Society for Clinical Oncology Metastatic 170,000,000 Stage 4 296,000,000 Invasive 92,000,000
  • 13. Consistent Search Based on Taxonomic Metadata üSearch for oInvasive breast cancer oMetastatic breast cancer oStage IV breast cancer üGet a single search set result whatever term set you enter Funding – 2 Conferences -3862 Journals 2993
  • 14. Well formed vocabulary control Rich Synonymy and cross references Deep automatic tagging Taxonomy – Tagging – Data Enriched
  • 15. - Content Analysis - Term Identification - Multiple Taxonomies with Weighted Terms - Enriched Content (XML, Video Transcript, Excel) Automated Tagging - Taxonomy Development - AI Training Set - Classifying - Metadata Enrichment
  • 16. Lauren Sapira Director, Sci/Tech Digital Products “ “ Before we did this upgrade project, we heard all kinds of complaints about our search. Since we relaunched, there have been no complaints at all!
  • 17. Public Library of Science üTag every document on ingestion üWebsite search platform üThe search hierarchy is visible to 3 levels deep for enhanced search üUsers can vote on keywords used to train the system üIndex to the most specific level üEditors can view the entire hierarchy üSearch is keyword driven based on the taxonomy metadata üTrack every usage of data by searchers
  • 18. Public Library of Science üNeed to keep track of all incoming papers as well as usage by customers üWhat new journals should we create, and which are not doing well? üWho uses what on the web site üWhat works, what does not üIf we change something, what is the impact?
  • 19. Show keywords and allow users to suggest additions or deletions
  • 20. Automatic term suggestions for incoming documents
  • 21. Automatic weighting of taxonomy terms based on source content
  • 22. Summary Content enrichment gives better data Use subject metadata for better search Don't launch at less than 85% accuracy -- Precision and recall combined Unstructured data is still structured and can be tagged Keep track of the data usage
  • 23. About ALBUQUERQUE PHILADELPHIA Cambridge • Established in 1978 • Experts in semantic solutions • Over 300 clients on 4 continents • Over 2,000 project engagements • 95% YOY client retention • Data Harmony Suite • Award Winning • Patented Technology • Semantic Services • Managed Services • Project Services
  • 24. Clients Publishing & Media Education Government Non-profits & Societies Health/Pharma Manufacturing & Retail
  • 25. Services Overview We are experts in semantic solutions ü Metadata Creation and Enhancement ü AI Training Set Development ü Custom Data Classification ü Develop Controlled Vocabularies ü Abstract and Index Documents ü Automated Indexing ü Capture and Convert Data Not only do we build solutions that are compliant, in many cases, we built or influence the data standard!
  • 26. Semantic Environment ü Concept Extraction ü Term Recommendations ü Sentiment Analysis ü Visualization Tools ü NLP Precision Monitoring ANALYZE ü Taxonomy Building ü Thesaurus Building ü AI Training-Set Building MODEL ü Abstracting ü Classifying ü Metadata Enrichment ü Inline Tagging ü Meta-Titles ENRICH
  • 27. Marjorie MK Hlava Founder/President mhlava@accessinn.com (m): 505-975-5578 Change Search to Found We are the intelligence and the technology behind world-class semantic solutions