When Metadata is the Content
From Articles to Knowledge
SSP 2009 Annual Meeting
Chris Beguel – Director of Sales – TEMIS
Baltimore, MD – May 09
Where are we? Semantic Age!




      Copyright © 2009 TEMIS –All rights reserved   2
From Words to Meaning…
         Trimilax 500 mg makes me feel dizzy after ingestion

Term        Prop.     Num. Abrev. Verb /3rd           Pron.   Verb       Adj.   Prep.        Noun




Entity    Product          Dosing         Action     Target          State      Event       Action




Fact                Drug                                Symptom                         Condition


                              Potential Adverse Effect
                                 Drug = Trimilax
Knowledge                        Dosing = 500mg
                                 Symptom = Tireness
                                 When = After administration
                Copyright © 2009 TEMIS –All rights reserved                                          3
Metadata? Understand!
                                                              Metadata
                                                    Title:    Google gives drivers a hand
                                                              at the gas pumps
                                                    Source:   InformationWeek
                                                    Author:   Antone Gonsalves
                                                    Date:     November 7, 2007




                                                               Entities




                                                                 Facts

      Copyright © 2009 TEMIS –All rights reserved                                      4
Metadata? Understand!
                                                                 Metadata
                                                                 Entities
                                                    Companies
                                                      Gilbarco Veeder-Root             Gilbarco

                                                      Google         InformationWeek

                                                      T-Mobile        HTC

                                                      Qualcomm              Motorola
                                                    Persons
                                                      Lucy Sackett            Sackett

                                                    Locations
                                                      Atlanta        United States

                                                    Organizations
                                                      National Association of Conveni…

                                                    Technologies
                                                      Internet      Linux      Open-source …
                                                    Product
                                                      New Service           Google Service


      Copyright © 2009 TEMIS –All rights reserved                    Facts                   5
Metadata? Understand!
                                                                           Who: Gilbarco
                                                                           Whom: unknown
                                                                           What: New Service   Metadata
                                                                           When: unknown

                                                   Announcement                                Entities
                   Who: Gilbarco                                                                Facts
                   What: Google Service
                   When: early next week                                       Announcement
                                                                                  Gilbarco         New service
                                                  Who: Sackett
         Launch                                   Whom: InformationWeek
                                                  When: unknown                   Sackett          InformationWeek
                                                  What: unknown
                                                                               Launch
                                                                                  Gilbarco         Google Service

        Function                                                               Function
                              Announcement
                                           Who: Gilbarco                          Sackett          Gilbarco
                                           With whom: Google
  Who: Sackett                             When; unknown
                                           State: Negative
                                                                               Partnership
  Company: Gilbarco
                                                       Who: Google
  Function: spoke woman                                                           Gilbarco         Google
                                                       With whom: T-
                                                       Mobile, HTC,
                   Partnership                         Qualcom, Motorola       Alliance
                                                       When: unknown                               T-Mobile
                                                                                  Google           HTC
                                     Alliance
                                                                                                   Qualcomm

                                                                                                   Motorola
            Copyright © 2009 TEMIS –All rights reserved                                                             6
From Metadata to Knowledge!




      Copyright © 2009 TEMIS –All rights reserved   7
What is Text Mining?

 v Text Mining is an information access technology…
 v Text Mining generates Knowledge
 v Text Mining serves information consumers & producers



                Text Mining Back-End




                     Data
                   Repository

                                          Text Mining Front-End
                                             (Text Analytics)

       Copyright © 2009 TEMIS –All rights reserved                8
1. Enhanced Search Experience
From standard keyword search….




                       Simple recognition of words…


       Copyright © 2009 TEMIS –All rights reserved    9
1. Enhanced Search Experience
     … to Entity & Fact search!




            •Make comprehensive and precise search
 End-User
            •Get more relevant documents
 Benefits   •Find what you don’know!
                               t
        Copyright © 2009 TEMIS –All rights reserved   10
2. Faceted Navigation
From “
     narrow your search”
                       ….




       Copyright © 2009 TEMIS –All rights reserved   11
2. Faceted Navigation
… to multi-dimensional faceted navigation


         Point & Click
              filtering


    Ability to combine
 several filters at once
                (and/or)


        Self-adjusting
  filters to refine the
                 search


               •Get a quick vision of document content
 End-User
               •Navigate within context-relevant information
 Benefits      •Rapidly focus on targeted documents
           Copyright © 2009 TEMIS –All rights reserved         12
3. Data Analysis and Reporting
From bug view ….




       Copyright © 2009 TEMIS –All rights reserved   13
3. Data Analysis and Reporting


                                                            … to bird-
                                                            eye view!




              •Visualize key Entities & Facts (pie/bar charts)
 End-User
              •Detect Entities & Facts dependencies (matrix charts)
 Benefits     •Zoom in & out by drilling anywhere
        Copyright © 2009 TEMIS –All rights reserved                   14
4. Information Discovery
From flat list of documents ….




        Copyright © 2009 TEMIS –All rights reserved   15
4. Information Discovery
    … to
information
  network
                                                           Discovery
   Search
                                                               Tools
    Panel


 Entities




                                                          Proofs
                        Facts

                  •Search in knowledge, not in documents
 End-User
                  •Get a graphical representation of knowledge
  Benefits        •Discover information by navigating within Facts
            Copyright © 2009 TEMIS –All rights reserved                16
Semantic Enrichment at the Core

     Automatic         Entity & Facts       Taxonomy         Content
   Categorization        Extraction        Management        Editors



                                                                       Related Topics
                      Editorial                        Web Content       Extraction
                     & Content                         Management
                                                                         Similarity
                    Management                                           Detection

                                                                            Smart
                               Text Mining                                 Linking

                             Content Enrichment                        Trends Analysis
          Product                                                        & Charting
                                                                                         Visitors &
                                                                                         customers
          Management
                                                                         Sentiment
                                                                          Analysis



                             Content            Metadata
                            Annotation          Extraction

                                  Original Content
                                   Journal Scans
                                 Expert Interviews
                                   Event Reports


              Copyright © 2009 TEMIS –All rights reserved                                       17
Benefits to Information Producers
    Increase stickiness of website to maximize
       ad revenue or subscription utilization!
 v Create more engaging, longer lasting user visits
     •   Richer user experience with context sensitive information
     •   Enhanced page views per visits
     •   Exposing the “long tail” through suggestions and linking
     •   Integrate more content at a fraction of the cost
 v Establish your web properties as a community
   gateway
     • “70% of all searches do NOT start on Google/MSN/Yahoo”
       says Sue Feldman at IDC Research
     • Smart search and navigation are critical to user’ experience
                                                       s
          Copyright © 2009 TEMIS –All rights reserved                18
Re-Packaging Content – Elsevier
 v Objective
    • Develop a revolutionary database indexing the last 28 years
      in chemistry patent
    • Provide an exceptional users’experience by using “smart
      content”
 v Results
    • ~20 Million Chemistry Patent documents
    • Searchable by chemical reactions, solvents, reactants directly
      extracted from the documents
    • Released by Elsevier-MDL in Nov. 2004
 v Currently
    • TEMIS distributes the Chemical Entities Relationships
      Annotator in partnership with Elsevier
       Copyright © 2009 TEMIS –All rights reserved              19
Exposing the Long Tail – Springer
 v Objective
    • Mapping of meaningful words and phrases
      in journal articles to encyclopedia entries
    • Identification of related documents in a pool of over
      three million journal articles
 v Solution
    • Indexing of incoming journal articles to link journal
      articles with the related encyclopedia entry
    • Creation of semantic fingerprint for each journal article
      to allow search engine calculate degree of relationship
    • Integration with Springer’ search engine
                                 s
 v Benefits
    • Increased product sales by improving content linking
       Copyright © 2009 TEMIS –All rights reserved                20
Answering Burning Questions – EFL

 v Objectives
    • Extract numerical data
      from case law to enhance
      information access
      for lawyers.
 v Solution
    • Luxid® with custom annotators (address, activity,
      compensation, age, turnover… )
    • Export numerical data as metadata to a search engine.
 v Benefits
    • Productivity gain to extract and validate metadata
    • Allowing to treat huge amount of case law

       Copyright © 2009 TEMIS –All rights reserved            21
Questions?
Thank you!
SSP 2009 Annual Meeting
Chris Beguel – Director of Sales – TEMIS
Baltimore, MD – May 09

More Related Content

PDF
09 02 17 Web 2.0 Weekly
PDF
PhD Dissertation - Manuscrit de thèse de doctorat
PPTX
Google Presentation
PDF
Powerset: Natural and Semantic Search
 
KEY
Nonprofit Must Have Technology Tools & Tricks
PDF
Google Apps for Nonprofits: Running Your Nonprofit In The Cloud
PDF
E-Business & E-Commerce Models and Strategies
PPTX
https://www.slideshare.net/upload?download_id=270630006&original_file=true&_g...
09 02 17 Web 2.0 Weekly
PhD Dissertation - Manuscrit de thèse de doctorat
Google Presentation
Powerset: Natural and Semantic Search
 
Nonprofit Must Have Technology Tools & Tricks
Google Apps for Nonprofits: Running Your Nonprofit In The Cloud
E-Business & E-Commerce Models and Strategies
https://www.slideshare.net/upload?download_id=270630006&original_file=true&_g...

Similar to 1 d.3 (20)

PDF
Symantec Website Security Threat Report
PPTX
Google A Complete Presentation
PPTX
Google, Products and Information Seraching
DOCX
Daniels Fund Ethics Initiative University of New Mexico .docx
DOCX
Planning Intellectual Property for Marketing Strategies in .docx
PDF
Basic Internet Security (for Association of Bridal Consultants - Italy)
PPTX
Google Products Innovation
PDF
Accenture Mobility - Trends for the Next Decade
PPT
Wdm google
PPT
The Analysis of Google
PDF
The New Incumbents: Anti-trust in the Digital Economy
PPTX
Google Case Analysis
PPTX
Google Search Tips
PDF
Health 2 0 & ultrabook services
PPTX
Smartphone
PPTX
GLA COMO WorldShare ILL/WorldShare
DOCX
The Illuminating Journey of Lucent Technologies.docx
PPTX
Week 5 google
PDF
InfoBulletin February 2011
Symantec Website Security Threat Report
Google A Complete Presentation
Google, Products and Information Seraching
Daniels Fund Ethics Initiative University of New Mexico .docx
Planning Intellectual Property for Marketing Strategies in .docx
Basic Internet Security (for Association of Bridal Consultants - Italy)
Google Products Innovation
Accenture Mobility - Trends for the Next Decade
Wdm google
The Analysis of Google
The New Incumbents: Anti-trust in the Digital Economy
Google Case Analysis
Google Search Tips
Health 2 0 & ultrabook services
Smartphone
GLA COMO WorldShare ILL/WorldShare
The Illuminating Journey of Lucent Technologies.docx
Week 5 google
InfoBulletin February 2011
Ad

More from Society for Scholarly Publishing (20)

PPTX
10052016 ssp seminar2_newsham
PPTX
10052016 ssp seminar2_rivera
PPTX
10052016 ssp seminar2_pesanelli
PDF
10052016 ssp seminar2_harley
PPTX
10042016 ssp seminar1_session4_myers
PPTX
10042016 ssp seminar1_session4_demers
PPTX
10042016 ssp seminar1_session4_cochran
PPTX
10042016 ssp seminar1_session3_stanley
PPTX
10042016 ssp seminar1_session3_ranganathan
PPTX
10042016 ssp seminar1_session3_odike
PPTX
10042016 ssp seminar1_session3_cochran
PPTX
10042016 ssp seminar1_session2_walker
PPTX
10042016 ssp seminar1_session2_ivins
PPTX
10042016 ssp seminar1_session2_holland
PPTX
10042016 ssp seminar1_session1_stanley
PPTX
10042016 ssp seminar1_session1_keane
PPTX
10042016 ssp seminar1_session1_ivins
PPTX
10042016 ssp seminar1_session1_asadilari
PDF
04142015 ssp webinar_theworldisflatforscholarlypublishing_caitlinmeadows
PPTX
04142015 ssp webinar_theworldisflatforscholarlypublishing_bruceheterick
10052016 ssp seminar2_newsham
10052016 ssp seminar2_rivera
10052016 ssp seminar2_pesanelli
10052016 ssp seminar2_harley
10042016 ssp seminar1_session4_myers
10042016 ssp seminar1_session4_demers
10042016 ssp seminar1_session4_cochran
10042016 ssp seminar1_session3_stanley
10042016 ssp seminar1_session3_ranganathan
10042016 ssp seminar1_session3_odike
10042016 ssp seminar1_session3_cochran
10042016 ssp seminar1_session2_walker
10042016 ssp seminar1_session2_ivins
10042016 ssp seminar1_session2_holland
10042016 ssp seminar1_session1_stanley
10042016 ssp seminar1_session1_keane
10042016 ssp seminar1_session1_ivins
10042016 ssp seminar1_session1_asadilari
04142015 ssp webinar_theworldisflatforscholarlypublishing_caitlinmeadows
04142015 ssp webinar_theworldisflatforscholarlypublishing_bruceheterick
Ad

1 d.3

  • 1. When Metadata is the Content From Articles to Knowledge SSP 2009 Annual Meeting Chris Beguel – Director of Sales – TEMIS Baltimore, MD – May 09
  • 2. Where are we? Semantic Age! Copyright © 2009 TEMIS –All rights reserved 2
  • 3. From Words to Meaning… Trimilax 500 mg makes me feel dizzy after ingestion Term Prop. Num. Abrev. Verb /3rd Pron. Verb Adj. Prep. Noun Entity Product Dosing Action Target State Event Action Fact Drug Symptom Condition Potential Adverse Effect Drug = Trimilax Knowledge Dosing = 500mg Symptom = Tireness When = After administration Copyright © 2009 TEMIS –All rights reserved 3
  • 4. Metadata? Understand! Metadata Title: Google gives drivers a hand at the gas pumps Source: InformationWeek Author: Antone Gonsalves Date: November 7, 2007 Entities Facts Copyright © 2009 TEMIS –All rights reserved 4
  • 5. Metadata? Understand! Metadata Entities Companies Gilbarco Veeder-Root Gilbarco Google InformationWeek T-Mobile HTC Qualcomm Motorola Persons Lucy Sackett Sackett Locations Atlanta United States Organizations National Association of Conveni… Technologies Internet Linux Open-source … Product New Service Google Service Copyright © 2009 TEMIS –All rights reserved Facts 5
  • 6. Metadata? Understand! Who: Gilbarco Whom: unknown What: New Service Metadata When: unknown Announcement Entities Who: Gilbarco Facts What: Google Service When: early next week Announcement Gilbarco New service Who: Sackett Launch Whom: InformationWeek When: unknown Sackett InformationWeek What: unknown Launch Gilbarco Google Service Function Function Announcement Who: Gilbarco Sackett Gilbarco With whom: Google Who: Sackett When; unknown State: Negative Partnership Company: Gilbarco Who: Google Function: spoke woman Gilbarco Google With whom: T- Mobile, HTC, Partnership Qualcom, Motorola Alliance When: unknown T-Mobile Google HTC Alliance Qualcomm Motorola Copyright © 2009 TEMIS –All rights reserved 6
  • 7. From Metadata to Knowledge! Copyright © 2009 TEMIS –All rights reserved 7
  • 8. What is Text Mining? v Text Mining is an information access technology… v Text Mining generates Knowledge v Text Mining serves information consumers & producers Text Mining Back-End Data Repository Text Mining Front-End (Text Analytics) Copyright © 2009 TEMIS –All rights reserved 8
  • 9. 1. Enhanced Search Experience From standard keyword search…. Simple recognition of words… Copyright © 2009 TEMIS –All rights reserved 9
  • 10. 1. Enhanced Search Experience … to Entity & Fact search! •Make comprehensive and precise search End-User •Get more relevant documents Benefits •Find what you don’know! t Copyright © 2009 TEMIS –All rights reserved 10
  • 11. 2. Faceted Navigation From “ narrow your search” …. Copyright © 2009 TEMIS –All rights reserved 11
  • 12. 2. Faceted Navigation … to multi-dimensional faceted navigation Point & Click filtering Ability to combine several filters at once (and/or) Self-adjusting filters to refine the search •Get a quick vision of document content End-User •Navigate within context-relevant information Benefits •Rapidly focus on targeted documents Copyright © 2009 TEMIS –All rights reserved 12
  • 13. 3. Data Analysis and Reporting From bug view …. Copyright © 2009 TEMIS –All rights reserved 13
  • 14. 3. Data Analysis and Reporting … to bird- eye view! •Visualize key Entities & Facts (pie/bar charts) End-User •Detect Entities & Facts dependencies (matrix charts) Benefits •Zoom in & out by drilling anywhere Copyright © 2009 TEMIS –All rights reserved 14
  • 15. 4. Information Discovery From flat list of documents …. Copyright © 2009 TEMIS –All rights reserved 15
  • 16. 4. Information Discovery … to information network Discovery Search Tools Panel Entities Proofs Facts •Search in knowledge, not in documents End-User •Get a graphical representation of knowledge Benefits •Discover information by navigating within Facts Copyright © 2009 TEMIS –All rights reserved 16
  • 17. Semantic Enrichment at the Core Automatic Entity & Facts Taxonomy Content Categorization Extraction Management Editors Related Topics Editorial Web Content Extraction & Content Management Similarity Management Detection Smart Text Mining Linking Content Enrichment Trends Analysis Product & Charting Visitors & customers Management Sentiment Analysis Content Metadata Annotation Extraction Original Content Journal Scans Expert Interviews Event Reports Copyright © 2009 TEMIS –All rights reserved 17
  • 18. Benefits to Information Producers Increase stickiness of website to maximize ad revenue or subscription utilization! v Create more engaging, longer lasting user visits • Richer user experience with context sensitive information • Enhanced page views per visits • Exposing the “long tail” through suggestions and linking • Integrate more content at a fraction of the cost v Establish your web properties as a community gateway • “70% of all searches do NOT start on Google/MSN/Yahoo” says Sue Feldman at IDC Research • Smart search and navigation are critical to user’ experience s Copyright © 2009 TEMIS –All rights reserved 18
  • 19. Re-Packaging Content – Elsevier v Objective • Develop a revolutionary database indexing the last 28 years in chemistry patent • Provide an exceptional users’experience by using “smart content” v Results • ~20 Million Chemistry Patent documents • Searchable by chemical reactions, solvents, reactants directly extracted from the documents • Released by Elsevier-MDL in Nov. 2004 v Currently • TEMIS distributes the Chemical Entities Relationships Annotator in partnership with Elsevier Copyright © 2009 TEMIS –All rights reserved 19
  • 20. Exposing the Long Tail – Springer v Objective • Mapping of meaningful words and phrases in journal articles to encyclopedia entries • Identification of related documents in a pool of over three million journal articles v Solution • Indexing of incoming journal articles to link journal articles with the related encyclopedia entry • Creation of semantic fingerprint for each journal article to allow search engine calculate degree of relationship • Integration with Springer’ search engine s v Benefits • Increased product sales by improving content linking Copyright © 2009 TEMIS –All rights reserved 20
  • 21. Answering Burning Questions – EFL v Objectives • Extract numerical data from case law to enhance information access for lawyers. v Solution • Luxid® with custom annotators (address, activity, compensation, age, turnover… ) • Export numerical data as metadata to a search engine. v Benefits • Productivity gain to extract and validate metadata • Allowing to treat huge amount of case law Copyright © 2009 TEMIS –All rights reserved 21
  • 22. Questions? Thank you! SSP 2009 Annual Meeting Chris Beguel – Director of Sales – TEMIS Baltimore, MD – May 09