User Experiences of  Enterprise Semantic Content Management Amit Sheth Panel at Symposium on the User Experience of Business Intelligence & Knowledge Management, IBM Almaden Research Center, San Jose, March 18, 2000.  University of Georgia
The Problem:   Massive, disparate   information everywhere Multiple isolated sources of information that are not shared or integrated Large variety of open source, partner, proprietary and extranet information Multiple formats  (Text, HTML, XML, PDF, etc.) Diverse structure (structured, semi-structured, unstructured)  Multiple media (Text, Audio, Video, Images, etc.) Diverse Communication Channels (FTP, extraction for source, etc.) The Difficulty & Challenges:  Inability to have timely actionable information Overwhelming amount of information -> in-context, relevant information Timely, accurate, personalized & actionable decisions  Advanced Content  Management Challenges
Knowledge Discovery/Management  Requirements The Problem:  Aggregation and corelation of passenger/flight information Correlate/link huge volumes of information  Integrated knowledge applications with diverse response to different end users  Response in near real-time The Challenge:  To build a knowledge linking and discovery system that automatically detects hidden relationships Intelligent analysis of multiple available sources of information  Customized knowledge applications targeting diverse needs of different users Intelligent analysis of valuable information to provide actionable insight  Scalable and near real-time system
Visionics AcSys Security Portal Check-in Interrogation Boarding Gate Airport Airspace Voquette Knowledgebase Metabase Threat Scoring Gov’t Watchlists News Media Web Info LexisNexis RiskWise Passenger Records Reservation Data Airline Data Airport Data Airline and Airport Data Future   and Current Risks Airport LEO ARC AvSec Manager Data Management Data Mining IPG User Class 1: End Users Different types of  users have different  information needs
Voquette’s Semantic Technology enables flight authorities to  : - take a quick look at the    passenger’s history - check quickly if the passenger is    on any official watchlist - interpret and understand    passenger’s links to other    organizations (possibly terrorist) - verify if the passenger has    boarded the flight from a “high    risk” region - verify if the passenger originally    belongs to a “high risk” region - check if the passenger’s name    has been mentioned in any news    article along with the name of a    known bad guy Voquette’s Solution for NASA Smith John
Threat Score Components of APITAS (APITAS=Airline Passenger Identification and Threat Assessment System) Smith John WATCHLIST ANALYSIS Action : Voquette’s rich knowledgebase is automatically searched for the possible appearance of this name on any of the watchlists Ability Proven : Ability to automatically aggregate relevant rich domain knowledge and automatically co-relate it and rank the threat factors to indicate threat level of the passenger on the watchlist front METABASE SEARCH Action : Voquette’s rich metabase is searched for this name and associated content stories mentioning the passenger’s name are retrieved Ability Proven : Ability to automatically aggregate and retrieve relevant content stories, field reports, etc. about the passenger that can be used by flight officials to determine if the passenger has any connections with known bad people or organizations appearsOn watchList : FBI KNOWLEDGEBASE SEARCH Action : Voquette’s rich knowledgebase is searched for this name and associated information like position, aliases, relationships (past or present) of this name to other organizations, watchlists, country, etc. are retrieved Ability Proven : Ability to automatically aggregate relevant rich domain knowledge about a passenger and automatically co-relate it with other data in the knowledgebase to present a visual association picture to the flight official LEXIS NEXIS ANNOTATION Action : Information about or related to the passenger returned by Lexis Nexis is enhanced by linking important entities to Voquette’s rich knowledgebase Ability Proven : Ability to automatically aggregate relevant rich domain knowledge, recognize entities in a piece of text and further automatically co-relate it with other data in the knowledgebase to present a clear picture about the passenger to the flight official Flight Country Check  45  0.15 Person Country Check  25  0.15 Nested Organizations Check  75  0.8 Aggregate Link Analysis Score: 17.7 LINK ANALYSIS Action : Semantic analysis of the various components (watchlist, Lexis Nexis, knowledgebase search, metabase search, etc.) to come up with an aggregate threat score for the passenger Ability Proven : Ability to automatically aggregate relevant rich domain knowledge, recognize entities in a piece of text, automatically co-relate it with other data in the knowledgebase, search for relevant content to present an overall idea of the threat level fo the passenger, allowing him to take quick action
Intelligence Analysis Browsing  Scenario Knowledge Browser Demo  Automatic Content Enhancement Demo
Semantic Application Example  –  Financial  Research Dashboard Voquette Research Dashboard:  http://www.voquette.com/demo Focused relevant content organized by topic ( semantic categorization ) Automatic Content Aggregation from multiple content providers and feeds Related relevant content not explicitly asked for (semantic associations) Competitive research inferred automatically Automatic 3 rd  party content integration
Innovations that affect  User Experience BSBQ: Blended  Semantic  Browsing and Querying Ability to query and browse relevant desired content in a highly contextual manner Seamless access/processing of Content, Metadata and Knowledge Ability to retrieve relevant content, view related metadata, access relevant knowledge and switch between all the above, allowing user to follow his train of thought dACE: dynamic Automatic Content Enhancement Ability to provide enhanced annotation features, allowing the user to retrieve relevant knowledge about significant pieces of content during content consumption Semantic Engine APIs with XML output Ability to create customized APIs for the Semantic Engine involving  Semantic Associations   with XML output to cater to any user application
SCORE System  Architecture Knowledge Browser Analyst WB Dashboard Search Personalization Metadata Extractor Agents Knowledge Extractor Agents C C A S Semantic Engine (Automated Maintenance) .  .  .  .  . .  . .  .  .  . .  .  .  .  .  . .  .  .  .  .  . .  .  .  . . .  .  .  .  .. .  . . .  .  .  . .  .  .  .  .  . . .  . .  .  .  .  .  . .  . .  .  .  . .  .  .  .  .  . .  .  .  .  .  . .  .  .  . . .  .  .  .  .. .  . . .  .  .  . .  .  .  .  .  .  . . .  . .  .  .  . .  .  .  .  . .  . .  .  .  . .  .  .  .  .  . .  .  .  .  .  . .  .  .  . . .  .  .  .  .. .  . . .  .  .  . .  .  .  .  .  . . .  . .  .  .  .  .  . .  . .  .  .  . .  .  .  .  .  . .  .  .  .  .  . .  .  .  . . .  .  .  .  .. .  . . .  .  .  . .  .  .  .  .  .  . . .  . .  .  .  . .  .  .  .  .  . .  . .  .  .  . .  .  .  .  .  . .  .  .  .  .  . .  .  .  . . .  .  .  .  .. .  . . .  .  .  . .  .  .  .  .  .  . . .  . .  .  .  . .  .  .  .  .  . .  . .  .  .  . .  .  .  .  .  . .  .  .  .  .  . .  .  .  . . .  .  .  .  .. .  . . .  .  .  . .  .  .  .  .  .  . . .  . .  .  .  . .  .  .  .  .  . .  . .  .  .  . .  .  .  .  .  . .  .  .  .  .  . .  .  .  . . .  .  .  .  .. .  . . .  .  .  . .  .  .  .  .  .  . . .  . .  .  .  . .  .  .  . . .  .  .  .  . .  . .  .  .  . .  .  .  .  .  . .  .  .  .  .  . .  .  .  . . .  .  .  .  .. .  . . .  .  .  . .  . .  .  .  . .  . KnowledgeBase Metabase (Database of Richly Indexed Metadata) WorldModel Knowledge Toolkit Extractor Toolkit Extractor Toolkit Analysis Reports Mining XML XML Documents Web Sites Corporate Repositories Structured & Semi-Structured Content - - - - - - - - - - - - Email Word  Documents PowerPoint Presentations Unstructured Content Proprietary Content Corporate Web Sites Public Domain Web Sites Subscription Content Trusted Knowledge Sources Content Enhancement Domain Experts Metadata Enhanced Metadata ENTERPRISE  USERS Custom Content and Knowledge APIs Std. Content APIs
Related Stock  News Semantic Web – Intelligent Content Industry News Technology  Products COMPANY EPA Regulations Competition COMPANIES in Same or Related INDUSTRY COMPANIES  in INDUSTRY with Competing  PRODUCTS Impacting INDUSTRY or Filed By COMPANY Important to INDUSTRY or COMPANY Intelligent Content = What You Asked for + What you need to know! SEC
User Class 2: Enterprise Application Developer Automation: KnowledgeBase (creation and maintenance) Dynamic content (metadata extraction and scheduled updates) Multiple techniques/technologies (DB, machine learning, knowledgebase, lexical/NLP, statistical, etc.) Content Enhancement (value-added metatagging and indexing) Toolkits About 30  integrated  tools for content/knowledge creation, processing, maintenance and management
Discussion/Questions? Case Studies available http://www.voquette.com/demo
Voquette SCORE Technology  Architecture Distributed agents that automatically extract relevant semantic metadata from structured and unstructured content Fast main-memory based query  engine with APIs and XML output CACS provides automatic classification (w.r.t. WorldModel) from unstructured text and extracts contextually relevant metadata Distributed agents that automatically extract/mine knowledge from trusted sources Toolkit to design and maintain the Knowledgebase Knowledgebase represents the real-world instantiation (entities and relationships) of the WorldModel WorldModel specifies enterprise’s normalized view of information (ontology)
Content Enhancement Workflow Semantic Metadata Syntax Metadata
Content Asset Index Evolution Extractor Agent for Bloomberg Scans text  for analysis Metadata extracted automatically Asset Syntax Metadata Producer: BusinessWire Source: Bloomberg Date: Sept. 10 2001 Location: San Jose, CA URL:  http://bloomberg.com/1.htm Media: Text Semantic Metadata  Company: Cisco Systems, Inc. Creates asset (index) out of extracted  metadata Asset Syntax Metadata Producer: BusinessWire Source: Bloomberg Date: Sept. 10 2001 Location: San Jose, CA URL:  http://bloomberg.com/1.htm Media: Text Semantic Metadata  Company: Cisco Systems, Inc. Topic: Company News Categorization & Auto-Cataloging  System (CACS) Scans text  for analysis Classifies document into  pre-defined category/topic Appends  topic  metadata to asset Cisco Systems  CSCO  NASDAQ  Company Ticker Exchange Industry Sector Executives John Chambers Telecomm. Computer  Hardware Competition Nortel Networks  Knowledge Base CEO of Competes with Syntax Metadata   Asset Producer: BusinessWire Source: Bloomberg Date: Sept. 10 2001 Location: San Jose, CA URL:  http://bloomberg.com/1.htm Media: Text Semantic Metadata  Company: Cisco Systems, Inc. Topic: Company News Ticker: CSCO Exchange: NASDAQ Industry: Telecomm. Sector: Computer Hardware Executive: John Chambers Competition: Nortel Networks Headquarters: San Jose, CA Leverages knowledge to enhance metatagging Enhanced  Content Asset  Indexed  Headquarters San Jose XML Feed Semantic Engine
Content which does contain the  words the user asked for Extractor Agents Content which does not contain the  words the user  asked for, but is  about  what he asked for. Value-added Metadata Content the user did not  think to ask for , but which he  needs to know . Semantic Associations + + Intelligent Content End-User Intelligent Content Empowers the User

More Related Content

PDF
IST 561 Spring 2007--Session7, Sources of Information
KEY
Detecting Signals from Real-time Social Web
PPTX
Semantic Technologies for Big Sciences including Astrophysics
PPTX
Computing for Human Experience [v4]: Keynote @ OnTheMove Federated Conferences
PDF
Meena Nagarajan Ph.D. Dissertation Defense
PPTX
Active Perception over Machine and Citizen Sensing
PPTX
Domain case study: successful application of Semantic Web technologies and to...
PPTX
Role of Semantic Web in Health Informatics
IST 561 Spring 2007--Session7, Sources of Information
Detecting Signals from Real-time Social Web
Semantic Technologies for Big Sciences including Astrophysics
Computing for Human Experience [v4]: Keynote @ OnTheMove Federated Conferences
Meena Nagarajan Ph.D. Dissertation Defense
Active Perception over Machine and Citizen Sensing
Domain case study: successful application of Semantic Web technologies and to...
Role of Semantic Web in Health Informatics

Viewers also liked (10)

PPTX
Federated Architecture with Provenance and Access Control to realize Open Dig...
PPTX
Realizing Semantic Web - Light Weight semantics and beyond
PPTX
Computing for Human Experience [v3, Aug-Oct 2010]
PPTX
Citizen Sensor Data Mining, Social Media Analytics and Development Centric ...
PPTX
Introduction to Kno.e.sis Center - March 2011
PPT
Kino : Making Semantic Annotations Easier
PDF
How to Leverage Social Media Communities for Crisis Response Coordination
PPTX
PhD thesis defense of Ajith Ranabahu
PPT
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Federated Architecture with Provenance and Access Control to realize Open Dig...
Realizing Semantic Web - Light Weight semantics and beyond
Computing for Human Experience [v3, Aug-Oct 2010]
Citizen Sensor Data Mining, Social Media Analytics and Development Centric ...
Introduction to Kno.e.sis Center - March 2011
Kino : Making Semantic Annotations Easier
How to Leverage Social Media Communities for Crisis Response Coordination
PhD thesis defense of Ajith Ranabahu
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Ad

Similar to User Experiences of Enterprise Semantic Content Management (20)

PPT
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
PPTX
Smartlogic, Semaphore and Semantically Enhanced Search – For “Discovery”
PPTX
Smartlogic, Semaphore and Semantically Enhanced Search – For “Discovery”
PPT
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
PPT
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
PDF
Smart Innovation of Web of Things 1st Edition Aarti Jain (Editor)
DOC
Record matching over query results
PPTX
NOW! Get the internet to work for you!
PDF
Recommender Systems A Multidisciplinary Approach Monideepa Roy Pushpendu Kar ...
PDF
Quality, quantity, web and semantics
PDF
Quality, Quantity, Web and Semantics
PDF
Advanced Metasearch Engine Technology Weiyi Meng Clement T Yu
PDF
I'm In Doha*&QATAR^*[☎️+639358141074]]@ @# Abortion pills for sale in Doha Qa...
PDF
Progress OpenEdge database administration guide and reference
PPT
Advanced Web Development
PPS
Semantic Web in Action: Ontology-driven information search, integration and a...
PDF
Cost Analysis for Engineers and Scientists 1st Edition Fariborz Tayyari
PPT
Introduction to Semantic Web for GIS Practitioners
PDF
Automated Verification Of Concurrent Search Structures Siddharth Krishna
PPT
Fox-Keynote-Now and Now of Data Publishing-nfdp13
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
Smartlogic, Semaphore and Semantically Enhanced Search – For “Discovery”
Smartlogic, Semaphore and Semantically Enhanced Search – For “Discovery”
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Smart Innovation of Web of Things 1st Edition Aarti Jain (Editor)
Record matching over query results
NOW! Get the internet to work for you!
Recommender Systems A Multidisciplinary Approach Monideepa Roy Pushpendu Kar ...
Quality, quantity, web and semantics
Quality, Quantity, Web and Semantics
Advanced Metasearch Engine Technology Weiyi Meng Clement T Yu
I'm In Doha*&QATAR^*[☎️+639358141074]]@ @# Abortion pills for sale in Doha Qa...
Progress OpenEdge database administration guide and reference
Advanced Web Development
Semantic Web in Action: Ontology-driven information search, integration and a...
Cost Analysis for Engineers and Scientists 1st Edition Fariborz Tayyari
Introduction to Semantic Web for GIS Practitioners
Automated Verification Of Concurrent Search Structures Siddharth Krishna
Fox-Keynote-Now and Now of Data Publishing-nfdp13
Ad

Recently uploaded (20)

PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PPTX
Configure Apache Mutual Authentication
PPT
Geologic Time for studying geology for geologist
PDF
OpenACC and Open Hackathons Monthly Highlights July 2025
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
PDF
A proposed approach for plagiarism detection in Myanmar Unicode text
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
UiPath Agentic Automation session 1: RPA to Agents
PDF
STKI Israel Market Study 2025 version august
PPTX
Custom Battery Pack Design Considerations for Performance and Safety
PPTX
Benefits of Physical activity for teenagers.pptx
PDF
Enhancing emotion recognition model for a student engagement use case through...
PPT
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
PDF
Credit Without Borders: AI and Financial Inclusion in Bangladesh
PPTX
Microsoft Excel 365/2024 Beginner's training
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
DOCX
search engine optimization ppt fir known well about this
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
Configure Apache Mutual Authentication
Geologic Time for studying geology for geologist
OpenACC and Open Hackathons Monthly Highlights July 2025
NewMind AI Weekly Chronicles – August ’25 Week III
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
A proposed approach for plagiarism detection in Myanmar Unicode text
Final SEM Unit 1 for mit wpu at pune .pptx
Zenith AI: Advanced Artificial Intelligence
UiPath Agentic Automation session 1: RPA to Agents
STKI Israel Market Study 2025 version august
Custom Battery Pack Design Considerations for Performance and Safety
Benefits of Physical activity for teenagers.pptx
Enhancing emotion recognition model for a student engagement use case through...
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
Credit Without Borders: AI and Financial Inclusion in Bangladesh
Microsoft Excel 365/2024 Beginner's training
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
search engine optimization ppt fir known well about this
A contest of sentiment analysis: k-nearest neighbor versus neural network

User Experiences of Enterprise Semantic Content Management

  • 1. User Experiences of Enterprise Semantic Content Management Amit Sheth Panel at Symposium on the User Experience of Business Intelligence & Knowledge Management, IBM Almaden Research Center, San Jose, March 18, 2000. University of Georgia
  • 2. The Problem: Massive, disparate information everywhere Multiple isolated sources of information that are not shared or integrated Large variety of open source, partner, proprietary and extranet information Multiple formats (Text, HTML, XML, PDF, etc.) Diverse structure (structured, semi-structured, unstructured) Multiple media (Text, Audio, Video, Images, etc.) Diverse Communication Channels (FTP, extraction for source, etc.) The Difficulty & Challenges: Inability to have timely actionable information Overwhelming amount of information -> in-context, relevant information Timely, accurate, personalized & actionable decisions Advanced Content Management Challenges
  • 3. Knowledge Discovery/Management Requirements The Problem: Aggregation and corelation of passenger/flight information Correlate/link huge volumes of information Integrated knowledge applications with diverse response to different end users Response in near real-time The Challenge: To build a knowledge linking and discovery system that automatically detects hidden relationships Intelligent analysis of multiple available sources of information Customized knowledge applications targeting diverse needs of different users Intelligent analysis of valuable information to provide actionable insight Scalable and near real-time system
  • 4. Visionics AcSys Security Portal Check-in Interrogation Boarding Gate Airport Airspace Voquette Knowledgebase Metabase Threat Scoring Gov’t Watchlists News Media Web Info LexisNexis RiskWise Passenger Records Reservation Data Airline Data Airport Data Airline and Airport Data Future and Current Risks Airport LEO ARC AvSec Manager Data Management Data Mining IPG User Class 1: End Users Different types of users have different information needs
  • 5. Voquette’s Semantic Technology enables flight authorities to : - take a quick look at the passenger’s history - check quickly if the passenger is on any official watchlist - interpret and understand passenger’s links to other organizations (possibly terrorist) - verify if the passenger has boarded the flight from a “high risk” region - verify if the passenger originally belongs to a “high risk” region - check if the passenger’s name has been mentioned in any news article along with the name of a known bad guy Voquette’s Solution for NASA Smith John
  • 6. Threat Score Components of APITAS (APITAS=Airline Passenger Identification and Threat Assessment System) Smith John WATCHLIST ANALYSIS Action : Voquette’s rich knowledgebase is automatically searched for the possible appearance of this name on any of the watchlists Ability Proven : Ability to automatically aggregate relevant rich domain knowledge and automatically co-relate it and rank the threat factors to indicate threat level of the passenger on the watchlist front METABASE SEARCH Action : Voquette’s rich metabase is searched for this name and associated content stories mentioning the passenger’s name are retrieved Ability Proven : Ability to automatically aggregate and retrieve relevant content stories, field reports, etc. about the passenger that can be used by flight officials to determine if the passenger has any connections with known bad people or organizations appearsOn watchList : FBI KNOWLEDGEBASE SEARCH Action : Voquette’s rich knowledgebase is searched for this name and associated information like position, aliases, relationships (past or present) of this name to other organizations, watchlists, country, etc. are retrieved Ability Proven : Ability to automatically aggregate relevant rich domain knowledge about a passenger and automatically co-relate it with other data in the knowledgebase to present a visual association picture to the flight official LEXIS NEXIS ANNOTATION Action : Information about or related to the passenger returned by Lexis Nexis is enhanced by linking important entities to Voquette’s rich knowledgebase Ability Proven : Ability to automatically aggregate relevant rich domain knowledge, recognize entities in a piece of text and further automatically co-relate it with other data in the knowledgebase to present a clear picture about the passenger to the flight official Flight Country Check 45 0.15 Person Country Check 25 0.15 Nested Organizations Check 75 0.8 Aggregate Link Analysis Score: 17.7 LINK ANALYSIS Action : Semantic analysis of the various components (watchlist, Lexis Nexis, knowledgebase search, metabase search, etc.) to come up with an aggregate threat score for the passenger Ability Proven : Ability to automatically aggregate relevant rich domain knowledge, recognize entities in a piece of text, automatically co-relate it with other data in the knowledgebase, search for relevant content to present an overall idea of the threat level fo the passenger, allowing him to take quick action
  • 7. Intelligence Analysis Browsing Scenario Knowledge Browser Demo Automatic Content Enhancement Demo
  • 8. Semantic Application Example – Financial Research Dashboard Voquette Research Dashboard: http://www.voquette.com/demo Focused relevant content organized by topic ( semantic categorization ) Automatic Content Aggregation from multiple content providers and feeds Related relevant content not explicitly asked for (semantic associations) Competitive research inferred automatically Automatic 3 rd party content integration
  • 9. Innovations that affect User Experience BSBQ: Blended Semantic Browsing and Querying Ability to query and browse relevant desired content in a highly contextual manner Seamless access/processing of Content, Metadata and Knowledge Ability to retrieve relevant content, view related metadata, access relevant knowledge and switch between all the above, allowing user to follow his train of thought dACE: dynamic Automatic Content Enhancement Ability to provide enhanced annotation features, allowing the user to retrieve relevant knowledge about significant pieces of content during content consumption Semantic Engine APIs with XML output Ability to create customized APIs for the Semantic Engine involving Semantic Associations with XML output to cater to any user application
  • 10. SCORE System Architecture Knowledge Browser Analyst WB Dashboard Search Personalization Metadata Extractor Agents Knowledge Extractor Agents C C A S Semantic Engine (Automated Maintenance) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . KnowledgeBase Metabase (Database of Richly Indexed Metadata) WorldModel Knowledge Toolkit Extractor Toolkit Extractor Toolkit Analysis Reports Mining XML XML Documents Web Sites Corporate Repositories Structured & Semi-Structured Content - - - - - - - - - - - - Email Word Documents PowerPoint Presentations Unstructured Content Proprietary Content Corporate Web Sites Public Domain Web Sites Subscription Content Trusted Knowledge Sources Content Enhancement Domain Experts Metadata Enhanced Metadata ENTERPRISE USERS Custom Content and Knowledge APIs Std. Content APIs
  • 11. Related Stock News Semantic Web – Intelligent Content Industry News Technology Products COMPANY EPA Regulations Competition COMPANIES in Same or Related INDUSTRY COMPANIES in INDUSTRY with Competing PRODUCTS Impacting INDUSTRY or Filed By COMPANY Important to INDUSTRY or COMPANY Intelligent Content = What You Asked for + What you need to know! SEC
  • 12. User Class 2: Enterprise Application Developer Automation: KnowledgeBase (creation and maintenance) Dynamic content (metadata extraction and scheduled updates) Multiple techniques/technologies (DB, machine learning, knowledgebase, lexical/NLP, statistical, etc.) Content Enhancement (value-added metatagging and indexing) Toolkits About 30 integrated tools for content/knowledge creation, processing, maintenance and management
  • 13. Discussion/Questions? Case Studies available http://www.voquette.com/demo
  • 14. Voquette SCORE Technology Architecture Distributed agents that automatically extract relevant semantic metadata from structured and unstructured content Fast main-memory based query engine with APIs and XML output CACS provides automatic classification (w.r.t. WorldModel) from unstructured text and extracts contextually relevant metadata Distributed agents that automatically extract/mine knowledge from trusted sources Toolkit to design and maintain the Knowledgebase Knowledgebase represents the real-world instantiation (entities and relationships) of the WorldModel WorldModel specifies enterprise’s normalized view of information (ontology)
  • 15. Content Enhancement Workflow Semantic Metadata Syntax Metadata
  • 16. Content Asset Index Evolution Extractor Agent for Bloomberg Scans text for analysis Metadata extracted automatically Asset Syntax Metadata Producer: BusinessWire Source: Bloomberg Date: Sept. 10 2001 Location: San Jose, CA URL: http://bloomberg.com/1.htm Media: Text Semantic Metadata Company: Cisco Systems, Inc. Creates asset (index) out of extracted metadata Asset Syntax Metadata Producer: BusinessWire Source: Bloomberg Date: Sept. 10 2001 Location: San Jose, CA URL: http://bloomberg.com/1.htm Media: Text Semantic Metadata Company: Cisco Systems, Inc. Topic: Company News Categorization & Auto-Cataloging System (CACS) Scans text for analysis Classifies document into pre-defined category/topic Appends topic metadata to asset Cisco Systems CSCO NASDAQ Company Ticker Exchange Industry Sector Executives John Chambers Telecomm. Computer Hardware Competition Nortel Networks Knowledge Base CEO of Competes with Syntax Metadata Asset Producer: BusinessWire Source: Bloomberg Date: Sept. 10 2001 Location: San Jose, CA URL: http://bloomberg.com/1.htm Media: Text Semantic Metadata Company: Cisco Systems, Inc. Topic: Company News Ticker: CSCO Exchange: NASDAQ Industry: Telecomm. Sector: Computer Hardware Executive: John Chambers Competition: Nortel Networks Headquarters: San Jose, CA Leverages knowledge to enhance metatagging Enhanced Content Asset Indexed Headquarters San Jose XML Feed Semantic Engine
  • 17. Content which does contain the words the user asked for Extractor Agents Content which does not contain the words the user asked for, but is about what he asked for. Value-added Metadata Content the user did not think to ask for , but which he needs to know . Semantic Associations + + Intelligent Content End-User Intelligent Content Empowers the User