SlideShare a Scribd company logo
Collaborative Ontology Building Project  - a multiagent-based ontology editing and discovery environment Jie Bao Artificial Intelligence Research Laboratory  Dept of Computer Science Iowa State University Ames IA 50010 [email_address] http://www.cs.iastate.edu/~baojie Project homepage: http://boole.cs.iastate.edu:9090/COB/   A Research proposal Dec 02, 2003
COB Without  SHOE  how can you be a  RACER ? Without  Sesame  how can you make  OIL ? Semantic Web  is a plan of good But with no  ontology  it’s only a nil. Everyone makes a small piece of brick  Not in one day can we make Rome real. Let’s build ontology together and hard Just like ants build their hill.
Outline Objectives Key difficulties Background review A tentative framework
What is the problem Semantic web needs  general  and  open  ontology library, but ontology building is a time-consuming, knowledge sensitive process.  Domain experts are needed, and nobody has full knowledge Also, intellectual asset/copyright issue hinders the wide usage of commercial ontology (e.g. Cyc) Automatic ontology discovery and mapping are still impossible in general Existent ontology editing and discovery tools are standalone and too complex  Not suitable for team ontology generation. Jargons are horrible for common people who knows little about ontology. Data sources are distributed, heterogonous, dynamic  New concept appears everyday: Election2004
Related problems Distributed Learning Learning from distributed, heterogonous, dynamic, multiple dataset Software engineering Concurrent version control and management Open Source Issue  (copyright vs. copyleft) Knowledge Management Knowledge sharing in group/project Automatic knowledge aggregation
Design Philosophy (1)   ----- about people Teamwork is needed Nobody can know everything But everyone is an expert somehow Everybody knows something: your dog, your department, your favorite TV show You can build big things from small pieces One expert can write several articles for an encyclopedia And hundreds of experts can work together. However, People always have different viewpoints Conflict: 21 st  century begins at 2000/2001 Redundancy: IraqWar, WarInIraq, GulfWarII
Design Philosophy (2)   ----- about agent and software Small pieces of ontologies are generated by agents Those agents are domain experts or trained agents Light-weight ontology editor which requires minimal user effort: browser-based Automatic and controllable information collection by software robots. Ontology repository is maintained by machine learning algorithms Ontology mapping on controlled topics. Detect and reduce redundancy and conflicts by inference
A Desirable Case   -- Pop Music Ontology (1) Suppose we want to build an ontology and knowledge base about pop music called PopOnt Even kids know John is a teenager student and knows nothing about ontology. But he knows much about pop music.  He’d like to share his knowledge to PopOnt. I’m willing to spend 5 minutes for you There are millions of pop music fans like John, their knowledge is complementary each other. Some of them may go to the website of PopOnt and write one or two pieces of simple sentences, like [ M. Jackson] [isn’t] a [country music artist].  They may also correct others’ mistakes
A Desirable Case   -- Pop Music Ontology (2) You even don’t need to go to the website There are also mailing lists, newsgroups, weblogs, p2p applications and websites about pop music, which can be used for validation or mining. For example, if [M. Jackson]  hardly coincides with [country music], it’s more possible [ M. Jackson] [isn’t] a [country music artist]  is true Agent can be expert, too. It will be more desirable if those articles have subject, abstract, or even keywords, which can be used as labeled instances for machine learning.  New concepts can be mined and cross-validated by people, too. Finally, PopOnt is built in a couple of months and free to use for everyone.
Outline Objectives Key difficulties Background review A tentative framework
Key Difficulties 1 :  Logic breakdown How to make ontology editing as easy as writing diary? Ontology [subject][predicate][object] [subject][predicate][object] [subject][predicate][object] [subject][predicate][object] Class SubClass SubSubClass SubSubClass SubClass SubSubClass SubSubClass Classes and Slots Instances Can complex ontology be broken down into group of single sentences? Or say, how to decompose complex description logic statement into very simple FOPL sentences? And inverse composition is also needed. Each single sentences is as simple as A is B , A has B
Key Difficulties 2 :  Ontology Evolution How to refine an ontology by cooperation of experts and software agents? People and agents are all error-prone.  Interactive and iterative cross-validation are central. People are “lazy” and “natural”. An ontology piece may be firstly written in short natural language and be refined latterly by other people or agents into a former and more complex piece. Inference are needed to rule out conflict information, to detect malicious/wrong information
Key Difficulties 3 :  Ontology Mining Where to collect source information? Google search? No Pull: agents search and know where are “good” sources. That can be verified by whether the source is well cited(referenced) or not. Push: information are automatic pushed to agent via credible channels. Automatic extraction is still impossible Depends on NLP Article summary/keywords are helpful, especially when the summary overlaps with existent ontology. Such summarized text can  be used as labeled instance. Simplified tasks are feasible It the keyword a consistent concept? Do some keywords are related? Comparison: In content-based retrieval of video database, automatic discovery of semantics based on image processing / pattern recognition are proven not quite successful. Semantics from expert knowledge are needed in MPEG 7 stream.
Key Difficulties 4 :  Ontology Mapping People always name same thing with different names, or divide concepts into groups in multiple ways. Automatic general ontology mapping is still hard.  Simplified mapping is more feasible while still useful Check concept pair (with instances) are same or not Detect redundancy and suggest merge.
Outline Objectives Key difficulties Background review A tentative framework
Beyond INDUS INDUS is a distributed learning system, while COB is a MAS learning system Agents in different channels  have different focus for learning  They work together for the same goal. INDUS have a heavy-weight database mechanism while COB aims at light-weight implementation Ontology/KB are stored in atom sentences Interface for dummies, not for gurus. Data sources are usually small but change quickly, and their number is huge. In query, uses the inference power of ontology language.
Semantic Web meets MAS COB is an application of MAS learning from data on web Learn new concept from instances Validate concept of other agents/human Learner can be any form: BayesNet, Neural Net, Decision Tree, KNN Everything is about semantics Agents share an ontology but also have dialect issue Small pieces of semantics are carried by agents and aggregated in the “home” Guess semantics from labeled instance. An application shows how to implement proof and trust on semantic web
Ready Techniques Dynamic knowledge sharing  RSS(RDF site summary): answering questions like "Who wrote this?", "When was this published?", and "What is/are the topic(s) of discussion?"  RSS is widely used for news aggregation and automatic news discovery.  Grid/Social Computation Grid: distribute the compuation task across the internet and compose result together. Blog and Wiki: easy to use site building tools, instead of HTML editor.  Topics are refined by the effort of a community. Peer-to-peer communication  Local repository can be shared to other peer The other peer can be a agent in COB ! However, they are all somehow missing of semantics. The unfiltered information may flood  the user.
  Collaborative Ontology Building Example FOAF http://xml.mfd-consult.dk/foaf/explorer/ FoaF  is an acronym for  Friend of a Friend , an experimental project and vocabulary for the  Semantic Web . It is based on the idea of a machine-readable version of the current World Wide Web, with homepages, mailling lists, travel itineraries, calendars, address books and the likes. Everyone can join and add their own information It’s RDF based
Collaborative Ontology Building Example  wikipedia 170,000 concepts in English only, more in other language. An open encyclopedia Everyone can edit any page. Based on the assumption that most of people are nice And it’s proven true! Limitation: the relation between items is not formal, and it’s to human read only(at least for now)
Collaborative Ontology Building Example Open Directory Project http:// www.dmoz.org / 60,000 editors 460,000 concepts Collaborative taxonomy building Open to everyone Limitation: Taxonomy only
Outline Objectives Key difficulties Background review A tentative framework
System design Ontology Repository OntoWiki OWL-like  syntax Human Expert Email list Newsgroup Forum Blog Wiki P2P node Semantic RSS-aware Channel Semantic RSS-aware Channel Semantic RSS-aware Channel Agents:  Ontology  Mining Browser Ontology Alignment Version Control Redundancy Check Conflict Check Cross Validation A B C D
Part A (1): OntoWiki Everyone can edit any concept Version control is enabled Ontology-guide editing Should have a ontology visualizer
Part A (2): OWL-like syntax // COB terms  cob:equals   cob:documentation   // OWL terms  owl:AllDifferent   owl:allValuesFrom   owl:backwardCompatibleWith   owl:cardinality   owl:Class   owl:complementOf   owl:DatatypeProperty   owl:DeprecatedClass   owl:DeprecatedProperty   owl:differentFrom   owl:disjointWith   owl:distinctMembers   owl:equivalentClass   owl:equivalentProperty   owl:FunctionalProperty   owl:hasValue     owl:imports   owl:incompatibleWith   owl:intersectionOf   owl:InverseFunctionalProperty     owl:inverseOf   owl:maxCardinality   owl:minCardinality   owl:Nothing   owl:ObjectProperty   owl:oneOf   owl:onProperty   owl:Ontology   owl:priorVersion   owl:Restriction   owl:sameAs   owl:someValuesFrom   owl:SymmetricProperty   owl:Thing   owl:TransitiveProperty   owl:unionOf   owl:versionInfo   rdf:List   rdf:nil   rdf:type   rdfs:comment   rdfs:Datatype   rdfs:domain   rdfs:label   rdfs:Literal   rdfs:Literal   rdfs:range   rdfs:subClassOf   rdfs:subPropertyOf A subset of OWL is used Single statement are RDF-like triple [subject] [predicate] [object] Name Space are used cob:instanceOf owl:Class rdfs:subClassOf Core COB language is defined in it’s own namespace (see right)
Part A (3): Instance Example # [cob:Instance]  # [cob:instanceOf] [Student]  # [cob:instanceOf] [Chinese] # [cob:equals][ 鲍捷 ] # [hasSurname] Bao # [hasFirstname] Jie # [worksOn] [semanticWeb] # [worksOn] [MAS] # [worksOn] [complexSystem] # [advisedBy] [Honavar] # [memberOf] [aiLab] # [hasEmail] baojie@cs.iastate.edu # [hasHomepage] http://www.cs.iastate.edu/~baojie # [cob:documentation] Hi, I love cats BaoJie cob:Instance   cob:instanceOf   Student ?   cob:instanceOf   Chinese ?   cob:equals   鲍捷   hasSurname  Bao  hasFirstname  Jie  worksOn   semanticWeb ?   worksOn   MAS ?   worksOn   complexSystem ?   advisedBy   Honavar ?   memberOf   aiLab ?   hasEmail  baojie@cs.iastate.edu  hasHomepage  http://www.cs.iastate.edu/~baojie  cob:documentation   Hi, I love cats  Edit this page     More info...    Attach file...  Source Screen shows
Part A (4): Name Space Java-like package naming, which shows the relatedness of concepts even when they don’t inherit from the same concept. Packages are in DAG  Internationalization is enabled  //cob:Thing.Country.US.Iowa.Ames.ISU //cob:Thing.Education.University.Iowa.ISU [cob:instanceOf] [PublicUniversity] [cob:instanceOf] [dmoz:University] [cob:equals] [Iowa State University] // cobZH: 事物 . 美国大学 . 艾奥瓦州立大学 [cob:language] zh // Chinese [cob:equals]    [cob:Thing.Country.US.Iowa.Ames.ISU] //cob:Thing.Education.University.Idaho.ISU [cob:instanceOf] [PublicUniversity] [cob:instanceOf] [dmoz:University] [cob:equals] [Idaho State University]
Part B: Semantic RSS RSS  has no semantics We can use  Dublin Core   to enhance RSS Keywords are concepts or concept candidates in the ontology Agents listen to S-RSS channels and discover new concepts <channel rdf:about=&quot;http://boole.cs.iastate.edu:9090/COB/&quot;> <title>COB Project</title> <link>http://boole.cs.iastate.edu:9090/COB/</link> <description>AI Ontology</description> <language>en-us</language> <items> <rdf:Seq> <rdf:li rdf:resource=&quot;http://boole.cs.iastate.edu:9090/COB/Wiki.jsp?page=Main&quot; /> </rdf:Seq> </items> </channel> <item rdf:about=&quot;http://boole.cs.iastate.edu:9090/COB/Wiki.jsp?page=Main&quot;> <title>Main</title> <link>http://boole.cs.iastate.edu:9090/COB/Wiki.jsp?page=Main</link> <description>129.186.93.7 changed this page on Wed Dec 03 19:18:23 CST 2003:&lt;br />&lt;hr />&lt;br /></description> <wiki:version>27</wiki:version> <wiki:diff>http://boole.cs.iastate.edu:9090/COB/Diff.jsp?page=Main&amp;r1=-1</wiki:diff> <dc:date>2003-12-04T01:18:23Z</dc:date> <dc:contributor> <rdf:Description> <rdf:value>129.186.93.7</rdf:value> </rdf:Description> </dc:contributor> <wiki:history>http://boole.cs.iastate.edu:9090/COB/PageInfo.jsp?page=Main</wiki:history> </item>
Part C (1): Agent Each agent does  Trace back information source and check its credibility. Do filtering and text normalization Extract new concept from instances Extract possible general relationship (like [cob:alsoSee]) between concepts And they may differs Not necessarily should use the same learning algorithm Learning from email header are different from learning from free text content Dialect Agent 1: I listens to Idaho S.U. maillist and know ISU = Idaho State University Agent 2: I watch a blog in Iowa and know ISU = Iowa State University Communication helps Agent 1: P([M. Jackson]^[CountryMusic])=0.1 Agent 2: P([M. Jackson]^[CountryMusic])=0.03
Part C (2): Ontology Alignment Do mapping on restricted cases When an agent or expert doubts if some concepts are same, it will ask OntologyAlignmenter with instance set  Merge detected duplicated concepts like IraqWar and WarInIraq be careful: UniversityOfWashington, WashtingtonUniversity are different. It can be learnt from instances. Manual alignment enabled, too
Part D : Ontology Repository Version control Keep version for each concept, lock mature concepts, detect malicious changes Redundancy check [I.S.U] [cob:instanceOf] [University] [I.S.U] [cob:alsoSee] [Cyclone] [Iowa Stete University] [cob:instanceOf] [PublicUniversity] [Iowa Stete University] [cob:alsoSee] [Cyclone] [PublicUniversity] [cob:subClassOf][University] Conflict check [ISU] [locatedIn] [Ames] [ISU] [locatedIn] [Des Moines] Cross validation Score agent and expert for it’s credibility Check soundness of inputs from it’s peer inputs. Refactoring (rename, remove, merge)
Summary What’s new Light-weight ontology editor for community Collaborative, distributed ontology learning based on logic decomposition  Semantic extension to RSS Mulitagent ontology mining from trusted channel. Do ontology management based on proof and trust COB doesn't want to  Solve ontology mapping in general Solve ontology extract from free text in general

More Related Content

PPT
Gadgets pwn us? A pattern language for CALL
PDF
FinalReport
PDF
Lecture: Semantic Word Clouds
ODP
Text-mining and Automation
PDF
A survey paper of virtual friend
PDF
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
PPT
The Semantic Web: status and prospects
PPTX
Lecture semantic augmentation
Gadgets pwn us? A pattern language for CALL
FinalReport
Lecture: Semantic Word Clouds
Text-mining and Automation
A survey paper of virtual friend
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
The Semantic Web: status and prospects
Lecture semantic augmentation

What's hot (14)

PPTX
DMDS Winter 2015 Workshop 1 slides
PPTX
Integrating digital traces into a semantic enriched data
PPT
Dimensions of Media Object Comprehensibility
PPT
Social Machines Oxford Hendler
PPTX
Nautral Langauge Processing - Basics / Non Technical
PDF
Lecture 2: Computational Semantics
PDF
Ontologies
PDF
An Abridged Version of My Statement of Research Interests
PDF
Research Statement
PDF
Blenderbot
PPT
Doctoral seminar (DBIS RWTH Aachen)
PPT
Semantic Technologies in Learning Environments
PDF
Towards a lingua universalis
PDF
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
DMDS Winter 2015 Workshop 1 slides
Integrating digital traces into a semantic enriched data
Dimensions of Media Object Comprehensibility
Social Machines Oxford Hendler
Nautral Langauge Processing - Basics / Non Technical
Lecture 2: Computational Semantics
Ontologies
An Abridged Version of My Statement of Research Interests
Research Statement
Blenderbot
Doctoral seminar (DBIS RWTH Aachen)
Semantic Technologies in Learning Environments
Towards a lingua universalis
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
Ad

Viewers also liked (20)

PPTX
Amplo aurora session
PPTX
Fopl vaughan librariesboard
PDF
Fopl index discussion paper
PPT
Jarrar.lecture notes.aai.2011s.ch7.p logic
PPT
Wk8of2
PPT
E books symposium intro
PPT
Backward chaining(bala,karthi,rajesh)
PDF
17 fpl001 narrativetoolkit_vf1_print
PPSX
Artificial Intelligence
PDF
Adhoc frames conceptual graphs
PPTX
Logic programming in python
PPTX
Watson: An Academic's Perspective
PPTX
Jarrar: Propositional Logic Inference Methods
PPT
Logical Agents
PPTX
Why Watson Won: A cognitive perspective
PDF
Logic programming (1)
PPTX
Artificial intelligence in cyber defense
PPT
Artificial Intelligence and Expert Systems
PPTX
Propositional logic & inference
Amplo aurora session
Fopl vaughan librariesboard
Fopl index discussion paper
Jarrar.lecture notes.aai.2011s.ch7.p logic
Wk8of2
E books symposium intro
Backward chaining(bala,karthi,rajesh)
17 fpl001 narrativetoolkit_vf1_print
Artificial Intelligence
Adhoc frames conceptual graphs
Logic programming in python
Watson: An Academic's Perspective
Jarrar: Propositional Logic Inference Methods
Logical Agents
Why Watson Won: A cognitive perspective
Logic programming (1)
Artificial intelligence in cyber defense
Artificial Intelligence and Expert Systems
Propositional logic & inference
Ad

Similar to Collaborative Ontology Building Project (20)

PPT
Corrib.org - OpenSource and Research
PPTX
Knowledge Representation, Semantic Web
PPTX
Web 3 final(1)
PDF
Ontologies Fmi 042010
PDF
The Revolution Of Cloud Computing
PPTX
Knowledge mangement
PDF
How to model digital objects within the semantic web
PPTX
The Social Semantic Web
PDF
Introduction to the Semantic Web
PDF
Microposts Ontology Construction Via Concept Extraction
PPTX
Ontology
PDF
NetIKX Semantic Search Presentation
PDF
Microposts Ontology Construction Via Concept Extraction
PDF
Gic2011 aula10-ingles
PDF
Ontology Engineering Synthesis Lectures on Data Semantics and Knowledge 1st ...
ODP
Research on collaborative information sharing systems
ODT
Riding The Semantic Wave
PPT
Ontologies for multimedia: the Semantic Culture Web
PPT
22 owl section 1
PPT
Semantic Technolgy
Corrib.org - OpenSource and Research
Knowledge Representation, Semantic Web
Web 3 final(1)
Ontologies Fmi 042010
The Revolution Of Cloud Computing
Knowledge mangement
How to model digital objects within the semantic web
The Social Semantic Web
Introduction to the Semantic Web
Microposts Ontology Construction Via Concept Extraction
Ontology
NetIKX Semantic Search Presentation
Microposts Ontology Construction Via Concept Extraction
Gic2011 aula10-ingles
Ontology Engineering Synthesis Lectures on Data Semantics and Knowledge 1st ...
Research on collaborative information sharing systems
Riding The Semantic Wave
Ontologies for multimedia: the Semantic Culture Web
22 owl section 1
Semantic Technolgy

More from Jie Bao (20)

PDF
python-graph-lovestory
PDF
unix toolbox 中文版
PDF
unixtoolbox.book
PDF
Lean startup 精益创业 新创企业的成长思维
PPT
Towards social webtops using semantic wiki
PPT
Semantic information theory in 20 minutes
PPT
Towards a theory of semantic communication
PPTX
Expressive Query Answering For Semantic Wikis (20min)
PDF
Startup best practices
PDF
Owl 2 quick reference card a4 size
PDF
ISWC 2010 Metadata Work Summary
PPTX
Expressive Query Answering For Semantic Wikis
PDF
PDF
24 Ways to Explore ISWC 2010 Data
PPT
Semantic Web: In Quest for the Next Generation Killer Apps
PDF
Representing financial reports on the semantic web a faithful translation f...
PDF
XACML 3.0 (Partial) Concept Map
PDF
Development of a Controlled Natural Language Interface for Semantic MediaWiki
PDF
Digital image self-adaptive acquisition in medical x-ray imaging
PPT
Privacy-Preserving Reasoning on the Semantic Web (Poster)
python-graph-lovestory
unix toolbox 中文版
unixtoolbox.book
Lean startup 精益创业 新创企业的成长思维
Towards social webtops using semantic wiki
Semantic information theory in 20 minutes
Towards a theory of semantic communication
Expressive Query Answering For Semantic Wikis (20min)
Startup best practices
Owl 2 quick reference card a4 size
ISWC 2010 Metadata Work Summary
Expressive Query Answering For Semantic Wikis
24 Ways to Explore ISWC 2010 Data
Semantic Web: In Quest for the Next Generation Killer Apps
Representing financial reports on the semantic web a faithful translation f...
XACML 3.0 (Partial) Concept Map
Development of a Controlled Natural Language Interface for Semantic MediaWiki
Digital image self-adaptive acquisition in medical x-ray imaging
Privacy-Preserving Reasoning on the Semantic Web (Poster)

Recently uploaded (20)

PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Big Data Technologies - Introduction.pptx
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
cuic standard and advanced reporting.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
KodekX | Application Modernization Development
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Encapsulation theory and applications.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Approach and Philosophy of On baking technology
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
Big Data Technologies - Introduction.pptx
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
cuic standard and advanced reporting.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
KodekX | Application Modernization Development
The AUB Centre for AI in Media Proposal.docx
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Unlocking AI with Model Context Protocol (MCP)
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Per capita expenditure prediction using model stacking based on satellite ima...
The Rise and Fall of 3GPP – Time for a Sabbatical?
Encapsulation theory and applications.pdf
Network Security Unit 5.pdf for BCA BBA.
Building Integrated photovoltaic BIPV_UPV.pdf
Electronic commerce courselecture one. Pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Approach and Philosophy of On baking technology

Collaborative Ontology Building Project

  • 1. Collaborative Ontology Building Project - a multiagent-based ontology editing and discovery environment Jie Bao Artificial Intelligence Research Laboratory Dept of Computer Science Iowa State University Ames IA 50010 [email_address] http://www.cs.iastate.edu/~baojie Project homepage: http://boole.cs.iastate.edu:9090/COB/ A Research proposal Dec 02, 2003
  • 2. COB Without SHOE how can you be a RACER ? Without Sesame how can you make OIL ? Semantic Web is a plan of good But with no ontology it’s only a nil. Everyone makes a small piece of brick Not in one day can we make Rome real. Let’s build ontology together and hard Just like ants build their hill.
  • 3. Outline Objectives Key difficulties Background review A tentative framework
  • 4. What is the problem Semantic web needs general and open ontology library, but ontology building is a time-consuming, knowledge sensitive process. Domain experts are needed, and nobody has full knowledge Also, intellectual asset/copyright issue hinders the wide usage of commercial ontology (e.g. Cyc) Automatic ontology discovery and mapping are still impossible in general Existent ontology editing and discovery tools are standalone and too complex Not suitable for team ontology generation. Jargons are horrible for common people who knows little about ontology. Data sources are distributed, heterogonous, dynamic New concept appears everyday: Election2004
  • 5. Related problems Distributed Learning Learning from distributed, heterogonous, dynamic, multiple dataset Software engineering Concurrent version control and management Open Source Issue (copyright vs. copyleft) Knowledge Management Knowledge sharing in group/project Automatic knowledge aggregation
  • 6. Design Philosophy (1) ----- about people Teamwork is needed Nobody can know everything But everyone is an expert somehow Everybody knows something: your dog, your department, your favorite TV show You can build big things from small pieces One expert can write several articles for an encyclopedia And hundreds of experts can work together. However, People always have different viewpoints Conflict: 21 st century begins at 2000/2001 Redundancy: IraqWar, WarInIraq, GulfWarII
  • 7. Design Philosophy (2) ----- about agent and software Small pieces of ontologies are generated by agents Those agents are domain experts or trained agents Light-weight ontology editor which requires minimal user effort: browser-based Automatic and controllable information collection by software robots. Ontology repository is maintained by machine learning algorithms Ontology mapping on controlled topics. Detect and reduce redundancy and conflicts by inference
  • 8. A Desirable Case -- Pop Music Ontology (1) Suppose we want to build an ontology and knowledge base about pop music called PopOnt Even kids know John is a teenager student and knows nothing about ontology. But he knows much about pop music. He’d like to share his knowledge to PopOnt. I’m willing to spend 5 minutes for you There are millions of pop music fans like John, their knowledge is complementary each other. Some of them may go to the website of PopOnt and write one or two pieces of simple sentences, like [ M. Jackson] [isn’t] a [country music artist]. They may also correct others’ mistakes
  • 9. A Desirable Case -- Pop Music Ontology (2) You even don’t need to go to the website There are also mailing lists, newsgroups, weblogs, p2p applications and websites about pop music, which can be used for validation or mining. For example, if [M. Jackson] hardly coincides with [country music], it’s more possible [ M. Jackson] [isn’t] a [country music artist] is true Agent can be expert, too. It will be more desirable if those articles have subject, abstract, or even keywords, which can be used as labeled instances for machine learning. New concepts can be mined and cross-validated by people, too. Finally, PopOnt is built in a couple of months and free to use for everyone.
  • 10. Outline Objectives Key difficulties Background review A tentative framework
  • 11. Key Difficulties 1 : Logic breakdown How to make ontology editing as easy as writing diary? Ontology [subject][predicate][object] [subject][predicate][object] [subject][predicate][object] [subject][predicate][object] Class SubClass SubSubClass SubSubClass SubClass SubSubClass SubSubClass Classes and Slots Instances Can complex ontology be broken down into group of single sentences? Or say, how to decompose complex description logic statement into very simple FOPL sentences? And inverse composition is also needed. Each single sentences is as simple as A is B , A has B
  • 12. Key Difficulties 2 : Ontology Evolution How to refine an ontology by cooperation of experts and software agents? People and agents are all error-prone. Interactive and iterative cross-validation are central. People are “lazy” and “natural”. An ontology piece may be firstly written in short natural language and be refined latterly by other people or agents into a former and more complex piece. Inference are needed to rule out conflict information, to detect malicious/wrong information
  • 13. Key Difficulties 3 : Ontology Mining Where to collect source information? Google search? No Pull: agents search and know where are “good” sources. That can be verified by whether the source is well cited(referenced) or not. Push: information are automatic pushed to agent via credible channels. Automatic extraction is still impossible Depends on NLP Article summary/keywords are helpful, especially when the summary overlaps with existent ontology. Such summarized text can be used as labeled instance. Simplified tasks are feasible It the keyword a consistent concept? Do some keywords are related? Comparison: In content-based retrieval of video database, automatic discovery of semantics based on image processing / pattern recognition are proven not quite successful. Semantics from expert knowledge are needed in MPEG 7 stream.
  • 14. Key Difficulties 4 : Ontology Mapping People always name same thing with different names, or divide concepts into groups in multiple ways. Automatic general ontology mapping is still hard. Simplified mapping is more feasible while still useful Check concept pair (with instances) are same or not Detect redundancy and suggest merge.
  • 15. Outline Objectives Key difficulties Background review A tentative framework
  • 16. Beyond INDUS INDUS is a distributed learning system, while COB is a MAS learning system Agents in different channels have different focus for learning They work together for the same goal. INDUS have a heavy-weight database mechanism while COB aims at light-weight implementation Ontology/KB are stored in atom sentences Interface for dummies, not for gurus. Data sources are usually small but change quickly, and their number is huge. In query, uses the inference power of ontology language.
  • 17. Semantic Web meets MAS COB is an application of MAS learning from data on web Learn new concept from instances Validate concept of other agents/human Learner can be any form: BayesNet, Neural Net, Decision Tree, KNN Everything is about semantics Agents share an ontology but also have dialect issue Small pieces of semantics are carried by agents and aggregated in the “home” Guess semantics from labeled instance. An application shows how to implement proof and trust on semantic web
  • 18. Ready Techniques Dynamic knowledge sharing RSS(RDF site summary): answering questions like &quot;Who wrote this?&quot;, &quot;When was this published?&quot;, and &quot;What is/are the topic(s) of discussion?&quot; RSS is widely used for news aggregation and automatic news discovery. Grid/Social Computation Grid: distribute the compuation task across the internet and compose result together. Blog and Wiki: easy to use site building tools, instead of HTML editor. Topics are refined by the effort of a community. Peer-to-peer communication Local repository can be shared to other peer The other peer can be a agent in COB ! However, they are all somehow missing of semantics. The unfiltered information may flood the user.
  • 19. Collaborative Ontology Building Example FOAF http://xml.mfd-consult.dk/foaf/explorer/ FoaF is an acronym for Friend of a Friend , an experimental project and vocabulary for the Semantic Web . It is based on the idea of a machine-readable version of the current World Wide Web, with homepages, mailling lists, travel itineraries, calendars, address books and the likes. Everyone can join and add their own information It’s RDF based
  • 20. Collaborative Ontology Building Example wikipedia 170,000 concepts in English only, more in other language. An open encyclopedia Everyone can edit any page. Based on the assumption that most of people are nice And it’s proven true! Limitation: the relation between items is not formal, and it’s to human read only(at least for now)
  • 21. Collaborative Ontology Building Example Open Directory Project http:// www.dmoz.org / 60,000 editors 460,000 concepts Collaborative taxonomy building Open to everyone Limitation: Taxonomy only
  • 22. Outline Objectives Key difficulties Background review A tentative framework
  • 23. System design Ontology Repository OntoWiki OWL-like syntax Human Expert Email list Newsgroup Forum Blog Wiki P2P node Semantic RSS-aware Channel Semantic RSS-aware Channel Semantic RSS-aware Channel Agents: Ontology Mining Browser Ontology Alignment Version Control Redundancy Check Conflict Check Cross Validation A B C D
  • 24. Part A (1): OntoWiki Everyone can edit any concept Version control is enabled Ontology-guide editing Should have a ontology visualizer
  • 25. Part A (2): OWL-like syntax // COB terms cob:equals cob:documentation // OWL terms owl:AllDifferent owl:allValuesFrom owl:backwardCompatibleWith owl:cardinality owl:Class owl:complementOf owl:DatatypeProperty owl:DeprecatedClass owl:DeprecatedProperty owl:differentFrom owl:disjointWith owl:distinctMembers owl:equivalentClass owl:equivalentProperty owl:FunctionalProperty owl:hasValue owl:imports owl:incompatibleWith owl:intersectionOf owl:InverseFunctionalProperty owl:inverseOf owl:maxCardinality owl:minCardinality owl:Nothing owl:ObjectProperty owl:oneOf owl:onProperty owl:Ontology owl:priorVersion owl:Restriction owl:sameAs owl:someValuesFrom owl:SymmetricProperty owl:Thing owl:TransitiveProperty owl:unionOf owl:versionInfo rdf:List rdf:nil rdf:type rdfs:comment rdfs:Datatype rdfs:domain rdfs:label rdfs:Literal rdfs:Literal rdfs:range rdfs:subClassOf rdfs:subPropertyOf A subset of OWL is used Single statement are RDF-like triple [subject] [predicate] [object] Name Space are used cob:instanceOf owl:Class rdfs:subClassOf Core COB language is defined in it’s own namespace (see right)
  • 26. Part A (3): Instance Example # [cob:Instance] # [cob:instanceOf] [Student] # [cob:instanceOf] [Chinese] # [cob:equals][ 鲍捷 ] # [hasSurname] Bao # [hasFirstname] Jie # [worksOn] [semanticWeb] # [worksOn] [MAS] # [worksOn] [complexSystem] # [advisedBy] [Honavar] # [memberOf] [aiLab] # [hasEmail] baojie@cs.iastate.edu # [hasHomepage] http://www.cs.iastate.edu/~baojie # [cob:documentation] Hi, I love cats BaoJie cob:Instance cob:instanceOf Student ? cob:instanceOf Chinese ? cob:equals 鲍捷 hasSurname Bao hasFirstname Jie worksOn semanticWeb ? worksOn MAS ? worksOn complexSystem ? advisedBy Honavar ? memberOf aiLab ? hasEmail baojie@cs.iastate.edu hasHomepage http://www.cs.iastate.edu/~baojie cob:documentation Hi, I love cats Edit this page    More info...    Attach file... Source Screen shows
  • 27. Part A (4): Name Space Java-like package naming, which shows the relatedness of concepts even when they don’t inherit from the same concept. Packages are in DAG Internationalization is enabled //cob:Thing.Country.US.Iowa.Ames.ISU //cob:Thing.Education.University.Iowa.ISU [cob:instanceOf] [PublicUniversity] [cob:instanceOf] [dmoz:University] [cob:equals] [Iowa State University] // cobZH: 事物 . 美国大学 . 艾奥瓦州立大学 [cob:language] zh // Chinese [cob:equals] [cob:Thing.Country.US.Iowa.Ames.ISU] //cob:Thing.Education.University.Idaho.ISU [cob:instanceOf] [PublicUniversity] [cob:instanceOf] [dmoz:University] [cob:equals] [Idaho State University]
  • 28. Part B: Semantic RSS RSS has no semantics We can use Dublin Core to enhance RSS Keywords are concepts or concept candidates in the ontology Agents listen to S-RSS channels and discover new concepts <channel rdf:about=&quot;http://boole.cs.iastate.edu:9090/COB/&quot;> <title>COB Project</title> <link>http://boole.cs.iastate.edu:9090/COB/</link> <description>AI Ontology</description> <language>en-us</language> <items> <rdf:Seq> <rdf:li rdf:resource=&quot;http://boole.cs.iastate.edu:9090/COB/Wiki.jsp?page=Main&quot; /> </rdf:Seq> </items> </channel> <item rdf:about=&quot;http://boole.cs.iastate.edu:9090/COB/Wiki.jsp?page=Main&quot;> <title>Main</title> <link>http://boole.cs.iastate.edu:9090/COB/Wiki.jsp?page=Main</link> <description>129.186.93.7 changed this page on Wed Dec 03 19:18:23 CST 2003:&lt;br />&lt;hr />&lt;br /></description> <wiki:version>27</wiki:version> <wiki:diff>http://boole.cs.iastate.edu:9090/COB/Diff.jsp?page=Main&amp;r1=-1</wiki:diff> <dc:date>2003-12-04T01:18:23Z</dc:date> <dc:contributor> <rdf:Description> <rdf:value>129.186.93.7</rdf:value> </rdf:Description> </dc:contributor> <wiki:history>http://boole.cs.iastate.edu:9090/COB/PageInfo.jsp?page=Main</wiki:history> </item>
  • 29. Part C (1): Agent Each agent does Trace back information source and check its credibility. Do filtering and text normalization Extract new concept from instances Extract possible general relationship (like [cob:alsoSee]) between concepts And they may differs Not necessarily should use the same learning algorithm Learning from email header are different from learning from free text content Dialect Agent 1: I listens to Idaho S.U. maillist and know ISU = Idaho State University Agent 2: I watch a blog in Iowa and know ISU = Iowa State University Communication helps Agent 1: P([M. Jackson]^[CountryMusic])=0.1 Agent 2: P([M. Jackson]^[CountryMusic])=0.03
  • 30. Part C (2): Ontology Alignment Do mapping on restricted cases When an agent or expert doubts if some concepts are same, it will ask OntologyAlignmenter with instance set Merge detected duplicated concepts like IraqWar and WarInIraq be careful: UniversityOfWashington, WashtingtonUniversity are different. It can be learnt from instances. Manual alignment enabled, too
  • 31. Part D : Ontology Repository Version control Keep version for each concept, lock mature concepts, detect malicious changes Redundancy check [I.S.U] [cob:instanceOf] [University] [I.S.U] [cob:alsoSee] [Cyclone] [Iowa Stete University] [cob:instanceOf] [PublicUniversity] [Iowa Stete University] [cob:alsoSee] [Cyclone] [PublicUniversity] [cob:subClassOf][University] Conflict check [ISU] [locatedIn] [Ames] [ISU] [locatedIn] [Des Moines] Cross validation Score agent and expert for it’s credibility Check soundness of inputs from it’s peer inputs. Refactoring (rename, remove, merge)
  • 32. Summary What’s new Light-weight ontology editor for community Collaborative, distributed ontology learning based on logic decomposition Semantic extension to RSS Mulitagent ontology mining from trusted channel. Do ontology management based on proof and trust COB doesn't want to Solve ontology mapping in general Solve ontology extract from free text in general