SlideShare a Scribd company logo
Ontopia Code CampTMRA 2009-11-11Lars Marius Garshol & Geir Ove Grønmo
AgendaAbout youwho are you?what do you want from the code camp?About OntopiaThe productThe futureParticipating in the projectWriting some code!
Some backgroundAbout Ontopia
Brief history1999-2000private hobby project for Geir Ove2000-2009commercial software sold by Ontopia ASlots of international customers in diverse fields2009-open source project
The projectOpen source hosted at Google CodeContributorsLars Marius Garshol, BouvetGeir Ove Grønmo, BouvetThomas Neidhart, SpaceAppsLars Heuer, SemagiaHannes Niederhausen, TMLabStig Lau, BouvetBaard H. Rehn-Johansen, BouvetPeter-Paul Kruijssen, MorpheusQuintin Siebers, Morpheus
Current activity (toward 5.1)tolog updatesadded by LMGVarious fixes and optimizationsby everyoneToma implementation (in sandbox)by ThomasTMQL implementation (in sandbox)?by Sven Krosse
Architecture and modulesThe product
The big pictureAuto-class.A.N.otherA.N.otherOtherCMSsA.N.otherA.N.otherDB2TMPortlet supportOKPXML2TMEngineCMSintegrationData integrationEscenicTaxon.importOntopolyWebservice
The engineCore APITMAPI 2.0 supportImport/exportRDF conversionTMSyncFulltext searchEvent APItolog query languagetolog update languageEngine
Query EngineImplementation of Ontopia’s tolog language (based on Prolog and SQL)Allows powerful queries on the topic map data structureSimplifies application development and improves performanceExample:select $B, count($A) from instance-of($B, city),{ premiere($A : opera, $B : place) |   premiere($A : opera, $C : place),   located-in($C : containee, $B : container) } order by $A desc?returns all B's and the corresponding number of A's whereB is a city ANDEITHER	B is the place where A was premieredOR	 the place where A was premiered is located in B in decreasing order of ATMSyncConfigurable module for synchronizing one TM against anotherdefine subset of source TM to sync (using tolog)define subset of target TM to sync (using tolog)the module handles the restCan also be used with non-TM sourcescreate a non-updating conversion from the source to some TM formatthen use TMSync to sync against the converted TM instead of directly against the source
How TMSync worksDefine which part of the target topic map you want,Define which part of the source topic map it is the master for, andThe algorithm does the rest
If the source is not a topic mapTMSyncconvert.xsltSimply do a normal one-time conversionlet TMSync do the update for youIn other words, TMSync reduces the update problem to a conversion problemsource.xml
The City of Bergen usecaseNorge.noServiceUnitPersonLOSCity of BergenLOS
The backendsIn-memoryno persistent storagethread-safeno setupRDBMStransactionspersistentthread-safeuses cachingclusteringRemoteuses web serviceread-onlyunofficialEngineMemoryRDBMSRemote
RDBMS BackendAllows the Engine to use topic maps stored in a relational databaseBased on a generic topic map schemaNecessary when working with very large topic mapsTransparent to applicationsFeaturesAutomatically loads data when neededCaches frequently used dataFull support for RDBMS transactionsSupports tolog-to-SQL compilationStatistical reports for performance tuningPlatform supportOracle, MySQL, PostgreSQL, MS SQL ServerTest suite available for verifying compatibility with other JDBC-enabled RDBMSes
DB2TMUpconversion to TMsfrom RDBMS via JDBCor from CSVUses XML mappingcan call out to JavaSupports synceither full rescanor change tableTMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
DB2TM exampleOntopia+=United NationsBouvet<relation name="organizations.csv" columns="id name url">  <topic type="ex:organization">
   <item-identifier>#org${id}</item-identifier>
   <topic-name>${name}</topic-name>
   <occurrence type="ex:homepage">${url}</occurrence>
 </topic></relation>
TMRAPWeb service interfacevia SOAPvia plain HTTPRequestsget-topicget-topic-pageget-tologdelete-topic...TMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
Navigator frameworkServlet-based APImanage topic mapsload/scan/delete/createJSP tag libraryXSLT-likebased on tologJSTL integrationTMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
Ontopia Navigator FrameworkJava API for interacting with TM repositoryJSP tag librarybased on tologkind of like XSLT in JSP with tolog instead of XPathhas JSTL integrationUndocumented partsweb presentation componentssome wrapped as JSP tagswant to build proper portlets from them
http://www.ontopia.net/operamap
Navigator tag library example   <%-- assume variable 'composer' is already set --%><p><b>Operas:</b><br/><tolog:foreach query=”composed-by(%composer% : composer, $OPERA : opera),                      { premiere-date($OPERA, $DATE) }?”>  <li>    <a href="opera.jsp?id=<tolog:id var="OPERA"/>”         ><tolog:out var="OPERA"/></a>       <tolog:if var="DATE">      <tolog:out var="DATE"/>    </tolog:if>  </li></tolog:foreach></p>
Elmer Preview
Ontopia Code Camp
Ontopia Code Camp
Ontopia Code Camp
Automated classificationUndocumentedexperimentalExtracts textautodetects formatWord, PDF, XML, HTMLProcesses textdetects languagestemming, stop-wordsExtracts keywordsranked by importanceuses existing topicssupports compound termsTMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
Example of keyword extractiontopic maps			1.0metadata			0.57subject-based class.	0.42Core metadata		0.42faceted classification	0.34taxonomy			0.22monolingual thesauri	0.19controlled vocabulary	0.19Dublin Core			0.16thesauri			0.16Dublin				0.15keywords			0.15
Example #2Automated classification		1.0	5Topic Maps				0.51	14XSLT					0.38	11compound keywords		0.29	2keywords				0.26	20Lars					0.23	1Marius					0.23	1Garshol				0.22	1...
So how could this be used?To help users classify new documents in a CMS interfacesuggest appropriate keywords, screened by user before approvalAutomate classification of incoming documentsthis means lower quality, but also lower costGet an overview of interesting terms in a document corpusclassify all documents, extract the most interesting termsthis can be used as the starting point for building an ontology(keyword extraction only)
Example user interfaceThe user creates an articlethis screen then used to add keywordsuser adjusts the proposals from the classifier
VizigatorVizOntopolyGraphical visualizationVizDesktopSwing app to configurefilter/style/...VizletJava applet for webuses configurationloads via TMRAPuses “Remote” backendTMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
The VizigatorGraphical visualization of Topic MapsTwo partsVizDesktop: Swing desktop app for configurationVizlet: Java applet for web deploymentConfiguration stored in XTM file
Without configuration
With configuration
The VizigatorThe Vizigator uses TMRAPthe Vizlet runs in the browser (on the client)a fragment of the topic map is downloaded from the serverthe fragment is grown as neededServerTMRAP
OntopolyVizOntopolyGeneric editorweb-based, AJAXmeta-ontology in TMOntology designercreate types and fieldscontrol user interfacebuild viewsincremental devInstance editorguided by ontologyTMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
OntopolyA generic Topic Maps editor, in two partsontology editor: used to create the ontology and schemainstance editor: used to enter instances based on ontologyBuilt with the Web Editor Frameworkworks with both XTM files and topic maps stored in RDBMS backendsupports access control to administrative functions, ontology, and instance editorsexisting topic maps can be importedparts of the ontology can be marked as read-only, or hidden
Ontopia Code Camp
Typical deploymentViewingapplicationEngineUsersDBBackendOntopolyFrameworksEditorsDBTMRAPDB2TMHTTPDBExternal applicationApplication server
CMS integrationThe best way to add content functionality to Ontopiathe world doesn’t need another CMSbetter to reuse those which already existSo far two integrations existEscenicOfficeNet Knowledge Portalmore are being worked on
ImplementationA CMS event listenerthe listener creates topics for new CMS articles, folders, etcthe mapping is basically the design of the ontology used by this listenerPresentation integrationit must be possible to list all topics attached to an articleconversely, it must be possible to list all articles attached to a topichow close the integration needs to be here will vary, as will the difficulty of the integrationUser interface integrationit needs to be possible to attach topics to an article from within the normal CMS user interfacethis can be quite trickySearch integrationthe Topic Maps search needs to also search content in the CMScan be achieved by writing a tolog plug-in
Articles as topicsis aboutElectionsNew city council appointedGoal: associate articles with topicsmainly to say what they are abouttypically also want to include other metadataNeed to create topics for the articles to do thisin fact, a general CMS-to-TM mapping is neededmust decide what metadata and structures to include
Mapping issuesArticle topicswhat topic type to use?title becomes name? (do you know the title?)include author? include last modified? include workflow state?should all articles be mapped?Folders/directories/sections/...should these be mapped, too?one topic type for all folders/.../.../...?if so, use associations to connect articles to foldersuse associations to reproduce hierarchical folder structureMultimedia objectsshould these be included?what topic type? what name? ...
Two styles of mappingsArticles as articlesTopic represents only the articleTopic type is some subclass of “article”“Is about” association connects article into topic mapFields are presentationaltitle, abstract, bodyArticles as conceptsTopic represents some real-world subject (like a person)article is just the default content about that subjectType is the type of the subject (person)Semantic associations to the rest of the topic mapworks in department, has competence, ...Fields can be semanticname, phone no, email, ...
Article as articleArticle about building of a new schoolIs about association to “Primary schools”Topic type is “article”
Article as conceptArticle about a sports hallArticle really represents the hallTopic type is “Location”Associations tocity borough
events in the location
category “Sports”
Ontopia Code Camp
Ontopia Code Camp
Ontopia Code Camp
Ontopia Code Camp
Two projects
The projectA new citizen’s portal for the city administrationstrategic decision to make portal main interface for interaction with citizensas many services as possible are to be moved onlineBig projectstarted in late 2004, to continue at 							least into 2008~5 million Euro spent by launch date1.7 million Euro budgeted for 2007Topic Maps development is a fraction 							of this (less than 25%)Many companies involvedBouvet/OntopiaAvenirKPMGKarabinEscenic
Simplified original ontologyService catalogEscenic (CMS)LOSFormArticlenearlyeverythingCategoryServiceSubjectDepartmentBoroughExternalresourceEmployeePayroll++
Data flowOntopolyOntopiaEscenicLOSIntegrationTMSyncDB2TMFellesdataPayroll(Agresso)Dexter/ExtensServiceCatalog
Conceptual architectureDatasourcesOracle PortalApplicationOntopiaEscenicOracle Database
The portal
Technical architecture
NRK/SkoleNorwegian National Broadcasting (NRK)media resources from the archivespublished for use in schoolsintegrated with the National CurriculumIn productiondelayed by copyright wranglingTechnologiesOKSPolopoly CMSMySQL databaseResin application server
Curriculum-based browsing (1)CurriculumSocial studiesHigh school
Curriculum-based browsing (2)Gender roles
Curriculum-based browsing (3)Feminist movement in the 70s and 80sChanges to the family in the 70sThe prime minister’s husbandChildren choosing careersGay partnerships in 1993
One video (prime minister’s husband)MetadataSubjectPersonRelatedresourcesDescription
Conceptual architecturePolopolyHTTPOntopiaMediaDBGrepDB2TMTMSyncRDBMS backendMySQLEditors
ImplementationDomain model in JavaPlain old Java objects built onOntopia’s Java APItologJSP for presentationusing JSTL on top of the domain modelSubversion for the source codeMaven2 to build and deployUnit tests
What we’d like to seeThe future
The big pictureAuto-class.A.N.otherA.N.otherOtherCMSsA.N.otherA.N.otherDB2TMPortlet supportOKPXML2TMEngineCMSintegrationData integrationEscenicTaxon.importOntopolyWebservice
CMS integrationsThe more of these, the betterCandidate CMSsLiferay (being worked on at Bouvet)Alfresco (might be started soon)MagnoliaInspera (possible project here)JSR-170 Java Content RepositoryCMIS (OASIS web service standard)
Portlet toolkitSubversion contains a number of “portlets”basically, Java objects doing presentation taskssome have JSP wrappers as wellExamplesdisplay tree viewlist of topics filterable by facetsshow related topicsget-topic-page via TMRAP componentNot ready for prime-time yetundocumentedincomplete
Ontopoly plug-insPlugins for getting more data from externalsTMSync import pluginDB2TM pluginSubj3ct.com pluginadapted RDF2TM pluginclassify plugin...Plugins for ontology fragmentsmenu editor, for example
TMCLNow implementableWe’d like to seean object model for TMCL (supporting changes)a validator based on the object modelOntopoly import/export from TMCL (initially)refactor Ontopoly API to make it more portableOntopoly ported to use TMCL natively (eventually)
Things we’d like to removeOSL supportOntopia Schema LanguageWeb editor frameworkunfortunately, still used by some major customersFulltext searchthe old APIs for this are not really of any use
Management interfaceImport topic maps (to file or RDBMS)
What do you think?Suggestions?Questions?Plans?Ideas?
Setting up the developer environmentGetting started
If you are using Ontopia......simply download the zip, thenunzip,set the classpath,start the server, ......and you’re good to go
If you are developing Ontopia...You must haveJava 1.5 (not 1.6 or 1.7 or ...)Ant 1.6 (or later)Ivy 2.0 (or later)SubversionThencheck out the source from Subversionsvn checkout http://ontopia.googlecode.com/svn/trunk/ ontopia-read-onlyant bootstrapant dist.jar.ontopiaant testant dist.ontopia
BewareThis is fun, becauseyou can play around with anything you wante.g, my build has a faster TopicIF.getRolesByTypeyou can track changes as they happen in svnHowever, you’re on your ownif it fails it’s kind of hard to say whymaybe it’s your changes, maybe notFor production use, official releases are best
Participating etcThe project
Our goalTo provide the best toolkit for building Topic Maps-based applicationsWe want it to beactively maintained,bug-free,scalable,easy to use,well documented,stable,reliable
Our philosophyWe want Ontopia to provide as much useful more-or-less generic functionality as possibleNew contributions are generally welcome as long asthey meet the quality requirements, andthey don’t cause problems for others
The sandboxThere’s a lot of Ontopia-related code which does not meet those requirementssome of it can be very useful,someone may pick it up and improve itThe sandbox is for these piecessome are in Ontopia’s Subversion repository,others are maintained externallyTo be “promoted” into Ontopia a module needsan active maintainer,to be generally useful, andto meet certain quality requirements
CommunicationsJoin the mailing list(s)!http://groups.google.com/group/ontopiahttp://groups.google.com/group/ontopia-devGoogle Code pagehttp://code.google.com/p/ontopia/note the “updates” feed!Bloghttp://ontopia.wordpress.comTwitterhttp://twitter.com/ontopia
CommittersThese are the people who run the projectthey can actually commit to Subversionthey can vote on decisions to be made etcEveryone else canuse the software as much as they want,report and comment on issues,discuss on the mailing list, andsubmit patches for inclusion
How to become a committerParticipate in the project!that is, get involved firstlet people get to know you, show some commitmentOnce you’ve gotten some way into the project you can ask to become a committerbest if you have provided some patches firstUnless you’re going to commit changes there’s no need to be a committer
Finding a task to work onReport bugs!they exist. if you find any, please report them.Look at the open issuesthere is always testing/discussion to be doneLook for issues marked “newbie”http://code.google.com/p/ontopia/issues/list?q=label:NewbieLook at what’s in the sandboxmost of these modules need workScratch an itchif there’s something you want fixed/changed/added...
How to fix a bugFirst figure out why you think it failsThen write a test casebased on your assumptionmake sure the test case fails (test before you fix)Then fix the bugfollow the coding guidelines (see wiki)Then run the test suiteverify that you’ve fixed the bugverify that you haven’t broken anythingThen submit the patch
The test suiteLots of *.test packages in the source tree3148 test cases as of right nowtest data in ontopia/src/test-datasome tests are generators based on filessome of the test files come from cxtm-tests.sf.netRun withant testjava net.ontopia.test.TestRunner src/test-data/config/tests.xml test-group
Source tree structurenet.ontopia.utils					various utilitiestest					various test support codeinfoset				LocatorIF code + cruftpersistence		OR-mapper for RDBMS backendproduct			cruftxml					various XML-related utilitiestopicmaps			next slides
Source tree structurenet.ontopia.topicmaps.core				core engine APIimpl			engine backends + utilsutils				utilities (see next slide)cmdlineutils	command-line toolsentry			TM repositorynav + nav2	navigator frameworkquery			tolog enginevizclassify			db2tmwebed			cruft
Source tree structurenet.ontopia.topicmaps.utils*				various utility classesltm			LTM reader and writerctm			CTM readerrdf			RDF converter (both ways)tmrap		TMRAP implementation
Let’s write some code!
The engineThe core API corresponds closely to the TMDMTopicMapIF, TopicIF, TopicNameIF, ...Compile withant init compile.ontopia.class files go into ontopia/build/classesant dist.ontopia.jar # makes a jar
The importersMain class implements TopicMapReaderIFusually, this lets you set up configuration, etcthen uses other classes to do the real workXTM importersuse an XML parsermain work done in XTM(2)ContentHandlersome extra code for validation and format detectionCTM/LTM importersuse Antlr-based parsersreal code in ctm.g/ltm.gAll importers work via the core API
Fixing a real bugThere is a failing test case in the TM/XML importerSo let’s fix that right now...
Find an issue in the issue tracker(Picking one with “Newbie” might be good, but isn’t necessary)Get set upcheck out the source codebuild the coderun the test suiteThen dig inwe’ll help you with any questions you haveAt the end, submit a patch to the issue trackerremember to use the test suite!

More Related Content

PPTX
Shannon Holgate: Bending non-splittable data to harness distributed performance
PPT
MITH Digital Dialogues: Intro to Programming for Humanists (with Ruby)
PPTX
Ot performance webinar
PPTX
Advance database management project
PDF
Missing-Children
PPTX
Project Presentation on Advance Java
PPS
My database project
DOC
Dbms Project
Shannon Holgate: Bending non-splittable data to harness distributed performance
MITH Digital Dialogues: Intro to Programming for Humanists (with Ruby)
Ot performance webinar
Advance database management project
Missing-Children
Project Presentation on Advance Java
My database project
Dbms Project

Similar to Ontopia Code Camp (20)

PPTX
Ontopia tutorial
PPT
TMSync: Synchronizing topic maps
PPTX
Building and Integrating Competitive Intelligence Reports Using the Topic Map...
PPT
KnowIT, semantic informatics knowledge base
PPTX
Whats Up With Ontopoly?
PPTX
Ontopia/Liferay integration @TMRA 2010
PPT
Semantic Search with Topic Maps
PPT
Concept Glossary Manager Topic Maps Engine and Navigator
PPTX
NISO Virtual Conference: The Semantic Web Coming of Age: Technologies and Imp...
PPTX
National Data Standardization: A Place for Topic Maps?
PPT
Semantic Web Good News
PDF
Smw+tutorial berlin-fall-2011
PPT
Making the Web searchable
PPTX
Making things findable
PDF
TMAPI 2.0 tutorial
PDF
Create a Smooth & Satisfying Reader Experience using Metadata-Based Links...
PPTX
Ontology driven portal based on ISO Topic Maps, 2008
ODP
Semantic Web - Introduction
PPT
Museum Linked Open Data: Ontologies, Datasets, Projects
PPTX
Semantic Search at Yahoo
Ontopia tutorial
TMSync: Synchronizing topic maps
Building and Integrating Competitive Intelligence Reports Using the Topic Map...
KnowIT, semantic informatics knowledge base
Whats Up With Ontopoly?
Ontopia/Liferay integration @TMRA 2010
Semantic Search with Topic Maps
Concept Glossary Manager Topic Maps Engine and Navigator
NISO Virtual Conference: The Semantic Web Coming of Age: Technologies and Imp...
National Data Standardization: A Place for Topic Maps?
Semantic Web Good News
Smw+tutorial berlin-fall-2011
Making the Web searchable
Making things findable
TMAPI 2.0 tutorial
Create a Smooth & Satisfying Reader Experience using Metadata-Based Links...
Ontology driven portal based on ISO Topic Maps, 2008
Semantic Web - Introduction
Museum Linked Open Data: Ontologies, Datasets, Projects
Semantic Search at Yahoo
Ad

More from Lars Marius Garshol (20)

PDF
JSLT: JSON querying and transformation
PDF
Data collection in AWS at Schibsted
PPTX
Kveik - what is it?
PDF
Nature-inspired algorithms
PDF
Collecting 600M events/day
PDF
History of writing
PDF
NoSQL and Einstein's theory of relativity
PPTX
Norwegian farmhouse ale
PPTX
Archive integration with RDF
PPTX
The Euro crisis in 10 minutes
PPTX
Using the search engine as recommendation engine
PPTX
Linked Open Data for the Cultural Sector
PPTX
NoSQL databases, the CAP theorem, and the theory of relativity
PPTX
Bitcoin - digital gold
PPTX
Introduction to Big Data/Machine Learning
PPTX
Hops - the green gold
PPTX
Big data 101
PPTX
Linked Open Data
PPTX
Hafslund SESAM - Semantic integration in practice
PPTX
Approximate string comparators
JSLT: JSON querying and transformation
Data collection in AWS at Schibsted
Kveik - what is it?
Nature-inspired algorithms
Collecting 600M events/day
History of writing
NoSQL and Einstein's theory of relativity
Norwegian farmhouse ale
Archive integration with RDF
The Euro crisis in 10 minutes
Using the search engine as recommendation engine
Linked Open Data for the Cultural Sector
NoSQL databases, the CAP theorem, and the theory of relativity
Bitcoin - digital gold
Introduction to Big Data/Machine Learning
Hops - the green gold
Big data 101
Linked Open Data
Hafslund SESAM - Semantic integration in practice
Approximate string comparators
Ad

Recently uploaded (20)

PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
01-Introduction-to-Information-Management.pdf
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PPTX
master seminar digital applications in india
PPTX
Lesson notes of climatology university.
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
GDM (1) (1).pptx small presentation for students
PDF
Complications of Minimal Access Surgery at WLH
PDF
Weekly quiz Compilation Jan -July 25.pdf
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Computing-Curriculum for Schools in Ghana
PDF
Yogi Goddess Pres Conference Studio Updates
PPTX
Orientation - ARALprogram of Deped to the Parents.pptx
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
01-Introduction-to-Information-Management.pdf
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
Final Presentation General Medicine 03-08-2024.pptx
202450812 BayCHI UCSC-SV 20250812 v17.pptx
master seminar digital applications in india
Lesson notes of climatology university.
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
GDM (1) (1).pptx small presentation for students
Complications of Minimal Access Surgery at WLH
Weekly quiz Compilation Jan -July 25.pdf
Microbial disease of the cardiovascular and lymphatic systems
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
VCE English Exam - Section C Student Revision Booklet
Supply Chain Operations Speaking Notes -ICLT Program
Abdominal Access Techniques with Prof. Dr. R K Mishra
Microbial diseases, their pathogenesis and prophylaxis
Computing-Curriculum for Schools in Ghana
Yogi Goddess Pres Conference Studio Updates
Orientation - ARALprogram of Deped to the Parents.pptx

Ontopia Code Camp

  • 1. Ontopia Code CampTMRA 2009-11-11Lars Marius Garshol & Geir Ove Grønmo
  • 2. AgendaAbout youwho are you?what do you want from the code camp?About OntopiaThe productThe futureParticipating in the projectWriting some code!
  • 4. Brief history1999-2000private hobby project for Geir Ove2000-2009commercial software sold by Ontopia ASlots of international customers in diverse fields2009-open source project
  • 5. The projectOpen source hosted at Google CodeContributorsLars Marius Garshol, BouvetGeir Ove Grønmo, BouvetThomas Neidhart, SpaceAppsLars Heuer, SemagiaHannes Niederhausen, TMLabStig Lau, BouvetBaard H. Rehn-Johansen, BouvetPeter-Paul Kruijssen, MorpheusQuintin Siebers, Morpheus
  • 6. Current activity (toward 5.1)tolog updatesadded by LMGVarious fixes and optimizationsby everyoneToma implementation (in sandbox)by ThomasTMQL implementation (in sandbox)?by Sven Krosse
  • 8. The big pictureAuto-class.A.N.otherA.N.otherOtherCMSsA.N.otherA.N.otherDB2TMPortlet supportOKPXML2TMEngineCMSintegrationData integrationEscenicTaxon.importOntopolyWebservice
  • 9. The engineCore APITMAPI 2.0 supportImport/exportRDF conversionTMSyncFulltext searchEvent APItolog query languagetolog update languageEngine
  • 10. Query EngineImplementation of Ontopia’s tolog language (based on Prolog and SQL)Allows powerful queries on the topic map data structureSimplifies application development and improves performanceExample:select $B, count($A) from instance-of($B, city),{ premiere($A : opera, $B : place) | premiere($A : opera, $C : place), located-in($C : containee, $B : container) } order by $A desc?returns all B's and the corresponding number of A's whereB is a city ANDEITHER B is the place where A was premieredOR the place where A was premiered is located in B in decreasing order of ATMSyncConfigurable module for synchronizing one TM against anotherdefine subset of source TM to sync (using tolog)define subset of target TM to sync (using tolog)the module handles the restCan also be used with non-TM sourcescreate a non-updating conversion from the source to some TM formatthen use TMSync to sync against the converted TM instead of directly against the source
  • 11. How TMSync worksDefine which part of the target topic map you want,Define which part of the source topic map it is the master for, andThe algorithm does the rest
  • 12. If the source is not a topic mapTMSyncconvert.xsltSimply do a normal one-time conversionlet TMSync do the update for youIn other words, TMSync reduces the update problem to a conversion problemsource.xml
  • 13. The City of Bergen usecaseNorge.noServiceUnitPersonLOSCity of BergenLOS
  • 14. The backendsIn-memoryno persistent storagethread-safeno setupRDBMStransactionspersistentthread-safeuses cachingclusteringRemoteuses web serviceread-onlyunofficialEngineMemoryRDBMSRemote
  • 15. RDBMS BackendAllows the Engine to use topic maps stored in a relational databaseBased on a generic topic map schemaNecessary when working with very large topic mapsTransparent to applicationsFeaturesAutomatically loads data when neededCaches frequently used dataFull support for RDBMS transactionsSupports tolog-to-SQL compilationStatistical reports for performance tuningPlatform supportOracle, MySQL, PostgreSQL, MS SQL ServerTest suite available for verifying compatibility with other JDBC-enabled RDBMSes
  • 16. DB2TMUpconversion to TMsfrom RDBMS via JDBCor from CSVUses XML mappingcan call out to JavaSupports synceither full rescanor change tableTMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
  • 17. DB2TM exampleOntopia+=United NationsBouvet<relation name="organizations.csv" columns="id name url"> <topic type="ex:organization"> <item-identifier>#org${id}</item-identifier> <topic-name>${name}</topic-name> <occurrence type="ex:homepage">${url}</occurrence> </topic></relation>
  • 18. TMRAPWeb service interfacevia SOAPvia plain HTTPRequestsget-topicget-topic-pageget-tologdelete-topic...TMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
  • 19. Navigator frameworkServlet-based APImanage topic mapsload/scan/delete/createJSP tag libraryXSLT-likebased on tologJSTL integrationTMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
  • 20. Ontopia Navigator FrameworkJava API for interacting with TM repositoryJSP tag librarybased on tologkind of like XSLT in JSP with tolog instead of XPathhas JSTL integrationUndocumented partsweb presentation componentssome wrapped as JSP tagswant to build proper portlets from them
  • 22. Navigator tag library example <%-- assume variable 'composer' is already set --%><p><b>Operas:</b><br/><tolog:foreach query=”composed-by(%composer% : composer, $OPERA : opera), { premiere-date($OPERA, $DATE) }?”> <li> <a href="opera.jsp?id=<tolog:id var="OPERA"/>” ><tolog:out var="OPERA"/></a> <tolog:if var="DATE"> <tolog:out var="DATE"/> </tolog:if> </li></tolog:foreach></p>
  • 27. Automated classificationUndocumentedexperimentalExtracts textautodetects formatWord, PDF, XML, HTMLProcesses textdetects languagestemming, stop-wordsExtracts keywordsranked by importanceuses existing topicssupports compound termsTMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
  • 28. Example of keyword extractiontopic maps 1.0metadata 0.57subject-based class. 0.42Core metadata 0.42faceted classification 0.34taxonomy 0.22monolingual thesauri 0.19controlled vocabulary 0.19Dublin Core 0.16thesauri 0.16Dublin 0.15keywords 0.15
  • 29. Example #2Automated classification 1.0 5Topic Maps 0.51 14XSLT 0.38 11compound keywords 0.29 2keywords 0.26 20Lars 0.23 1Marius 0.23 1Garshol 0.22 1...
  • 30. So how could this be used?To help users classify new documents in a CMS interfacesuggest appropriate keywords, screened by user before approvalAutomate classification of incoming documentsthis means lower quality, but also lower costGet an overview of interesting terms in a document corpusclassify all documents, extract the most interesting termsthis can be used as the starting point for building an ontology(keyword extraction only)
  • 31. Example user interfaceThe user creates an articlethis screen then used to add keywordsuser adjusts the proposals from the classifier
  • 32. VizigatorVizOntopolyGraphical visualizationVizDesktopSwing app to configurefilter/style/...VizletJava applet for webuses configurationloads via TMRAPuses “Remote” backendTMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
  • 33. The VizigatorGraphical visualization of Topic MapsTwo partsVizDesktop: Swing desktop app for configurationVizlet: Java applet for web deploymentConfiguration stored in XTM file
  • 36. The VizigatorThe Vizigator uses TMRAPthe Vizlet runs in the browser (on the client)a fragment of the topic map is downloaded from the serverthe fragment is grown as neededServerTMRAP
  • 37. OntopolyVizOntopolyGeneric editorweb-based, AJAXmeta-ontology in TMOntology designercreate types and fieldscontrol user interfacebuild viewsincremental devInstance editorguided by ontologyTMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
  • 38. OntopolyA generic Topic Maps editor, in two partsontology editor: used to create the ontology and schemainstance editor: used to enter instances based on ontologyBuilt with the Web Editor Frameworkworks with both XTM files and topic maps stored in RDBMS backendsupports access control to administrative functions, ontology, and instance editorsexisting topic maps can be importedparts of the ontology can be marked as read-only, or hidden
  • 41. CMS integrationThe best way to add content functionality to Ontopiathe world doesn’t need another CMSbetter to reuse those which already existSo far two integrations existEscenicOfficeNet Knowledge Portalmore are being worked on
  • 42. ImplementationA CMS event listenerthe listener creates topics for new CMS articles, folders, etcthe mapping is basically the design of the ontology used by this listenerPresentation integrationit must be possible to list all topics attached to an articleconversely, it must be possible to list all articles attached to a topichow close the integration needs to be here will vary, as will the difficulty of the integrationUser interface integrationit needs to be possible to attach topics to an article from within the normal CMS user interfacethis can be quite trickySearch integrationthe Topic Maps search needs to also search content in the CMScan be achieved by writing a tolog plug-in
  • 43. Articles as topicsis aboutElectionsNew city council appointedGoal: associate articles with topicsmainly to say what they are abouttypically also want to include other metadataNeed to create topics for the articles to do thisin fact, a general CMS-to-TM mapping is neededmust decide what metadata and structures to include
  • 44. Mapping issuesArticle topicswhat topic type to use?title becomes name? (do you know the title?)include author? include last modified? include workflow state?should all articles be mapped?Folders/directories/sections/...should these be mapped, too?one topic type for all folders/.../.../...?if so, use associations to connect articles to foldersuse associations to reproduce hierarchical folder structureMultimedia objectsshould these be included?what topic type? what name? ...
  • 45. Two styles of mappingsArticles as articlesTopic represents only the articleTopic type is some subclass of “article”“Is about” association connects article into topic mapFields are presentationaltitle, abstract, bodyArticles as conceptsTopic represents some real-world subject (like a person)article is just the default content about that subjectType is the type of the subject (person)Semantic associations to the rest of the topic mapworks in department, has competence, ...Fields can be semanticname, phone no, email, ...
  • 46. Article as articleArticle about building of a new schoolIs about association to “Primary schools”Topic type is “article”
  • 47. Article as conceptArticle about a sports hallArticle really represents the hallTopic type is “Location”Associations tocity borough
  • 48. events in the location
  • 55. The projectA new citizen’s portal for the city administrationstrategic decision to make portal main interface for interaction with citizensas many services as possible are to be moved onlineBig projectstarted in late 2004, to continue at least into 2008~5 million Euro spent by launch date1.7 million Euro budgeted for 2007Topic Maps development is a fraction of this (less than 25%)Many companies involvedBouvet/OntopiaAvenirKPMGKarabinEscenic
  • 56. Simplified original ontologyService catalogEscenic (CMS)LOSFormArticlenearlyeverythingCategoryServiceSubjectDepartmentBoroughExternalresourceEmployeePayroll++
  • 61. NRK/SkoleNorwegian National Broadcasting (NRK)media resources from the archivespublished for use in schoolsintegrated with the National CurriculumIn productiondelayed by copyright wranglingTechnologiesOKSPolopoly CMSMySQL databaseResin application server
  • 64. Curriculum-based browsing (3)Feminist movement in the 70s and 80sChanges to the family in the 70sThe prime minister’s husbandChildren choosing careersGay partnerships in 1993
  • 65. One video (prime minister’s husband)MetadataSubjectPersonRelatedresourcesDescription
  • 67. ImplementationDomain model in JavaPlain old Java objects built onOntopia’s Java APItologJSP for presentationusing JSTL on top of the domain modelSubversion for the source codeMaven2 to build and deployUnit tests
  • 68. What we’d like to seeThe future
  • 69. The big pictureAuto-class.A.N.otherA.N.otherOtherCMSsA.N.otherA.N.otherDB2TMPortlet supportOKPXML2TMEngineCMSintegrationData integrationEscenicTaxon.importOntopolyWebservice
  • 70. CMS integrationsThe more of these, the betterCandidate CMSsLiferay (being worked on at Bouvet)Alfresco (might be started soon)MagnoliaInspera (possible project here)JSR-170 Java Content RepositoryCMIS (OASIS web service standard)
  • 71. Portlet toolkitSubversion contains a number of “portlets”basically, Java objects doing presentation taskssome have JSP wrappers as wellExamplesdisplay tree viewlist of topics filterable by facetsshow related topicsget-topic-page via TMRAP componentNot ready for prime-time yetundocumentedincomplete
  • 72. Ontopoly plug-insPlugins for getting more data from externalsTMSync import pluginDB2TM pluginSubj3ct.com pluginadapted RDF2TM pluginclassify plugin...Plugins for ontology fragmentsmenu editor, for example
  • 73. TMCLNow implementableWe’d like to seean object model for TMCL (supporting changes)a validator based on the object modelOntopoly import/export from TMCL (initially)refactor Ontopoly API to make it more portableOntopoly ported to use TMCL natively (eventually)
  • 74. Things we’d like to removeOSL supportOntopia Schema LanguageWeb editor frameworkunfortunately, still used by some major customersFulltext searchthe old APIs for this are not really of any use
  • 75. Management interfaceImport topic maps (to file or RDBMS)
  • 76. What do you think?Suggestions?Questions?Plans?Ideas?
  • 77. Setting up the developer environmentGetting started
  • 78. If you are using Ontopia......simply download the zip, thenunzip,set the classpath,start the server, ......and you’re good to go
  • 79. If you are developing Ontopia...You must haveJava 1.5 (not 1.6 or 1.7 or ...)Ant 1.6 (or later)Ivy 2.0 (or later)SubversionThencheck out the source from Subversionsvn checkout http://ontopia.googlecode.com/svn/trunk/ ontopia-read-onlyant bootstrapant dist.jar.ontopiaant testant dist.ontopia
  • 80. BewareThis is fun, becauseyou can play around with anything you wante.g, my build has a faster TopicIF.getRolesByTypeyou can track changes as they happen in svnHowever, you’re on your ownif it fails it’s kind of hard to say whymaybe it’s your changes, maybe notFor production use, official releases are best
  • 82. Our goalTo provide the best toolkit for building Topic Maps-based applicationsWe want it to beactively maintained,bug-free,scalable,easy to use,well documented,stable,reliable
  • 83. Our philosophyWe want Ontopia to provide as much useful more-or-less generic functionality as possibleNew contributions are generally welcome as long asthey meet the quality requirements, andthey don’t cause problems for others
  • 84. The sandboxThere’s a lot of Ontopia-related code which does not meet those requirementssome of it can be very useful,someone may pick it up and improve itThe sandbox is for these piecessome are in Ontopia’s Subversion repository,others are maintained externallyTo be “promoted” into Ontopia a module needsan active maintainer,to be generally useful, andto meet certain quality requirements
  • 85. CommunicationsJoin the mailing list(s)!http://groups.google.com/group/ontopiahttp://groups.google.com/group/ontopia-devGoogle Code pagehttp://code.google.com/p/ontopia/note the “updates” feed!Bloghttp://ontopia.wordpress.comTwitterhttp://twitter.com/ontopia
  • 86. CommittersThese are the people who run the projectthey can actually commit to Subversionthey can vote on decisions to be made etcEveryone else canuse the software as much as they want,report and comment on issues,discuss on the mailing list, andsubmit patches for inclusion
  • 87. How to become a committerParticipate in the project!that is, get involved firstlet people get to know you, show some commitmentOnce you’ve gotten some way into the project you can ask to become a committerbest if you have provided some patches firstUnless you’re going to commit changes there’s no need to be a committer
  • 88. Finding a task to work onReport bugs!they exist. if you find any, please report them.Look at the open issuesthere is always testing/discussion to be doneLook for issues marked “newbie”http://code.google.com/p/ontopia/issues/list?q=label:NewbieLook at what’s in the sandboxmost of these modules need workScratch an itchif there’s something you want fixed/changed/added...
  • 89. How to fix a bugFirst figure out why you think it failsThen write a test casebased on your assumptionmake sure the test case fails (test before you fix)Then fix the bugfollow the coding guidelines (see wiki)Then run the test suiteverify that you’ve fixed the bugverify that you haven’t broken anythingThen submit the patch
  • 90. The test suiteLots of *.test packages in the source tree3148 test cases as of right nowtest data in ontopia/src/test-datasome tests are generators based on filessome of the test files come from cxtm-tests.sf.netRun withant testjava net.ontopia.test.TestRunner src/test-data/config/tests.xml test-group
  • 91. Source tree structurenet.ontopia.utils various utilitiestest various test support codeinfoset LocatorIF code + cruftpersistence OR-mapper for RDBMS backendproduct cruftxml various XML-related utilitiestopicmaps next slides
  • 92. Source tree structurenet.ontopia.topicmaps.core core engine APIimpl engine backends + utilsutils utilities (see next slide)cmdlineutils command-line toolsentry TM repositorynav + nav2 navigator frameworkquery tolog enginevizclassify db2tmwebed cruft
  • 93. Source tree structurenet.ontopia.topicmaps.utils* various utility classesltm LTM reader and writerctm CTM readerrdf RDF converter (both ways)tmrap TMRAP implementation
  • 95. The engineThe core API corresponds closely to the TMDMTopicMapIF, TopicIF, TopicNameIF, ...Compile withant init compile.ontopia.class files go into ontopia/build/classesant dist.ontopia.jar # makes a jar
  • 96. The importersMain class implements TopicMapReaderIFusually, this lets you set up configuration, etcthen uses other classes to do the real workXTM importersuse an XML parsermain work done in XTM(2)ContentHandlersome extra code for validation and format detectionCTM/LTM importersuse Antlr-based parsersreal code in ctm.g/ltm.gAll importers work via the core API
  • 97. Fixing a real bugThere is a failing test case in the TM/XML importerSo let’s fix that right now...
  • 98. Find an issue in the issue tracker(Picking one with “Newbie” might be good, but isn’t necessary)Get set upcheck out the source codebuild the coderun the test suiteThen dig inwe’ll help you with any questions you haveAt the end, submit a patch to the issue trackerremember to use the test suite!