SlideShare a Scribd company logo
Ontopia TutorialTMRA 2010-09-29Lars Marius Garshol & Geir Ove Grønmo
AgendaAbout youwho are you?About OntopiaThe productThe futureParticipating in the project
Some backgroundAbout Ontopia
Brief history1999-2000private hobby project for Geir Ove2000-2009commercial software sold by Ontopia ASlots of international customers in diverse fields2009-open source project
The projectOpen source hosted at Google CodeContributorsLars Marius Garshol, BouvetGeir Ove Grønmo, BouvetThomas Neidhart, SpaceAppsLars Heuer, SemagiaHannes Niederhausen, TMLabStig Lau, BouvetBaard H. Rehn-Johansen, BouvetPeter-Paul Kruijssen, MorpheusQuintin Siebers, MorpheusMatthias Fischer, HTW Berlin
Recent workOntopia/Liferay integrationMatthias Fischer & LMGVarious fixes and optimizationseveryoneTropics (RESTful web service interface)SpaceAppsPorting build system to Maven2Morpheus
Architecture and modulesProduct overview
The big pictureAuto-class.A.N.otherA.N.otherOtherCMSsA.N.otherA.N.otherDB2TMPortlet supportOKPTMSyncEngineCMSintegrationData integrationEscenicTaxon.importOntopolyWebservice
The engineCore APITMAPI 2.0 supportImport/exportRDF conversionTMSyncFulltext searchEvent APItolog query languagetolog update languageEngine
The backendsIn-memoryno persistent storagethread-safeno setupRDBMStransactionspersistentthread-safeuses cachingclusteringRemoteuses web serviceread-onlyunofficialEngineMemoryRDBMSRemote
DB2TMUpconversion to TMsfrom RDBMS via JDBCor from CSVUses XML mappingcan call out to JavaSupports synceither full rescanor change tableTMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
TMRAPWeb service interfacevia SOAPvia plain HTTPRequestsget-topicget-topic-pageget-tologdelete-topic...TMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
Navigator frameworkServlet-based APImanage topic mapsload/scan/delete/createJSP tag libraryXSLT-likebased on tologJSTL integrationTMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
Automated classificationUndocumentedexperimentalExtracts textautodetects formatWord, PDF, XML, HTMLProcesses textdetects languagestemming, stop-wordsExtracts keywordsranked by importanceuses existing topicssupports compound termsTMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
VizigatorVizOntopolyGraphical visualizationVizDesktopSwing app to configurefilter/style/...VizletJava applet for webuses configurationloads via TMRAPuses “Remote” backendTMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
OntopolyVizOntopolyGeneric editorweb-based, AJAXmeta-ontology in TMOntology designercreate types and fieldscontrol user interfacebuild viewsincremental devInstance editorguided by ontologyTMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
Typical deploymentViewingapplicationEngineUsersDBBackendOntopolyFrameworksEditorsDBTMRAPDB2TMHTTPDBExternal applicationApplication server
APIsThe engine
Core APIsnet.ontopia.topicmaps.core.*Fairly direct mapping from TMDMTopicIFAssociationIFTopicMapIF...Set/get methods reflect TMDM properties
TopicIFInterface, not a classgetTopicNames()addTopicName(TopicNameIF)removeTopicName(TopicNameIF)getOccurrences() + add + removegetSubjectIdentifiers() + add + removegetItemIdentifiers() + add + removegetSubjectLocators() + add + removegetRoles()getRolesByType(TopicIF)
Core interfacesTopicMapStoreIFTopicMapIFTopicIFAssociationIFTopicNameIFOccurrenceIFAssociationRoleIFVariantNameIF
How to get a TopicMapIFCreate one directlynew net...impl.basic.InMemoryTopicMapStore()Load one from fileusing an importer (next slide)Connect to an RDBMScovered laterUse a topic map repositorycovered later
TopicMapReaderIFimport net.ontopia.topicmaps.core.TopicMapIF;import net.ontopia.topicmaps.core.TopicMapReaderIF;import net.ontopia.topicmaps.utils.ImportExportUtils;public class TopicCounter {  public static void main(String[] argv) throws Exception {    TopicMapReaderIF reader = ImportExportUtils.getReader(argv[0]);    TopicMapIF tm = reader.read();    System.out.println("TM contains " + tm.getTopics().size() 							+ " topics");  }}[larsga@c716c5ac1 tmp]$ java TopicCounter ~/data/bilder/privat/metadata.xtmTM contains 17035 topics[larsga@c716c5ac1 tmp]$
Supported syntaxes
The utility classesA set of classes outside the core interfaces that perform common tasksa number of these utilities are obsolete now that tolog is hereThey are all built on top of the core interfacesSome important utilitiesImportExportUtils				creates readers and writersMergeUtils						merges topics and topic mapsPSI								contains important PSIsDeletionUtils					cascading delete of topicsDuplicateSuppressionUtils	removes duplicatesTopicStringifiers				find names for topics
Topic Maps repositoryUses a set of topic maps sources to build a set of topics mapstopic maps can be looked up by IDMany kinds of sourcesscan directory for files matching patterndownload from URLconnect to RDBMS...Configurable using an XML filetm-sources.xmlUsed by Navigator Framework
Event APIAllows clients to receive notification of changesMust implement TopicMapListenerIF static class TestListener extends AbstractTopicMapListener {    public void objectAdded(TMObjectIF snapshot) {      System.out.println("Topic added: " + snapshot.getObjectId());    }    public void objectModified(TMObjectIF snapshot) {      System.out.println("Topic modified: " + snapshot.getObjectId());    }    public void objectRemoved(TMObjectIF snapshot) {      System.out.println("Topic removed: " + snapshot.getObjectId());    }  }
Using the API    // register to listen for events    TestListener listener = new TestListener();    TopicMapEvents.addTopicListener(ref, listener);    // get the store through the reference so the listener is registered    ref.createStore(false);    // let's add a topic    System.out.println("Off we go");    TopicMapBuilderIF builder = tm.getBuilder();    TopicIF newbie = builder.makeTopic(tm);    System.out.println("Let's name this topic");    builder.makeTopicName(newbie, "Newbie topic");    // then let's remove it    System.out.println("And now, the exit");    DeletionUtils.remove(newbie);    System.out.println("Goodbye, short-lived topic");[larsga@dhcp-98 tmp]$ java EventTest bkclean.xtm Off we goTopic added: 3409Let's name this topicTopic modified: 3409Topic modified: 3409And now, the exitTopic removed: 3409Goodbye, short-lived topic
For more informationSee the Engine Developer's Guidehttp://www.ontopia.net/doc/current/doc/engine/devguide.html
Persistence & transactionsRDBMS backend
RDBMS backendStores Topic Maps in an RDBMSgeneric schemaaccess via JDBCProvides full ACIDtransactionsconcurrency...Supports several databasesOracle, MySQL, PostgreSQL, MS SQL Server, hsqlClustering support
Core API implementationImplements same API as in-memory impltheoretically, a switch requires only config changeLazy loading of objectsobjects loaded from DB as neededConsiderable internal cachingfor performance reasonsSeparate objects for separate transactionsin order to provide isolationShared cache between transactions
ConfigurationA Java property fileSpecifiesdatabase typeJDBC URLusername + passwordcache settingsclustering settings...
jdbcspyA built-in SQL profilerUseful for identifying cause of performance issues
tologThe Query Engine
tologA logic-based query languagea mix of Prolog and SQLeffectively equivalent to DatalogTwo partsqueries (data retrieval)updates (data modification)Developed by Ontopianot an ISO standardeventually to be replaced by TMQL
tologThe recommended way to interact with the dataAPI programming is slow and cumbersometolog queries perform betterAvailable viaJava APIWeb service APIForms interface in Omnigatortolog queries return API objects
Finding all operas by a composer    Collection operas = new ArrayList();    TopicIF composer = getTopicById("puccini");    TopicIF composed_by = getTopicById("composed-by");    TopicIF work = getTopicById("work");    TopicIF creator = getTopicById("composer");    for (AssociationRoleIF role1 : composer.getRolesByType(creator)) {      AssociationIF assoc = role1.getAssociation();      if (assoc.getType() != composed_by)        continue;      for (AssociationRoleIF role2 : assoc.getRoles()) {        if (role2.getType() != work)          continue;        operas.add(role2.getPlayer());      }    }
Finding all operas by a composercomposed-by(puccini : composer, $O : work)?composed-by($C : composer, tosca : work)?composed-by($C : composer, $O : work)?composed-by(puccini : composer, 						tosca : work)?
FeaturesAccess all aspects of a topic mapGeneric queries independent of ontologyAND, OR, NOT, OPTIONALCountSortLIMIT/OFFSETReusable inference rules
Chaining predicates (AND)Predicates can be chainedborn-in($PERSON : person, $PLACE : place),located-in($PLACE : containee, italy : container)?The comma between the predicates means ANDThis query finds all the people born in ItalyIt first builds a two-column table of all born-in associationsThen, those rows where the place is not located-in Italy are removed(Note that when the PLACE variable is reused above that means that the birthplace and the location must be the same topic in each match)Any number of predicates can be chainedTheir order is insignificantActually, the optimizer reorders the predicatesIt will start with located-in because it has a topic constant
Thinking in predicatesMost of you are probably used to functions, which work like this:
function(arg1, arg2, arg3) -> result
Predicates, however, are in a sense bidirectional, because of the way the pattern matching works
predicate(topic : role1, $VAR : role2)
predicate($VAR : role1, topic : role2)
The order of the roles are, on the other hand, insignificant
predicate(topic : role1, $VAR : role2)
predicate($VAR : role2, topic : role1)ProjectionSometimes queries make use of temporary variables that we are not really interested inThe way to get rid of unwanted variables is projectionSyntax:			select  $variable1, $variable2, ... from <query>?The query is first run, then projected down to the requested variables
The instance-of predicateinstance-of has the following form:
instance-of (instance,class)
NOTE: the order of the arguments is significant
Like players, instance and class may be specified in two ways:
using a variable ($name)
using a topic reference
e.g. instance-of ( $A, city )
instance-of makes use of the superclass-subclass associations in the topic map
this means that composers will be considered musicians, and musicians will be considered personsCities with the most premieresusing o for i"http://psi.ontopedia.net/"select $CITY, count($OPERA) from instance-of($CITY, o:City), { o:premiere($OPERA : o:Work, $CITY : o:Place) |   o:premiere($OPERA : o:Work, $THEATRE : o:Place),   o:located_in($THEATRE : o:Containee, $CITY : o:Container) } order by $OPERA desc?
All non-hidden photosselect $PHOTO from  instance-of($PHOTO, op:Photo)  not(ph:hide($PHOTO : ph:hidden)),  not(ph:taken-at($PHOTO : op:Image, $PLACE : op:Place),        ph:hide($PLACE : ph:hidden)),  not(ph:taken-during($PHOTO : op:Image, $EVENT : op:Event),        ph:hide($EVENT : ph:hidden)),  not(ph:depicted-in($PHOTO : ph:depiction, $PERSON : ph:depicted),        ph:hide($PERSON : ph:hidden))?
DemoShow running queries in OmnigatorAlso show query tracingShakespeare/* #OPTION: optimizer.reorder = false */
tologspytolog query profilershares code with jdbcspy
Using the query engine APIThe query engine API is really simple to useget a QueryProcessorIF objectrun a query in the QueryProcessorIF and get a QueryResultIFloop over the results and use themclose the result objectgo back to step 2, or do something elseThere are two different QueryProcessorIF implementations
the API lets you write code without worrying about that, however
the two implementations behave identicallyRunning a query with the APITopicMapIF tm = ...;QueryProcessorIF processor = QueryUtils.getQueryProcessor(tm);QueryResultIF result = processor.execute(“instance-of($P, person)?”);try {  while (result.next()) {      TopicIF person = (TopicIF) result.getValue(0);      // do something useful with 'person'  } } finally {  result.close();}
Advanced optionsIt is possible to parse a query once, and then run it many times
the processor returns a ParsedQueryIF object, which can be executed
parameters can be passed to the query on each execution
It is possible to make declarations and use them across executionsUsing a parsed queryParsedQueryIF parsedQuery = processor.parse(“instance-of($P, %TYPE%)?”);Map params = Collections.singletonMap(“TYPE”, person);QueryResultIF result = parsedQuery.execute(params);try {  while (result.next()) {    // ...  } } finally {  result.close();}
QueryWrapperDesigned to make all of this easierQueryWrapper qw = new QueryWrapper(tm);TopicIF topic = qw.queryForTopic(...);List topics = qw.queryForList(...);List<Person> people = qw.queryForList(..., mapper);
tolog updatesGreatly simplifies TM modificationAlso means you can do modification without API programminguseful with RDBMS topic mapsuseful with TMs in running web serversBy performing a sequence of updates, just about any change can be madePotentially allows much more powerful architecture
DELETEStatic formdelete lmgDynamic formdelete $person from instance-of($person, person)Delete a valuedelete subject-identifier(topic, “http://ex.org/tst”)
MERGEStatic formMERGE topic1, topic2Dynamic formMERGE $p1, $p2 FROM									instance-of($p1, person),							instance-of($p2, person), 								  email($p1, $email), 									  email($p2, $email)
INSERTStatic formINSERT lmg isa person; - “Lars Marius Garshol” .Dynamic formINSERT 										           tmcl:belongs-to-schema(tmcl:container : theschema, tmcl:containee: $c) FROM instance-of($c, tmcl:constraint)
INSERT againINSERT   ?y $psi .  event-in-year(event: $e, year: ?y)FROM 	start-date($e, $date),	str:substring($y, $date, 4),	str:concat($psi, "http://psi.semagia.com/iso8601", $y)
UPDATEStatic formUPDATE value(@3421, “New name”)Dynamic formUPDATE value($TN, “Ontopia”)							FROM topic-name(oks, $TN)
More informationLook at sample queries in Omnigatortolog tutorialhttp://www.ontopia.net/doc/current/doc/query/tutorial.htmltolog built-in predicate referencehttp://www.ontopia.net/doc/current/doc/query/predicate-reference.html
Conversion from RDBMS dataDB2TM
DB2TMUpconversion of relational dataeither from CSV files orover JDBCBased on an XML file describing the mappingvery highly configurableSupport forall of Topic Maps (except variants)value transformationssynchronization
Standard use casePull in data from external sourceturn it into Topic Maps following some ontologyEnrich itusually manually, but not necessarilyResync from source at intervals
DB2TM exampleOntopia+=United NationsBouvet<relation name="organizations.csv" columns="id name url">  <topic type="ex:organization">
   <item-identifier>#org${id}</item-identifier>
   <topic-name>${name}</topic-name>
   <occurrence type="ex:homepage">${url}</occurrence>
 </topic></relation>
Creating associations <relation name="people.csv" columns="id given family employer phone">    <topic id="employer">      <item-identifier>#org${employer}</item-identifier>    </topic>    <topic type="ex:person">      <item-identifier>#person${id}</item-identifier>      <topic-name>${given} ${family}</topic-name>      <occurrence type="ex:phone">${phone}</occurrence>      <player atype="ex:employed-by" rtype="ex:employee">        <other rtype="ex:employer" player="#employer"/>      </player>    </topic>  </relation>
Value transformations <relation name="SCHEMATA" columns="SCHEMA_NAME">    <function-column name='SCHEMA_ID'                     method='net.ontopia.topicmaps.db2tm.Functions.makePSI'>      <param>${SCHEMA_NAME}</param>    </function-column>    <topic type="mysql:schema">      <item-identifier>#${SCHEMA_ID}</item-identifier>      <topic-name>${SCHEMA_NAME}</topic-name>    </topic>  </relation>
Running DB2TMjava net.ontopia.topicmaps.db2tm.Executecommand-line toolalso works with RDBMS topic mapsnet.ontopia.topicmaps.db2tm.DB2TMAPI class to run transformationsmethods "add" and "sync"
More informationDB2TM User's Guidehttp://www.ontopia.net/doc/current/doc/db2tm/user-guide.html
Synchronizing with other sourcesTMSync
TMSyncConfigurable module for synchronizing one TM against anotherdefine subset of source TM to sync (using tolog)define subset of target TM to sync (using tolog)the module handles the restCan also be used with non-TM sourcescreate a non-updating conversion from the source to some TM formatthen use TMSync to sync against the converted TM instead of directly against the source
How TMSync worksDefine which part of the target topic map you want,Define which part of the source topic map it is the master for, andThe algorithm does the rest
If the source is not a topic mapTMSyncconvert.xsltSimply do a normal one-time conversionlet TMSync do the update for youIn other words, TMSync reduces the update problem to a conversion problemsource.xml
The City of Bergen usecaseNorge.noServiceUnitPersonLOSCity of BergenLOS
Web service interfaceTMRAP
TMRAP basicsAbstract interfacethat is, independent of any particular technologycoarse-grained operations, to reduce network trafficProtocol bindings existplain HTTP bindingSOAP bindingSupports many syntaxesXTM 1.0LTMTM/XMLcustom tolog result-set syntax
get-topicRetrieves a single topic from the remote servertopic map may optionally be specifiedsyntax likewiseMain useto build client-side fragments into a bigger topic mapto present information about a topic on a different server
get-topicParametersidentifier: a set of URIs (subject identifiers of wanted topic)subject: a set of URIs (subject locators of wanted topic)item: a set of URIs (item identifiers of wanted topic)topicmap: identifier for topic map being queriedsyntax: string identifying desired Topic Maps syntax in responseview: string identifying TM-Views view used to define fragmentResponsetopic map fragment representing topic in requested syntaxdefault is XTM fragment with all URI identifiers, names, occurrences, and associationsin default view types and scopes on these constructs are only identified by one <*Ref xlink:href=“...”/> XTM elementthe same goes for associated topics
get-topic-pageReturns link information about a topicthat is, where does the server present this topicmainly useful for realizing the portal integration scenarioresult information contains metadata about server setup
get-topic-pageParametersidentifier: a set of URIs (subject identifiers of wanted topic)subject: a set of URIs (subject locators of wanted topic)item: a set of URIs (item identifiers of wanted topic)topicmap: identifier for topic map being queriedsyntax: string identifying desired Topic Maps syntax in responseResponse is a topic map fragment[oks : tmrap:server = "OKS Samplers local installation"][opera : tmrap:topicmap = "The Italian Opera Topic Map"]  {opera, tmrap:handle, [[opera.xtm]]}tmrap:contained-in(oks :  tmrap:container, opera : tmrap:containee)tmrap:contained-in(opera : tmrap:container, view : tmrap:containee)tmrap:contained-in(opera : tmrap:container, edit : tmrap:containee)[view : tmrap:view-page %"http://localhost:8080/omnigator/models/..."][edit : tmrap:edit-page %"http://localhost:8080/ontopoly/enter.ted?..."][russia = "Russia” @"http://www.topicmaps.org/xtm/1.0/country.xtm#RU"]
get-tologReturns query resultsmain use is to extract larger chunks of the topic map to the client for presentationmore flexible than get-topiccan achieve more with less network traffic
get-tologParameterstolog: tolog querytopicmap: identifier for topic map being queriedsyntax: string identifying desired syntax of responseview: string identifying TM-Views view used to define fragmentResponseif syntax is“tolog”an XML representation of the query resultuseful if order of results matterotherwise, a topic map fragment containing multiple topics is returnedas for get-topic
add-fragmentAdds information to topic map on the serverdoes this by merging in a fragmentParametersfragment: topic map fragmenttopicmap: identifier for topic map being added tosyntax: string identifying syntax of request fragmentResultfragment imported into named topic map
update-topicCan be used to update a topicadd-fragment only adds informationupdate sets the topic to exactly the uploaded informationParameterstopicmap: the topic map to updatefragment: fragment containing the new topicsyntax: syntax of the uploaded fragmentidentifier: a set of URIs (subject identifiers of wanted topic)subject: a set of URIs (subject locators of wanted topic)item: a set of URIs (item identifiers of wanted topic)Update happens using TMSync
delete-topicRemoves a topic from the serverParametersidentifier: a set of URIs (subject identifiers of wanted topic)subject: a set of URIs (subject locators of wanted topic)item: a set of URIs (item identifiers of wanted topic)topicmap: identifier for topic map being queriedResultdeletes the identified topicincludes all names, occurrences, and associations
tolog-updateRuns a tolog update statementParameterstopicmap: topic map to updatestatement: tolog statement to runRuns the statement & commits the change
HTTP binding basicsThe mapping requires a base URLe.g http://localhost:8080/tmrap/This is used to send requestshttp://localhost:8080/tmrap/method?param1=value1&...GET is used for requests that do not cause state changesPOST for requests that doResponses returned in response body
Exercise #1: Retrieve a topicUse the get-topic request to retrieve a topic from the serverbase URL is http://localhost:8080/tmrap/find the identifying URI in Omnigatorjust print the retrieved fragment to get a look at itNote: you must escape the “#” character in URIsotherwise it is interpreted as the anchor and not transmitted at allescape sequence: %23Note: you must specify the topic map IDotherwise results will only be returned from loaded topic mapsin other words: if the topic map isn’t loaded, you get no results
Solution #1 (in Python)import urllibBASE = "http://localhost:8080/tmrap/tmrap/"psi = "http://www.topicmaps.org/xtm/1.0/country.xtm%23RU"inf = urllib.urlopen(BASE + "get-topic?identifier=" + psi)print inf.read()inf.close()
Solution #1 (response)
<topicMap xmlns="http://www.topicmaps.org/xtm/1.0/" 
          xmlns:xlink="http://www.w3.org/1999/xlink">
  <topic id="id458">
    <instanceOf>
      <subjectIndicatorRef xlink:href="http://psi.ontopia.net/geography/#country"/>
    </instanceOf>
    <subjectIdentity>
      <subjectIndicatorRef xlink:href="http://www.topicmaps.org/xtm/1.0/country.xtm#RU"/>
      <topicRef xlink:href="file:/.../WEB-INF/topicmaps/geography.xtmm#russia"/>
    </subjectIdentity>
    <baseName>
      <baseNameString>Russia</baseNameString>
    </baseName>
  </topic>
Processing XTM with XSLTThis is possible, but unpleasantthe main problem is that the XML is phrased in terms of Topic Maps, not in domain termsthis means that all the XPath will talk about “topic”, “association”, ... and not “person”, “works-for” etcThe structure is also complicatedthis makes queries complicatedfor example, the XPath to traverse an association looks like this://xtm:association  [xtm:member[xtm:roleSpec / xtm:topicRef / @xlink:href = '#employer']             [xtm:topicRef / @xlink:href = concat('#', $company)]]  [xtm:instanceOf / xtm:topicRef / @xlink:href = '#employed-by']
TM/XMLNon-standard XML syntax for Topic Mapsdefined by Ontopia (presented at TMRA’05)implemented in the OKSXSLT-friendlymuch easier to process with XSLT than XTMcan be understood by developers who do not understand Topic Mapsdynamic domain-specific syntaxes instead of generic syntaxpredictable (can generate XML Schema from TM ontology)
TM/XML example<topicmap ... reifier="tmtopic">  <topicmap id="tmtopic">    <iso:topic-name><tm:value>TM/XML example</tm:value> </iso:topic-name>     <dc:description>An example of the use of TM/XML.</dc:description>  </topicmap>  <person id="lmg">    <iso:topic-name><tm:value>Lars Marius Garshol</tm:value>      <tm:variant scope="core:sort">garshol, lars marius</tm:variant>    </iso:topic-name>    <homepage datatype="http://www.w3.org/2001/XMLSchema#anyURI"       >http://www.garshol.priv.no</homepage>    <created-by role="creator" topicref="tmtopic" otherrole="work"/>    <presentation role="presenter">      <presented topicref="tmxml"/>       <event topicref="tmra05"/>    </presentation>  </person></topicmap>
tmphotoCategoryPersonPhotoEventLocationhttp://www.garshol.priv.no/tmphoto/A topic map to organize my personal photoscontains ~15,000 photosA web gallery runs on Ontopiaon www.garshol.priv.no
tmtoolshttp://www.garshol.priv.no/tmtools/OrganizationAn index of Topic Maps toolsorganized as shown on the rightAgain, web application for browsingscreenshots belowPersonSoftwareproductPlatformCategoryTechnology
The person pageBoring! No content.
And in tmphoto...
get-illustrationA web service in tmphotoreceives the PSI of a personthen automatically picks a suitable photo of that personBased onvote score for photos,categories (portrait),other people in photo...The service returnsa topic map fragment with links to the person page and a few different sizes of the selected photohttp://www.garshol.priv.no/blog/183.html
get-illustrationHmmm. Scores, categories, people in photo, ...Do you have a photo ofhttp://psi.ontopedia.net/Benjamin_Bock ?http://www.garshol.priv.no/tmphoto/get-illustration?identifier=http://psi.on....tmphototmtoolsTopic mapfragment
Voila...
Points to noteNo hard-wiring of linksjust add identifiers when creating people topicsphotos appear automaticallyif a better photo is added later, it’s replaced automaticallyNo copying of datano duplication, no extra maintenanceVery loose bindingnothing application-specificHighly extensibleonce the identifiers are in place we can easily pull in more content from other sources
My blogHas more content aboutpeople 			(tmphoto & tmtools),events 			(tmphoto),tools 			(tmtools),technologies 	(tmtools)Should be available in those applications
SolutionMy blog posts are taggedbut the tags are topics, which can have PSIsthese PSIs are used in tmphoto and tmtools, tooThe get-topic-page request lets tmphoto & tmtools ask the blog for links to relevant postsgiven identifiers for a topic, returns links to pages about that topichttp://www.garshol.priv.no/blog/145.html
get-topic-pageDo you have pages abouthttp://psi.ontopedia.net/TMRA_2008 ?http://www.garshol.priv.no/blog/get-topic-page?identifier=http://psi.on....BlogtmphotoTopic mapfragmentTopics linking toindividual blog posts
In tmphoto
Making web applicationsNavigator Framework
Ontopia Navigator FrameworkJava API for interacting with TM repositoryJSP tag librarybased on tologkind of like XSLT in JSP with tolog instead of XPathhas JSTL integrationUndocumented partsweb presentation componentssome wrapped as JSP tagswant to build proper portlets from them
How it worksWeb server with JSP containere.g. Apache TomcatJSP pageBrowserTopic MapEngineTagLibrariesJSP pageBrowserJSP pageBrowserJSP pageBrowserTopic Map
The two tag librariestolog
makes up nearly the entire framework
used to extract information from topic maps
lets you execute tolog queries to extract information from the topic map
looping and control flow structures
template
used to create template pages
separates layout and structure from content
not Topic Maps-aware
optional, but recommendedHow the tag libraries workThe topic map engine holds a registry of topic maps
collected from the tm-sources.xml configuration file
each topic map has its own id (usually the file name)
Each page also holds a set of variable bindings
each variable holds a collection of objects
objects can be topics, base names, locators, strings, ...
Tags access variables
some tags set the values of variables, while others use themBuilding a JSP pageThe <%@ taglib ... %> tags declare your tag libraries
Tells the page which tag library to include and binds it to a prefix
Prefixes are used to qualify the tags (and avoid name collisions)
Use the <tolog:context> tag around the entire page
The "topicmap" attribute specifies the ID of the current topic map
The first time you access the page in your browser the page gets compiled
If you modify the page then it will be recompiled the next time it is accessedhttp://www.ontopia.net/operamap
Navigator tag library example   <%-- assume variable 'composer' is already set --%><p><b>Operas:</b><br/><tolog:foreach query=”composed-by(%composer% : composer, $OPERA : opera),                      { premiere-date($OPERA, $DATE) }?”>  <li>    <a href="opera.jsp?id=<tolog:id var="OPERA"/>”         ><tolog:out var="OPERA"/></a>       <tolog:if var="DATE">      <tolog:out var="DATE"/>    </tolog:if>  </li></tolog:foreach></p>
Elmer Preview
Ontopia tutorial
Ontopia tutorial
Ontopia tutorial
Possible configurationApplication directorieswebappsmyApp/*.jspomnigator/WEB-INF/config/*.xmli18n/topicmaps/*.xtm, *.ltmweb.xml
The navigator configuration filesweb.xml
where to find the other files, plus plug-ins
tm-sources.xml
tells the navigator where to find topic maps
log4j.properties
configuration of the log4j logging
More details in the "Configuration Guide" documentMore informationNavigator Framework Configuration Guidehttp://www.ontopia.net/doc/current/doc/navigator/config.htmlNavigator Framework Developer's Guidehttp://www.ontopia.net/doc/current/doc/navigator/navguide.htmlNavigator Framework Tag Library Referencehttp://www.ontopia.net/doc/current/doc/navigator/tolog-taglib.html
...Automated classification
What is automated classification?Create parts of a topic map automaticallyusing the text in existing content as the sourcenot necessarily 100% automatic; user may help outA hard tasknatural language processing is very complexresult is never perfectHowever, it’s possible to achieve some results
Why automate classification?Creating a topic map requires intellectual effortthat is, it requires work by humansHuman effort = costadded value must be sufficient to justify the costin some cases eitherthe cost is too high, orthe value added is too limitedThe purpose of automation is to lower the costthis increases the number of cases where the use of Topic Maps is justified
Automatable tasksProjectPersonDepartmentWorked onWorked onJane  Doeworked onemployed inXYZ ProjectIT groupOntologyharddepends on requirementsone time onlyInstance datahardusually exists in other sourcesDocument keywordseasierfrequent operationusually no other sources
Two kinds of categorizationBroad:Environment, Crisis managementNarrow:Water, Norway, drought, Drought Act, Cloud seeding, Morecambe BayBroad categorizationcategories are broadly definedinclude many different subjectsNarrow categorizationuses very specific keywordseach keyword is a single subject
What it doesExtract keywords from contentgoal is to use these for classificationNot entity recognitionwe only care about identifying what the content is aboutUses statistical approachno attempt at full formal parsing of the text
Steps of operationIdentify formatthen, extract the textIdentify languagethen, remove stop wordsstem remaining wordsClassifycan use terms from preexisting Topic Mapsexploits knowledge of the languageReturn proposed keywords
Example of keyword extractiontopic maps			1.0metadata			0.57subject-based class.	0.42Core metadata		0.42faceted classification	0.34taxonomy			0.22monolingual thesauri	0.19controlled vocabulary	0.19Dublin Core			0.16thesauri			0.16Dublin				0.15keywords			0.15
Example #2Automated classification		1.0	5Topic Maps				0.51	14XSLT					0.38	11compound keywords		0.29	2keywords				0.26	20Lars					0.23	1Marius					0.23	1Garshol				0.22	1...
So how could this be used?To help users classify new documents in a CMS interfacesuggest appropriate keywords, screened by user before approvalAutomate classification of incoming documentsthis means lower quality, but also lower costGet an overview of interesting terms in a document corpusclassify all documents, extract the most interesting termsthis can be used as the starting point for building an ontology(keyword extraction only)
Example user interfaceThe user creates an articlethis screen then used to add keywordsuser adjusts the proposals from the classifier
Interfacesjava net.ontopia.topicmaps.classify.Chew<topicmapuri><inputfile>produces textual output onlynet.ontopia.topicmaps.classify.SimpleClassifierclassify(uri, topicmap) -> TermDatabaseclassify(uri) -> TermDatabase
Supported formats and languagesXML (any schema)HTML (non-XML)PDFWord (.doc, .docx)PowerPoint (.ppt, .pptx)Plain textEnglishNorwegian
Visualization of Topic MapsVizigator
The VizigatorGraphical visualization of Topic MapsTwo partsVizDesktop: Swing desktop app for configurationVizlet: Java applet for web deploymentConfiguration stored in XTM file
The uses of visualizationNot really suitable for navigationdoesn't work for all kinds of dataGreat for seeing the big picture
Without configuration
With configuration
VizDesktop
The VizigatorThe Vizigator uses TMRAPthe Vizlet runs in the browser (on the client)a fragment of the topic map is downloaded from the serverthe fragment is grown as neededServerTMRAP
Embedding the VizletSet up TMRAP serviceAdd ontopia-vizlet.jarAdd necessary HTML <applet code="net.ontopia.topicmaps.viz.Vizlet.class"          archive="ontopia-vizlet.jar">    <param name="tmrap"    value="/omnigator/plugins/viz/">    <param name="config"   value="/omnigator/plugins/viz/config.jsp?tm=<%= tmid %>">    <param name="tmid"     value="<%= tmid    %>">    <param name="idtype"   value="<%= idtype  %>">    <param name="idvalue"  value="<%= idvalue %>">    <param name="propTarget"    value="VizletProp">    <param name="controlsVisible"    value="true">    <param name="locality"    value="1">    <param name="max-locality"    value="5"></applet>
Topic Maps debuggerOmnigator
OmnigatorGeneric Topic Maps browservery useful for seeing what's in a topic mapthe second-oldest part of OntopiaContains other features beyond simple browsingstatisticsmanagement consolemergingtolog querying/updatesexport
Ontology designer and editorOntopoly
OntopolyA generic Topic Maps editor, in two partsontology editor: used to create the ontology and schemainstance editor: used to enter instances based on ontologyFeaturesworks with both XTM files and topic maps stored in RDBMS backendsupports access control to administrative functions, ontology, and instance editorsexisting topic maps can be importedparts of the ontology can be marked as read-only, or hidden
Ontology designerCreate ontology based ontopic, association, name, occurrence, and role typesSupports iterative ontology developmentmodify and prototype the ontology until it's rightSupports ontology annotationadd fields to topic types, for exampleSupports viewsdefine restricted views of certain topic types
Instance editorConfigured by the ontology editorshows topics as defined by the ontologyHas several ways to pick associationsdrop-down listby searchfrom hierarchyAvoids conflictspages viewed by one user are locked to others
Ontopoly is embeddableThe Ontopoly instance editor can be embeddedbasically, the main panel can be inserted into another web applicationuses an iframeRequires only ID of topic being editedcan also be restricted to a specific viewMakes it possible to build easier-to-use editorsso users don't have to learn all of Ontopoly
Adding content featuresCMS integrations
CMS integrationThe best way to add content functionality to Ontopiathe world doesn’t need another CMSbetter to reuse those which already existSo far two integrations existEscenicOfficeNet Knowledge Portalmore are being worked on
ImplementationA CMS event listenerthe listener creates topics for new CMS articles, folders, etcthe mapping is basically the design of the ontology used by this listenerPresentation integrationit must be possible to list all topics attached to an articleconversely, it must be possible to list all articles attached to a topichow close the integration needs to be here will vary, as will the difficulty of the integrationUser interface integrationit needs to be possible to attach topics to an article from within the normal CMS user interfacethis can be quite trickySearch integrationthe Topic Maps search needs to also search content in the CMScan be achieved by writing a tolog plug-in
Articles as topicsis aboutElectionsNew city council appointedGoal: associate articles with topicsmainly to say what they are abouttypically also want to include other metadataNeed to create topics for the articles to do thisin fact, a general CMS-to-TM mapping is neededmust decide what metadata and structures to include
Mapping issuesArticle topicswhat topic type to use?title becomes name? (do you know the title?)include author? include last modified? include workflow state?should all articles be mapped?Folders/directories/sections/...should these be mapped, too?one topic type for all folders/.../.../...?if so, use associations to connect articles to foldersuse associations to reproduce hierarchical folder structureMultimedia objectsshould these be included?what topic type? what name? ...
Two styles of mappingsArticles as articlesTopic represents only the articleTopic type is some subclass of “article”“Is about” association connects article into topic mapFields are presentationaltitle, abstract, bodyArticles as conceptsTopic represents some real-world subject (like a person)article is just the default content about that subjectType is the type of the subject (person)Semantic associations to the rest of the topic mapworks in department, has competence, ...Fields can be semanticname, phone no, email, ...
Article as articleArticle about building of a new schoolIs about association to “Primary schools”Topic type is “article”
Article as conceptArticle about a sports hallArticle really represents the hallTopic type is “Location”Associations tocity borough
events in the location
category “Sports”
Ontopia tutorial
Ontopia tutorial
Ontopia tutorial
Ontopia tutorial
Ontopia/LiferayAn integration with the Liferay CMS and portal is in progresspresented Friday 1130-1150 in Schiller 2
Two projectsReal-life usage
The projectA new citizen’s portal for the city administrationstrategic decision to make portal main interface for interaction with citizensas many services as possible are to be moved onlineBig projectstarted in late 2004, to continue at 							least into 2008~5 million Euro spent by launch date1.7 million Euro budgeted for 2007Topic Maps development is a fraction 							of this (less than 25%)Many companies involvedBouvet/OntopiaAvenirKPMGKarabinEscenic
Simplified original ontologyService catalogEscenic (CMS)LOSFormArticlenearlyeverythingCategoryServiceSubjectDepartmentBoroughExternalresourceEmployeePayroll++

More Related Content

PPT
Profiling and optimization
PDF
Python Performance 101
PDF
Beyond tf idf why, what & how
PPT
Euro python2011 High Performance Python
PDF
Machine learning with py torch
PDF
asyncio internals
DOCX
PathOfMostResistance
PDF
Introduction to Objective - C
Profiling and optimization
Python Performance 101
Beyond tf idf why, what & how
Euro python2011 High Performance Python
Machine learning with py torch
asyncio internals
PathOfMostResistance
Introduction to Objective - C

What's hot (20)

PDF
Go Java, Go!
PDF
"О некоторых особенностях Objective-C++" Влад Михайленко (Maps.Me)
PPTX
EuroPython 2016 - Do I Need To Switch To Golang
PDF
«iPython & Jupyter: 4 fun & profit», Лев Тонких, Rambler&Co
ODP
Java Generics
PDF
Kotlin Bytecode Generation and Runtime Performance
PDF
Use PEG to Write a Programming Language Parser
PDF
Oleksii Holub "Expression trees in C#"
PDF
Simple ETL in python 3.5+ with Bonobo - PyParis 2017
PDF
"PyTorch Deep Learning Framework: Status and Directions," a Presentation from...
PDF
Don't do this
ZIP
Intro to Pig UDF
PDF
Functional programming in C++ LambdaNsk
PDF
Apache PIG - User Defined Functions
PDF
Python Async IO Horizon
PPT
About Those Python Async Concurrent Frameworks - Fantix @ OSTC 2014
PPT
From Java to Python
PDF
Expression trees in c#
PDF
effective_r27
PDF
Geeks Anonymes - Le langage Go
Go Java, Go!
"О некоторых особенностях Objective-C++" Влад Михайленко (Maps.Me)
EuroPython 2016 - Do I Need To Switch To Golang
«iPython & Jupyter: 4 fun & profit», Лев Тонких, Rambler&Co
Java Generics
Kotlin Bytecode Generation and Runtime Performance
Use PEG to Write a Programming Language Parser
Oleksii Holub "Expression trees in C#"
Simple ETL in python 3.5+ with Bonobo - PyParis 2017
"PyTorch Deep Learning Framework: Status and Directions," a Presentation from...
Don't do this
Intro to Pig UDF
Functional programming in C++ LambdaNsk
Apache PIG - User Defined Functions
Python Async IO Horizon
About Those Python Async Concurrent Frameworks - Fantix @ OSTC 2014
From Java to Python
Expression trees in c#
effective_r27
Geeks Anonymes - Le langage Go
Ad

Similar to Ontopia tutorial (20)

PPT
Groovy Introduction - JAX Germany - 2008
PDF
TWINS: OOP and FP - Warburton
PDF
Dev Day 2019: Mike Sperber – Software Design für die Seele
PPTX
Java - A broad introduction
PDF
Angular Schematics
PPTX
A brief overview of java frameworks
PPT
Demystifying Maven
PPTX
Eclipse Modeling Framework
PPTX
The GO Language : From Beginners to Gophers
PPTX
Iron Languages - NYC CodeCamp 2/19/2011
PPTX
Sour Pickles
PPTX
Golang basics for Java developers - Part 1
PPTX
Introducing PHP Latest Updates
PPTX
Exploring SharePoint with F#
PDF
Refactoring In Tdd The Missing Part
ODP
Aspect-oriented programming in Perl
PPT
Eclipse Training - Main eclipse ecosystem classes
PDF
Twins: OOP and FP
PDF
Terraform GitOps on Codefresh
PPTX
Modeling Patterns for JavaScript Browser-Based Games
Groovy Introduction - JAX Germany - 2008
TWINS: OOP and FP - Warburton
Dev Day 2019: Mike Sperber – Software Design für die Seele
Java - A broad introduction
Angular Schematics
A brief overview of java frameworks
Demystifying Maven
Eclipse Modeling Framework
The GO Language : From Beginners to Gophers
Iron Languages - NYC CodeCamp 2/19/2011
Sour Pickles
Golang basics for Java developers - Part 1
Introducing PHP Latest Updates
Exploring SharePoint with F#
Refactoring In Tdd The Missing Part
Aspect-oriented programming in Perl
Eclipse Training - Main eclipse ecosystem classes
Twins: OOP and FP
Terraform GitOps on Codefresh
Modeling Patterns for JavaScript Browser-Based Games
Ad

More from Lars Marius Garshol (20)

PDF
JSLT: JSON querying and transformation
PDF
Data collection in AWS at Schibsted
PPTX
Kveik - what is it?
PDF
Nature-inspired algorithms
PDF
Collecting 600M events/day
PDF
History of writing
PDF
NoSQL and Einstein's theory of relativity
PPTX
Norwegian farmhouse ale
PPTX
Archive integration with RDF
PPTX
The Euro crisis in 10 minutes
PPTX
Using the search engine as recommendation engine
PPTX
Linked Open Data for the Cultural Sector
PPTX
NoSQL databases, the CAP theorem, and the theory of relativity
PPTX
Bitcoin - digital gold
PPTX
Introduction to Big Data/Machine Learning
PPTX
Hops - the green gold
PPTX
Big data 101
PPTX
Linked Open Data
PPTX
Hafslund SESAM - Semantic integration in practice
PPTX
Approximate string comparators
JSLT: JSON querying and transformation
Data collection in AWS at Schibsted
Kveik - what is it?
Nature-inspired algorithms
Collecting 600M events/day
History of writing
NoSQL and Einstein's theory of relativity
Norwegian farmhouse ale
Archive integration with RDF
The Euro crisis in 10 minutes
Using the search engine as recommendation engine
Linked Open Data for the Cultural Sector
NoSQL databases, the CAP theorem, and the theory of relativity
Bitcoin - digital gold
Introduction to Big Data/Machine Learning
Hops - the green gold
Big data 101
Linked Open Data
Hafslund SESAM - Semantic integration in practice
Approximate string comparators

Recently uploaded (20)

PPT
Teaching material agriculture food technology
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
cuic standard and advanced reporting.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
Spectroscopy.pptx food analysis technology
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Encapsulation theory and applications.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Teaching material agriculture food technology
Digital-Transformation-Roadmap-for-Companies.pptx
20250228 LYD VKU AI Blended-Learning.pptx
cuic standard and advanced reporting.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
A comparative analysis of optical character recognition models for extracting...
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Spectroscopy.pptx food analysis technology
sap open course for s4hana steps from ECC to s4
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
Encapsulation_ Review paper, used for researhc scholars
Per capita expenditure prediction using model stacking based on satellite ima...
Spectral efficient network and resource selection model in 5G networks
Dropbox Q2 2025 Financial Results & Investor Presentation
Encapsulation theory and applications.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
Building Integrated photovoltaic BIPV_UPV.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11

Ontopia tutorial

  • 1. Ontopia TutorialTMRA 2010-09-29Lars Marius Garshol & Geir Ove Grønmo
  • 2. AgendaAbout youwho are you?About OntopiaThe productThe futureParticipating in the project
  • 4. Brief history1999-2000private hobby project for Geir Ove2000-2009commercial software sold by Ontopia ASlots of international customers in diverse fields2009-open source project
  • 5. The projectOpen source hosted at Google CodeContributorsLars Marius Garshol, BouvetGeir Ove Grønmo, BouvetThomas Neidhart, SpaceAppsLars Heuer, SemagiaHannes Niederhausen, TMLabStig Lau, BouvetBaard H. Rehn-Johansen, BouvetPeter-Paul Kruijssen, MorpheusQuintin Siebers, MorpheusMatthias Fischer, HTW Berlin
  • 6. Recent workOntopia/Liferay integrationMatthias Fischer & LMGVarious fixes and optimizationseveryoneTropics (RESTful web service interface)SpaceAppsPorting build system to Maven2Morpheus
  • 8. The big pictureAuto-class.A.N.otherA.N.otherOtherCMSsA.N.otherA.N.otherDB2TMPortlet supportOKPTMSyncEngineCMSintegrationData integrationEscenicTaxon.importOntopolyWebservice
  • 9. The engineCore APITMAPI 2.0 supportImport/exportRDF conversionTMSyncFulltext searchEvent APItolog query languagetolog update languageEngine
  • 10. The backendsIn-memoryno persistent storagethread-safeno setupRDBMStransactionspersistentthread-safeuses cachingclusteringRemoteuses web serviceread-onlyunofficialEngineMemoryRDBMSRemote
  • 11. DB2TMUpconversion to TMsfrom RDBMS via JDBCor from CSVUses XML mappingcan call out to JavaSupports synceither full rescanor change tableTMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
  • 12. TMRAPWeb service interfacevia SOAPvia plain HTTPRequestsget-topicget-topic-pageget-tologdelete-topic...TMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
  • 13. Navigator frameworkServlet-based APImanage topic mapsload/scan/delete/createJSP tag libraryXSLT-likebased on tologJSTL integrationTMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
  • 14. Automated classificationUndocumentedexperimentalExtracts textautodetects formatWord, PDF, XML, HTMLProcesses textdetects languagestemming, stop-wordsExtracts keywordsranked by importanceuses existing topicssupports compound termsTMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
  • 15. VizigatorVizOntopolyGraphical visualizationVizDesktopSwing app to configurefilter/style/...VizletJava applet for webuses configurationloads via TMRAPuses “Remote” backendTMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
  • 16. OntopolyVizOntopolyGeneric editorweb-based, AJAXmeta-ontology in TMOntology designercreate types and fieldscontrol user interfacebuild viewsincremental devInstance editorguided by ontologyTMRAPNavDB2TMClassifyEngineMemoryRDBMSRemote
  • 19. Core APIsnet.ontopia.topicmaps.core.*Fairly direct mapping from TMDMTopicIFAssociationIFTopicMapIF...Set/get methods reflect TMDM properties
  • 20. TopicIFInterface, not a classgetTopicNames()addTopicName(TopicNameIF)removeTopicName(TopicNameIF)getOccurrences() + add + removegetSubjectIdentifiers() + add + removegetItemIdentifiers() + add + removegetSubjectLocators() + add + removegetRoles()getRolesByType(TopicIF)
  • 22. How to get a TopicMapIFCreate one directlynew net...impl.basic.InMemoryTopicMapStore()Load one from fileusing an importer (next slide)Connect to an RDBMScovered laterUse a topic map repositorycovered later
  • 23. TopicMapReaderIFimport net.ontopia.topicmaps.core.TopicMapIF;import net.ontopia.topicmaps.core.TopicMapReaderIF;import net.ontopia.topicmaps.utils.ImportExportUtils;public class TopicCounter { public static void main(String[] argv) throws Exception { TopicMapReaderIF reader = ImportExportUtils.getReader(argv[0]); TopicMapIF tm = reader.read(); System.out.println("TM contains " + tm.getTopics().size() + " topics"); }}[larsga@c716c5ac1 tmp]$ java TopicCounter ~/data/bilder/privat/metadata.xtmTM contains 17035 topics[larsga@c716c5ac1 tmp]$
  • 25. The utility classesA set of classes outside the core interfaces that perform common tasksa number of these utilities are obsolete now that tolog is hereThey are all built on top of the core interfacesSome important utilitiesImportExportUtils creates readers and writersMergeUtils merges topics and topic mapsPSI contains important PSIsDeletionUtils cascading delete of topicsDuplicateSuppressionUtils removes duplicatesTopicStringifiers find names for topics
  • 26. Topic Maps repositoryUses a set of topic maps sources to build a set of topics mapstopic maps can be looked up by IDMany kinds of sourcesscan directory for files matching patterndownload from URLconnect to RDBMS...Configurable using an XML filetm-sources.xmlUsed by Navigator Framework
  • 27. Event APIAllows clients to receive notification of changesMust implement TopicMapListenerIF static class TestListener extends AbstractTopicMapListener { public void objectAdded(TMObjectIF snapshot) { System.out.println("Topic added: " + snapshot.getObjectId()); } public void objectModified(TMObjectIF snapshot) { System.out.println("Topic modified: " + snapshot.getObjectId()); } public void objectRemoved(TMObjectIF snapshot) { System.out.println("Topic removed: " + snapshot.getObjectId()); } }
  • 28. Using the API // register to listen for events TestListener listener = new TestListener(); TopicMapEvents.addTopicListener(ref, listener); // get the store through the reference so the listener is registered ref.createStore(false); // let's add a topic System.out.println("Off we go"); TopicMapBuilderIF builder = tm.getBuilder(); TopicIF newbie = builder.makeTopic(tm); System.out.println("Let's name this topic"); builder.makeTopicName(newbie, "Newbie topic"); // then let's remove it System.out.println("And now, the exit"); DeletionUtils.remove(newbie); System.out.println("Goodbye, short-lived topic");[larsga@dhcp-98 tmp]$ java EventTest bkclean.xtm Off we goTopic added: 3409Let's name this topicTopic modified: 3409Topic modified: 3409And now, the exitTopic removed: 3409Goodbye, short-lived topic
  • 29. For more informationSee the Engine Developer's Guidehttp://www.ontopia.net/doc/current/doc/engine/devguide.html
  • 31. RDBMS backendStores Topic Maps in an RDBMSgeneric schemaaccess via JDBCProvides full ACIDtransactionsconcurrency...Supports several databasesOracle, MySQL, PostgreSQL, MS SQL Server, hsqlClustering support
  • 32. Core API implementationImplements same API as in-memory impltheoretically, a switch requires only config changeLazy loading of objectsobjects loaded from DB as neededConsiderable internal cachingfor performance reasonsSeparate objects for separate transactionsin order to provide isolationShared cache between transactions
  • 33. ConfigurationA Java property fileSpecifiesdatabase typeJDBC URLusername + passwordcache settingsclustering settings...
  • 34. jdbcspyA built-in SQL profilerUseful for identifying cause of performance issues
  • 36. tologA logic-based query languagea mix of Prolog and SQLeffectively equivalent to DatalogTwo partsqueries (data retrieval)updates (data modification)Developed by Ontopianot an ISO standardeventually to be replaced by TMQL
  • 37. tologThe recommended way to interact with the dataAPI programming is slow and cumbersometolog queries perform betterAvailable viaJava APIWeb service APIForms interface in Omnigatortolog queries return API objects
  • 38. Finding all operas by a composer Collection operas = new ArrayList(); TopicIF composer = getTopicById("puccini"); TopicIF composed_by = getTopicById("composed-by"); TopicIF work = getTopicById("work"); TopicIF creator = getTopicById("composer"); for (AssociationRoleIF role1 : composer.getRolesByType(creator)) { AssociationIF assoc = role1.getAssociation(); if (assoc.getType() != composed_by) continue; for (AssociationRoleIF role2 : assoc.getRoles()) { if (role2.getType() != work) continue; operas.add(role2.getPlayer()); } }
  • 39. Finding all operas by a composercomposed-by(puccini : composer, $O : work)?composed-by($C : composer, tosca : work)?composed-by($C : composer, $O : work)?composed-by(puccini : composer, tosca : work)?
  • 40. FeaturesAccess all aspects of a topic mapGeneric queries independent of ontologyAND, OR, NOT, OPTIONALCountSortLIMIT/OFFSETReusable inference rules
  • 41. Chaining predicates (AND)Predicates can be chainedborn-in($PERSON : person, $PLACE : place),located-in($PLACE : containee, italy : container)?The comma between the predicates means ANDThis query finds all the people born in ItalyIt first builds a two-column table of all born-in associationsThen, those rows where the place is not located-in Italy are removed(Note that when the PLACE variable is reused above that means that the birthplace and the location must be the same topic in each match)Any number of predicates can be chainedTheir order is insignificantActually, the optimizer reorders the predicatesIt will start with located-in because it has a topic constant
  • 42. Thinking in predicatesMost of you are probably used to functions, which work like this:
  • 44. Predicates, however, are in a sense bidirectional, because of the way the pattern matching works
  • 45. predicate(topic : role1, $VAR : role2)
  • 46. predicate($VAR : role1, topic : role2)
  • 47. The order of the roles are, on the other hand, insignificant
  • 48. predicate(topic : role1, $VAR : role2)
  • 49. predicate($VAR : role2, topic : role1)ProjectionSometimes queries make use of temporary variables that we are not really interested inThe way to get rid of unwanted variables is projectionSyntax: select $variable1, $variable2, ... from <query>?The query is first run, then projected down to the requested variables
  • 50. The instance-of predicateinstance-of has the following form:
  • 52. NOTE: the order of the arguments is significant
  • 53. Like players, instance and class may be specified in two ways:
  • 55. using a topic reference
  • 56. e.g. instance-of ( $A, city )
  • 57. instance-of makes use of the superclass-subclass associations in the topic map
  • 58. this means that composers will be considered musicians, and musicians will be considered personsCities with the most premieresusing o for i"http://psi.ontopedia.net/"select $CITY, count($OPERA) from instance-of($CITY, o:City), { o:premiere($OPERA : o:Work, $CITY : o:Place) | o:premiere($OPERA : o:Work, $THEATRE : o:Place), o:located_in($THEATRE : o:Containee, $CITY : o:Container) } order by $OPERA desc?
  • 59. All non-hidden photosselect $PHOTO from instance-of($PHOTO, op:Photo) not(ph:hide($PHOTO : ph:hidden)), not(ph:taken-at($PHOTO : op:Image, $PLACE : op:Place), ph:hide($PLACE : ph:hidden)), not(ph:taken-during($PHOTO : op:Image, $EVENT : op:Event), ph:hide($EVENT : ph:hidden)), not(ph:depicted-in($PHOTO : ph:depiction, $PERSON : ph:depicted), ph:hide($PERSON : ph:hidden))?
  • 60. DemoShow running queries in OmnigatorAlso show query tracingShakespeare/* #OPTION: optimizer.reorder = false */
  • 62. Using the query engine APIThe query engine API is really simple to useget a QueryProcessorIF objectrun a query in the QueryProcessorIF and get a QueryResultIFloop over the results and use themclose the result objectgo back to step 2, or do something elseThere are two different QueryProcessorIF implementations
  • 63. the API lets you write code without worrying about that, however
  • 64. the two implementations behave identicallyRunning a query with the APITopicMapIF tm = ...;QueryProcessorIF processor = QueryUtils.getQueryProcessor(tm);QueryResultIF result = processor.execute(“instance-of($P, person)?”);try { while (result.next()) { TopicIF person = (TopicIF) result.getValue(0); // do something useful with 'person' } } finally { result.close();}
  • 65. Advanced optionsIt is possible to parse a query once, and then run it many times
  • 66. the processor returns a ParsedQueryIF object, which can be executed
  • 67. parameters can be passed to the query on each execution
  • 68. It is possible to make declarations and use them across executionsUsing a parsed queryParsedQueryIF parsedQuery = processor.parse(“instance-of($P, %TYPE%)?”);Map params = Collections.singletonMap(“TYPE”, person);QueryResultIF result = parsedQuery.execute(params);try { while (result.next()) { // ... } } finally { result.close();}
  • 69. QueryWrapperDesigned to make all of this easierQueryWrapper qw = new QueryWrapper(tm);TopicIF topic = qw.queryForTopic(...);List topics = qw.queryForList(...);List<Person> people = qw.queryForList(..., mapper);
  • 70. tolog updatesGreatly simplifies TM modificationAlso means you can do modification without API programminguseful with RDBMS topic mapsuseful with TMs in running web serversBy performing a sequence of updates, just about any change can be madePotentially allows much more powerful architecture
  • 71. DELETEStatic formdelete lmgDynamic formdelete $person from instance-of($person, person)Delete a valuedelete subject-identifier(topic, “http://ex.org/tst”)
  • 72. MERGEStatic formMERGE topic1, topic2Dynamic formMERGE $p1, $p2 FROM instance-of($p1, person), instance-of($p2, person), email($p1, $email), email($p2, $email)
  • 73. INSERTStatic formINSERT lmg isa person; - “Lars Marius Garshol” .Dynamic formINSERT tmcl:belongs-to-schema(tmcl:container : theschema, tmcl:containee: $c) FROM instance-of($c, tmcl:constraint)
  • 74. INSERT againINSERT ?y $psi . event-in-year(event: $e, year: ?y)FROM start-date($e, $date), str:substring($y, $date, 4), str:concat($psi, "http://psi.semagia.com/iso8601", $y)
  • 75. UPDATEStatic formUPDATE value(@3421, “New name”)Dynamic formUPDATE value($TN, “Ontopia”) FROM topic-name(oks, $TN)
  • 76. More informationLook at sample queries in Omnigatortolog tutorialhttp://www.ontopia.net/doc/current/doc/query/tutorial.htmltolog built-in predicate referencehttp://www.ontopia.net/doc/current/doc/query/predicate-reference.html
  • 78. DB2TMUpconversion of relational dataeither from CSV files orover JDBCBased on an XML file describing the mappingvery highly configurableSupport forall of Topic Maps (except variants)value transformationssynchronization
  • 79. Standard use casePull in data from external sourceturn it into Topic Maps following some ontologyEnrich itusually manually, but not necessarilyResync from source at intervals
  • 80. DB2TM exampleOntopia+=United NationsBouvet<relation name="organizations.csv" columns="id name url"> <topic type="ex:organization"> <item-identifier>#org${id}</item-identifier> <topic-name>${name}</topic-name> <occurrence type="ex:homepage">${url}</occurrence> </topic></relation>
  • 81. Creating associations <relation name="people.csv" columns="id given family employer phone"> <topic id="employer"> <item-identifier>#org${employer}</item-identifier> </topic> <topic type="ex:person"> <item-identifier>#person${id}</item-identifier> <topic-name>${given} ${family}</topic-name> <occurrence type="ex:phone">${phone}</occurrence> <player atype="ex:employed-by" rtype="ex:employee"> <other rtype="ex:employer" player="#employer"/> </player> </topic> </relation>
  • 82. Value transformations <relation name="SCHEMATA" columns="SCHEMA_NAME"> <function-column name='SCHEMA_ID' method='net.ontopia.topicmaps.db2tm.Functions.makePSI'> <param>${SCHEMA_NAME}</param> </function-column> <topic type="mysql:schema"> <item-identifier>#${SCHEMA_ID}</item-identifier> <topic-name>${SCHEMA_NAME}</topic-name> </topic> </relation>
  • 83. Running DB2TMjava net.ontopia.topicmaps.db2tm.Executecommand-line toolalso works with RDBMS topic mapsnet.ontopia.topicmaps.db2tm.DB2TMAPI class to run transformationsmethods "add" and "sync"
  • 84. More informationDB2TM User's Guidehttp://www.ontopia.net/doc/current/doc/db2tm/user-guide.html
  • 85. Synchronizing with other sourcesTMSync
  • 86. TMSyncConfigurable module for synchronizing one TM against anotherdefine subset of source TM to sync (using tolog)define subset of target TM to sync (using tolog)the module handles the restCan also be used with non-TM sourcescreate a non-updating conversion from the source to some TM formatthen use TMSync to sync against the converted TM instead of directly against the source
  • 87. How TMSync worksDefine which part of the target topic map you want,Define which part of the source topic map it is the master for, andThe algorithm does the rest
  • 88. If the source is not a topic mapTMSyncconvert.xsltSimply do a normal one-time conversionlet TMSync do the update for youIn other words, TMSync reduces the update problem to a conversion problemsource.xml
  • 89. The City of Bergen usecaseNorge.noServiceUnitPersonLOSCity of BergenLOS
  • 91. TMRAP basicsAbstract interfacethat is, independent of any particular technologycoarse-grained operations, to reduce network trafficProtocol bindings existplain HTTP bindingSOAP bindingSupports many syntaxesXTM 1.0LTMTM/XMLcustom tolog result-set syntax
  • 92. get-topicRetrieves a single topic from the remote servertopic map may optionally be specifiedsyntax likewiseMain useto build client-side fragments into a bigger topic mapto present information about a topic on a different server
  • 93. get-topicParametersidentifier: a set of URIs (subject identifiers of wanted topic)subject: a set of URIs (subject locators of wanted topic)item: a set of URIs (item identifiers of wanted topic)topicmap: identifier for topic map being queriedsyntax: string identifying desired Topic Maps syntax in responseview: string identifying TM-Views view used to define fragmentResponsetopic map fragment representing topic in requested syntaxdefault is XTM fragment with all URI identifiers, names, occurrences, and associationsin default view types and scopes on these constructs are only identified by one <*Ref xlink:href=“...”/> XTM elementthe same goes for associated topics
  • 94. get-topic-pageReturns link information about a topicthat is, where does the server present this topicmainly useful for realizing the portal integration scenarioresult information contains metadata about server setup
  • 95. get-topic-pageParametersidentifier: a set of URIs (subject identifiers of wanted topic)subject: a set of URIs (subject locators of wanted topic)item: a set of URIs (item identifiers of wanted topic)topicmap: identifier for topic map being queriedsyntax: string identifying desired Topic Maps syntax in responseResponse is a topic map fragment[oks : tmrap:server = "OKS Samplers local installation"][opera : tmrap:topicmap = "The Italian Opera Topic Map"] {opera, tmrap:handle, [[opera.xtm]]}tmrap:contained-in(oks : tmrap:container, opera : tmrap:containee)tmrap:contained-in(opera : tmrap:container, view : tmrap:containee)tmrap:contained-in(opera : tmrap:container, edit : tmrap:containee)[view : tmrap:view-page %"http://localhost:8080/omnigator/models/..."][edit : tmrap:edit-page %"http://localhost:8080/ontopoly/enter.ted?..."][russia = "Russia” @"http://www.topicmaps.org/xtm/1.0/country.xtm#RU"]
  • 96. get-tologReturns query resultsmain use is to extract larger chunks of the topic map to the client for presentationmore flexible than get-topiccan achieve more with less network traffic
  • 97. get-tologParameterstolog: tolog querytopicmap: identifier for topic map being queriedsyntax: string identifying desired syntax of responseview: string identifying TM-Views view used to define fragmentResponseif syntax is“tolog”an XML representation of the query resultuseful if order of results matterotherwise, a topic map fragment containing multiple topics is returnedas for get-topic
  • 98. add-fragmentAdds information to topic map on the serverdoes this by merging in a fragmentParametersfragment: topic map fragmenttopicmap: identifier for topic map being added tosyntax: string identifying syntax of request fragmentResultfragment imported into named topic map
  • 99. update-topicCan be used to update a topicadd-fragment only adds informationupdate sets the topic to exactly the uploaded informationParameterstopicmap: the topic map to updatefragment: fragment containing the new topicsyntax: syntax of the uploaded fragmentidentifier: a set of URIs (subject identifiers of wanted topic)subject: a set of URIs (subject locators of wanted topic)item: a set of URIs (item identifiers of wanted topic)Update happens using TMSync
  • 100. delete-topicRemoves a topic from the serverParametersidentifier: a set of URIs (subject identifiers of wanted topic)subject: a set of URIs (subject locators of wanted topic)item: a set of URIs (item identifiers of wanted topic)topicmap: identifier for topic map being queriedResultdeletes the identified topicincludes all names, occurrences, and associations
  • 101. tolog-updateRuns a tolog update statementParameterstopicmap: topic map to updatestatement: tolog statement to runRuns the statement & commits the change
  • 102. HTTP binding basicsThe mapping requires a base URLe.g http://localhost:8080/tmrap/This is used to send requestshttp://localhost:8080/tmrap/method?param1=value1&...GET is used for requests that do not cause state changesPOST for requests that doResponses returned in response body
  • 103. Exercise #1: Retrieve a topicUse the get-topic request to retrieve a topic from the serverbase URL is http://localhost:8080/tmrap/find the identifying URI in Omnigatorjust print the retrieved fragment to get a look at itNote: you must escape the “#” character in URIsotherwise it is interpreted as the anchor and not transmitted at allescape sequence: %23Note: you must specify the topic map IDotherwise results will only be returned from loaded topic mapsin other words: if the topic map isn’t loaded, you get no results
  • 104. Solution #1 (in Python)import urllibBASE = "http://localhost:8080/tmrap/tmrap/"psi = "http://www.topicmaps.org/xtm/1.0/country.xtm%23RU"inf = urllib.urlopen(BASE + "get-topic?identifier=" + psi)print inf.read()inf.close()
  • 105. Solution #1 (response) <topicMap xmlns="http://www.topicmaps.org/xtm/1.0/" xmlns:xlink="http://www.w3.org/1999/xlink"> <topic id="id458"> <instanceOf> <subjectIndicatorRef xlink:href="http://psi.ontopia.net/geography/#country"/> </instanceOf> <subjectIdentity> <subjectIndicatorRef xlink:href="http://www.topicmaps.org/xtm/1.0/country.xtm#RU"/> <topicRef xlink:href="file:/.../WEB-INF/topicmaps/geography.xtmm#russia"/> </subjectIdentity> <baseName> <baseNameString>Russia</baseNameString> </baseName> </topic>
  • 106. Processing XTM with XSLTThis is possible, but unpleasantthe main problem is that the XML is phrased in terms of Topic Maps, not in domain termsthis means that all the XPath will talk about “topic”, “association”, ... and not “person”, “works-for” etcThe structure is also complicatedthis makes queries complicatedfor example, the XPath to traverse an association looks like this://xtm:association [xtm:member[xtm:roleSpec / xtm:topicRef / @xlink:href = '#employer'] [xtm:topicRef / @xlink:href = concat('#', $company)]] [xtm:instanceOf / xtm:topicRef / @xlink:href = '#employed-by']
  • 107. TM/XMLNon-standard XML syntax for Topic Mapsdefined by Ontopia (presented at TMRA’05)implemented in the OKSXSLT-friendlymuch easier to process with XSLT than XTMcan be understood by developers who do not understand Topic Mapsdynamic domain-specific syntaxes instead of generic syntaxpredictable (can generate XML Schema from TM ontology)
  • 108. TM/XML example<topicmap ... reifier="tmtopic"> <topicmap id="tmtopic"> <iso:topic-name><tm:value>TM/XML example</tm:value> </iso:topic-name> <dc:description>An example of the use of TM/XML.</dc:description> </topicmap> <person id="lmg"> <iso:topic-name><tm:value>Lars Marius Garshol</tm:value> <tm:variant scope="core:sort">garshol, lars marius</tm:variant> </iso:topic-name> <homepage datatype="http://www.w3.org/2001/XMLSchema#anyURI" >http://www.garshol.priv.no</homepage> <created-by role="creator" topicref="tmtopic" otherrole="work"/> <presentation role="presenter"> <presented topicref="tmxml"/> <event topicref="tmra05"/> </presentation> </person></topicmap>
  • 109. tmphotoCategoryPersonPhotoEventLocationhttp://www.garshol.priv.no/tmphoto/A topic map to organize my personal photoscontains ~15,000 photosA web gallery runs on Ontopiaon www.garshol.priv.no
  • 110. tmtoolshttp://www.garshol.priv.no/tmtools/OrganizationAn index of Topic Maps toolsorganized as shown on the rightAgain, web application for browsingscreenshots belowPersonSoftwareproductPlatformCategoryTechnology
  • 111. The person pageBoring! No content.
  • 113. get-illustrationA web service in tmphotoreceives the PSI of a personthen automatically picks a suitable photo of that personBased onvote score for photos,categories (portrait),other people in photo...The service returnsa topic map fragment with links to the person page and a few different sizes of the selected photohttp://www.garshol.priv.no/blog/183.html
  • 114. get-illustrationHmmm. Scores, categories, people in photo, ...Do you have a photo ofhttp://psi.ontopedia.net/Benjamin_Bock ?http://www.garshol.priv.no/tmphoto/get-illustration?identifier=http://psi.on....tmphototmtoolsTopic mapfragment
  • 116. Points to noteNo hard-wiring of linksjust add identifiers when creating people topicsphotos appear automaticallyif a better photo is added later, it’s replaced automaticallyNo copying of datano duplication, no extra maintenanceVery loose bindingnothing application-specificHighly extensibleonce the identifiers are in place we can easily pull in more content from other sources
  • 117. My blogHas more content aboutpeople (tmphoto & tmtools),events (tmphoto),tools (tmtools),technologies (tmtools)Should be available in those applications
  • 118. SolutionMy blog posts are taggedbut the tags are topics, which can have PSIsthese PSIs are used in tmphoto and tmtools, tooThe get-topic-page request lets tmphoto & tmtools ask the blog for links to relevant postsgiven identifiers for a topic, returns links to pages about that topichttp://www.garshol.priv.no/blog/145.html
  • 119. get-topic-pageDo you have pages abouthttp://psi.ontopedia.net/TMRA_2008 ?http://www.garshol.priv.no/blog/get-topic-page?identifier=http://psi.on....BlogtmphotoTopic mapfragmentTopics linking toindividual blog posts
  • 122. Ontopia Navigator FrameworkJava API for interacting with TM repositoryJSP tag librarybased on tologkind of like XSLT in JSP with tolog instead of XPathhas JSTL integrationUndocumented partsweb presentation componentssome wrapped as JSP tagswant to build proper portlets from them
  • 123. How it worksWeb server with JSP containere.g. Apache TomcatJSP pageBrowserTopic MapEngineTagLibrariesJSP pageBrowserJSP pageBrowserJSP pageBrowserTopic Map
  • 124. The two tag librariestolog
  • 125. makes up nearly the entire framework
  • 126. used to extract information from topic maps
  • 127. lets you execute tolog queries to extract information from the topic map
  • 128. looping and control flow structures
  • 130. used to create template pages
  • 131. separates layout and structure from content
  • 133. optional, but recommendedHow the tag libraries workThe topic map engine holds a registry of topic maps
  • 134. collected from the tm-sources.xml configuration file
  • 135. each topic map has its own id (usually the file name)
  • 136. Each page also holds a set of variable bindings
  • 137. each variable holds a collection of objects
  • 138. objects can be topics, base names, locators, strings, ...
  • 140. some tags set the values of variables, while others use themBuilding a JSP pageThe <%@ taglib ... %> tags declare your tag libraries
  • 141. Tells the page which tag library to include and binds it to a prefix
  • 142. Prefixes are used to qualify the tags (and avoid name collisions)
  • 143. Use the <tolog:context> tag around the entire page
  • 144. The "topicmap" attribute specifies the ID of the current topic map
  • 145. The first time you access the page in your browser the page gets compiled
  • 146. If you modify the page then it will be recompiled the next time it is accessedhttp://www.ontopia.net/operamap
  • 147. Navigator tag library example <%-- assume variable 'composer' is already set --%><p><b>Operas:</b><br/><tolog:foreach query=”composed-by(%composer% : composer, $OPERA : opera), { premiere-date($OPERA, $DATE) }?”> <li> <a href="opera.jsp?id=<tolog:id var="OPERA"/>” ><tolog:out var="OPERA"/></a> <tolog:if var="DATE"> <tolog:out var="DATE"/> </tolog:if> </li></tolog:foreach></p>
  • 154. where to find the other files, plus plug-ins
  • 156. tells the navigator where to find topic maps
  • 158. configuration of the log4j logging
  • 159. More details in the "Configuration Guide" documentMore informationNavigator Framework Configuration Guidehttp://www.ontopia.net/doc/current/doc/navigator/config.htmlNavigator Framework Developer's Guidehttp://www.ontopia.net/doc/current/doc/navigator/navguide.htmlNavigator Framework Tag Library Referencehttp://www.ontopia.net/doc/current/doc/navigator/tolog-taglib.html
  • 161. What is automated classification?Create parts of a topic map automaticallyusing the text in existing content as the sourcenot necessarily 100% automatic; user may help outA hard tasknatural language processing is very complexresult is never perfectHowever, it’s possible to achieve some results
  • 162. Why automate classification?Creating a topic map requires intellectual effortthat is, it requires work by humansHuman effort = costadded value must be sufficient to justify the costin some cases eitherthe cost is too high, orthe value added is too limitedThe purpose of automation is to lower the costthis increases the number of cases where the use of Topic Maps is justified
  • 163. Automatable tasksProjectPersonDepartmentWorked onWorked onJane Doeworked onemployed inXYZ ProjectIT groupOntologyharddepends on requirementsone time onlyInstance datahardusually exists in other sourcesDocument keywordseasierfrequent operationusually no other sources
  • 164. Two kinds of categorizationBroad:Environment, Crisis managementNarrow:Water, Norway, drought, Drought Act, Cloud seeding, Morecambe BayBroad categorizationcategories are broadly definedinclude many different subjectsNarrow categorizationuses very specific keywordseach keyword is a single subject
  • 165. What it doesExtract keywords from contentgoal is to use these for classificationNot entity recognitionwe only care about identifying what the content is aboutUses statistical approachno attempt at full formal parsing of the text
  • 166. Steps of operationIdentify formatthen, extract the textIdentify languagethen, remove stop wordsstem remaining wordsClassifycan use terms from preexisting Topic Mapsexploits knowledge of the languageReturn proposed keywords
  • 167. Example of keyword extractiontopic maps 1.0metadata 0.57subject-based class. 0.42Core metadata 0.42faceted classification 0.34taxonomy 0.22monolingual thesauri 0.19controlled vocabulary 0.19Dublin Core 0.16thesauri 0.16Dublin 0.15keywords 0.15
  • 168. Example #2Automated classification 1.0 5Topic Maps 0.51 14XSLT 0.38 11compound keywords 0.29 2keywords 0.26 20Lars 0.23 1Marius 0.23 1Garshol 0.22 1...
  • 169. So how could this be used?To help users classify new documents in a CMS interfacesuggest appropriate keywords, screened by user before approvalAutomate classification of incoming documentsthis means lower quality, but also lower costGet an overview of interesting terms in a document corpusclassify all documents, extract the most interesting termsthis can be used as the starting point for building an ontology(keyword extraction only)
  • 170. Example user interfaceThe user creates an articlethis screen then used to add keywordsuser adjusts the proposals from the classifier
  • 171. Interfacesjava net.ontopia.topicmaps.classify.Chew<topicmapuri><inputfile>produces textual output onlynet.ontopia.topicmaps.classify.SimpleClassifierclassify(uri, topicmap) -> TermDatabaseclassify(uri) -> TermDatabase
  • 172. Supported formats and languagesXML (any schema)HTML (non-XML)PDFWord (.doc, .docx)PowerPoint (.ppt, .pptx)Plain textEnglishNorwegian
  • 173. Visualization of Topic MapsVizigator
  • 174. The VizigatorGraphical visualization of Topic MapsTwo partsVizDesktop: Swing desktop app for configurationVizlet: Java applet for web deploymentConfiguration stored in XTM file
  • 175. The uses of visualizationNot really suitable for navigationdoesn't work for all kinds of dataGreat for seeing the big picture
  • 179. The VizigatorThe Vizigator uses TMRAPthe Vizlet runs in the browser (on the client)a fragment of the topic map is downloaded from the serverthe fragment is grown as neededServerTMRAP
  • 180. Embedding the VizletSet up TMRAP serviceAdd ontopia-vizlet.jarAdd necessary HTML <applet code="net.ontopia.topicmaps.viz.Vizlet.class" archive="ontopia-vizlet.jar"> <param name="tmrap" value="/omnigator/plugins/viz/"> <param name="config" value="/omnigator/plugins/viz/config.jsp?tm=<%= tmid %>"> <param name="tmid" value="<%= tmid %>"> <param name="idtype" value="<%= idtype %>"> <param name="idvalue" value="<%= idvalue %>"> <param name="propTarget" value="VizletProp"> <param name="controlsVisible" value="true"> <param name="locality" value="1"> <param name="max-locality" value="5"></applet>
  • 182. OmnigatorGeneric Topic Maps browservery useful for seeing what's in a topic mapthe second-oldest part of OntopiaContains other features beyond simple browsingstatisticsmanagement consolemergingtolog querying/updatesexport
  • 183. Ontology designer and editorOntopoly
  • 184. OntopolyA generic Topic Maps editor, in two partsontology editor: used to create the ontology and schemainstance editor: used to enter instances based on ontologyFeaturesworks with both XTM files and topic maps stored in RDBMS backendsupports access control to administrative functions, ontology, and instance editorsexisting topic maps can be importedparts of the ontology can be marked as read-only, or hidden
  • 185. Ontology designerCreate ontology based ontopic, association, name, occurrence, and role typesSupports iterative ontology developmentmodify and prototype the ontology until it's rightSupports ontology annotationadd fields to topic types, for exampleSupports viewsdefine restricted views of certain topic types
  • 186. Instance editorConfigured by the ontology editorshows topics as defined by the ontologyHas several ways to pick associationsdrop-down listby searchfrom hierarchyAvoids conflictspages viewed by one user are locked to others
  • 187. Ontopoly is embeddableThe Ontopoly instance editor can be embeddedbasically, the main panel can be inserted into another web applicationuses an iframeRequires only ID of topic being editedcan also be restricted to a specific viewMakes it possible to build easier-to-use editorsso users don't have to learn all of Ontopoly
  • 189. CMS integrationThe best way to add content functionality to Ontopiathe world doesn’t need another CMSbetter to reuse those which already existSo far two integrations existEscenicOfficeNet Knowledge Portalmore are being worked on
  • 190. ImplementationA CMS event listenerthe listener creates topics for new CMS articles, folders, etcthe mapping is basically the design of the ontology used by this listenerPresentation integrationit must be possible to list all topics attached to an articleconversely, it must be possible to list all articles attached to a topichow close the integration needs to be here will vary, as will the difficulty of the integrationUser interface integrationit needs to be possible to attach topics to an article from within the normal CMS user interfacethis can be quite trickySearch integrationthe Topic Maps search needs to also search content in the CMScan be achieved by writing a tolog plug-in
  • 191. Articles as topicsis aboutElectionsNew city council appointedGoal: associate articles with topicsmainly to say what they are abouttypically also want to include other metadataNeed to create topics for the articles to do thisin fact, a general CMS-to-TM mapping is neededmust decide what metadata and structures to include
  • 192. Mapping issuesArticle topicswhat topic type to use?title becomes name? (do you know the title?)include author? include last modified? include workflow state?should all articles be mapped?Folders/directories/sections/...should these be mapped, too?one topic type for all folders/.../.../...?if so, use associations to connect articles to foldersuse associations to reproduce hierarchical folder structureMultimedia objectsshould these be included?what topic type? what name? ...
  • 193. Two styles of mappingsArticles as articlesTopic represents only the articleTopic type is some subclass of “article”“Is about” association connects article into topic mapFields are presentationaltitle, abstract, bodyArticles as conceptsTopic represents some real-world subject (like a person)article is just the default content about that subjectType is the type of the subject (person)Semantic associations to the rest of the topic mapworks in department, has competence, ...Fields can be semanticname, phone no, email, ...
  • 194. Article as articleArticle about building of a new schoolIs about association to “Primary schools”Topic type is “article”
  • 195. Article as conceptArticle about a sports hallArticle really represents the hallTopic type is “Location”Associations tocity borough
  • 196. events in the location
  • 202. Ontopia/LiferayAn integration with the Liferay CMS and portal is in progresspresented Friday 1130-1150 in Schiller 2
  • 204. The projectA new citizen’s portal for the city administrationstrategic decision to make portal main interface for interaction with citizensas many services as possible are to be moved onlineBig projectstarted in late 2004, to continue at least into 2008~5 million Euro spent by launch date1.7 million Euro budgeted for 2007Topic Maps development is a fraction of this (less than 25%)Many companies involvedBouvet/OntopiaAvenirKPMGKarabinEscenic
  • 205. Simplified original ontologyService catalogEscenic (CMS)LOSFormArticlenearlyeverythingCategoryServiceSubjectDepartmentBoroughExternalresourceEmployeePayroll++
  • 210. NRK/SkoleNorwegian National Broadcasting (NRK)media resources from the archivespublished for use in schoolsintegrated with the National CurriculumIn productiondelayed by copyright wranglingTechnologiesOKSPolopoly CMSMySQL databaseResin application server
  • 213. Curriculum-based browsing (3)Feminist movement in the 70s and 80sChanges to the family in the 70sThe prime minister’s husbandChildren choosing careersGay partnerships in 1993
  • 214. One video (prime minister’s husband)MetadataSubjectPersonRelatedresourcesDescription
  • 216. ImplementationDomain model in JavaPlain old Java objects built onOntopia’s Java APItologJSP for presentationusing JSTL on top of the domain modelSubversion for the source codeMaven2 to build and deployUnit tests
  • 217. What we’d like to seeThe future
  • 218. The big pictureAuto-class.A.N.otherA.N.otherOtherCMSsA.N.otherA.N.otherDB2TMPortlet supportOKPXML2TMEngineCMSintegrationData integrationEscenicTaxon.importOntopolyWebservice
  • 219. CMS integrationsThe more of these, the betterCandidate CMSsLiferay (being worked on at Bouvet)Alfresco MagnoliaInsperaJSR-170 Java Content RepositoryCMIS (OASIS web service standard)
  • 220. Portlet toolkitSubversion contains a number of “portlets”basically, Java objects doing presentation taskssome have JSP wrappers as wellExamplesdisplay tree viewlist of topics filterable by facetsshow related topicsget-topic-page via TMRAP componentNot ready for prime-time yetundocumentedincomplete
  • 221. Ontopoly plug-insPlugins for getting more data from externalsTMSync import pluginDB2TM pluginSubj3ct.com pluginadapted RDF2TM pluginclassify plugin...Plugins for ontology fragmentsmenu editor, for example
  • 222. TMCLNow implementableWe’d like to seean object model for TMCL (supporting changes)a validator based on the object modelOntopoly import/export from TMCL (initially)refactor Ontopoly API to make it more portableOntopoly ported to use TMCL natively (eventually)
  • 223. Things we’d like to removeOSL supportOntopia Schema LanguageWeb editor frameworkunfortunately, still used by some major customersFulltext searchthe old APIs for this are not really of any use
  • 224. Management interfaceImport topic maps (to file or RDBMS)
  • 225. What do you think?Suggestions?Questions?Plans?Ideas?
  • 226. Setting up the developer environmentGetting started
  • 227. If you are using Ontopia......simply download the zip, thenunzip,set the classpath,start the server, ......and you’re good to go
  • 228. If you are developing Ontopia...You must haveJava 1.5 (not 1.6 or 1.7 or ...)Ant 1.6 (or later)Ivy 2.0 (or later)SubversionThencheck out the source from Subversionsvn checkout http://ontopia.googlecode.com/svn/trunk/ ontopia-read-onlyant bootstrapant dist.jar.ontopiaant testant dist.ontopia
  • 229. BewareThis is fun, becauseyou can play around with anything you wante.g, my build has a faster TopicIF.getRolesByTypeyou can track changes as they happen in svnHowever, you’re on your ownif it fails it’s kind of hard to say whymaybe it’s your changes, maybe notFor production use, official releases are best
  • 231. Our goalTo provide the best toolkit for building Topic Maps-based applicationsWe want it to beactively maintained,bug-free,scalable,easy to use,well documented,stable,reliable
  • 232. Our philosophyWe want Ontopia to provide as much useful more-or-less generic functionality as possibleNew contributions are generally welcome as long asthey meet the quality requirements, andthey don’t cause problems for others
  • 233. The sandboxThere’s a lot of Ontopia-related code which does not meet those requirementssome of it can be very useful,someone may pick it up and improve itThe sandbox is for these piecessome are in Ontopia’s Subversion repository,others are maintained externallyTo be “promoted” into Ontopia a module needsan active maintainer,to be generally useful, andto meet certain quality requirements
  • 234. CommunicationsJoin the mailing list(s)!http://groups.google.com/group/ontopiahttp://groups.google.com/group/ontopia-devGoogle Code pagehttp://code.google.com/p/ontopia/note the “updates” feed!Bloghttp://ontopia.wordpress.comTwitterhttp://twitter.com/ontopia
  • 235. CommittersThese are the people who run the projectthey can actually commit to Subversionthey can vote on decisions to be made etcEveryone else canuse the software as much as they want,report and comment on issues,discuss on the mailing list, andsubmit patches for inclusion
  • 236. How to become a committerParticipate in the project!that is, get involved firstlet people get to know you, show some commitmentOnce you’ve gotten some way into the project you can ask to become a committerbest if you have provided some patches firstUnless you’re going to commit changes there’s no need to be a committer
  • 237. Finding a task to work onReport bugs!they exist. if you find any, please report them.Look at the open issuesthere is always testing/discussion to be doneLook for issues marked “newbie”http://code.google.com/p/ontopia/issues/list?q=label:NewbieLook at what’s in the sandboxmost of these modules need workScratch an itchif there’s something you want fixed/changed/added...
  • 238. How to fix a bugFirst figure out why you think it failsThen write a test casebased on your assumptionmake sure the test case fails (test before you fix)Then fix the bugfollow the coding guidelines (see wiki)Then run the test suiteverify that you’ve fixed the bugverify that you haven’t broken anythingThen submit the patch
  • 239. The test suiteLots of *.test packages in the source tree3795 test cases as of right nowtest data in ontopia/src/test-datasome tests are generators based on filessome of the test files come from cxtm-tests.sf.netRun withant testjava net.ontopia.test.TestRunner src/test-data/config/tests.xml test-group
  • 240. Source tree structurenet.ontopia.utils various utilitiestest various test support codeinfoset LocatorIF code + cruftpersistence OR-mapper for RDBMS backendproduct cruftxml various XML-related utilitiestopicmaps next slides
  • 241. Source tree structurenet.ontopia.topicmaps.core core engine APIimpl engine backends + utilsutils utilities (see next slide)cmdlineutils command-line toolsentry TM repositorynav + nav2 navigator frameworkquery tolog enginevizclassify db2tmwebed cruft
  • 242. Source tree structurenet.ontopia.topicmaps.utils* various utility classesltm LTM reader and writerctm CTM readerrdf RDF converter (both ways)tmrap TMRAP implementation
  • 244. The engineThe core API corresponds closely to the TMDMTopicMapIF, TopicIF, TopicNameIF, ...Compile withant init compile.ontopia.class files go into ontopia/build/classesant dist.ontopia.jar # makes a jar
  • 245. The importersMain class implements TopicMapReaderIFusually, this lets you set up configuration, etcthen uses other classes to do the real workXTM importersuse an XML parsermain work done in XTM(2)ContentHandlersome extra code for validation and format detectionCTM/LTM importersuse Antlr-based parsersreal code in ctm.g/ltm.gAll importers work via the core API
  • 246. Find an issue in the issue tracker(Picking one with “Newbie” might be good, but isn’t necessary)Get set upcheck out the source codebuild the coderun the test suiteThen dig inwe’ll help you with any questions you haveAt the end, submit a patch to the issue trackerremember to use the test suite!