3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse

rd
3 Status report of degree project

Integrating SWI-Prolog
for semantic reasoning in Bioclipse
Samuel Lampa, 2010-04-07
Project blog: http://saml.rilspace.com

Research question

How do biochemical questions
formulated as Prolog queries
compare to other solutions
available in Bioclipse in terms of
speed and expressiveness?

Compared Semantic Tools

● Jena
● General RDF querying (via SPARQL)
● Pellet
● OWL-DL Reasoning (via SPARQL)

● General querying via Jena (via SPARQL)

● SWI-Prolog

● Access to RDF triples (both assertion and querying) via the
rdf( Subject, Predicate, Object ) method
● Complex wrapper/convenience methods can be built

Use Case: NMRShiftDB

Interesting use case:
Querying NMRShiftDB data
● Characteristics:

– Rather shallow RDF graph
– Numeric (float value) interval
matching

NMR Spectrum Similarity Search

What to test:
Given a spectrum,
represented as a list of shift
values, find spectra with
the same shifts, (allowing
Intensity variation within a limit).

Shift → “Dereferencing”
spectra

Example Data

<http://pele.farmbio.uu.se/nmrshiftdb/?moleculeId=234>
:hasSpectrum <http://pele.farmbio.uu.se/nmrshiftdb/?
spectrumId=4735>;
:moleculeId "234".
<http://pele.farmbio.uu.se/nmrshiftdb/?spectrumId=4735>
:hasPeak <http://pele.farmbio.uu.se/nmrshiftdb/?s4735p0>,
<http://pele.farmbio.uu.se/nmrshiftdb/?s4735p1>,
<http://pele.farmbio.uu.se/nmrshiftdb/?s4735p2>,
<http://pele.farmbio.uu.se/nmrshiftdb/?s4735p0>
:hasShift "17.6"^^xsd:decimal .

% Register RDF namespaces, for use in the convenience methods at the end
:- rdf_register_ns(nmr, 'http://www.nmrshiftdb.org/onto#').

Prolog code :- rdf_register_ns(xsd, 'http://www.w3.org/2001/XMLSchema#').

find_mol_with_peak_vals_near( SearchShiftVals, Mols ) :-
% Pick the Mols in 'Mol', that match the pattern:
% list_peak_shifts_of_mol( Mol, MolShiftVals ), contains_list_elems_near( SearchShiftVals, MolShiftVals )
% and collect them in 'Mols'.
setof( Mol,
( list_peak_shifts_of_mol( Mol, MolShiftVals ), % A Mol's shift values are collected
contains_list_elems_near( SearchShiftVals, MolShiftVals ) ), % and compared against the given SearchShiftVals
[Mols|MolTail] ). % In 'Mols', all 'Mol's, for which their shift
% values match the SearchShiftVals, are collected.
% Given a 'Mol', give it's shiftvalues in list form, in 'ListOfPeaks'
list_peak_shifts_of_mol( Mol, ListOfPeaks ) :-
has_spectrum( Mol, Spectrum ),
findall( ShiftVal,
( has_peak( Spectrum, Peak ),
has_shift_val( Peak, ShiftVal ) ),
ListOfPeaks ).
% Compare two lists to see if list2 has near-matches for each of the values in list1
contains_list_elems_near( [ElemHead|ElemTail], List ) :-
member_close_to( ElemHead, List ),
( contains_list_elems_near( ElemTail, List );
ElemTail == [] ).

%%%%%%%%%%%%%%%%%%%%%%%%
% Recursive construct: %
%%%%%%%%%%%%%%%%%%%%%%%%
% Test first the end criterion:
member_close_to( X, [ Y | Tail ] ) :-
closeTo( X, Y ).
% but if the above doesn't validate, then recursively continue with the tail of List2:
member_close_to( X, [ Y | Tail ] ) :-
member_close_to( X, Tail ).
% Numerical near-match
closeTo( Val1, Val2 ) :-
abs(Val1 - Val2) =< 0.3.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Convenience accessory methods %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
has_shift_val( Peak, ShiftVal ) :-
rdf( Peak, nmr:hasShift, literal(type(xsd:decimal, ShiftValLiteral))),
atom_number_create( ShiftValLiteral, ShiftVal ).
has_spectrum( Subject, Predicate ) :-
rdf( Subject, nmr:has_spectrum, Predicate).
has_peak( Subject, Predicate ) :-
rdf( Subject, nmr:has_peak, Predicate).

% Wrapper method for the atom_number/2 method which converts atoms (string constants) to number.
% The wrapper methods avoids exceptions on empty atoms, instead converting into a zero.
atom_number_create( Atom, Number ) :-
atom_length( Atom, AtomLength ), AtomLength > 0 -> % IF atom is not empty
atom_number( Atom, Number ); % THEN Convert the atom to a numerical value
atom_number( '0', Number ). % ELSE Convert to a zero ");

PREFIX owl: <http://www.w3.org/2002/07/owl#>

SPARQL code
PREFIX afn: <http://jena.hpl.hp.com/ARQ/function#>
PREFIX fn: <http://www.w3.org/2005/xpath-functions#>
PREFIX nmr: <http://www.nmrshiftdb.org/onto#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?s
WHERE {
?s nmr:hasPeak [ nmr:hasShift ?s1 ] ,
[ nmr:hasShift ?s2 ] ,
[ nmr:hasShift ?s16 ] .
FILTER ( fn:abs(?s1 - 17.6) < 0.3 ) .
FILTER ( fn:abs(?s2 - 18.3) < 0.3 ) .
FILTER ( fn:abs(?s3 - 22.6) < 0.3 ) .
FILTER ( fn:abs(?s4 - 26.5) < 0.3 ) .
FILTER ( fn:abs(?s5 - 31.7) < 0.3 ) .
FILTER ( fn:abs(?s6 - 33.5) < 0.3 ) .
FILTER ( fn:abs(?s7 - 33.5) < 0.3 ) .
FILTER ( fn:abs(?s8 - 41.8) < 0.3 ) .
FILTER ( fn:abs(?s9 - 42.0) < 0.3 ) .
FILTER ( fn:abs(?s10 - 42.2) < 0.3 ) .
FILTER ( fn:abs(?s11 - 78.34) < 0.3 ) .
FILTER ( fn:abs(?s12 - 140.99) < 0.3 ) .
FILTER ( fn:abs(?s13 - 158.3) < 0.3 ) .
FILTER ( fn:abs(?s14 - 193.4) < 0.3 ) .
FILTER ( fn:abs(?s15 - 203.0) < 0.3 ) .
FILTER ( fn:abs(?s16 - 0) < 0.3 ) . }

“Expressivity”: SPARQL vs Prolog

SPARQL PROLOG

Prolog predicate taking variables

How to change “input parameters”?
● SPARQL: Modify SPARQL query

● Prolog: Change input parameter

Observations

● SPARQL
● Fewer lines of code

● Easier to understand the code

● Prolog

● Easier to change input parameters

● Easier to re-use existing logic

(call a method rather than cut and paste
SPARQL code)
● Easier to change aspects of the execution logic

Prolog vs Jena vs JenaTDB vs Pellet

Observations

● Prolog is the fastest (in-memory only)
● Jena faster with disk based than with

in-memory RDF store!
● Pellet with in-memory store is slow

● Pellet with disk based store out of

question

Project plan from last

Planned final presentation: 28 april 2010 (BMC B7:101a)
Everybody is welcome!

Thank you!
Project blog: http://saml.rilspace.com

3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse

More Related Content

Similar to 3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse (20)

More from Samuel Lampa (17)

3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse