SlideShare a Scribd company logo
10-12-2014 
SAFE 
Policy Aware SPARQL Query Federation 
Over RDF Data Cubes 
Dr. Ratnesh Sahay 
Semantics in eHealth & Life Sciences (SeLS) 
Insight Centre for Data Analytics 
NUI Galway, Ireland 
SWAT4LS-2014, Berlin 
Germany
Enabling networked knowledge 
2 
Linked2Safety - Showcases 
1. Showcase #1 – Phase III Clinical Trial: Subject Selection Criteria: 
 the unbiased randomised selection of subjects in phase III clinical 
trials 
 e.g. return subjects with diabetesValue > 4 and weight >80 and 
hasCancer 
2. Showcase #2 – Phase IV Post Marketing Surveillance trial: 
 the pharmacovigilance of a drug after it receives permission to be 
sold 
 e.g. Test Drugx association with headaches 
3. Showcase #3 – Chemoinformatics: 
 identification of relations between molecular fragments and 
specific adverse side effect categories. 
 e.g. Test chemicalFragmentX(of DrugX) with rash
Enabling networked knowledge 
3 
The Problem 
return number of patients that have been administered the drug Insulin and exhibit 
BMI > 25 and Hypertension and Diabetes as adverse events 
Switzerland Cyprus Greece
Safety First – Ethical & Legal Aspects 
Enabling networked knowledge 
4 
Patients’ anonymity Data Ownership & Privacy 
 Anonymised Clinical Data cubes 
 Insensitive clinical parameters without 
personal information 
 Access-Control Based Query Federation
CING 
(Data Cubes) 
Enabling networked knowledge 
5 
SAFE - Secure SPARQL Query Federation 
CHUV 
(Data Cubes) 
ZEINCR 
O 
(Data Cubes) 
RDF Data Cubes 
Index 
RDF Data Cubes 
RDF Data Cubes 
Access 
Policy Model 
SPARQL Query 
Source 
Selection 
Access 
Policy Filter 
Query re- 
Writer 
Results
Enabling networked knowledge 
6 
SAFE 
SPARQL 
Query + User 
Info 
Source 
Selection 
Access Policy 
Filtering 
Query Re-writing 
Oya 
Clinical Researcher 
Expertise – Diabetes 
SELECT ?diabetes ?bmi ?hypertension ?cases 
WHERE { 
?observation a qb:Observation . 
?observation l2s-dim:Diabetes ?diabetes. 
?observation l2s-dim:BMI ?bmi. 
?observation l2s-dim:Hypertension ?hypertension. 
?observation sdmx-measure:Cases ?cases. 
}
S3={ D ia b e t e s , B M I , H y p e r t e n s i o n , H I V , C }ases 
Enabling networked knowledge 
8 
SAFE – Source Selection 
SPARQL 
Query + User 
Info 
Source 
Selection 
Access Policy 
Filtering 
Query Re-writing 
Triples Patterns 
?observation l2s-dim:Diabetes ?diabetes. 
?observation l2s-dim:BMI ?bmi. 
?observation l2s-dim:Hypertension ?hypertension. 
?observation sdmx-measure:Cases ?cases. 
Capable Sources 
{S1, S2, S3, } 
{S1, S2, S3} 
{S1, S2, S3} 
{S1, S2, S3 , S4} 
S1={ } 
S4={ Smoking, Gender, Cases } 
INDEX 
Diabetes, BMI, Hypertension, Cases 
S2={ } Diabetes, BMI, Hypertension, Cases 
Diabetes, 
S4 
Join Awareness
Enabling networked knowledge 
9 
SAFE – Access Policy 
Access Policy Framework 
SPARQL 
Query + User 
Info 
Triple Pattern-based 
Source 
Selection 
Access Policy 
Filtering 
Query Re-writing 
Oya 
Clinical Researcher 
Expertise – Diabetes 
Requested Data 
S1 S2 S3 
Input Input 
Grants Access Denies Access 
S1 
S2 
S3
Enabling networked knowledge 
10 
SAFE – Access Policy 
• Example Access Policy 
AP1 type Access_Policy 
AP1 applies_to {S1, S2} 
AP1 grants_access Read 
AP1 assigned_to Oya 
SPARQL 
Query + User 
Info 
Source 
Selection 
Access Policy 
Filtering 
Query Re-writing 
Oya type User 
Oya haslocation Galway 
Oya hasPurpose Perform p-value analysis 
Oya hasRole Clinical Researcher 
Oya hasDomain Diabetes 
• SPARQL Query 
ASK WHERE { 
?accessPolicy a AccessPolicy. 
?accessPolicy appliesToNamedGraph S1. 
?accessPolicy :grantsAccess 
rantsAccess acl:Read_l2s, 
?accessPolicy hasUser Oya. 
}
Enabling networked knowledge 
11 
SAFE – Query Rewriting 
SPARQL 
Query + User 
Info 
 Graph Information will be added to the 
query triples 
 SELECT …. WHERE { GRAPH <S1> { …. } } 
 SELECT …. WHERE { GRAPH <S2> { …. } } 
 Sub queries sent to relevant sources 
 S1 
 S2 
 Integration of results obtained from each 
sources 
Source 
Selection 
Access Policy 
Filtering 
Query Re-writing 
Diabetes BMI Hypertension Cases 
0 0 0 40 
1 0 1 50 
0 1 1 120 
1 1 1 90 
S1 
S2
Enabling networked knowledge 
12 
Evaluation - DataSets 
Dataset # triples # obs # sub # pred # obj # size # index 
size 
# index generation 
time 
Internal Dataset 
CHUV 0.8 M 96 K 96 K 36 88 31 MB - - 
CING 0.1 M 17 K 17 K 21 51 5 MB - - 
ZEINCRO 0.4 M 49 K 49 K 24 59 15 MB - - 
Total 1.3 M 162 K 162 K 81 198 51 MB 8 KB 10 sec 
External Dataset 
World Bank 77 M 10 M 10 M 58 40 K 19 GB - - 
IMF 18 M 1.8 M 1.8 M 30 3151 3.51 GB - - 
Eurostat 0.3 M 38 K 44 K 31 5717 205 MB - - 
Trans. Int. 43 K 3939 4286 64 5290 9.2 MB - - 
Total 95 M 12 M 2 M 183 54 K 23 GB 12 KB 571 sec
Enabling networked knowledge 
13 
Evaluation - Berlin SPARQL Benchmark 
Characteristics Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 
# of Triple Patterns 9 7 9 16 7 8 11 10 7 7 3 7 
# of Sources 3 4 4 3 4 3 3 4 3 3 3 3 
# of Results 41 50 348 41 62 1983 5 10 1701 19656 570 41 
Filters  
> 9 Patterns        
Negation  
LIMIT Modifier     
Order By Modifier    
DISTINCT Modifier           
REGEX Operator  
UNION Operator 
• Sum of triple-pattern-wise sources selected for each query 
• Number of SPARQL ASK requests used for source selection 
Enabling networked knowledge 
14 
Evaluation – Source Selection 
Systems Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Avg 
SAFE 8 10 13 16 15 13 15 16 7 7 9 7 11 
FedX 9 13 16 24 20 14 16 19 15 17 9 16 16 
Systems Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Avg 
SAFE 0 0 0 0 0 0 0 0 0 0 0 0 0 
FedX 36 28 40 64 48 40 44 40 21 21 9 21 35
Enabling networked knowledge 
15 
Evaluation - Source Selection Time 
• Source Selection Time
Enabling networked knowledge 
16 
Evaluation - Query Execution Time 
• Query Execution Time 
Query times-out for FedX
Enabling networked knowledge 
17 
SAFE – Highlights 
 Source Selection 
 SPARQL SERVICE 
 Using SPARQL ASK queries 
 Using a catalog/index 
 SAFE - Hybrid (catalog/index + ASK) 
 Lightweight Cache 
– RDF Cube Data Structure 
– AccessPolicy 
 Join Aware 
 Excludes ineligible sources before actual query join 
 Provenance – via RDF Named Graphs 
 Self-contained Data Cubes 
 Creator 
 Location 
 Date 
 Access rights
Conclusion & Future Work 
Enabling networked knowledge 
18 
 Efficient source selection with a lightweight indexing 
 Policy aware query execution 
 Evaluated against internal and external sets 
 Performance is significantly improved compared to FedX 
 Cooking at the moment ! 
 Evaluation extended to federation engines (ANAPSID, HiBISCuS) 
 Benchmarking for query federation over statistical data cubes 
 SAFE extension for normal RDF data
http://linked2safety.hcls.deri.org:8080/SAFE-Demo/ 
Enabling networked knowledge 
19 
SAFE - Team 
• Yasar Khan 
• Muhammad Saleem 
• Aftab Iqbal 
• Muntazir Mehdi 
• Aidan Hogan 
• Panagiotis Hasapis 
• Axel-Cyrille Ngonga Ngomo 
• Stefan Decker 
• Ratnesh Sahay 
Thank You

More Related Content

PPTX
Consensus ranking and fragmentation prediction for identification of unknowns...
PPTX
The EPA Comptox Chemicals Dashboard as a Data Integration Hub for Environment...
PPT
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...
PPTX
Chemistry data: Distortion and dissemination in the Internet Era
PPTX
Translating research into practical tools: A case study of GenRA, a new read...
PDF
PEDSnet : 18 month summary on data integration and data quality
PPTX
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
PDF
Developing Apps: Exposing Your Data Through Araport
Consensus ranking and fragmentation prediction for identification of unknowns...
The EPA Comptox Chemicals Dashboard as a Data Integration Hub for Environment...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...
Chemistry data: Distortion and dissemination in the Internet Era
Translating research into practical tools: A case study of GenRA, a new read...
PEDSnet : 18 month summary on data integration and data quality
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
Developing Apps: Exposing Your Data Through Araport

What's hot (20)

PDF
ICAR 2015 Workshop - Blake Meyers
PPTX
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
PDF
ICAR 2015 Plenary - Chris Town
PPTX
Incorporating new technologies and High Throughput Screening in the design an...
PPTX
Multi-omics methods and resources for Bioconductor
PPTX
Accessing information for chemicals in hydraulic fracturing fluids using the ...
PPT
Adding complex expert knowledge into chemical database and transforming surfa...
PPTX
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
PPTX
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
PPTX
Does bigger mean better in the world of chemistry databases?
PPTX
An Approach to Combining Disparate Clinical Study Data across Multiple Sponso...
PDF
The influence of data curation on QSAR Modeling – examining issues of qualit...
PPTX
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
PPTX
New developments in delivering public access to data from the National Center...
PPTX
Structure Identification Using High Resolution Mass Spectrometry Data and the...
PDF
Utah Water Quality Monitoring Data
PPTX
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
PPTX
Chemical identification of unknowns in high resolution mass spectrometry usin...
PPTX
Cheminformatics approaches to support chemical identification delivered via t...
ICAR 2015 Workshop - Blake Meyers
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
ICAR 2015 Plenary - Chris Town
Incorporating new technologies and High Throughput Screening in the design an...
Multi-omics methods and resources for Bioconductor
Accessing information for chemicals in hydraulic fracturing fluids using the ...
Adding complex expert knowledge into chemical database and transforming surfa...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Does bigger mean better in the world of chemistry databases?
An Approach to Combining Disparate Clinical Study Data across Multiple Sponso...
The influence of data curation on QSAR Modeling – examining issues of qualit...
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
New developments in delivering public access to data from the National Center...
Structure Identification Using High Resolution Mass Spectrometry Data and the...
Utah Water Quality Monitoring Data
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
Chemical identification of unknowns in high resolution mass spectrometry usin...
Cheminformatics approaches to support chemical identification delivered via t...
Ad

Similar to SAFE: Policy Aware SPARQL Query Federation Over RDF Data Cubes (20)

PPTX
SAFE: Policy Aware SPARQL Query Federation Over RDF Data Cubes
PDF
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
PPTX
Technology for Drug Discovery Research Productivity
PPTX
Free online access to experimental and predicted chemical properties through ...
PDF
ReVeaLD: A user-driven domain-specific interactive search platform for biomed...
PDF
Meaningful (meta)data at scale: removing barriers to precision medicine research
PPTX
Burton - Security, Privacy and Trust
PPTX
Performing a Trio Analysis in VSClinical
PDF
The Search and Hyperlinking Task at MediaEval 2014
PPTX
The EPA Online Prediction Physicochemical Prediction Platform to Support Envi...
PDF
Tpa 2013
PPTX
Next Gen Clinical Data Sciences
PDF
BDE Technical Webinar 1 : Requirements elicitation
PPTX
Transparency in the Data Supply Chain
PPTX
Predicting query performance and explaining results to assist Linked Data con...
PDF
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
PPTX
Delivering The Benefits of Chemical-Biological Integration in Computational T...
PPT
Multivarite and network tools for biological data analysis
PPTX
The influence of data curation on QSAR Modeling – Presented at American Chemi...
PPTX
Jack Verhoosel | Semantics in Dairy Farming: towards a Common Dairy Ontology
SAFE: Policy Aware SPARQL Query Federation Over RDF Data Cubes
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
Technology for Drug Discovery Research Productivity
Free online access to experimental and predicted chemical properties through ...
ReVeaLD: A user-driven domain-specific interactive search platform for biomed...
Meaningful (meta)data at scale: removing barriers to precision medicine research
Burton - Security, Privacy and Trust
Performing a Trio Analysis in VSClinical
The Search and Hyperlinking Task at MediaEval 2014
The EPA Online Prediction Physicochemical Prediction Platform to Support Envi...
Tpa 2013
Next Gen Clinical Data Sciences
BDE Technical Webinar 1 : Requirements elicitation
Transparency in the Data Supply Chain
Predicting query performance and explaining results to assist Linked Data con...
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
Delivering The Benefits of Chemical-Biological Integration in Computational T...
Multivarite and network tools for biological data analysis
The influence of data curation on QSAR Modeling – Presented at American Chemi...
Jack Verhoosel | Semantics in Dairy Farming: towards a Common Dairy Ontology
Ad

Recently uploaded (20)

PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Empathic Computing: Creating Shared Understanding
PDF
Machine learning based COVID-19 study performance prediction
PDF
Approach and Philosophy of On baking technology
PPTX
Cloud computing and distributed systems.
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
Mobile App Security Testing_ A Comprehensive Guide.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Empathic Computing: Creating Shared Understanding
Machine learning based COVID-19 study performance prediction
Approach and Philosophy of On baking technology
Cloud computing and distributed systems.
Assigned Numbers - 2025 - Bluetooth® Document
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Programs and apps: productivity, graphics, security and other tools
Review of recent advances in non-invasive hemoglobin estimation
Unlocking AI with Model Context Protocol (MCP)
Reach Out and Touch Someone: Haptics and Empathic Computing
Diabetes mellitus diagnosis method based random forest with bat algorithm
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Spectral efficient network and resource selection model in 5G networks
NewMind AI Weekly Chronicles - August'25-Week II
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Encapsulation_ Review paper, used for researhc scholars

SAFE: Policy Aware SPARQL Query Federation Over RDF Data Cubes

  • 1. 10-12-2014 SAFE Policy Aware SPARQL Query Federation Over RDF Data Cubes Dr. Ratnesh Sahay Semantics in eHealth & Life Sciences (SeLS) Insight Centre for Data Analytics NUI Galway, Ireland SWAT4LS-2014, Berlin Germany
  • 2. Enabling networked knowledge 2 Linked2Safety - Showcases 1. Showcase #1 – Phase III Clinical Trial: Subject Selection Criteria:  the unbiased randomised selection of subjects in phase III clinical trials  e.g. return subjects with diabetesValue > 4 and weight >80 and hasCancer 2. Showcase #2 – Phase IV Post Marketing Surveillance trial:  the pharmacovigilance of a drug after it receives permission to be sold  e.g. Test Drugx association with headaches 3. Showcase #3 – Chemoinformatics:  identification of relations between molecular fragments and specific adverse side effect categories.  e.g. Test chemicalFragmentX(of DrugX) with rash
  • 3. Enabling networked knowledge 3 The Problem return number of patients that have been administered the drug Insulin and exhibit BMI > 25 and Hypertension and Diabetes as adverse events Switzerland Cyprus Greece
  • 4. Safety First – Ethical & Legal Aspects Enabling networked knowledge 4 Patients’ anonymity Data Ownership & Privacy  Anonymised Clinical Data cubes  Insensitive clinical parameters without personal information  Access-Control Based Query Federation
  • 5. CING (Data Cubes) Enabling networked knowledge 5 SAFE - Secure SPARQL Query Federation CHUV (Data Cubes) ZEINCR O (Data Cubes) RDF Data Cubes Index RDF Data Cubes RDF Data Cubes Access Policy Model SPARQL Query Source Selection Access Policy Filter Query re- Writer Results
  • 6. Enabling networked knowledge 6 SAFE SPARQL Query + User Info Source Selection Access Policy Filtering Query Re-writing Oya Clinical Researcher Expertise – Diabetes SELECT ?diabetes ?bmi ?hypertension ?cases WHERE { ?observation a qb:Observation . ?observation l2s-dim:Diabetes ?diabetes. ?observation l2s-dim:BMI ?bmi. ?observation l2s-dim:Hypertension ?hypertension. ?observation sdmx-measure:Cases ?cases. }
  • 7. S3={ D ia b e t e s , B M I , H y p e r t e n s i o n , H I V , C }ases Enabling networked knowledge 8 SAFE – Source Selection SPARQL Query + User Info Source Selection Access Policy Filtering Query Re-writing Triples Patterns ?observation l2s-dim:Diabetes ?diabetes. ?observation l2s-dim:BMI ?bmi. ?observation l2s-dim:Hypertension ?hypertension. ?observation sdmx-measure:Cases ?cases. Capable Sources {S1, S2, S3, } {S1, S2, S3} {S1, S2, S3} {S1, S2, S3 , S4} S1={ } S4={ Smoking, Gender, Cases } INDEX Diabetes, BMI, Hypertension, Cases S2={ } Diabetes, BMI, Hypertension, Cases Diabetes, S4 Join Awareness
  • 8. Enabling networked knowledge 9 SAFE – Access Policy Access Policy Framework SPARQL Query + User Info Triple Pattern-based Source Selection Access Policy Filtering Query Re-writing Oya Clinical Researcher Expertise – Diabetes Requested Data S1 S2 S3 Input Input Grants Access Denies Access S1 S2 S3
  • 9. Enabling networked knowledge 10 SAFE – Access Policy • Example Access Policy AP1 type Access_Policy AP1 applies_to {S1, S2} AP1 grants_access Read AP1 assigned_to Oya SPARQL Query + User Info Source Selection Access Policy Filtering Query Re-writing Oya type User Oya haslocation Galway Oya hasPurpose Perform p-value analysis Oya hasRole Clinical Researcher Oya hasDomain Diabetes • SPARQL Query ASK WHERE { ?accessPolicy a AccessPolicy. ?accessPolicy appliesToNamedGraph S1. ?accessPolicy :grantsAccess rantsAccess acl:Read_l2s, ?accessPolicy hasUser Oya. }
  • 10. Enabling networked knowledge 11 SAFE – Query Rewriting SPARQL Query + User Info  Graph Information will be added to the query triples  SELECT …. WHERE { GRAPH <S1> { …. } }  SELECT …. WHERE { GRAPH <S2> { …. } }  Sub queries sent to relevant sources  S1  S2  Integration of results obtained from each sources Source Selection Access Policy Filtering Query Re-writing Diabetes BMI Hypertension Cases 0 0 0 40 1 0 1 50 0 1 1 120 1 1 1 90 S1 S2
  • 11. Enabling networked knowledge 12 Evaluation - DataSets Dataset # triples # obs # sub # pred # obj # size # index size # index generation time Internal Dataset CHUV 0.8 M 96 K 96 K 36 88 31 MB - - CING 0.1 M 17 K 17 K 21 51 5 MB - - ZEINCRO 0.4 M 49 K 49 K 24 59 15 MB - - Total 1.3 M 162 K 162 K 81 198 51 MB 8 KB 10 sec External Dataset World Bank 77 M 10 M 10 M 58 40 K 19 GB - - IMF 18 M 1.8 M 1.8 M 30 3151 3.51 GB - - Eurostat 0.3 M 38 K 44 K 31 5717 205 MB - - Trans. Int. 43 K 3939 4286 64 5290 9.2 MB - - Total 95 M 12 M 2 M 183 54 K 23 GB 12 KB 571 sec
  • 12. Enabling networked knowledge 13 Evaluation - Berlin SPARQL Benchmark Characteristics Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 # of Triple Patterns 9 7 9 16 7 8 11 10 7 7 3 7 # of Sources 3 4 4 3 4 3 3 4 3 3 3 3 # of Results 41 50 348 41 62 1983 5 10 1701 19656 570 41 Filters  > 9 Patterns        Negation  LIMIT Modifier     Order By Modifier    DISTINCT Modifier           REGEX Operator  UNION Operator 
  • 13. • Sum of triple-pattern-wise sources selected for each query • Number of SPARQL ASK requests used for source selection Enabling networked knowledge 14 Evaluation – Source Selection Systems Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Avg SAFE 8 10 13 16 15 13 15 16 7 7 9 7 11 FedX 9 13 16 24 20 14 16 19 15 17 9 16 16 Systems Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Avg SAFE 0 0 0 0 0 0 0 0 0 0 0 0 0 FedX 36 28 40 64 48 40 44 40 21 21 9 21 35
  • 14. Enabling networked knowledge 15 Evaluation - Source Selection Time • Source Selection Time
  • 15. Enabling networked knowledge 16 Evaluation - Query Execution Time • Query Execution Time Query times-out for FedX
  • 16. Enabling networked knowledge 17 SAFE – Highlights  Source Selection  SPARQL SERVICE  Using SPARQL ASK queries  Using a catalog/index  SAFE - Hybrid (catalog/index + ASK)  Lightweight Cache – RDF Cube Data Structure – AccessPolicy  Join Aware  Excludes ineligible sources before actual query join  Provenance – via RDF Named Graphs  Self-contained Data Cubes  Creator  Location  Date  Access rights
  • 17. Conclusion & Future Work Enabling networked knowledge 18  Efficient source selection with a lightweight indexing  Policy aware query execution  Evaluated against internal and external sets  Performance is significantly improved compared to FedX  Cooking at the moment !  Evaluation extended to federation engines (ANAPSID, HiBISCuS)  Benchmarking for query federation over statistical data cubes  SAFE extension for normal RDF data
  • 18. http://linked2safety.hcls.deri.org:8080/SAFE-Demo/ Enabling networked knowledge 19 SAFE - Team • Yasar Khan • Muhammad Saleem • Aftab Iqbal • Muntazir Mehdi • Aidan Hogan • Panagiotis Hasapis • Axel-Cyrille Ngonga Ngomo • Stefan Decker • Ratnesh Sahay Thank You