SlideShare a Scribd company logo
Jarrar © 2013 1
Dr. Mustafa Jarrar
University of Birzeit
mjarrar@birzeit.edu
www.jarrar.info
Mustafa Jarrar
Lecture Notes on Architectural Solutions
Birzeit University, Palestine
2013
Architectural Solutions
in Data Integration
Jarrar © 2013 2
Watch this lecture and download the slides from
http://jarrar-courses.blogspot.com/2014/01/web-data-management.html
Most information adapted from [1]
Jarrar © 2013 3
Outline
Two families of solutions for the integration issue:
- Application-driven Integration
- Data-driven Integration
- Architectures of application-driven Integration
- Information Integration Architectures
- The integration problem
- Criteria to be adopted
Keywords: Data Integration, Application-driven Integration, Data-driven Integration, Web Services, RPC, Publish &
Subscribe, Consolidation ,Data Warehouse, Data Integration, Service Oriented Architecture , Virtual Data Integration, Query
complexity, heterogeneity
Jarrar © 2013 4
Different Solutions
Two families of solutions for the integration issue:
– Application-driven Integration
• Various types of middleware (e.g. Web Services, Remote
Procedure Call (RPC), Publish & Subscribe) that achieve
reconciliation through application to middleware communication
– Data-driven Integration
• Various types of data reconciliation and integration
– Consolidation
– Data Warehouse
– Data Integration
Jarrar © 2013 5
Architectures of application-driven Integration
Service Oriented Architecture
. . . . . .
MSG-1
AS
SS
AS
SS
AS
SS
AS
SS
AS
SS
AS
SS
. . .
Legend
SS = Security Server
AS = Adapter Server
MSG = Data Message
MSG-N
enterprise
service bus
Jarrar © 2013 6
Architectures of application-driven Integration
Source 1 Source 2
Source nApplication 1 Application 2 Application n
Middleware
1
2
347
5
6
Update of an object O
PublishesSubscribes
Publish-Subscribe Architecture
Typical application-driven integration architecture for integration of updates.
Jarrar © 2013 7
Information Integration Architectures
Source 1
Source 2
Source n
…..
Source 2
Source 1
Source n
Unique DB
New architecture
once for all
Consolidation
Jarrar © 2013 8
Information Integration Architectures
Source 1
Source 2
Source n
…..
Unique DB
New architecture: periodically updated
Data Warehouse
middleware
New database
Data Warehouse
Jarrar © 2013 9
Information Integration Architectures
Virtual Data Integration
Source 1
Source 2
Source n
…..
Mediator
Local
schema
Local
schema
Local
schema
Local
schemaLocal
schemaLocal
schema
Global
schema
New architectureNo new database!
Jarrar © 2013 10
The integration problem…
Source 2
Source 1Registry
of clients 1
Source 3
Source 4
Source n
…..
Which kind of
integration?
New
architecture
Registry
of clients 2
Retail
sales
On line
sales
Other
How to decide?
Jarrar © 2013 11
Criteria to be adopted
• Autonomy, the degree of independence between the different
database administrators in their design choices;
• Relevance of historical data, and consequent need to
periodically store new data without deleting the old ones;
• Query complexity, in terms of amount of data and tables visited
and number of operators on them, and consequent time
complexity in query execution;
• Relevance of currency in queries, the need for queries to extract
current data;
• Economic value of integration, the relevance of having
integrated information in input for business operational and
decisional processes in order to produce effective outputs;
Jarrar © 2013 12
Criteria to be adopted
• Volatility of sources, frequency of adding or deleting sources,
and frequency of change of source schemas;
• Relevance of queries w.r.t transactions, relative importance and
frequency of queries with respect to changes in data;
• Management complexity, the effort to be spent in management
activities related to databases and hw-sw infrastructures, due to
the corresponding complexity of the organizations using the
data bases;
• Costs of heterogeneity, hidden and explicit costs related to
business processes that are due to making use of
heterogeneous data.
Jarrar © 2013 13
References and Acknowledge
• Carlo Batini: Course on Data Integration. BZU IT Summer School 2011.
• Stefano Spaccapietra: Information Integration. Presentation at the IFIP
Academy. Porto Alegre. 2005.
• Chris Bizer: The Emerging Web of Linked Data. Presentation at SRI
International, Artificial Intelligence Center. Menlo Park, USA. 2009.
Appreciation extended to Anton Deik for aiding in preparing this lecture

More Related Content

PPTX
Jarrar: Data Schema Integration
PPTX
Jarrar: Introduction to data Integration
PDF
Dbms Notes Lecture 4 : Data Models in DBMS
PPTX
All data models in dbms
PPTX
DBMS OF DATA MODEL Deepika 2
PPTX
Slide 2 data models
PPT
data modeling and models
PPT
Database Management & Models
Jarrar: Data Schema Integration
Jarrar: Introduction to data Integration
Dbms Notes Lecture 4 : Data Models in DBMS
All data models in dbms
DBMS OF DATA MODEL Deepika 2
Slide 2 data models
data modeling and models
Database Management & Models

What's hot (20)

PDF
B131626
PPT
OODM-object oriented data model
PPTX
Design approach
PDF
Cross Domain Data Fusion
PPTX
Data models
PDF
Cse ii ii sem
PPTX
Data Modeling PPT
PDF
Literature review of attribute level and
PPT
Chapter10 conceptual data modeling
DOCX
The three level of data modeling
PPTX
Data Modeling Basics
PDF
Chapter 3 Entity Relationship Model
PPTX
PPT
Introduction to Data Abstraction
PDF
An approach for transforming of relational databases to owl ontology
PPT
Tg03
PPTX
Database model
PDF
Whitepaper sones GraphDB (eng)
B131626
OODM-object oriented data model
Design approach
Cross Domain Data Fusion
Data models
Cse ii ii sem
Data Modeling PPT
Literature review of attribute level and
Chapter10 conceptual data modeling
The three level of data modeling
Data Modeling Basics
Chapter 3 Entity Relationship Model
Introduction to Data Abstraction
An approach for transforming of relational databases to owl ontology
Tg03
Database model
Whitepaper sones GraphDB (eng)
Ad

Similar to Jarrar: Architectural solutions in Data Integration (20)

PDF
Jarrar: Architectural Solutions in Data Integration
PDF
The Shifting Landscape of Data Integration
PDF
PDF
Govern and Protect Your End User Information
PDF
How a Logical Data Fabric Enhances the Customer 360 View
PPTX
Impact of cloud services on software development life
PDF
Credit Suisse: Multi-Domain Enterprise Reference Data
PDF
The Executive View on Big Data Platform Hosting - Evaluating Hosting Services...
PPTX
10-IoT Data Analytics, Cloud Computing for IoT, Cloud Based platforms, ML for...
PDF
Adopting a Logical Data Architecture for Today's Data and Analytics Requirements
PPTX
Data Mesh in Azure using Cloud Scale Analytics (WAF)
PDF
Fundamentals_of_Data_Centre_Part_1(Recovered).pdf
PDF
ADV Slides: Data Pipelines in the Enterprise and Comparison
PDF
Data Architecture for Solutions.pdf
PPTX
Data warehouseold
PPTX
Database-Management-Systems-An-Introduction (1).pptx
PDF
Community Resource Portal for the Healthcare Sector
PPTX
Data Mesh Implementation - a practical journey
PPTX
SG Data Mgt - Findings and Recommendations.pptx
PDF
GDPR Noncompliance: Avoid the Risk with Data Virtualization
Jarrar: Architectural Solutions in Data Integration
The Shifting Landscape of Data Integration
Govern and Protect Your End User Information
How a Logical Data Fabric Enhances the Customer 360 View
Impact of cloud services on software development life
Credit Suisse: Multi-Domain Enterprise Reference Data
The Executive View on Big Data Platform Hosting - Evaluating Hosting Services...
10-IoT Data Analytics, Cloud Computing for IoT, Cloud Based platforms, ML for...
Adopting a Logical Data Architecture for Today's Data and Analytics Requirements
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Fundamentals_of_Data_Centre_Part_1(Recovered).pdf
ADV Slides: Data Pipelines in the Enterprise and Comparison
Data Architecture for Solutions.pdf
Data warehouseold
Database-Management-Systems-An-Introduction (1).pptx
Community Resource Portal for the Healthcare Sector
Data Mesh Implementation - a practical journey
SG Data Mgt - Findings and Recommendations.pptx
GDPR Noncompliance: Avoid the Risk with Data Virtualization
Ad

More from Mustafa Jarrar (20)

PPTX
Clustering Arabic Tweets for Sentiment Analysis
PPTX
Classifying Processes and Basic Formal Ontology
PPTX
Discrete Mathematics Course Outline
PPTX
Business Process Implementation
PPTX
Business Process Design and Re-engineering
PPTX
BPMN 2.0 Analytical Constructs
PPTX
BPMN 2.0 Descriptive Constructs
PPTX
Introduction to Business Process Management
PDF
Customer Complaint Ontology
PPTX
Subset, Equality, and Exclusion Rules
PPTX
Schema Modularization in ORM
PPTX
On Computer Science Trends and Priorities in Palestine
PPTX
Lessons from Class Recording & Publishing of Eight Online Courses
PPTX
Presentation curras paper-emnlp2014-final
PPTX
Jarrar: Future Internet in Horizon 2020 Calls
PPT
Habash: Arabic Natural Language Processing
PDF
Adnan: Introduction to Natural Language Processing
PPTX
Riestra: How to Design and engineer Competitive Horizon 2020 Proposals
PPTX
Bouquet: SIERA Workshop on The Pillars of Horizon2020
PPTX
Jarrar: Sparql Project
Clustering Arabic Tweets for Sentiment Analysis
Classifying Processes and Basic Formal Ontology
Discrete Mathematics Course Outline
Business Process Implementation
Business Process Design and Re-engineering
BPMN 2.0 Analytical Constructs
BPMN 2.0 Descriptive Constructs
Introduction to Business Process Management
Customer Complaint Ontology
Subset, Equality, and Exclusion Rules
Schema Modularization in ORM
On Computer Science Trends and Priorities in Palestine
Lessons from Class Recording & Publishing of Eight Online Courses
Presentation curras paper-emnlp2014-final
Jarrar: Future Internet in Horizon 2020 Calls
Habash: Arabic Natural Language Processing
Adnan: Introduction to Natural Language Processing
Riestra: How to Design and engineer Competitive Horizon 2020 Proposals
Bouquet: SIERA Workshop on The Pillars of Horizon2020
Jarrar: Sparql Project

Recently uploaded (20)

PDF
Approach and Philosophy of On baking technology
PPTX
Cloud computing and distributed systems.
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
A Presentation on Artificial Intelligence
PPT
Teaching material agriculture food technology
PPTX
Spectroscopy.pptx food analysis technology
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
A comparative analysis of optical character recognition models for extracting...
Approach and Philosophy of On baking technology
Cloud computing and distributed systems.
MYSQL Presentation for SQL database connectivity
Chapter 3 Spatial Domain Image Processing.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
The AUB Centre for AI in Media Proposal.docx
Dropbox Q2 2025 Financial Results & Investor Presentation
20250228 LYD VKU AI Blended-Learning.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Building Integrated photovoltaic BIPV_UPV.pdf
A Presentation on Artificial Intelligence
Teaching material agriculture food technology
Spectroscopy.pptx food analysis technology
Spectral efficient network and resource selection model in 5G networks
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Digital-Transformation-Roadmap-for-Companies.pptx
Encapsulation_ Review paper, used for researhc scholars
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
A comparative analysis of optical character recognition models for extracting...

Jarrar: Architectural solutions in Data Integration

  • 1. Jarrar © 2013 1 Dr. Mustafa Jarrar University of Birzeit mjarrar@birzeit.edu www.jarrar.info Mustafa Jarrar Lecture Notes on Architectural Solutions Birzeit University, Palestine 2013 Architectural Solutions in Data Integration
  • 2. Jarrar © 2013 2 Watch this lecture and download the slides from http://jarrar-courses.blogspot.com/2014/01/web-data-management.html Most information adapted from [1]
  • 3. Jarrar © 2013 3 Outline Two families of solutions for the integration issue: - Application-driven Integration - Data-driven Integration - Architectures of application-driven Integration - Information Integration Architectures - The integration problem - Criteria to be adopted Keywords: Data Integration, Application-driven Integration, Data-driven Integration, Web Services, RPC, Publish & Subscribe, Consolidation ,Data Warehouse, Data Integration, Service Oriented Architecture , Virtual Data Integration, Query complexity, heterogeneity
  • 4. Jarrar © 2013 4 Different Solutions Two families of solutions for the integration issue: – Application-driven Integration • Various types of middleware (e.g. Web Services, Remote Procedure Call (RPC), Publish & Subscribe) that achieve reconciliation through application to middleware communication – Data-driven Integration • Various types of data reconciliation and integration – Consolidation – Data Warehouse – Data Integration
  • 5. Jarrar © 2013 5 Architectures of application-driven Integration Service Oriented Architecture . . . . . . MSG-1 AS SS AS SS AS SS AS SS AS SS AS SS . . . Legend SS = Security Server AS = Adapter Server MSG = Data Message MSG-N enterprise service bus
  • 6. Jarrar © 2013 6 Architectures of application-driven Integration Source 1 Source 2 Source nApplication 1 Application 2 Application n Middleware 1 2 347 5 6 Update of an object O PublishesSubscribes Publish-Subscribe Architecture Typical application-driven integration architecture for integration of updates.
  • 7. Jarrar © 2013 7 Information Integration Architectures Source 1 Source 2 Source n ….. Source 2 Source 1 Source n Unique DB New architecture once for all Consolidation
  • 8. Jarrar © 2013 8 Information Integration Architectures Source 1 Source 2 Source n ….. Unique DB New architecture: periodically updated Data Warehouse middleware New database Data Warehouse
  • 9. Jarrar © 2013 9 Information Integration Architectures Virtual Data Integration Source 1 Source 2 Source n ….. Mediator Local schema Local schema Local schema Local schemaLocal schemaLocal schema Global schema New architectureNo new database!
  • 10. Jarrar © 2013 10 The integration problem… Source 2 Source 1Registry of clients 1 Source 3 Source 4 Source n ….. Which kind of integration? New architecture Registry of clients 2 Retail sales On line sales Other How to decide?
  • 11. Jarrar © 2013 11 Criteria to be adopted • Autonomy, the degree of independence between the different database administrators in their design choices; • Relevance of historical data, and consequent need to periodically store new data without deleting the old ones; • Query complexity, in terms of amount of data and tables visited and number of operators on them, and consequent time complexity in query execution; • Relevance of currency in queries, the need for queries to extract current data; • Economic value of integration, the relevance of having integrated information in input for business operational and decisional processes in order to produce effective outputs;
  • 12. Jarrar © 2013 12 Criteria to be adopted • Volatility of sources, frequency of adding or deleting sources, and frequency of change of source schemas; • Relevance of queries w.r.t transactions, relative importance and frequency of queries with respect to changes in data; • Management complexity, the effort to be spent in management activities related to databases and hw-sw infrastructures, due to the corresponding complexity of the organizations using the data bases; • Costs of heterogeneity, hidden and explicit costs related to business processes that are due to making use of heterogeneous data.
  • 13. Jarrar © 2013 13 References and Acknowledge • Carlo Batini: Course on Data Integration. BZU IT Summer School 2011. • Stefano Spaccapietra: Information Integration. Presentation at the IFIP Academy. Porto Alegre. 2005. • Chris Bizer: The Emerging Web of Linked Data. Presentation at SRI International, Artificial Intelligence Center. Menlo Park, USA. 2009. Appreciation extended to Anton Deik for aiding in preparing this lecture