SlideShare a Scribd company logo
Dear students get fully solved assignments
Send your semester & Specialization name to our mail id :
“ help.mbaassignments@gmail.com ”
or
Call us at : 08263069601
ASSIGNMENT
PROGRAM Master of Science inInformation Technology(MScIT)RevisedFall 2011
SEMESTER 4
SUBJECT CODE & NAME MIT401– Data Warehousing and Data Mining
CREDIT 4
BK ID B1633
MAX.MARKS 60
Note: Answer all questions. Kindly note that answers for 10 marks questions should be approximately
of 400 words. Each questionis followedby evaluationscheme.
1 Differentiate betweenOLTPand Data Warehouses.
Answer : The data warehouse and the OLTP data base are both relational databases. However, the
objectivesof boththese databasesare different.
The OLTP database records transactions in real time and aims to automate clerical data entry processes
of a business entity. Addition, modification and deletion of data in the OLTP database is essential and
the semantics of the application used in the front end impact on the organization of the data in the
database.
The data warehouse on the other hand does not cater to real time operational requirements of the
enterprise.Itismore astorehouse of currentand historical
Q2 Explainthe Data Warehouse Kimball life cycle.
Answer : The Kimball Lifecycle methodology was conceived during the mid-1980s by members of the
Kimball Group and other colleagues at Metaphor Computer Systems, a pioneering decision support
company. Since then, it has been successfully utilized by thousands of data warehouse and business
intelligence (DW/BI) projectteamsacrossvirtually
Q3 Describe about Hyper Cube and Multicube.
Answer: Multidimensional databases can present their data to an application using two types of cubes:
hypercubes and multicubes. In the hypercube model, as shown in the following illustration, all data
appears logically as a single cube. All parts of the manifold represented by this hypercube have identical
dimensionality.
In the multicube model, data is segmented into a set of smaller cubes, each of which is composed of a
subsetof the available dimensions,asshowninthe followingillustration:
Hypercubesandmulticubesdifferintermsof available
Q.4 List and explainthe Strategies for data reduction.
Answer: Data reduction is the process of minimizing the amount of data that needs to be stored in a
data storage environment.Datareductioncanincrease storage efficiencyandreduce costs.
Strategiesfor data reduction:
TAKE ADVANTAGE OF EXISTING INFORMATION: First of all, we don't want to reinvent the wheel.
There's a lot of existing information out there for community health coalitions to take advantage of.
Know your community's history! Has this initiative or something similar been tried here before? Even a
failed attempt has valuable information to offer. Take advantage of existing knowledge on risk reduction
before youworklike crazyto come up withstrategiesand
Q.5 Describe K-meansmethodfor clustering.List its advantages and drawbacks.
Answer: k-means clustering is a method of vector quantization, originally from signal processing, that is
popular for cluster analysis in data mining. k-means clustering aims to partition n observations into k
clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype
of the cluster.Thisresultsinapartitioningof the dataspace intoVoronoi cells.
The problemiscomputationallydifficult(NP-hard);
Q.6 Describe about Multilevel Databasesand WebQuery Systems.
Answer: Multilevel Databases: The main idea behind this approach is that the lowest level of the
database contains semi-structured information stored in various Web repositories, such as hypertext
documents. At the higher level(s) meta data or generalizations are extracted from lower levels and
organized in structured collections, i.e. relational or object oriented databases. For example, Han, et. al.
use a multilayered database where each layer is obtained via generalization and transformation
operations performed on the lower layers. Kholsa, et. al. propose the creation and maintenance of
meta-databases at each information providing domain and the use of a global schema for the meta-
database. King & Novak propose the incremental integration of a portion of the schema from each
informationsource,ratherthanrelyingonaglobal heterogeneousdatabase schema.The
Dear students get fully solved assignments
Send your semester & Specialization name to our mail id :
“ help.mbaassignments@gmail.com ”
or
Call us at : 08263069601

More Related Content

DOCX
Mc0088 data mining
PDF
Query Optimization Techniques in Graph Databases
DOCX
Bt9001, data mining
PDF
A survey on data mining and analysis in hadoop and mongo db
PDF
A new link based approach for categorical data clustering
PPTX
Data warehouse and olap technology
PPTX
CoDe Modeling of Graph Composition for Data Warehouse Report Visualization
PDF
Role of Data Cleaning in Data Warehouse
Mc0088 data mining
Query Optimization Techniques in Graph Databases
Bt9001, data mining
A survey on data mining and analysis in hadoop and mongo db
A new link based approach for categorical data clustering
Data warehouse and olap technology
CoDe Modeling of Graph Composition for Data Warehouse Report Visualization
Role of Data Cleaning in Data Warehouse

What's hot (19)

PDF
On-Demand Service-Based Big Data Integration: Optimized for Research Collabor...
PDF
Semi-automatic Discovery of Mappings Between Heterogeneous Data Warehouse Dim...
DOCX
Abstract.DOCX
PPTX
Data mining an introduction
PDF
GCUBE INDEXING
PPTX
DMDW Lesson 04 - Data Mining Theory
PPTX
Architecture of data mining system
PDF
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
PPTX
Data Mining: Key definitions
PPT
Data preprocessing
PPTX
Lecture 02 - The Data Warehouse Environment
PPTX
5 data preparation and processing2
PDF
Recommendation system using bloom filter in mapreduce
PPTX
COVID - 19 DATA ANALYSIS USING PYTHON and Introduction to Data Science
PPTX
OLAP & Data Warehouse
PPTX
Data Mining Primitives, Languages & Systems
PPT
Lecture 03 - The Data Warehouse and Design
PPTX
Multi dimensional model vs (1)
On-Demand Service-Based Big Data Integration: Optimized for Research Collabor...
Semi-automatic Discovery of Mappings Between Heterogeneous Data Warehouse Dim...
Abstract.DOCX
Data mining an introduction
GCUBE INDEXING
DMDW Lesson 04 - Data Mining Theory
Architecture of data mining system
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
Data Mining: Key definitions
Data preprocessing
Lecture 02 - The Data Warehouse Environment
5 data preparation and processing2
Recommendation system using bloom filter in mapreduce
COVID - 19 DATA ANALYSIS USING PYTHON and Introduction to Data Science
OLAP & Data Warehouse
Data Mining Primitives, Languages & Systems
Lecture 03 - The Data Warehouse and Design
Multi dimensional model vs (1)
Ad

Similar to Mit401 data warehousing and data mining (20)

DOCX
Mc0088 data mining
PDF
Growth of relational model: Interdependence and complementary to big data
PDF
PDF
A New Multi-Dimensional Hyperbolic Structure for Cloud Service Indexing
PDF
Relational Databases For An Efficient Data Management And...
DOCX
I.J. Information Technology and Computer Science, 2016, 12, 59.docx
PDF
Database Systems Performance Evaluation for IoT Applications
PDF
DATABASE SYSTEMS PERFORMANCE EVALUATION FOR IOT APPLICATIONS
PDF
International Journal of Engineering and Science Invention (IJESI)
PDF
An ontological approach to handle multidimensional schema evolution for data ...
PDF
JovianDATA MDX Engine Comad oct 22 2011
DOCX
MC0088 Internal Assignment (SMU)
PPT
Database Management & Models
DOC
Dwh faqs
PDF
1 ieee98
PPTX
Data warehousing
PDF
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
PDF
A BRIEF REVIEW ALONG WITH A NEW PROPOSED APPROACH OF DATA DE DUPLICATION
PDF
Towards a new hybrid approach for building documentoriented data wareh
PDF
Analysis and evaluation of riak kv cluster environment using basho bench
Mc0088 data mining
Growth of relational model: Interdependence and complementary to big data
A New Multi-Dimensional Hyperbolic Structure for Cloud Service Indexing
Relational Databases For An Efficient Data Management And...
I.J. Information Technology and Computer Science, 2016, 12, 59.docx
Database Systems Performance Evaluation for IoT Applications
DATABASE SYSTEMS PERFORMANCE EVALUATION FOR IOT APPLICATIONS
International Journal of Engineering and Science Invention (IJESI)
An ontological approach to handle multidimensional schema evolution for data ...
JovianDATA MDX Engine Comad oct 22 2011
MC0088 Internal Assignment (SMU)
Database Management & Models
Dwh faqs
1 ieee98
Data warehousing
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
A BRIEF REVIEW ALONG WITH A NEW PROPOSED APPROACH OF DATA DE DUPLICATION
Towards a new hybrid approach for building documentoriented data wareh
Analysis and evaluation of riak kv cluster environment using basho bench
Ad

Recently uploaded (20)

PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Classroom Observation Tools for Teachers
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
Complications of Minimal Access Surgery at WLH
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
TR - Agricultural Crops Production NC III.pdf
PDF
Computing-Curriculum for Schools in Ghana
PDF
Basic Mud Logging Guide for educational purpose
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
Pharma ospi slides which help in ospi learning
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
GDM (1) (1).pptx small presentation for students
PPTX
Cell Structure & Organelles in detailed.
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
human mycosis Human fungal infections are called human mycosis..pptx
Classroom Observation Tools for Teachers
FourierSeries-QuestionsWithAnswers(Part-A).pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Complications of Minimal Access Surgery at WLH
VCE English Exam - Section C Student Revision Booklet
Supply Chain Operations Speaking Notes -ICLT Program
2.FourierTransform-ShortQuestionswithAnswers.pdf
TR - Agricultural Crops Production NC III.pdf
Computing-Curriculum for Schools in Ghana
Basic Mud Logging Guide for educational purpose
Pharmacology of Heart Failure /Pharmacotherapy of CHF
102 student loan defaulters named and shamed – Is someone you know on the list?
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Pharma ospi slides which help in ospi learning
Anesthesia in Laparoscopic Surgery in India
GDM (1) (1).pptx small presentation for students
Cell Structure & Organelles in detailed.
Chapter 2 Heredity, Prenatal Development, and Birth.pdf

Mit401 data warehousing and data mining

  • 1. Dear students get fully solved assignments Send your semester & Specialization name to our mail id : “ help.mbaassignments@gmail.com ” or Call us at : 08263069601 ASSIGNMENT PROGRAM Master of Science inInformation Technology(MScIT)RevisedFall 2011 SEMESTER 4 SUBJECT CODE & NAME MIT401– Data Warehousing and Data Mining CREDIT 4 BK ID B1633 MAX.MARKS 60 Note: Answer all questions. Kindly note that answers for 10 marks questions should be approximately of 400 words. Each questionis followedby evaluationscheme. 1 Differentiate betweenOLTPand Data Warehouses. Answer : The data warehouse and the OLTP data base are both relational databases. However, the objectivesof boththese databasesare different. The OLTP database records transactions in real time and aims to automate clerical data entry processes of a business entity. Addition, modification and deletion of data in the OLTP database is essential and the semantics of the application used in the front end impact on the organization of the data in the database. The data warehouse on the other hand does not cater to real time operational requirements of the enterprise.Itismore astorehouse of currentand historical Q2 Explainthe Data Warehouse Kimball life cycle.
  • 2. Answer : The Kimball Lifecycle methodology was conceived during the mid-1980s by members of the Kimball Group and other colleagues at Metaphor Computer Systems, a pioneering decision support company. Since then, it has been successfully utilized by thousands of data warehouse and business intelligence (DW/BI) projectteamsacrossvirtually Q3 Describe about Hyper Cube and Multicube. Answer: Multidimensional databases can present their data to an application using two types of cubes: hypercubes and multicubes. In the hypercube model, as shown in the following illustration, all data appears logically as a single cube. All parts of the manifold represented by this hypercube have identical dimensionality. In the multicube model, data is segmented into a set of smaller cubes, each of which is composed of a subsetof the available dimensions,asshowninthe followingillustration: Hypercubesandmulticubesdifferintermsof available Q.4 List and explainthe Strategies for data reduction. Answer: Data reduction is the process of minimizing the amount of data that needs to be stored in a data storage environment.Datareductioncanincrease storage efficiencyandreduce costs. Strategiesfor data reduction: TAKE ADVANTAGE OF EXISTING INFORMATION: First of all, we don't want to reinvent the wheel. There's a lot of existing information out there for community health coalitions to take advantage of. Know your community's history! Has this initiative or something similar been tried here before? Even a failed attempt has valuable information to offer. Take advantage of existing knowledge on risk reduction before youworklike crazyto come up withstrategiesand Q.5 Describe K-meansmethodfor clustering.List its advantages and drawbacks. Answer: k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. k-means clustering aims to partition n observations into k
  • 3. clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster.Thisresultsinapartitioningof the dataspace intoVoronoi cells. The problemiscomputationallydifficult(NP-hard); Q.6 Describe about Multilevel Databasesand WebQuery Systems. Answer: Multilevel Databases: The main idea behind this approach is that the lowest level of the database contains semi-structured information stored in various Web repositories, such as hypertext documents. At the higher level(s) meta data or generalizations are extracted from lower levels and organized in structured collections, i.e. relational or object oriented databases. For example, Han, et. al. use a multilayered database where each layer is obtained via generalization and transformation operations performed on the lower layers. Kholsa, et. al. propose the creation and maintenance of meta-databases at each information providing domain and the use of a global schema for the meta- database. King & Novak propose the incremental integration of a portion of the schema from each informationsource,ratherthanrelyingonaglobal heterogeneousdatabase schema.The Dear students get fully solved assignments Send your semester & Specialization name to our mail id : “ help.mbaassignments@gmail.com ” or Call us at : 08263069601