A holistic approach to distribute dimensionality reduction of big dat,big data projects in pondicherry, bulk ieee projectsbig data projects

 A HOLISTIC APPROACH TO DISTRIBUTE
DIMENSIONALITY REDUCTION OF BIG DATA

With the exponential growth of data volume, big data have placed an unprecedented
burden on current computing infrastructure.
Dimensionality reduction of big data attracts a great deal of attention in recent years
as an efﬁcient method to extract the core data which is smaller to store and faster to
process.
This paper aims at addressing the three fundamental problems closely related to
distributed dimensionality reduction of big data, i.e. big data fusion, dimensionality
reduction algorithm and construction of distributed computing platform.
A chunk tensor method is presented to fuse the unstructured, semi-structured and
structured data as a uniﬁed model in which all characteristics of the heterogeneous
data are appropriately arranged along the tensor orders.

A Lanczos based High Order Singular Value Decomposition algorithm is
proposed to reduce dimensionality of the uniﬁed model.
Theoretical analyses of the algorithm are provided in terms of storage
scheme, convergence property and computation cost.
To execute the dimensionality reduction task, this paper employs the
Transparent Computing paradigm to construct a distributed computing
platform as well as utilizes the linear predictive model to partition the data
blocks.
Experimental results demonstrate that the proposed holistic approach is
efﬁcient for distributed dimensionality reduction of big data.

 With the exponential growth of data volume, big
data have placed an unprecedented burden on
current computing infrastructure.
 Dimensionality reduction of big data attracts a
great deal of attention in recent years as an
efﬁcient method to extract the core data which is
smaller to store and faster to process.

 This paper aims at addressing the three fundamental problems
closely related to distributed dimensionality reduction of big
data, i.e. big data fusion, dimensionality reduction algorithm and
construction of distributed computing platform.
 A chunk tensor method is presented to fuse the unstructured,
semi-structured and structured data as a uniﬁed model in which
all characteristics of the heterogeneous data are appropriately
arranged along the tensor orders.

 Decomposition
 Storage Scheme for Symmetric Matrix during
Lanczos Iteration
 Convergence and Accuracy of the L-HOSVD
Algorithm
 Computation Cost and Memory Usage

This paper aims at providing a holistic approach to distributed dimensionality reduction of
big data. Firstly a chunk tensor model is proposed to fuse the heterogeneous data from
multiple sources as a uniﬁed tensor model.
Concepts and operations of the chunk tensor model are established in this paper.
Secondly, a Lanczos-based High Order Singular Value Decomposition (L-HOSVD) algorithm
is proposed to obtain the core data which are small but contain valuable information.
Storage and convergence property of the L-HOSVD algorithm are studied.
Thirdly, the transparent computing paradigm is employed to construct a distributed
computing platform, as well as the linear predictive model is used to partition and distribute
data blocks to autonomic devices.

 [1] L. J. van der Maaten, E. O. Postma, and H. J. van den Herik, “Dimensionality Reduction: A ComparativeReview,”
Journal of Machine Learning Research, vol. 10, no. 1-41, pp. 66–71, 2009.
 [2]U. Doraszelski and K. L. Judd, “Avoiding the Curse of Dimensionality in Dynamic Stochastic
Games,”Quantitative Economics, vol. 3, no. 1, pp. 53–93, 2012.
 [3] H. Abdi and L. J. Williams, “Principal Component Analysis,” Wiley Interdisciplinary Reviews:
ComputationalStatistics, vol. 2, no. 4, pp. 433–459, 2010.
 [4] E. Henry, J. Hofrichter et al., “Singular Value Decomposition: Application to Analysis of ExperimentalData,”
Essential Numerical Computer Methods, vol. 210, pp. 81–138, 2010.
 [5] P. Comon and C. Jutten, Handbook of Blind Source Separation: Independent Component Analysis and
Applications. Academic Press, 2010.

A holistic approach to distribute dimensionality reduction of big dat,big data projects in pondicherry, bulk ieee projectsbig data projects

More Related Content

What's hot (19)

Similar to A holistic approach to distribute dimensionality reduction of big dat,big data projects in pondicherry, bulk ieee projectsbig data projects (20)

More from Nexgen Technology (20)

Recently uploaded (20)

A holistic approach to distribute dimensionality reduction of big dat,big data projects in pondicherry, bulk ieee projectsbig data projects