SlideShare a Scribd company logo
SUBSPACE LEARNING AND IMPUTATION FOR STREAMING
BIG DATA MATRICES AND TENSORS
ABSTRACT
Extracting latent low-dimensional structure from high-dimensional data is of paramount
importance in timely inference tasks encountered with “Big Data” analytics. However,
increasingly noisy, heterogeneous, and incomplete datasets, as well as the need for real-time
processing of streaming data, posemmajor challenges to this end. In this context, the present
paper permeates benefits from rank minimization to scalable imputation of missing data, via
tracking low-dimensional subspaces and unraveling latent (possibly multi-way) structure from
incomplete streaming data. For low-rank matrix data, a subspace estimator is proposed based on
an exponentially weighted least-squares criterion regularized with the nuclear norm. After
recasting the nonseparable nuclear norm into a form amenable to online optimization, real-time
algorithms with complementary strengths are developed, and their convergence is established
under simplifying technical assumptions. In a stationary setting, the asymptotic estimates
obtained offer the well-documented performance guarantees of the batch nuclear-norm
regularized estimator. Under the same unifying framework, a novel online (adaptive) algorithm
is developed to obtain multi-way decompositions of low-rank tensors with missing entries and
perform imputation as a byproduct. Simulated tests with both synthetic as well as real Internet
and cardiac magnetic resonance imagery (MRI) data confirm the efficacy of the proposed
algorithms, and their superior performance relative to state-of-the-art alternatives.

More Related Content

DOCX
EXPLOITING EFFICIENT AND SCALABLE SHUFFLE TRANSFERS IN FUTURE DATA CENTER NET...
PDF
Cast effective and efficient user interaction for context aware selection in ...
DOCX
Revisiting central limit theorem; accurate gaussian random number generation ...
PDF
Secure power grid simulation on cloud
DOCX
A tree cluster-based data-gathering algorithm for industrial ws ns with a mob...
PDF
Ace an accurate and efficient multi entity device-free wlan localization system
PPTX
Spatiotemporal analytics
PDF
Scalable algorithms for nearest neighbor joins on big trajectory data
EXPLOITING EFFICIENT AND SCALABLE SHUFFLE TRANSFERS IN FUTURE DATA CENTER NET...
Cast effective and efficient user interaction for context aware selection in ...
Revisiting central limit theorem; accurate gaussian random number generation ...
Secure power grid simulation on cloud
A tree cluster-based data-gathering algorithm for industrial ws ns with a mob...
Ace an accurate and efficient multi entity device-free wlan localization system
Spatiotemporal analytics
Scalable algorithms for nearest neighbor joins on big trajectory data

What's hot (20)

PDF
Adaptive anomaly detection with kernel eigenspace splitting and merging
PPTX
snower malik
PDF
Gray-Box Models for Performance Assessment of Spark Applications
PDF
A modeling approach for cloud infrastructure planning considering dependabili...
PDF
Predictive control for energy aware consolidation in cloud datacenters
DOCX
6On Efficient Retiming of Fixed-Point Circuits
PPTX
PhD Projects in Qualnet Research Assistance
DOCX
EMR: A SCALABLE GRAPH-BASED RANKING MODEL FOR CONTENT-BASED IMAGE RETRIEVAL
PDF
Zolnai geobyte manuscript
DOCX
NEURAL NETWORK-BASED MODEL DESIGN FOR SHORT-TERM LOAD FORECAST IN DISTRIBUTIO...
DOCX
Graph based transistor network generation
DOCX
Aging aware reliable multiplier design with
DOCX
JPJ1402 A Scalable Two-Phase Top-Down Specialization Approach For Data Anon...
DOCX
Vector and Raster Data data model
PDF
Cut to Fit: Tailoring the Partitioning to the Computation
DOCX
Automated vessel segmentation using infinite perimeter active contour model w...
DOCX
A novel area efficient vlsi architecture
DOCX
Energy-Efficient VLSI Realization of Binary64 Division With Redundant Number ...
Adaptive anomaly detection with kernel eigenspace splitting and merging
snower malik
Gray-Box Models for Performance Assessment of Spark Applications
A modeling approach for cloud infrastructure planning considering dependabili...
Predictive control for energy aware consolidation in cloud datacenters
6On Efficient Retiming of Fixed-Point Circuits
PhD Projects in Qualnet Research Assistance
EMR: A SCALABLE GRAPH-BASED RANKING MODEL FOR CONTENT-BASED IMAGE RETRIEVAL
Zolnai geobyte manuscript
NEURAL NETWORK-BASED MODEL DESIGN FOR SHORT-TERM LOAD FORECAST IN DISTRIBUTIO...
Graph based transistor network generation
Aging aware reliable multiplier design with
JPJ1402 A Scalable Two-Phase Top-Down Specialization Approach For Data Anon...
Vector and Raster Data data model
Cut to Fit: Tailoring the Partitioning to the Computation
Automated vessel segmentation using infinite perimeter active contour model w...
A novel area efficient vlsi architecture
Energy-Efficient VLSI Realization of Binary64 Division With Redundant Number ...
Ad

Similar to SUBSPACE LEARNING AND IMPUTATION FOR STREAMING BIG DATA MATRICES AND TENSORS (20)

PDF
IEEE Datamining 2016 Title and Abstract
DOCX
IEEE 2014 JAVA NETWORKING PROJECTS Snapshot and continuous data collection in...
DOCX
Anomaly detection via online over sampling principal component analysis
PDF
Data mining projects topics for java and dot net
DOCX
Ieee transactions on networking 2018 Title with Abstract
DOCX
Ieee transactions on image processing
PDF
Signal Processing IEEE 2015 Projects
DOCX
Ieee acm transactions 2018 on networking topics with abstract for final year ...
PDF
M phil-computer-science-signal-processing-projects
PDF
Signal Processing IEEE 2015 Projects
DOCX
JAVA 2013 IEEE DATAMINING PROJECT Distributed processing of probabilistic top...
DOCX
Distributed processing of probabilistic top k queries in wireless sensor netw...
DOCX
Distributed processing of probabilistic top k queries in wireless sensor netw...
PDF
IEEE Emerging topic in computing Title and Abstract 2016
DOCX
Ns2 2015 2016 titles abstract
PDF
A fuzzy clustering algorithm for high dimensional streaming data
PDF
A Parallel Algorithm Template for Updating Single-Source Shortest Paths in La...
PDF
Matlab 2013 14 papers astract
DOCX
JPN1406 Snapshot and Continuous Data Collection in Probabilistic Wireless S...
PDF
TOWARDS REDUCTION OF DATA FLOW IN A DISTRIBUTED NETWORK USING PRINCIPAL COMPO...
IEEE Datamining 2016 Title and Abstract
IEEE 2014 JAVA NETWORKING PROJECTS Snapshot and continuous data collection in...
Anomaly detection via online over sampling principal component analysis
Data mining projects topics for java and dot net
Ieee transactions on networking 2018 Title with Abstract
Ieee transactions on image processing
Signal Processing IEEE 2015 Projects
Ieee acm transactions 2018 on networking topics with abstract for final year ...
M phil-computer-science-signal-processing-projects
Signal Processing IEEE 2015 Projects
JAVA 2013 IEEE DATAMINING PROJECT Distributed processing of probabilistic top...
Distributed processing of probabilistic top k queries in wireless sensor netw...
Distributed processing of probabilistic top k queries in wireless sensor netw...
IEEE Emerging topic in computing Title and Abstract 2016
Ns2 2015 2016 titles abstract
A fuzzy clustering algorithm for high dimensional streaming data
A Parallel Algorithm Template for Updating Single-Source Shortest Paths in La...
Matlab 2013 14 papers astract
JPN1406 Snapshot and Continuous Data Collection in Probabilistic Wireless S...
TOWARDS REDUCTION OF DATA FLOW IN A DISTRIBUTED NETWORK USING PRINCIPAL COMPO...
Ad

More from I3E Technologies (20)

PPTX
Design of a low voltage low-dropout regulator
PPTX
An efficient constant multiplier architecture based on vertical horizontal bi...
PPTX
Aging aware reliable multiplier design with adaptive hold logic
PPTX
A high performance fir filter architecture for fixed and reconfigurable appli...
PPTX
A generalized algorithm and reconfigurable architecture for efficient and sca...
PPTX
A combined sdc sdf architecture for normal i o pipelined radix-2 fft
PPTX
Reverse converter design via parallel prefix adders novel components, method...
PPTX
Pre encoded multipliers based on non-redundant radix-4 signed-digit encoding
PPTX
Energy optimized subthreshold vlsi logic family with unbalanced pull up down ...
PPTX
Variable form carrier-based pwm for boost-voltage motor driver with a charge-...
PPTX
Ultrasparse ac link converters
PPTX
Single inductor dual-output buck–boost power factor correction converter
PPTX
Ripple minimization through harmonic elimination in asymmetric interleaved mu...
PPTX
Resonance analysis and soft switching design of isolated boost converter with...
PPTX
Reliability evaluation of conventional and interleaved dc–dc boost converters
PPTX
Power factor corrected zeta converter based improved power quality switched m...
PPTX
Pfc cuk converter fed bldc motor drive
PPTX
Optimized operation of current fed dual active bridge dc dc converter for pv ...
PPTX
Online variable topology type photovoltaic grid-connected inverter
Design of a low voltage low-dropout regulator
An efficient constant multiplier architecture based on vertical horizontal bi...
Aging aware reliable multiplier design with adaptive hold logic
A high performance fir filter architecture for fixed and reconfigurable appli...
A generalized algorithm and reconfigurable architecture for efficient and sca...
A combined sdc sdf architecture for normal i o pipelined radix-2 fft
Reverse converter design via parallel prefix adders novel components, method...
Pre encoded multipliers based on non-redundant radix-4 signed-digit encoding
Energy optimized subthreshold vlsi logic family with unbalanced pull up down ...
Variable form carrier-based pwm for boost-voltage motor driver with a charge-...
Ultrasparse ac link converters
Single inductor dual-output buck–boost power factor correction converter
Ripple minimization through harmonic elimination in asymmetric interleaved mu...
Resonance analysis and soft switching design of isolated boost converter with...
Reliability evaluation of conventional and interleaved dc–dc boost converters
Power factor corrected zeta converter based improved power quality switched m...
Pfc cuk converter fed bldc motor drive
Optimized operation of current fed dual active bridge dc dc converter for pv ...
Online variable topology type photovoltaic grid-connected inverter

Recently uploaded (20)

PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PDF
Digital Logic Computer Design lecture notes
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PDF
composite construction of structures.pdf
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
web development for engineering and engineering
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
DOCX
573137875-Attendance-Management-System-original
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
Sustainable Sites - Green Building Construction
PPTX
additive manufacturing of ss316l using mig welding
PPTX
Construction Project Organization Group 2.pptx
Operating System & Kernel Study Guide-1 - converted.pdf
Digital Logic Computer Design lecture notes
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
composite construction of structures.pdf
Internet of Things (IOT) - A guide to understanding
web development for engineering and engineering
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
OOP with Java - Java Introduction (Basics)
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
573137875-Attendance-Management-System-original
CH1 Production IntroductoryConcepts.pptx
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Sustainable Sites - Green Building Construction
additive manufacturing of ss316l using mig welding
Construction Project Organization Group 2.pptx

SUBSPACE LEARNING AND IMPUTATION FOR STREAMING BIG DATA MATRICES AND TENSORS

  • 1. SUBSPACE LEARNING AND IMPUTATION FOR STREAMING BIG DATA MATRICES AND TENSORS ABSTRACT Extracting latent low-dimensional structure from high-dimensional data is of paramount importance in timely inference tasks encountered with “Big Data” analytics. However, increasingly noisy, heterogeneous, and incomplete datasets, as well as the need for real-time processing of streaming data, posemmajor challenges to this end. In this context, the present paper permeates benefits from rank minimization to scalable imputation of missing data, via tracking low-dimensional subspaces and unraveling latent (possibly multi-way) structure from incomplete streaming data. For low-rank matrix data, a subspace estimator is proposed based on an exponentially weighted least-squares criterion regularized with the nuclear norm. After recasting the nonseparable nuclear norm into a form amenable to online optimization, real-time algorithms with complementary strengths are developed, and their convergence is established under simplifying technical assumptions. In a stationary setting, the asymptotic estimates obtained offer the well-documented performance guarantees of the batch nuclear-norm regularized estimator. Under the same unifying framework, a novel online (adaptive) algorithm is developed to obtain multi-way decompositions of low-rank tensors with missing entries and perform imputation as a byproduct. Simulated tests with both synthetic as well as real Internet and cardiac magnetic resonance imagery (MRI) data confirm the efficacy of the proposed algorithms, and their superior performance relative to state-of-the-art alternatives.