Computing for Data Analysis: Theory and Practices 1st Edition Sanjay Chakraborty

Computing for Data Analysis: Theory and
Practices 1st Edition Sanjay Chakraborty install
download
https://ebookmeta.com/product/computing-for-data-analysis-theory-
and-practices-1st-edition-sanjay-chakraborty/
Download more ebook from https://ebookmeta.com

We believe these products will be a great fit for you. Click
the link to download now, or visit ebookmeta.com
to discover even more!
Big Data Analysis for Green Computing: Concepts and
Applications 1st Edition Rohit Sharma
https://ebookmeta.com/product/big-data-analysis-for-green-
computing-concepts-and-applications-1st-edition-rohit-sharma/
Cloud Computing for Data Analysis: The missing semester
of Data Science Noah Gift
https://ebookmeta.com/product/cloud-computing-for-data-analysis-
the-missing-semester-of-data-science-noah-gift/
Data Analysis for the Social Sciences Integrating
Theory and Practice 1st Edition Douglas Bors
https://ebookmeta.com/product/data-analysis-for-the-social-
sciences-integrating-theory-and-practice-1st-edition-douglas-
bors/
The Norton Anthology of American Literature Volume 2
Robert S. Levine
https://ebookmeta.com/product/the-norton-anthology-of-american-
literature-volume-2-robert-s-levine/

The Math Pact High School Achieving Instructional
Coherence Within and Across Grades 1st Edition Barbara
J Dougherty
https://ebookmeta.com/product/the-math-pact-high-school-
achieving-instructional-coherence-within-and-across-grades-1st-
edition-barbara-j-dougherty/
Breaking Bailey Underground Omega Syndicate Book 4 1st
Edition Jarica James
https://ebookmeta.com/product/breaking-bailey-underground-omega-
syndicate-book-4-1st-edition-jarica-james/
Screw Theory in Robotics: An Illustrated and
Practicable Introduction to Modern Mechanics 1st
Edition Jose M Pardos-Gotor
https://ebookmeta.com/product/screw-theory-in-robotics-an-
illustrated-and-practicable-introduction-to-modern-mechanics-1st-
edition-jose-m-pardos-gotor/
Conquest Of The Danelaw 1st Edition H A Culley
https://ebookmeta.com/product/conquest-of-the-danelaw-1st-
edition-h-a-culley/
The Personalization of the Museum Visit: Art Museums,
Discourse, and Visitors 1st Edition Seph Rodney
https://ebookmeta.com/product/the-personalization-of-the-museum-
visit-art-museums-discourse-and-visitors-1st-edition-seph-rodney/

Pathways to Thinking Schools 1st Edition David Hyerle
https://ebookmeta.com/product/pathways-to-thinking-schools-1st-
edition-david-hyerle/

Sanjay Chakraborty · Lopamudra Dey
Data-Intensive Research
Computing for
Data Analysis:
Theory and
Practices

Series Editors
Nilanjan Dey, Techno International New Town, Kolkata, West Bengal, India
Bijaya Ketan Panigrahi, Indian Institute of Technology Delhi, New Delhi, India
Vincenzo Piuri, University of Milan, Milano, Italy

This book series provides a comprehensive and up-to-date collection of research
and experimental works, summarizing state-of-the-art developments in the fields
of data science and engineering. The trends, technologies and state-of-the art
research related to data collection, storage, representation, visualization, processing,
interpretation, analysis, and management related concepts, taxonomy, techniques,
designs, approaches, systems, algorithms, tools, engines, applications, best prac-
tices, bottlenecks, perspectives, policies, properties, practicalities, quality control,
usage, validation, workflows, assessment, evaluation, metrics, and many more are to
be covered.
The series will publish monographs, edited volumes, textbooks and proceedings
of important conferences, symposia and meetings in the field of autonomic and
data-driven computing.

Sanjay Chakraborty · Lopamudra Dey
Computing for Data
Analysis: Theory
and Practices

Sanjay Chakraborty
Department of Computer Science
and Engineering
Techno International New Town
Kolkata, West Bengal, India
Lopamudra Dey
Department of Computer Science
and Engineering
Heritage Institute of Technology
Kolkata, West Bengal, India
ISSN 2731-555X ISSN 2731-5568 (electronic)
ISBN 978-981-19-8003-9 ISBN 978-981-19-8004-6 (eBook)
https://doi.org/10.1007/978-981-19-8004-6
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Singapore Pte Ltd. 2023
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore

To our Parents, Sister and my Son Arohan for
their love and inspiration.
—Dr. Sanjay Chakraborty
—Dr. Lopamudra Dey

Preface
Data analytics is significant since it aids in the performance optimization of enter-
prises. By finding more cost-effective ways to do business and retaining a lot of data,
firms can help cut expenses by incorporating it into their business strategy. Analyzing
data collections to identify trends and make judgments about the information they
contain is known as Data Analytics (DA). Data analytics is increasingly carried out
with the use of specialist hardware and software.
This book has covered various cutting-edge computing technologies and their
applications over data. We have discussed in-depth knowledge of big data and
cloud computing, the Internet of Things, augmented and virtual reality, quantum
computing, cognitive computing, and computational biology with respect to different
kinds of data analysis and their applications. In this book, we have described some
interesting models in the cloud, IoT, AR/VR systems, quantum, cognitive, and
computational biology domains that provide some useful impact on intelligent data
(bulk time series data, emotional, image data, etc.) analysis. We have also explained
how these computing technologies-based data analysis approaches are used for
various real-life applications. We believe this book will be benefited the readers
who are interested to work in these areas in the future.
Chapter 1 gives an overall introduction to the basics of big data analytics, cloud
data analytics, quantum, and IoT-based data analytics, biological data analytics, and
so on. It described the impact of data analysis on all these frameworks briefly.
Then under Part I, Chap. 2 discusses the roles and different techniques of big
data analysis on a cloud platform. It first discusses the various types of big data
analysis and how that data analysis can be performed in Hadoop architecture through
a cloud framework. To find patterns in data and derive fresh insights, cloud analytics
entails the combination of scalable cloud computing with robust analytical tools. Data
analysis is being used by corporations to gain a competitive edge, enhance scientific
research, and improve people’s lives in a variety of ways. Therefore, it describes the
vii

viii Preface
basics of the cloud, its different models, and architectures and also explains how they
help to do effective data analysis.
Chapters 3 and 4 focus on the discussion of edge computing with the notions
of the Internet of Things (IoT) and augmented/virtual (AR/VR) reality. Chapter 3
introduces the basic concepts of IoT along with the related technologies, protocols,
and architecture. Then, it describes the impact of IoT on various industrial appli-
cations and big data analysis on cloud framework. Chapter 4 discusses the types
of augmented reality with some specific system architectures. It also explains the
different hardware and software components of AR/VR systems. It also presents the
different real-life applications of data analysis and future research directions in this
area.
Chapter 5 under Part II takes a more in-depth look at data analysis in the Biocom-
puting domain. In this domain, we discuss the basic concepts of computational
biology and its various data types. Besides that, it describes the different data anal-
ysis processes on DNA/RNA sequences, microarray data sequences, and protein
sequences.
Chapter 6 discusses the data analysis through cognitive computing. In this chapter,
wedescribethebasicsofbrain–computerinterfacingtechniquesforfeatureextraction
and its various components. It also presents the methodology for the classification
of emotional data through the analysis of EEG signals collected from the human
brain. It has huge applications for those people who are getting distressed due to
work pressure or other issues in their day to day life.
In Part III, Chaps. 7 and 8 deal with the concepts of quantum computing that
help to perform various machine learning and image processing operations on a
set of real-life data and image matrices. Chapter 7 discusses the basics of quantum
machine learning concepts and how they can be utilized to solve some complex clus-
tering and classification problems more efficiently compared to classical computing.
Similarly, two important and complex image processing operations (denoising and
edge detection) that can be solved more efficiently and faster way in the quantum
framework are discussed in Chap. 8.
Finally, Chap. 9 under Part IV summarizes the concepts presented in this book and
discusses applications and trends in data analysis. Social impacts of data analysis,
such as privacy and data security issues, are discussed, in addition to challenging
research issues.

Preface ix
This book has several strong features that set it apart from other texts on computing
for data analysis. It presents very broad yet in-depth coverage of the spectrum of data
analysis over various popular computing domains, especially regarding several recent
research topics on data computing.
Dr. Sanjay Chakraborty
Associate Professor
Department of Computer Science and Engineering
Techno International New Town
Kolkata, India
Dr. Lopamudra Dey
Assistant Professor
Department of Computer Science and Engineering
Heritage Institute of Technology
Kolkata, India

Acknowledgements
We express our great pleasure, sincere thanks, and gratitude to the people who
significantly helped, contributed, and supported the completion of this book. We
are sincerely thankful to Dr. Radha Tamal Goswami, Professor and Director, Techno
International Newtown, Kolkata, India, for his encouragement, support, guidance,
advice, and suggestions to complete this book. Our sincere thanks to Dr. Amlan
Chakrabarti, Professor and Head, AKCSIT, University of Calcutta, India, and Dr.
Anirban Mukhopadhyay, Professor, Department of Computer Science and Engi-
neering, University of Kalyani, Kalyani, India, for their continuous support, advice,
and cordial guidance from the beginning to the completion of this book.
We would also like to express our honest appreciation to our colleagues at the
Techno International Newtown, India, and Heritage Institute of Technology, Kolkata,
for their guidance and support.
We are also very thankful to the reviewers for reviewing the book chapters. This
book would not have been possible without their continuous support and commitment
toward completing the review on time.
To complete this book, the entire staff at Springer extended their kind cooperation,
timely response, expert comments, and guidance, and we are very thankful to them.
Finally, we sincerely express our special and heartfelt respect, gratitude, and
gratefulness to our family members and parents for their endless support and
blessings.
Kolkata, India Dr. Sanjay Chakraborty
Dr. Lopamudra Dey
xi

Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Data and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Types of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Analysis of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Big Data and Data Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Big Data Architecture and Data Analysis . . . . . . . . . . . . . . . . 3
1.3 Cloud Computing and Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Internet of Things (IoT) and Data Analysis . . . . . . . . . . . . . . . . . . . . . 6
1.5 AR/VR and Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.6 Biological Computing and Data Analysis . . . . . . . . . . . . . . . . . . . . . . 9
1.6.1 Steps in Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.7 Cognitive Computing and Data Analysis . . . . . . . . . . . . . . . . . . . . . . . 11
1.8 Quantum Computing and Data Analysis . . . . . . . . . . . . . . . . . . . . . . . 14
1.8.1 Quantum-Inspired Data Analytics . . . . . . . . . . . . . . . . . . . . . . 17
1.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Part I Integration of Cloud, Internet of Things, Virtual Reality
and Big Data Analytics
2 Impact of Big Data and Cloud Computing on Data Analysis . . . . . . . . 23
2.1 Big Data Architecture with Hadoop and MapReduce . . . . . . . . . . . . 23
2.1.1 Hadoop Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2 Big Data Analytics: Emerging Applications in Industry . . . . . . . . . . 28
2.3 Cloud Computing: Definition, Models, and Architectures . . . . . . . . 29
2.4 Comparison of Cloud with Other Computing . . . . . . . . . . . . . . . . . . . 32
2.4.1 Cloud Versus Grid Versus Utility Computing . . . . . . . . . . . . . 32
2.4.2 Cloud Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4.3 Cloud Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
xiii

xiv Contents
2.5 Load Balancing and Virtualization in Cloud Computing . . . . . . . . . . 39
2.5.1 Virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.5.2 Load Balancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.6 Cloud Computing Systems for Data-Intensive Applications . . . . . . . 45
2.7 Analytical and Perspective Approach of Big Data in Cloud
Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3 Edge Computing with Internet of Things (IoT) and Data
Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.2 Related Technologies, Architectures, and Protocols of IoT . . . . . . . . 52
3.2.1 IoT Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.3 Industry Applications of IoT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.4 Big Data Analytics via IoT with Cloud Service . . . . . . . . . . . . . . . . . 66
3.4.1 Data Acquisition, Preprocessing, and Storage . . . . . . . . . . . . 68
3.4.2 Computing in Cloud Framework for IoT . . . . . . . . . . . . . . . . . 69
3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4 Virtual and Augmented Reality with Embedded Systems . . . . . . . . . . . 75
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.2 Types of Augmented Reality Systems . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.3 Overview of Augmented Reality System Organization . . . . . . . . . . . 77
4.3.1 History of Augmented Reality . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.3.2 Embedded Systems Design Approaches . . . . . . . . . . . . . . . . . 78
4.3.3 Custom AR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.4 Augmented Reality Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.4.1 Hardware Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.4.2 Required Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.4.3 Remote Servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.5 Relation of 5G/6G with AR/VR Systems . . . . . . . . . . . . . . . . . . . . . . . 87
4.6 Applications and Future Research Directions . . . . . . . . . . . . . . . . . . . 88
4.6.1 Applications in AR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.6.2 Future Research Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Part II Biological Applications of Data Analytics
5 Computational Biology Toward Data Analysis . . . . . . . . . . . . . . . . . . . . 99
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.2 History of Computational Biology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.3 Biological Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.4 Biological Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

Contents xv
5.5 Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.5.1 DNA/RNA Sequence Data Analysis . . . . . . . . . . . . . . . . . . . . 104
5.5.2 Microarray Data Analysis and Preprocessing . . . . . . . . . . . . . 107
5.5.3 Protein Sequences Data Analysis . . . . . . . . . . . . . . . . . . . . . . . 113
5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
6 Data Classification Through Cognitive Computing . . . . . . . . . . . . . . . . 127
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
6.2.1 Basic Components of BCI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
6.2.2 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
6.2.3 Feature Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
6.3 Result Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.4 Open Research Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.5 EEG Signal-Based Emotional Data Classification . . . . . . . . . . . . . . . 142
6.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.5.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
6.5.3 Result Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
6.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
Part III Quantum Computing for Data Analysis
7 Quantum Computing in Machine Learning . . . . . . . . . . . . . . . . . . . . . . . 161
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
7.2 Quantum Hybrid Data Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
7.2.1 Methodology: Pseudo-steps of Proposed Quantum
Clustering Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
7.2.2 Analysis of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
7.2.3 Computational Complexity Analysis . . . . . . . . . . . . . . . . . . . . 167
7.2.4 Pros and Cons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
7.3 Quantum Hybrid Feature Subset Selection . . . . . . . . . . . . . . . . . . . . . 169
7.3.1 Methodology (HQFSA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
7.3.2 Result Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
7.3.3 Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
7.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
8 Quantum Computing in Image Processing . . . . . . . . . . . . . . . . . . . . . . . . 179
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
8.2 Quantum Image Denoising . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
8.2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
8.2.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

xvi Contents
8.3 Quantum Image Edge Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
8.3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
8.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
Part IV Computations for Various Data Applications and Future
Work
9 Challenges and Future Research Directions on Data
Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
9.2 Challenges and Future Research Directions for Big Data
Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
9.2.1 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
9.2.2 Future Research Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
9.3 Challenges and Future Research Directions for IoT Data
Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
9.4 Challenges and Future Research Directions for AR–VR
Embedded Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
9.5 Challenges and Future Research Directions for Big Biological
Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
9.6 Challenges and Future Research Directions for Quantum
Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
9.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

About the Authors
Dr. Sanjay Chakraborty is currently an Associate Professor of the Department of
Computer Science and Engineering, Techno International New Town, Kolkata, India.
He did his B.Tech. from West Bengal University of Technology, India on Information
Technology in the year 2009. He completed his Master of Technology (M.Tech.) from
National Institute of Technology, Raipur, India in the year of 2011. He completed his
Ph.D. at AKCSIT University of Calcutta in 2022. Dr. Chakraborty is the recipient of
the University Silver Medal from NIT Raipur in 2011 for ranking first class second in
M.Tech. He has 11 years of teaching and research experience. He has published over
55 research papers in various international journals, conferences and book-chapters.
He has authored of two books published by Lap Lambert, Germany and Springer
EAI series respectively. Dr. Chakraborty attended many national and international
conferences in India and abroad. His research interests include Data Mining and
Machine Learning and Quantum Computing. He is a professional member of IAENG
and UACEE. Dr. Chakraborty is an active member of the board of reviewers in
various International Journals, Transactions and Conferences. He is the recipient of
“INNOVATION AWARD” for outstanding achievement in the field of Innovation
by Techno India Institution’s Innovation Council 2019. He is also the recipient of
“IEEE Young Professional Best Paper Award” in 2017. He has also achieved the top
five best paper recognition by Ain Shams Engineering Journal, Elsevier and most
cited author award from Biomedical Journal, Elsevier in 2021.
Dr. Lopamudra Dey completed B.Tech. from West Bengal University of Tech-
nology, Kolkata, India in Computer Science and Engineering in 2009. She received a
Bronze medal in her Bachelor degree. In 2011, she completed M.Tech. from Univer-
sity of Kalyani West Bengal India. She obtained her Ph.D. in Computer Science
from Kalyani University in 2021. She is also working as an Assistant Professor in
the Department of Computer Science and Engineering in Heritage Institute of Tech-
nology, Kolkata, India. Her areas of interests include Bioinformatics, Data Mining,
xvii

xviii About the Authors
and Network Security. She has published more than 15 research articles in journals,
conferences and books.

Chapter 1
Introduction
1.1 Data and Analysis
Data, which is shorthand for “information”, has always been gathered, reviewed,
and/or analyzed as part of the running of the Head Start program. For children
to enroll in the program, numerous pieces of information are needed. Information
from screenings and any subsequent services are included in the delivery of health
and dental services. The gathering and use of a significant amount of information are
required in every aspect of a Head Start program, including content and management
[1]. No matter if they identify as “data analysts” or not, everyone in today’s world
must cope with mountains of data. However, those that have a toolbox of data analysis
abilities have a huge advantage over everyone else because they know what to do with
all that information. They are skilled at turning data into knowledge that motivates
practical action. They are skilled at deconstructing and organizing complicated issues
and datasets to get at the root of issues in their industry.
1.1.1 Types of Data
The relative benefits of quantitative and qualitative data have been the subject of
a protracted argument in the research community. Key factors in this discussion
include the researchers’ educational backgrounds, which are exacerbated by indi-
vidual differences and people’s preferences for relating to things in words or figures.
In actuality, Head Start does not really care about this argument. We need to gather
both kinds of data if we want to have a high-quality program.
• Qualitative Data is information that is conversational or narrative in nature.
Focus groups, interviews, open-ended questions on questionnaires, and other less
organized methods are used to gather these kinds of data. Thinking of qualitative
data as words is a straightforward method to examine it.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023
S. Chakraborty and L. Dey, Computing for Data Analysis: Theory and Practices,
Data-Intensive Research, https://doi.org/10.1007/978-981-19-8004-6_1
1

2 1 Introduction
• Data that is expressed numerically and can have either large or tiny numeric values
is referred to as Qualitative Data. A certain category or label may be associated
with a number of values.
1.1.2 Analysis of Data
The study of unstructured data is to uncover patterns and relevant knowledge. Addi-
tionally, this procedure may involve gathering, organizing, preprocessing, trans-
forming, modeling, and interpreting the data. Knowledge in the field of analytics
comes from several resources. The concept of extrapolating information originates
in the long-established field of inductive learning, a subfield of statistics. With the
development of personal computers, computational resources are being used increas-
ingly frequently to address issues related to inductive learning. The ability to compute
has been used to create novel techniques. New issues have also arisen that call for a
solid understanding of computer sciences. For instance, computational statisticians
now explore ways to carry out a specific task more efficiently from a computational
standpoint.
Several scientists have also fantasized about being able to simulate human
behavior on machines. They came from the artificial intelligence field. In addition to
statistics, they also employed computers to simulate biological and human behavior,
which was a major source of inspiration for their study. For instance, artificial neural
networks have been investigated since the 1940s to mimic the human brain, and ant
colony optimization algorithms were developed in the 1990s to mimic the behavior
of ants. According to Arthur Samuel in 1959 [2], the term machine learning (ML)
first originated as the “area of study of computer algorithms that convert data into
intelligent tasks”. A new phrase with a marginally different connotation first surfaced
in the 1990s: data mining (DM). Business intelligence tools first became available
in the 1990s as a result of more affordable and large-capacity data centers [2].
Companies begin to gather an increasing amount of data with the intention of
either resolving or improving business operations, such as by identifying credit card
fraud, enhancing client relationships using relational marketing strategies that are
more effective. The main issue was whether it was possible to fetch the data to draw
out the knowledge required for a certain purpose.
1.2 Big Data and Data Analytics
Thephrase“bigdata”initiallyoriginatedintheearlytwentiethcentury.Abigdata,the
“three Vs” initially served as the definition of data processing technology. Since then,
other Vs have been suggested. We may create a taxonomy of big data using the first
three Vs: volume, variety, and velocity. There is a volume issue with data repositories
for massive amounts of data as a method of storing big data. How to combine data

1.2 Big Data and Data Analytics 3
from several sources is a topic of variety. Velocity refers to the capacity to handle data
arriving quickly and in streams called data streams. Learning from streaming data
outside of big data’s velocity is another aspect of analytics. A new term has evolved
and is occasionally used as data science. Large datasets require the development
of new techniques and tools for data storage, computing, and distribution because
they cannot be handled by the data processing technologies that are now available
[3]. Big data, however, can be described in many ways than just data amount. The
term “big” can be used to describe a variety of factors, including the quantity of
data sources, the significance of the data, the demand for new processing methods,
the speed at which data is received, the combination of various datasets to enable
real-time analysis, and the accessibility of the data, which today is available to any
business, non-profit organization, or individual. Big data is therefore more focused
on technology. It offers a computer platform for various data processing operations
in addition to analytics.
Processing financial transactions, processing online data, and processing georef-
erenced data are some of these responsibilities. Data science focuses on the devel-
opment of models that can recognize patterns in large amounts of complex data and
the application of these models to practical issues. Data science uses the right tech-
nology to extract meaningful and practical knowledge from data. It is closely related
to data mining and analytics. By offering a framework for knowledge extraction that
incorporates statistics and visualization, data science goes beyond data mining.
As a result, although data administration and collection are supported by big
data, new knowledge is discovered through data science through the application of
procedures to these data. All of these techniques for drawing knowledge from data
are included by the concept of data analytics that we utilize [4, 5].
1.2.1 Big Data Architecture and Data Analysis
New computer technologies are required when data grow in bulk, velocity, and
variety. These emerging technologies, which comprise h/w and s/w, must be highly
flexible as more data is processed. Scalability is the name for this quality. Distributing
the data processing jobs among a number of computers, which may then be grouped
together to form computer clusters, is one technique to achieve scalability. The reader
should not conflate computer clusters with clusters created by analytics techniques
called clustering, which partition a dataset to locate groupings within it. Even though
a distributed system can be created by grouping numerous computers into a cluster,
conventional distributed system software typically struggles to handle massive data.
The effective division of data among the various computing and storage units is one
of the restrictions. New software tools and approaches have been created to handle
these requirements. MapReduce was one of the first methods created for huge data
processing employing clusters. The two steps in the MapReduce programming model
are map and reduce. Hadoop is the name of the most well-known MapReduce imple-
mentation. MapReduce separates the dataset into pieces, or “chunks”, and saves the

4 1 Introduction
block of the dataset required by each cluster computer [4]. The average salary of
a billion people might be calculated using a cluster of thousand computers, each
of which has a computing unit and storage capacity. The population can be broken
down into 1000 subgroups, or pieces, comprising data from one million individuals
each. One of the computers can process each chunk on its own. One may average the
output of each of these computers, which represents the average wage of one million
individuals, to obtain the final average salary. The following conditions must be met
by a distributed system in order to effectively tackle a large data problem:
• Ensure that the entire task is completed and that no data is lost. Another computer
in the cluster must take up the responsibilities assigned to the failed computer or
computers, as well as the affected data chunk.
• Redundancy is the practice of performing the identical task and associated data
piece on many cluster computers. As a result, the redundant computer continues
to perform the work even if one or more computers fail.
• Faulty computers can rejoin the cluster once they have been repaired.
• As the processing demand varies, it is simple to withdraw computers from the
cluster or add more ones.
A solution that complies with these requirements must conceal from the data
analyst the mechanics of how the program functions, such as how the jobs and data
blocks are allocated among the cluster computers [6]. Chapter 2 describes how big
data analysis can be performed on a distributed cluster environment in details.
1.3 Cloud Computing and Data Analysis
To find patterns in data and derive fresh insights, cloud analytics entails the combi-
nation of scalable cloud computing with robust analytical tools. Data analysis is
being used by corporations to gain a competitive edge, enhance scientific research,
and improve people’s lives in a variety of ways. Data analysis is being used by
corporations to gain a competitive edge, enhance scientific research, and improve
people’s lives in a variety of ways. Consequently, as the amount and value of data
continue to rise, data analytics has grown in importance as a tool. Artificial intel-
ligence (AI), machine learning (ML), and deep learning are frequently linked to
cloud analytics (DL). Additionally, it is frequently utilized in commercial applica-
tions, including corporate intelligence, security, Internet of Things (IoT), genomics
research, and work in the oil and gas industry. In truth, data analytics may boost
organizational performance and create new value in every sector. A subset of cloud
analytics called cloud infrastructure analytics is concerned with the analysis of data
related to IT infrastructure, whether it is on-premises or on the cloud. Identification
of input–output patterns, performance evaluation of applications, detection of policy
compliance, and support for capacity management and infrastructure resilience are
the objectives [7, 8].

1.3 Cloud Computing and Data Analysis 5
Data analytics, or the process of analyzing and drawing conclusions from massive
datasets, has become easier because to the development of analytics programs like
Apache Hadoop. Analytics workloads and technologies that were migrated to the
cloud are now referred to as cloud analytics. The capability, accessibility, and ease of
executing complicated data analysis on very big datasets have all grown significantly
thanks to cloud analytics. For a number of reasons, cloud analytics is particularly
intriguing:
• The amount of data being gathered globally is increasing at startling rates, and a
large portion of it is being created and gathered at IoT endpoints or in the cloud.
• Because cloud services are supplied as automated services and do not involve
the installation and upkeep of physical hardware, they are significantly simpler to
deploy.
• A user can activate and deactivate services as necessary thanks to the cloud busi-
ness model. With this consumption-based pricing model, clients only pay for the
services they actually use, eliminating the need to purchase and manage expensive
hardware and saving money on data center space.
• Users can use the cloud to deploy the ideal number of IT resources based on the
current issue. Users may quickly apply computing and storage and grow them
as needed thanks to dynamic resource sizing. Users are relieved of the need to
purchase a fixed capacity of physical IT equipment for each project involving data
analysis.
• For users that want to use the cloud to test a new analytics project as a POC before
making investments on-premises, using a hybrid analytics solution is effective [9].
Organizations are empowered by cloud analytics to:
• Analyze genomic data to learn more about hereditary disorders and how to develop
treatments.
• To enhance customer happiness and customer service, look for patterns in voice,
photographs, and videos.
• To increase product availability and delivery, research purchasing patterns.
• Determine disease reporting patterns to increase the accessibility of medications
and immunizations.
• Hybrid cloud infrastructures should be analyzed to reduce IT spending and
enhance application performance.
There are some best uses of cloud data analytics given below,
A. Social Media
Compounding and deciphering social media activity is a common application for
cloud data analytics. Processing activity across numerous social networking sites
was challenging until cloud drives became widely used, especially if the data
was stored on different servers. Cloud drives enable simultaneous social media
site data analysis, enabling speedy results quantification and attention-based
resource allocation.

6 1 Introduction
B. Tracking of Products
It should come as no surprise that Amazon.com, long regarded as one of the
kings of efficiency and foresight, employs data analytics on cloud storage to
follow things across their chain of warehouses and distribute items wherever is
necessary, regardless of the items’ proximity to customers. With the help of their
Redshift project, Amazon is a pioneer in big data analysis services in addition
to using cloud drives and remote analysis. Redshift serves as an information
warehouse and provides smaller organizations with many of the same analysis
tools and storage capacities as Amazon. This saves smaller companies from
having to invest in expensive hardware.
C. Tracking Preference
For the past 10 years or so, Netflix has drawn a lot of attention because to
its DVD delivery service and the movie library it hosts online. One of their
website’s highlights is its movie suggestions, which keep note of the films users
view and suggest similar ones they might like, serving as a service to customers
and promoting the use of their product. All user information is remotely kept
on cloud disks, so users’ preferences do not alter from computer to computer.
Netflix was able to produce a television program that statistically appealed to a
sizable section of its audience based on their proven taste since they were able
to keep all of their users’ preferences and tastes in movies and television.
D. Records Keeping Strategy
Data may be recorded and processed simultaneously using cloud analytics,
regardless of how far away local servers are. Businesses can monitor the sales of
a product across all of their locations or franchisees in the USA and modify their
production and shipments as necessary. They can manage inventories remotely
using information that is automatically posted to cloud drives instead of waiting
for inventory reports from nearby stores if a product is not selling well. Busi-
nesses can operate more effectively and have a better understanding of their
customers’ behavior thanks to the data stored in the cloud [10].
Chapter 2 describes how data analysis can be performed on a cloud computing
environment in details.
1.4 Internet of Things (IoT) and Data Analysis
IoT analytics is a data analysis tool that evaluates the vast amount of data gathered
from IoT devices. IoT analytics analyzes enormous amounts of data and generates
informative data from it. IoT analytics and Industrial IoT are frequently discussed
together (IIoT). Numerous sensors are used in manufacturing infrastructure, weather
stations, smart meters, delivery vans, and other types of machinery to gather data.
Data center management and applications for the retail and healthcare industries can
both benefit from IoT analytics. IoT data, however, resembles big data. The main
distinction between the two is not simply the amount of data, but also the variety of

1.5 AR/VR and Data Analysis 7
sources from which it was gathered. All of this information must be transformed into
a single, understandable data stream. Data integration becomes quite challenging
when there are so many different types of information sources. This is where IoT
analytics may help, even though it might be challenging to build and deploy [11].
There is an unending flow of data in large amounts from a variety of devices.
Without the use of hardware or infrastructure, IoT analytics assists in the analysis
of this data across all linked devices. Computing power and data storage scale up or
down in accordance with changes in your organization’s needs, ensuring that your
IoT analysis has the necessary capability [12].
(a) Collecting data from many sources, in a variety of formats, and at various
frequency is the initial stage.
(b) Then, this data is processed using a variety of outside sources.
(c) After that, the data is kept in a time series for analysis.
(d) The analysis can be carried out in a variety of methods, including using machine
learning analysis approaches, ordinary SQL queries, or specialized analysis
tools. Numerous predictions can be made using the findings.
(e) Organizations can create a variety of systems and applications to streamline
business procedures using the information they have acquired.
There are wide range of IoT devices that capture data and help to analyzed them.
Some of them are wearable devices, such as smart watch, smart glasses, smart cars.
There are a list of benefits that can be achieved during data analysis through IoT [13,
14].
• Greater control and visibility, which speed up decision-making.
• Growth into new markets and adaptable scaling of business requirements.
• Automation reduces operating expenses, and improved resource use.
• New revenue streams as a result of operational issues being resolved.
• Quicker answers from precisely identifying the issues.
• Earlier problem resolution and recurrence avoidance.
• Improved client experience based on research of past purchases.
• More efficient and pertinent product development.
Chapter 3 describes how data analysis can be performed through IoT devices in
details. It describes the various data collection strategies through IoT devices and
their architecture and protocols for data communication. This Chap. 3 also discussed
the relation between IoT and cloud services for the purpose of big data analysis.
1.5 AR/VR and Data Analysis
Bar graphs and pie charts, which were once the standard tools for data visualization,
are simply unable to capture the intricacy of the data that we now gather. More than
simply data scientists are required to extract insights in order to fully utilize the
enormous amount of data we acquire every day. One way that artificial intelligence

8 1 Introduction
and augmented/virtual realities might truly alter data analytics in an organization is
by making data simpler to interpret even for those who do not have a background
in data. Users can better understand and spot trends in data by using visualization.
Users can more easily gain insights by interacting with the data with the aid of AR
and VR. In conventional 2D data visualizations, it is frequently impossible to detect
critical information, such as data clusters at the intersection of several dimensions.
Users of AR and VR can engage with the data since it can literally surround them on
all sides and be in front of, behind, above, and to either side of them. Collaboration
between teams that are spread across different places can be facilitated by VR and
AR [15].
Multivariate datasets are common today. Without VR and AR, it is currently
impossible for humans to effectively assess the complexity of data; therefore, they
must manually put up 2D representations, reports, graphs, and the like in order to try
to guide decision-making. Users using VR/AR can view everything at once, giving
them a comprehensive perspective on the data that is not possible with conventional
methods of data presentation. The use of a human’s innate ability to consider and
interpret data in various dimensions is another advantage of using VR and AR for
data visualization. Data properties can be communicated in a variety of positions due
to the user’s immersion in the data representation. Data analytics is now accessible to
a wider user base than only data scientists thanks to AR/VR technology. It can make
it possible for more people to be involved in keeping an eye on neural networks
and machine learning models to make sure that the decisions the machines make
continue to be morally righteous, just, and logical. Data analysis may also be more
enjoyable thanks to AR and VR. Humans need to embrace data analysis since it is
how businesses stay competitive to make decisions based on data. By “walking into
the data”, data analysis becomes an immersive experience that can even be enjoyable
rather than a task of poring over spreadsheets and reports [16].
Analyzing data using AR and VR makes it easier to understand the data
completely. In addition, we now have the ability to display complicated data struc-
tures in ways that are easier to comprehend than previously. In AR–VR systems,
massive data visualization is preferred for three reasons.
• We can reduce the complexity of the data by visualizing its structures in VR and
AR.
• It is a novel media with enormous promise for data visualization. It provides
more organic connections, more room, multidimensionality, fewer annihilations,
and many other things.
• Not just data scientists but also other people can get data analytics thanks to AR
and VR technologies.
The next major use of VR technology will be combining it with big data to address
the problem caused by the limitation of human perception. If the enormous amount
of data generated by user interaction can be filtered into useable information, it is an
incredibly great asset. In the highly competitive environment of internet enterprises,
sorting this data is crucial to making wise judgments. When it comes to processing
massive datasets, traditional visual representations like pie charts and diagrams in

1.6 Biological Computing and Data Analysis 9
two dimensions are not cutting it. VR thus offers an optional way to review mate-
rial by exploiting its immersive capabilities to handle complicated problems. Data
visualization is a concept that comprises creating an immersive experience in which
the information models surround you. It makes use of intelligent mapping, intelli-
gent routines, machine learning, and natural language processing to identify impor-
tant patterns and display them in the virtual world, which users may subsequently
customize. The main justification and purpose for combining VR and big data is to
increase the thoroughness of the enormous volume of analytical data. One business
in particular has created a platform that enables users to study up to ten data pieces
by fusing artificial intelligence, virtual reality, and big data [17].
In this book, Chap. 4 discusses the various types of AR–VR systems and their
organization. Besides that, it also shows how the different tools and technologies of
AR–VR systems help to do an effective and efficient data analysis.
1.6 Biological Computing and Data Analysis
Data analytics is the science of analyzing unprocessed data in order to make infer-
ences about it. Any type of data can be used using these strategies to learn things that
can be used to make things better. With the use of data analytics techniques, trends
and indicators that could otherwise get buried in a sea of data can be found. The
overall efficiency of any model can be improved by optimizing the dataset features.
1.6.1 Steps in Data Analysis
Grouping, acquiring, cleaning, translating, and analyzing raw data into useful, perti-
nent information that can help businesses make informed decisions are the process
of data analysis. It can be explained with the following steps:
1. The first step is to understand that the data requirement and how to make groups
of data.
2. The second stage of data analytics is the data collection procedure. Computers,
online resources, cameras, environmental sources, and people can all be used for
this, among other methods.
3. After the data collection, data is organized using software.
4. At fourth step, data cleaning is done to eliminate duplicates, missing values, and
errors.
1.6.1.1 Biological Data
Biological data refers to the information that gathered from the biological organism.
There exist many different types of biological data. For example, gene sequence,

10 1 Introduction
protein structure, mutation, gene expression, amino acids, linkages, pathways, etc.
Allofthesedataformatsareextremelycomplicated,andtraditionaldatabasemanage-
ment systems (DBMS) do not adequately address the need for complex data struc-
ture as compared to most other applications. Bioinformaticists collect these biolog-
ical data, mainly DNA, RNA, and protein data from computational and laboratory
experiments and also published literatures, and store them in databases.
Biological data has a number of unique properties that make it difficult to manage.
It has a lot of variability and a wide range. Moreover, different biologists repre-
sent the same data differently. For example, a protein name and its ID is different
in different databases. The same protein SUMO1 has several aliases like DAP1,
GMP1, OFC10, PIC1, SENP2, SMT3, SMT3C, SMT3H3, UBL1. It has ID 7341 in
NCBI database and ID P63165 in UniProt Database. Furthermore, the schemas of
biological databases are rapidly changing. There should be support for schema evolu-
tion and data object migration so that information can move more freely between
database generations or releases. As most biologists have very little knowledge
about the internal schema design, the interface to the biological database/resource
should display information to the user in a manner appropriate for the problem
being addressed and that reflects the underlying data structures. Access to past
versions of existing data is frequently required by biological data users. Therefore,
while updating the existing database, handling of the old data needs to be carefully
managed. Finally, users of biological database need only read access and do not
require write access. Write access is restricted to authorized users known as cura-
tors. Although only a small number of users require write access, the users generate
a wide range of read access patterns in the databases.
1.6.1.2 Types of Biological Data and Databases
Biological databases can essentially be divided into the following groups based on
the sorts of data stored in them: (1) DNA, (2) RNA, (3) protein, (4) expression, (5)
pathway, (6) gene ontology. There are different biological databases that contain
different biological data. For example, nucleic acid databases contain DNA infor-
mation, genomic databases contain gene-level information, protein information is
available at protein databases, and protein families, domains, and functional sites
contain classification of proteins and domain-related data. These databases serve as
repositories of biological data to researchers. Each entry in the database contains
information about the nucleotide sequence, protein sequence, 3D structure, etc. A
defined algorithm is required to analyze the contents of a database.
1.6.1.3 Data Analysis on Biological Data
Over the last decade, biological data is growing rapidly. Human genomes can now be
sequenced 50,000 times quicker than they could in 2000. As biological data volumes
increase, existing analysis techniques and environment can no longer keep up with the

1.7 Cognitive Computing and Data Analysis 11
demand for data analysis activities to be completed quickly in the life sciences. Three
key characteristics of biological datasets are enormous data volume, extraordinarily
long running time, and application reliance. Each day, hundreds of TB of data are
created. Such a vast volume of data presents problems for hardware support as well
as computer scientists’ ability to analyze data effectively and efficiently. Therefore,
the development of efficient and effective biological data analytics technologies has
required significant research investment [18].
High-performance computing (HPC) platforms and effective, scalable algorithms
can provide efficient way to solve these problems. Large-scale data can be mined
for useful insights thanks to data science. Principal component analysis, linear
regression, and linear discriminant analysis were initiated by many of the inven-
tors of modern statistics, such as Galton, Pearson, and Fisher, who were also preoc-
cupied with the analysis of significant volumes of biological data [19]. Methods
including logistic regression, clustering, random forests, and neural networks were
envisioned or developed more recently by scientists that can solve biological issues.
Apart from that, in order to take advantage of the variety of parallelism and scal-
ability on computer platforms, various programming models such as OpenMP,
CUDA/OpenCL, message passing (MPI), and MapReduce (Hadoop, SPARK) have
been used by biological data researchers in many applications [20]. For networked
computing, MPI is the most widely used programming paradigm. Researchers
employ MPI to build high-performance biological data analytics tools on super-
computers. However, due to the strict requirements of scalability and fault toler-
ance (changing it), new programming models, such as MapReduce and Spark, are
proposed for large-scale distributed computing [21].
1.7 Cognitive Computing and Data Analysis
Systems for cognitive computing are frequently employed to complete tasks that call
for the analysis of enormous volumes of data. For instance, cognitive computing in
computer science helps with large data analytics, seeing trends and patterns, compre-
hending human language, and connecting with clients. Cognitive analytics combines
several cognitive technologies, such as semantics, artificial intelligence algorithms,
deep learning, and machine learning, to do some jobs with intelligence akin to that
of a human [22].
The development of big data has been the subject of numerous research that have
gathered a variety of academic sources. When big data analytics is applied, cogni-
tive computing can help minimize their drawbacks. In order to simulate both the
human thought process and the system errors makes repeatedly, cognitive computing
uses a computational model. This learning method can greatly improve how enor-
mous amounts of data are analyzed for better decision-making. The first step
toward advancement is implementing cognitive computing to evaluate huge data,
so researching and comprehending this topic are crucial [23]. These systems deliver

12 1 Introduction
higher-quality services including emotional contact, cognitive health care, and auto-
mated driving. Cognitive computing has not received much attention prior to the
big data era. However, the development of cognitive computing has now benefited
from the growth of cloud-based AI [24]. While big data analytics offers ways to
explore new data-related opportunities, cloud computing and the Internet of Things
can provide s/w and h/w-dependent cognitive computing. Human big data thinking
is one of the connections between big data analysis and cognitive computing. The
primary distinction between big data analysis and cognitive computing is how data is
processed in accordance with the human brain. Here, the machine must possess the
same data ideas as people in order to comprehend information about the surround-
ings [25]. The cognitive system architecture with the notion of cloud and big data
frameworks is shown in Fig. 1.1.
There is a list of features of big data and cognitive computing that are mapped to
each other (Table 1.1).
Cognitive and Cloud Based Framework
4G/5G/6G Internet IoT Devices Robotics
Cognitive Platform (TensorFlow, PyTorch, Theano etc.)
Database Big Data
Cognitive System Application Interface (Smart
healthcare etc.)
Library
Fig. 1.1 Cognitive system architecture with cloud and big data frameworks
Table 1.1 Mapping features
between big data and
cognitive system
Cognitive computing features Features
Observations Volume
Interpretation Variety
Evaluation Velocity
Decision Veracity

1.7 Cognitive Computing and Data Analysis 13
A cognitive computing system must be able to see a certain volume of data.
To improve data analysis, a cognitive computing system can manage, purge, and
normalize them. In the presence of several information sources, interpretation helps
with understanding and solving difficult situations. According to variety, data may be
obtained in many different ways, including through social media, IoT, GPS tracker,
email services, and other channels.
A human being’s innate capacity to generate knowledge includes evaluation. The
cognitive computing system must evaluate massive amounts of data in a relatively
short amount of time. Big data has the characteristic of velocity, wherein data creation
control and processing speed are crucial. Meanwhile, the effectiveness of data anal-
ysis must be taken into account in order to produce a trustworthy and correct evalua-
tion. Veracity is concerned with data dependability, uncertainty, and quality predic-
tion. The term “decision feature” describes a cognitive computing system’s capacity
to decide in accordance with the data under analysis. The presence of evidence is one
of the key factors in decision-making. Finally, the value characteristic demonstrates
the futility of massive amounts of data prior to their transformation into knowledge.
This function can enhance processing for knowledge development and repurpose
data.
Volume, variety, velocity, veracity, and value are the five categories into which the
features needed for the effective use of big data analysis are divided. The designation
5 V is used to refer to these five groups. These characteristics apply to both big data
and cognitive computing. Businesses may be more productive and enterprise-ready
if they can use cognitive computing to manage big data features (5 V). While concep-
tual modeling of structural equations has been employed, industrial big data analysis
is crucial for the cognitive IoT, incorporating WSN, intellectual computer methods,
and ML approaches. A cognitive computing system built on big data called the hybrid
fuzzy multiobjective optimization algorithm is used to optimize social media anal-
ysis [26]. To address the E-projects portfolio selection problem, this approach is
suggested (EPPS). Web development environments place a great deal of importance
on big data decision-making in EPPS. It has been demonstrated that big data and
cognitive computing are useful in the process of learning [27]. In this study, an intel-
ligent model is utilized to investigate how big data and cognitive systems might be
improved in order to redesign the labor market and have an impact on educational
processes. In addition, suggestions have been made to enhance the performance of
universities in order to address the issues that currently present in education. The
devised remedy is predicated on a novel paradigm known as the Smart University,
where knowledge expands quickly, is freely distributed, and is seen as a shared
heritage of instructors and students. The key finding for a great demand for compe-
tences and expertise motivates the educational system to incorporate other disciplines
into the curriculum. Here, big data use and cognitive computing systems help speed
up the process of restoring key academic community components. Figure 1.2 illus-
trates how the advantages of big data are used to link the characteristics of cognitive
computing.

14 1 Introduction
General
Applications Big Data
Explore New
Knowledge
Generate
Data
Cognitive
computing +
AI + Machine
Learning
Data
Analytics
Insights
Fig. 1.2 Cognitive computing and big data-based conceptual model
Data classification through cognitive computing (brain–computer interfacing)
techniques are discussed and analyzed at Chap. 6 in this book. This chapter is fully
focused on the discussion of the EEG signal-based emotional data classification.
1.8 Quantum Computing and Data Analysis
Quantum computing is one of the most powerful concepts nowadays that can handle
large volume of complex data collected from different scenarios efficiently and effec-
tively. The term “big data” is a matter of concern nowadays. To handle such kind
of data, we require more powerful systems, tools, and technologies. The term “big
data” is frequently used interchangeably with “artificial intelligence”, which leads to
a misunderstanding that it refers to a problem rather than a solution. “Big data” may
be a computational challenge in the field of medicine specifically if the size of a given
datasetexceedstheprocessingcapabilitiesofcurrentcomputers.Forinstance,genetic
data contains millions of SNPs and other biomarker data, necessitating a sizable
amount of storage space and computational skill to execute studies with a semblance
of efficiency. This problem is only made worse by the expanding volume of multidi-
mensional data that is now available to study intricate phenotypes, risk factors, and
outcomes. A significant gap exists in the development of cutting-edge genetic and/or
molecular epidemiological research due to the limitations of traditional computing.
The limitations of today’s sophisticated computers are fortunately being overcome by
better computing approaches like parallel processing and supercomputing. Research
into quantum phenomena and optimization theory has also aided in the develop-
ment of computing theory, which is now starting to come to fruition [28]. At atomic
scales, quantum computing adheres to the quantum mechanical rules, which is radi-
cally different from the world as we know it. The smallest unit of information in
a classical computer is a bit, a binary digit that is deterministically represented as
either “0” or “1”, whereas the closest equivalent unit in a quantum computer is the
qubit, a 2 quantum system probabilistically represented as a coherent superposition
of both “0” and “1”. There are a list of quantum algorithms which are very popular
and extensively use in various data analytics or predictive analytics applications. All

1.8 Quantum Computing and Data Analysis 15
these algorithms follow the basic quantum phenomena such as superposition, paral-
lelism, entanglement, Grover’s operation, quantum operators [29]. The most widely
used quantum algorithms are listed below:
• Supervised Quantum Learning: The best illustration of a supervised quantum
algorithm is the quantum neural network (QNN). Researchers have proposed the
concept of a quantum neuron, which is built on a quantum circuit that can naturally
imitate the cutoff stimulation of neurons and the feedback from various ANN
configurations. Their suggested model can be utilized to build a variety of classical
network configurations, including supervised, unsupervised, and reinforcement
learning, while also honoring intrinsic quantum benefits, such as superposition of
inputs, coherence, and entanglement. To connect machine learning and quantum
computation, a decision tree classifier in the quantum realm. The paper introduces
the quantum entropy impurity criterion for selecting the split node. The training
data was then clustered into subclasses to enable the quantum decision tree to
control quantum states by using a fidelity measure between two quantum states.
In the instance of a quantum SVM, the classical data x→
was solely translated
into quantum states using the quantum feature maps V ((x→
)) and the kernel of
the SVM was constructed from these quantum states. The quantum SVM can be
trained in the same manner as a conventional SVM after the kernel matrix has been
computed on the quantum computer. The quantum kernel concept is identical to
the classical instance. We now use the quantum feature maps to calculate the inner
product of the feature maps (x→
,→
) = |(x→
)|(z→
)|2
. The concept is that
we might gain a quantum advantage if we select a quantum feature map that is
difficult to simulate with a classical computer. Every internal node in a quantum
decision tree divides the training dataset into two or more subgroups based on a
particular discrete function [30]. The term “quantum decision tree” is occasionally
used to describe a quantum query algorithm or quantum black box algorithm that
uses quantum superposition to calculate the function f: {0, 1}n
→ {0, 1}. In reality,
these quantum algorithms are not trees. Because they can handle nonlinearity and
pooling operations, quantum convolutional neural networks (QCNN) can emulate
the behavior of traditional CNN, capable of handling larger or deeper inputs and
providing more sophisticated kernels. Their method is distinctive because it uses
a novel quantum tomography technique that reduces system complexity by more
reliably extracting the most important data.
• Unsupervised Quantum Learning: The two main types of unsupervised
quantum machine learning techniques are dimensionality reduction and clustering
algorithms. Because the database containing the vectors to be grouped requires
less calls overall because to the usage of quantum algorithms, privacy enhance-
ment is one application where quantum clustering methods can be useful. As a
result, the user of the algorithm is exposed to less data from the database. Since
QML algorithms can handle these problems in both vector number and dimension
in logarithmic time, they outperform traditional methods exponentially in speed.
Three quantum algorithms that could replace elements of classical algorithms and
outperform classical algorithms in terms of speedup in clustering are quantized

16 1 Introduction
and assume the existence of a black box quantum circuit that serves as a distance
oracle and provides the distance between vector inputs. Their respective subrou-
tines can be used to: (1) find the two vector dataset points that are the furthest
apart from one another; (2) find the n vector dataset points that are the closest to a
given point; and (3) produce neighborhood graphs of vector datasets, all in times
faster than their classical counterparts. They suggest the following strategies for
quantizing based on these capabilities: (1) divisive clustering, (2) K-medians clus-
tering, and (3) unsupervised learning algorithms. Grover iterations are used by
these subroutines, which are based on Grover’s algorithm, to separate desirable
outputs from the outcomes of computations with super positioned inputs. The
visual technique dynamic quantum clustering (DQC) is effective for handling
large and highly dimensional data. Its hallmark is its ability to work with large,
high-dimensional datasets by exploiting differences in the density of the data (in
feature space) and revealing subsets of the data. The result of a DQC analysis is
a movie that demonstrates how and why sets of data points are genuinely cate-
gorized as members of simple clusters when they display correlations among all
the measured variables [31]. Support vector clustering (SVC) links data points to
Hilbert space states. These states can allow for the weighting of specific locations
to give them more prominence, presumably as cluster center possibilities. They
are represented by Gaussian wave functions. This is useful if one uses a method
like SVC that can be improperly influenced by outliers. With the addition of this
information, the influence of these outlier sites on computations for determining
cluster centers might be weighed [31].
• Variational Quantum Eigensolver (VQE): A hybrid quantum/classical
approach called the Variational Quantum Eigensolver (VQE) can be used to deter-
mine the eigenvalues of a (typically enormous) matrix H. H is often the Hamil-
tonian of some system when this approach is applied in quantum simulations. In
this hybrid algorithm, a conventional optimization loop is conducted inside of a
quantum subroutine [32]. The overall circuit diagram is shown in Fig. 1.3.
There are two essential steps in a quantum subroutine:
– Prepare the ansatz, also known as the quantum state |(vec(θ)).
– Calculate the value of expectation (vec(θ))|H|(vec(θ)).
Fig. 1.3 Overall circuit diagram of VQE

1.8 Quantum Computing and Data Analysis 17
This expected value will always be higher than the smallest eigenvalue of H
because to the variational principle. This constraint enables us to find this
eigenvalue using classical computation to execute an optimization loop:
– By adjusting the ansatz parameters vec(θ), use a traditional nonlinear optimizer
to minimize the expected value.
– Until convergence, iterate.
Applications
– Solve electronic structure problems.
– In quantum chemistry to find the ground energy state of a molecule (reaction
rates, binding strengths, or molecular pathways).
– Traveling salesman problem. Variational principle.
– Solve coloring puzzle (graph coloring).
• Quantum Approximate Optimization Algorithm (QAOA): A variational
quantum technique called the quantum approximate optimization algorithm
(QAOA) is used to roughly solve discrete combinatorial optimization issues.
The optimization framework of VQE is immediately extended by the QAOA
implementation. However, QAOA employs its own finely calibrated ansatz, which
consists of parameterized global rotations and various Hamiltonian parameteriza-
tions of the issue, in contrast to VQE, which may be configured with any number
of ansatzes. The quantum approximation optimization algorithm (QAOA) is a
broad method for approximating solutions to combinatorial optimization prob-
lems, especially those that may be recast as the search for an ideal bit string
[33].
1.8.1 Quantum-Inspired Data Analytics
Although it is a relatively new technology, data analytics is already utilizing quantum
computing. Here are a few ways that big data analysts are using quantum computing
to their advantage [30, 34]:
• Quantum computing provides high-speed detection, analysis, integration, and
diagnosis capabilities when working with large, dispersed datasets.
• Quantum computers can discover patterns quickly in large, unsorted datasets by
concurrently observing every item in a massive database.
• Quantum computers can do incredibly complex computations in a matter of
seconds as opposed to non-quantum computers, which could take hundreds of
years.
• Applications of artificial intelligence employed nowadays are frequently used to
manage huge data and assist in the analysis of datasets to find regularities. Despite
the technology’s rapid advancement, conventional computers are only capable of
processing a finite amount of data. Contrarily, quantum computers are unaffected
by this restriction.

18 1 Introduction
Three domains of artificial intelligence benefit from the speed and power of
quantum computing:
• Natural Language Processing: The first natural language processing operation
using quantum technology was completed in 2020. Grammatical statements have
been successfully converted into quantum circuits by scientists. These algorithms
were able to answer questions once they were run on a quantum computer, which
has significant implications for huge data.
• Quantum Machine Learning: Uses a quantum computer to carry out machine
learning algorithms. Processing speed can be significantly increased by using this
new technology, which can access more computational power than it could on a
conventional computer.
• Data Analytics for Prediction: Using artificial intelligence, predictive analytics
can be utilized to extract pertinent historical information and current data from
databases. More data is processed when quantum computing is integrated with it,
producingpertinentdatathatcanthenbeutilizedtogeneratepredictions.However,
a predictive model, which must take into consideration multiple choices, features,
and variables, may find the vast amount of data accessible to be too much at times.
Building more scalable predictive models with quantum computing is possible
without experiencing any process sluggishness.
Environmental factors like temperature changes or vibrations can prevent most
quantum computers from reaching their full potential and can place them in a condi-
tion of decoherence that renders them essentially worthless. Because of this, it might
still be some time before quantum computing enters the majority of businesses or
turns into a commonplace tool for data analytics. Quantum computing is still a fairly
young technology in 2021. Machine learning algorithms are currently getting better
thanks to developments in quantum computing. There is still a lot to be discovered
about the potential of quantum computing and its implications [32].
In this book, Chaps. 7 and 8 fully deal with the different applications of quantum
computingformachinelearningalgorithmsandimageprocessingtechniques,respec-
tively. These two chapters mainly focused on some well-known machine learning and
image processing algorithms which are frequently used for different kind of general
data or image matrix analysis. How quantum computing techniques help to reach an
effective, efficient, and fastest data analysis is discussed in these two chapters.
1.9 Conclusion
With predictive analytics, data stream ingestion, and recommendations for critical
modifications, these cutting-edge technologies are altering the data-driven enter-
prises. This chapter gives an overview of data analysis techniques and how the
cutting-edge computing platforms enhance the efficiency, flexibility, reliability, and
security of this analysis. In this book, we will concentrate on helping executives who
have a lot of experience using analytics to make important business choices develop
these advanced abilities.

References 19
References
1. Moreira J, Carvalho A, Horvath T (2018) A general introduction to data analytics. Wiley
2. Richmond B (2006) Introduction to data analysis handbook. Academy for Educational
Development
3. Kambatla K, Kollias G, Kumar V, Grama A (2014) Trends in big data analytics. J Parallel
Distrib Comput 74(7):2561–2573
4. Prabhu CSR, Chivukula AS, Mogadala A, Ghosh R, Livingston LM (2019) Big data analytics.
In: Big data analytics: systems, algorithms, applications. Springer, Singapore, pp 1–23
5. Azeem M, Haleem A, Bahl S, Javaid M, Suman R, Nandan D (2021) Big data applications to
take up major challenges across manufacturing industries: a brief review. Mater Today Proc
6. Shehab N, Badawy M, Arafat H (2021) Big data analytics and preprocessing. In: Machine
learning and big data analytics paradigms: analysis, applications and challenges. Springer,
Cham, pp 25–43
7. Ageed ZS, Zeebaree SR, Sadeeq MM, Kak SF, Yahia HS, Mahmood MR, Ibrahim IM (2021)
Comprehensive survey of big data mining approaches in cloud systems. Qubahan Acad J
1(2):29–38
8. Duan L, Da Xu L (2021) Data analytics in industry 4.0: a survey. Inf Syst Front 1–17
9. Mushtaq MS, Mushtaq MY, Iqbal MW, Hussain SA (2022) Security, integrity, and privacy of
cloud computing and big data. In: Security and privacy trends in cloud computing and big data.
CRC Press, pp 19–51
10. Mohan PM (2021) Challenges in big data analytics and cloud computing. Int J Bus Manag Res
9(2):156–161
11. Talebkhah M, Sali A, Marjani M, Gordan M, Hashim SJ, Rokhani FZ (2021) IoT and big
data applications in smart cities: recent advances, challenges, and critical issues. IEEE Access
9:55465–55484
12. Li W, Chai Y, Khan F, Jan SRU, Verma S, Menon VG, Li X (2021) A comprehensive survey
on machine learning-based big data analytics for IoT-enabled smart healthcare system. Mobile
Netw Appl 26(1):234–252
13. Bi Z, Jin Y, Maropoulos P, Zhang WJ, Wang L (2021) Internet of things (IoT) and big data
analytics (BDA) for digital manufacturing (DM). Int J Prod Res 1–18
14. Sharma R, Sharma D (2022) New trends and applications in internet of things (IoT) and big
data analytics. ISBN: 978-3-030-99329-0
15. Sharma L, Anand S, Sharma N, Routry SK (2021) Visualization of big data with augmented
reality. In: 2021 5th international conference on intelligent computing and control systems
(ICICCS). IEEE, pp 928–932
16. Olshannikova E, Ometov A, Koucheryavy Y et al (2015) Visualizing big data with augmented
and virtual reality: challenges and research agenda. J Big Data 2:22. https://doi.org/10.1186/
s40537-015-0031-2
17. Khalid ZM, Zeebaree SR (2021) Big data analysis for data visualization: a review. Int J Sci
Bus 5(2):64–75
18. Venter JC (2010) Multiple personal genomes await. Nature 464(7289):676–677
19. Yin Z, Lan H, Tan G, Lu M, Vasilakos AV, Liu W (2017) Computing platforms for big biological
data analytics: perspectives and challenges. Comput Struct Biotechnol J 15:403–411
20. Fienberg SE (1992) A brief history of statistics in three and one-half chapters: a review essay.
Stat Sci 7:208–225
21. Langmead B, Schatz MC, Lin J, Pop M, Salzberg SL (2009) Searching for SNPs with cloud
computing. Genome Biol 10(11)
22. Hurwitz JS, Kaufman M, Bowles A (2015) Cognitive computing and big data analytics. Wiley,
p 288. ISBN: 978-1-118-89662-4
23. Mishra S, Tripathy HK, Mallick PK, Sangaiah AK, Chae GS (eds) (2021) Cognitive big data
intelligence with a metaheuristic approach. Academic Press
24. Sechin Matoori S, Nourafza N (2021) Big data analytics and cognitive computing: a review
study. J Bus Data Sci Res 1(1):23–32

20 1 Introduction
25. Sreedevi AG, Harshitha TN, Sugumaran V, Shankar P (2022) Application of cognitive
computinginhealthcare,cybersecurity,bigdata andIoT: a literature review.InfProcessManage
59(2):102888
26. Sangaiah AK, Goli A, Tirkolaee EB, Ranjbar-Bourani M, Pandey HM, Zhang W (2020)
Big data-driven cognitive computing system for optimization of social media analytics. IEEE
Access 8:82215–82226
27. Coccoli M, Maresca P, Stanganelli L (2017) The role of big data and cognitive computing in
the learning process. J Vis Lang Comput 38:97–103
28. Mallow GM, Hornung A, Barajas JN, Rudisill SS, An HS, Samartzis D (2022) Quantum
computing: the future of big data and artificial intelligence in spine. Spine Surg Relat Res
6(2):93–98
29. Shaikh TA, Ali R (2016) Quantum computing in big data analytics: a survey. In: 2016 IEEE
international conference on computer and information technology (CIT). IEEE, pp 112–115
30. Chen SYC, Wei TC, Zhang C, Yu H, Yoo S (2022) Quantum convolutional neural networks
for high energy physics data analysis. Phys Rev Res 4(1):013231
31. Ramezani SB, Sommers A, Manchukonda HK, Rahimi S, Amirlatifi A (2020) Machine learning
algorithms in quantum computing: a survey. In: 2020 international joint conference on neural
networks (IJCNN). IEEE, pp 1–8
32. Ostaszewski M, Trenkwalder LM, Masarczyk W, Scerri E, Dunjko V (2021) Reinforcement
learning for optimization of variational quantum circuit architectures. Adv Neural Inf Process
Syst 34:18182–18194
33. Wang H, Zhao J, Wang B, Tong L (2021) A quantum approximate optimization algorithm with
metalearning for MaxCut problem and its simulation via TensorFlow quantum. Math Probl
Eng
34. Pandey A, Ramesh V (2015) Quantum computing for big data analysis. Indian J Sci 14(43):98–
104

Part I
Integration of Cloud, Internet of Things,
Virtual Reality and Big Data Analytics

Chapter 2
Impact of Big Data and Cloud
Computing on Data Analysis
2.1 Big Data Architecture with Hadoop and MapReduce
Big data deals with large volume of multidimensional data such as the data generated
by Google, Yahoo, LinkedIn, eBay, etc. Big data analytics defines the techniques by
which this huge amount of data can be processed and analyzed in a rapid and cost-
effective manner. A traditional database management system (DBMS) fails to handle
such big data. Therefore, Google develops its own MapReduce technique that can
efficiently works on Google File System. Due to the BigTable system embedded
into Google MapReduce framework, it becomes easy for searching from millions of
data and returning the result in milliseconds. The characteristics of big data lie on
three pillars velocity, variety and volume (stored in data warehouses). The benefits
of using this big data concept are listed below.
• You can get more comprehensive answers thanks to big data because you have
access to more data.
• More thorough responses increase data confidence, which calls for an entirely
different strategy for approaching issues.
• Big data’s capacity to assist businesses in product innovation and redesign is a
tremendous advantage.
• Big data analytics is utilized to provide marketing insights and solve problems
for advertisers.
• Businesses can detect a variety of customer-related patterns and trends thanks
to the utilization of big data. By analyzing customer’s purchasing behavior, a
business can research the most popular products and develop items in line with
this pattern.
• Big data tools can handle and analyze the customer feedback about the company
through sentiment analysis, this leads to managing and increasing the growth of
your business.
• Hadoop and MapReduce tools can find new data sources that assist firms in speedy
data analysis and decision-making based on the knowledge.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023
S. Chakraborty and L. Dey, Computing for Data Analysis: Theory and Practices,
Data-Intensive Research, https://doi.org/10.1007/978-981-19-8004-6_2
23

24 2 Impact of Big Data and Cloud Computing on Data Analysis
There are three varieties of big data [1].
i. Structured Big Data: Structured data that has a specific format and can be
processed, saved, and retrieved. It alludes to highly organized data that can be
quickly and easily stored in a database and accessed from it using basic search
engine methods. For example, the student table in an institute database will be
structured and organized.
ii. Unstructured Big Data: Data that has no form or organization at all is
referred to as unstructured data. Processing and analyzing unstructured data
become extremely challenging and time-consuming as a result. An example of
unstructured data is email.
iii. Semi-structured Big Data: Data that contains both of the aforementioned
formats, i.e., structured and unstructured data, is referred to as semi-structured
data. To be exact, it refers to data that has essential information or tags that sepa-
rate different data items even though it has not been categorized under a certain
repository (database). The activities performed of big data can be classified
into three categories, store the big data in a distributed environment, cleaning,
modifying, transforming, running of algorithms under a process is a complex
job, and lastly, access (search, retrieval) the data. However, there are a certain
number of analytics on big data,
i. Descriptive Analytics
It is focused fully on historical data. In this case, data warehousing plays a vital
role to store the last 10–15 years of historical data. Data aggregation and data
mining tools are used in this kind of analytics to discover patterns from the
historical data.
ii. Predictive Analytics
It is a collection of statistical methods that deals with the machine learning algo-
rithms find trends in data and forecast the future behavior and actions. Predictive
analytics software is no longer just for statisticians; it is now more readily avail-
able and less expensive for a variety of sectors and industries, including the field
of learning and development.
iii. Prescriptive Analytics
Prescriptive analytics is a statistical technique for formulating advice and taking
judgments based on the results of computations from algorithmic models.
Without knowing what to look for or what problem has to be fixed, recom-
mendations cannot be generated. Prescriptive analytics starts with a problem in
this manner. For example, using predictive analysis, a training manager learns
that the majority of students who lack a specific talent would not finish the
recently launched course. What is possible to do? Prescriptive analytics can
now help with the situation and help choose options for action. Perhaps an algo-
rithm can identify students who need the new course but lack a specific talent
and automatically suggest that they use a different training resource to pick up
the deficient skill. However, the correctness of a given conclusion or suggestion

2.1 Big Data Architecture with Hadoop and MapReduce 25
depends on how well the computational models and the data were developed.
When implemented in the training department of another organization, what
might make sense for one company’s training requirements might not make
sense for another. It is generally advised that models be customized for each
particular circumstance and requirement.
2.1.1 Hadoop Architecture
Hadoop is a Java-based open-source software framework (MapReduce) from Apache
that efficiently handle big data especially unstructured data. It is actually running
on a cluster of machines to manage and store big data in a distributed manner.
Besides Google, today many companies using Hadoop with MapReduce to handle
big data activities inside their organization. Figure 2.1 represents all the important
components of Hadoop architecture. The Hadoop architecture mainly consists of,
• MapReduce.
• Hadoop distributed file system (HDFS).
• Yet another resource negotiator framework (YARN).
• Hadoop Common.
Hadoop ecosystem plays a vital role to meet the needs of big data processing.
It includes, HDFS, HBase, Hive, Sqoop, Flume, Spark, MapReduce, Pig, Impala,
Cloudera, Oozie, Hue. In this chapter, we are mainly focusing on MapReduce
component [2].
A. MapReduce
MapReduce is nothing more than a YARN framework-based data structure. The
primary function of MapReduce is to carry out parallel distributed processing
in a Hadoop cluster, which is what makes Hadoop operate so quickly. Serial
processing is no longer useful when working with big data. It divides into Map
and Reduce in these two phases. From Fig. 2.2, it is clearly visible that big data
HDFS (Distributed Storage)
MapReduce (Distributed Framework for
computation)
Hadoop
YARN
Hadoop
common
Fig. 2.1 Components in Hadoop architecture

Reduce ()
Big data as
Input
Map
Function
Map
Function
Map
Function
Reduce ()
Output
Fig. 2.2 Overall MapReduce operation phases
input is initially accepted by Map() function which divides the data into key-
value pair-based tuples with the help of RecordReader module. These tuples will
act as input to the Reduce() function. Reduce() merge those individual tuples into
a set of tuples by its key value through a combiner module. However, some basic
operations like shuffling, summation, sorting etc., can be executed on those set of
tuples based on the requirements and finally send it to the output. Gathering the
tuple produced by Map and performing some sort of aggregation operation on
those key-value pairs relying on their key element is the primary duty or function
of Reduce. In the output phase, with the aid of record writer, the key-value pairs
are entered into the file with each record starting on a new line and the key and
value separated by spaces [3].
B. Hadoop Distributed File System (HDFS)
Based on the Google File System (GFS), the Hadoop distributed file system
(HDFS) offers a distributed file system that is intended to function on common
hardware. It is meant to be installed on inexpensive hardware and is extremely
fault-tolerant.
It supports applications with massive datasets and offers high throughput access
to application data. It consists of two modules, Hadoop Common is nothing
but Java libraries which are useful in other modules and Hadoop YARN is a
framework which is used for managing cluster resources and task scheduling.
Instead of implementing expensive high configuration servers, Hadoop helps to
implement a single functional distributed system where the cluster computers
parallely read all the data and generate high speed throughput. In this cluster,
NameNode and DataNode are working as a master and slave, respectively [4]
(Fig. 2.3).
NameNode is mainly responsible to store metadata consisting of transaction
logs and the DataNode is responsible to store the data in Hadoop cluster. The
Hadoop cluster can hold more data the more DataNodes it has. Therefore, it is
recommended that the DataNode has a high storage capacity in order to store

2.1 Big Data Architecture with Hadoop and MapReduce 27
NameNode
(Master) Resource monitor
DataNode
(Map, Reduce)
DataNode
(Map,
Reduce)
Slave DataNode
(Map, Reduce)
Fig. 2.3 Hadoop HDFS architecture
a lot of file blocks. HDFS stores data in terms of blocks at all times. Hadoop
performs the below tasks,
• Initially, directories and files are used to organize data. Blocks of 128 and
64 M uniformly sized files make up each file (preferably 128 M).
• These files are then split up across other cluster nodes for additional
processing.
• The processing is under the supervision of HDFS, which sits atop the local
file system.
• Block replication is done to handle hardware failure.
• Verifying that the code was successfully executed.
• Executing the sort that comes after the map and before the reduce phases.
• Delivering the data after sorting to a certain machine.
• Generating debugging logs for every task.
C. YARN (Yet Another Resource Negotiator)
MapReduce runs on a framework called YARN. The two tasks that YARN carries
out are resource management and job scheduling. The goal of job scheduling is
to break large tasks down into smaller ones so that each job can be distributed
across different slaves in a Hadoop cluster, maximizing processing. The job
scheduler also keeps track of the jobs’ priorities, dependencies on one another,
importance levels, and other details like job timing. To manage all the resources
made available for running a Hadoop cluster, Resource Manager is used.
D. Hadoop Common
Hadoop Common, often known as the “common utilities”, is nothing more than
our Java library, Java files, or the Java scripts that we require for all the other
componentsfoundinaHadoopcluster.Fortheclustertofunction,HDFS,YARN,
andMapReduceusethesetools.HadoopCommonconfirmsthathardwarefailure

in a Hadoop cluster is frequent, necessitating an automatic software solution by
the Hadoop Framework.
2.2 Big Data Analytics: Emerging Applications in Industry
Today is an era of big data. The notable improvement of the bandwidth of Internet
or the use of Internet of Things (IoT) helps the rapid growth of big data. These
days, any firm is amassing multidimensional data, including infobytes from media
to journal articles, tweets to YouTube videos, social networking updates, and blog
conversations. There are multiple industries using big data applications on the cloud
platform.
1. Financial Institution and Banking
Big data is extensively used to monitor the activity of financial markets like the
stock exchange nowadays. Huge data is utilized by retail traders, big banks, hedge
funds in the financial markets for trade analytics such as high-frequency trading,
sentiment analysis, predictive analytics. For risk analytics, such as antimoney
laundering, demand enterprise risk management, “KYC”, and detection of fraud
extensively relies on big data.
2. Healthcare Industry
Some hospitals are trying to collect feedback about the doctors, infrastructure,
facilities, and so on from the patients and their guardians through mobile applica-
tions. Based on that, they can improve their services to the patients. Some medical
institutes have combined free public health data and Google Maps to provide
visual data that enables quicker diagnosis and effective analysis of healthcare
information used in tracing the development of chronic disease.
3. Education Sector
Nowadays, big data has been extensively used in education sectors. Some insti-
tutes monitor the overall progress of the students over time through the use of
big data-based learning and management system. It is also used to measure the
teaching quality, performance, and effectiveness of the teachers or trainers to
ensure gradual growth for the students.
4. Media, Entertainment and Social Networking
All kinds of social media and entertainment industries (YouTube, Facebook,
Netflix, Amazon Prime, etc.) are using big data techniques to generate content
for various target audiences, provide on-demand content, and also monitor the
quality of the content. Real-time sentiment analysis during a football match or
cricket match can be possible through big data. Nowadays, a lot of work is going
on “recommendation systems” where big data applications play a significant role.

2.3 Cloud Computing: Definition, Models, and Architectures 29
5. Retail and Online Shopping
Big data from social media is used for product promotion, customer interest, and
customer retention. Big data technology also helps these online shopping stores
(Flipkart, EBay, Amazon, etc.) to reduce fraudulent activity, time to time analysis
of inventory, detect the shopping patterns of the customers, etc. There directly
connected with the decision-making and profit-loss matters of the shopping
industry.
6. Government Sector and Insurance
Big data is being used by the food authority to identify and research disease and
sickness trends that arerelatedtofood. This enables aquicker response, whichhas
resultedinquickertreatmentandfewerfatalities.Forsecuritypurposes(including
border security), big data technologies are extensively used by various govern-
ment agencies and military forces. Big data also enables insurance businesses to
better retain their customers through analyzing the pattern and behavior of their
existing customers from historical records collected from social media, CCTV
footage, CIBIL score, and fraudulent activity.
7. Manufacturing and Energy Sector
Big data makes it possible to use predictive modeling to help decision-making
after absorbing and combining significant amounts of text, temporal, graphical,
and geospatial data. There are a lot of challenges the manufacturing industry
facing today. Those challenges can be efficiently handled by big data technology.
2.3 Cloud Computing: Definition, Models,
and Architectures
Inspired by the grid computing and utility computing movements, cloud computing
appears and handles hardware and software resources efficiently from a large data
center via the high speed Internet. The customer can pay as per their use of computing,
storage, and communication resources. The terms “cloud” represents Internet and
“computing”refers totheprocessingonthosevarious resources of Internet. Thecloud
computing concept is based on one single question, “Why we purchase resources
if we can rent them?” Therefore, cloud computing can be defined as Internet-based
computing where on-demand (pay as you go) access from a collection of resources
are accomplished without strong intervention of the service provider [5, 6].
Case Study-1
European researchers switched from supercomputers to cloud computing. High-
performance computing (HPC) takes the help of powerful computers to solve high-
end complex problems and that generates high-wage jobs. On average, 95% of

Fig. 2.4 Overview of cloud computing
the computing capacity of desktop computers in universities is wasted. By using
Windows Azure, this capacity is maximized up to 99%. The traditional European
supercomputing industry has largely vanished. So, the main objective to adopt cloud
computing here is to better resource utilization. An overview of a cloud computing
system is shown in Fig. 2.4.
Case Study-2 (Amazon Web Services)
Actual Analytics creates automatic, aided video content analysis tools that make it
possible to index and search video content based on what is happening in the video.
Their products make it possible to identify illnesses and medication side effects for
usage in the pharmaceutical sector. The business made the early decision to employ
a cloud platform to deploy their application because of the significant and fluctuating
processing needs associated with video processing. So, the main objective to adopt
cloud computing here is to provide rapid elasticity.
Case Study-3 (Microsoft Azure)
Anyone in the fishing sector should be extremely concerned about the high incidence
of maritime fatalities, as 24,000 crew members are thought to drown globally each
year. Both man overboard (MOB) Guardian and GeoPoint utilize personal safety
equipment that can automatically set off on-vessel alarms while transmitting a signal
via satellite to the search-and-rescue agency in the event of a man overboard. So, the
no capital expenditure objective is adopted through cloud computing here. By the
use of this cloud technology, numerous numbers of lives were saved at sea in UK.

Another Random Scribd Document
with Unrelated Content

Koiria ei oltu ollenkaan muistettu. Oli ne siinä kyllä liehakoineet,
kun ampumaan ruvettiin, ja jälempää sanoi Jussi kuulleensa niinkuin
haukkua joltakin suunnalta, vaan ei ollut hänkään asiasta varma.
Lähdettiin takaisin Hinkkalaan päin.
Metsässä tavattiin polulla pieni poikanen, joka oli käynyt
satimillaan ja sattunut saamaan jäniksen, niinkuin ne joskus
ruishalmeen aitovieriä kierrellessään menevät satimeen.
Poika seisattui jänis selässä ja olisi ehkä livistänyt metsään, mutta
tulijat olivat jo liian lähellä.
Jänis ostettiin ja heitettiin kiiltävä markka pojalle. Poika seisoi siinä
vielä hetken katsellen ihmeissään komeita jahtiherroja, kun he
poistuivat, mutta sitten kävi kuin pieni nykäys hänen kädessään, hän
kurkisti rahaansa, puristi sen lujasti kouraansa ja läksi juosta
lyllertämään kotiinsa päin.
Illalla kun Joppi palasi herroja rautatielle saattamasta, vannoi hän
itseksensä vakavasti paikalla hankkivansa oikeat jahtikojeet ja
koiran, sillä niin hauskalta oli hänestä tämä ensimmäinen jahti
tuntunut. Ja ne olivat hankittavat pian, sillä herrat olivat luvanneet
lähtiessään tulla vielä ennen lunta uuteen jahtiin.
Raha vaan siinä asiassa teki pientä kiusaa, eikä ollut muistanut
herroiltakaan kysyä, mitä ne pyssyt ja koirat oikeastaan maksaa.
Ja hän teki puolihumalaisena laskujaan, istuessaan yksinään
rattailla. Antoi hevosen kävellä ja tuumi.
Jos hän veisi kauroja kaupunkiin, onhan niitä. Metsästä kyllä saisi
rahaa, mutta ei saa niin äkkiä kuin tarvitsisi. Mutta hän vie kauroja

— ja sen ison sonnin. Mikä sitäkään enää ruokkii. Sen hän tekee, ja
paikalla. Huomenillalla hän jo voi tavata samat herrat Helsingissä —
ja käskiväthän ne käymään luonansa, kun kaupunkiin tulee.
— Nooh! hän läimäytti piiskalla hevosta. — Pitää rientää, niin ehtii
vielä ennen maatapanoa hakea lahtarin aamuksi.
Sonni lyötiin aamulla penkkiin ja aitasta mitattiin kymmenen
tynnöriä kauroja.
— Mitähän se talvella syöttää karjallaan, kun nyt jo näitäkin vähiä
kauroja myömään rupeaa? arveli Eerikki, kun rengin kanssa pani
kauroja säkkiin.
— Myö aina molempia saman verran, niinkuin nyt näkyy
alkavankin.
Sillä tavallahan sitä pysyy tasapainossa, sanoi toinen.
— Kyllä se on vaivaisen tasapainoa, kun hoikkenee kummastakin
päästä, päätti Eerikki. Mutta sen tekee nyt jo mieli niitä eilisiä
herrojaan katsomaan, koska kuuluu Helsinkiin menevän. Ei riittänyt
kotoinen hummastus. — Ja menköön, suotta minä tässä loruan,
lopetti Eerikki, vetäisten säkin sidettä kiinni.
Joppi meni. Junassa meni ja junaan pani tavaratkin. Möi tavaransa
ja osti jahtikalua, kaikkea mitä tarvittiin, pyssyn, laukun, torven,
komean patruunavyön ja nauhasta kannettavan jahtipullon, isointa
laatua.
Siltä tutulta konjakkikauppiaalta osti satamarkkasella koiran ja
muuta metsämiehen tarvetta. Koiran piti olla maan parhaita. Eikä se
enään lampaitakaan aja, kun oikein maalle asettuu, vakuutti myöjä.

— Ja nyt sinne vasta hauska on jahtiin tulla, kun on sinullakin
koira — oli juotu jo veljenmaljat ja he sinuttelivat toisiaan. — Nyt
siitä vasta lystiä tulee, hoki kauppias vielä, kun hän moneen kertaan
jäähyväisiä otettaissa puristeli tiskinsä takaa Jopin kättä. Ja hän
antoi vakavana, käsi leu'alla, katseensa kulkea pitkin pulloilla
täytetyitä hyllyriviä, kääntyi sitten, otti sieltä muutamalta hyllyltä
paperiin käärityn pullon ja pisti itse sen Jopin taskuun
kaupanpäällisiksi — ett'ei Jopin tarvitsisi tiellä kannunpulloaan avata,
lisäsi hän.
Raha riitti osavasti ja Joppi tuli kotiin täytenä jahtimiehenä.
Ja niin se alkoi metsänkäynti.
Vaan kun aluksi ei ollut oikeata jahtikumppania, niin piti komentaa
edes omat miehet rankain hakkuuseen Teuron kulmalle,
metsäpalstaan. Sillä pitihän sitä olla joku näkemässä, miten
jahtimiehenä liikutaan, ja jonkunlainen toveri, jonka ryypylle huutaisi
näkemään kuinka siitä viheriänauhaisesta kylkipullosta ryyppy
kaadetaan kirkkaaseen metallipikariin.
3.
Eräänä iltana, joku viikko jälkeen Helsingin herrain ison jahdin,
istui Joppi kamarissaan porstuan perässä ja poltti pystyvalkeata.
Hän oli alkupäästä, herroja uuteen jahtiin odotellessaan,
kuleksinut metsiä yksin. Oli tänäänkin ollut jahdissa, mutta nyt oli
ollut seuraa. Kirkonkylän kauppamies oli tullut aamulla aikaiseen

Hinkkalaan, oli lähdetty yhdessä metsään ja Joppi oli kaatanut
jäniksen. Metsällä kulkiessa oli keskusteltu kaupungin herrain
jahdista ja kauppias oli päätellyt, että ei ne kaupungin puotimiehet
otusta saa koskaan, vaikka olkoon kuinka hyvät pyssyt ja laitokset
heillä tahansa. Muuten ovat kyllä iloisia miehiä ja hauskaahan niiden
kanssa on pyhäpäivä mellastaa. Lampuria koiria niillä oli melkein
aina, eikä tiennyt vielä Jopinkaan koiraa taata, kun vaan olisi ollut
lampaita näkyvissä, mutta ne oli kaikkialla jo otettu sisään. Jäniksen
ajaja siitä kyllä tulee, kun vaan oikeitten pyssymiesten seurassa
kulkee.
Varsin tyytyväisenä ja hauskalla päällä oli Joppi eronnut
kumppanistaan. Kotiin tullessa oli alkanut sataa ja hän oli kastunut
aika lailla. Oli sentähden, saatuaan märät vaatteet päältään,
sytyttänyt tulen pesään ja istuutunut siihen valkean loisteeseen.
Hän istui kädet polvilla ja katseli kuinka kuusipuut räiskyen
paloivat pesässä. Nousi ylös, kohensi puita, heitti nurkasta
muutaman puun lisää ja asettui taas tuolille takkavalkean eteen.
Hän näytti alakuloiselta. Vaikka oli iloisena, joskin märkänä, tullut
kotiin, istui hän nyt tavallista miettivämmän näköisenä, sillä hän oli
tuskin päässyt kamariinsa metsältä tultua, kun jo vanhempi renki oli
pyrkinyt pakinoille ja pyytänyt vähän rahaa. Ja se oli vienyt kuin
puhaltamalla Jopin päästä kaikki jahtimietelmät pois.
Kun ei ollut rahaa tällä kerralla yhtään.
Kirottua rahaa, ett'ei sitä pidä saaman tarpeeksi! tuumaili hän.
Ensin se niin pikainen pyynti oli vähän vavahuttanut, mutta heti
piti ottaa kova muoto päällensä, ja hän oli sanonut tiukasti rengille,

että en minä nyt märkänä sinulle rahoja kaiva. Anna aikaa
riisuutuakseni edes — ja kyllähän huomenna saat, mitä sillä yöllä
teet. Ja renki oli mennyt pois.
Mutta Jopin se juttu pani tuumimaan. Hän tuumi sinne, tuumi
tänne, ja hautoi päässänsä, mistä sitä saisi, ja vielä ihan huomiseksi.
Ei hän ennen ollut ikänä tällaiseen pulaan sattunut, ei isän eläissä
koskaan, sillä isä oli aina maksanut palkollisten palkat ja kaikki mitä
talosta meni. Eikä hän ollenkaan ollut muistanut, että muutto-aika oli
kohta käsissä ja pitihän niille maksaa palkka, kellä vielä oli saatavata.
Ja saatavaa niillä oli kaikilla, piioillakin paljon vielä.
Hän alkoi aatoksessaan kokoilla, paljonko ylisummaan menisi ja
muistikin, että Jussille ei mene enään mitään, se on palkkansa
hävinnyt pelissä ja ehkä vähän enemmänkin.
— On se edes hyvä, tuumi Joppi, päästen hiukkaista paremmalle
päälle.
— Ja Iita sitten — jaa, se jää taloon, ja vaikk'ei jäisikään, niin
kyllähän sen kanssa aina selvän saa.
— Mutta toinen piika ja tuo Heikki juutas, joka on melkein koko
palkkansa sisässä pitänyt — menee sitä sentään toista sataa, ja kyllä
siitä tulisi ilmeinen häpeä, joll'ei palkollisilleen voisi maksaa. — En
ymmärrä, mistä se isävainaa otti rahaa, kun sillä sentään aina oli
oma tarve. — Mutta se nuukaili, ja vaivasi liiaksi itseään, ukko parka
— rääkkäsi suorastaan ruumistaan. Eikä hän elellyt herrasseuroissa
— ei ymmärtänytkään sitä.
Joppi kynsäsi päätään, nousi tuolilta ja kohenteli taas pesää. Ei
ottanut ajatus oikein seisattuakseen mihinkään kohti. Hän käveli

pitkin permantoa. Pikari ja tyhjä puteli olivat aamulla metsään
lähtiessä jääneet pöydälle. Ne hän pisti kaappiin.
Kaikki oli tyhjää nyt, ei kukkaro yksin.
Tuli alkoi jo hiiltyä, mutta ei Joppi viitsinyt ottaa valkeata
lamppuun, vaan heittäysi hiilustan hämärässä valossa pitkälleen
vuoteelle.
— Olisiko lainata rahaa ensi hätään, jatkoi hän miettimistään. —
Mutta mistä nyt tähän kiireeseen lainaa? Naapurin ukolla kyllä olisi,
mutta ennen menkööt talo ja tavarat, ennenkuin siltä juutalaiselta
pyytää. Etemmäksi ei viitsi lähteä, kun on väsynyt, ulkona sataa ja
on pimeä. Lähteköön susi!
— Kun olisi päivällä tullut mieleen, niin olisi pyytänyt Nuurperilta.
Ajatus palasi tämänpäiväiseen jahtiin, tahi siinä jahdin muassa
oikeastaan metsään. Se oli joskus ennenkin pysähtynyt samaan
kohti, ja varsinkin viime aikoina, kun Joppi Teuron puolella
takametsässään alkoi jahtailla, asustanut yhä useammin niissä
suurissa hongissa siellä, joista Nuurperikin tänään oli sanonut, että
tuossa niitä markkoja makaa.
— Makaa siellä rahan roskaa, ei muuta kuin ottaa vaan, kosk'ei
osannut jo ennen ottaa isä. Ja mikä estää ottamasta näin
tarvittaissa. Se täytyy tehdä heti kuin sopii vaan, se täytyy! Hullu
tässä päätänsä puutteessa pinnistelköön, päätteli hän.
Se ajatus otti nyt Jopin ajussa varman, pysyvän olinpaikan, eikä
jättänyt huomispäivän pienelle rahantarpeelle eikä rengin saataville
sinne sillä kerralla enään mitään tilaa.

Joppi sytytti valkean, huusi Iitaa laittamaan illallista, ja otti pyssyn
nurkasta, alkaen sitä siivota, vihellellen ja puhutellen koiraansa.
Samassa ilmestyi taloon vieras.
Joppi kuuli tulijan koputtelevan jalkojaan porstuassa ja kysyvän
piialta onko isäntä kotona.
Kamarin ovi aukeni ja mies astui sisään. Pitkä, viiksinaamainen
mies, sateesta märkänä, pitkissä, yli polven ulottuvissa
varsisaappaissa, leveä hattu päässä, pyssy kädessä, ja keltainen,
taitteille käännettävä mittapuikko palttoon rintataskusta puoleksi
näkyvissä.
Se oli Tukki-Oskari.
Hänen oikea nimensä oli Oskari Alander, vaikka häntä yleensä
kutsuttiin vaan Tukki-Oskariksi. Hän oli Korvenkylän Alatalon poikia
— samasta pitäjästä — vaan kun isä oli kuollut ja talo tullut
vanhimmalle veljelle, oli Oskari joutunut kotoaan pois. Isä oli
käyttänyt häntä vähän kansakoulussa, jossa hän oli oppinut jonkun
verran kirjoituksen alkeita, ja sen vuoksi oli Oskari tuuminut, että
työmieheksi hän ei rupea, kyllä pännän pitää elättää. Oli ottanut
nimekseen Alander ja oleillut jonkun ajan puotipoikana siellä täällä
maakauppiailla, vaan kun se ei oikein vedellyt, yhtynyt lopulta
tukkimiehiin, koska se ammatti oli hänestä niin vapaata, ja elätellyt
itseään keksillä. Vaan hänpä olikin pian kohonnut tavallista
tukkijunkkaria ylemmäksi kirjoitustaitonsa avulla. Kun osasi sen
verran piirtää numeroita, että voi paperille lukea tukit ja merkitä
niiden ko'on, pääsi hän vartesmanniksi Porin puulaakiin, osti
taskukirjan ja mittapuikon ja piti niitä aina rintataskussaan
näkyvissä, niinkuin arvonsa merkkeinä, ja kulki kuin herrat ainakin.

Keväät ja kesät hän johdatteli uittoa ja piti jokirannan kyläin
tytöille hyvää suuta, syksyt ja talvet mittaili tukkia metsissä, ja oli
lopulta saavuttanut niin suuren luottamuksen, että puulaaki jo uskoi
hänet metsäkauppoja hieromaan talonpoikain kanssa. Uskottiinpa
aina joku summa rahojakin hänelle, että hyvän kaupan syntyessä
voisi antaa myöjälle käsirahoja.
Ja kun rahaa piteli, osasi siitä aina säästää itselleenkin jonkun
markan lisätuloksi, vaikka palkkakin oli jo tavallisen hyvä.
Sentähden olikin Oskari aina hyvissä varoissa ja kulki siistissä
vaatteissa. Ryyppymies hän oli kohtalainen, mutta joi aina varovasti,
ei milloinkaan liikaa, paitsi juuri aniharvoin, kun sattui jonkun
parhaan ystävänsä kanssa turvalliseen paikkaan. Hän säästeli rahoja,
ja hänen hartain salainen toivonsa, perimmäinen päämääränsä oli,
saada kerran koetella metsäkauppoja omalla kukkarollaan.
Hän oli hiljan palannut Porista kotiseudulleen ja kierteli siellä
metsiä pyssyineen — noin vaan niinkuin muuna metsämiehenä —
mutta itse asiassa tarkastellen tukkipuita ja aprikoiden, mitä mistäkin
metsästä lupaisi, jos saisi omistajan kauppoihin ryhtymään. Jopillakin
hän tiesi olevan Teurossa hyvän metsän, aivan ojan varrella, jota
myöten uitto kävi keväällä mainiosti väljemmille vesille Vanajaan
päin. Sen metsän hän oli jo salavihkaa tarkimmilleen tarkastanut, ja
olipa eräänä päivänä ollut vähällä, ett'ei hän tapaturmassa Joppiin
yhtynyt tämän jahtimatkoilla, mutta oli saanut hänet sivuutetuksi.
Siitä olisi voinut syntyä epäluuloa vanhain ystävysten väliin, sillä hän
oli Jopin, niinkuin monen muunkin kanssa, vanha hyvä tuttu, ja
olivat he ennen monasti koetelleet kumpi nakin lyönnissä oli
etevämpi.
Nyt hän näytti tulleen taloon kuin puolittain sattumalta.

Sisäänastuttuaan asetti hän pyssynsä ovipieleen, laski märän
hattunsa tuolille pesän eteen ja tervehti Joppia.
— No mutta Oskarihan se on. Mistä päin sinä — tsooh, älä siinä
rähise — ja Jopin piti kesken puheen ruveta taltuttamaan koiraansa,
joka vieraan tullessa oli horkuksistaan herännyt ja alkanut haukkua.
— Ei se pure, elä pelkää —
— Jaa mistä minä tulen. Sen kyllä kuulet, kun saan tämän märän
nutun päältäni ja kuulen annatko vanhalle ystävälle yösijaa.
— Sehän tulee pyytämättäsikin, vaan mitä sinä tällaisessa
jumalanilmassa kuljet, pimeällä vielä?
— Se nyt käy niin, että joka tyynellä istuu, se tuulella soutaa. Olin
tuolla kirkolla ja päätin sieltä lähteä taas kerran katsomaan kotikylän
puoltakin, Korventaa, mutta kun on paljon tuttuja, niin tulee
viivähtäneeksi. Luulin saavani hevosen, vaan en saanutkaan, ja niin
läksin tästä oikoteitä käymään jalan. Olisin kyllä mennyt
yhtäpäätäkin pimeän vuoksi, vaan tuli tuo jumalaton sade, ja kun ei
erittäin kovaa kiirettä ole taas räin syyspuolella, ajattelin, että
poiketaanpa Hinkkalaan katsomaan miten siellä Joppi isännöi — ja
emännöi, sillä eihän sinulla vielä kuulu akkaa olevan.
— Ei ole, eikä ole tulostakaan tietoa, mutta on mulla hyviä piikoja
—
— Hyvä on sekin.
Iita, nuorempi piika, tuli samassa kamariin. — Että vaikka
emännäksi kelpaa, katsopa tätäkin, sanoi Joppi.

— Jaha! Kah Iitahan se on, vanha tuttu, sanoi Oskari paiskaten
kättä tytölle.
Iitan poskille lenti puna Oskarin tervehtäissä, vaan ei sitä miehistä
kumpikaan puolihämärässä ovensuussa eroittanut.
— Mistä sinä tämän tunnet? tiedusti Joppi.
— Olinhan talvella täällä, etkö sitä muista.
— Niinpä olitkin tosiaan. Kuule Iita, onhan siellä vielä teelehtiä?
Keitäppä meille sajua.
— Sitävarten juuri tulin. Kyllä minä keitän, sanoi piika ja poistui.
— Häpeä sanoani, jatkoi Joppi, mutta nyt on talo niin kuiva, ett'ei
ryyppyyn päästä. Eikä se usein ole sattunut, vaan nyt juuri
pahimmoikseen.
— Onpa oltu ilman ennenkin. — Mutta nythän minä vasta
huomaan, että sinusta on tullut jahtimies. Ja oletpa hankkinut aika
komean pyssyn, vallan takaaladattavan, ja koiran. Vai onko se omasi
tuo koira?
— Kyllä se taloon kuuluu, sanoi Joppi vähän mahtavana.
— Sinäpä poika olet, ja millä sinä niitä kaikkia ostelet, kun kuulin
kesällä oriinkin ostaneesi?
— Että millä minä ostan, kysyt.
— Niin, että mistä sinulla aina riittää? — vai oletko pelannut
pyssyn Helsingin herroilta, koska tuolla tullessani kerrottiin niitä
täällä pitkin syksyä käyneen.

Joppi säpsähti, niinkuin olisi äkkiä herännyt, ja kourasi
otsatukkaansa — aivan huomaamattaan. Päässä välähti ajatus, että
pelatahan hänen olisikin pitänyt niiltä pyssy ja koira silloin kun olivat
heillä yötä ja yhdessä hummattiin. Voi yhdeksänkolmatta! Miks'ei ne
olisi pelanneet, kun muutakin pelattiin. Olisi pannut hevosen vastaan
— ja voittanut, varmaan voittanut, sillä ei ne järinkään tarkoilta
näyttäneet. Ja sitten olisi niillä tavaroillaan ostanut muuta tarvetta,
tai pannut saatuaan rahat taskuunsa.
Mutta katuminen ei enään auttanut, eikä varsinkaan puhuminen
Oskarille niin ääretöntä tyhmyyttä, ettei olisi pelannut. Piti ruveta
äkkiä rauhalliseksi vaan.
— Voitin kuin voitinkin nämä helsinkiläisiltä, vastasi hän riuskasti,
temmaten pyssyn Oskarin kädestä ja heiluttaen sitä, niin että piippu
välähteli lampun valossa. — Ja Pekka oli vastassa, oriini.
— Ja niinkö paljon rohkenit? Puhutkohan vaan totta, sanoi Oskari,
joka oli huomannut Jopissa jotain epäillyttävää.
— Totta tai valetta, se on yhdentekevä pikkuseikoissa, vastasi
Joppi harmissaan. Ja luuletko sinä, ett'ei Jopilla olisi aina niin paljon
perää, että ostaakin jaksaa, mitä tarvitsee. Kalut pitää olla hyvät,
syödään mitä saadaan.
— No no, ei senvuoksi sentään että leipä puuttuisi, vaikka limppu
ostosessa on, sanoi Oskari leikillisesti. Mutta joko sinut on kastettu
oikeaksi jahtimieheksi? Joko olet jäniksen peijaisiin sattunut?
— Kyllä ne ovat tapahtuneet jo kumpikin. Tänäänkin ammuin
jäniksen,
Nuurperin kanssa, vastasi Joppi.

— Vai Nuurperin kanssa olit.
Niin, ei minua saaliin halusta yksin huvita metsän käynti, enkä ole
juuri suuria saanutkaan, kun en ole vielä oikein tottunut ampumaan
juoksevaa otusta, mutta hyvä seura, se se pääasia on, olen
huomannut. Kas sellaistakin päivää, kuin tässä hiljattain
helsinkiläisten herrain kanssa metsällä, ei ole minulle monasti
sattunut. Voi Anttila vainaa! Tiedäppä, että kotona pistettiin liiviin
ensin koko yö, ja kun aamulla metsään lähdettiin, piti meidän Jussin
kantaa kokonainen kannu konjakkia evääksi, kuuteen metsäpulloon
pantuna, ja siihen vielä viinit ja muut hyvät lisäksi. Ei kastetakaan
joka poikaa sellaisilla pidoilla jahtimieheksi.
— Ja jäniksiä tuli vahvasti, kysyi Oskari.
— Ei saatu, oli liian kuiva sää. Ei ne herrat siitä sentään pahoillaan
olleet, vaikkei saaneetkaan, lupasivat tulla uuden kerran. Eikä ne
olleet erittäin saaliinhaluisia nekään, kuin mitä vähän aamusta ensin.
Malttuivat kumminkin pian, kun olivat huomanneet ilman
mahdottomaksi. Minä kyllä olen sen jälestä käynyt yksikseni itseäni
harjoitellen, vaan olen huomannut, ett'ei se sitä ole kuin seurassa.
Kyllästynyt olisin melkein jo, mutta sattui sitten Nuurperi tulemaan,
ja toista se oli heti tänään, vaikk'ei lähimainkaan niinkuin se iso jahti,
hui hai!
— Mutta jos mentäisi huomenaamunakin, sanoi Oskari. Ei ole
minullakaan mitään hengen hätää matkallani, ja kun sattuu pyssykin
olemaan muassani.
— Sehän ihmeesti sopii, mutta peijakas — Joppi napsautti
sormiaan — kuinka mennään, kun ei ole evästä mitään. Toin kyllä
hiljan Helsingistä tavallisen satsin, mutta se tuli tässä kulutetuksi niin

ja näin, niinkuin ymmärrät, ja lopputilkasta saatiin tänään tuskin
matti puolelleen. Viitsitäänköhän lähteä.
— Koetetaan, sittehän tiedämme kuinka se ilman viinaa käy,
ehdotti
Oskari.
— Mutta sinä et tiedä kuinka se on mukavata, kun metsässä tapaa
sopivan paikan, sitten istuu, ja siinä iloisten jahtiveikkojen kanssa
ryyppää, sanoi innokkaasti Joppi.
— Voi veljeni, kyllä minä tiedän, mutta minkä puutteelle voi.
— Totisesti! Eipä sille näin suutapahkaa voi yhtään mitään.
Puutteesta puhuttaissa lensi Jopin mieleen taas se huominen
rahantarve, ja jos huomenna juuri voisikin asian kaartaa, niin
tarvitsee sitä pian kumminkin, pyhäinpäiväksi. — Mutta aivan
samalla kertaa iski mieleen jo metsän myyntikin, ja siinähän se oli
omassa kamarissa nyt paras metsänostaja kuin käsketty. Kyllä siitä
on nyt pidettävä kiinni — ja onhan se voinut vallan sitä varten
tullakin, vaikka sanoi muilla asioilla kulkevansa, eikä tahdo heti
ilmaista mieltänsä. Onpa varmaan tullut sitä varten, veitikka! Pitää
sitä viivyttää, koska on halu viipyä. Ehkä jo aamulla rupeaa asiata
alkamaan.
— No Joppi, mitä tuumit nyt? sanoi Oskari, kun tämä näytti jotain
mietiskelevän.
— Mietin, että mennään metsälle aamulla kuin mennäänkin, koska
sattui hyvä toveri, sanoi Joppi.
Ja se oli sillä päätetty.

Piika toi teetä ja alettiin sitä ryypiskellä.
— Pitääpä pistäytyä vähän ulkona, sanoi Oskari ja läksi kamarista.
Mutta hän kääntyikin tuvan puolelle, avasi oven, kurkisti tupaan ja
kysyi matalalla äänellä, että onko Jussi siellä?
Jussi oli jo ennättänyt vuoteelle. Ei hän vielä nukkunut, vaan kun
ei ollut mitään puhdetyötäkään, oli hän ruvennut vuoteelleen
ovipielessä, vastapäätä uunia.
Hän kavahti ylös, kuultuaan Oskarin äänen, ja meni ovelle, kun
huomasi ett'ei toinen aikonut tulla sisään.
Oskari veti hänet porstuaan ja puhui siellä hetken hiljaa hänen
kanssaan.
— Joll'ei Kulmalasta saa, niin ei sitte mistään. Siellä melkein aina
on, mutta tämä on tavaton matka, enkä lähempänä tiedä, sanoi
Jussi kuultuaan Oskarin asian.
— No ei sillä kiireempää ole kunhan aamuksi saat, kuiskasi toinen.
Kyllä minä vaivasi muistan, tuossa on rahaa.
Oskari meni takaisin kamariin, mutta Jussi puki kiireellä päällensä,
läksi tuvasta ulos ja hävisi pilkkosen pimeään syyssateeseen.
Kamarissa olijat eivät tahtoneet oikeaan mielialaansa päästä sinä
iltana. Pakina kulki jäykänpuoleisesti, vaikka sitä kyllä koetettiin
hierustella. Lyötiin korttiakin niinkuin tavallista oli, mutta leikin vuoksi
vaan, sillä Joppi ei nyt sanonut viitsivänsä rahapeliin ruveta.
He katselivat vuoroin toinen toistansa — salaapäin — ja kumpikin
koetti ikäänkuin haistella toisen tuumia.

Ja lopulta kumpikin varsin vainusi toisen aatoksen juoksun, vaikka
koetti pidätellä itseään tuomasta omaansa ensiksi esille, sillä heidän
molempain ajatus pyöri aivan samassa navassa. Oskari tiesi, että
Joppi herroiksi eläissään alkoi olla yhä kovemmassa rahan pulassa;
ja ett'ei hänellä nytkään ollut rahoja, sen hän huomasi varmaan.
Yhtä varmaan hän arvasi, että Joppi myö metsää heti, kun vaan
jokukin ostaja ilmestyy, ja olisi myönytkin jo, mennyt tarjolle, joll'ei
isävainajalta olisi jäänyt hiukan rahoja. Vaan eihän hänenkään,
Oskarin, sopinut suuta päätä ruveta kauppaa hieromaan, sillä sehän
olisi näyttänyt kovin hätäiseltä, ihankuin puulaakilla olisi kova metsän
tarve, ja hän nyt olisi vallan sitä varten taloon tullut. Pitää vaan
kysästä noin ohimennen, kun sattuu sopiva tilaisuus. Oskari istui
vakavana, heitellen huolettomasti lehtiä pöytään, mutta Joppi ei
näyttänyt illemmalla aivan levolliselta. Hän alkoi epäillä, että jos
Oskari ei ottaisikaan puheeksi metsäkauppoja, vaikka kyllä metsästä
ja metsänkäynnistä oli pakistu pitkin iltaa. Ja mitä hänen sitten
pitäisi tehdä? Tuntuisi kovin köyhältä, jos hän itse rupeaisi
tavaraansa tarjoilemaan.
Siihen suuntaan juoksi miesten ajatus ja alkoi lopulta niin painaa
kumpaakin, että keskustelu viimein joutui ihan pakkolain alaiseksi.
Ruvettiin lopulta yhdessä Jopin vuoteelle maata. Vielä valkean
sammutettua, kun ei heti unettanut, kertoi Oskari entisiä
jahtiseikkailujaan, hän kun oli paljon metsiä kulkenut.
— Oli sekin mukava tapaus — alkoi hän taas uutta juttua,
kääntyen lattianpuoleiselle kylelleen ja heittäen pimeässä pesän
suuta kohti paperossin tyngän, jota oli vielä makuullaan imeskellyt —
oli sekin mukava tapaus, joka minulle viime talvena sattui Riuttan
maisterin metsässä, kun meidän puulaaki siellä tukkia kaatoi. Istuin

tukkitelalla ja merkkailin puita kirjaani, niin alkoi pyy viheltää
takanani kuusikossa. Otin pyssyn, sillä se oli muassani, ja aloin
katsastaa kuusikkoon päin, mutta ei siellä mitään näkynyt ja vihellys
alkoi siirtyä etemmäksi. Elähän huoli, vielä sinä tulet sieltä
lähemmäksikin, ajattelin. Taitappa tuosta näreestä sileä oksa,
sanoin miehelle, joka oli muassani, ja tuo tänne. Mutta ollaan
hiljaa. Minä tekaisin kuusen oksasta piiskun ja aloin viheltää. Pyy
vastasi heti ja tuli lähemmäksi taas, ja me kyykistyimme miehen
kanssa telan ta'a, minulla pyssy viressä. Kumma pyy, kun ei lentoon
pyrähdä, sanoi mies, kun se yhä lähenteli ja aina vastasi. Kyllä sen
kohta näät, sanoin minä, se juoksee maassa. Mutta sieltä alkoikin
kuulua isompaa kapsetta, kuin pyyn pyrinätä, ja samassa tuli
kuusien välistä näkyviin maisterin partainen naama, silmät sirillään
tarkastellen pitkin maata, ja miehellä sormi liipasimessa. Me
rupesimme nauramaan ja minä huusin, että tässä on erehdytty
kumpikin, mutta maisteripa vasta suuttui! Piti sellaista elämää, että
jos huonompia miehiä olisi oltu, niin käpälämäki olisi otettu. Sakottaa
lupasi, kun hänen maallaan pyssyillä kuljetaan, ja häntä vielä
narrataan. Mutta onhan metsä puulaakin, väitin minä. Vaan maa
on minun, väitti maisteri. Silloin otin kovan pään minäkin ja sanoin,
että kun maisteri kerran saa puulaakilta viisikymmentä tuhatta —
— Saiko se roisto viisikymmentä tuhatta, keskeytti Joppi hätäisesti
vieruskumppaninsa puheen.
— Sai se, mutta siinä olikin metsää, veikkoseni.
— Ilmanko se on tullut yhä ylpeämmäksi, jos on ainakin tavallaan
ollut. Ei minua muu niin pistätä, kuin että se niillä rilloillaan aina
rällää, eikä jumalaa mainitse vastaantullessa, sanoi Joppi. — Olin
minä juhanina sitä niistää kievarissa, kun jouduimme sanan

käänteeseen, mutta maltoin mieleni sentään. — Vai niin paljon sai!
Kas kun en ole sitä ennen kuullut.
— Saa sitä metsästä rahaa, kun ottaa vaan, sanoi Oskari
huolettomasti, mutta aina metsänsä mukaan kukin. Kohtuuden
mukaan jokainen. — Vaan muutamat tahtoisivat olla järin susimaisia,
niinkuin tuo Jussilakin. Hän on tarjoillut sitä mäkeä tuolla puolella
Teuron, jonka tiedät —
— Kyllä tiedän. Paljonko vaatii?
— Neljää tuhatta kahdeksantuumaisiin saakka, pienestä mäen
nyppylästä. Vaan ei puulaakikaan rahojaan ilmaiseen jakele, ehei,
sanoi Oskari haukotellen.
Niin se käy, kun menee tarjoilemaan, ajatteli itseksensä Joppi.
Mutta minä kun en tarjoa, niin maksaa ne minulle neljä tuhatta tältä
puolen Teuron, mäestä ja korvesta, vaikka siellä on vähemmän isoja
puita, kuin Jussilalla vastapäätä. Minä sen tiedän. — Vissiin se
huomenna kysyy sitä, varsinkin kun mennään sinne päin metsälle. Ja
sitten tehdään kauppa, se antaa käsirahaa, pitäisi sen antaa, että
kauppa on pitävä — ja sitten tulee perästä koko tukko —
Oskari ei puhunut enään mitään. — Mahtaa nukkua jo, arveli
Joppi.
Ja nyt sai Jopin mielikuvitus siivet sivuunsa siinä mukavassa,
lepposassa lämpimässä peitteen alla, ja alkoi kulkea ennen
aavistamattomia aloja.
Aatos meni varmassa rahain toivossa niin pitkälle, että tuntui
melkein siltä, kuin olisi hänellä ollut niitä jo kourassa, ja herkkänä

liikkeiltään alkoi se löytää jo paikkojakin, mihin niitä asetella.
Nämä huoneet, ne oli korjattavat. Monasti oli pitänyt suorastaan
hävetä näitä mustuneita, sammallettuja seiniä, kun joku parempi
vieras sattui tulemaan. Maalata pitää koko talo, panna paperit seiniin
ja kattoon — miks'ei heillä niinkuin herroillakin. Mutta parempi
maalari pitää hakea, kuin se on ollut, joka pappilan salin kattoon on
kaikellaisia ruohonkorsia tuhrinut — hän oli katsellut niitä, kun oli isä
vainaata maahan toimittamassa. Sellainen on haettava, joka osaa
kattoon maalata kaksi pyssyä ristiin joka nurkkaan, ja seinälle koiran
kuvan — tai hevosen. — Sielläkin Helsingin ravintolassa oli
linnunkuvia seinällä — ja jäniksiä. — Se on niin mukavata, jos sattuu
jahtiherroja. — Ja uunit on lyötävä maahan, ei ne enään vie
savuaankaan — ja tehtävä uudet kaakelista. — Vaan ihan
ensimmäiseksi pitää ostaa rillat, kun on hyvä hevonen jo — ja niillä
ajaa että soi — ajaa kirkolle — ja sattuisipa silloin Riuttan maisteri
vastaan, niin löisi hevosta selkään ja laskisi komeana ohitse —
näyttäisi, että on meilläkin rillat. — Ja Joppi jo hykersi mielihyvästä
käsiään, ajaessaan niillä uusilla rilloilla.
Hänen tuli oikein kuuma ja hän työnsi peitettä alemmaksi.
Oskari rykäsi ja käänsi kylkeään.
— Etkö nuku vielä? kysyi hän. Minä jo yritin nukkua.
— En tiedä mikä lie, kun minua ei nukuta.
— Mitä sitten tuumit?
— Tiesi hänen, on tässä maailmassa toisinaan tuumimistakin,
vastasi

Joppi.
— Onpa kyllä, vaan heitä pois. Kyllä siinä on tuumimista, jos
tuumitaankin! Oletko ajatellut edes, mihin päin aamulla lähdemme?
— Lähtään Teurolle, siellä on hyvät maat, vastasi Joppi
päättävästi.
— Se on minulle oikeastaan takamatkaa, mutta jos hevosen panet,
niin yhdentekevä. Pääsemmehän sillä takaisin, sanoi Oskari.
— Pannaan hevonen, on niillä aikaa —
Miehet painuivat uneen, kumpikin omissa mietteissään.
Aamulla ajaa koluutettin metsään sitä huononpuoleista ratastietä,
joka vei kylästä maantielle ja kirkolle.
Siinä oli kyläin keskivälillä noin puolen peninkulman levyinen
metsämaa, jonka läpitse Teuron oja juoksi, tullen oikealta puolelta
suuresta Kilpisuosta. Kylätie vei ojan poikki pikkuista ennen
yhtymäkohtaansa valtamaantiehen, josta toiselle kädelle käännyttiin
kirkolle, toiselle taas rautatielle. Sitä metsämaata, joka oli vuoroin
korpea, vuoroin mäkeä, kutsui kumpikin kylä takamaakseen, vaikka
se oli niiden keskivälillä, mutta siellä oli kummankin kylän
metsäpalstat, ja ne aina sattuivat viljeltyjen maitten taakse,
kylästäpäin katsoen.
Palstat kulkivat pitkinä sarkoina, päättyen kummaltakin puolen
ojaan, joka oli kyläin rajana. Siellä oli, varsinkin kirkonkylän puolella,
tukkimiesten kirveet jo monen talven paukkuneet, ja kun pientä
metsää sitten oli alettu hakata haloiksi ja vetää rautatielle, niin oli
useaan paikkaan jo syntynyt aikamoisia pälveitä, joista alastomat

mäennyppylät näkyivät etäälle. Metsänkylän puolella ojaa oli niitä
vähemmän.
Jopin palsta kulki loppupäässään ihan pitkin kylätietä, sen oikealla
puolella, ja päättyi sillan korvaan, missä oli pieni uutismökki. Se ei
ollut mikään oikea torppa, vaan voisi kerran siksi tulla, kun pelto
laajenisi. Nykyään vielä sen asukas kulki itsellismiehen kirjoissa, ja
kävi enimmäkseen rahatöissä, paitse veropäiviään, jotka hän
kesäsittäin teki Hinkkalaan.
Ja paikka siinä oli muuten vallan sopiva torpankin tiluksiksi, sillä
maanlaatu oli hyvää, savipohjaa korpea, mutta asukas valitteli sitä
hallan araksi, niinkuin yleensä oli laita koko paikkakunnan, suuren
Kilpisuon vuoksi.
Sinne mökille saakka olivat Joppi ja Oskari päättäneet ajaa
hevosella ja jättää se sinne suojaan jahdin ajaksi. Paremmin olisi
luullut jäniksiä olevan metsämaan syrjässä, heti kylän peltojen ja
niittyjen takana, mutta kun päätös kerran tuli yksimielisesti siten
tehdyksi, niin sinne ajettiin.
Kun oli kylän lakeus ohettu ja ajettiin käymäjalkaa ensimmäistä
metsäistä mäkeä ylös, kumartui Oskari ja veti, suu naurun mareessa,
istuimen alta heinistä esiin putelin.
Jopin silmät suureni.
— Mitä juonia sinulla — mistä sinä veitikka sitä olet saanut? Ethän
kumminkaan jo illalla muassasi tuonut?
— En tuonut, mutta minulla on aina sellaisia pieniä taikoja, nauroi
Oskari.

Ja Joppi yhtyi nauruun, naurettiin yhdessä että tyyni metsä kaikui
ja otettiin rattailla vahva aamuryyppy, hevosen ponnistellessa mäkeä
ylös ja pyöräin loksahdellessa raition kuoppiin.
Kun alettiin lähestyä mökkiä, pysyi Jopin katse melkein
katkeamatta oikeanpuoleisessa metsässä.
— Tuossakin on tuuli taas murskannut hyvän hirren, sanoi hän
pitkäveteisesti, omituisella painolla, osoittaen tien sivuun.
— Niinpä näkyy, vastasi Oskari välinpitämättömästi.
— Kyllä niitä sellaisia vahingoita tapahtuu. Kenen se on metsää?
— Kyllä se on minun, vastasi Joppi jonkunlaisella helpoituksella.
— Sinun — vai tulee teidänkin metsä näin pitkälle, sanoi Oskari,
koettaen olla varsin ihmettelevän näköinen.
— Ojaan saakka ne menevät kaikki lohot, selitti Joppi.
— Ovat täällä miehet olleet monen päivän tuulen murtamia
kasaamassa, vaan ei niitä kaikkia ehdi koota. Mätänee siellä monta
parempaa puuta, kuin tuo oli.
— Sinäpä houkko olet, kun annat hyväin puiden mennä ilman
edestä, ja yhä enemmän niitä mätänee, kun metsä vanhenee. Myö
koko metsä, ota rahaa, se ei vanhene eikä mätäne —
Asia oli itsestään vähitellen vetäytynyt niin kireälle, että se oli kuin
viritetty pyssy: sitä kun liipasi, niin se laukesi.
Joppi kääntyi iloisesti Oskariin päin ja sanoi päättävällä äänellä:

Computing for Data Analysis: Theory and Practices 1st Edition Sanjay Chakraborty

More Related Content

Similar to Computing for Data Analysis: Theory and Practices 1st Edition Sanjay Chakraborty (20)

Recently uploaded (20)

Computing for Data Analysis: Theory and Practices 1st Edition Sanjay Chakraborty