SlideShare a Scribd company logo
Frontiers In Data Science 1st Edition Matthias
Dehmer Frank Emmertstreib download
https://ebookbell.com/product/frontiers-in-data-science-1st-
edition-matthias-dehmer-frank-emmertstreib-6837892
Explore and download more ebooks at ebookbell.com
Here are some recommended products that we believe you will be
interested in. You can click the link to download.
Frontiers In Massive Data Analysis Committee On The Analysis Of
Massive Data
https://ebookbell.com/product/frontiers-in-massive-data-analysis-
committee-on-the-analysis-of-massive-data-4541630
New Frontiers In Applied Data Mining Pakdd 2008 International
Workshops Osaka Japan May 2023 2008 Revised Selected Papers 1st
Edition Takuya Kida
https://ebookbell.com/product/new-frontiers-in-applied-data-mining-
pakdd-2008-international-workshops-osaka-japan-may-2023-2008-revised-
selected-papers-1st-edition-takuya-kida-2028246
New Frontiers In Applied Data Mining Pakdd 2011 International
Workshops Shenzhen China May 2427 2011 Revised Selected Papers 1st
Edition Hyoungnyoun Kim
https://ebookbell.com/product/new-frontiers-in-applied-data-mining-
pakdd-2011-international-workshops-shenzhen-china-
may-2427-2011-revised-selected-papers-1st-edition-hyoungnyoun-
kim-4142786
New Frontiers In Applied Data Mining Pakdd 2009 International
Workshops Bangkok Thailand April 2730 2009 Revised Selected Papers 1st
Edition Frdric Flouvat
https://ebookbell.com/product/new-frontiers-in-applied-data-mining-
pakdd-2009-international-workshops-bangkok-thailand-
april-2730-2009-revised-selected-papers-1st-edition-frdric-
flouvat-4142788
Frontiers In Major League Baseball Nonparametric Analysis Of
Performance Using Data Envelopment Analysis 1st Edition John Ruggiero
Auth
https://ebookbell.com/product/frontiers-in-major-league-baseball-
nonparametric-analysis-of-performance-using-data-envelopment-
analysis-1st-edition-john-ruggiero-auth-4269016
Intelligent Data Engineering And Analytics Proceedings Of The 10th
International Conference On Frontiers In Intelligent Computing Theory
And Applications Ficta 2022 Vikrant Bhateja
https://ebookbell.com/product/intelligent-data-engineering-and-
analytics-proceedings-of-the-10th-international-conference-on-
frontiers-in-intelligent-computing-theory-and-applications-
ficta-2022-vikrant-bhateja-49171398
Frontiers In Antiinfective Drug Discovery Volume 9 Attaurrahman Iqbal
Chaudhary
https://ebookbell.com/product/frontiers-in-antiinfective-drug-
discovery-volume-9-attaurrahman-iqbal-chaudhary-45329624
Frontiers In Clinical Drug Research Anticancer Agent Attaurrahman
https://ebookbell.com/product/frontiers-in-clinical-drug-research-
anticancer-agent-attaurrahman-46882778
Frontiers In Magnetic Materials From Principles To Material Design And
Practical Applications Chen Wu
https://ebookbell.com/product/frontiers-in-magnetic-materials-from-
principles-to-material-design-and-practical-applications-chen-
wu-47224102
Frontiers In Data Science 1st Edition Matthias Dehmer Frank Emmertstreib
Frontiers In Data Science 1st Edition Matthias Dehmer Frank Emmertstreib
Frontiers in Data Science
Chapman & Hall/CRC
Big Data Series
PUBLISHED TITLES
SERIES EDITOR
Sanjay Ranka
AIMS AND SCOPE
This series aims to present new research and applications in Big Data, along with the computa-
tional tools and techniques currently in development. The inclusion of concrete examples and
applications is highly encouraged.The scope of the series includes, but is not limited to, titles in the
areas of social networks, sensor networks, data-centric computing, astronomy, genomics, medical
data analytics, large-scale e-commerce, and other relevant topics that may be proposed by poten-
tial contributors.
FRONTIERS IN DATA SCIENCE
Matthias Dehmer and Frank Emmert-Streib
BIG DATA OF COMPLEX NETWORKS
Matthias Dehmer, Frank Emmert-Streib, Stefan Pickl, and Andreas Holzinger
BIG DATA COMPUTING: A GUIDE FOR BUSINESS AND TECHNOLOGY
MANAGERS
Vivek Kale
BIG DATA : ALGORITHMS, ANALYTICS, AND APPLICATIONS
Kuan-Ching Li, Hai Jiang, Laurence T.Yang, and Alfredo Cuzzocrea
BIG DATA MANAGEMENT AND PROCESSING
Kuan-Ching Li, Hai Jiang, and AlbertY. Zomaya
BIG DATA ANALYTICS: TOOLS AND TECHNOLOGY FOR EFFECTIVE
PLANNING
Arun K. Somani and Ganesh Chandra Deka
BIG DATA IN COMPLEX AND SOCIAL NETWORKS
My T. Thai, Weili Wu, and Hui Xiong
HIGH PERFORMANCE COMPUTING FOR BIG DATA
Chao Wang
NETWORKING FOR BIG DATA
ShuiYu, Xiaodong Lin, Jelena Mišić, and Xuemin (Sherman) Shen
Frontiers in Data Science
Edited by
Matthias Dehmer
Frank Emmert-Streib
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
© 2018 by Taylor & Francis Group, LLC
CRC Press is an imprint of Taylor & Francis Group, an Informa business
No claim to original U.S. Government works
Printed on acid-free paper
International Standard Book Number-13: 978-1-4987-9932-4 (Hardback)
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts
have been made to publish reliable data and information, but the author and publisher cannot assume
responsibility for the validity of all materials or the consequences of their use. The authors and publishers
have attempted to trace the copyright holders of all material reproduced in this publication and apologize to
copyright holders if permission to publish in this form has not been obtained. If any copyright material has
not been acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter
invented, including photocopying, microfilming, and recording, or in any information storage or retrieval
system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.
com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood
Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and
registration for a variety of users. For organizations that have been granted a photocopy license by the CCC,
a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used
only for identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
Contents
About the Editors vii
Contributors ix
1 Legal aspects of information science, data science,
and Big Data 1
Alessandro Mantelero and Giuseppe Vaciago
2 Legal and policy aspects of information science in
emerging automated environments 47
Stefan A. Kaiser
3 Privacy as secondary rule, or the intrinsic limits of legal
orders in the age of Big Data 69
Bart van der Sloot
4 Data ownership: Taking stock and mapping the issues 111
Florent Thouvenin, Rolf H. Weber, and Alfred Früh
5 Philosophical and methodological foundations of text
data analytics 147
Beth-Anne Schuelke-Leech and Betsy Barry
6 Mobile commerce and the consumer information
paradox: A review of practice, theory, and a research
agenda 171
Matthew S. Eastin and Nancy H. Brinson
7 The impact of Big Data on making evidence-based
decisions 191
Rodica Neamtu, Caitlin Kuhlman, Ramoza Ahsan, and
Elke Rundensteiner
8 Automated business analytics for artificial intelligence in
Big Data@X 4.0 era 223
Yi-Ting Chen and Edward W. Sun
v
vi Contents
9 The evolution of recommender systems: From the
beginning to the Big Data era 253
Beatrice Paoli, Monika Laner, Beat Tödtli, and Jouri Semenov
10 Preprocessing in Big Data: New challenges for
discretization and feature selection 285
Verónica Bolón-Canedo, Noelia Sánchez-Maroño, and
Amparo Alonso-Betanzos
11 Causation, probability, and all that: Data science as a
novel inductive paradigm 329
Wolfgang Pietsch
12 Big Data in healthcare in China: Applications, obstacles,
and suggestions 355
Zhong Wang and Xiaohua Wang
Index 371
About the Editors
Matthias Dehmer studied mathematics at the University of Siegen, Siegen,
Germany and earned his PhD in computer science from the Technical
University of Darmstadt, Darmstadt, Germany. Afterward, he was a research
fellow at Vienna Bio Center, Austria, Vienna University of Technology, and
University of Coimbra, Portugal. He obtained his habilitation in applied
discrete mathematics from the Vienna University of Technology. Currently, he
is a professor at UMIT—The Health and Life Sciences University, Austria. His
research interests are in data science, Big Data, complex networks, machine
learning, and information theory. He has published more than 220 publi-
cations in applied mathematics, computer science, data science, and related
disciplines.
Frank Emmert-Streib studied physics at the University of Siegen, Germany,
and earned his PhD in theoretical physics from the University of Bremen,
Bremen, Germany. He was a postdoctoral fellow in the United States before
becoming a faculty member at the Center for Cancer Research at the Queen’s
University Belfast, UK. Currently, he is a professor in the Department of
Signal Processing at Tampere University of Technology, Finland. His research
interests are in the field of computational biology, data science and analytics
in the development and application of methods from statistics, and machine
learning for the analysis of Big Data from genomics, finance, and business.
vii
Frontiers In Data Science 1st Edition Matthias Dehmer Frank Emmertstreib
Contributors
Ramoza Ahsan
Worcester Polytechnic University
Worcester, Massachusetts
Betsy Barry
Emory University
Atlanta, Georgia
Amparo Alonso-Betanzos
Universidade da Coruña
A Coruña, Spain
Nancy H. Brinson
University of Texas at Austin
Austin, Texas
Verónica Bolón-Canedo
Universidade da Coruña
A Coruña, Spain
Yi-Ting Chen
National Chiao Tung University
Hsinchu, Taiwan
Matthew S. Eastin
University of Texas at Austin
Austin, Texas
Alfred Früh
Universität Zürich
Zürich, Switzerland
Stefan A. Kaiser
Independent Researcher
Wassenberg, Germany
Caitlin Kuhlman
Worcester Polytechnic University
Worcester, Massachusetts
Monika Laner
Fernfachhochschule Schweiz
Brig, Switzerland
Alessandro Mantelero
Polytechnic University of Turin
Turin, Italy
Noelia Sánchez-Maroño
Universidade da Coruña
A Coruña, Spain
Rodica Neamtu
Worcester Polytechnic University
Worcester, Massachusetts
Beatrice Paoli
Fernfachhochschule Schweiz
Brig, Switzerland
Wolfgang Pietsch
Technische Universität München
Munich, Germany
Elke Rundensteiner
Worcester Polytechnic University
Worcester, Massachusetts
ix
x Contributors
Beth-Anne Schuelke-Leech
University of Windsor
Windsor, Ontario, Canada
Jouri Semenov
Fernfachhochschule Schweiz
Brig, Switzerland
Edward W. Sun
KEDGE Business School
Talence, France
Florent Thouvenin
Universität Zürich
Zürich, Switzerland
Beat Tödtli
Fernfachhochschule Schweiz
Brig, Switzerland
Giuseppe Vaciago
University of Insubria
Varese, Italy
Bart van der Sloot
Tilburg University
Tilburg, the Netherlands
Zhong Wang
Beijing Academy of Social Sciences
Beijing, China
Xiaohua Wang
Chinese Academy of Sciences
Beijing, China
Rolf H. Weber
Universität Zürich
Zürich, Switzerland
Chapter 1
Legal aspects of information science,
data science, and Big Data∗
Alessandro Mantelero
Giuseppe Vaciago
Introduction: The legal challenges of the use of data ................... 2
Data collection and data processing: The fundamentals of data
protection regulations ................................................ 4
The European Union model: From the Data Protection Directive to
the General Data Protection Regulation ............................. 6
Use of data and risk-analysis .......................................... 10
Use of data for decision-making purposes: From individual to collective
dimension of data processing ............................................ 17
Data-centered approach and socio-ethical impacts ................... 21
Multiple-risk assessment and collective interests ..................... 23
The guidelines adopted by the Council of Europe on the protection
of individuals with regard to the processing of personal data in a
world of Big Data ..................................................... 25
Data prediction: Social control and social surveillance ................. 29
Use of data during the investigation: Reasonable doubt versus
reasonable suspicion .................................................. 30
Big Data and social surveillance: Public and private interplay in
social control .......................................................... 31
The EU reform on data protection ................................... 35
References ............................................................... 36
∗Alessandro Mantelero, Polytechnic University of Turin, is the author of sections
“Introduction: The legal challenges of the use of data” and “Use of data for decision-making
purposes: From individual to collective dimension of data processing.” Giuseppe Vaciago,
University of Insubria, is the author of section “Data prediction: Social control and social
surveillance.”
1
2 Frontiers in Data Science
Introduction: The legal challenges of the use of data
There are many definitions of Big Data, which differ depending on the
specific discipline. Most of the definitions focus on the growing technological
ability to collect, process, and extract new and predictive knowledge from a
bulk of data characterized by a great volume, velocity, and variety.∗
However, in terms of protection of individual rights, the main issues do
not only concern the volume, velocity, and variety of processed data, but also
the analysis of data, using software to extract new and predictive knowledge
for decision-making purposes. Therefore, in this contribution, the definition
of Big Data encompasses both Big Data and Big Data analytics.†
The advent of Big Data has suggested a new paradigm in social empiri-
cal studies, in which the traditional approach adopted in statistical studies is
complemented or replaced by Big Data analysis. This new paradigm is char-
acterized by the relevant role played by data visualization, which makes it
possible the analysis of real-time data streams to get their trajectory and
predict future trends possible [3]. Moreover, large amounts of data make it
possible to use unsupervised machine-learning algorithms to discover hidden
correlations between variables that characterize large datasets.
This kind of approach, which is based on the emerging correlations among
data, leads social investigation to adopt a new strategy, in which there are
no preexisting research hypotheses to be verified through empirical statisti-
cal studies. Big Data analytics suggest possible correlations, which constitute
per se the research hypothesis: data show the potential relations between facts
or behavior. Nevertheless, these relations are not grounded on causation and,
for this reason, should be further investigated using the traditional statistical
method.
Assuming that data trends suggest correlations and consequent research
hypotheses, at the moment of data collection only very general research
hypotheses are possible, as the potential data patterns are still unknown.
Therefore, the specific purpose of data processing can be identified only at
a later time, when correlations reveal the usefulness of some information to
detect specific aspects. Only at that time, the given purpose of the use of
information becomes evident, also with regard to further analyses conducted
with traditional statistical methods [4].
∗The term “Big Data” usually identifies extremely large datasets that may be analyzed
computationally to extract inferences about data patterns, trends, and correlations. Accord-
ing to the International Telecommunication Union, Big Data are “a paradigm for enabling
the collection, storage, management, analysis, and visualization, potentially under real-time
constraints, of extensive datasets with heterogeneous characteristics” [1].
†This term is used to identify computational technologies that analyze large amounts of
data to uncover hidden patterns, trends, and correlations. According to the European Union
Agency for Network and Information Security, the term Big Data analytics “refers to the
whole data management lifecycle of collecting, organizing, and analysing data to discover
patterns, to infer situations or states, to predict and to understand behaviors” [2].
Legal aspects of information science, data science, and Big Data 3
On the other hand, there are algorithms, such as supervised machine-
learning algorithms, that need a preliminary training phase. In this stage, a
supervisor uses data training sets to correct the errors of the machine, orienting
the algorithm toward correct associations. In this sense, supervised machine-
learning algorithms require a prior definition of the purpose of the use of
data, identifying the goal that the machine should reach through autonomous
processing of all available data.
In this case, although the purpose of data use is defined in the training
phase, the manner in which data are processed and the final outcome of data
mining remain largely unknown. In fact, these algorithms are black boxes and
their internal dynamics are partially unpredictable.∗
Both data visualization and machine-learning applications pose relevant
questions in terms of Big Data processing, which will be addressed in the
following sections. How is it possible to define the specific purpose of data
processing at the moment of data collection, when the correlations suggested
by analytics are unknown at that time? If different sources of data are used
in machine training and running learning algorithms, how can data subjects
know the specific purpose of the use of their information in given machine-
learning applications?
These questions clearly show the tension that characterizes the application
of the traditional data protection principles in the Big Data context. But this
is not the only crucial aspect: the very notion of personal data is becoming
more undefined. Running Big Data analytics over large datasets could make
it difficult to distinguish between personal data and anonymous data, as well
as between sensitive data and nonsensitive data.
Various studies have demonstrated how information stored in anonymized
datasets can be partially reidentified, in some cases without expensive tech-
nical solutions [5–12]. This suggests going beyond the traditional dichotomy
between personal and anonymous data and representing this distinction as
a scale that moves from personal identified information to aggregated data.
Between these extremes, the level of anonymization is proportional to the
effort, in terms of time, resources and costs, which is required to reidentify
information.
Finally, with regard to sensitive data, Big Data analytics make it possi-
ble to use nonsensitive data to infer sensitive information, such as informa-
tion concerning religious practices extracted from location data and mobility
patterns [13].
Against this background, the existing data protection regulations and the
ongoing proposals [14,15] remain largely focused on the traditional main pil-
lars of the so-called fourth generation of data protection laws [16]: the notice
∗See, e.g., Zhang M., “Google Photos Tags Two African-Americans As Gorillas
Through Facial Recognition Software,” Forbes, July 1, 2015. http://www.forbes.com/sites/
mzhang/2015/07/01/google-photos-tags-two-african-americans-as-gorillas-through-facial-
recognition-software/#36b529227b63 (accessed March 23, 2016).
4 Frontiers in Data Science
and consent model (i.e., an informed, freely given, and specific consent)
[17–21],∗
the purpose limitation principle [24,25], and the minimization
principle.
For this reason, the following sections investigate the limits and criticisms
of the existing legal framework and the possible options to provide adequate
answers to the new challenges of Big Data processing. In this light, this chapter
is divided into three main sections.
The first section focuses on the traditional paradigm of data protection
and on the provisions, primarily in the new EU General Data Protection
Regulation (Regulation (EU) 2016/679, hereafter GDPR), that can be used
to safeguard individual rights in Big Data processing.
The second section goes beyond the existing legal framework and, in the
light of the path opened by the guidelines on Big Data adopted by the Coun-
cil of Europe, suggests a broader approach that encompasses the collective
dimension of data protection. This dimension often characterizes Big Data
applications and leads to assess the ethical and social impacts of data uses,
which assume an important role in many Big Data contexts.
The last section deals with the use of Big Data to anticipate fraud detection
and to prevent crime. In this light, the new Directive (EU) 2016/680†
is briefly
analyzed.
Data collection and data processing: The fundamentals of data
protection regulations
Before considering the different reasons that induce the law to protect
personal information, it should be noted that European legal systems do not
recognize the same broad notion of the right to privacy that exists in U.S.
jurisprudence.‡
At the same time, in the European countries, data protection
laws do not draw their origins from the European idea of privacy and its
related case law.
∗See Articles 6 and 7, Regulation (EU) 2016/679 of the European Parliament and of the
Council of April 27, 2016 on the protection of natural persons with regard to the processing
of personal data and on the free movement of such data, and repealing Directive 95/46/EC
(General Data Protection Regulation). Differently, in the United States, the traditional
approach based on various sectorial regulations has underestimated the role played by user’s
choice, adopting a market-oriented strategy. Nevertheless, the guidelines adopted by the
U.S. administrations in 2012 [14] seem to suggest a different approach, reinforcing self-
determination [8,22,23].
†Directive (EU) 2016/680 on the protection of natural persons with regard to the
processing of personal data by competent authorities for the purposes of the prevention,
investigation, detection or prosecution of criminal offences or the execution of criminal penal-
ties, and on the free movement of such data, and repealing Council Framework Decision
2008/977/JHA.
‡With regard to the notion of right to privacy (and in brief), in the United States the
right to privacy covers a broad area that goes from informational privacy to the right of
self-determination in private life decisions. On the other hand, in European countries, this
right mainly focuses on the first aspect and is related to media activities [26–31].
Legal aspects of information science, data science, and Big Data 5
European data protection regulations, since their origins in the second
half of the last century, focused on information regarding individuals, without
distinguishing between public or private information [32]. Compared with the
right to privacy, the issues regarding the protection of personal data have been
more recently recognized by law, both in the United States and Europe [33].
This dates from the 1960s, whereas the primitive era of the right to privacy
was at the end of the nineteenth century, when the penny press assumed
a significant role in limiting the privacy of the people belonging to upper
classes [34].
In the light of the above, the analysis of the fundamentals of data process-
ing should start from the effects of the computer revolution that happened in
the late 1950s. The advent of computers and its social impact led to the first
regulations on data protection and posed the first pillars of the architecture
of the present legal framework.
The first generations of data protection regulations were characterized by a
national approach. They were adopted in different times by national legislators
and were different with regard to the extension of the safeguards provided and
the remedies offered.
The notion of data protection was originally based on the idea of control
over information, as confirmed by the literature of that period [35–37]. The
migration from dusty paper archives to computer memories was a Coperni-
can revolution which, for the first time in history, permitted the aggregation
of information about every citizen that was previously spread over different
archives [38].
The first data protection regulations were the answer to the rising concern
of citizens about social control, as the new big mainframe computers gave
governments [16,38–41] and large corporations the opportunity to collect and
manage large amount of personal information [16,42]. In this sense, the legal
systems gave individuals the opportunity to have a sort of countercontrol over
the collected data [16,38,43].
The purpose of the regulations was not to spread and democratize power
over information but to increase the level of transparency about data pro-
cessing and safeguard the right to access to information. Citizens felt they
were monitored, and the law gave them the opportunity to know who con-
trolled their data, which kind of information was collected, and for which
purposes.
The mandatory notifications of new databases, registration, licensing pro-
cedures, and independent authorities [16,44] were the fundamental elements
of these new regulations. They were necessary to know who had control over
information and to monitor data processing. Another key component was the
right to access, which allows citizens to ask data owners about the way in
which information is used and, consequently, about the exercise of their power
over information. Finally, the entire picture was completed by the creation of
ad hoc public authorities to safeguard and enforce citizen’s rights, exercise
control over data owners, and react against abuses.
6 Frontiers in Data Science
In this model, there was no space for individual consent, due to the eco-
nomic context of that period. The collection of information was mainly made
by public entities for purposes related to public interests, was mandatory,
and there was no space of autonomy in terms of negotiation about personal
information. At the same time, personal information did not have an eco-
nomic value for private companies: data about clients and suppliers were
mainly used for operational functions regarding the execution of company
activities.
Another element that contributed to exclude the role of self-determination
was the lack of knowledge, the extreme difficulty for ordinary people to under-
stand the use, and the mode of operation of mainframes. The computer main-
frames were a sort of modern God, with sacral attendants, a selected number
of technicians who were able to use this new equipment. In this scenario, it
did not make sense to give citizens the chance to choose, as they were unable
to understand the way in which their data were processed.
In conclusion, during the 1970s and the first part of the 1980s of the
last century, legislators laid the foundations for data protection regulations in
many European countries and outside Europe, as a result of the technological
and social changes of that period. These first regulations defined the initial
core of data protection (i.e., transparency, rights to access, and data protection
authorities), which is still present in the existing legal framework.
The European Union model: From the Data Protection
Directive to the General Data Protection Regulation
The period from the mid-1980s to the 1990s was characterized not only by
the rising of a uniform approach to data protection regulation among the mem-
bers of the European Union, but also by a change in the regulatory paradigm,
due to the new technological, social, and economic scenarios.
Home computers entered the market in the late 1970s to become common
during the 1980s. This was the new era of distributed computers, in which a
lot of people bought a personal computer to collect and process information.
The big mainframe computers became the small desktop personal com-
puters, with a relatively low cost. Consequently, the computational capacity
was no longer an exclusive privilege of governments and big companies but
became accessible to many entities and consumers.
This period witnessed another transformation involving direct marketing,
which was no longer based on the concept of mail order and moved toward
computerized direct marketing solutions.∗
The new forms of marketing were
based on customer profiling and required extensive data collection to apply
∗Although direct marketing has its roots in mail order services, which were based on
personalized letter (e.g., using the name and surname of addressees) and general group
profiling (e.g., using census information to group addressees in social and economic classes),
the use of computer equipment increased the level of manipulation of consumer information
and generated detailed consumer’s profiles [45,46].
Legal aspects of information science, data science, and Big Data 7
data mining software. The main purpose of profiling was to suggest a suitable
commercial proposal to any consumer.
This was an innovative application of data processing driven by new pur-
poses. Information was no longer collected to support supply chains, logistics,
and orders, but to sell the best product to each user. As a result, the data
subject became the focus of the process, and personal information acquired
an economic and business value, given its role in sales.
These changes in the technological and business frameworks created new
requests from society to legislators, as citizens wanted to have the chance to
negotiate their personal data and gain something in return.
Although the new generations of the European data protection laws placed
personal information within the context of fundamental rights,∗
the main goal
of these regulations was to pursue economic interests related to the free flow
of personal data. This is also affirmed by the Directive 95/46/EC,†
which
represents both the general framework and the synthesis of this second wave
of data protection laws.‡
However, the roots of data protection remained in the context of person-
ality rights. Therefore, the European approach is less market-oriented than it
happens in other legal systems. The directive also recognizes the fundamental
role of public authorities in protecting data subjects against unwilled or unfair
exploitation of their personal information for marketing purposes.
Both the theoretical model of fundamental rights, based on self-
determination, and the rising data-driven economy highlighted the importance
of user consent in consumer data processing. Consent does not only represent
an expression of choice with regard to the use of personality rights by third
parties but is also an instrument to negotiate the economic value of personal
information.
In this new data-driven economy, personal data cannot be exploited for
business purposes without any involvement of data subjects. It is necessary
that individuals become part of the negotiation, as data are no longer used
mainly by government agencies for public purposes but also by private com-
panies with monetary revenues [49,50].
∗See Council of Europe, Convention for the Protection of Individuals with regard
to Automatic Processing of Personal Data, opened for signature on January 28, 1981
and entered into force on October 1, 1985. http://conventions.coe.int/Treaty/Commun/
QueVoulezVous.asp?NT=108&CL=ENG (accessed February 27, 2014); OECD, Annex
to the Recommendation of the Council of 23rd September 1980: Guidelines on the Pro-
tection of Privacy and Transborder Flows of Personal Data. http://www.oecd.org/internet/
ieconomy/oecdguidelinesontheprotectionofprivacyandtransborderflowsofpersonaldata.htm#
preface (accessed February 27, 2014).
†Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995
on the protection of individuals with regard to the processing of personal data and on the
free movement of such data [1995] OJ L281/31.
‡The EU Directive 95/46/EC has a dual nature, as it was written on the basis of the
existing national data protection laws, in order to harmonize them, but at the same time it
also provided a new set of rules. See the recitals in the preamble to the Directive 95/46/EC
[47,48].
8 Frontiers in Data Science
Effective self-determination in data processing, both in terms of protec-
tion and economic exploitation of personality rights, cannot be obtained
without adequate and prior notice.∗
For this reason, the notice and consent
model†
added a new layer to the existing paradigm based on transparency and
access [17].
Finally, it is important to highlight that, during the 1980s and 1990s, data
analysis increased in quality, but its level of complexity was still limited. Con-
sequently, consumers were able to understand the general correlation between
data collection and related purposes of data processing (e.g., profiling users,
offering customized services, or goods). At that time, informed consent and
self-determination were largely considered as synonyms, but this is changed
now, in the Big Data era.
The advent of Big Data analytics has created a different economic and
technological scenario, with direct consequences on the adequacy of the legal
framework adopted to safeguard personal information. The new environment
is mainly digital and characterized by an increasing concentration of informa-
tion in the hands of a few entities, both public and private.
The role played by specific subjects in the generation of data flows is the
main reason for this concentration. Governments and big private companies
(e.g., large retailers, telecommunication companies) collect huge amounts of
data while performing their daily activities. This bulk of information repre-
sents a strategic and economically relevant asset, as the management of large
databases enables these entities to assume the role of gatekeepers with regard
to the information that can be extracted from the datasets. They are able to
keep information completely closed or to limit access to the data, perhaps
to specific subjects only or with regard to circumscribed parts of the entire
collection.
Not only governments and big private companies acquire this power but
also the intermediaries in information flows (e.g., search engines, Internet
providers, data brokers, and marketing companies), which do not generate
information but play a key role in circulating it.
There are also different cases in which information is accessible to the pub-
lic, both in raw and processed form (e.g., open datasets, online user-generated
contents). This only apparently diminishes the concentration of power over
information, as access to information is not equivalent to knowledge [51].
A large amount of data create knowledge if the data holders have the
adequate interpretation tools to select relevant information, to reorganize it,
to place the data in a systematic context, and if there are people with the
required skills to define the design of the research and give an interpretation to
the results generated by Big Data analytics [3,15,52,53]. Without these skills,
data only produce confusion and less knowledge in the end, with information
interpreted in an incomplete or biased way. For these reasons, the availability
∗The notice describes how the data are processed and the detailed purposes of data
processing.
†See Articles 2(h), 7(a) and 10, Directive 95/46/EC.
Legal aspects of information science, data science, and Big Data 9
of data is not sufficient in the Big Data context [54,55]. It is also necessary to
have the adequate human and computing resources to manage it.
In this scenario, control over information does not only regard limited
access data, but can also concern open data [56,57], over which the infor-
mation intermediaries create an added value by means of their instruments
of analysis. Given that only few entities are able to invest heavily in equip-
ment and research, the dynamics described earlier enhance the concentra-
tion of power over information, which increases due to the new expansion of
Big Data.
Under many aspects, this new environment resembles the origins of data
processing, when, in the mainframe era, technologies were held by a few
entities and data processing was too complex to be understood by data sub-
jects. Nevertheless, there are important differences that may affect the possi-
ble evolution of this situation, in terms of a diffused and democratic access to
information.
The new data gatherers do not base their position only on expensive hard-
ware and software, which may become cheaper in the future, or is based on the
growing number of experts able to give an interpretation to the results of data
analytics. The fundamental element of this power is represented by the large
databases they have. These data silos, which are considered the goldmine of
the twenty-first century, do not have free access, as they represent the main
or the side effect of the activities conducted by their owners, due to the role
that they play in creating, collecting, or managing information.
For this reason, in the Big Data context, it seems quite difficult to imagine
the same process of democratization that happened with regard to computer
equipment during the 1980s [58]. The access to large databases is not only
protected by legal rights, but it is also strictly related to the peculiar positions
held by data holders in their market and to the presence of entry barriers.
Another aspect that characterizes this new form of concentration of con-
trol over information is the nature of the purposes of data collection: data
processing is no longer focused on single users (profiling), but it increased by
scale and it is trying to investigate attitudes and behaviors of large groups and
communities, up to entire countries. The consequence of this large-scale ap-
proach is the return of the fears about social surveillance, which characterized
the mainframe era.
Against this background, the GDPR does not change the main pillars
of the previous regulatory model. Therefore, personal data are still primar-
ily protected by individual rights; the notice and consent model remains an
important legal ground for data processing, and the principles of purpose lim-
itation and data minimization are reaffirmed.
Despite this traditional approach, which seems to be partially inade-
quate in the Big Data context, the GDPR shows a partial shift of the
regulatory focus from data subject’s self-determination to accountability of
the controller and persons involved in data processing. In this sense, ac-
countability represents the core of the new EU data protection framework
10 Frontiers in Data Science
and an important element to tackle the potential negative impacts of the
use of data analytics [59].
More specifically, accountability is based on the data protection impact
assessment, the role played by data protection officers and, when required by
law, the prior assessment process conducted by data protection authorities. In
this sense, compared with the previous Data Protection Directive, the GDPR
undoubtedly moves toward a risk-based approach.
Nevertheless, this transition is still incomplete. Elements of the previous
model focused on data subjects that coexist with the new approach, but with-
out a complete redraft of the architecture defined in the 1990s, it seems to be
difficult to address the social and technological challenges of Big Data.
Use of data and risk-analysis
Regarding risk management in data processing, it is worth pointing out
that risk can be considered, in a broad sense, as any negative consequence
that can occur when personal data are processed, regardless of the fact that
these consequences might produce damage or prejudice to individual rights
and freedoms.
In this sense, data subjects that use social networks expose themselves to
the risk of being profiled [60], of having their information shared with third
parties, of being tracked for commercial purposes, and so on. None of these
consequences are against the law, as those are detailed in terms and conditions
and privacy policies by service providers and accepted by users, on the basis
of the notice and consent model.
In these cases, it seems that there is no relevant risk for the safeguard
of data subjects’ rights, as individuals can assess the consequences of data
processing and have freely expressed their consent. Nevertheless, legal and
sociological studies have clearly demonstrated that users are usually unaware
of the consequences of providing their consent, as they do not read long and
technical notices or are not able to completely understand these descriptions
and imagine their practical consequences [61–65]. Moreover, in many cases,
power imbalance and social lock-in drastically reduce any effective freedom
of choice.
As a consequence of these constraints, users frequently accept some forms
of data processing without any prior risk/benefit analysis and are unaware of
the consequences. This shows the limits of the traditional notice and choice
paradigm [66,67], which are more evident in the context of Big Data analytics,
in which it is difficult to describe the “specific” purposes of data processing
[Article 6(1)(a) GDPR] at the moment of data collection, due to the transfor-
mative use of data made by data controllers [68].∗
∗In this light, it is also difficult to comply with the provisions of Article 4 of the GDPR,
which qualifies data subject’s consent as “freely given, specific and informed.” According to
the Article 29 Data Protection Working Party, “to be specific, consent must be intelligible: it
should refer clearly and precisely to the scope and the consequences of data processing” [17].
Legal aspects of information science, data science, and Big Data 11
In this sense, with respect to the broad notion of risk-concerning data pro-
cessing, the GDPR maintains the important roles played by self-determination
of data subjects and transparency, recognized by law in the last decades.
The European legislator seems to be unaware of the weaknesses of this
approach, where the formal transparency of terms and conditions com-
bined with users’ behavior [61] provide data controllers with the notice and
consent model, an easy way to lawfully exploit personal data in an extensive
manner.
On the other hand, a narrower notion of risk can be adopted, which focuses
on “material or nonmaterial damages” that prejudice the “rights and freedom
of natural persons.” This notion has been adopted in the GDPR to define the
risk-based approach (Recital 75 GDPR). According to the regulation, when a
risk of prejudice exists and cannot be mitigated or excluded, data processing
becomes unlawful, despite the presence of any legitimate grounds, such as the
data subject’s consent.
Recital n. 75 of the GDPR provides a long list of cases in which data pro-
cessing is considered unlawful. Moreover, this recital does not limit these hypo-
theses to the security of data processing but also takes into account the risk
of discrimination and “any other significant economic or social disadvantage.”
This notion of risk impact, which is echoed in the Article 35 of the GDPR,
represents an important step in the direction of an impact assessment of data
processing [69] that is no longer primarily focused on data security (see Article
32 GDPR) and evolves toward a more robust and broader Privacy, Ethical, and
Social Impact Assessment (PESIA).∗
Moreover, the attention to the economic
and social implications of data uses assumes relevance in the Big Data con-
text, in which analytics are used in decision-making processes and may have
negative impacts that affect individuals in terms of discrimination rather than
in terms of data security.†
In line with the risk-based approach, the new provisions of the GDPR
reinforce the accountability of data controllers that, according to Article 24, are
liable when they do not “implement appropriate technical and organizational
measures” to tackle the risks mentioned in the regulation (see also
Article 83(4) GDPR). These measures should be implemented from the
earliest stage of data processing design, embedding them in the processing,
according to the data protection by design approach (Article 25 GDPR).
In the light of the above, regarding transparency, rights to access, and data
protection authorities, which are the founding pillars of data protection regula-
tion, and the further element of the data subject’s consent, the new regulation
∗See sections “Multiple-risk assessment and collective interests” and “The guidelines
adopted by the Council of Europe on the protection of individuals with regard to the
processing of personal data in a world of Big Data.” Regarding the PESIA model, see also
the H2020 project “VIRT-EU: Values and ethics in Innovation for Responsible Technology
in Europe.” http://www.virteuproject.eu/ (accessed December 21, 2016).
†See section “Data-centered approach and socio-ethical impacts.”
12 Frontiers in Data Science
sheds light on the accountability of data controllers. Although accountability
principles were already present in the first data protection regulations, in
which the duties of transparency and the role played by data protection autho-
rities increased data controllers’ accountability, in the Directive 95/46/EC,
there was not a general process of risk-assessment, with specific consequences
in terms of accountability.
Before the new regulation, there were only national provisions or best
practices regarding the privacy impact assessment [69], but no uniform risk-
based approach. This goal has now been reached in the GDPR by means of a
set of rules that concern the role played by risk analysis, the data protection
impact assessment, the prior consultation of data protection authorities, and
the data protection officer (Articles 35, 36, and 37 GDPR).
In more detail, the risk-based model defined by the GDPR is articulated in
three different levels of assessment. The first is required by Article 24 GDPR,
and implicitly by Article 35(1). This is a general assessment of “the risk of
varying likelihood and severity for rights and freedoms of natural persons,”
which defines the level of the potential negative impact of data processing.
When this first assessment shows that the processing “is likely to result in
a high risk to the rights and freedoms of natural persons” (Article 35 GDPR),
the controller should carry out a formal data protection impact assessment.
Moreover, there is a list of cases in which high risk is presumed (Article 35(3)
GDPR). This is an open list, due to the fact that data protection authori-
ties may add further cases (Article 35(4) GDPR), according to the margin
of maneuver recognized in several provisions by the regulation to national
authorities or legislators.
Nevertheless, the idea of a list of high-risk cases, as well as of cases excluded
from the impact assessment (Article 35(5) GDPR), raise doubts about the
feasibility of this categorization. In this sense, an ex ante general definition
of the presumed level of risk seems to be in conflict with the idea of risk-
assessment, which is necessarily context based.
Moreover, the cases of high risk are described using indefinite notions,
such as “large scale” data processing (Article 35(3)(b) and (c) GDPR). In this
regard, Recital n. 91 may be of help to clarify the meaning of this provision, as
it states that the impact assessment “should in particular apply to large-scale
processing operations which aim to process a considerable amount of personal
data at regional, national or supranational level and which could affect a large
number of data subjects.” Nevertheless, the recital does not explain when
an amount of data is deemed “considerable” and why, in the digital global
context, the amount of data should refer to territorial dimensions (regional,
national, or supranational).
Finally, in the absence of any scale, the general notion of high risk remains
quite indefinite. Recital n.77 identifies a series of bodies and instruments that
can provide guidance as regards the “identification of the risk related to the
processing, their assessment in terms of origin, nature, likelihood and severity,”
but, at the moment, the framework remains uncertain.
Legal aspects of information science, data science, and Big Data 13
These criticisms seem to have a limited impact on the field of Big Data
analytics, as the majority of applications fall within the cases listed in
Article 35(3) GDPR, in which high risk is presumed. Nevertheless, it is worth
pointing out that analytics can be used in contexts in which the evaluation of
personal aspects is not necessarily “systematic and extensive,” as they may
focus only on a specific subset of attributes or on a given cluster of persons.
Pursuant to Article 35(3), the use of Big Data analytics usually requires
a prior data protection impact assessment. This procedure is defined by
Article 35(7), in line with the traditional model of risk-assessment, which is
primarily a prior evaluation of the potential negative outcomes of a process,
product, or activity, and a consequent identification of the measures that
should be adopted to avoid or, at least, mitigate the identified risks.∗
This procedure can be divided into three different stages: analysis of
the process (Article 35(7)(a) GDPR), risk-assessment (Article 35(7)(b) and
(c) GDPR), and definition of the measures envisaged to address the risks
(Article 35(7)(d) GDPR). It is worth pointing out that the stage concerning
the risk-assessment includes two different kinds of evaluation: assessment of
the “necessity and proportionality” of data processing, and assessment of the
“risks to the rights and freedoms of data subjects.” These two evaluations are
correlated and consequent, as disproportional or unnecessary data processing
cannot be put in place and, in this case, there is not any further question about
the impact on individual rights and freedoms. On the other hand, when the
principles of necessity and proportionality are respected, further investigation
is needed to assess the specific balance of interests that the use of data implies.
According to the principles and values framed in the European Chart
of Fundamental Rights of the European Union, this balance of interests is
not a mere risk/benefit analysis, but a comparison between interests that
are different and may have a different hierarchical order.†
In this sense, the
data protection impact assessment is not in line with the risk-based theories
[70] that suggest the adoption of a risk/benefit approach instead of a risk-
mitigation approach.‡
∗According to the traditional paradigm of risk-assessment, data controllers should be
able to demonstrate compliance with the Regulation on the basis of the assessment results
(Article 35(7)(d) GDPR) and should periodically review these results, due to the possibility
of a change in the nature and severity of the risks over the time (Article 35(11) GDPR).
†See European Court of Justice, May 13, 2014, Case 131/12, Google Spain SL, Google
Inc. v Agencia Española de Protección de Datos (AEPD), Mario Costeja González.
http://curia.europa.eu/juris/document/document.jsf?text=&docid=152065&pageIndex=0
&doclang=EN&mode=lst&dir=&occ=first&part=1&cid=980962 (accessed June 16, 2016).
‡According to the risk/benefit approach, the assessment should be based on the com-
parison between the amount of benefits and the sum of all risks, without any distinction
regarding the nature of risks and benefits. In this sense, for instance, economic benefits may
prevail over individual rights. On the other hand, the risk mitigation approach assumes
that some interests (e.g., fundamental rights) are prevailing and cannot be compared with
other interests that have a lower relevance. As a consequence, the risk mitigation approach
focuses on the potential prejudice for fundamental rights and suggests adequate measures
to reduce this risk or, where feasible, to exclude it.
14 Frontiers in Data Science
When data protection impact assessment “indicates that the processing
would result in a high risk in the absence of measures taken by the con-
troller to mitigate the risk,” data controllers must consult the supervisory
authority prior to the start of processing activities (Article 36(1) GDPR).
According to Recital n. 84 of the GDPR, the absence of measures to mitigate
the risk is evaluated taking into account the “available technology and costs
of implementation.”
It is worth pointing out that the reference to the costs and the available
technology, also present in the provisions concerning security risk (Recital
n. 83 and Article 32(1) GDPR) and data protection by design (Article 25(1)
GDPR), represents an important opportunity to put the principle of pro-
portionality into practice in the context of risk mitigation. Therefore, these
provisions reduce the risk of an excessive burden for data controllers due to
the implementation of the risk-assessment model.
When a data protection impact assessment indicates that processing would
result in a high risk in the absence of measures taken by the controller to
mitigate the risk, data controllers should consult the supervisory authority
prior to the start of processing activities (Recital n. 94 GDPR).∗
According to Article 36(2) GDPR, when the supervisory authority is of
the opinion that the intended processing would infringe the regulation, the
authority “shall [. . . ] provide written advice to the controller and, where app-
licable to the processor, may use any of its powers referred to in Article 58.”
Given the powers given to supervisory authorities by Article 58, this means
that there are two options as follows: (1) The assessment is not satisfactory,
and the data controller has not adequately identified or mitigated the risk;
(2) the assessment has been conducted in a correct manner, but there are
no measures available to mitigate the risk. In the first case, the supervisory
authority orders the controller or processor “to bring processing operations
into compliance with the provisions of this Regulation, where appropriate, in
a specified manner” (Article 58(2)(d) GDPR), whereas, in the second case,
the authority imposes “a temporary or definitive limitation including a ban
on processing” (Article 58(2)(f) GDPR).
Finally, minor aspects concerning the risk-based approach regard the role
played by the data protection officer, whose main tasks are to provide advice
to the controller or the processor of their obligations (included the data pro-
tection impact assessment), and to monitor compliance with legal provisions
concerning data protection and with the privacy policies of the controller or
processor (Article 39(1) GDPR). In the performance of these tasks, the data
protection officer must “have due regard to the risk associated with processing
operations, taking into account the nature, scope, context, and purposes of
∗The model of prior consultation is built on the concept of prior checking, which was
already present in Article 20 of the Directive 95/46/EC.
Legal aspects of information science, data science, and Big Data 15
processing” (Article 39(2) GDPR). Therefore, the risk-assessment represents
one of the main criteria that should drive the action of the data protection
officer.
The new provisions about risk-assessment represent an important evolu-
tion in the direction of a risk-based approach in data protection and, in this
sense, may offer an adequate solution to the potential negative outcomes of
the use of Big Data analytics. The main limit of these provisions lies in the
link to the purposes of data processing.∗
Although the assessment should necessarily be related to the use of data
for a specific purpose, there is a problem due to the fact that, according to
Article 5(1)(b) GDPR, data processing purposes should be “specific, explicit,
and legitimate” and defined at the moment of data collection, which contrast
with the transformative use of data made by private and public bodies by
means of Big Data analytics.
For these reasons, a better design of the impact assessment should not focus
on the initial purpose of data collection, but on each specific data use that
is put in place by the data controller after data collection. In this regard, it
should be noted that, at the moment, this result is achieved by data controllers
circumventing the provisions on purpose limitation. They collect personal data
on the basis of broad series of different purposes and then, if they have already
adopted procedures of impact assessment, evaluate case-by-case the potential
impact on data protection, with regard to each different use of information
for a given purpose.
Against this background, a different perspective can be adopted, which
expressly accepts the idea that data are collected for multiple purposes, defined
only broadly at the beginning of data processing. This model focuses on the
different specific uses of collected information and the prior assessment of the
potential risks of each use.
This kind of approach, if adopted by the legislator, will be more efficient
and consistent with the transformative use of data made by companies in the
Big Data context, as well as with the level of self-determination of the data
subjects [66,71]. In this sense, a more extensive use of the legitimate interest
as legal grounds [24] may complete this model. Companies may enlist users
in data processing without any prior consent, provided they give notice of
the results of the assessment, which should be supervised by data protection
authorities (licensing model), and provide an opt-out option [66].
It might be noted that the suggested approach undermines the chances for
users to negotiate their consent, but the strength of this objection is reduced
by the existing limits to self-determination described above. In the majority
∗See Article 35(1) GDPR (“Where a type of processing in particular using new technolo-
gies, and taking into account the nature, scope, context and purposes of the processing, is
likely to result in a high risk to the rights and freedoms of natural persons”) and 35(7)(b)
(“[The assessment shall contain at least] an assessment of the necessity and proportionality
of the processing operations in relation to the purposes”).
16 Frontiers in Data Science
of the cases, the negotiation is reduced to the alternative take it or leave it.
A prior assessment conducted under the supervision of independent author-
ities, the use of legitimate interest as legal ground, and the adoption of an
opt-out model seem to offer more guarantees to users than an apparent, but
inconsistent, self-determination based on notice and consent and on the opt-in
model.
On the other hand, remaining focused on the existing legal framework
defined by the Regulation 2016/679, a different option [71] may be to limit
Big Data uses to statistical purposes, which benefit from an explicitly permit-
ted reuse of data (Articles 5 (1) and 89, GDPR). Nevertheless, in this case,
using analytics for decision-making purposes directly affecting a particular
individual would be outside the field of statistical purposes and also violate
the restrictions on automated individual decision making, including profiling.
In this sense, the GDPR “can be seen as a stepping stone, pointing toward
the need to evolve data protection beyond the old paradigm, yet not fully
committed to doing so” [71].
The model of data management defined by the new Regulation does not
completely address the new challenges of use of Big Data analytics in data
processing [24,71]: the new provisions do not provide an effective transparency
of data processing (obscure notices, impact assessment not publicly available),
but only a higher level of accountability.
Moreover, the risk-mitigation approach adopted by the Regulation seems
still to be far from the idea of a multiple and participative risk-assessment.
Although Recital n. 75 recognizes the risk of discrimination and “any other
significant economic or social disadvantage,” the provisions of the Regulation
do not offer an adequate framework for the assessment of this kind of negative
outcome.
With regard to the use of Big Data analytics in decision-making processes,
important questions arise about the ethical and social values that should be
taken into account, as well as the role that the different social stakeholders can
play in assessing the impact of data uses.∗
In conclusion, the European Union
seems to be insecure in moving its steps away from the traditional model of
data protection, whereas other international bodies are trying to offer a more
courageous answer to the challenges of the data age.
In this sense, the new guidelines on Big Data of the Council of Europe
seem to be aware of the limits of the traditional principles governing data
protection and open to a broader risk-assessment, which takes into account
the social and ethical impacts of data uses and recognizes the benefits of a
participatory model based on the multistakeholder approach.†
∗See section “The guidelines adopted by the Council of Europe on the protection of
individuals with regard to the processing of personal data in a world of Big Data.”
†See section “Multiple-risk assessment and collective interests.”
Legal aspects of information science, data science, and Big Data 17
Use of data for decision-making purposes: From individual
to collective dimension of data processing
The new scale of data processing of Big Data applications and the use of
analytics in decision-making processes pose new questions about data protec-
tion. As Big Data make it possible to collect and analyze large amounts of
information, data processing is no longer focused on individual users, and this
sheds light on the collective dimension of the use of data.
In the Big Data environment, general strategies are adopted on a large
scale and on the basis of representations of society generated by algorithms,
which predict future collective behavior [3,25,55,64]. These strategies are then
applied to specific individuals, given the fact that they are part of one or more
groups generated by analytics [3,56,72].
The use of analytics and the adoption of decisions based on group behavior
rather than on individuals are not limited to commercial and market contexts.
They also affect other important fields, such as security and social policies,
where a different balance of interest should be adopted, given the importance
of public interest issues.∗
One example of this is provided by predictive policing
solutions such as PredPol [73–77].
This categorical approach characterizing the use of analytics leads poli-
cymakers to adopt common solutions for individuals belonging to the same
cluster generated by analytics. These decisional processes do not consider in-
dividuals per se, but as a part of a group of people characterized by some
common qualitative factors.
In this sense, the use of personal information and Big Data analytics to
support decisions exceeds the boundaries of the individual dimension and
assumes a collective dimension [78], with potential harmful consequences for
some groups [79,80]. In this sense, prejudice can result not only from the well-
known privacy-related risks (e.g., illegitimate use of personal information, data
security) but also from discriminatory and invasive forms of data processing
[15,81,82].
The dichotomy between individuals and groups is not new, and it has
already been analyzed with regard to the legal aspects of personal information.
Nonetheless, the right to privacy and the right to the protection of personal
data have been largely safeguarded as individual rights, despite the social
dimension of their rationale.
The focus on the model of individual rights is probably the main reason
for the few contributions by privacy scholars on the collective dimension of
privacy and data protection. Hitherto, only few authors have investigated the
notion of group privacy. They have represented this form of privacy as the pri-
vacy of the facts and ideas expressed by the members of a group in the group
environment or in terms of protection of information about a group [37,83,84].
∗See also section “Data prediction: social control and social surveillance.”
18 Frontiers in Data Science
On the other hand, collective data protection does not necessarily con-
cern facts or information referring to a specific person, as with individual
privacy and data protection. Nor does it concern clusters of individuals that
can be considered as groups in the sociological sense of the term. In addition,
collective rights are not necessarily a large-scale representation of individ-
ual rights and related issues [85]. Finally, collective data protection concerns
non-aggregative collective interests [86], which are not the mere sum of many
individual interests.∗
The importance of this collective dimension [78] depends on the fact that
the approach to classification by modern algorithms does not merely focus on
individuals, but on groups or clusters of people with common characteristics
(e.g., customer habits, lifestyle, online and offline behavior). Data gatherers are
mainly interested in studying groups’ behavior and predicting this behavior,
rather than in profiling single users. Data-driven decisions concern clusters
of individuals and only indirectly affect the members of these clusters. One
example of this is price discrimination based on age, habits, or wealth.
The most important concern in this context is the protection of groups
from potential harm due to invasive and discriminatory data processing. In
this sense, the collective dimension of data processing is mainly focused on
the use of information [66,70], rather than on secrecy [83,84] and data quality.
Regarding the risk of discrimination, this section does not focus on the
unfair practices characterized by intentional discriminatory purposes, which
are generally forbidden and sanctioned by law [87,88],†
but on the involuntary
forms of discrimination in cases in which Big Data analytics provide biased
representations of society [89,90].
For example, in 2013, a study examined the advertising provided by Google
AdSense and found statistically significant racial discrimination in adver-
tisement delivery [91,92]. Similarly, Kate Crawford has pointed out certain
algorithmic illusions [93,94] and described the case of the City of Boston and
its StreetBump smartphone app to passively detect potholes [95].‡
Another example is the Progressive case, in which an insurance company
obliged drivers to install a small monitoring device in their cars to receive the
∗Contra Vedder [81], who claims that the notion of collective privacy “reminds of col-
lective rights,” but subjects of collective rights are groups or communities. Conversely, the
groups generated by group profiling are not communities of individuals sharing similar
characteristics and structured or organized in some way. For this reason, Vedder uses the
different definition of “categorial privacy.”
†See Article 14 of the Convention for the Protection of Human Rights and Fundamental
Freedoms; Article 21 of the Charter of Fundamental Rights of the European Union; Article
19 of the Treaty on the Functioning of the European Union; Directive 2000/43/EC; Directive
2000/78/EC.
‡In this case, the application had a signal problem, due to the bias generated by the
low penetration of smartphones among lower income and older residents. While the Boston
administration took this bias into account and solved the problem, less-enlightened pub-
lic officials might underestimate such considerations and make potentially discriminatory
decisions.
Legal aspects of information science, data science, and Big Data 19
company’s best rates. The system is considered as a negative factor driving late
at night but did not take into account the potential bias against low-income
individuals, who are more likely to work night shifts, compared with late-night
party-goers, “forcing them [low-income individuals] to carry more of the cost
of intoxicated and other irresponsible driving that happens disproportionately
at night” [76].
These cases represent situations in which a biased representation of groups
and society results from flawed data processing∗
or a lack of accuracy in the
representation. This produces potentially discriminatory effects as a conse-
quence of the decisions taken on the basis of analytics.
On the other hand, the decision to put in place different treatment of dif-
ferent situations may represent an intentional and legitimate goal for policy
makers, in line with the rule of law. This is the case of law and enforce-
ment bodies and intelligence agencies, which adopt solutions to discriminate
between different individuals and identify targeted persons. Here, there is a
deliberate intention to treat given individuals differently, but this is not un-
fair or illegal providing it is within existing legal provisions. Nonetheless, as
in the previous case, potential flaws or a lack of accuracy may cause harm to
citizens.†
Discrimination, in terms of the different treatment of different situations,
also appears in commercial contexts to offer tailored services to consumers.
In this case, in which the interests are of a purely private nature, commercial
practices may lead to price discrimination [99,100] or the adoption of different
terms and conditions depending on the assignment of consumers to a specific
cluster [56,99,101,102].
Thus, consumers classified as “financially challenged” belong to a cluster
“[i]n the prime working years of their lives [. . . ] including many single par-
ents, struggl[ing] with some of the lowest incomes and little accumulation of
wealth.” This implies the following predictive viewpoint, based on Big Data
analytics and regarding all consumers in the cluster: “[n]ot particularly loyal to
any one financial institution, [and] they feel uncomfortable borrowing money
and believe they are better off having what they want today as they never
know what tomorrow will bring” [56]. It is not hard to imagine the potential
discriminatory consequences of these classifications with regard to individuals
and groups.
These forms of discrimination are not necessarily against the law, espe-
cially when they are not based on individual profiles and only indirectly affect
∗This is the case of the errors that affect the E-Verify system, which is used in the United
States to verify if a new worker is legally eligible to work in the United States [76,96].
†For instance, criticisms have been raised with regard to the aforementioned predic-
tive software adopted in recent years by various police departments in the U.S. Criticisms
also concern the use of risk assessment procedures based on analytics coupled with a cat-
egorical approach (based on typology of crimes and offenders) in U.S. criminal sentencing
[97,98].
20 Frontiers in Data Science
individuals as part of a category, without their direct identification.∗
For this
reason, existing legal provisions against individual discrimination might not be
effective in preventing the negative outcomes of these practices, if adopted on
a collective basis. Still, such cases clearly show the importance of the collective
dimension of the use of information about groups of individuals.
From a data protection perspective and in the European Union, such data
analysis focusing on clustered individuals may not represent a form of personal
data processing, as the use of categorical analytics methodologies does not
necessarily make it possible to identify a person, and group profiles can be
made using anonymized data.†
This reduces the chances of individuals taking
action against biased representations of themselves within a group or having
access to the data-processing mechanisms, as the anonymized information
used for group profiling cannot be linked to them [88,104–106]. However, it
has been observed that “once a profile is linked to an identifiable person—for
instance in the case of credit scoring—it may turn into a personal data, thus
reviving the applicability of data protection legislation” [72].
It should be noted that, as group profiling based on analytics is used to
take decisions affecting a multiplicity of individuals, the main target of data
processing is not the data subject, but the clusters of people created by Big
Data gatherers. In this light, the interests that assume relevance are primarily
supraindividual and collective [86].
In general terms, collective interests may be shared by an entire group
without conflicts between the views of its members (aggregative interests) or
with conflicts between the opinions of its members (non-aggregative interests)
[86,107]. If the group is characterized by non-aggregative interests, the collec-
tive nature of the interest is represented by the fundamental values of a given
society (e.g., environmental protection).
With regard to data protection, the notion of collective non-aggregative
interests seems to be the best way to describe the collective dimension of the
use of personal information. In this sense, although individuals may have dif-
ferent opinions about the balance between the conflicting interests,‡
there are
some collective priorities concerning privacy and data protection that are of
relevance to the general interest. Here, the rationale for collective data pro-
tection is mainly focused on the potential harm to groups caused by extensive
and invasive data processing.
∗Regarding the decisions that affect an individual as member of a specific cluster of peo-
ple, it should be noted that in many cases, these decisions are not based solely on automated
processing [82]. In this sense, credit scoring systems have reduced but not removed human
intervention on credit evaluation. At the same time, classifications often regard identified
or identifiable individuals [103].
†On the limits of anonymization in the big data context, see section “Introduction. The
legal challenges of the use of data.”
‡In this sense, an extensive group profiling for commercial purposes can be passively
accepted, considered with favor or perceived as invasive and potentially discriminatory.
The same divergence of opinions and interests exists with regard to government social
surveillance for crime prevention and national security, in which part of the population is
in favor of surveillance, due to concerns about crime and terrorism.
Legal aspects of information science, data science, and Big Data 21
Data-centered approach and socio-ethical impacts
Privacy and data protection are context-dependent notions, which vary
from culture to culture and across historical periods [37,104,108,109]. In the
same way, the related collective dimensions are necessarily influenced by his-
torical and geographical variables and are the result of actions by policymak-
ers. For these reasons, it is impossible to define a common and fixed balance
between collective data protection and conflicting interests.
There are jurisdictions that give greater priority to national and security
interests, which in many cases prevail over individual and collective data pro-
tection; meanwhile, in some countries, extensive forms of social surveillance
are considered disproportionate and invasive. Therefore, any balancing test
must focus on a specific social context in a given historical moment [110]. As
has been pointed out in the literature [111], defining prescriptive ethical guide-
lines concerning the values that should govern the use of Big Data analytics
and the related balance of interests is problematic.
Given such variability, from a theoretical perspective, a common frame-
work for a balancing test can be found in the values recognized by interna-
tional charters of fundamental rights. These charters provide a baseline from
which it is to identify the values that can serve to provide ethical guidance
and define the existing relationships between these values [111].
In addition, the context-dependent framework of values and the relation-
ship between conflicting interests and rights needs to be specified with regard
to the actual use of Big Data analytics. In Europe, for instance, commercial
interests related to credit score systems can generally be considered compatible
with the processing of personal information, providing that data are adequate,
relevant, and not excessive in relation to the purposes for which it is collected.∗
Even so, specific Big Data analytics solutions adopted by some companies for
credit scoring purposes may lead to a disproportionate scrutiny of consumers’
private life. The same reasoning can also be applied to smart mobility solu-
tions, which can potentially lead to extensive social surveillance. This means
that a prior case-by-case risk-assessment is necessary to mitigate the potential
impact of these solutions on data protection and individual freedoms.
This “in-context” balance of conflicting interests is based on an impact
assessment that, in the presence of complex data collection and processing
systems, should not be conducted by consumers or companies but must entail
an active involvement of various stakeholders. Against this background, an
important aspect of the protection of collective interests relating to personal
information is the analysis of the existing conflicting interests and the repre-
sentation of the issues regarding the individuals grouped in clusters by data
gatherers.
∗See Articles 18 and 20 of the Directive 2014/17/EU. See also Article 8 of the
Directive 2008/48/EC on credit agreements for consumers and repealing Council Direc-
tive 87/102/EEC.
22 Frontiers in Data Science
Here, it is useful to briefly consider the fields in which the group dimen-
sion of data protection is already known in more traditional contexts that
are not characterized by extensive data collection and use of analytics. For
instance, labor law recognizes this collective dimension of rights and the dua-
lism between individuals and groups.∗
Under certain circumstances, trade
unions and employees’ representatives may concur in taking decisions that
affect the employees and have an impact on data protection in the workplace.
Collective agreements on these decisions are based on the recognition that
the power imbalance in the workplace means that, in some cases, the employee
is unaware of the implications of employer’s policies (e.g., employers’ work-
place surveillance practices). Moreover, in many cases, this imbalance makes
it difficult for employees to object to the illegitimate processing of their data.
Entities representing collective interests (e.g., trade unions) are less vul-
nerable to power imbalance and have a broader vision of the impact of the
employer’s policies and decisions. It should also be noted that the employer’s
unfair policies and forms of control are often oriented toward discriminatory
measures that affect individual workers, even though they are targeted at the
whole group.
This collective representation of common interests is also adopted in other
fields, such as consumer protection and environmental protection. These con-
texts are all characterized by a power imbalance affecting one of the par-
ties directly involved (employees, consumers, or citizens). Furthermore, in
many cases, the conflicting interests refer to contexts in which the use of
new technologies makes it hard for users to be aware of the potential negative
implications.
The same situation of imbalance often exists in the Big Data context,
where data subjects are not in a position to object to discriminatory uses
of personal information by data gatherers. Data subjects often do not know
the basic steps of data processing, and the complexity of the process means
that they are unable to negotiate their information and are not aware of
the potential collective prejudices that underlay its use.†
This is why it is
important to recognize the role of entities representing collective interests, as
it happens in the earlier cases.
Employees are part of a specific group, defined by their relationship with
a single employer; therefore, they are aware of their common identity and
have mutual relationships. By contrast, in the Big Data context, the common
attributes of the group often only become evident in the hands of the data
gatherer.
Data subjects are not aware of the identity of the other members of the
group, have no relationship with them, and have a limited perception of their
collective issues [112,113]. Furthermore, these groups shaped by analytics have
a variable geometry, and individuals can shift from one group to another.
∗See for example, Italian Statute of the Workers’ Rights, Articles 4 and 8, Act 300, May
20, 1970.
†See section “Introduction. The legal challenges of the use of data.”
Legal aspects of information science, data science, and Big Data 23
This does not undermine the idea of representing collective data protec-
tion interests. On the contrary, this atomistic dimension makes the need for
collective representation more urgent. However, it is hard to imagine represen-
tatives appointed by the members of these groups, as is instead the case in the
workplace.
In this sense, there are similarities with consumer law, where there are
collective interests (e.g., product security, fair commercial practices), but the
potential victims of harm have no relationship to one another. Thus, individ-
ual legal remedies must be combined with collective remedies.∗
Examples of
possible complementary solutions are provided by consumer law, where inde-
pendent authorities responsible for consumer protection, class action lawsuits,
and consumer associations play an important role.
In the field of Big Data analytics, the partially hidden nature of the pro-
cesses and their complexity probably make timely class actions more difficult
than in other fields. For instance, in the case of a product liability, the dam-
ages are often more evident making it easier for the injured people to react.
On the other hand, associations that protect collective interests can play an
active role in facilitating reaction to unfair practices and, moreover, they can
be involved in a multistakeholder risk-assessment of the specific use of Big
Data analytics.
The involvement of such bodies requires specific procedural criteria to de-
fine the entities that may act in the collective interest.†
This is more difficult in
the context of Big Data, in which the groups created by data gatherers do not
have a stable character. In this case, an assessment of the social and ethical
impact of analytics often provides the opportunity to discover how data pro-
cessing affects collective interests and thus identify the potential stakeholders.
Multiple-risk assessment and collective interests
How collective interests should be protected against discrimination and
social surveillance in the use of Big Data analytics is largely a matter for
the policymakers. Different legal systems and different balances between the
components of society suggest differing solutions. Identifying the indepen-
dent authority charged with protecting collective interests may therefore be
difficult.
Many countries have independent bodies responsible for supervising spe-
cific social surveillance activities, and other bodies focused on antidiscrimi-
nation actions [114]. In other countries, this responsibility is spread across
various authorities, which take different approaches, use different remedies,
and do not necessarily cooperate in solving cases with multiple impacts.
Meanwhile, a central element in the risk-assessment of Big Data analytics
is the analysis of data processing, which is the factor common to all these
∗The same approach has been adopted in the realm of antidiscrimination laws [114,115].
†See also Article 80 GDPR.
24 Frontiers in Data Science
situations, regardless of the potential harm to collective interests. For this
reason, data protection authorities can play a key role in the risk-assessment
processes, even if they are not focused on the specific social implications (e.g.,
discrimination).
On the other hand, if we take a different approach that takes into consider-
ation the various negative effects generated by the use of Big Data (discrimina-
tion, unfair consumer practices, social control, etc.), we should involve multiple
entities and authorities. Nevertheless, the end result may be a fragmented and
potentially conflicting decision-making process that may underestimate the
use of data, which is the common core of all these situations [95].
Furthermore, data protection authorities are accustomed to addressing
collective issues and have already demonstrated that they do consider both
the individual and the wider collective dimension of data processing. Focusing
on data protection and fundamental rights, they are also well placed to balance
the conflicting interests around the use of data.
The adequacy of the solution is also empirically demonstrated by impor-
tant cases decided by data protection authorities concerning data-processing
projects with significant social and ethical impacts. These cases show that
decisions to assess the impact of innovative products, services, and business
solutions on data protection and society are not normally on the initiative of
the data subjects, but primarily on that of data protection authorities, who
are aware of the potential risks of such innovations. Based on their balancing
tests, these authorities are in a position to suggest measures that companies
should adopt to reduce the risks discussed here and to place these aspects
within the more general framework of the rights of the individual, as a single
person and as a member of a democratic society.
The risk assessment represents the opportunity for group issues to be iden-
tified and addressed. Thus, bodies representing collective interests should not
only partially exercise traditional individual rights on behalf of data sub-
jects but also exercise other autonomous rights relating to the collective
dimension of data protection. These new rights mainly concern participa-
tion in the risk-assessment process, which should take a multistakeholder
approach.∗
Against this background, data protection authorities may involve in the
assessment process the various stakeholders that represent the collective inter-
ests affected by specific data-processing projects [111,116].†
This would lead
to the definition of a new model in which companies that intend to use Big
Data analytics would undergo an assessment prior to collecting and processing
data.
∗The extent of the rights conferred upon the different stakeholders in the protection of
collective privacy is largely a matter for policymakers to decide and would depend on the
nature and values of the different sociolegal contexts.
†A different assessment exclusively based on the adoption of security standards or corpo-
rate self-regulation would not have the same extent and independency. This does not mean
that, in this framework, forms of standardization or coregulation cannot be adopted.
Legal aspects of information science, data science, and Big Data 25
The assessment would not only focus on data security and data protec-
tion but also consider the social and ethical impacts relating to the collective
dimension of data use in a given project.∗
This assessment should be con-
ducted by third parties and supervised by the data protection authorities.†
Once this multiple-impact assessment is approved by data protection author-
ities, the ensuing data processing would be considered secure in protecting
personal information and collective interests.
Although data protection authorities are already engaged to some degree
in addressing the collective dimension, the suggested solution would lead to
a broader and deeper assessment, which would become mandatory. This pro-
posal is therefore in line with the view that a licensing scheme might “prove
to be the most effective means of ensuring that data protection principles do
not remain ‘law-in-book’ with respect to profiling practices” [44,104].
The guidelines adopted by the Council of Europe on the
protection of individuals with regard to the processing of
personal data in a world of Big Data
Although the guidelines provided by the Council of Europe on the basis
of the Convention 108 on data protection have not the same impact of the
regulation (EU) 2016/679, in terms of efficacy and direct application, they
represent an interesting set of rules that, for some aspects, shows a new manner
to address the issues concerning the use of Big Data analytics.
Before briefly examining the previsions of the “Guidelines on the protection
of Individuals with Regard to the Processing of Personal data in a World of Big
Data” (hereafter Guidelines) adopted by the Council of Europe,‡
the nature
and the peculiarity of these guidelines should be highlighted.
Within the framework of the Convention 108, the guidelines are practical
and operative instructions provided by the Council of Europe to member
states. They are primarily addressed to data controllers and data processors,
to facilitate the effective application of the principles of the Convention in
∗In the Big Data context, another important aspect is the transparency of the algorithms
used by companies [55,64,82,88,90]. See Articles 13 (2)(f), 14 (2)(f), and 15 (1)(h) GDPR,
which recognize data subject’s right to receive “meaningful information about the logic
involved.”
†The entire system will work only if the political and financial autonomy of data pro-
tection authorities from governments and corporations is guaranteed. Moreover, data
protection authorities would need new competence and resources in order to bear the bur-
den of the supervision and approval of these multiple-impact assessments. For these reasons,
a model based on mandatory fees—paid by companies when they submit their requests
for authorization to data protection authorities—would be preferable [66]. It should also
be noted that, in cases of large-scale and multinational data collection, forms of mutual
assistance and cooperation may facilitate the role played by data protection authorities in
addressing the problems related to the dimensions of data collection and data gatherers.
‡The guidelines are available https://rm.coe.int/CoERMPublicCommonSearchServices/
DisplayDCTMContent?documentId=09000016806ebe7a.
26 Frontiers in Data Science
specific sectors.∗
Nevertheless, unlike the guidelines previously adopted by the
Council of Europe, which concerned specific contexts or issues, these guidelines
focus on the use of a given technology (Big Data) and are not sector specific.†
The awareness of the critical issues posed by the new forms of data process-
ing based on analytics characterizes the entire text of the Guidelines. There-
fore, the principles of the Convention 108 are interpreted to provide adequate
solutions, taking into account “the given social and technological context”
and “a lack of knowledge on the part of individuals” with regard to Big Data
applications.‡
In this light, the effective safeguard of the individual’s “right to control
his or her personal data and the processing of such data”§
is placed in the
context of Big Data uses, in which processes of collection and analysis of data
are characterized by complexity and obscurity [64].
For this reason, the Guidelines do not consider the notion of control as
merely circumscribed to individual control (such as in the notice and consent
model) but adopt a broader idea of control over the use of data, according
to which “individual control evolves in a more complex process of multiple-
impact assessment of the risks related to the use of data.”¶
This leads to go beyond the individual dimension of data protection and
investigate aspects that concern the relations among individuals and the soc-
iety at large. In this light, potential prejudices are not only restricted to the
well-known privacy-related risks (e.g., illegitimate use of personal information,
data security) but also include other prejudices that may concern the conflict
with ethical and social values [15,82], in line with the Privacy, Ethical, and
Social Impact Assessment model (PESIA) mentioned above.∗∗
Nevertheless, the assessment concerning the impact of the use of data on
ethical and social values is more complicated than the traditional data
protection assessment. Moreover, although individual rights concerning data
∗See Section II (Scope) of the Guidelines.
†These guidelines do not provide an authoritative definition of Big Data, as there are
many definitions of Big Data, which differ depending on the specific discipline. The Guide-
lines cover both Big Data and Big Data analytics.
‡See Section I (Introduction) of the guidelines. See also Section II (Scope) of the guide-
lines (“Given the nature of Big Data, the application of some of the traditional principles
of data processing [e.g., minimization principle, purpose specification, meaningful consent,
etc.] may be challenging in this technological scenario. These guidelines therefore suggest a
tailored application of the principles of the Convention 108, to make them more effective in
practice in the Big Data context”).
§See Section I (Introduction) of the guidelines. See also the Preamble of the Draft mod-
ernized Convention for the Protection of Individuals with Regard to the Processing of
Personal Data (“Considering that it is necessary to secure the human dignity and protec-
tion of the human rights and fundamental freedoms of every individual and [. . . ] personal
autonomy based on a person’s right to control of his or her personal data and the processing
of such [personal] data”).
¶See the previous footnote.
∗∗See section “Use of data and risk-analysis.”
Legal aspects of information science, data science, and Big Data 27
processing are generally recognized by different national regulations and in-
ternational conventions, as well as data security, and data management best
practices are commonly diffused among data controllers, the values that should
inspire the use of data are more indefinite and context based, changing from a
community to another. This makes more complicated to identify a benchmark
for these values that can be used in the ethical and social risk-assessment.
This point is clearly addressed in the section “Introduction. The legal
challenges of the use of data” of the fourth part (Principles and guidelines)
of the Guidelines. First, the section urges both data controllers and data
processors to “adequately take into account the likely impact of the intended
Big Data processing and its broader ethical and social implications.” Second,
it recognizes the relative nature of the social and ethical values and, in this
sense, the Guidelines require that data uses should not be in conflict with the
“ethical values commonly accepted in the relevant community or communities
and should not prejudice societal interests, values and norms.”
Although the Guidelines recognize the difficulties in defining the values
that should be taken into account in the social and ethical assessment, they do
not renounce to define some practical steps to identify these values. Therefore,
they suggest, “the common guiding ethical values can be found in international
charters of human rights and fundamental freedoms, such as the European
Convention for the Protection of Human Rights.”
Given the context-dependent nature of social and ethical assessment and
the fact that international charters may only provide high-level guidance, the
Guidelines combine this general suggestion with a more tailored option that is
represented by “ad hoc ethics committee.”∗
These committees, which already
exist in practice, should identify the specific ethical values to be safeguarded
with respect to a given use of data, providing more detailed and context-based
guidance for risk assessment.
The Guidelines put the risk-assessment process in the broader context of
the precautionary approach, which should characterize any new application
of technology that may produce potential risks for individuals and society.†
In this light, the Guidelines require data controllers to adopt preventive poli-
cies to adequately address and mitigate the potential risks related to the use
of Big Data analytics.‡
∗See Guidelines, IV.1.3 (“If the assessment of the likely impact of an intended data
processing described in section IV.2 highlights a high impact of the use of Big Data on
ethical values, data controllers could establish an ad hoc ethics committee, or rely on existing
ones, to identify the specific ethical values to be safeguarded in the use of data”).
†See Guidelines, IV.2.1 (“Given the increasing complexity of data processing and the
transformative use of Big Data, the Parties should adopt a precautionary approach in
regulating data protection in this field”).
‡See Guidelines, IV.2.2. This is consistent with the provision of the Modernized Con-
vention, which focuses both on risk analysis and the design of data processing “in such a
manner as to prevent or minimise the risk of interference with [. . . ] rights and fundamental
freedoms.” See Article 8bis (2) of the Draft modernised Convention for the Protection of
Individuals with Regard to the Processing of Personal Data.
28 Frontiers in Data Science
According to the general theory on the risk-based approach, the assessment
process is divided into the following four different stages:∗
(1) identification of
the risks, (2) analysis of the potential impact of the identified risks, (3) iden-
tification of the solutions to exclude or mitigate the risks, and (4) continuous
or periodical monitoring of the effectiveness of the solutions provided.†
This is the traditional scheme that characterizes risk-assessment. Here,
the most innovative aspect concerns the broader range of interests considered
in the assessment process, which goes beyond the traditional notion of data
protection. In this sense, the right to nondiscrimination and the social and
ethical impacts of data processing activities assume specific relevance.
Given the complexity of this assessment and the different aspects that
should be taken into account, it cannot be conducted only by experts in data
protection law but requires external auditors with specific and multidisci-
plinary skills. In this light, these guidelines require that the risk-assessment
“should be carried out by persons with adequate professional qualifications
and knowledge to evaluate the different impacts, including the legal, social,
ethical and technical dimensions.”‡
Moreover, the collective dimension of the
potential impact of the use of data leads to encourage a multistakeholder
approach that gives voice to the different groups of persons that may be aff-
ected by a given use of data.§
Due to the complexity of this assessment and the continuous evolution
of both the potential risks and the measures to tackle them, data protection
authorities may play a relevant role in supporting data controllers, providing
information about the state-of-the-art of data-processing security methods,
and providing detailed guidelines on the risk-assessment process.¶
From the data subject’s perspective, a better understanding of the pur-
poses of data processing can come from the analysis of the way in which data
uses impact on individuals and society. In this light, the disclosure of the
results of the different impacts mentioned above should become part of the
duties of transparency of data controllers, to increase individuals’ awareness
about their choices concerning personal information.∗∗
With regard to the level of disclosure that should characterize the pub-
licity of the impact assessment, the Guidelines, according to the suggestion
of legal scholars [66,111], clarify that the public availability of the result of
the assessment should be made “without prejudice to secrecy safeguarded by
law.” Therefore, in the presence of such secrecy, data controllers “shall provide
∗See Guidelines, IV.2.5.
†See Guidelines, IV.2.9. Moreover, data controllers shall document the assessment and
these solutions (Guidelines, IV.2.10).
‡See Guidelines, IV.2.6.
§See Guidelines, IV.2.7 (“With regard to the use of Big Data which may affect funda-
mental rights, the Parties should encourage the involvement of the different stakeholders
(e.g., individuals or groups potentially affected by the use of Big Data) in this assessment
process and in the design of data processing”).
¶See Guidelines, IV.2.8.
∗∗See Guidelines, IV.3.2 and 3.3.
Legal aspects of information science, data science, and Big Data 29
any sensitive information in a separate annex to the risk-assessment report.”
Anyway, although this annex is not public, it may be accessed by supervisory
authorities.∗
Minor provisions of these guidelines concern the by-design approach†
and
data subject’s consent. With regard to the latter and the notice and consent
model, the Guidelines highlight that the notice should be comprehensive of
the outcome of the assessment process and “might also be provided by means
of an interface that simulates the effects of the use of data and its potential
impact on the data subject, in a learn-from-experience approach.”‡
Moreover,
consent cannot be considered freely given when “there is a clear imbalance of
power between the data subject and the Data Controllers or Data Processors,
which affects the data subject’s decisions with regard to the processing.”§
Finally, the Guidelines devote a section to the role of the human interven-
tion in Big Data–supported decisions,¶
reaffirming that the use of Big Data
“should preserve the autonomy of human intervention in the decision-making
process.” In this light, when decisions based on Big Data might affect in-
dividual rights significantly or produce legal effects, a human decision-maker
should, upon request of the data subject, “provide her or him with the reason-
ing underlying the processing, including the consequences for the data subject
of this reasoning.” In the same vein, the autonomy of decision makers should
be preserved and, on the basis of “reasonable arguments,” they should be al-
lowed the freedom not to rely on the result of the recommendations provided
using Big Data.
Data prediction: Social control and social surveillance
Big Data prediction promises incredible opportunities to anticipate fraud
detection and to prevent crime but, at the same time, its use could also
threaten fundamental legal rights such as privacy and due process [68].
Law enforcement agencies [73], secret services [117], doctors, lawyers,∗∗
accountants [55], and judge††
are using Big Data predictive analytics solutions
∗See Guidelines, IV.3.2.
†See Guidelines, IV.4.
‡See Guidelines, IV.5.1.
§See Guidelines, IV.5.3. In these cases, data controller “should demonstrate that this
imbalance does not exist or does not affect the consent given by the data subject.”
¶See Guidelines, IV.7.
∗∗See ROSS, the first Artificially Intelligent Lawyer at the following Url: http://www.
rossintelligence.com; see IBM Watson, at the following Url: http://www-03.ibm.com/
innovation/us/watson.
††A recent experiment demonstrates that artificial intelligence has been used to predict
decisions of the European Court of Human Rights (ECtHR) to 79% accuracy. Further in-
formation at: http://www.legalfutures.co.uk/latest-news/robot-judge-ai-predicts-outcome-
european-court-cases.
30 Frontiers in Data Science
as they are well aware of how these tools can be useful and/or profitable espe-
cially in a society increasingly preoccupied with the concepts of risk and public
protection [118]. However, new technologies enhance preemptive profiling of
individuals as the combination of predictive strategies and increased surveil-
lance allow for more targeted profiles.
Kerr and Earle identified three categories of Big Data prediction: conse-
quential, preferential, and preemptive prediction.
Consequential prediction is, in a general terms, an attempt to anticipate
the likely consequences of a person’s action. Usually, this is the kind of predic-
tion used by a lawyer to show to the client a realistic scenario of her defense
strategy.
Preferential prediction is mostly used by private players (iTunes Genius
or Amazon Recommendation engine represents two significant examples), and
it uses anticipatory algorithms based on social media intelligence to predict
what kind of service a user will find interesting.
Preemptive predictions assess the likely consequences of allowing or dis-
allowing a person to act in a certain way. In contrast to consequential or
preferential predictions, preemptive predictions do not usually adopt the
perspective of the actor. Preemptive predictions are mostly made from
the standpoint of the state, a corporation, or anyone who wishes to pre-
vent or forestall certain types of action. Preemptive predictions are not
concerned with an individual’s actions but with whether an individual
or group should be permitted to act in a certain way. Examples of this
technique include a no-fly list used to preclude possible terrorist activ-
ity on an airplane, or analytics software used to determine how much
supervision parolees should have based on predictions of future behavior
[118]. This latter form of prediction could considerably threaten the con-
cept of the fundamental rights in any democratic constitution. Ferguson
correctly questioned if a computer program that predicts the probabil-
ity of future crime locations could change Fourth Amendment protections
in the targeted area. Furthermore, are data-driven hunches more reliable
than personal hunches traditionally deemed insufficient to justify reasonable
suspicion?
Use of data during the investigation: Reasonable doubt versus
reasonable suspicion
The new reality, which has been briefly described in the previous section,
simultaneously undermines the protection that reasonable suspicion provides
against stops and potentially transforms reasonable suspicion into a means of
justifying those same stops.
Reasonable suspicion in the United States is a legal standard of proof that
is less than probable cause, the legal standard for arrests and warrants, but
more than an unparticularized suspicion; it must be based on specific and
Legal aspects of information science, data science, and Big Data 31
articulable facts, “taken together with rational inferences from those facts,”
and the suspicion must be associated with the specific individual.
In Europe, the article 5 of the Convention on the Human Rights states
that “everyone has the right to liberty and security of person. No one shall
be deprived of this liberty save in the following cases and in accordance
with a procedure prescribed by law [. . . ].” This means that a reasonable
suspicion presupposes the existence of facts or information that would sat-
isfy an objective observer that the person concerned may have committed
an offence.∗
Therefore, a failure by the authorities to make a genuine in-
quiry into the basic facts of a case, to verify whether a complaint was well
founded, disclosed a violation of Article 5 §1 (c) of the European Convention on
Human Rights.†
To better understand the consequence of the principle of reasonable suspi-
cion in the Big Data scenario, it could be helpful a practical example. Suppose
police are investigating a series of robberies in a particular neighborhood hav-
ing in their patrol cars a facial recognition software, connected to the database
of the arrest photos, which scans people on the street.
Suddenly, there is a match with a suspected person. The suspect’s personal
information scrolls across the patrol car’s computer screen—prior to robbery
arrests and robbery convictions. The officer then searches additional sources
of third-party data, including the suspect’s GPS location information for the
last six hours, or license plate records that tie the suspect to pawn shop trades
close in time prior to robberies and—obviously—social media information. The
police now have particularized, individualized suspicion about a man who is
not doing anything overtly criminal.
Can this aggregation of individualized information be sufficient to justify
interfering with a person’s constitutional liberty?‡
This question, and more,
will be raised by the use of any predictive policing strategy.
Big Data and social surveillance: Public and private interplay
in social control
The interaction between public and private in social control could be divi-
ded into two categories, both of which are significant with regard to data
protection. The first concerns the collection of private company data by gov-
ernment with surveillance and social control purpose, whereas the second is
the use of judicial authorities of instruments and technologies provided by
private companies for organizational and investigative purposes.
∗ECHR, Ilgar Mammadov v. Azerbaijan, §88; Erdagöz v. Turkey, §51; Fox, Campbell
and Hartley v. the United Kingdom, §32.
†ECHR, Stepuleac v. Moldova, §73; Elçi and Others v. Turkey, §674.
‡All these investigative instrument could be used on the basis of the following principle:
law enforcement officers may access many of these records without violating the Fourth
Amendment, under the theory that we can claim no reasonable expectation of privacy in
information we have knowingly revealed to third parties.
32 Frontiers in Data Science
With regard to the first category and especially when the request is made
by governmental agencies, the issue of the possible violation of fundamental
rights becomes more delicate. The Echelon Interception System [119] and the
Total Information Awareness program [120] are concrete examples that are
not isolated incidents, but undoubtedly the National Security Agency (NSA)
case has clearly shown how could be invasive the surveillance in the era of
global data flows and Big Data. To better understand the NSA case, it is
quite important to have an overview of the considerable amount of electronic
surveillance legislation that, particularly in the wake of 9/11, has been app-
roved in the United States and, to a certain extent, in a number of European
countries.
The most important legislation is the Foreign Intelligence Surveillance Act
(FISA) of 1978∗
which lays down the procedures for collecting foreign intel-
ligence information through the electronic surveillance of communications for
homeland security purposes. The section 702 of FISA Act amended in 2008
extended its scope beyond interception of communications to include any data
in public cloud computing as well. Furthermore, this section clearly indicates
that two different regimes of data processing and protection exist for U.S.
citizens and residents on the one hand, and non-U.S. citizens and residents on
the other. More specifically, the Fourth Amendment is applicable only for U.S.
citizens as there is an absence of any cognizable privacy rights for non-U.S.
persons under FISA.
Thanks to FISA Act and the amendment of 2008, U.S. authorities had
the possibility to access and process personal data of EU citizens on a
large scale via, among others, the NSA’s warrantless wiretapping of cable-
bound internet traffic (UPSTREAM) and direct access to the personal data
stored in the servers of U.S.-based private companies such as Microsoft,
Yahoo, Google, Apple, Facebook, or Skype (PRISM), through cross-database
search programs such as X-KEYSCORE. U.S. authorities have also the power
to compel disclosure of cryptographic keys, including the secure sockets
layer (SSL) keys used to secure data in transit by major search engines,
social networks, webmail portals, and Cloud services in general (BULLRUN
Program) [121].
Even if the FISA Act is the mostly applied and known legislative tool to
conduct intelligence activities, there are other relevant pieces of legislation on
electronic surveillance. One needs only to consider the Communications Assis-
tance for Law Enforcement Act of 1994,†
which authorizes the law enforcement
and intelligence agencies to conduct electronic surveillance by requiring that
telecommunications carriers and manufacturers of telecommunications equip-
ment modify and design their equipment, facilities, and services to ensure that
they have built-in surveillance.
∗Foreign Intelligence Surveillance Act (50 U.S.C. §1801–1885C).
†See Communications Assistance for Law Enforcement Act (18 USC §2522).
Legal aspects of information science, data science, and Big Data 33
Truthfully, the surveillance programs are not only in the United States.
In Europe, the Communications Capabilities Development Program has
prompted a huge amount of controversy, given its intention to create a ubiq-
uitous mass surveillance scheme for the United Kingdom in relation to phone
calls, text messages and e-mails, and extending to logging communications
on social media. On June 2013, the so-called program TEMPORA showed
that UK intelligence agency Government Communications Headquarters has
cooperated with the NSA in surveillance and spying activities [122]. These
revelations were followed in September 2013 by reports focusing on the activ-
ities of Sweden’s National Defense Radio Establishment. Similar projects for
the large-scale interception of telecommunications data has been developed
by both France’s General Directorate for External Security and Germany’s
Federal Intelligence Service.
Even if it seems that EU and U.S. surveillance programs are similar, there
is one important difference: in the European Union, under data protection
law, individuals have always control over their own personal data, whereas in
the United States, the individuals have a more limited control once the user
has subscribed to the terms and condition of a service.∗
Other than government agencies’ monitoring activities, the second cate-
gory regarding the use by judicial authorities of private tools for investigative
purposes has two interesting examples.
The first is the PredPol†
software initially used by the Los Angeles police
force and now by other police forces in the United States (Palm Beach, Mem-
phis, New York, Chicago, Minneapolis, and Dallas). Police Chief (ret.) William
J. Bratton and the Los Angeles police department (LAPD) are credited with
envisioning the PredPol, whereas Charlie Beck, chief of LAPD since 1977,
wrote in 2009, “what can we learn from Wal-Mart and Amazon about fight-
ing crime in a recession? Predictive policing leverages advanced analytics to
enable information-based approaches to law enforcement tactics, strategy, and
policy, enhancing public safety and changing outcomes. Advanced analytics
tools, techniques, and processes support meaningful exploitation of public-
safety data necessary to turn data into knowledge and guide information-based
prevention, thwarting, mitigation, and response.”
Predictive policing, in essence, cross-checks data, places, and techniques of
recent crimes with disparate sources, analyzing them and then using the re-
sults to anticipate, prevent, and respond more effectively to future crime. Even
if the software house created by PredPol declares that no profiling activities are
carried out, it becomes essential to carefully understand the technology used
to anonymize the personal data acquired by the law-enforcement database.
∗See United States v. Miller (425 U.S. 425 [1976]). In this case the United States Supreme
Court held that the “bank records of a customer’s accounts are the business records of the
banks and that the customer can assert neither ownership nor possession of those records.”
The same principle could be applied to an Internet Service Provider.
†See PredPol, Predictive Policing Software available at www.predpol.com/.
34 Frontiers in Data Science
This type of software is bound to have a major impact in the United States
on the conception of the protection of rights under the Fourth Amendment,
and more specifically on concepts such as probable cause and reasonable sus-
picion that in future may come to depend on an algorithm rather than human
choice [73].
The second example is Geofeedia software.∗
This software maps a given
location, such as a certain block within a city or even an entire particular
metropolitan area, and searches the entire public Twitter and or Facebook
feed to identify any geolocated tweets in the past days within that specific
area. This application can provide particularly useful data for the purpose of
social control. One can imagine the possibility to have useful elements (e.g.,
IP address) to identify the subjects present in a given area during a serious
car accident or a terrorist attack.
From a strictly legal standpoint, these social control tools may be employed
by gathering information from citizens directly due to the following principle
of public: “Where someone does an act in public, the observance and recording
of that act will ordinarily not give rise to an expectation of privacy” [123].
In the European Union, although this type of data collection frequently
takes place, it could be in contrast with European Court of Human Rights
(ECHR) case law that, in the Rotaru vs. Romania case,†
ruled that “public
information can fall within the scope of private life where it is systemati-
cally collected and stored in files held by the authorities.” As O’Floinn [124]
observes, “Non-private information can become private information depend-
ing on its retention and use. The accumulation of information is likely to result
in the obtaining of private information about that person.”
In the United States, this subject has been addressed in the case People v.
Harris;‡
the New York County District Attorney’s Office sent a subpoena to
Twitter, Inc. seeking to obtain the Twitter records of user suspected of having
participated in the Occupy Wall Street movement. Twitter refused to provide
the law enforcement officers with the information requested and sought to
quash the subpoena. The Criminal Court of New York confirmed the appli-
cation made by the New York County District Attorney’s Office, rejecting
the arguments put forward by Twitter, stating that tweets are, by defini-
tion, public, and that a warrant is not required to compel Twitter to disclose
them. The District Attorney’s Office argued that the third party disclosure
doctrine put forward for the first time in the United States v. Miller was
applicable.§
∗See https://geofeedia.com/. The ACLU of California has recently obtained records
showing that Twitter, Facebook, and Instagram provided user data access to Geofeedia,
a developer of a social media monitoring marketed to law enforcement as a tool to moni-
tor activists and protesters. More information at: https://www.aclunc.org/blog/facebook-
instagram-and-twitter-provided-data-access-surveillance-product-marketed-target.
†See Rotaru v. Romania (App. No. 28341/95) (2000) 8 B.H.R.C. at [49].
‡See 2012 NY Slip Op 22175 [36 Misc 3d 868].
§See United States v. Miller (425 U.S. 425 [1976]).
Legal aspects of information science, data science, and Big Data 35
The EU reform on data protection
In addition to the GDPR, the new directive on the protection of individuals
with regard to the processing of personal data by competent authorities (DPI)
establishes some protection against a possible violation of EU citizens’ privacy.
The goal of this directive is to ensure that “in a global society characterized
by rapid technological change where information exchange knows no borders,”
the fundamental right to data protection is consistently protected.∗
The founding principles of this directive, which are shared with the previ-
ous directives referred to, are twofold
1. First, there is the need for fair, lawful, and adequate data processing
during criminal investigations or to prevent a crime, on the basis of
which every data must be collected for specified, explicit, and legitimate
purposes and must be erased or rectified without delay.†
2. Then there is the obligation to make a clear distinction between the
various categories of the possible data subjects in a criminal proceeding
(persons with regard to whom there are serious grounds for believing
that they have committed or are about to commit a criminal offence,
persons convicted, victims of criminal offense, and third parties to the
criminal offence).
For each of these categories, there must be a different adequate level of atten-
tion on data protection, especially for persons who do not fall within any of
the categories referred previously.‡
These two principles are of considerable importance, although their app-
lication on a practical level will be neither easy nor immediate in certain
member states. This is easily demonstrated by the difficulties encountered
when either drafting practical rules distinguishing between several categories
of potential data subjects within the papers on a court file, or attempting to
identify the principle on the basis of which a certain court document is to be
erased.
In addition to these two general principles, the provisions of the direc-
tive are interesting and confirm consolidated data protection principles. Suf-
fice to mention here, the prohibition on using measures is solely based on
automated processing of personal data that significantly affect or produce an
adverse legal effect for the data subject,§
as well as the implementation of
∗See DPI, explanatory Memorandum, (SEC(2012) 72 final).
†Art. 4, DPI and Art. 4b, Directive 2016/280 of the European Parliament and of
the Council on the protection of individuals with regard to the processing of personal
data by competent authorities for the purposes of prevention, investigation, detection
or prosecution of criminal offences or the execution of criminal penalties, and the free
movement of such data, available at: http://eur-lex.europa.eu/legal-content/EN/TXT/
HTML/?uri=CELEX:32016L0680&from=EN.
‡Art. 5, DPI.
§Art. 9a, DPI.
36 Frontiers in Data Science
data protection by design and by default mechanisms to ensure the protec-
tion of the data subject’s rights and the processing of only those personal
data.∗
Furthermore, the directive entails the obligation to designate a data
protection officer in all law-enforcement agencies to monitor the imple-
mentation and application of the policies on the protection of personal
data.†
These principles constitute a significant limitation to possible data min-
ing of personal and sensitive data collection by law enforcement agencies.
If it is true that most of these provisions were also present in the Recom-
mendation No. R (87) of Council of Europe and in the Framework Decision
2008/977/JHA, it is also true that propelling data protection by design and by
default mechanisms and measures could encourage data anonymization and
help one to avoid the indiscriminate use of automated processing of personal
data.
References
[1] ITU. 2015. Recommendation Y.3600: Big data—Cloud comput-
ing based requirements and capabilities. http://www.itu.int/itu-t/
recommendations/rec.aspx?rec=12584 (accessed July 23, 2016).
[2] ENISA. 2015. Privacy by design in big data: An overview of privacy enh-
ancing technologies in the era of big data analytics. https://www.enisa.
europa.eu/publications/big-data-protection/at download/fullReport
(accessed June 15, 2016).
[3] Bollier, D. 2010. The promise and perils of big data. Aspen Institute,
Communications and Society Program. http://www.aspeninstitute.org/
sites/default/files/content/docs/pubs/The Promise and Peril of Big
Data.pdf (accessed February 27, 2014).
[4] Paparrizos, J., White, R.W., and Horvitz, E. 2016. Screening for pan-
creatic adenocarcinoma using signals from web search logs: Feasibility
study and results. Journal of Oncology Practice 12(8): 737–744.
[5] Golle, P. 2006. Revisiting the uniqueness of simple demographics in the
US population. In Juels, A. (Ed.), Proceedings of the 5th ACM Workshop
on Privacy in Electronic Society. New York: ACM.
∗Art. 19, DPI.
†Art. 30, DPI.
Legal aspects of information science, data science, and Big Data 37
[6] Narayanan, A., and Felten, E.W. 2014. No silver bullet: De-identification
still doesn’t work. http://randomwalker.info/publications/no-silver-
bullet-de-identification.pdf (accessed March 25, 2015).
[7] Narayanan, A., Huey, J., and Felten, E.W. 2015. A precautionary
approach to big data privacy. http://randomwalker.info/publications/
precautionary.pdf (accessed April 4, 2015).
[8] Ohm, P. 2010. Broken promises of privacy: Responding to the surprising
failure of anonymization. UCLA Law Review 75(6): 1701–1777.
[9] Sweeney, L. 2000. Foundations of privacy protection from a computer
science perspective. Proceedings of the Joint Statistical Meeting, AAAS,
Indianapolis, IN. http://dataprivacylab.org/projects/disclosurecontrol/
paper1.pdf (accessed January 24, 2015).
[10] Sweeney, L. 2000. Simple demographics often identify people uniquely.
Pittsburgh, PA: Carnegie Mellon University, Data Privacy Working
Paper 3. http://dataprivacylab.org/projects/identifiability/paper1.pdf
(accessed January 24, 2015).
[11] Sweeney, L. 2015. Only you, your doctor, and many others
may know. Technology Science, September 29. http://techscience.
org/a/2015092903 (accessed November 28, 2015).
[12] United States General Accounting Office. 2011. Record linkage and
privacy. Issues in creating New Federal Research and Statistical Informa-
tion. http://www.gao.gov/assets/210/201699.pdf (accessed December
14, 2013).
[13] Mantelero, A. 2015. Data protection, e-ticketing and intelligent systems
for public transport. International Data Privacy Law 5(4): 309–320.
[14] The White House. 2012. A consumer data privacy in a net-
worked world: A framework for protecting privacy and promoting
innovation in the global digital economy. http://www.whitehouse.
gov/sites/default/files/privacy-final.pdf (accessed June 25, 2014).
[15] The White House, Executive Office of the President. 2014. Big data:
Seizing opportunities, preserving values. http://www.whitehouse.gov/
sites/default/files/docs/big data privacy report may 1 2014.pdf (acce-
ssed December 26, 2014).
[16] Mayer-Schönberger, V. 1997. Generational development of data protec-
tion in Europe? In Agre, P.E., and Rotenberg, M. (Eds.), Technology
and Privacy: The New Landscape. Cambridge, MA: MIT Press.
38 Frontiers in Data Science
[17] Article 29 Data Protection Working Party. 2011. Opinion 15/2011
on the definition of consent. http://ec.europa.eu/justice/policies/
privacy/docs/wpdocs/2011/wp187 en.pdf (accessed February 27, 2014).
[18] Article 29 Data Protection Working Party. 2014. Opinion 06/2014 on
the notion of legitimate interests of the data controller under Article 7
of Directive 95/46/EC. http://ec.europa.eu/justice/data-protection/
article-29/documentation/opinion-recommendation/files/2014/wp217
en.pdf (accessed February 27, 2014).
[19] Brownsword, R. 2009. Consent in data protection law: Privacy, fair pro-
cessing and confidentiality. In Gutwirth, S., Poullet, Y., De Hert, P.,
de Terwangne, C., and Nouwt, S. (Eds.), Reinventing data protection?
Dordrecht, the Netherlands: Springer.
[20] European Commission, Directorate-General Justice, Freedom and Secu-
rity. 2010. Comparative study on different approaches to new privacy
challenges, in particular in the light of technological developments:
Working Paper No. 2: Data protection laws in the EU. The difficulties
in meeting challenges posed by global social and technical developments.
http://ec.europa.eu/justice/policies/privacy/docs/studies/new privacy
challenges/final report working paper 2 en.pdf (accessed July 5, 2014).
[21] Van Alsenoy, B., Kosta, E., and Dumortier, J. 2014. Privacy notices
versus informational self-determination: Minding the gap. International
Review of Law, Computers, & Technology 28(2): 185–203.
[22] Cranor, L.F. 2012. Necessary but not sufficient: Standardized mecha-
nisms for privacy and choice. Journal on Telecommunications and High
Technology Law 10: 273–307.
[23] Richards, N.M., and King, J.H. 2014. Big data ethics. Wake Forest Law
Review 49: 339–432.
[24] Moerel, L. 2014. Big Data Protection: How to Make the Draft EU Regula-
tion on Data Protection Future Proof. Tilburg, the Netherlands: Tilburg
University. http://www.debrauw.com/wp-content/uploads/NEWS%20-
%20PUBLICATIONS/Moerel oratie.pdf (accessed October 15, 2016).
[25] Rubinstein, I.S. 2013. Big data: The end of privacy or a new beginning?
International Data Privacy Law 3(2): 74–87.
[26] Henkin, L. 1974. Privacy and autonomy. Columbia Law Review 74(8):
1419–1433.
[27] Murphy, R.S. 1996. Property rights in personal information: An eco-
nomic defense of privacy. Georgetown Law Journal 84: 2381.
[28] Parent, W.A. 1983. A new definition of privacy for the law. Law &
Philosophy 2(3): 305–338.
Legal aspects of information science, data science, and Big Data 39
[29] Wacks, R. 1980. The poverty of “privacy.” Law Quarterly Review 96:
73–78.
[30] Wacks, R. 1980. The Protection of Privacy. London: Sweet & Maxwell.
[31] Zimmerman, D.L. 1983. Requiem for a heavyweight: A farewell to
Warren and Brandeis’s privacy tort. Cornell Law Review 68(3):
291–367.
[32] Costa, L., and Poullet, Y. 2012. Privacy and the regulation of 2012.
Computer Law & Security Review 28(3): 254–262.
[33] Secretary’s Advisory Committee on Automated Personal Data Systems.
1973. Records, computers and the rights of citizens. http://epic.org/
privacy/hew1973report/ (accessed February 27, 2014).
[34] Schudson, M. 1978. Discovering the News. A Social History of American
Newspaper. New York: Basic Books.
[35] Breckenridge, A.C. 1970. The Right to Privacy. Lincoln, NE: University
of Nebraska.
[36] Solove, D.J. 2008. Understanding Privacy. Cambridge, MA: Harvard
University Press.
[37] Westin, A.F. 1970. Privacy and Freedom. New York: Atheneum.
[38] Schwartz, P.M. 2013. The E.U.-US privacy collision: A turn to institu-
tions and procedures. Harvard Law Review 126: 1966–2009.
[39] Brenton, M. 1964. The Privacy Invaders. New York: Coward-McCann.
[40] Miller, A.R. 1971. The Assault on Privacy Computers, Data Banks,
Dossiers. Ann Arbor, MI: University of Michigan Press.
[41] Packard, V. 1964. The Naked Society. New York: David McKay.
[42] Bennett, C.J. 1992. Regulating Privacy: Data Protection and Public Pol-
icy in Europe and the United States. Ithaca, NY: Cornell University
Press.
[43] Agre, P.E., and Rotenberg, M. (Eds.). 1997. Technology and Privacy:
The New Landscape. Cambridge, MA: MIT Press.
[44] Bygrave, L.A. 2014. Data Privacy Law. An International Perspective.
Oxford, UK: Oxford University Press.
[45] Petrison, L.A., Blattberg R.C., and Wang, P. 1997. Database mar-
keting. Past, present, and future. Journal of Direct Marketing 11(4):
109–125.
Exploring the Variety of Random
Documents with Different Content
only two villas on the Terrace, and they pertained variously to a Paris
specialist in madness, and the controller-general of a great French
bank. Between the two villas lay a large and valuable plot of ground,
overgrown and tangled up with creepers, brambles, cabbage stalks,
rose bushes, and seeding onions, set in the midst of which was a
dilapidated one-room hut. The hut was the fly in the ointment of the
specialist in lunacy and the controller-general. They could do nothing
to remove this picturesque slum from their gates, for old veuve
Michel, who lived there and drank two bottles of cognac a day and
sang gay ribald songs by night, owned the land by right of some old
French statute, and no one could turn her out for as long as she
lived. Haidee and Bran considered veuve Michel a very charming
person indeed. She was fat and merry and gentle, called them her
nice little hens and gave them apples and pears (for she also owned
an orchard up the cliff) all through the winter when there was no
fruit to be got any nearer than Cherbourg. Naturally they liked and
appreciated the old woman. Haidee had a good mind to go in and
pay her a visit, but she decided it was better not, as old veuve would
just be sleeping off her morning bottle of cognac preparatory to
starting on the afternoon one; also Haidee remembered that she
was hungry, and had better hurry back and help get lunch. Still she
could not help stopping once or twice to examine for signs of little
pink tips the lower branches of the tamarisk-trees which grew on
one side of the Terrace--on the other side was the grey stone river
wall with the tide lapping blue against it.
Haidee loved tamarisks with a joy that she was sure was unholy
because they looked so wicked and painted somehow when they
were all dressed out in their pink feathers. She fancied that Jezebel
must have had a bunch of them stuck like an aigrette in her
beautifully coiffée hair, and the same pink tint on her cheeks when
she looked out of the window for the last time. Anyway why were
tamarisks the only trees to be found growing in the ruins of
Babylon? And why had she read somewhere, that in the days of
ancient Rome tamarisks were bound around the heads of criminals?
It was a nuisance to have to forsake these interesting meditations to
enter the little soap-scented shop of the village barber, but she
stayed no longer than to bid him come to the Villa at three o'clock to
cut off Madame's hair. Next she called at Lemonier's to command a
sack of coal, and noted that Lemonier had evidently been drunk
again, for Madame had a black eye. It was funny to think that such a
jolly big red man should be so cruel! Haidee meditated on this
subject on the return journey, also on the horrible price of coal--
sixty-five francs a ton and it disappeared like lightning. No one
seemed to know why "Carr-diff," as they called it, should be so dear.
Hortense, closely questioned on the subject by Val anxious for
information, said that it must be because the people in England
hated the French and were still angry that Normandy did not belong
to them.
"Well, have n't you got any coal mines in your own blessed
country?" asked Haidee.
"Certainement!" Hortense had replied indignantly. "We have
Newcas-sel!"
――――
The barber arrived at three o'clock, and Val sat trembling before
her dressing-table. She had arranged two mirrors so that she could
view the whole proceeding, but as soon as the barber commenced
she closed her eyes tight. Bran and Haidee stationed themselves at
either side of the table to see fair play.
The barber was frankly amazed at the decision of Madame to
cut off her feathery hair. Even at the last moment he asked--holding
it up in his hands and shaking it out in sprays:
"Does Madame realise what a change it will make in her
appearance? Would it not be better if Madame had it merely cut
short, leaving about two inches all round à la Jeanne d'Arc, so--?" He
stuck his little pudgy fingers out below her ears to show the
desirable length.
"No, no, no!" cried Val, without opening her eyes. "Does he
think I want to look like a pony with my mane hogged! Cut it off
close, it must grow long and thick as it used to do. Tell him, Haidee."
Haidee told him as much as it was good for him to know--no
mention of ponies.
"Bon!" said Monsieur le Barbier agreeably, but he looked
doubtful, thinking to himself that hair seldom grew much after the
age of thirty, and the lady looked well that. When one side was gone
Val opened her eyes and gave a deep cry. If it could have been
replaced then, she would have abandoned her idea and made the
best of what she had. As it was she closed her eyes again, but
during the rest of the operation great tears rolled down her face
upon her tightly clasped hands. And when all was over the children
were swept from the room and she locked herself in with her heart's
bitterness. Even Bran was not permitted to comfort her.
It is true that nothing makes a greater difference to the
appearance of a woman than to cut off her hair. The tale of every sin
she has committed and every sorrow she has suffered seems to be
written bare and unsheltered upon her face for all the world to read.
What subtle alleviation there is in a frame of hair round the face of a
sinner it is hard to say: but it is a problem whether Mary Magdalene,
with all her shining story of repentance would have appealed to the
love and chivalry of the world in quite the same way if she had been
handed down through the ages without her wondrous hair.
When Valentine Valdana looked in the glass at her pale, oval
face with no darkness above it to soften the fine lines of her
temples, faintly hollowed cheeks, and sombre eyes whose defect
appeared to have become suddenly accentuated, she longed in
shame and dismay for a mask. It seemed to her that she had
indecently exposed her sorrows to the world; that exile, misery, and
all the failures of her life were plainly written for even the most
unintelligent eye to read. A curious sense too of having done
something disloyal to others in revealing her unhappiness crept into
her mind for an instant, but she made haste to dismiss it, and would
not even specify the vague "others" to herself. None knew better
than she the power of a beloved hand to strike deepest, to hollow
out cheeks, sharpen temples, and put shadows into eyes: but she
would never have admitted it. Hers was no accusing heart. She
blamed nobody but herself for her failures--not even the Fate that
had bestowed on her that double nature of artist and lover which
rarely if ever makes for happiness. She only felt the despair of the
convict and almost wished herself one, so that she might hide in a
cell. At length she sought her gay scarf of asphodel-blue and
arranged it over her head like a nun's veil. It was thus that she
presented herself to the children in the kindly dusk. Supper already
stood upon the table. Haidee displayed unusual tact, but Bran was
full of curiosity.
"Are you always going to wear that wale tied on you?" he
inquired.
"Until my hair grows long again," said poor Val, biting her lip
painfully.
"Sleep in it too?" Val nodded, and Haidee made haste to help
Bran to pommes frites which he loved.
Next morning, Bran waking up and throwing out an arm for his
matutinal hug, encountered something strange to his touch:
something round, bumpy, and slightly scrubby, very different to the
soft nest he was used to dabble his hand in as soon as he woke. The
blue scarf had slipped down while Val slept and her shorn head lay
cruelly outlined upon the pillow. Bran knelt up and considered her in
consternation mingled with pity, then finding himself in the attitude
of prayer, mechanically crossed himself and murmured his morning
orison, his eyes still fixed on his mother's head:
"Jesus, Mary, and Joseph, I give you my heart, take it please,
and preserve it from sin."
"Jesus, Mary, and Joseph, I give you my soul and my life.
"Jesus, Mary, and Joseph, help me in my last agony.
"Jesus, Mary, and Joseph, grant that I live and die in thy holy
company. Amen."
Immediately afterwards humour, that Irish vice, overcame all
gentler feelings; like a certain famous Bishop of Down, Bran would
lose a friend for a joke. He woke Val with a cruel jest:
"Bon jour, Monsieur le Curé!"
The curé of Mascaret was a Breton as rugged as his country,
with haggard spiritual eyes and an upper lip you could built a fort
on, as the saying is; he intensified his uncomeliness by wearing his
hair so close-shaved that it was impossible to say where his tonsure
began or ended. To be told by her loving but candid son that she
resembled this good man was a cruel thrust to Val, and the memory
of it darkened life for many days to come. She wrapped herself in
gloom and the blue veil, and nothing more was heard of the fez cap
and cigarettes except that in good time the Stores forwarded them
and the French Customs taxed them. After once trying on the fez
and finding herself the image of a sallow and melancholy Turk, she
had cast it from her. Her one instinct was to hide her ugliness from
every one. Even at the sight of John the Baptist she would fly and
hide, and she never left the house except after dark, when for
exercise she would sometimes race Haidee up and down the digue,
or run along the beach at midnight, her scarf floating behind her in
the wind, and her head bare to give her "roots" a chance.
These proceedings gravely annoyed the Customs officers
distributed in the little straw-littered watch-huts that line the
Normandy coast. Instead of tucking themselves in their blankets for
a peaceful night, they were obliged to keep awake for fear the mad
American woman meant either to commit suicide or meet a boat full
of brandy and cigars from Jersey.
CHAPTER XIV
THE WAYS OF LITERATURE
"The voyage of even the best ship is a zigzag line of a hundred tacks."
From Jersey Val had made a bee-line for Paris which she knew well,
and where she had hopes of renewing her mental energy by the
sights and sounds of a great city and association with other brain
workers. Autumn removals were in full swing and there was no great
difficulty in finding house-room for herself and the children, though
she was unprepared to find how Paris rents had risen since the days
when she and her mother sojourned in the Latin Quarter. It was to
that part of Paris she naturally turned--the only possible part for
artists and writers to live, though the rich and empty-headed are
fond of calling it the "wrong side" of the river. A studio seemed the
most suitable form of residence, for she knew she would not be able
to work in a small room, and she hated the sordid construction of a
cheap flat. She was fortunate in finding a good atelier in a little
secluded rue on the confines of the Quarter--a big, high room, with
kitchen and small bedroom attached, looking out onto a little square
yard with clusters of shrubs, ivied walls, and a few old battered
statues that lent a picturesque air. Here she had settled down and
with resolute energy begun the series of "Wanderfoot" articles for
which Branker Preston had obtained a commission. It was an
arduous task. No matter how much material is stored in the mind it
is not easy to import the air and colour of far-off lands into a Paris
atelier. The art of putting things down had not yet been recaptured
either. Still, the stimulus of even the short journey from Jersey to
Paris had done something for her, and though to her critical eye the
articles she achieved seemed but pale echoes of her former work,
they at least paid the rent and kept things going in rue Campagne
Premiere. The continuation of Haidee's education became a problem
needing instant attention; for Val very soon realised that the Latin
Quarter with its liberal ideas of morality and its fascinating students
was no place for a young impressionable girl. Her own child she
would have allowed to stay, for she knew that anything with her
nature would come to no harm among these careless, attractive
people, to whom she felt herself blood-kin. But Haidee, the child of a
pretty flighty mother, was of different stock. Besides, there was a
responsibility to Westenra in the matter. There were no convents left
in Paris, or indeed, in France. All those lovely homes where girls
learned a sweet sedateness and many beautiful arts had been closed
by a ruthless government. No more in France may the gentle coifed
women impart composure and beauty of mind to English and
American girls and train the aristocratic children of France to a love
of Church and Country. What the loss is to the sum of the world's
harmony can never be computed, but American and English mothers
have a slight realisation of it.
It was in Belgium that Val at last found what was needed for
Haidee--a little community of French nuns who, refusing to unveil,
had been obliged to flee over the border, and there had founded a
convent to which many good Catholics in Paris sent their children. It
was well within Val's means too, for the living is cheap in Belgium,
and the fare in the convent was simple though good. Haidee hated
terribly to go, but Val was firm, though she held out the promise of
early liberation if Haidee would work well at French and try and pass
her brevet simple. This was no difficult task, for the girl had been
well grounded in French during their sojourn in Jersey. Remained the
problem of Bran--and little children are a problem in France to
parents of limited means. No one caters for them as in other
countries. No one even understands the art of teaching and amusing
them at the same time, nor even how to feed them. There are no
kindergartens and no milk puddings! Small wonder that French
babies are small and sallow and sad! Since the nuns were driven out
there are only the public Lycées where strong and weak, rough and
gentle, are jumbled together with results that no thinking woman
would welcome for her child. From their tenderest years French
children are crammed with lessons, pushed ahead to pass exams,
while the business of play so necessary for little children is almost
entirely suppressed.
Val very certainly had no intention of confiding her son to such
institutions. She was therefore obliged to hire a daily governess for
him, for though, at his age, he needed little teaching, he had to be
sent out of doors so that she might have silence and solitude
wherein to work. Even this was a costly business. In England a
nursery governess can be afforded by almost every one, but in
France it costs one hundred francs a month to have your child well
taken care of and taught his alphabet for a few hours a day.
Val did not grudge it, but what worried her was that Bran did
not thrive. Paris was no place for him. The Luxembourg Gardens
make a good play-ground for city-bred children, but Bran was Val's
own child in his need of air and space and horizon. His bloom faded
a little, and he began to look very fair and spiritual. Also his love of
the picture and statue galleries seemed to his mother something too
wistful and wonderful in a small boy, and brought tears to her pillow
in the silence of many a night. Then she took him to Belgium for
awhile and left him with Haidee and the good nuns. He was a shy
creature, though he hated any one to know it, and believed he hid
his secret well behind a set smile and little hardy incomprehensible
sayings. When the nuns clustered round him calling him their "little
Jesus," a favourite name in France for a pretty child, he disdained to
shelter behind Val's skirts, as instinct bade him, but nothing could be
got out of him except an enigmatic saying he always kept for
strangers:
"The cat says bow-wow-wow, and
The dog says meow, meow, meow."
All the while he smiled his little bright smile and his eyes roving
keenly noted every detail of the pale æsthetic faces. Even the tears
in the Reverend Mother's eyes did not escape him. Afterward he said
to Val:
"I like that one with the floating eyes. I think she wishes she
had a nice little boy like me. Her voice was littler than a pin's head
when she called me her petit Jesu. But why do they nearly all have
green teeth?"
When Val kissed him farewell it nearly broke her heart to see
the brave smile he maintained, though Haidee was sniffling and
snuffling at his elbow, partly with momentary grief but mostly with
indignation at being, as she rudely phrased it: "Shut up in a convent
with a lot of old pussycats."
Back in Paris the studio seemed desolate and empty. Bran had
become so much a part of his mother's being and life that without
him she was like a bird from whom a wing had been torn. A month
later Haidee wrote:
"I think Bran is fretting. Whenever I speak to him he puts that
little fixed grin on his mouth, but you should see his eyes."
Within an hour Val was in the Brussels express speeding for that
dear sight. On the journey back to Paris, happy now and healed of
her broken wing, she heard all the history of his lonely nights and
the "purply-red pain" that he got in his stomach when he thought of
her. Cuddled to her side he wept as he had never wept whilst
separated from her, and Val's tears ran down her face too while she
listened, registering a vow that she would never part with him again.
So once more he went out with a governess and came home to
his mother full of original criticisms of Moreau's pictures and the
statues of Rodin, until one morning nearly two years after their
arrival in Paris, and just when Haidee had arrived for the summer
holidays, Val rose up from her bed with the itch for travel in her feet,
and the longing quickly communicated to the children for the sight
of a clear horizon. They tore their possessions from the walls,
stuffed them into trunks, and shook the dust of Paris from their feet.
"Let's go to Italy and live on olives and spaghetti, "was Haidee's
suggestion, but Bran knew the news of the world.
"We might get an earthquake!"
The size of the cheque from Branker Preston, however, was
what really decided the affair, limiting them to wandering happily
enough in Brittany. But the water and primitive methods of Breton
cooks made Val think nervously of typhoid, and after a time she
headed for Normandy. Normans are cleaner in their household ways
than Bretons, of whom they slightingly speak as "les pores Bretons,"
declaring that they eat out of holes in the table and never wash the
holes. Besides, Normandy in winter is milder than Brittany. So,
travelling by highways and byways, they happened at last on
Mascaret.
It was the tag end of September when they arrived. All the
summer visitors were gone and the big silver beach deserted, but
summer itself still lingered. They got an entrancing glimpse of the
gentle green and gold beauty of the place before the chills of
autumn set in. Even then they had been able to bathe and go sailing
in the fishing boat of one of père Duval's sons, who was now in his
turn lighthouse-keeper of Mascaret. For ten sunny October days,
too, they had assisted with all the ardour of novitiates at père
Duval's cider making, becoming acquainted with the secrets of cidre
bouché, and the grades to be found in cidre ordinaire unto the third
and fourth watering. They even sampled the latter as drunk by the
fishermen and called for at the cafés by the name of le boisson avec
le brulot dedans: which signifies cider very liberally diluted with
French cognac. Then the winter closed in on Mascaret with wild
gales and high-flowing tides. On Christmas Eve snow came softly
down, so that the walk to midnight mass had been like acting in that
scene painted by a Dutch painter where the village folk are seen
winding their way through the snow, lanterns and hot-water bottles
in their hands, to the distant church with windows full of red light.
All the winter interests of the simple village had been sampled and
shared by Val and the children, and they had been happier there
than ever in France. The children loved the freedom of the place and
the bonhomie of the French folk so different to English people of
that class. The three went about in their red sweaters and lived a life
of absolute unconvention. It was a good place to write a
masterpiece in--if one were only a master--was Val's ironical
thought, and in spite of her self-directed irony, she did achieve
during the first months there a wonderful little curtain raiser, which
Branker Preston had no difficulty in disposing of to a London
manager. It dealt with Boers and Zulus, and had been well received,
but unfortunately the play it had preceded in the bill was a failure
and the two were withdrawn together before Val could greatly
benefit, but it had brought in five guineas a week for six weeks, and
this success had put her in heart for further work of the kind. She
had sickened of writing "Wanderfoot" articles from a chair. She could
by this time have written some very spirited ones on the subject of
France in general and Normandy in particular, but she had her
reasons for not wishing to attract attention to her whereabouts, as
such articles would surely have done. Preston advised her to write a
novel, but she knew she had neither the patience to spin a long
story through many chapters to its end, nor the gift of character
portrayal. What was hers was a sense for situation, colour, and
atmosphere, and it occurred to her that the best vehicle for a display
of these qualities was the theatre. Her first little venture had
attracted the attention of several managers, and one of them told
Preston that he was ready to consider a three-act play by her. It was
this play she was busy upon now. But it was sometimes hard to
transport the atmosphere of far-away tropical Natal into a little
wooden villa facing the English Channel, with a wild spring gale
tearing at the windows, and the rollers booming like cannon on the
Barleville beach--for the promise of summer had gone as swiftly as it
came, and the spring tides were flooding up the river flinging great
walls of spray over the digue and splashing three feet deep across
the Terrasse, right to the steps of the Hotel de la Mer, so that the
journey to the village had to be made by a path up the cliff.
Val found that the only way to ignore Normandy and the bleak
mists of La Manche was to sit over a chaufferette full of bright red
embers of charcoal, letting the heat steal up her skirts and
enveloping her whole person from the soles of her feet to her scalp
in a lovely glow. Immediately she would begin to write things full of
the tropical languor of Africa. In her brain palms waved, little pot-
bellied Kaffirs rolled in the hot dirt, sunshine blazed over a blue and
green land, the air was filled with the scent of mimosa, and great-
limbed Zulus danced in rhythmic lines with chant and stamp and
swing of assegai before Cetewayo, the great and cruel king.
Unfortunately, a chaufferette is not always an easy thing to
manage. Like everything French it has a temperament, and is liable
to moods when it will burn and moods when it won't. It is a wooden
or tin box, perforated at the top and open at one side to admit an
earthenware bowl full of the charcoal which is called charbon de
bois--actually calcined morsels of green wood. The baker makes this
charbon by sticking green wood branches into his hot oven after he
has finished baking his bread, but each baker makes a limited supply
only, and will not sell it except to people who buy his bread. Every
one uses chaufferette in Normandy during the winter, and visitors
are given one to put their feet on as soon as they enter a house,
though sometimes when the host is rich enough to keep a perpetual
fire going, a supply of hot bricks is kept in the oven instead.
Val's chaufferette was of most uncertain temper. Hortense
always lit it in the morning, and left it by the writing-table. When Val
came to it all that had to be done was to gently insert an old spoon
under the little ash heap and lift it all round, when a red hot centre
of glowing embers would disclose itself. But sometimes an old nail or
piece of "Carr-diff" found its way by accident into the pot, then the
charbon would immediately sulk itself into oblivion, or sometimes for
no reason at all after being perfectly lighted it would just go out.
Ensued a struggle in which Val and Haidee invariably came off
second-best. They would take the pot out of its box and stand it on
a window-sill with the window drawn low to make a draught; put it
on the front door step and, kneeling down, blow on it until fine ash
sat thick upon their noses and their eyes were full of tears; build
paper bonfires on it; fan it wildly with newspapers. All to no avail!
Usually that was the end of work and inspiration for the day. Val
declared that she could not think with cold feet. But sometimes old
père Duval, compassionate for the mad, would send up his wooden
box, large enough for two men to warm their feet on, with a great
iron saucepan full of glowing charbon inside, and Val would sit
toasting over it and write things of a tropical languor extraordinary.
Haidee had passed her brevet simple, an exam, about equal to
the English Oxford Junior, and the American 6th standard, and was
now working for the brevet supérieure with a French woman who
had been a governess before she married a retired commercial
traveller and settled in Mascaret. The discovery of this good woman
was a stroke of luck for Val, though certainly Haidee did not consider
it so. However, her lessons only took up four hours a day. For the
rest she and Bran idled joyous and care-free through life, climbing
the cliff, fishing, digging for sand-eels, making long excursions
inland, or meeting the fishing boats in the evening when they came
in with the day's haul, and all the villagers would be at the port to
bargain for fish. Haidee usually haggled for and bought a raie (dog-
fish) for the next day's dinner, and Bran would run a stick through its
ribald-looking mouth, and carry the slithery monstrous thing home,
to be met by scowls from Hortense, who, stolid as she was, hated
the sight of a raie, and could not face the business of washing and
gutting it without cries of douleur and disgust.
"Ah! C'est craintive! C'est affreux!"
But meat was too dear for daily consumption, and raie the only
fish brought in by the boats throughout the winter months, so it had
to be eaten, and some one had to prepare it. And after all, wrestling
with raie was one of the jobs for which Hortense was paid three
francs a week. It was her business to come in the morning at seven
o'clock, make the fires, and deliver "little breakfast" at each bedside;
afterwards she swept and made the beds, then disappeared until
just before lunch, when she came to perform upon the raie and
execute one or two culinary feats that were beyond the scope of Val
or Haidee--such as cutting up onions, which neither of them could
accomplish without weeping aloud, or putting the chipped potatoes
into a pan full of boiling dripping, a business that when conducted
by Val made a rain of grease spots all over the kitchen and scalded
every one in sight. After washing the midday dishes, and chopping
up vegetables for the soup, Hortense would consider her function
over for the day, and leave Val and Haidee to grapple as best they
might with tea, supper, fires, and the chaufferette. The supper was
no very great difficulty, merely a matter of putting the cut vegetables
into a pot with a large lump of specially prepared and seasoned
dripping, and standing said pot on the stove until supper-time, when
its contents would be marvellously transformed into soupe à la
graise, a savoury and nourishing broth eaten as an evening meal by
every peasant in Normandy. The fires were the greatest nuisance.
The stove in the kitchen either became a red-hot furnace and purred
like a man-eater, or else went out; and the stove with an open grate
in Val's room, which old man Duval had paid a month's rent for and
gone all the way to Cherbourg to fetch, had a way of going out also
before any one even noticed that it was low; then there would be
much scratching with a poker, searching for kindling wood, pouring
out of paraffin, sudden happy blazes that nearly took the roof off,
and black smuts everywhere. When all was over, and a beautiful fire
roaring after the united efforts of the family, Val would find that her
chaufferette had gone out! It was hard to even think masterpieces
among such distractions, to say nothing of writing them. Tea was
easily got. Haidee made the toast on the salad fork, Val buttered it
with dripping, Bran laid the table. Then all three sat with their feet
on the stove, drinking out of the big coffee bowls, eating every scrap
of the delicious smoky toast and licking their fingers afterwards. If
Val had written anything funny or dramatic that day she would
sometimes read it out to them, but for the most part her instinct was
to hide what she wrote. She said she felt as if she had lost
something afterwards, and if any one had been even looking at her
written sheets they never seemed quite the same to her again--
some virtue went out of her work the moment she shared it with any
one.
Usually, after tea she settled down for another struggle with her
ideas, and Bran and Haidee went for a prowl on the digue in the
hope of adventures. Bran, whose mind was as full of fairies as if he
had been born in the wilds of Ireland, was always in hope of
meeting a giant or a dwarf, but he had learned not to mention these
aspirations to Haidee. Anyway, there was always the village gossip to
listen to in the petit port, where the fishing boats anchored and
usually the excitement of watching the Quatre Frères come chup--
chup--chupping up the river to her moorings. She was a natty and
picturesque trawler, with a petrol engine that was the admiration of
the village installed in her bowels. Because of this engine she was
known as the Chalutier à petrole, but at Villa Duval she was called
by Bran's translation of her name, The Cat's Frères. She never
caught anything but raie, and of this despised species far fewer than
any of the other boats, but she dashed in and out of the harbour
with great slam and needed five men to handle her. There was a
legend that the petrol engine frightened the fish away. It was known
that the four brothers who owned her were anxious to get rid of her.
Every one knew that she cost more than she brought in. But Haidee
and Bran shared a fugitive hope that Val's play would make them all
so rich that they would be able to acquire her as a pleasure boat.
Sometimes strange craft from Granville or a Brittany port would
come in for the night, and there was the St. Joseph, a great fishing
trawler from Lannion, carrying a master and seven hands, that put in
when weather was heavy. Her sails were patched with every colour
of the rainbow, her decks were filthy, and her years sat heavy upon
her--you could hear her creaking and groaning two miles from
shore: but to Haidee and Bran she stood for the true romance! She
always brought in tons of fish, not only the everlasting raie, but
deep-sea fish, and as soon as her arrival was heralded all the village
sabots came clipper-clopping down the terrace, shawls clutched
round bosoms, the wind flicking bright red spots in old cheeks, every
one anxious to pick and choose from the mass of coal-fish, red
gurnet, plaice, congers, and mullets that was hooked out of the hold
and flung quivering ashore. The big weather-beaten fishermen in
their sea-boots bandied jests with the carking old village wives and
the girls showered laughter. In the end, the villagers departed with
full baskets, and the seamen well content adjourned to the petit café
close by for a "cup of coffee with a burn in it" and a good meal.
CHAPTER XV
WAYS SACRED AND SECULAR
"A gentleman makes no noise: a lady is serene."--EMERSON.
In May, the gentle month of May, the weather cleared up again, and
green things commenced to sprout and bloom on the cliff above Villa
Duval. The country-side began to bloom and blossom as the rose.
From the high coast that lies facing the sea, Jersey could be
discerned on clear days etched as if in India ink upon the horizon
thirteen miles away. Clots of sea-samphire burst into flower, cleverly
justifying its name of creste marine by just keeping out of reach of
the high tides. The gorse showed dots of yellow amongst its prickles,
and little brilliant blue squills stuck up their perky faces and gave out
a sweet scent. All along the path to the lighthouse wild thyme came
out in springy masses, and the mad Americans often went up that
way for the special purpose of lying on it as on a soft, pink silk rug.
It seemed to cause them a peculiar kind of joy to put their faces
down in it, crying, "Oh! oh! oh!"
The garbage-hole across the road in front of Villa Duval which
the dustman had been trying for many summers to transform into a
building plot by filling it with empty tins and rubbish from the hotel,
and which had been an eyesore all the winter, now suddenly became
a place of beauty, for a lot of prickly, thistly-looking plants growing
among the jam tins burst into a blaze of red and yellow. It turned
out that they were poppies that had been keeping themselves secret
all through the winter, and the yellow bright gold of "Our Lady's
bedstraw." One day Haidee brought home some long, fragile trails of
cinquefoil, one of the first spring things, and Val, worn and haggard
under her blue veil, pinned it over her heart because she had read in
old Elizabethan days that cinquefoil was supposed to be a cure for
inflammations and fevers. She quoted to Haidee what an old
herbalist had once written of such cures:
"Let no man despise them because they are plain and easy: the
ways of God are all such."
Haidee flushed faintly and retired into awkward silence, shy like
most girls of her age at the mention of God. She was going to make
her communion the next day with the First Communion candidates,
but it was not her first, for that had been made once when she was
ill in New York. She was to be confirmed in June when the
archbishop of a neighbouring parish intended to visit Mascaret and
hold a confirmation service.
It being Saturday afternoon Hortense as well as Haidee was due
at the confessional for the recital of her weekly sins, therefore she
bustled over the washing-up, announcing her intention of making a
bon confession, as though the one she usually made was of an
inferior brand.
"What are you going to tell?" asked Haidee, drying plates. She
knew very well it was forbidden to talk about your confession, but
the subject was a curiously fascinating one. Hortense had a "cupful
of sins" for the curé's ear. She had been reading love stories in the
Petit Journal (a forbidden paper because it is "against the Church"),
telling the cards, and consulting her dream book; also she had
missed Vespers twice and several meetings of the "Children of Mary,"
of which body she was a member. She computed that her pénitence
would be as long as her arm.
"He will scold me well, I know," she said cheerfully, "for he saw
me talking with Léon Bourget yesterday."
"What! that awful fisherman with the hump?"
"Yes; but he is not a bad fellow, mademoiselle, only all the
fishermen here are wicked towards the curé because, as you know,
he would not bury the mother of Jean le Petit, and they had to go
and get the mayor to do it."
"Yes; but you must remember that she lived with old man le
Petit without being married to him, and that is forbidden by the
Church. She would not even repent on her death-bed and receive
the Blessed Sacrament. How could the curé bury her after that?"
Haidee knew all about the little scandal, for the storm it
occasioned had raged all the winter about the curé's head. The same
day he had refused to bury mère le Petit he was obliged to go to
Paris on Church business. On his return in the dusk of a December
evening he was met at the station by all the fishermen in the village
partially disguised in home-made masks, each carrying some
instrument or implement with which to make hideous sounds; pots,
pans, old trays, sheep-bells, and cow-horns had all been pressed
into service, and the din was truly fearsome. The curé preserving his
serenity was conducted to his presbytery by this scratch band, and
on every dark night thereafter it had serenaded him from the
shadows near his house. The blare sometimes continued until the
small hours of the morning, keeping not only the unfortunate curé,
but the whole village awake. The gendarmes from Barleville, the
nearest police-station, had made several midnight raids with the
stated intention of capturing the offenders, but their efforts were
attended by a lack of success so striking as to suggest a certain
amount of sympathy, not to say complicity, on the part of the law. At
any rate, the curé's music, or "Mujik de Churie," as it was popularly
pronounced, went on gaily, and there had been some kind of
unofficial announcement that it would continue until the curé cleared
out. Old pére Duval opined, however, that the entertainment was
likely to cease with the arrival of the first summer visitors, for
however vindictive the fishermen were they knew which side their
bread was buttered on, and were politic enough not to want to drive
away trade by their thrilling "mujik."
Having finished drying plates Haidee retired up-stairs to prepare
her confession, telling Hortense to be sure and wait for her. She
proceeded to write her sins down on a piece of paper. In spite of her
good French she stammered so much from nervousness when
confessing that the curé had arranged this method with her. She
always gave him the piece of paper, which he took away to the
sacristy while she waited in the confessional. When he had read her
paper he came back, conferred penance and a little scolding, then
gave her absolution.
With the aid of a French Catechism, which had a formula for
confession in it, she proceeded to write out her sins, her method
being to dive into the book first for a question and then into her soul
for a sin that corresponded. Eventually the piece of paper contained
the following statement:
"Je ne me suis pas confessé depuis trois semaines; j'ai recu
l'absolution. Je m'accuse:
"De n'avoir pas fait ma priere du matin beaucoup de fois.
"De n'avoir pas fait ma prière du soir plusieurs fois.
"D'avoir manqué aux Vêpres 4 fois.
"D'avoir été distraite dans l'Église 2 fois.
"D'avoir été dissipée dans l'Église 2 fois.
"D'avoir désobéi à ma mère 2 fois.
"D'avoir manqué de respect envers elle 1 fois.
"De m'été disputée avec mon frère 2 fois.
"D'avoir fait des petits mensonges 4 fois.
"Je m'accuse de tous ces pèches et de ceux dont je ne me
souviens pas.
"Je demande pardon de Dieu et à vous, mon père, la pénitence
et l'absolution selon que vous m'en jugerez digne."
Whether this list of offences truly represented the burden of her
transgressions for the past three weeks it would be hard to say. It is
possible that Val could have made out a longer and more
comprehensive one for her, as she often threatened to do when
Haidee vexed her. Anyway, the latter folded up her piece of paper
with a complacency that either betokened a clear conscience or a
heart hardened in crime. She computed that her penance would be
to recite a decade of the rosary, and she knew that the curé would
then speak of the next Church feast, and of the wishes preferred by
the Sacred Heart and the Blessed Virgin, tell her to invoke the aid of
the Saints when she felt herself tempted to sin, to try always to give
a good example to her little brother, and to be very pious so that her
mother would be converted and become a Catholic. Both Val and
Haidee had long since given up explaining that they were not
mother and daughter. They found that it saved time and a lot of
questions just to let people think what they liked.
Putting on her hat Haidee now popped her head out of the
window and gave a hoot to Hortense, who was below in the yard
cleaning her boots on the garden seat. Just as they were about to
start Val came down-stairs and begged Haidee to go to the butcher's
shop on her way back, and bring home something for Sunday's
dinner.
"What kind of something?" asked Haidee belligerently, for the
butcher's shop had no allure for her. There ensued a discussion as to
which was the most economical meat to get. Hortense, waiting at
the bottom of the steps, piped in with the announcement that every
one ought to eat lamb on First Communion Sunday. Val and Haidee
looked at each other. Vaguely they knew that the price of lamb was
high. But suddenly it came into Val's mind how sick the children
must be of raie, and stewed veal, and that though funds were low
the play was nearly finished. They would have a nice English dinner
for once. Roast lamb and mint sauce! She gave Haidee her last louis
to change.
"Pick some mint from the cliff-side as you come back," she
enjoined. French peasants have no use for mint in their cooking.
Some English visitors had once planted a root of it in pére Duval's
garden, but after they were gone he flung it out again on to the cliff-
side, where it had increased and multiplied until it was now a large
bed.
In the butcher's shop Haidee found a number of villagers
squabbling over beef-bones, and not a sign of lamb anywhere. The
truth was that every portion of the one lamb killed early in the week
had been sold, and though there were still one or two customers in
need of First Communion lamb, Mother Durand knew better than to
offer any of the freshly-killed beast that hung in the back shed.
Peasants are well aware that freshly-killed meat should not be cut
too early or it will be full of air, soft, flabby, and never tender. Mother
Durand, under her calm exterior, was furiously angry with her man
for having delayed the killing until now--after to-day there would be
no demand for anything but beef-bones and veal until the summer
visitors began to arrive. The young American mademoiselle asking
guilelessly for lamb was a godsend. Waiting until the last villager had
gone from the shop so that there would be no adverse comment on
what she meant to do, she turned ingratiatingly to Haidee.
"But certainly, mademoiselle ... there is none in the shop ... but
outside I have a lamb that is superbe ... just the thing for a première
communion ... it is not for every one I would cut that lamb, but for
such customers as you and your belle maman there is nothing I
would not do." She returned presently from the back shed. "There,
mademoiselle--a beautiful shoulder. Six francs."
Haidee was horrified at the price. Their dinner meat usually cost
about one franc twenty, and she knew that there was much to be
accomplished with Val's last twenty-franc piece.
"Could n't you give me a smaller one, Madame Durand? ... and
not so dear?"
"Ah, mademoiselle, you should have said to me before that you
wanted it small. It is cut now ... and what would I do with the pieces
from it? Do you think I could sell them? But no."
So Haidee took the shoulder, and returned home with it tucked
under her arm. On arrival as it happened old veuve Michel was in
the kitchen with Val, having just brought home some odds and ends
of family washing.
"What!" she cried, on seeing the lamb. "A shoulder of freshly-
killed lamb, full of air and bubbles ... cut off the poor nice lamb while
it had yet the hot life in it! Shame on the wretched woman Durand
... to take advantage thus of poor innocent Americans! ... Shame!
But then every one knows how she treated her poor daughter who
wanted to be a nun. Madame, the stones in the street are not more
wicked than that woman Amélie Durand!"
Val, much disturbed by these sayings, examined the shoulder of
mutton. Certainly it was very bubbly looking: warm too. She
remembered now hearing the cook in New York storm over a piece
of freshly-killed meat, declaring that it had been cut too soon and
was not fit to eat.
"How ought I to cook it to make the best of it?" she inquired in
dismay.
"Cook it!" cried Widow Michel, scarlet in the face from
indignation combined with the effects of her afternoon bottle of
cognac. "No good to cook it. Better to pluck a rock from the cliff-side
and cook it."
"How much was it, Haidee?"
"Six francs."
"Mon Dieu! What imposition! Take it back, Haidee dear, and tell
her that it is too dear and too fresh ... she must give us a pound of
steak instead. We are too poor to buy meat we can't eat, you know,
darling. Six francs! Did you pay for it?"
"Why, yes, of course I paid for it. You know I had the louis. Oh!
blow Val, I don't care much about taking it back."
"But, Haidee, what's the use of talking like that ... we can't eat
that bubbly lamb ... think of poor Brannie without dinner! I 'd go
myself if I had any hair.... Tell her it 's ridiculous to have given you
such meat. I remember now Hortense said that leg we had at
Christmas and could n't eat was too freshly-killed--it was soft and
tough at the same time, and all slithery when you tried to cut it.
Don't you remember--it made you sick to look at it?"
Yes, Haidee remembered well enough, but she did n't like taking
the shoulder back just the same. However, veuve Michel offered the
moral support of her company, and she returned to Mother Durand.
Half-an-hour later she was back at the Villa, the wretched shoulder
of lamb still in her hands.
"She won't take it back. She says it 's a rule of the shop never
to take back meat that has once gone out of it."
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.
More than just a book-buying platform, we strive to be a bridge
connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.
Join us on a journey of knowledge exploration, passion nurturing, and
personal growth every day!
ebookbell.com

More Related Content

PDF
Frontiers in Data Science 1st Edition Matthias Dehmer
PDF
Data Science And Big Data Analytics Proceedings Of Idba 2023 2024th Edition D...
PDF
Data Science And Big Data An Environment Of Computational Intelligence 1st Ed...
PDF
Handbook Of Big Data Analytics And Forensics 1st Ed 2022 Kimkwang Raymond Cho...
PDF
Big Data And High Performance Computing V Grandinetti L Joubert
PDF
Data Science In Applications Gintautas Dzemyda Jolita Bernataviien
PPTX
Unit 1-FDS. .pptx
PDF
Data Science And Applications Satyasai Jagannath Nanda Rajendra Prasad Yadav
Frontiers in Data Science 1st Edition Matthias Dehmer
Data Science And Big Data Analytics Proceedings Of Idba 2023 2024th Edition D...
Data Science And Big Data An Environment Of Computational Intelligence 1st Ed...
Handbook Of Big Data Analytics And Forensics 1st Ed 2022 Kimkwang Raymond Cho...
Big Data And High Performance Computing V Grandinetti L Joubert
Data Science In Applications Gintautas Dzemyda Jolita Bernataviien
Unit 1-FDS. .pptx
Data Science And Applications Satyasai Jagannath Nanda Rajendra Prasad Yadav

Similar to Frontiers In Data Science 1st Edition Matthias Dehmer Frank Emmertstreib (20)

PDF
Data Science Techniques And Intelligent Applications 1st Edition Pallavi Vija...
PDF
Data Science In Societal Applications Siddharth Swarup Rautaray
PDF
Data Science In Societal Applications Siddharth Swarup Rautaray
PDF
Data Science 1st Edition Robert Stahlbock Gary M Weiss Mahmoud Abounasr
PDF
Data Analytics And Machine Learning Pushpa Singh Asha Rani Mishra Payal Garg
PDF
Governing Big Data : Principles and practices
PPTX
Data Science
PPTX
Data science innovations
PDF
Introduction To Data Science Laura Igual Santi Segu
PPTX
Unit 1 Introduction to Data Analytics .pptx
PDF
Big Data for Library Services (2017)
PDF
SWOT of Bigdata Security Using Machine Learning Techniques
PDF
PDF
ICPSR - Complex Systems Models in the Social Sciences - Lecture 6 - Professor...
PPTX
(Big) Data (Science) Skills
PDF
Moving Toward Big Data: Challenges, Trends and Perspectives
PDF
A Deep Dissertion Of Data Science Related Issues And Its Applications
PDF
Big Data Analytics Volume 33 1st Edition Venu Govindaraju 2024 Scribd Download
PDF
Towards a Community-driven Data Science Body of Knowledge – Data Management S...
DOCX
Encroachment in Data Processing using Big Data Technology
Data Science Techniques And Intelligent Applications 1st Edition Pallavi Vija...
Data Science In Societal Applications Siddharth Swarup Rautaray
Data Science In Societal Applications Siddharth Swarup Rautaray
Data Science 1st Edition Robert Stahlbock Gary M Weiss Mahmoud Abounasr
Data Analytics And Machine Learning Pushpa Singh Asha Rani Mishra Payal Garg
Governing Big Data : Principles and practices
Data Science
Data science innovations
Introduction To Data Science Laura Igual Santi Segu
Unit 1 Introduction to Data Analytics .pptx
Big Data for Library Services (2017)
SWOT of Bigdata Security Using Machine Learning Techniques
ICPSR - Complex Systems Models in the Social Sciences - Lecture 6 - Professor...
(Big) Data (Science) Skills
Moving Toward Big Data: Challenges, Trends and Perspectives
A Deep Dissertion Of Data Science Related Issues And Its Applications
Big Data Analytics Volume 33 1st Edition Venu Govindaraju 2024 Scribd Download
Towards a Community-driven Data Science Body of Knowledge – Data Management S...
Encroachment in Data Processing using Big Data Technology
Ad

Recently uploaded (20)

PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
Sports Quiz easy sports quiz sports quiz
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
Basic Mud Logging Guide for educational purpose
PDF
Pre independence Education in Inndia.pdf
PPTX
master seminar digital applications in india
PPTX
Lesson notes of climatology university.
PDF
Computing-Curriculum for Schools in Ghana
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
Insiders guide to clinical Medicine.pdf
PDF
RMMM.pdf make it easy to upload and study
PDF
TR - Agricultural Crops Production NC III.pdf
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
GDM (1) (1).pptx small presentation for students
PPTX
Cell Structure & Organelles in detailed.
PPTX
Cell Types and Its function , kingdom of life
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPH.pptx obstetrics and gynecology in nursing
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Sports Quiz easy sports quiz sports quiz
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Basic Mud Logging Guide for educational purpose
Pre independence Education in Inndia.pdf
master seminar digital applications in india
Lesson notes of climatology university.
Computing-Curriculum for Schools in Ghana
Supply Chain Operations Speaking Notes -ICLT Program
Insiders guide to clinical Medicine.pdf
RMMM.pdf make it easy to upload and study
TR - Agricultural Crops Production NC III.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
STATICS OF THE RIGID BODIES Hibbelers.pdf
GDM (1) (1).pptx small presentation for students
Cell Structure & Organelles in detailed.
Cell Types and Its function , kingdom of life
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Ad

Frontiers In Data Science 1st Edition Matthias Dehmer Frank Emmertstreib

  • 1. Frontiers In Data Science 1st Edition Matthias Dehmer Frank Emmertstreib download https://ebookbell.com/product/frontiers-in-data-science-1st- edition-matthias-dehmer-frank-emmertstreib-6837892 Explore and download more ebooks at ebookbell.com
  • 2. Here are some recommended products that we believe you will be interested in. You can click the link to download. Frontiers In Massive Data Analysis Committee On The Analysis Of Massive Data https://ebookbell.com/product/frontiers-in-massive-data-analysis- committee-on-the-analysis-of-massive-data-4541630 New Frontiers In Applied Data Mining Pakdd 2008 International Workshops Osaka Japan May 2023 2008 Revised Selected Papers 1st Edition Takuya Kida https://ebookbell.com/product/new-frontiers-in-applied-data-mining- pakdd-2008-international-workshops-osaka-japan-may-2023-2008-revised- selected-papers-1st-edition-takuya-kida-2028246 New Frontiers In Applied Data Mining Pakdd 2011 International Workshops Shenzhen China May 2427 2011 Revised Selected Papers 1st Edition Hyoungnyoun Kim https://ebookbell.com/product/new-frontiers-in-applied-data-mining- pakdd-2011-international-workshops-shenzhen-china- may-2427-2011-revised-selected-papers-1st-edition-hyoungnyoun- kim-4142786 New Frontiers In Applied Data Mining Pakdd 2009 International Workshops Bangkok Thailand April 2730 2009 Revised Selected Papers 1st Edition Frdric Flouvat https://ebookbell.com/product/new-frontiers-in-applied-data-mining- pakdd-2009-international-workshops-bangkok-thailand- april-2730-2009-revised-selected-papers-1st-edition-frdric- flouvat-4142788
  • 3. Frontiers In Major League Baseball Nonparametric Analysis Of Performance Using Data Envelopment Analysis 1st Edition John Ruggiero Auth https://ebookbell.com/product/frontiers-in-major-league-baseball- nonparametric-analysis-of-performance-using-data-envelopment- analysis-1st-edition-john-ruggiero-auth-4269016 Intelligent Data Engineering And Analytics Proceedings Of The 10th International Conference On Frontiers In Intelligent Computing Theory And Applications Ficta 2022 Vikrant Bhateja https://ebookbell.com/product/intelligent-data-engineering-and- analytics-proceedings-of-the-10th-international-conference-on- frontiers-in-intelligent-computing-theory-and-applications- ficta-2022-vikrant-bhateja-49171398 Frontiers In Antiinfective Drug Discovery Volume 9 Attaurrahman Iqbal Chaudhary https://ebookbell.com/product/frontiers-in-antiinfective-drug- discovery-volume-9-attaurrahman-iqbal-chaudhary-45329624 Frontiers In Clinical Drug Research Anticancer Agent Attaurrahman https://ebookbell.com/product/frontiers-in-clinical-drug-research- anticancer-agent-attaurrahman-46882778 Frontiers In Magnetic Materials From Principles To Material Design And Practical Applications Chen Wu https://ebookbell.com/product/frontiers-in-magnetic-materials-from- principles-to-material-design-and-practical-applications-chen- wu-47224102
  • 7. Chapman & Hall/CRC Big Data Series PUBLISHED TITLES SERIES EDITOR Sanjay Ranka AIMS AND SCOPE This series aims to present new research and applications in Big Data, along with the computa- tional tools and techniques currently in development. The inclusion of concrete examples and applications is highly encouraged.The scope of the series includes, but is not limited to, titles in the areas of social networks, sensor networks, data-centric computing, astronomy, genomics, medical data analytics, large-scale e-commerce, and other relevant topics that may be proposed by poten- tial contributors. FRONTIERS IN DATA SCIENCE Matthias Dehmer and Frank Emmert-Streib BIG DATA OF COMPLEX NETWORKS Matthias Dehmer, Frank Emmert-Streib, Stefan Pickl, and Andreas Holzinger BIG DATA COMPUTING: A GUIDE FOR BUSINESS AND TECHNOLOGY MANAGERS Vivek Kale BIG DATA : ALGORITHMS, ANALYTICS, AND APPLICATIONS Kuan-Ching Li, Hai Jiang, Laurence T.Yang, and Alfredo Cuzzocrea BIG DATA MANAGEMENT AND PROCESSING Kuan-Ching Li, Hai Jiang, and AlbertY. Zomaya BIG DATA ANALYTICS: TOOLS AND TECHNOLOGY FOR EFFECTIVE PLANNING Arun K. Somani and Ganesh Chandra Deka BIG DATA IN COMPLEX AND SOCIAL NETWORKS My T. Thai, Weili Wu, and Hui Xiong HIGH PERFORMANCE COMPUTING FOR BIG DATA Chao Wang NETWORKING FOR BIG DATA ShuiYu, Xiaodong Lin, Jelena Mišić, and Xuemin (Sherman) Shen
  • 8. Frontiers in Data Science Edited by Matthias Dehmer Frank Emmert-Streib
  • 9. CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2018 by Taylor & Francis Group, LLC CRC Press is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Printed on acid-free paper International Standard Book Number-13: 978-1-4987-9932-4 (Hardback) This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copyright. com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com
  • 10. Contents About the Editors vii Contributors ix 1 Legal aspects of information science, data science, and Big Data 1 Alessandro Mantelero and Giuseppe Vaciago 2 Legal and policy aspects of information science in emerging automated environments 47 Stefan A. Kaiser 3 Privacy as secondary rule, or the intrinsic limits of legal orders in the age of Big Data 69 Bart van der Sloot 4 Data ownership: Taking stock and mapping the issues 111 Florent Thouvenin, Rolf H. Weber, and Alfred Früh 5 Philosophical and methodological foundations of text data analytics 147 Beth-Anne Schuelke-Leech and Betsy Barry 6 Mobile commerce and the consumer information paradox: A review of practice, theory, and a research agenda 171 Matthew S. Eastin and Nancy H. Brinson 7 The impact of Big Data on making evidence-based decisions 191 Rodica Neamtu, Caitlin Kuhlman, Ramoza Ahsan, and Elke Rundensteiner 8 Automated business analytics for artificial intelligence in Big Data@X 4.0 era 223 Yi-Ting Chen and Edward W. Sun v
  • 11. vi Contents 9 The evolution of recommender systems: From the beginning to the Big Data era 253 Beatrice Paoli, Monika Laner, Beat Tödtli, and Jouri Semenov 10 Preprocessing in Big Data: New challenges for discretization and feature selection 285 Verónica Bolón-Canedo, Noelia Sánchez-Maroño, and Amparo Alonso-Betanzos 11 Causation, probability, and all that: Data science as a novel inductive paradigm 329 Wolfgang Pietsch 12 Big Data in healthcare in China: Applications, obstacles, and suggestions 355 Zhong Wang and Xiaohua Wang Index 371
  • 12. About the Editors Matthias Dehmer studied mathematics at the University of Siegen, Siegen, Germany and earned his PhD in computer science from the Technical University of Darmstadt, Darmstadt, Germany. Afterward, he was a research fellow at Vienna Bio Center, Austria, Vienna University of Technology, and University of Coimbra, Portugal. He obtained his habilitation in applied discrete mathematics from the Vienna University of Technology. Currently, he is a professor at UMIT—The Health and Life Sciences University, Austria. His research interests are in data science, Big Data, complex networks, machine learning, and information theory. He has published more than 220 publi- cations in applied mathematics, computer science, data science, and related disciplines. Frank Emmert-Streib studied physics at the University of Siegen, Germany, and earned his PhD in theoretical physics from the University of Bremen, Bremen, Germany. He was a postdoctoral fellow in the United States before becoming a faculty member at the Center for Cancer Research at the Queen’s University Belfast, UK. Currently, he is a professor in the Department of Signal Processing at Tampere University of Technology, Finland. His research interests are in the field of computational biology, data science and analytics in the development and application of methods from statistics, and machine learning for the analysis of Big Data from genomics, finance, and business. vii
  • 14. Contributors Ramoza Ahsan Worcester Polytechnic University Worcester, Massachusetts Betsy Barry Emory University Atlanta, Georgia Amparo Alonso-Betanzos Universidade da Coruña A Coruña, Spain Nancy H. Brinson University of Texas at Austin Austin, Texas Verónica Bolón-Canedo Universidade da Coruña A Coruña, Spain Yi-Ting Chen National Chiao Tung University Hsinchu, Taiwan Matthew S. Eastin University of Texas at Austin Austin, Texas Alfred Früh Universität Zürich Zürich, Switzerland Stefan A. Kaiser Independent Researcher Wassenberg, Germany Caitlin Kuhlman Worcester Polytechnic University Worcester, Massachusetts Monika Laner Fernfachhochschule Schweiz Brig, Switzerland Alessandro Mantelero Polytechnic University of Turin Turin, Italy Noelia Sánchez-Maroño Universidade da Coruña A Coruña, Spain Rodica Neamtu Worcester Polytechnic University Worcester, Massachusetts Beatrice Paoli Fernfachhochschule Schweiz Brig, Switzerland Wolfgang Pietsch Technische Universität München Munich, Germany Elke Rundensteiner Worcester Polytechnic University Worcester, Massachusetts ix
  • 15. x Contributors Beth-Anne Schuelke-Leech University of Windsor Windsor, Ontario, Canada Jouri Semenov Fernfachhochschule Schweiz Brig, Switzerland Edward W. Sun KEDGE Business School Talence, France Florent Thouvenin Universität Zürich Zürich, Switzerland Beat Tödtli Fernfachhochschule Schweiz Brig, Switzerland Giuseppe Vaciago University of Insubria Varese, Italy Bart van der Sloot Tilburg University Tilburg, the Netherlands Zhong Wang Beijing Academy of Social Sciences Beijing, China Xiaohua Wang Chinese Academy of Sciences Beijing, China Rolf H. Weber Universität Zürich Zürich, Switzerland
  • 16. Chapter 1 Legal aspects of information science, data science, and Big Data∗ Alessandro Mantelero Giuseppe Vaciago Introduction: The legal challenges of the use of data ................... 2 Data collection and data processing: The fundamentals of data protection regulations ................................................ 4 The European Union model: From the Data Protection Directive to the General Data Protection Regulation ............................. 6 Use of data and risk-analysis .......................................... 10 Use of data for decision-making purposes: From individual to collective dimension of data processing ............................................ 17 Data-centered approach and socio-ethical impacts ................... 21 Multiple-risk assessment and collective interests ..................... 23 The guidelines adopted by the Council of Europe on the protection of individuals with regard to the processing of personal data in a world of Big Data ..................................................... 25 Data prediction: Social control and social surveillance ................. 29 Use of data during the investigation: Reasonable doubt versus reasonable suspicion .................................................. 30 Big Data and social surveillance: Public and private interplay in social control .......................................................... 31 The EU reform on data protection ................................... 35 References ............................................................... 36 ∗Alessandro Mantelero, Polytechnic University of Turin, is the author of sections “Introduction: The legal challenges of the use of data” and “Use of data for decision-making purposes: From individual to collective dimension of data processing.” Giuseppe Vaciago, University of Insubria, is the author of section “Data prediction: Social control and social surveillance.” 1
  • 17. 2 Frontiers in Data Science Introduction: The legal challenges of the use of data There are many definitions of Big Data, which differ depending on the specific discipline. Most of the definitions focus on the growing technological ability to collect, process, and extract new and predictive knowledge from a bulk of data characterized by a great volume, velocity, and variety.∗ However, in terms of protection of individual rights, the main issues do not only concern the volume, velocity, and variety of processed data, but also the analysis of data, using software to extract new and predictive knowledge for decision-making purposes. Therefore, in this contribution, the definition of Big Data encompasses both Big Data and Big Data analytics.† The advent of Big Data has suggested a new paradigm in social empiri- cal studies, in which the traditional approach adopted in statistical studies is complemented or replaced by Big Data analysis. This new paradigm is char- acterized by the relevant role played by data visualization, which makes it possible the analysis of real-time data streams to get their trajectory and predict future trends possible [3]. Moreover, large amounts of data make it possible to use unsupervised machine-learning algorithms to discover hidden correlations between variables that characterize large datasets. This kind of approach, which is based on the emerging correlations among data, leads social investigation to adopt a new strategy, in which there are no preexisting research hypotheses to be verified through empirical statisti- cal studies. Big Data analytics suggest possible correlations, which constitute per se the research hypothesis: data show the potential relations between facts or behavior. Nevertheless, these relations are not grounded on causation and, for this reason, should be further investigated using the traditional statistical method. Assuming that data trends suggest correlations and consequent research hypotheses, at the moment of data collection only very general research hypotheses are possible, as the potential data patterns are still unknown. Therefore, the specific purpose of data processing can be identified only at a later time, when correlations reveal the usefulness of some information to detect specific aspects. Only at that time, the given purpose of the use of information becomes evident, also with regard to further analyses conducted with traditional statistical methods [4]. ∗The term “Big Data” usually identifies extremely large datasets that may be analyzed computationally to extract inferences about data patterns, trends, and correlations. Accord- ing to the International Telecommunication Union, Big Data are “a paradigm for enabling the collection, storage, management, analysis, and visualization, potentially under real-time constraints, of extensive datasets with heterogeneous characteristics” [1]. †This term is used to identify computational technologies that analyze large amounts of data to uncover hidden patterns, trends, and correlations. According to the European Union Agency for Network and Information Security, the term Big Data analytics “refers to the whole data management lifecycle of collecting, organizing, and analysing data to discover patterns, to infer situations or states, to predict and to understand behaviors” [2].
  • 18. Legal aspects of information science, data science, and Big Data 3 On the other hand, there are algorithms, such as supervised machine- learning algorithms, that need a preliminary training phase. In this stage, a supervisor uses data training sets to correct the errors of the machine, orienting the algorithm toward correct associations. In this sense, supervised machine- learning algorithms require a prior definition of the purpose of the use of data, identifying the goal that the machine should reach through autonomous processing of all available data. In this case, although the purpose of data use is defined in the training phase, the manner in which data are processed and the final outcome of data mining remain largely unknown. In fact, these algorithms are black boxes and their internal dynamics are partially unpredictable.∗ Both data visualization and machine-learning applications pose relevant questions in terms of Big Data processing, which will be addressed in the following sections. How is it possible to define the specific purpose of data processing at the moment of data collection, when the correlations suggested by analytics are unknown at that time? If different sources of data are used in machine training and running learning algorithms, how can data subjects know the specific purpose of the use of their information in given machine- learning applications? These questions clearly show the tension that characterizes the application of the traditional data protection principles in the Big Data context. But this is not the only crucial aspect: the very notion of personal data is becoming more undefined. Running Big Data analytics over large datasets could make it difficult to distinguish between personal data and anonymous data, as well as between sensitive data and nonsensitive data. Various studies have demonstrated how information stored in anonymized datasets can be partially reidentified, in some cases without expensive tech- nical solutions [5–12]. This suggests going beyond the traditional dichotomy between personal and anonymous data and representing this distinction as a scale that moves from personal identified information to aggregated data. Between these extremes, the level of anonymization is proportional to the effort, in terms of time, resources and costs, which is required to reidentify information. Finally, with regard to sensitive data, Big Data analytics make it possi- ble to use nonsensitive data to infer sensitive information, such as informa- tion concerning religious practices extracted from location data and mobility patterns [13]. Against this background, the existing data protection regulations and the ongoing proposals [14,15] remain largely focused on the traditional main pil- lars of the so-called fourth generation of data protection laws [16]: the notice ∗See, e.g., Zhang M., “Google Photos Tags Two African-Americans As Gorillas Through Facial Recognition Software,” Forbes, July 1, 2015. http://www.forbes.com/sites/ mzhang/2015/07/01/google-photos-tags-two-african-americans-as-gorillas-through-facial- recognition-software/#36b529227b63 (accessed March 23, 2016).
  • 19. 4 Frontiers in Data Science and consent model (i.e., an informed, freely given, and specific consent) [17–21],∗ the purpose limitation principle [24,25], and the minimization principle. For this reason, the following sections investigate the limits and criticisms of the existing legal framework and the possible options to provide adequate answers to the new challenges of Big Data processing. In this light, this chapter is divided into three main sections. The first section focuses on the traditional paradigm of data protection and on the provisions, primarily in the new EU General Data Protection Regulation (Regulation (EU) 2016/679, hereafter GDPR), that can be used to safeguard individual rights in Big Data processing. The second section goes beyond the existing legal framework and, in the light of the path opened by the guidelines on Big Data adopted by the Coun- cil of Europe, suggests a broader approach that encompasses the collective dimension of data protection. This dimension often characterizes Big Data applications and leads to assess the ethical and social impacts of data uses, which assume an important role in many Big Data contexts. The last section deals with the use of Big Data to anticipate fraud detection and to prevent crime. In this light, the new Directive (EU) 2016/680† is briefly analyzed. Data collection and data processing: The fundamentals of data protection regulations Before considering the different reasons that induce the law to protect personal information, it should be noted that European legal systems do not recognize the same broad notion of the right to privacy that exists in U.S. jurisprudence.‡ At the same time, in the European countries, data protection laws do not draw their origins from the European idea of privacy and its related case law. ∗See Articles 6 and 7, Regulation (EU) 2016/679 of the European Parliament and of the Council of April 27, 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation). Differently, in the United States, the traditional approach based on various sectorial regulations has underestimated the role played by user’s choice, adopting a market-oriented strategy. Nevertheless, the guidelines adopted by the U.S. administrations in 2012 [14] seem to suggest a different approach, reinforcing self- determination [8,22,23]. †Directive (EU) 2016/680 on the protection of natural persons with regard to the processing of personal data by competent authorities for the purposes of the prevention, investigation, detection or prosecution of criminal offences or the execution of criminal penal- ties, and on the free movement of such data, and repealing Council Framework Decision 2008/977/JHA. ‡With regard to the notion of right to privacy (and in brief), in the United States the right to privacy covers a broad area that goes from informational privacy to the right of self-determination in private life decisions. On the other hand, in European countries, this right mainly focuses on the first aspect and is related to media activities [26–31].
  • 20. Legal aspects of information science, data science, and Big Data 5 European data protection regulations, since their origins in the second half of the last century, focused on information regarding individuals, without distinguishing between public or private information [32]. Compared with the right to privacy, the issues regarding the protection of personal data have been more recently recognized by law, both in the United States and Europe [33]. This dates from the 1960s, whereas the primitive era of the right to privacy was at the end of the nineteenth century, when the penny press assumed a significant role in limiting the privacy of the people belonging to upper classes [34]. In the light of the above, the analysis of the fundamentals of data process- ing should start from the effects of the computer revolution that happened in the late 1950s. The advent of computers and its social impact led to the first regulations on data protection and posed the first pillars of the architecture of the present legal framework. The first generations of data protection regulations were characterized by a national approach. They were adopted in different times by national legislators and were different with regard to the extension of the safeguards provided and the remedies offered. The notion of data protection was originally based on the idea of control over information, as confirmed by the literature of that period [35–37]. The migration from dusty paper archives to computer memories was a Coperni- can revolution which, for the first time in history, permitted the aggregation of information about every citizen that was previously spread over different archives [38]. The first data protection regulations were the answer to the rising concern of citizens about social control, as the new big mainframe computers gave governments [16,38–41] and large corporations the opportunity to collect and manage large amount of personal information [16,42]. In this sense, the legal systems gave individuals the opportunity to have a sort of countercontrol over the collected data [16,38,43]. The purpose of the regulations was not to spread and democratize power over information but to increase the level of transparency about data pro- cessing and safeguard the right to access to information. Citizens felt they were monitored, and the law gave them the opportunity to know who con- trolled their data, which kind of information was collected, and for which purposes. The mandatory notifications of new databases, registration, licensing pro- cedures, and independent authorities [16,44] were the fundamental elements of these new regulations. They were necessary to know who had control over information and to monitor data processing. Another key component was the right to access, which allows citizens to ask data owners about the way in which information is used and, consequently, about the exercise of their power over information. Finally, the entire picture was completed by the creation of ad hoc public authorities to safeguard and enforce citizen’s rights, exercise control over data owners, and react against abuses.
  • 21. 6 Frontiers in Data Science In this model, there was no space for individual consent, due to the eco- nomic context of that period. The collection of information was mainly made by public entities for purposes related to public interests, was mandatory, and there was no space of autonomy in terms of negotiation about personal information. At the same time, personal information did not have an eco- nomic value for private companies: data about clients and suppliers were mainly used for operational functions regarding the execution of company activities. Another element that contributed to exclude the role of self-determination was the lack of knowledge, the extreme difficulty for ordinary people to under- stand the use, and the mode of operation of mainframes. The computer main- frames were a sort of modern God, with sacral attendants, a selected number of technicians who were able to use this new equipment. In this scenario, it did not make sense to give citizens the chance to choose, as they were unable to understand the way in which their data were processed. In conclusion, during the 1970s and the first part of the 1980s of the last century, legislators laid the foundations for data protection regulations in many European countries and outside Europe, as a result of the technological and social changes of that period. These first regulations defined the initial core of data protection (i.e., transparency, rights to access, and data protection authorities), which is still present in the existing legal framework. The European Union model: From the Data Protection Directive to the General Data Protection Regulation The period from the mid-1980s to the 1990s was characterized not only by the rising of a uniform approach to data protection regulation among the mem- bers of the European Union, but also by a change in the regulatory paradigm, due to the new technological, social, and economic scenarios. Home computers entered the market in the late 1970s to become common during the 1980s. This was the new era of distributed computers, in which a lot of people bought a personal computer to collect and process information. The big mainframe computers became the small desktop personal com- puters, with a relatively low cost. Consequently, the computational capacity was no longer an exclusive privilege of governments and big companies but became accessible to many entities and consumers. This period witnessed another transformation involving direct marketing, which was no longer based on the concept of mail order and moved toward computerized direct marketing solutions.∗ The new forms of marketing were based on customer profiling and required extensive data collection to apply ∗Although direct marketing has its roots in mail order services, which were based on personalized letter (e.g., using the name and surname of addressees) and general group profiling (e.g., using census information to group addressees in social and economic classes), the use of computer equipment increased the level of manipulation of consumer information and generated detailed consumer’s profiles [45,46].
  • 22. Legal aspects of information science, data science, and Big Data 7 data mining software. The main purpose of profiling was to suggest a suitable commercial proposal to any consumer. This was an innovative application of data processing driven by new pur- poses. Information was no longer collected to support supply chains, logistics, and orders, but to sell the best product to each user. As a result, the data subject became the focus of the process, and personal information acquired an economic and business value, given its role in sales. These changes in the technological and business frameworks created new requests from society to legislators, as citizens wanted to have the chance to negotiate their personal data and gain something in return. Although the new generations of the European data protection laws placed personal information within the context of fundamental rights,∗ the main goal of these regulations was to pursue economic interests related to the free flow of personal data. This is also affirmed by the Directive 95/46/EC,† which represents both the general framework and the synthesis of this second wave of data protection laws.‡ However, the roots of data protection remained in the context of person- ality rights. Therefore, the European approach is less market-oriented than it happens in other legal systems. The directive also recognizes the fundamental role of public authorities in protecting data subjects against unwilled or unfair exploitation of their personal information for marketing purposes. Both the theoretical model of fundamental rights, based on self- determination, and the rising data-driven economy highlighted the importance of user consent in consumer data processing. Consent does not only represent an expression of choice with regard to the use of personality rights by third parties but is also an instrument to negotiate the economic value of personal information. In this new data-driven economy, personal data cannot be exploited for business purposes without any involvement of data subjects. It is necessary that individuals become part of the negotiation, as data are no longer used mainly by government agencies for public purposes but also by private com- panies with monetary revenues [49,50]. ∗See Council of Europe, Convention for the Protection of Individuals with regard to Automatic Processing of Personal Data, opened for signature on January 28, 1981 and entered into force on October 1, 1985. http://conventions.coe.int/Treaty/Commun/ QueVoulezVous.asp?NT=108&CL=ENG (accessed February 27, 2014); OECD, Annex to the Recommendation of the Council of 23rd September 1980: Guidelines on the Pro- tection of Privacy and Transborder Flows of Personal Data. http://www.oecd.org/internet/ ieconomy/oecdguidelinesontheprotectionofprivacyandtransborderflowsofpersonaldata.htm# preface (accessed February 27, 2014). †Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data [1995] OJ L281/31. ‡The EU Directive 95/46/EC has a dual nature, as it was written on the basis of the existing national data protection laws, in order to harmonize them, but at the same time it also provided a new set of rules. See the recitals in the preamble to the Directive 95/46/EC [47,48].
  • 23. 8 Frontiers in Data Science Effective self-determination in data processing, both in terms of protec- tion and economic exploitation of personality rights, cannot be obtained without adequate and prior notice.∗ For this reason, the notice and consent model† added a new layer to the existing paradigm based on transparency and access [17]. Finally, it is important to highlight that, during the 1980s and 1990s, data analysis increased in quality, but its level of complexity was still limited. Con- sequently, consumers were able to understand the general correlation between data collection and related purposes of data processing (e.g., profiling users, offering customized services, or goods). At that time, informed consent and self-determination were largely considered as synonyms, but this is changed now, in the Big Data era. The advent of Big Data analytics has created a different economic and technological scenario, with direct consequences on the adequacy of the legal framework adopted to safeguard personal information. The new environment is mainly digital and characterized by an increasing concentration of informa- tion in the hands of a few entities, both public and private. The role played by specific subjects in the generation of data flows is the main reason for this concentration. Governments and big private companies (e.g., large retailers, telecommunication companies) collect huge amounts of data while performing their daily activities. This bulk of information repre- sents a strategic and economically relevant asset, as the management of large databases enables these entities to assume the role of gatekeepers with regard to the information that can be extracted from the datasets. They are able to keep information completely closed or to limit access to the data, perhaps to specific subjects only or with regard to circumscribed parts of the entire collection. Not only governments and big private companies acquire this power but also the intermediaries in information flows (e.g., search engines, Internet providers, data brokers, and marketing companies), which do not generate information but play a key role in circulating it. There are also different cases in which information is accessible to the pub- lic, both in raw and processed form (e.g., open datasets, online user-generated contents). This only apparently diminishes the concentration of power over information, as access to information is not equivalent to knowledge [51]. A large amount of data create knowledge if the data holders have the adequate interpretation tools to select relevant information, to reorganize it, to place the data in a systematic context, and if there are people with the required skills to define the design of the research and give an interpretation to the results generated by Big Data analytics [3,15,52,53]. Without these skills, data only produce confusion and less knowledge in the end, with information interpreted in an incomplete or biased way. For these reasons, the availability ∗The notice describes how the data are processed and the detailed purposes of data processing. †See Articles 2(h), 7(a) and 10, Directive 95/46/EC.
  • 24. Legal aspects of information science, data science, and Big Data 9 of data is not sufficient in the Big Data context [54,55]. It is also necessary to have the adequate human and computing resources to manage it. In this scenario, control over information does not only regard limited access data, but can also concern open data [56,57], over which the infor- mation intermediaries create an added value by means of their instruments of analysis. Given that only few entities are able to invest heavily in equip- ment and research, the dynamics described earlier enhance the concentra- tion of power over information, which increases due to the new expansion of Big Data. Under many aspects, this new environment resembles the origins of data processing, when, in the mainframe era, technologies were held by a few entities and data processing was too complex to be understood by data sub- jects. Nevertheless, there are important differences that may affect the possi- ble evolution of this situation, in terms of a diffused and democratic access to information. The new data gatherers do not base their position only on expensive hard- ware and software, which may become cheaper in the future, or is based on the growing number of experts able to give an interpretation to the results of data analytics. The fundamental element of this power is represented by the large databases they have. These data silos, which are considered the goldmine of the twenty-first century, do not have free access, as they represent the main or the side effect of the activities conducted by their owners, due to the role that they play in creating, collecting, or managing information. For this reason, in the Big Data context, it seems quite difficult to imagine the same process of democratization that happened with regard to computer equipment during the 1980s [58]. The access to large databases is not only protected by legal rights, but it is also strictly related to the peculiar positions held by data holders in their market and to the presence of entry barriers. Another aspect that characterizes this new form of concentration of con- trol over information is the nature of the purposes of data collection: data processing is no longer focused on single users (profiling), but it increased by scale and it is trying to investigate attitudes and behaviors of large groups and communities, up to entire countries. The consequence of this large-scale ap- proach is the return of the fears about social surveillance, which characterized the mainframe era. Against this background, the GDPR does not change the main pillars of the previous regulatory model. Therefore, personal data are still primar- ily protected by individual rights; the notice and consent model remains an important legal ground for data processing, and the principles of purpose lim- itation and data minimization are reaffirmed. Despite this traditional approach, which seems to be partially inade- quate in the Big Data context, the GDPR shows a partial shift of the regulatory focus from data subject’s self-determination to accountability of the controller and persons involved in data processing. In this sense, ac- countability represents the core of the new EU data protection framework
  • 25. 10 Frontiers in Data Science and an important element to tackle the potential negative impacts of the use of data analytics [59]. More specifically, accountability is based on the data protection impact assessment, the role played by data protection officers and, when required by law, the prior assessment process conducted by data protection authorities. In this sense, compared with the previous Data Protection Directive, the GDPR undoubtedly moves toward a risk-based approach. Nevertheless, this transition is still incomplete. Elements of the previous model focused on data subjects that coexist with the new approach, but with- out a complete redraft of the architecture defined in the 1990s, it seems to be difficult to address the social and technological challenges of Big Data. Use of data and risk-analysis Regarding risk management in data processing, it is worth pointing out that risk can be considered, in a broad sense, as any negative consequence that can occur when personal data are processed, regardless of the fact that these consequences might produce damage or prejudice to individual rights and freedoms. In this sense, data subjects that use social networks expose themselves to the risk of being profiled [60], of having their information shared with third parties, of being tracked for commercial purposes, and so on. None of these consequences are against the law, as those are detailed in terms and conditions and privacy policies by service providers and accepted by users, on the basis of the notice and consent model. In these cases, it seems that there is no relevant risk for the safeguard of data subjects’ rights, as individuals can assess the consequences of data processing and have freely expressed their consent. Nevertheless, legal and sociological studies have clearly demonstrated that users are usually unaware of the consequences of providing their consent, as they do not read long and technical notices or are not able to completely understand these descriptions and imagine their practical consequences [61–65]. Moreover, in many cases, power imbalance and social lock-in drastically reduce any effective freedom of choice. As a consequence of these constraints, users frequently accept some forms of data processing without any prior risk/benefit analysis and are unaware of the consequences. This shows the limits of the traditional notice and choice paradigm [66,67], which are more evident in the context of Big Data analytics, in which it is difficult to describe the “specific” purposes of data processing [Article 6(1)(a) GDPR] at the moment of data collection, due to the transfor- mative use of data made by data controllers [68].∗ ∗In this light, it is also difficult to comply with the provisions of Article 4 of the GDPR, which qualifies data subject’s consent as “freely given, specific and informed.” According to the Article 29 Data Protection Working Party, “to be specific, consent must be intelligible: it should refer clearly and precisely to the scope and the consequences of data processing” [17].
  • 26. Legal aspects of information science, data science, and Big Data 11 In this sense, with respect to the broad notion of risk-concerning data pro- cessing, the GDPR maintains the important roles played by self-determination of data subjects and transparency, recognized by law in the last decades. The European legislator seems to be unaware of the weaknesses of this approach, where the formal transparency of terms and conditions com- bined with users’ behavior [61] provide data controllers with the notice and consent model, an easy way to lawfully exploit personal data in an extensive manner. On the other hand, a narrower notion of risk can be adopted, which focuses on “material or nonmaterial damages” that prejudice the “rights and freedom of natural persons.” This notion has been adopted in the GDPR to define the risk-based approach (Recital 75 GDPR). According to the regulation, when a risk of prejudice exists and cannot be mitigated or excluded, data processing becomes unlawful, despite the presence of any legitimate grounds, such as the data subject’s consent. Recital n. 75 of the GDPR provides a long list of cases in which data pro- cessing is considered unlawful. Moreover, this recital does not limit these hypo- theses to the security of data processing but also takes into account the risk of discrimination and “any other significant economic or social disadvantage.” This notion of risk impact, which is echoed in the Article 35 of the GDPR, represents an important step in the direction of an impact assessment of data processing [69] that is no longer primarily focused on data security (see Article 32 GDPR) and evolves toward a more robust and broader Privacy, Ethical, and Social Impact Assessment (PESIA).∗ Moreover, the attention to the economic and social implications of data uses assumes relevance in the Big Data con- text, in which analytics are used in decision-making processes and may have negative impacts that affect individuals in terms of discrimination rather than in terms of data security.† In line with the risk-based approach, the new provisions of the GDPR reinforce the accountability of data controllers that, according to Article 24, are liable when they do not “implement appropriate technical and organizational measures” to tackle the risks mentioned in the regulation (see also Article 83(4) GDPR). These measures should be implemented from the earliest stage of data processing design, embedding them in the processing, according to the data protection by design approach (Article 25 GDPR). In the light of the above, regarding transparency, rights to access, and data protection authorities, which are the founding pillars of data protection regula- tion, and the further element of the data subject’s consent, the new regulation ∗See sections “Multiple-risk assessment and collective interests” and “The guidelines adopted by the Council of Europe on the protection of individuals with regard to the processing of personal data in a world of Big Data.” Regarding the PESIA model, see also the H2020 project “VIRT-EU: Values and ethics in Innovation for Responsible Technology in Europe.” http://www.virteuproject.eu/ (accessed December 21, 2016). †See section “Data-centered approach and socio-ethical impacts.”
  • 27. 12 Frontiers in Data Science sheds light on the accountability of data controllers. Although accountability principles were already present in the first data protection regulations, in which the duties of transparency and the role played by data protection autho- rities increased data controllers’ accountability, in the Directive 95/46/EC, there was not a general process of risk-assessment, with specific consequences in terms of accountability. Before the new regulation, there were only national provisions or best practices regarding the privacy impact assessment [69], but no uniform risk- based approach. This goal has now been reached in the GDPR by means of a set of rules that concern the role played by risk analysis, the data protection impact assessment, the prior consultation of data protection authorities, and the data protection officer (Articles 35, 36, and 37 GDPR). In more detail, the risk-based model defined by the GDPR is articulated in three different levels of assessment. The first is required by Article 24 GDPR, and implicitly by Article 35(1). This is a general assessment of “the risk of varying likelihood and severity for rights and freedoms of natural persons,” which defines the level of the potential negative impact of data processing. When this first assessment shows that the processing “is likely to result in a high risk to the rights and freedoms of natural persons” (Article 35 GDPR), the controller should carry out a formal data protection impact assessment. Moreover, there is a list of cases in which high risk is presumed (Article 35(3) GDPR). This is an open list, due to the fact that data protection authori- ties may add further cases (Article 35(4) GDPR), according to the margin of maneuver recognized in several provisions by the regulation to national authorities or legislators. Nevertheless, the idea of a list of high-risk cases, as well as of cases excluded from the impact assessment (Article 35(5) GDPR), raise doubts about the feasibility of this categorization. In this sense, an ex ante general definition of the presumed level of risk seems to be in conflict with the idea of risk- assessment, which is necessarily context based. Moreover, the cases of high risk are described using indefinite notions, such as “large scale” data processing (Article 35(3)(b) and (c) GDPR). In this regard, Recital n. 91 may be of help to clarify the meaning of this provision, as it states that the impact assessment “should in particular apply to large-scale processing operations which aim to process a considerable amount of personal data at regional, national or supranational level and which could affect a large number of data subjects.” Nevertheless, the recital does not explain when an amount of data is deemed “considerable” and why, in the digital global context, the amount of data should refer to territorial dimensions (regional, national, or supranational). Finally, in the absence of any scale, the general notion of high risk remains quite indefinite. Recital n.77 identifies a series of bodies and instruments that can provide guidance as regards the “identification of the risk related to the processing, their assessment in terms of origin, nature, likelihood and severity,” but, at the moment, the framework remains uncertain.
  • 28. Legal aspects of information science, data science, and Big Data 13 These criticisms seem to have a limited impact on the field of Big Data analytics, as the majority of applications fall within the cases listed in Article 35(3) GDPR, in which high risk is presumed. Nevertheless, it is worth pointing out that analytics can be used in contexts in which the evaluation of personal aspects is not necessarily “systematic and extensive,” as they may focus only on a specific subset of attributes or on a given cluster of persons. Pursuant to Article 35(3), the use of Big Data analytics usually requires a prior data protection impact assessment. This procedure is defined by Article 35(7), in line with the traditional model of risk-assessment, which is primarily a prior evaluation of the potential negative outcomes of a process, product, or activity, and a consequent identification of the measures that should be adopted to avoid or, at least, mitigate the identified risks.∗ This procedure can be divided into three different stages: analysis of the process (Article 35(7)(a) GDPR), risk-assessment (Article 35(7)(b) and (c) GDPR), and definition of the measures envisaged to address the risks (Article 35(7)(d) GDPR). It is worth pointing out that the stage concerning the risk-assessment includes two different kinds of evaluation: assessment of the “necessity and proportionality” of data processing, and assessment of the “risks to the rights and freedoms of data subjects.” These two evaluations are correlated and consequent, as disproportional or unnecessary data processing cannot be put in place and, in this case, there is not any further question about the impact on individual rights and freedoms. On the other hand, when the principles of necessity and proportionality are respected, further investigation is needed to assess the specific balance of interests that the use of data implies. According to the principles and values framed in the European Chart of Fundamental Rights of the European Union, this balance of interests is not a mere risk/benefit analysis, but a comparison between interests that are different and may have a different hierarchical order.† In this sense, the data protection impact assessment is not in line with the risk-based theories [70] that suggest the adoption of a risk/benefit approach instead of a risk- mitigation approach.‡ ∗According to the traditional paradigm of risk-assessment, data controllers should be able to demonstrate compliance with the Regulation on the basis of the assessment results (Article 35(7)(d) GDPR) and should periodically review these results, due to the possibility of a change in the nature and severity of the risks over the time (Article 35(11) GDPR). †See European Court of Justice, May 13, 2014, Case 131/12, Google Spain SL, Google Inc. v Agencia Española de Protección de Datos (AEPD), Mario Costeja González. http://curia.europa.eu/juris/document/document.jsf?text=&docid=152065&pageIndex=0 &doclang=EN&mode=lst&dir=&occ=first&part=1&cid=980962 (accessed June 16, 2016). ‡According to the risk/benefit approach, the assessment should be based on the com- parison between the amount of benefits and the sum of all risks, without any distinction regarding the nature of risks and benefits. In this sense, for instance, economic benefits may prevail over individual rights. On the other hand, the risk mitigation approach assumes that some interests (e.g., fundamental rights) are prevailing and cannot be compared with other interests that have a lower relevance. As a consequence, the risk mitigation approach focuses on the potential prejudice for fundamental rights and suggests adequate measures to reduce this risk or, where feasible, to exclude it.
  • 29. 14 Frontiers in Data Science When data protection impact assessment “indicates that the processing would result in a high risk in the absence of measures taken by the con- troller to mitigate the risk,” data controllers must consult the supervisory authority prior to the start of processing activities (Article 36(1) GDPR). According to Recital n. 84 of the GDPR, the absence of measures to mitigate the risk is evaluated taking into account the “available technology and costs of implementation.” It is worth pointing out that the reference to the costs and the available technology, also present in the provisions concerning security risk (Recital n. 83 and Article 32(1) GDPR) and data protection by design (Article 25(1) GDPR), represents an important opportunity to put the principle of pro- portionality into practice in the context of risk mitigation. Therefore, these provisions reduce the risk of an excessive burden for data controllers due to the implementation of the risk-assessment model. When a data protection impact assessment indicates that processing would result in a high risk in the absence of measures taken by the controller to mitigate the risk, data controllers should consult the supervisory authority prior to the start of processing activities (Recital n. 94 GDPR).∗ According to Article 36(2) GDPR, when the supervisory authority is of the opinion that the intended processing would infringe the regulation, the authority “shall [. . . ] provide written advice to the controller and, where app- licable to the processor, may use any of its powers referred to in Article 58.” Given the powers given to supervisory authorities by Article 58, this means that there are two options as follows: (1) The assessment is not satisfactory, and the data controller has not adequately identified or mitigated the risk; (2) the assessment has been conducted in a correct manner, but there are no measures available to mitigate the risk. In the first case, the supervisory authority orders the controller or processor “to bring processing operations into compliance with the provisions of this Regulation, where appropriate, in a specified manner” (Article 58(2)(d) GDPR), whereas, in the second case, the authority imposes “a temporary or definitive limitation including a ban on processing” (Article 58(2)(f) GDPR). Finally, minor aspects concerning the risk-based approach regard the role played by the data protection officer, whose main tasks are to provide advice to the controller or the processor of their obligations (included the data pro- tection impact assessment), and to monitor compliance with legal provisions concerning data protection and with the privacy policies of the controller or processor (Article 39(1) GDPR). In the performance of these tasks, the data protection officer must “have due regard to the risk associated with processing operations, taking into account the nature, scope, context, and purposes of ∗The model of prior consultation is built on the concept of prior checking, which was already present in Article 20 of the Directive 95/46/EC.
  • 30. Legal aspects of information science, data science, and Big Data 15 processing” (Article 39(2) GDPR). Therefore, the risk-assessment represents one of the main criteria that should drive the action of the data protection officer. The new provisions about risk-assessment represent an important evolu- tion in the direction of a risk-based approach in data protection and, in this sense, may offer an adequate solution to the potential negative outcomes of the use of Big Data analytics. The main limit of these provisions lies in the link to the purposes of data processing.∗ Although the assessment should necessarily be related to the use of data for a specific purpose, there is a problem due to the fact that, according to Article 5(1)(b) GDPR, data processing purposes should be “specific, explicit, and legitimate” and defined at the moment of data collection, which contrast with the transformative use of data made by private and public bodies by means of Big Data analytics. For these reasons, a better design of the impact assessment should not focus on the initial purpose of data collection, but on each specific data use that is put in place by the data controller after data collection. In this regard, it should be noted that, at the moment, this result is achieved by data controllers circumventing the provisions on purpose limitation. They collect personal data on the basis of broad series of different purposes and then, if they have already adopted procedures of impact assessment, evaluate case-by-case the potential impact on data protection, with regard to each different use of information for a given purpose. Against this background, a different perspective can be adopted, which expressly accepts the idea that data are collected for multiple purposes, defined only broadly at the beginning of data processing. This model focuses on the different specific uses of collected information and the prior assessment of the potential risks of each use. This kind of approach, if adopted by the legislator, will be more efficient and consistent with the transformative use of data made by companies in the Big Data context, as well as with the level of self-determination of the data subjects [66,71]. In this sense, a more extensive use of the legitimate interest as legal grounds [24] may complete this model. Companies may enlist users in data processing without any prior consent, provided they give notice of the results of the assessment, which should be supervised by data protection authorities (licensing model), and provide an opt-out option [66]. It might be noted that the suggested approach undermines the chances for users to negotiate their consent, but the strength of this objection is reduced by the existing limits to self-determination described above. In the majority ∗See Article 35(1) GDPR (“Where a type of processing in particular using new technolo- gies, and taking into account the nature, scope, context and purposes of the processing, is likely to result in a high risk to the rights and freedoms of natural persons”) and 35(7)(b) (“[The assessment shall contain at least] an assessment of the necessity and proportionality of the processing operations in relation to the purposes”).
  • 31. 16 Frontiers in Data Science of the cases, the negotiation is reduced to the alternative take it or leave it. A prior assessment conducted under the supervision of independent author- ities, the use of legitimate interest as legal ground, and the adoption of an opt-out model seem to offer more guarantees to users than an apparent, but inconsistent, self-determination based on notice and consent and on the opt-in model. On the other hand, remaining focused on the existing legal framework defined by the Regulation 2016/679, a different option [71] may be to limit Big Data uses to statistical purposes, which benefit from an explicitly permit- ted reuse of data (Articles 5 (1) and 89, GDPR). Nevertheless, in this case, using analytics for decision-making purposes directly affecting a particular individual would be outside the field of statistical purposes and also violate the restrictions on automated individual decision making, including profiling. In this sense, the GDPR “can be seen as a stepping stone, pointing toward the need to evolve data protection beyond the old paradigm, yet not fully committed to doing so” [71]. The model of data management defined by the new Regulation does not completely address the new challenges of use of Big Data analytics in data processing [24,71]: the new provisions do not provide an effective transparency of data processing (obscure notices, impact assessment not publicly available), but only a higher level of accountability. Moreover, the risk-mitigation approach adopted by the Regulation seems still to be far from the idea of a multiple and participative risk-assessment. Although Recital n. 75 recognizes the risk of discrimination and “any other significant economic or social disadvantage,” the provisions of the Regulation do not offer an adequate framework for the assessment of this kind of negative outcome. With regard to the use of Big Data analytics in decision-making processes, important questions arise about the ethical and social values that should be taken into account, as well as the role that the different social stakeholders can play in assessing the impact of data uses.∗ In conclusion, the European Union seems to be insecure in moving its steps away from the traditional model of data protection, whereas other international bodies are trying to offer a more courageous answer to the challenges of the data age. In this sense, the new guidelines on Big Data of the Council of Europe seem to be aware of the limits of the traditional principles governing data protection and open to a broader risk-assessment, which takes into account the social and ethical impacts of data uses and recognizes the benefits of a participatory model based on the multistakeholder approach.† ∗See section “The guidelines adopted by the Council of Europe on the protection of individuals with regard to the processing of personal data in a world of Big Data.” †See section “Multiple-risk assessment and collective interests.”
  • 32. Legal aspects of information science, data science, and Big Data 17 Use of data for decision-making purposes: From individual to collective dimension of data processing The new scale of data processing of Big Data applications and the use of analytics in decision-making processes pose new questions about data protec- tion. As Big Data make it possible to collect and analyze large amounts of information, data processing is no longer focused on individual users, and this sheds light on the collective dimension of the use of data. In the Big Data environment, general strategies are adopted on a large scale and on the basis of representations of society generated by algorithms, which predict future collective behavior [3,25,55,64]. These strategies are then applied to specific individuals, given the fact that they are part of one or more groups generated by analytics [3,56,72]. The use of analytics and the adoption of decisions based on group behavior rather than on individuals are not limited to commercial and market contexts. They also affect other important fields, such as security and social policies, where a different balance of interest should be adopted, given the importance of public interest issues.∗ One example of this is provided by predictive policing solutions such as PredPol [73–77]. This categorical approach characterizing the use of analytics leads poli- cymakers to adopt common solutions for individuals belonging to the same cluster generated by analytics. These decisional processes do not consider in- dividuals per se, but as a part of a group of people characterized by some common qualitative factors. In this sense, the use of personal information and Big Data analytics to support decisions exceeds the boundaries of the individual dimension and assumes a collective dimension [78], with potential harmful consequences for some groups [79,80]. In this sense, prejudice can result not only from the well- known privacy-related risks (e.g., illegitimate use of personal information, data security) but also from discriminatory and invasive forms of data processing [15,81,82]. The dichotomy between individuals and groups is not new, and it has already been analyzed with regard to the legal aspects of personal information. Nonetheless, the right to privacy and the right to the protection of personal data have been largely safeguarded as individual rights, despite the social dimension of their rationale. The focus on the model of individual rights is probably the main reason for the few contributions by privacy scholars on the collective dimension of privacy and data protection. Hitherto, only few authors have investigated the notion of group privacy. They have represented this form of privacy as the pri- vacy of the facts and ideas expressed by the members of a group in the group environment or in terms of protection of information about a group [37,83,84]. ∗See also section “Data prediction: social control and social surveillance.”
  • 33. 18 Frontiers in Data Science On the other hand, collective data protection does not necessarily con- cern facts or information referring to a specific person, as with individual privacy and data protection. Nor does it concern clusters of individuals that can be considered as groups in the sociological sense of the term. In addition, collective rights are not necessarily a large-scale representation of individ- ual rights and related issues [85]. Finally, collective data protection concerns non-aggregative collective interests [86], which are not the mere sum of many individual interests.∗ The importance of this collective dimension [78] depends on the fact that the approach to classification by modern algorithms does not merely focus on individuals, but on groups or clusters of people with common characteristics (e.g., customer habits, lifestyle, online and offline behavior). Data gatherers are mainly interested in studying groups’ behavior and predicting this behavior, rather than in profiling single users. Data-driven decisions concern clusters of individuals and only indirectly affect the members of these clusters. One example of this is price discrimination based on age, habits, or wealth. The most important concern in this context is the protection of groups from potential harm due to invasive and discriminatory data processing. In this sense, the collective dimension of data processing is mainly focused on the use of information [66,70], rather than on secrecy [83,84] and data quality. Regarding the risk of discrimination, this section does not focus on the unfair practices characterized by intentional discriminatory purposes, which are generally forbidden and sanctioned by law [87,88],† but on the involuntary forms of discrimination in cases in which Big Data analytics provide biased representations of society [89,90]. For example, in 2013, a study examined the advertising provided by Google AdSense and found statistically significant racial discrimination in adver- tisement delivery [91,92]. Similarly, Kate Crawford has pointed out certain algorithmic illusions [93,94] and described the case of the City of Boston and its StreetBump smartphone app to passively detect potholes [95].‡ Another example is the Progressive case, in which an insurance company obliged drivers to install a small monitoring device in their cars to receive the ∗Contra Vedder [81], who claims that the notion of collective privacy “reminds of col- lective rights,” but subjects of collective rights are groups or communities. Conversely, the groups generated by group profiling are not communities of individuals sharing similar characteristics and structured or organized in some way. For this reason, Vedder uses the different definition of “categorial privacy.” †See Article 14 of the Convention for the Protection of Human Rights and Fundamental Freedoms; Article 21 of the Charter of Fundamental Rights of the European Union; Article 19 of the Treaty on the Functioning of the European Union; Directive 2000/43/EC; Directive 2000/78/EC. ‡In this case, the application had a signal problem, due to the bias generated by the low penetration of smartphones among lower income and older residents. While the Boston administration took this bias into account and solved the problem, less-enlightened pub- lic officials might underestimate such considerations and make potentially discriminatory decisions.
  • 34. Legal aspects of information science, data science, and Big Data 19 company’s best rates. The system is considered as a negative factor driving late at night but did not take into account the potential bias against low-income individuals, who are more likely to work night shifts, compared with late-night party-goers, “forcing them [low-income individuals] to carry more of the cost of intoxicated and other irresponsible driving that happens disproportionately at night” [76]. These cases represent situations in which a biased representation of groups and society results from flawed data processing∗ or a lack of accuracy in the representation. This produces potentially discriminatory effects as a conse- quence of the decisions taken on the basis of analytics. On the other hand, the decision to put in place different treatment of dif- ferent situations may represent an intentional and legitimate goal for policy makers, in line with the rule of law. This is the case of law and enforce- ment bodies and intelligence agencies, which adopt solutions to discriminate between different individuals and identify targeted persons. Here, there is a deliberate intention to treat given individuals differently, but this is not un- fair or illegal providing it is within existing legal provisions. Nonetheless, as in the previous case, potential flaws or a lack of accuracy may cause harm to citizens.† Discrimination, in terms of the different treatment of different situations, also appears in commercial contexts to offer tailored services to consumers. In this case, in which the interests are of a purely private nature, commercial practices may lead to price discrimination [99,100] or the adoption of different terms and conditions depending on the assignment of consumers to a specific cluster [56,99,101,102]. Thus, consumers classified as “financially challenged” belong to a cluster “[i]n the prime working years of their lives [. . . ] including many single par- ents, struggl[ing] with some of the lowest incomes and little accumulation of wealth.” This implies the following predictive viewpoint, based on Big Data analytics and regarding all consumers in the cluster: “[n]ot particularly loyal to any one financial institution, [and] they feel uncomfortable borrowing money and believe they are better off having what they want today as they never know what tomorrow will bring” [56]. It is not hard to imagine the potential discriminatory consequences of these classifications with regard to individuals and groups. These forms of discrimination are not necessarily against the law, espe- cially when they are not based on individual profiles and only indirectly affect ∗This is the case of the errors that affect the E-Verify system, which is used in the United States to verify if a new worker is legally eligible to work in the United States [76,96]. †For instance, criticisms have been raised with regard to the aforementioned predic- tive software adopted in recent years by various police departments in the U.S. Criticisms also concern the use of risk assessment procedures based on analytics coupled with a cat- egorical approach (based on typology of crimes and offenders) in U.S. criminal sentencing [97,98].
  • 35. 20 Frontiers in Data Science individuals as part of a category, without their direct identification.∗ For this reason, existing legal provisions against individual discrimination might not be effective in preventing the negative outcomes of these practices, if adopted on a collective basis. Still, such cases clearly show the importance of the collective dimension of the use of information about groups of individuals. From a data protection perspective and in the European Union, such data analysis focusing on clustered individuals may not represent a form of personal data processing, as the use of categorical analytics methodologies does not necessarily make it possible to identify a person, and group profiles can be made using anonymized data.† This reduces the chances of individuals taking action against biased representations of themselves within a group or having access to the data-processing mechanisms, as the anonymized information used for group profiling cannot be linked to them [88,104–106]. However, it has been observed that “once a profile is linked to an identifiable person—for instance in the case of credit scoring—it may turn into a personal data, thus reviving the applicability of data protection legislation” [72]. It should be noted that, as group profiling based on analytics is used to take decisions affecting a multiplicity of individuals, the main target of data processing is not the data subject, but the clusters of people created by Big Data gatherers. In this light, the interests that assume relevance are primarily supraindividual and collective [86]. In general terms, collective interests may be shared by an entire group without conflicts between the views of its members (aggregative interests) or with conflicts between the opinions of its members (non-aggregative interests) [86,107]. If the group is characterized by non-aggregative interests, the collec- tive nature of the interest is represented by the fundamental values of a given society (e.g., environmental protection). With regard to data protection, the notion of collective non-aggregative interests seems to be the best way to describe the collective dimension of the use of personal information. In this sense, although individuals may have dif- ferent opinions about the balance between the conflicting interests,‡ there are some collective priorities concerning privacy and data protection that are of relevance to the general interest. Here, the rationale for collective data pro- tection is mainly focused on the potential harm to groups caused by extensive and invasive data processing. ∗Regarding the decisions that affect an individual as member of a specific cluster of peo- ple, it should be noted that in many cases, these decisions are not based solely on automated processing [82]. In this sense, credit scoring systems have reduced but not removed human intervention on credit evaluation. At the same time, classifications often regard identified or identifiable individuals [103]. †On the limits of anonymization in the big data context, see section “Introduction. The legal challenges of the use of data.” ‡In this sense, an extensive group profiling for commercial purposes can be passively accepted, considered with favor or perceived as invasive and potentially discriminatory. The same divergence of opinions and interests exists with regard to government social surveillance for crime prevention and national security, in which part of the population is in favor of surveillance, due to concerns about crime and terrorism.
  • 36. Legal aspects of information science, data science, and Big Data 21 Data-centered approach and socio-ethical impacts Privacy and data protection are context-dependent notions, which vary from culture to culture and across historical periods [37,104,108,109]. In the same way, the related collective dimensions are necessarily influenced by his- torical and geographical variables and are the result of actions by policymak- ers. For these reasons, it is impossible to define a common and fixed balance between collective data protection and conflicting interests. There are jurisdictions that give greater priority to national and security interests, which in many cases prevail over individual and collective data pro- tection; meanwhile, in some countries, extensive forms of social surveillance are considered disproportionate and invasive. Therefore, any balancing test must focus on a specific social context in a given historical moment [110]. As has been pointed out in the literature [111], defining prescriptive ethical guide- lines concerning the values that should govern the use of Big Data analytics and the related balance of interests is problematic. Given such variability, from a theoretical perspective, a common frame- work for a balancing test can be found in the values recognized by interna- tional charters of fundamental rights. These charters provide a baseline from which it is to identify the values that can serve to provide ethical guidance and define the existing relationships between these values [111]. In addition, the context-dependent framework of values and the relation- ship between conflicting interests and rights needs to be specified with regard to the actual use of Big Data analytics. In Europe, for instance, commercial interests related to credit score systems can generally be considered compatible with the processing of personal information, providing that data are adequate, relevant, and not excessive in relation to the purposes for which it is collected.∗ Even so, specific Big Data analytics solutions adopted by some companies for credit scoring purposes may lead to a disproportionate scrutiny of consumers’ private life. The same reasoning can also be applied to smart mobility solu- tions, which can potentially lead to extensive social surveillance. This means that a prior case-by-case risk-assessment is necessary to mitigate the potential impact of these solutions on data protection and individual freedoms. This “in-context” balance of conflicting interests is based on an impact assessment that, in the presence of complex data collection and processing systems, should not be conducted by consumers or companies but must entail an active involvement of various stakeholders. Against this background, an important aspect of the protection of collective interests relating to personal information is the analysis of the existing conflicting interests and the repre- sentation of the issues regarding the individuals grouped in clusters by data gatherers. ∗See Articles 18 and 20 of the Directive 2014/17/EU. See also Article 8 of the Directive 2008/48/EC on credit agreements for consumers and repealing Council Direc- tive 87/102/EEC.
  • 37. 22 Frontiers in Data Science Here, it is useful to briefly consider the fields in which the group dimen- sion of data protection is already known in more traditional contexts that are not characterized by extensive data collection and use of analytics. For instance, labor law recognizes this collective dimension of rights and the dua- lism between individuals and groups.∗ Under certain circumstances, trade unions and employees’ representatives may concur in taking decisions that affect the employees and have an impact on data protection in the workplace. Collective agreements on these decisions are based on the recognition that the power imbalance in the workplace means that, in some cases, the employee is unaware of the implications of employer’s policies (e.g., employers’ work- place surveillance practices). Moreover, in many cases, this imbalance makes it difficult for employees to object to the illegitimate processing of their data. Entities representing collective interests (e.g., trade unions) are less vul- nerable to power imbalance and have a broader vision of the impact of the employer’s policies and decisions. It should also be noted that the employer’s unfair policies and forms of control are often oriented toward discriminatory measures that affect individual workers, even though they are targeted at the whole group. This collective representation of common interests is also adopted in other fields, such as consumer protection and environmental protection. These con- texts are all characterized by a power imbalance affecting one of the par- ties directly involved (employees, consumers, or citizens). Furthermore, in many cases, the conflicting interests refer to contexts in which the use of new technologies makes it hard for users to be aware of the potential negative implications. The same situation of imbalance often exists in the Big Data context, where data subjects are not in a position to object to discriminatory uses of personal information by data gatherers. Data subjects often do not know the basic steps of data processing, and the complexity of the process means that they are unable to negotiate their information and are not aware of the potential collective prejudices that underlay its use.† This is why it is important to recognize the role of entities representing collective interests, as it happens in the earlier cases. Employees are part of a specific group, defined by their relationship with a single employer; therefore, they are aware of their common identity and have mutual relationships. By contrast, in the Big Data context, the common attributes of the group often only become evident in the hands of the data gatherer. Data subjects are not aware of the identity of the other members of the group, have no relationship with them, and have a limited perception of their collective issues [112,113]. Furthermore, these groups shaped by analytics have a variable geometry, and individuals can shift from one group to another. ∗See for example, Italian Statute of the Workers’ Rights, Articles 4 and 8, Act 300, May 20, 1970. †See section “Introduction. The legal challenges of the use of data.”
  • 38. Legal aspects of information science, data science, and Big Data 23 This does not undermine the idea of representing collective data protec- tion interests. On the contrary, this atomistic dimension makes the need for collective representation more urgent. However, it is hard to imagine represen- tatives appointed by the members of these groups, as is instead the case in the workplace. In this sense, there are similarities with consumer law, where there are collective interests (e.g., product security, fair commercial practices), but the potential victims of harm have no relationship to one another. Thus, individ- ual legal remedies must be combined with collective remedies.∗ Examples of possible complementary solutions are provided by consumer law, where inde- pendent authorities responsible for consumer protection, class action lawsuits, and consumer associations play an important role. In the field of Big Data analytics, the partially hidden nature of the pro- cesses and their complexity probably make timely class actions more difficult than in other fields. For instance, in the case of a product liability, the dam- ages are often more evident making it easier for the injured people to react. On the other hand, associations that protect collective interests can play an active role in facilitating reaction to unfair practices and, moreover, they can be involved in a multistakeholder risk-assessment of the specific use of Big Data analytics. The involvement of such bodies requires specific procedural criteria to de- fine the entities that may act in the collective interest.† This is more difficult in the context of Big Data, in which the groups created by data gatherers do not have a stable character. In this case, an assessment of the social and ethical impact of analytics often provides the opportunity to discover how data pro- cessing affects collective interests and thus identify the potential stakeholders. Multiple-risk assessment and collective interests How collective interests should be protected against discrimination and social surveillance in the use of Big Data analytics is largely a matter for the policymakers. Different legal systems and different balances between the components of society suggest differing solutions. Identifying the indepen- dent authority charged with protecting collective interests may therefore be difficult. Many countries have independent bodies responsible for supervising spe- cific social surveillance activities, and other bodies focused on antidiscrimi- nation actions [114]. In other countries, this responsibility is spread across various authorities, which take different approaches, use different remedies, and do not necessarily cooperate in solving cases with multiple impacts. Meanwhile, a central element in the risk-assessment of Big Data analytics is the analysis of data processing, which is the factor common to all these ∗The same approach has been adopted in the realm of antidiscrimination laws [114,115]. †See also Article 80 GDPR.
  • 39. 24 Frontiers in Data Science situations, regardless of the potential harm to collective interests. For this reason, data protection authorities can play a key role in the risk-assessment processes, even if they are not focused on the specific social implications (e.g., discrimination). On the other hand, if we take a different approach that takes into consider- ation the various negative effects generated by the use of Big Data (discrimina- tion, unfair consumer practices, social control, etc.), we should involve multiple entities and authorities. Nevertheless, the end result may be a fragmented and potentially conflicting decision-making process that may underestimate the use of data, which is the common core of all these situations [95]. Furthermore, data protection authorities are accustomed to addressing collective issues and have already demonstrated that they do consider both the individual and the wider collective dimension of data processing. Focusing on data protection and fundamental rights, they are also well placed to balance the conflicting interests around the use of data. The adequacy of the solution is also empirically demonstrated by impor- tant cases decided by data protection authorities concerning data-processing projects with significant social and ethical impacts. These cases show that decisions to assess the impact of innovative products, services, and business solutions on data protection and society are not normally on the initiative of the data subjects, but primarily on that of data protection authorities, who are aware of the potential risks of such innovations. Based on their balancing tests, these authorities are in a position to suggest measures that companies should adopt to reduce the risks discussed here and to place these aspects within the more general framework of the rights of the individual, as a single person and as a member of a democratic society. The risk assessment represents the opportunity for group issues to be iden- tified and addressed. Thus, bodies representing collective interests should not only partially exercise traditional individual rights on behalf of data sub- jects but also exercise other autonomous rights relating to the collective dimension of data protection. These new rights mainly concern participa- tion in the risk-assessment process, which should take a multistakeholder approach.∗ Against this background, data protection authorities may involve in the assessment process the various stakeholders that represent the collective inter- ests affected by specific data-processing projects [111,116].† This would lead to the definition of a new model in which companies that intend to use Big Data analytics would undergo an assessment prior to collecting and processing data. ∗The extent of the rights conferred upon the different stakeholders in the protection of collective privacy is largely a matter for policymakers to decide and would depend on the nature and values of the different sociolegal contexts. †A different assessment exclusively based on the adoption of security standards or corpo- rate self-regulation would not have the same extent and independency. This does not mean that, in this framework, forms of standardization or coregulation cannot be adopted.
  • 40. Legal aspects of information science, data science, and Big Data 25 The assessment would not only focus on data security and data protec- tion but also consider the social and ethical impacts relating to the collective dimension of data use in a given project.∗ This assessment should be con- ducted by third parties and supervised by the data protection authorities.† Once this multiple-impact assessment is approved by data protection author- ities, the ensuing data processing would be considered secure in protecting personal information and collective interests. Although data protection authorities are already engaged to some degree in addressing the collective dimension, the suggested solution would lead to a broader and deeper assessment, which would become mandatory. This pro- posal is therefore in line with the view that a licensing scheme might “prove to be the most effective means of ensuring that data protection principles do not remain ‘law-in-book’ with respect to profiling practices” [44,104]. The guidelines adopted by the Council of Europe on the protection of individuals with regard to the processing of personal data in a world of Big Data Although the guidelines provided by the Council of Europe on the basis of the Convention 108 on data protection have not the same impact of the regulation (EU) 2016/679, in terms of efficacy and direct application, they represent an interesting set of rules that, for some aspects, shows a new manner to address the issues concerning the use of Big Data analytics. Before briefly examining the previsions of the “Guidelines on the protection of Individuals with Regard to the Processing of Personal data in a World of Big Data” (hereafter Guidelines) adopted by the Council of Europe,‡ the nature and the peculiarity of these guidelines should be highlighted. Within the framework of the Convention 108, the guidelines are practical and operative instructions provided by the Council of Europe to member states. They are primarily addressed to data controllers and data processors, to facilitate the effective application of the principles of the Convention in ∗In the Big Data context, another important aspect is the transparency of the algorithms used by companies [55,64,82,88,90]. See Articles 13 (2)(f), 14 (2)(f), and 15 (1)(h) GDPR, which recognize data subject’s right to receive “meaningful information about the logic involved.” †The entire system will work only if the political and financial autonomy of data pro- tection authorities from governments and corporations is guaranteed. Moreover, data protection authorities would need new competence and resources in order to bear the bur- den of the supervision and approval of these multiple-impact assessments. For these reasons, a model based on mandatory fees—paid by companies when they submit their requests for authorization to data protection authorities—would be preferable [66]. It should also be noted that, in cases of large-scale and multinational data collection, forms of mutual assistance and cooperation may facilitate the role played by data protection authorities in addressing the problems related to the dimensions of data collection and data gatherers. ‡The guidelines are available https://rm.coe.int/CoERMPublicCommonSearchServices/ DisplayDCTMContent?documentId=09000016806ebe7a.
  • 41. 26 Frontiers in Data Science specific sectors.∗ Nevertheless, unlike the guidelines previously adopted by the Council of Europe, which concerned specific contexts or issues, these guidelines focus on the use of a given technology (Big Data) and are not sector specific.† The awareness of the critical issues posed by the new forms of data process- ing based on analytics characterizes the entire text of the Guidelines. There- fore, the principles of the Convention 108 are interpreted to provide adequate solutions, taking into account “the given social and technological context” and “a lack of knowledge on the part of individuals” with regard to Big Data applications.‡ In this light, the effective safeguard of the individual’s “right to control his or her personal data and the processing of such data”§ is placed in the context of Big Data uses, in which processes of collection and analysis of data are characterized by complexity and obscurity [64]. For this reason, the Guidelines do not consider the notion of control as merely circumscribed to individual control (such as in the notice and consent model) but adopt a broader idea of control over the use of data, according to which “individual control evolves in a more complex process of multiple- impact assessment of the risks related to the use of data.”¶ This leads to go beyond the individual dimension of data protection and investigate aspects that concern the relations among individuals and the soc- iety at large. In this light, potential prejudices are not only restricted to the well-known privacy-related risks (e.g., illegitimate use of personal information, data security) but also include other prejudices that may concern the conflict with ethical and social values [15,82], in line with the Privacy, Ethical, and Social Impact Assessment model (PESIA) mentioned above.∗∗ Nevertheless, the assessment concerning the impact of the use of data on ethical and social values is more complicated than the traditional data protection assessment. Moreover, although individual rights concerning data ∗See Section II (Scope) of the Guidelines. †These guidelines do not provide an authoritative definition of Big Data, as there are many definitions of Big Data, which differ depending on the specific discipline. The Guide- lines cover both Big Data and Big Data analytics. ‡See Section I (Introduction) of the guidelines. See also Section II (Scope) of the guide- lines (“Given the nature of Big Data, the application of some of the traditional principles of data processing [e.g., minimization principle, purpose specification, meaningful consent, etc.] may be challenging in this technological scenario. These guidelines therefore suggest a tailored application of the principles of the Convention 108, to make them more effective in practice in the Big Data context”). §See Section I (Introduction) of the guidelines. See also the Preamble of the Draft mod- ernized Convention for the Protection of Individuals with Regard to the Processing of Personal Data (“Considering that it is necessary to secure the human dignity and protec- tion of the human rights and fundamental freedoms of every individual and [. . . ] personal autonomy based on a person’s right to control of his or her personal data and the processing of such [personal] data”). ¶See the previous footnote. ∗∗See section “Use of data and risk-analysis.”
  • 42. Legal aspects of information science, data science, and Big Data 27 processing are generally recognized by different national regulations and in- ternational conventions, as well as data security, and data management best practices are commonly diffused among data controllers, the values that should inspire the use of data are more indefinite and context based, changing from a community to another. This makes more complicated to identify a benchmark for these values that can be used in the ethical and social risk-assessment. This point is clearly addressed in the section “Introduction. The legal challenges of the use of data” of the fourth part (Principles and guidelines) of the Guidelines. First, the section urges both data controllers and data processors to “adequately take into account the likely impact of the intended Big Data processing and its broader ethical and social implications.” Second, it recognizes the relative nature of the social and ethical values and, in this sense, the Guidelines require that data uses should not be in conflict with the “ethical values commonly accepted in the relevant community or communities and should not prejudice societal interests, values and norms.” Although the Guidelines recognize the difficulties in defining the values that should be taken into account in the social and ethical assessment, they do not renounce to define some practical steps to identify these values. Therefore, they suggest, “the common guiding ethical values can be found in international charters of human rights and fundamental freedoms, such as the European Convention for the Protection of Human Rights.” Given the context-dependent nature of social and ethical assessment and the fact that international charters may only provide high-level guidance, the Guidelines combine this general suggestion with a more tailored option that is represented by “ad hoc ethics committee.”∗ These committees, which already exist in practice, should identify the specific ethical values to be safeguarded with respect to a given use of data, providing more detailed and context-based guidance for risk assessment. The Guidelines put the risk-assessment process in the broader context of the precautionary approach, which should characterize any new application of technology that may produce potential risks for individuals and society.† In this light, the Guidelines require data controllers to adopt preventive poli- cies to adequately address and mitigate the potential risks related to the use of Big Data analytics.‡ ∗See Guidelines, IV.1.3 (“If the assessment of the likely impact of an intended data processing described in section IV.2 highlights a high impact of the use of Big Data on ethical values, data controllers could establish an ad hoc ethics committee, or rely on existing ones, to identify the specific ethical values to be safeguarded in the use of data”). †See Guidelines, IV.2.1 (“Given the increasing complexity of data processing and the transformative use of Big Data, the Parties should adopt a precautionary approach in regulating data protection in this field”). ‡See Guidelines, IV.2.2. This is consistent with the provision of the Modernized Con- vention, which focuses both on risk analysis and the design of data processing “in such a manner as to prevent or minimise the risk of interference with [. . . ] rights and fundamental freedoms.” See Article 8bis (2) of the Draft modernised Convention for the Protection of Individuals with Regard to the Processing of Personal Data.
  • 43. 28 Frontiers in Data Science According to the general theory on the risk-based approach, the assessment process is divided into the following four different stages:∗ (1) identification of the risks, (2) analysis of the potential impact of the identified risks, (3) iden- tification of the solutions to exclude or mitigate the risks, and (4) continuous or periodical monitoring of the effectiveness of the solutions provided.† This is the traditional scheme that characterizes risk-assessment. Here, the most innovative aspect concerns the broader range of interests considered in the assessment process, which goes beyond the traditional notion of data protection. In this sense, the right to nondiscrimination and the social and ethical impacts of data processing activities assume specific relevance. Given the complexity of this assessment and the different aspects that should be taken into account, it cannot be conducted only by experts in data protection law but requires external auditors with specific and multidisci- plinary skills. In this light, these guidelines require that the risk-assessment “should be carried out by persons with adequate professional qualifications and knowledge to evaluate the different impacts, including the legal, social, ethical and technical dimensions.”‡ Moreover, the collective dimension of the potential impact of the use of data leads to encourage a multistakeholder approach that gives voice to the different groups of persons that may be aff- ected by a given use of data.§ Due to the complexity of this assessment and the continuous evolution of both the potential risks and the measures to tackle them, data protection authorities may play a relevant role in supporting data controllers, providing information about the state-of-the-art of data-processing security methods, and providing detailed guidelines on the risk-assessment process.¶ From the data subject’s perspective, a better understanding of the pur- poses of data processing can come from the analysis of the way in which data uses impact on individuals and society. In this light, the disclosure of the results of the different impacts mentioned above should become part of the duties of transparency of data controllers, to increase individuals’ awareness about their choices concerning personal information.∗∗ With regard to the level of disclosure that should characterize the pub- licity of the impact assessment, the Guidelines, according to the suggestion of legal scholars [66,111], clarify that the public availability of the result of the assessment should be made “without prejudice to secrecy safeguarded by law.” Therefore, in the presence of such secrecy, data controllers “shall provide ∗See Guidelines, IV.2.5. †See Guidelines, IV.2.9. Moreover, data controllers shall document the assessment and these solutions (Guidelines, IV.2.10). ‡See Guidelines, IV.2.6. §See Guidelines, IV.2.7 (“With regard to the use of Big Data which may affect funda- mental rights, the Parties should encourage the involvement of the different stakeholders (e.g., individuals or groups potentially affected by the use of Big Data) in this assessment process and in the design of data processing”). ¶See Guidelines, IV.2.8. ∗∗See Guidelines, IV.3.2 and 3.3.
  • 44. Legal aspects of information science, data science, and Big Data 29 any sensitive information in a separate annex to the risk-assessment report.” Anyway, although this annex is not public, it may be accessed by supervisory authorities.∗ Minor provisions of these guidelines concern the by-design approach† and data subject’s consent. With regard to the latter and the notice and consent model, the Guidelines highlight that the notice should be comprehensive of the outcome of the assessment process and “might also be provided by means of an interface that simulates the effects of the use of data and its potential impact on the data subject, in a learn-from-experience approach.”‡ Moreover, consent cannot be considered freely given when “there is a clear imbalance of power between the data subject and the Data Controllers or Data Processors, which affects the data subject’s decisions with regard to the processing.”§ Finally, the Guidelines devote a section to the role of the human interven- tion in Big Data–supported decisions,¶ reaffirming that the use of Big Data “should preserve the autonomy of human intervention in the decision-making process.” In this light, when decisions based on Big Data might affect in- dividual rights significantly or produce legal effects, a human decision-maker should, upon request of the data subject, “provide her or him with the reason- ing underlying the processing, including the consequences for the data subject of this reasoning.” In the same vein, the autonomy of decision makers should be preserved and, on the basis of “reasonable arguments,” they should be al- lowed the freedom not to rely on the result of the recommendations provided using Big Data. Data prediction: Social control and social surveillance Big Data prediction promises incredible opportunities to anticipate fraud detection and to prevent crime but, at the same time, its use could also threaten fundamental legal rights such as privacy and due process [68]. Law enforcement agencies [73], secret services [117], doctors, lawyers,∗∗ accountants [55], and judge†† are using Big Data predictive analytics solutions ∗See Guidelines, IV.3.2. †See Guidelines, IV.4. ‡See Guidelines, IV.5.1. §See Guidelines, IV.5.3. In these cases, data controller “should demonstrate that this imbalance does not exist or does not affect the consent given by the data subject.” ¶See Guidelines, IV.7. ∗∗See ROSS, the first Artificially Intelligent Lawyer at the following Url: http://www. rossintelligence.com; see IBM Watson, at the following Url: http://www-03.ibm.com/ innovation/us/watson. ††A recent experiment demonstrates that artificial intelligence has been used to predict decisions of the European Court of Human Rights (ECtHR) to 79% accuracy. Further in- formation at: http://www.legalfutures.co.uk/latest-news/robot-judge-ai-predicts-outcome- european-court-cases.
  • 45. 30 Frontiers in Data Science as they are well aware of how these tools can be useful and/or profitable espe- cially in a society increasingly preoccupied with the concepts of risk and public protection [118]. However, new technologies enhance preemptive profiling of individuals as the combination of predictive strategies and increased surveil- lance allow for more targeted profiles. Kerr and Earle identified three categories of Big Data prediction: conse- quential, preferential, and preemptive prediction. Consequential prediction is, in a general terms, an attempt to anticipate the likely consequences of a person’s action. Usually, this is the kind of predic- tion used by a lawyer to show to the client a realistic scenario of her defense strategy. Preferential prediction is mostly used by private players (iTunes Genius or Amazon Recommendation engine represents two significant examples), and it uses anticipatory algorithms based on social media intelligence to predict what kind of service a user will find interesting. Preemptive predictions assess the likely consequences of allowing or dis- allowing a person to act in a certain way. In contrast to consequential or preferential predictions, preemptive predictions do not usually adopt the perspective of the actor. Preemptive predictions are mostly made from the standpoint of the state, a corporation, or anyone who wishes to pre- vent or forestall certain types of action. Preemptive predictions are not concerned with an individual’s actions but with whether an individual or group should be permitted to act in a certain way. Examples of this technique include a no-fly list used to preclude possible terrorist activ- ity on an airplane, or analytics software used to determine how much supervision parolees should have based on predictions of future behavior [118]. This latter form of prediction could considerably threaten the con- cept of the fundamental rights in any democratic constitution. Ferguson correctly questioned if a computer program that predicts the probabil- ity of future crime locations could change Fourth Amendment protections in the targeted area. Furthermore, are data-driven hunches more reliable than personal hunches traditionally deemed insufficient to justify reasonable suspicion? Use of data during the investigation: Reasonable doubt versus reasonable suspicion The new reality, which has been briefly described in the previous section, simultaneously undermines the protection that reasonable suspicion provides against stops and potentially transforms reasonable suspicion into a means of justifying those same stops. Reasonable suspicion in the United States is a legal standard of proof that is less than probable cause, the legal standard for arrests and warrants, but more than an unparticularized suspicion; it must be based on specific and
  • 46. Legal aspects of information science, data science, and Big Data 31 articulable facts, “taken together with rational inferences from those facts,” and the suspicion must be associated with the specific individual. In Europe, the article 5 of the Convention on the Human Rights states that “everyone has the right to liberty and security of person. No one shall be deprived of this liberty save in the following cases and in accordance with a procedure prescribed by law [. . . ].” This means that a reasonable suspicion presupposes the existence of facts or information that would sat- isfy an objective observer that the person concerned may have committed an offence.∗ Therefore, a failure by the authorities to make a genuine in- quiry into the basic facts of a case, to verify whether a complaint was well founded, disclosed a violation of Article 5 §1 (c) of the European Convention on Human Rights.† To better understand the consequence of the principle of reasonable suspi- cion in the Big Data scenario, it could be helpful a practical example. Suppose police are investigating a series of robberies in a particular neighborhood hav- ing in their patrol cars a facial recognition software, connected to the database of the arrest photos, which scans people on the street. Suddenly, there is a match with a suspected person. The suspect’s personal information scrolls across the patrol car’s computer screen—prior to robbery arrests and robbery convictions. The officer then searches additional sources of third-party data, including the suspect’s GPS location information for the last six hours, or license plate records that tie the suspect to pawn shop trades close in time prior to robberies and—obviously—social media information. The police now have particularized, individualized suspicion about a man who is not doing anything overtly criminal. Can this aggregation of individualized information be sufficient to justify interfering with a person’s constitutional liberty?‡ This question, and more, will be raised by the use of any predictive policing strategy. Big Data and social surveillance: Public and private interplay in social control The interaction between public and private in social control could be divi- ded into two categories, both of which are significant with regard to data protection. The first concerns the collection of private company data by gov- ernment with surveillance and social control purpose, whereas the second is the use of judicial authorities of instruments and technologies provided by private companies for organizational and investigative purposes. ∗ECHR, Ilgar Mammadov v. Azerbaijan, §88; Erdagöz v. Turkey, §51; Fox, Campbell and Hartley v. the United Kingdom, §32. †ECHR, Stepuleac v. Moldova, §73; Elçi and Others v. Turkey, §674. ‡All these investigative instrument could be used on the basis of the following principle: law enforcement officers may access many of these records without violating the Fourth Amendment, under the theory that we can claim no reasonable expectation of privacy in information we have knowingly revealed to third parties.
  • 47. 32 Frontiers in Data Science With regard to the first category and especially when the request is made by governmental agencies, the issue of the possible violation of fundamental rights becomes more delicate. The Echelon Interception System [119] and the Total Information Awareness program [120] are concrete examples that are not isolated incidents, but undoubtedly the National Security Agency (NSA) case has clearly shown how could be invasive the surveillance in the era of global data flows and Big Data. To better understand the NSA case, it is quite important to have an overview of the considerable amount of electronic surveillance legislation that, particularly in the wake of 9/11, has been app- roved in the United States and, to a certain extent, in a number of European countries. The most important legislation is the Foreign Intelligence Surveillance Act (FISA) of 1978∗ which lays down the procedures for collecting foreign intel- ligence information through the electronic surveillance of communications for homeland security purposes. The section 702 of FISA Act amended in 2008 extended its scope beyond interception of communications to include any data in public cloud computing as well. Furthermore, this section clearly indicates that two different regimes of data processing and protection exist for U.S. citizens and residents on the one hand, and non-U.S. citizens and residents on the other. More specifically, the Fourth Amendment is applicable only for U.S. citizens as there is an absence of any cognizable privacy rights for non-U.S. persons under FISA. Thanks to FISA Act and the amendment of 2008, U.S. authorities had the possibility to access and process personal data of EU citizens on a large scale via, among others, the NSA’s warrantless wiretapping of cable- bound internet traffic (UPSTREAM) and direct access to the personal data stored in the servers of U.S.-based private companies such as Microsoft, Yahoo, Google, Apple, Facebook, or Skype (PRISM), through cross-database search programs such as X-KEYSCORE. U.S. authorities have also the power to compel disclosure of cryptographic keys, including the secure sockets layer (SSL) keys used to secure data in transit by major search engines, social networks, webmail portals, and Cloud services in general (BULLRUN Program) [121]. Even if the FISA Act is the mostly applied and known legislative tool to conduct intelligence activities, there are other relevant pieces of legislation on electronic surveillance. One needs only to consider the Communications Assis- tance for Law Enforcement Act of 1994,† which authorizes the law enforcement and intelligence agencies to conduct electronic surveillance by requiring that telecommunications carriers and manufacturers of telecommunications equip- ment modify and design their equipment, facilities, and services to ensure that they have built-in surveillance. ∗Foreign Intelligence Surveillance Act (50 U.S.C. §1801–1885C). †See Communications Assistance for Law Enforcement Act (18 USC §2522).
  • 48. Legal aspects of information science, data science, and Big Data 33 Truthfully, the surveillance programs are not only in the United States. In Europe, the Communications Capabilities Development Program has prompted a huge amount of controversy, given its intention to create a ubiq- uitous mass surveillance scheme for the United Kingdom in relation to phone calls, text messages and e-mails, and extending to logging communications on social media. On June 2013, the so-called program TEMPORA showed that UK intelligence agency Government Communications Headquarters has cooperated with the NSA in surveillance and spying activities [122]. These revelations were followed in September 2013 by reports focusing on the activ- ities of Sweden’s National Defense Radio Establishment. Similar projects for the large-scale interception of telecommunications data has been developed by both France’s General Directorate for External Security and Germany’s Federal Intelligence Service. Even if it seems that EU and U.S. surveillance programs are similar, there is one important difference: in the European Union, under data protection law, individuals have always control over their own personal data, whereas in the United States, the individuals have a more limited control once the user has subscribed to the terms and condition of a service.∗ Other than government agencies’ monitoring activities, the second cate- gory regarding the use by judicial authorities of private tools for investigative purposes has two interesting examples. The first is the PredPol† software initially used by the Los Angeles police force and now by other police forces in the United States (Palm Beach, Mem- phis, New York, Chicago, Minneapolis, and Dallas). Police Chief (ret.) William J. Bratton and the Los Angeles police department (LAPD) are credited with envisioning the PredPol, whereas Charlie Beck, chief of LAPD since 1977, wrote in 2009, “what can we learn from Wal-Mart and Amazon about fight- ing crime in a recession? Predictive policing leverages advanced analytics to enable information-based approaches to law enforcement tactics, strategy, and policy, enhancing public safety and changing outcomes. Advanced analytics tools, techniques, and processes support meaningful exploitation of public- safety data necessary to turn data into knowledge and guide information-based prevention, thwarting, mitigation, and response.” Predictive policing, in essence, cross-checks data, places, and techniques of recent crimes with disparate sources, analyzing them and then using the re- sults to anticipate, prevent, and respond more effectively to future crime. Even if the software house created by PredPol declares that no profiling activities are carried out, it becomes essential to carefully understand the technology used to anonymize the personal data acquired by the law-enforcement database. ∗See United States v. Miller (425 U.S. 425 [1976]). In this case the United States Supreme Court held that the “bank records of a customer’s accounts are the business records of the banks and that the customer can assert neither ownership nor possession of those records.” The same principle could be applied to an Internet Service Provider. †See PredPol, Predictive Policing Software available at www.predpol.com/.
  • 49. 34 Frontiers in Data Science This type of software is bound to have a major impact in the United States on the conception of the protection of rights under the Fourth Amendment, and more specifically on concepts such as probable cause and reasonable sus- picion that in future may come to depend on an algorithm rather than human choice [73]. The second example is Geofeedia software.∗ This software maps a given location, such as a certain block within a city or even an entire particular metropolitan area, and searches the entire public Twitter and or Facebook feed to identify any geolocated tweets in the past days within that specific area. This application can provide particularly useful data for the purpose of social control. One can imagine the possibility to have useful elements (e.g., IP address) to identify the subjects present in a given area during a serious car accident or a terrorist attack. From a strictly legal standpoint, these social control tools may be employed by gathering information from citizens directly due to the following principle of public: “Where someone does an act in public, the observance and recording of that act will ordinarily not give rise to an expectation of privacy” [123]. In the European Union, although this type of data collection frequently takes place, it could be in contrast with European Court of Human Rights (ECHR) case law that, in the Rotaru vs. Romania case,† ruled that “public information can fall within the scope of private life where it is systemati- cally collected and stored in files held by the authorities.” As O’Floinn [124] observes, “Non-private information can become private information depend- ing on its retention and use. The accumulation of information is likely to result in the obtaining of private information about that person.” In the United States, this subject has been addressed in the case People v. Harris;‡ the New York County District Attorney’s Office sent a subpoena to Twitter, Inc. seeking to obtain the Twitter records of user suspected of having participated in the Occupy Wall Street movement. Twitter refused to provide the law enforcement officers with the information requested and sought to quash the subpoena. The Criminal Court of New York confirmed the appli- cation made by the New York County District Attorney’s Office, rejecting the arguments put forward by Twitter, stating that tweets are, by defini- tion, public, and that a warrant is not required to compel Twitter to disclose them. The District Attorney’s Office argued that the third party disclosure doctrine put forward for the first time in the United States v. Miller was applicable.§ ∗See https://geofeedia.com/. The ACLU of California has recently obtained records showing that Twitter, Facebook, and Instagram provided user data access to Geofeedia, a developer of a social media monitoring marketed to law enforcement as a tool to moni- tor activists and protesters. More information at: https://www.aclunc.org/blog/facebook- instagram-and-twitter-provided-data-access-surveillance-product-marketed-target. †See Rotaru v. Romania (App. No. 28341/95) (2000) 8 B.H.R.C. at [49]. ‡See 2012 NY Slip Op 22175 [36 Misc 3d 868]. §See United States v. Miller (425 U.S. 425 [1976]).
  • 50. Legal aspects of information science, data science, and Big Data 35 The EU reform on data protection In addition to the GDPR, the new directive on the protection of individuals with regard to the processing of personal data by competent authorities (DPI) establishes some protection against a possible violation of EU citizens’ privacy. The goal of this directive is to ensure that “in a global society characterized by rapid technological change where information exchange knows no borders,” the fundamental right to data protection is consistently protected.∗ The founding principles of this directive, which are shared with the previ- ous directives referred to, are twofold 1. First, there is the need for fair, lawful, and adequate data processing during criminal investigations or to prevent a crime, on the basis of which every data must be collected for specified, explicit, and legitimate purposes and must be erased or rectified without delay.† 2. Then there is the obligation to make a clear distinction between the various categories of the possible data subjects in a criminal proceeding (persons with regard to whom there are serious grounds for believing that they have committed or are about to commit a criminal offence, persons convicted, victims of criminal offense, and third parties to the criminal offence). For each of these categories, there must be a different adequate level of atten- tion on data protection, especially for persons who do not fall within any of the categories referred previously.‡ These two principles are of considerable importance, although their app- lication on a practical level will be neither easy nor immediate in certain member states. This is easily demonstrated by the difficulties encountered when either drafting practical rules distinguishing between several categories of potential data subjects within the papers on a court file, or attempting to identify the principle on the basis of which a certain court document is to be erased. In addition to these two general principles, the provisions of the direc- tive are interesting and confirm consolidated data protection principles. Suf- fice to mention here, the prohibition on using measures is solely based on automated processing of personal data that significantly affect or produce an adverse legal effect for the data subject,§ as well as the implementation of ∗See DPI, explanatory Memorandum, (SEC(2012) 72 final). †Art. 4, DPI and Art. 4b, Directive 2016/280 of the European Parliament and of the Council on the protection of individuals with regard to the processing of personal data by competent authorities for the purposes of prevention, investigation, detection or prosecution of criminal offences or the execution of criminal penalties, and the free movement of such data, available at: http://eur-lex.europa.eu/legal-content/EN/TXT/ HTML/?uri=CELEX:32016L0680&from=EN. ‡Art. 5, DPI. §Art. 9a, DPI.
  • 51. 36 Frontiers in Data Science data protection by design and by default mechanisms to ensure the protec- tion of the data subject’s rights and the processing of only those personal data.∗ Furthermore, the directive entails the obligation to designate a data protection officer in all law-enforcement agencies to monitor the imple- mentation and application of the policies on the protection of personal data.† These principles constitute a significant limitation to possible data min- ing of personal and sensitive data collection by law enforcement agencies. If it is true that most of these provisions were also present in the Recom- mendation No. R (87) of Council of Europe and in the Framework Decision 2008/977/JHA, it is also true that propelling data protection by design and by default mechanisms and measures could encourage data anonymization and help one to avoid the indiscriminate use of automated processing of personal data. References [1] ITU. 2015. Recommendation Y.3600: Big data—Cloud comput- ing based requirements and capabilities. http://www.itu.int/itu-t/ recommendations/rec.aspx?rec=12584 (accessed July 23, 2016). [2] ENISA. 2015. Privacy by design in big data: An overview of privacy enh- ancing technologies in the era of big data analytics. https://www.enisa. europa.eu/publications/big-data-protection/at download/fullReport (accessed June 15, 2016). [3] Bollier, D. 2010. The promise and perils of big data. Aspen Institute, Communications and Society Program. http://www.aspeninstitute.org/ sites/default/files/content/docs/pubs/The Promise and Peril of Big Data.pdf (accessed February 27, 2014). [4] Paparrizos, J., White, R.W., and Horvitz, E. 2016. Screening for pan- creatic adenocarcinoma using signals from web search logs: Feasibility study and results. Journal of Oncology Practice 12(8): 737–744. [5] Golle, P. 2006. Revisiting the uniqueness of simple demographics in the US population. In Juels, A. (Ed.), Proceedings of the 5th ACM Workshop on Privacy in Electronic Society. New York: ACM. ∗Art. 19, DPI. †Art. 30, DPI.
  • 52. Legal aspects of information science, data science, and Big Data 37 [6] Narayanan, A., and Felten, E.W. 2014. No silver bullet: De-identification still doesn’t work. http://randomwalker.info/publications/no-silver- bullet-de-identification.pdf (accessed March 25, 2015). [7] Narayanan, A., Huey, J., and Felten, E.W. 2015. A precautionary approach to big data privacy. http://randomwalker.info/publications/ precautionary.pdf (accessed April 4, 2015). [8] Ohm, P. 2010. Broken promises of privacy: Responding to the surprising failure of anonymization. UCLA Law Review 75(6): 1701–1777. [9] Sweeney, L. 2000. Foundations of privacy protection from a computer science perspective. Proceedings of the Joint Statistical Meeting, AAAS, Indianapolis, IN. http://dataprivacylab.org/projects/disclosurecontrol/ paper1.pdf (accessed January 24, 2015). [10] Sweeney, L. 2000. Simple demographics often identify people uniquely. Pittsburgh, PA: Carnegie Mellon University, Data Privacy Working Paper 3. http://dataprivacylab.org/projects/identifiability/paper1.pdf (accessed January 24, 2015). [11] Sweeney, L. 2015. Only you, your doctor, and many others may know. Technology Science, September 29. http://techscience. org/a/2015092903 (accessed November 28, 2015). [12] United States General Accounting Office. 2011. Record linkage and privacy. Issues in creating New Federal Research and Statistical Informa- tion. http://www.gao.gov/assets/210/201699.pdf (accessed December 14, 2013). [13] Mantelero, A. 2015. Data protection, e-ticketing and intelligent systems for public transport. International Data Privacy Law 5(4): 309–320. [14] The White House. 2012. A consumer data privacy in a net- worked world: A framework for protecting privacy and promoting innovation in the global digital economy. http://www.whitehouse. gov/sites/default/files/privacy-final.pdf (accessed June 25, 2014). [15] The White House, Executive Office of the President. 2014. Big data: Seizing opportunities, preserving values. http://www.whitehouse.gov/ sites/default/files/docs/big data privacy report may 1 2014.pdf (acce- ssed December 26, 2014). [16] Mayer-Schönberger, V. 1997. Generational development of data protec- tion in Europe? In Agre, P.E., and Rotenberg, M. (Eds.), Technology and Privacy: The New Landscape. Cambridge, MA: MIT Press.
  • 53. 38 Frontiers in Data Science [17] Article 29 Data Protection Working Party. 2011. Opinion 15/2011 on the definition of consent. http://ec.europa.eu/justice/policies/ privacy/docs/wpdocs/2011/wp187 en.pdf (accessed February 27, 2014). [18] Article 29 Data Protection Working Party. 2014. Opinion 06/2014 on the notion of legitimate interests of the data controller under Article 7 of Directive 95/46/EC. http://ec.europa.eu/justice/data-protection/ article-29/documentation/opinion-recommendation/files/2014/wp217 en.pdf (accessed February 27, 2014). [19] Brownsword, R. 2009. Consent in data protection law: Privacy, fair pro- cessing and confidentiality. In Gutwirth, S., Poullet, Y., De Hert, P., de Terwangne, C., and Nouwt, S. (Eds.), Reinventing data protection? Dordrecht, the Netherlands: Springer. [20] European Commission, Directorate-General Justice, Freedom and Secu- rity. 2010. Comparative study on different approaches to new privacy challenges, in particular in the light of technological developments: Working Paper No. 2: Data protection laws in the EU. The difficulties in meeting challenges posed by global social and technical developments. http://ec.europa.eu/justice/policies/privacy/docs/studies/new privacy challenges/final report working paper 2 en.pdf (accessed July 5, 2014). [21] Van Alsenoy, B., Kosta, E., and Dumortier, J. 2014. Privacy notices versus informational self-determination: Minding the gap. International Review of Law, Computers, & Technology 28(2): 185–203. [22] Cranor, L.F. 2012. Necessary but not sufficient: Standardized mecha- nisms for privacy and choice. Journal on Telecommunications and High Technology Law 10: 273–307. [23] Richards, N.M., and King, J.H. 2014. Big data ethics. Wake Forest Law Review 49: 339–432. [24] Moerel, L. 2014. Big Data Protection: How to Make the Draft EU Regula- tion on Data Protection Future Proof. Tilburg, the Netherlands: Tilburg University. http://www.debrauw.com/wp-content/uploads/NEWS%20- %20PUBLICATIONS/Moerel oratie.pdf (accessed October 15, 2016). [25] Rubinstein, I.S. 2013. Big data: The end of privacy or a new beginning? International Data Privacy Law 3(2): 74–87. [26] Henkin, L. 1974. Privacy and autonomy. Columbia Law Review 74(8): 1419–1433. [27] Murphy, R.S. 1996. Property rights in personal information: An eco- nomic defense of privacy. Georgetown Law Journal 84: 2381. [28] Parent, W.A. 1983. A new definition of privacy for the law. Law & Philosophy 2(3): 305–338.
  • 54. Legal aspects of information science, data science, and Big Data 39 [29] Wacks, R. 1980. The poverty of “privacy.” Law Quarterly Review 96: 73–78. [30] Wacks, R. 1980. The Protection of Privacy. London: Sweet & Maxwell. [31] Zimmerman, D.L. 1983. Requiem for a heavyweight: A farewell to Warren and Brandeis’s privacy tort. Cornell Law Review 68(3): 291–367. [32] Costa, L., and Poullet, Y. 2012. Privacy and the regulation of 2012. Computer Law & Security Review 28(3): 254–262. [33] Secretary’s Advisory Committee on Automated Personal Data Systems. 1973. Records, computers and the rights of citizens. http://epic.org/ privacy/hew1973report/ (accessed February 27, 2014). [34] Schudson, M. 1978. Discovering the News. A Social History of American Newspaper. New York: Basic Books. [35] Breckenridge, A.C. 1970. The Right to Privacy. Lincoln, NE: University of Nebraska. [36] Solove, D.J. 2008. Understanding Privacy. Cambridge, MA: Harvard University Press. [37] Westin, A.F. 1970. Privacy and Freedom. New York: Atheneum. [38] Schwartz, P.M. 2013. The E.U.-US privacy collision: A turn to institu- tions and procedures. Harvard Law Review 126: 1966–2009. [39] Brenton, M. 1964. The Privacy Invaders. New York: Coward-McCann. [40] Miller, A.R. 1971. The Assault on Privacy Computers, Data Banks, Dossiers. Ann Arbor, MI: University of Michigan Press. [41] Packard, V. 1964. The Naked Society. New York: David McKay. [42] Bennett, C.J. 1992. Regulating Privacy: Data Protection and Public Pol- icy in Europe and the United States. Ithaca, NY: Cornell University Press. [43] Agre, P.E., and Rotenberg, M. (Eds.). 1997. Technology and Privacy: The New Landscape. Cambridge, MA: MIT Press. [44] Bygrave, L.A. 2014. Data Privacy Law. An International Perspective. Oxford, UK: Oxford University Press. [45] Petrison, L.A., Blattberg R.C., and Wang, P. 1997. Database mar- keting. Past, present, and future. Journal of Direct Marketing 11(4): 109–125.
  • 55. Exploring the Variety of Random Documents with Different Content
  • 56. only two villas on the Terrace, and they pertained variously to a Paris specialist in madness, and the controller-general of a great French bank. Between the two villas lay a large and valuable plot of ground, overgrown and tangled up with creepers, brambles, cabbage stalks, rose bushes, and seeding onions, set in the midst of which was a dilapidated one-room hut. The hut was the fly in the ointment of the specialist in lunacy and the controller-general. They could do nothing to remove this picturesque slum from their gates, for old veuve Michel, who lived there and drank two bottles of cognac a day and sang gay ribald songs by night, owned the land by right of some old French statute, and no one could turn her out for as long as she lived. Haidee and Bran considered veuve Michel a very charming person indeed. She was fat and merry and gentle, called them her nice little hens and gave them apples and pears (for she also owned an orchard up the cliff) all through the winter when there was no fruit to be got any nearer than Cherbourg. Naturally they liked and appreciated the old woman. Haidee had a good mind to go in and pay her a visit, but she decided it was better not, as old veuve would just be sleeping off her morning bottle of cognac preparatory to starting on the afternoon one; also Haidee remembered that she was hungry, and had better hurry back and help get lunch. Still she could not help stopping once or twice to examine for signs of little pink tips the lower branches of the tamarisk-trees which grew on one side of the Terrace--on the other side was the grey stone river wall with the tide lapping blue against it. Haidee loved tamarisks with a joy that she was sure was unholy because they looked so wicked and painted somehow when they were all dressed out in their pink feathers. She fancied that Jezebel
  • 57. must have had a bunch of them stuck like an aigrette in her beautifully coiffée hair, and the same pink tint on her cheeks when she looked out of the window for the last time. Anyway why were tamarisks the only trees to be found growing in the ruins of Babylon? And why had she read somewhere, that in the days of ancient Rome tamarisks were bound around the heads of criminals? It was a nuisance to have to forsake these interesting meditations to enter the little soap-scented shop of the village barber, but she stayed no longer than to bid him come to the Villa at three o'clock to cut off Madame's hair. Next she called at Lemonier's to command a sack of coal, and noted that Lemonier had evidently been drunk again, for Madame had a black eye. It was funny to think that such a jolly big red man should be so cruel! Haidee meditated on this subject on the return journey, also on the horrible price of coal-- sixty-five francs a ton and it disappeared like lightning. No one seemed to know why "Carr-diff," as they called it, should be so dear. Hortense, closely questioned on the subject by Val anxious for information, said that it must be because the people in England hated the French and were still angry that Normandy did not belong to them. "Well, have n't you got any coal mines in your own blessed country?" asked Haidee. "Certainement!" Hortense had replied indignantly. "We have Newcas-sel!" ――――
  • 58. The barber arrived at three o'clock, and Val sat trembling before her dressing-table. She had arranged two mirrors so that she could view the whole proceeding, but as soon as the barber commenced she closed her eyes tight. Bran and Haidee stationed themselves at either side of the table to see fair play. The barber was frankly amazed at the decision of Madame to cut off her feathery hair. Even at the last moment he asked--holding it up in his hands and shaking it out in sprays: "Does Madame realise what a change it will make in her appearance? Would it not be better if Madame had it merely cut short, leaving about two inches all round à la Jeanne d'Arc, so--?" He stuck his little pudgy fingers out below her ears to show the desirable length. "No, no, no!" cried Val, without opening her eyes. "Does he think I want to look like a pony with my mane hogged! Cut it off close, it must grow long and thick as it used to do. Tell him, Haidee." Haidee told him as much as it was good for him to know--no mention of ponies. "Bon!" said Monsieur le Barbier agreeably, but he looked doubtful, thinking to himself that hair seldom grew much after the age of thirty, and the lady looked well that. When one side was gone Val opened her eyes and gave a deep cry. If it could have been replaced then, she would have abandoned her idea and made the best of what she had. As it was she closed her eyes again, but during the rest of the operation great tears rolled down her face upon her tightly clasped hands. And when all was over the children were swept from the room and she locked herself in with her heart's bitterness. Even Bran was not permitted to comfort her.
  • 59. It is true that nothing makes a greater difference to the appearance of a woman than to cut off her hair. The tale of every sin she has committed and every sorrow she has suffered seems to be written bare and unsheltered upon her face for all the world to read. What subtle alleviation there is in a frame of hair round the face of a sinner it is hard to say: but it is a problem whether Mary Magdalene, with all her shining story of repentance would have appealed to the love and chivalry of the world in quite the same way if she had been handed down through the ages without her wondrous hair. When Valentine Valdana looked in the glass at her pale, oval face with no darkness above it to soften the fine lines of her temples, faintly hollowed cheeks, and sombre eyes whose defect appeared to have become suddenly accentuated, she longed in shame and dismay for a mask. It seemed to her that she had indecently exposed her sorrows to the world; that exile, misery, and all the failures of her life were plainly written for even the most unintelligent eye to read. A curious sense too of having done something disloyal to others in revealing her unhappiness crept into her mind for an instant, but she made haste to dismiss it, and would not even specify the vague "others" to herself. None knew better than she the power of a beloved hand to strike deepest, to hollow out cheeks, sharpen temples, and put shadows into eyes: but she would never have admitted it. Hers was no accusing heart. She blamed nobody but herself for her failures--not even the Fate that had bestowed on her that double nature of artist and lover which rarely if ever makes for happiness. She only felt the despair of the convict and almost wished herself one, so that she might hide in a cell. At length she sought her gay scarf of asphodel-blue and
  • 60. arranged it over her head like a nun's veil. It was thus that she presented herself to the children in the kindly dusk. Supper already stood upon the table. Haidee displayed unusual tact, but Bran was full of curiosity. "Are you always going to wear that wale tied on you?" he inquired. "Until my hair grows long again," said poor Val, biting her lip painfully. "Sleep in it too?" Val nodded, and Haidee made haste to help Bran to pommes frites which he loved. Next morning, Bran waking up and throwing out an arm for his matutinal hug, encountered something strange to his touch: something round, bumpy, and slightly scrubby, very different to the soft nest he was used to dabble his hand in as soon as he woke. The blue scarf had slipped down while Val slept and her shorn head lay cruelly outlined upon the pillow. Bran knelt up and considered her in consternation mingled with pity, then finding himself in the attitude of prayer, mechanically crossed himself and murmured his morning orison, his eyes still fixed on his mother's head: "Jesus, Mary, and Joseph, I give you my heart, take it please, and preserve it from sin." "Jesus, Mary, and Joseph, I give you my soul and my life. "Jesus, Mary, and Joseph, help me in my last agony. "Jesus, Mary, and Joseph, grant that I live and die in thy holy company. Amen." Immediately afterwards humour, that Irish vice, overcame all gentler feelings; like a certain famous Bishop of Down, Bran would lose a friend for a joke. He woke Val with a cruel jest:
  • 61. "Bon jour, Monsieur le Curé!" The curé of Mascaret was a Breton as rugged as his country, with haggard spiritual eyes and an upper lip you could built a fort on, as the saying is; he intensified his uncomeliness by wearing his hair so close-shaved that it was impossible to say where his tonsure began or ended. To be told by her loving but candid son that she resembled this good man was a cruel thrust to Val, and the memory of it darkened life for many days to come. She wrapped herself in gloom and the blue veil, and nothing more was heard of the fez cap and cigarettes except that in good time the Stores forwarded them and the French Customs taxed them. After once trying on the fez and finding herself the image of a sallow and melancholy Turk, she had cast it from her. Her one instinct was to hide her ugliness from every one. Even at the sight of John the Baptist she would fly and hide, and she never left the house except after dark, when for exercise she would sometimes race Haidee up and down the digue, or run along the beach at midnight, her scarf floating behind her in the wind, and her head bare to give her "roots" a chance. These proceedings gravely annoyed the Customs officers distributed in the little straw-littered watch-huts that line the Normandy coast. Instead of tucking themselves in their blankets for a peaceful night, they were obliged to keep awake for fear the mad American woman meant either to commit suicide or meet a boat full of brandy and cigars from Jersey. CHAPTER XIV
  • 62. THE WAYS OF LITERATURE "The voyage of even the best ship is a zigzag line of a hundred tacks." From Jersey Val had made a bee-line for Paris which she knew well, and where she had hopes of renewing her mental energy by the sights and sounds of a great city and association with other brain workers. Autumn removals were in full swing and there was no great difficulty in finding house-room for herself and the children, though she was unprepared to find how Paris rents had risen since the days when she and her mother sojourned in the Latin Quarter. It was to that part of Paris she naturally turned--the only possible part for artists and writers to live, though the rich and empty-headed are fond of calling it the "wrong side" of the river. A studio seemed the most suitable form of residence, for she knew she would not be able to work in a small room, and she hated the sordid construction of a cheap flat. She was fortunate in finding a good atelier in a little secluded rue on the confines of the Quarter--a big, high room, with kitchen and small bedroom attached, looking out onto a little square yard with clusters of shrubs, ivied walls, and a few old battered statues that lent a picturesque air. Here she had settled down and with resolute energy begun the series of "Wanderfoot" articles for which Branker Preston had obtained a commission. It was an arduous task. No matter how much material is stored in the mind it is not easy to import the air and colour of far-off lands into a Paris atelier. The art of putting things down had not yet been recaptured either. Still, the stimulus of even the short journey from Jersey to Paris had done something for her, and though to her critical eye the
  • 63. articles she achieved seemed but pale echoes of her former work, they at least paid the rent and kept things going in rue Campagne Premiere. The continuation of Haidee's education became a problem needing instant attention; for Val very soon realised that the Latin Quarter with its liberal ideas of morality and its fascinating students was no place for a young impressionable girl. Her own child she would have allowed to stay, for she knew that anything with her nature would come to no harm among these careless, attractive people, to whom she felt herself blood-kin. But Haidee, the child of a pretty flighty mother, was of different stock. Besides, there was a responsibility to Westenra in the matter. There were no convents left in Paris, or indeed, in France. All those lovely homes where girls learned a sweet sedateness and many beautiful arts had been closed by a ruthless government. No more in France may the gentle coifed women impart composure and beauty of mind to English and American girls and train the aristocratic children of France to a love of Church and Country. What the loss is to the sum of the world's harmony can never be computed, but American and English mothers have a slight realisation of it. It was in Belgium that Val at last found what was needed for Haidee--a little community of French nuns who, refusing to unveil, had been obliged to flee over the border, and there had founded a convent to which many good Catholics in Paris sent their children. It was well within Val's means too, for the living is cheap in Belgium, and the fare in the convent was simple though good. Haidee hated terribly to go, but Val was firm, though she held out the promise of early liberation if Haidee would work well at French and try and pass her brevet simple. This was no difficult task, for the girl had been
  • 64. well grounded in French during their sojourn in Jersey. Remained the problem of Bran--and little children are a problem in France to parents of limited means. No one caters for them as in other countries. No one even understands the art of teaching and amusing them at the same time, nor even how to feed them. There are no kindergartens and no milk puddings! Small wonder that French babies are small and sallow and sad! Since the nuns were driven out there are only the public Lycées where strong and weak, rough and gentle, are jumbled together with results that no thinking woman would welcome for her child. From their tenderest years French children are crammed with lessons, pushed ahead to pass exams, while the business of play so necessary for little children is almost entirely suppressed. Val very certainly had no intention of confiding her son to such institutions. She was therefore obliged to hire a daily governess for him, for though, at his age, he needed little teaching, he had to be sent out of doors so that she might have silence and solitude wherein to work. Even this was a costly business. In England a nursery governess can be afforded by almost every one, but in France it costs one hundred francs a month to have your child well taken care of and taught his alphabet for a few hours a day. Val did not grudge it, but what worried her was that Bran did not thrive. Paris was no place for him. The Luxembourg Gardens make a good play-ground for city-bred children, but Bran was Val's own child in his need of air and space and horizon. His bloom faded a little, and he began to look very fair and spiritual. Also his love of the picture and statue galleries seemed to his mother something too wistful and wonderful in a small boy, and brought tears to her pillow
  • 65. in the silence of many a night. Then she took him to Belgium for awhile and left him with Haidee and the good nuns. He was a shy creature, though he hated any one to know it, and believed he hid his secret well behind a set smile and little hardy incomprehensible sayings. When the nuns clustered round him calling him their "little Jesus," a favourite name in France for a pretty child, he disdained to shelter behind Val's skirts, as instinct bade him, but nothing could be got out of him except an enigmatic saying he always kept for strangers: "The cat says bow-wow-wow, and The dog says meow, meow, meow." All the while he smiled his little bright smile and his eyes roving keenly noted every detail of the pale æsthetic faces. Even the tears in the Reverend Mother's eyes did not escape him. Afterward he said to Val: "I like that one with the floating eyes. I think she wishes she had a nice little boy like me. Her voice was littler than a pin's head when she called me her petit Jesu. But why do they nearly all have green teeth?" When Val kissed him farewell it nearly broke her heart to see the brave smile he maintained, though Haidee was sniffling and snuffling at his elbow, partly with momentary grief but mostly with indignation at being, as she rudely phrased it: "Shut up in a convent with a lot of old pussycats."
  • 66. Back in Paris the studio seemed desolate and empty. Bran had become so much a part of his mother's being and life that without him she was like a bird from whom a wing had been torn. A month later Haidee wrote: "I think Bran is fretting. Whenever I speak to him he puts that little fixed grin on his mouth, but you should see his eyes." Within an hour Val was in the Brussels express speeding for that dear sight. On the journey back to Paris, happy now and healed of her broken wing, she heard all the history of his lonely nights and the "purply-red pain" that he got in his stomach when he thought of her. Cuddled to her side he wept as he had never wept whilst separated from her, and Val's tears ran down her face too while she listened, registering a vow that she would never part with him again. So once more he went out with a governess and came home to his mother full of original criticisms of Moreau's pictures and the statues of Rodin, until one morning nearly two years after their arrival in Paris, and just when Haidee had arrived for the summer holidays, Val rose up from her bed with the itch for travel in her feet, and the longing quickly communicated to the children for the sight of a clear horizon. They tore their possessions from the walls, stuffed them into trunks, and shook the dust of Paris from their feet. "Let's go to Italy and live on olives and spaghetti, "was Haidee's suggestion, but Bran knew the news of the world. "We might get an earthquake!" The size of the cheque from Branker Preston, however, was what really decided the affair, limiting them to wandering happily enough in Brittany. But the water and primitive methods of Breton cooks made Val think nervously of typhoid, and after a time she
  • 67. headed for Normandy. Normans are cleaner in their household ways than Bretons, of whom they slightingly speak as "les pores Bretons," declaring that they eat out of holes in the table and never wash the holes. Besides, Normandy in winter is milder than Brittany. So, travelling by highways and byways, they happened at last on Mascaret. It was the tag end of September when they arrived. All the summer visitors were gone and the big silver beach deserted, but summer itself still lingered. They got an entrancing glimpse of the gentle green and gold beauty of the place before the chills of autumn set in. Even then they had been able to bathe and go sailing in the fishing boat of one of père Duval's sons, who was now in his turn lighthouse-keeper of Mascaret. For ten sunny October days, too, they had assisted with all the ardour of novitiates at père Duval's cider making, becoming acquainted with the secrets of cidre bouché, and the grades to be found in cidre ordinaire unto the third and fourth watering. They even sampled the latter as drunk by the fishermen and called for at the cafés by the name of le boisson avec le brulot dedans: which signifies cider very liberally diluted with French cognac. Then the winter closed in on Mascaret with wild gales and high-flowing tides. On Christmas Eve snow came softly down, so that the walk to midnight mass had been like acting in that scene painted by a Dutch painter where the village folk are seen winding their way through the snow, lanterns and hot-water bottles in their hands, to the distant church with windows full of red light. All the winter interests of the simple village had been sampled and shared by Val and the children, and they had been happier there than ever in France. The children loved the freedom of the place and
  • 68. the bonhomie of the French folk so different to English people of that class. The three went about in their red sweaters and lived a life of absolute unconvention. It was a good place to write a masterpiece in--if one were only a master--was Val's ironical thought, and in spite of her self-directed irony, she did achieve during the first months there a wonderful little curtain raiser, which Branker Preston had no difficulty in disposing of to a London manager. It dealt with Boers and Zulus, and had been well received, but unfortunately the play it had preceded in the bill was a failure and the two were withdrawn together before Val could greatly benefit, but it had brought in five guineas a week for six weeks, and this success had put her in heart for further work of the kind. She had sickened of writing "Wanderfoot" articles from a chair. She could by this time have written some very spirited ones on the subject of France in general and Normandy in particular, but she had her reasons for not wishing to attract attention to her whereabouts, as such articles would surely have done. Preston advised her to write a novel, but she knew she had neither the patience to spin a long story through many chapters to its end, nor the gift of character portrayal. What was hers was a sense for situation, colour, and atmosphere, and it occurred to her that the best vehicle for a display of these qualities was the theatre. Her first little venture had attracted the attention of several managers, and one of them told Preston that he was ready to consider a three-act play by her. It was this play she was busy upon now. But it was sometimes hard to transport the atmosphere of far-away tropical Natal into a little wooden villa facing the English Channel, with a wild spring gale tearing at the windows, and the rollers booming like cannon on the
  • 69. Barleville beach--for the promise of summer had gone as swiftly as it came, and the spring tides were flooding up the river flinging great walls of spray over the digue and splashing three feet deep across the Terrasse, right to the steps of the Hotel de la Mer, so that the journey to the village had to be made by a path up the cliff. Val found that the only way to ignore Normandy and the bleak mists of La Manche was to sit over a chaufferette full of bright red embers of charcoal, letting the heat steal up her skirts and enveloping her whole person from the soles of her feet to her scalp in a lovely glow. Immediately she would begin to write things full of the tropical languor of Africa. In her brain palms waved, little pot- bellied Kaffirs rolled in the hot dirt, sunshine blazed over a blue and green land, the air was filled with the scent of mimosa, and great- limbed Zulus danced in rhythmic lines with chant and stamp and swing of assegai before Cetewayo, the great and cruel king. Unfortunately, a chaufferette is not always an easy thing to manage. Like everything French it has a temperament, and is liable to moods when it will burn and moods when it won't. It is a wooden or tin box, perforated at the top and open at one side to admit an earthenware bowl full of the charcoal which is called charbon de bois--actually calcined morsels of green wood. The baker makes this charbon by sticking green wood branches into his hot oven after he has finished baking his bread, but each baker makes a limited supply only, and will not sell it except to people who buy his bread. Every one uses chaufferette in Normandy during the winter, and visitors are given one to put their feet on as soon as they enter a house, though sometimes when the host is rich enough to keep a perpetual fire going, a supply of hot bricks is kept in the oven instead.
  • 70. Val's chaufferette was of most uncertain temper. Hortense always lit it in the morning, and left it by the writing-table. When Val came to it all that had to be done was to gently insert an old spoon under the little ash heap and lift it all round, when a red hot centre of glowing embers would disclose itself. But sometimes an old nail or piece of "Carr-diff" found its way by accident into the pot, then the charbon would immediately sulk itself into oblivion, or sometimes for no reason at all after being perfectly lighted it would just go out. Ensued a struggle in which Val and Haidee invariably came off second-best. They would take the pot out of its box and stand it on a window-sill with the window drawn low to make a draught; put it on the front door step and, kneeling down, blow on it until fine ash sat thick upon their noses and their eyes were full of tears; build paper bonfires on it; fan it wildly with newspapers. All to no avail! Usually that was the end of work and inspiration for the day. Val declared that she could not think with cold feet. But sometimes old père Duval, compassionate for the mad, would send up his wooden box, large enough for two men to warm their feet on, with a great iron saucepan full of glowing charbon inside, and Val would sit toasting over it and write things of a tropical languor extraordinary. Haidee had passed her brevet simple, an exam, about equal to the English Oxford Junior, and the American 6th standard, and was now working for the brevet supérieure with a French woman who had been a governess before she married a retired commercial traveller and settled in Mascaret. The discovery of this good woman was a stroke of luck for Val, though certainly Haidee did not consider it so. However, her lessons only took up four hours a day. For the rest she and Bran idled joyous and care-free through life, climbing
  • 71. the cliff, fishing, digging for sand-eels, making long excursions inland, or meeting the fishing boats in the evening when they came in with the day's haul, and all the villagers would be at the port to bargain for fish. Haidee usually haggled for and bought a raie (dog- fish) for the next day's dinner, and Bran would run a stick through its ribald-looking mouth, and carry the slithery monstrous thing home, to be met by scowls from Hortense, who, stolid as she was, hated the sight of a raie, and could not face the business of washing and gutting it without cries of douleur and disgust. "Ah! C'est craintive! C'est affreux!" But meat was too dear for daily consumption, and raie the only fish brought in by the boats throughout the winter months, so it had to be eaten, and some one had to prepare it. And after all, wrestling with raie was one of the jobs for which Hortense was paid three francs a week. It was her business to come in the morning at seven o'clock, make the fires, and deliver "little breakfast" at each bedside; afterwards she swept and made the beds, then disappeared until just before lunch, when she came to perform upon the raie and execute one or two culinary feats that were beyond the scope of Val or Haidee--such as cutting up onions, which neither of them could accomplish without weeping aloud, or putting the chipped potatoes into a pan full of boiling dripping, a business that when conducted by Val made a rain of grease spots all over the kitchen and scalded every one in sight. After washing the midday dishes, and chopping up vegetables for the soup, Hortense would consider her function over for the day, and leave Val and Haidee to grapple as best they might with tea, supper, fires, and the chaufferette. The supper was no very great difficulty, merely a matter of putting the cut vegetables
  • 72. into a pot with a large lump of specially prepared and seasoned dripping, and standing said pot on the stove until supper-time, when its contents would be marvellously transformed into soupe à la graise, a savoury and nourishing broth eaten as an evening meal by every peasant in Normandy. The fires were the greatest nuisance. The stove in the kitchen either became a red-hot furnace and purred like a man-eater, or else went out; and the stove with an open grate in Val's room, which old man Duval had paid a month's rent for and gone all the way to Cherbourg to fetch, had a way of going out also before any one even noticed that it was low; then there would be much scratching with a poker, searching for kindling wood, pouring out of paraffin, sudden happy blazes that nearly took the roof off, and black smuts everywhere. When all was over, and a beautiful fire roaring after the united efforts of the family, Val would find that her chaufferette had gone out! It was hard to even think masterpieces among such distractions, to say nothing of writing them. Tea was easily got. Haidee made the toast on the salad fork, Val buttered it with dripping, Bran laid the table. Then all three sat with their feet on the stove, drinking out of the big coffee bowls, eating every scrap of the delicious smoky toast and licking their fingers afterwards. If Val had written anything funny or dramatic that day she would sometimes read it out to them, but for the most part her instinct was to hide what she wrote. She said she felt as if she had lost something afterwards, and if any one had been even looking at her written sheets they never seemed quite the same to her again-- some virtue went out of her work the moment she shared it with any one.
  • 73. Usually, after tea she settled down for another struggle with her ideas, and Bran and Haidee went for a prowl on the digue in the hope of adventures. Bran, whose mind was as full of fairies as if he had been born in the wilds of Ireland, was always in hope of meeting a giant or a dwarf, but he had learned not to mention these aspirations to Haidee. Anyway, there was always the village gossip to listen to in the petit port, where the fishing boats anchored and usually the excitement of watching the Quatre Frères come chup-- chup--chupping up the river to her moorings. She was a natty and picturesque trawler, with a petrol engine that was the admiration of the village installed in her bowels. Because of this engine she was known as the Chalutier à petrole, but at Villa Duval she was called by Bran's translation of her name, The Cat's Frères. She never caught anything but raie, and of this despised species far fewer than any of the other boats, but she dashed in and out of the harbour with great slam and needed five men to handle her. There was a legend that the petrol engine frightened the fish away. It was known that the four brothers who owned her were anxious to get rid of her. Every one knew that she cost more than she brought in. But Haidee and Bran shared a fugitive hope that Val's play would make them all so rich that they would be able to acquire her as a pleasure boat. Sometimes strange craft from Granville or a Brittany port would come in for the night, and there was the St. Joseph, a great fishing trawler from Lannion, carrying a master and seven hands, that put in when weather was heavy. Her sails were patched with every colour of the rainbow, her decks were filthy, and her years sat heavy upon her--you could hear her creaking and groaning two miles from shore: but to Haidee and Bran she stood for the true romance! She
  • 74. always brought in tons of fish, not only the everlasting raie, but deep-sea fish, and as soon as her arrival was heralded all the village sabots came clipper-clopping down the terrace, shawls clutched round bosoms, the wind flicking bright red spots in old cheeks, every one anxious to pick and choose from the mass of coal-fish, red gurnet, plaice, congers, and mullets that was hooked out of the hold and flung quivering ashore. The big weather-beaten fishermen in their sea-boots bandied jests with the carking old village wives and the girls showered laughter. In the end, the villagers departed with full baskets, and the seamen well content adjourned to the petit café close by for a "cup of coffee with a burn in it" and a good meal. CHAPTER XV WAYS SACRED AND SECULAR "A gentleman makes no noise: a lady is serene."--EMERSON. In May, the gentle month of May, the weather cleared up again, and green things commenced to sprout and bloom on the cliff above Villa Duval. The country-side began to bloom and blossom as the rose. From the high coast that lies facing the sea, Jersey could be discerned on clear days etched as if in India ink upon the horizon thirteen miles away. Clots of sea-samphire burst into flower, cleverly justifying its name of creste marine by just keeping out of reach of the high tides. The gorse showed dots of yellow amongst its prickles,
  • 75. and little brilliant blue squills stuck up their perky faces and gave out a sweet scent. All along the path to the lighthouse wild thyme came out in springy masses, and the mad Americans often went up that way for the special purpose of lying on it as on a soft, pink silk rug. It seemed to cause them a peculiar kind of joy to put their faces down in it, crying, "Oh! oh! oh!" The garbage-hole across the road in front of Villa Duval which the dustman had been trying for many summers to transform into a building plot by filling it with empty tins and rubbish from the hotel, and which had been an eyesore all the winter, now suddenly became a place of beauty, for a lot of prickly, thistly-looking plants growing among the jam tins burst into a blaze of red and yellow. It turned out that they were poppies that had been keeping themselves secret all through the winter, and the yellow bright gold of "Our Lady's bedstraw." One day Haidee brought home some long, fragile trails of cinquefoil, one of the first spring things, and Val, worn and haggard under her blue veil, pinned it over her heart because she had read in old Elizabethan days that cinquefoil was supposed to be a cure for inflammations and fevers. She quoted to Haidee what an old herbalist had once written of such cures: "Let no man despise them because they are plain and easy: the ways of God are all such." Haidee flushed faintly and retired into awkward silence, shy like most girls of her age at the mention of God. She was going to make her communion the next day with the First Communion candidates, but it was not her first, for that had been made once when she was ill in New York. She was to be confirmed in June when the
  • 76. archbishop of a neighbouring parish intended to visit Mascaret and hold a confirmation service. It being Saturday afternoon Hortense as well as Haidee was due at the confessional for the recital of her weekly sins, therefore she bustled over the washing-up, announcing her intention of making a bon confession, as though the one she usually made was of an inferior brand. "What are you going to tell?" asked Haidee, drying plates. She knew very well it was forbidden to talk about your confession, but the subject was a curiously fascinating one. Hortense had a "cupful of sins" for the curé's ear. She had been reading love stories in the Petit Journal (a forbidden paper because it is "against the Church"), telling the cards, and consulting her dream book; also she had missed Vespers twice and several meetings of the "Children of Mary," of which body she was a member. She computed that her pénitence would be as long as her arm. "He will scold me well, I know," she said cheerfully, "for he saw me talking with Léon Bourget yesterday." "What! that awful fisherman with the hump?" "Yes; but he is not a bad fellow, mademoiselle, only all the fishermen here are wicked towards the curé because, as you know, he would not bury the mother of Jean le Petit, and they had to go and get the mayor to do it." "Yes; but you must remember that she lived with old man le Petit without being married to him, and that is forbidden by the Church. She would not even repent on her death-bed and receive the Blessed Sacrament. How could the curé bury her after that?"
  • 77. Haidee knew all about the little scandal, for the storm it occasioned had raged all the winter about the curé's head. The same day he had refused to bury mère le Petit he was obliged to go to Paris on Church business. On his return in the dusk of a December evening he was met at the station by all the fishermen in the village partially disguised in home-made masks, each carrying some instrument or implement with which to make hideous sounds; pots, pans, old trays, sheep-bells, and cow-horns had all been pressed into service, and the din was truly fearsome. The curé preserving his serenity was conducted to his presbytery by this scratch band, and on every dark night thereafter it had serenaded him from the shadows near his house. The blare sometimes continued until the small hours of the morning, keeping not only the unfortunate curé, but the whole village awake. The gendarmes from Barleville, the nearest police-station, had made several midnight raids with the stated intention of capturing the offenders, but their efforts were attended by a lack of success so striking as to suggest a certain amount of sympathy, not to say complicity, on the part of the law. At any rate, the curé's music, or "Mujik de Churie," as it was popularly pronounced, went on gaily, and there had been some kind of unofficial announcement that it would continue until the curé cleared out. Old pére Duval opined, however, that the entertainment was likely to cease with the arrival of the first summer visitors, for however vindictive the fishermen were they knew which side their bread was buttered on, and were politic enough not to want to drive away trade by their thrilling "mujik." Having finished drying plates Haidee retired up-stairs to prepare her confession, telling Hortense to be sure and wait for her. She
  • 78. proceeded to write her sins down on a piece of paper. In spite of her good French she stammered so much from nervousness when confessing that the curé had arranged this method with her. She always gave him the piece of paper, which he took away to the sacristy while she waited in the confessional. When he had read her paper he came back, conferred penance and a little scolding, then gave her absolution. With the aid of a French Catechism, which had a formula for confession in it, she proceeded to write out her sins, her method being to dive into the book first for a question and then into her soul for a sin that corresponded. Eventually the piece of paper contained the following statement: "Je ne me suis pas confessé depuis trois semaines; j'ai recu l'absolution. Je m'accuse: "De n'avoir pas fait ma priere du matin beaucoup de fois. "De n'avoir pas fait ma prière du soir plusieurs fois. "D'avoir manqué aux Vêpres 4 fois. "D'avoir été distraite dans l'Église 2 fois. "D'avoir été dissipée dans l'Église 2 fois. "D'avoir désobéi à ma mère 2 fois. "D'avoir manqué de respect envers elle 1 fois. "De m'été disputée avec mon frère 2 fois. "D'avoir fait des petits mensonges 4 fois. "Je m'accuse de tous ces pèches et de ceux dont je ne me souviens pas. "Je demande pardon de Dieu et à vous, mon père, la pénitence et l'absolution selon que vous m'en jugerez digne."
  • 79. Whether this list of offences truly represented the burden of her transgressions for the past three weeks it would be hard to say. It is possible that Val could have made out a longer and more comprehensive one for her, as she often threatened to do when Haidee vexed her. Anyway, the latter folded up her piece of paper with a complacency that either betokened a clear conscience or a heart hardened in crime. She computed that her penance would be to recite a decade of the rosary, and she knew that the curé would then speak of the next Church feast, and of the wishes preferred by the Sacred Heart and the Blessed Virgin, tell her to invoke the aid of the Saints when she felt herself tempted to sin, to try always to give a good example to her little brother, and to be very pious so that her mother would be converted and become a Catholic. Both Val and Haidee had long since given up explaining that they were not mother and daughter. They found that it saved time and a lot of questions just to let people think what they liked. Putting on her hat Haidee now popped her head out of the window and gave a hoot to Hortense, who was below in the yard cleaning her boots on the garden seat. Just as they were about to start Val came down-stairs and begged Haidee to go to the butcher's shop on her way back, and bring home something for Sunday's dinner. "What kind of something?" asked Haidee belligerently, for the butcher's shop had no allure for her. There ensued a discussion as to which was the most economical meat to get. Hortense, waiting at the bottom of the steps, piped in with the announcement that every one ought to eat lamb on First Communion Sunday. Val and Haidee looked at each other. Vaguely they knew that the price of lamb was
  • 80. high. But suddenly it came into Val's mind how sick the children must be of raie, and stewed veal, and that though funds were low the play was nearly finished. They would have a nice English dinner for once. Roast lamb and mint sauce! She gave Haidee her last louis to change. "Pick some mint from the cliff-side as you come back," she enjoined. French peasants have no use for mint in their cooking. Some English visitors had once planted a root of it in pére Duval's garden, but after they were gone he flung it out again on to the cliff- side, where it had increased and multiplied until it was now a large bed. In the butcher's shop Haidee found a number of villagers squabbling over beef-bones, and not a sign of lamb anywhere. The truth was that every portion of the one lamb killed early in the week had been sold, and though there were still one or two customers in need of First Communion lamb, Mother Durand knew better than to offer any of the freshly-killed beast that hung in the back shed. Peasants are well aware that freshly-killed meat should not be cut too early or it will be full of air, soft, flabby, and never tender. Mother Durand, under her calm exterior, was furiously angry with her man for having delayed the killing until now--after to-day there would be no demand for anything but beef-bones and veal until the summer visitors began to arrive. The young American mademoiselle asking guilelessly for lamb was a godsend. Waiting until the last villager had gone from the shop so that there would be no adverse comment on what she meant to do, she turned ingratiatingly to Haidee. "But certainly, mademoiselle ... there is none in the shop ... but outside I have a lamb that is superbe ... just the thing for a première
  • 81. communion ... it is not for every one I would cut that lamb, but for such customers as you and your belle maman there is nothing I would not do." She returned presently from the back shed. "There, mademoiselle--a beautiful shoulder. Six francs." Haidee was horrified at the price. Their dinner meat usually cost about one franc twenty, and she knew that there was much to be accomplished with Val's last twenty-franc piece. "Could n't you give me a smaller one, Madame Durand? ... and not so dear?" "Ah, mademoiselle, you should have said to me before that you wanted it small. It is cut now ... and what would I do with the pieces from it? Do you think I could sell them? But no." So Haidee took the shoulder, and returned home with it tucked under her arm. On arrival as it happened old veuve Michel was in the kitchen with Val, having just brought home some odds and ends of family washing. "What!" she cried, on seeing the lamb. "A shoulder of freshly- killed lamb, full of air and bubbles ... cut off the poor nice lamb while it had yet the hot life in it! Shame on the wretched woman Durand ... to take advantage thus of poor innocent Americans! ... Shame! But then every one knows how she treated her poor daughter who wanted to be a nun. Madame, the stones in the street are not more wicked than that woman Amélie Durand!" Val, much disturbed by these sayings, examined the shoulder of mutton. Certainly it was very bubbly looking: warm too. She remembered now hearing the cook in New York storm over a piece of freshly-killed meat, declaring that it had been cut too soon and was not fit to eat.
  • 82. "How ought I to cook it to make the best of it?" she inquired in dismay. "Cook it!" cried Widow Michel, scarlet in the face from indignation combined with the effects of her afternoon bottle of cognac. "No good to cook it. Better to pluck a rock from the cliff-side and cook it." "How much was it, Haidee?" "Six francs." "Mon Dieu! What imposition! Take it back, Haidee dear, and tell her that it is too dear and too fresh ... she must give us a pound of steak instead. We are too poor to buy meat we can't eat, you know, darling. Six francs! Did you pay for it?" "Why, yes, of course I paid for it. You know I had the louis. Oh! blow Val, I don't care much about taking it back." "But, Haidee, what's the use of talking like that ... we can't eat that bubbly lamb ... think of poor Brannie without dinner! I 'd go myself if I had any hair.... Tell her it 's ridiculous to have given you such meat. I remember now Hortense said that leg we had at Christmas and could n't eat was too freshly-killed--it was soft and tough at the same time, and all slithery when you tried to cut it. Don't you remember--it made you sick to look at it?" Yes, Haidee remembered well enough, but she did n't like taking the shoulder back just the same. However, veuve Michel offered the moral support of her company, and she returned to Mother Durand. Half-an-hour later she was back at the Villa, the wretched shoulder of lamb still in her hands. "She won't take it back. She says it 's a rule of the shop never to take back meat that has once gone out of it."
  • 83. Welcome to our website – the perfect destination for book lovers and knowledge seekers. We believe that every book holds a new world, offering opportunities for learning, discovery, and personal growth. That’s why we are dedicated to bringing you a diverse collection of books, ranging from classic literature and specialized publications to self-development guides and children's books. More than just a book-buying platform, we strive to be a bridge connecting you with timeless cultural and intellectual values. With an elegant, user-friendly interface and a smart search system, you can quickly find the books that best suit your interests. Additionally, our special promotions and home delivery services help you save time and fully enjoy the joy of reading. Join us on a journey of knowledge exploration, passion nurturing, and personal growth every day! ebookbell.com