Spatial analysis using big data: methods and urban applications Yamagata

Download the full version and explore a variety of ebooks
or textbooks at https://ebookmass.com
Spatial analysis using big data: methods and urban
applications Yamagata
_____ Follow the link below to get your download now _____
https://ebookmass.com/product/spatial-analysis-using-big-
data-methods-and-urban-applications-yamagata/
Access ebookmass.com now to download high-quality
ebooks or textbooks

We have selected some products that you may be interested in
Click the link to download now or visit ebookmass.com
for more options!.
Computational and Data-Driven Chemistry Using Artificial
Intelligence: Fundamentals, Methods and Applications
Takashiro Akitsu
https://ebookmass.com/product/computational-and-data-driven-chemistry-
using-artificial-intelligence-fundamentals-methods-and-applications-
takashiro-akitsu/
Numerical Methods Using Kotlin: For Data Science,
Analysis, and Engineering 1st Edition Haksun Li
https://ebookmass.com/product/numerical-methods-using-kotlin-for-data-
science-analysis-and-engineering-1st-edition-haksun-li-2/
Numerical Methods Using Kotlin: For Data Science,
Analysis, and Engineering 1st Edition Haksun Li
https://ebookmass.com/product/numerical-methods-using-kotlin-for-data-
science-analysis-and-engineering-1st-edition-haksun-li/
Planning Support Methods: Urban and Regional Analysis and
Projection
https://ebookmass.com/product/planning-support-methods-urban-and-
regional-analysis-and-projection/

Qualitative Data Analysis: A Methods Sourcebook Third
Edition
https://ebookmass.com/product/qualitative-data-analysis-a-methods-
sourcebook-third-edition/
Spatial Analysis John T Kent
https://ebookmass.com/product/spatial-analysis-john-t-kent/
Spatial Analysis John T. Kent
https://ebookmass.com/product/spatial-analysis-john-t-kent-2/
Analysis and Visualization of Discrete Data Using Neural
Networks Koji Koyamada
https://ebookmass.com/product/analysis-and-visualization-of-discrete-
data-using-neural-networks-koji-koyamada/
Big Data Analysis of Nanoscience Bibliometrics, Patent,
and Funding Data (2000-2019) 1st Edition Yuliang Zhao
https://ebookmass.com/product/big-data-analysis-of-nanoscience-
bibliometrics-patent-and-funding-data-2000-2019-1st-edition-yuliang-
zhao/

Spatial Econometrics and Spatial Statistics
SPATIAL ANALYSIS
USING BIG DATA
Methods and Urban
Applications
Edited by
YOSHIKI YAMAGATA
Center for Global Environmental Research
National Institute for Environmental Studies
Tsukuba, Ibaraki, Japan
HAJIME SEYA
Departments of Civil Engineering
Graduate School of Engineering Faculty of Engineering
Kobe University, Kobe, Hyogo, Japan

Academic Press is an imprint of Elsevier
125 London Wall, London EC2Y 5AS, United Kingdom
525 B Street, Suite 1650, San Diego, CA 92101, United States
50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States
The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom
Copyright © 2020 Elsevier Inc. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or by any
means, electronic or mechanical, including photocopying, recording, or any information
storage and retrieval system, without permission in writing from the publisher. Details on
how to seek permission, further information about the Publisher’s permissions policies
and our arrangements with organizations such as the Copyright Clearance Center and the
Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.
This book and the individual contributions contained in it are protected under copyright
by the Publisher (other than as may be noted herein).
Notices
Knowledge and best practice in this ﬁeld are constantly changing. As new research and
experience broaden our understanding, changes in research methods, professional
practices, or medical treatment may become necessary.
Practitioners and researchers must always rely on their own experience and knowledge in
evaluating and using any information, methods, compounds, or experiments described
herein. In using such information or methods they should be mindful of their own safety
and the safety of others, including parties for whom they have a professional
responsibility.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or
editors, assume any liability for any injury and/or damage to persons or property as a
matter of products liability, negligence or otherwise, or from any use or operation of any
methods, products, instructions, or ideas contained in the material herein.
Library of Congress Cataloging-in-Publication Data
A catalog record for this book is available from the Library of Congress
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library
ISBN: 978-0-12-813127-5
For information on all Academic Press publications visit our
website at https://www.elsevier.com/books-and-journals
Publisher: Candice Janco
Acquisition Editor: Scott J Bentley
Editorial Project Manager: Redding Morse
Production Project Manager: Debasish Ghosh
Cover Designer: Matthew Limbert
Typeset by TNQ Technologies

Contributors
Toshihiro Hirano
Kanto Gakuin University, Yokohama, Kanagawa, Japan
Daisuke Murakami
The Institute of Statistical Mathematics, Tachikawa, Tokyo, Japan
Hajime Seya
Departments of Civil Engineering, Kobe University, Kobe, Hyogo, Japan
Yoshiki Yamagata
Center for Global Environmental Research, National Institute for Environmental Studies,
Takahiro Yoshida
ix j

Preface
The world today is experiencing a technological revolution caused by
the internet of things (IoT), big data, and artiﬁcial intelligence (AI).
Consequently, interest in using spatial big data for practical purposes has
boomed in recent years. Hundreds of new applications are emerging over a
wide variety of areas such as weather forecasting, car navigation, restaurant/
hotel recommendation, and so forth. New types of data are also becoming
available through IoT applications and the scope of spatial analysis models
being drastically enhanced by AI and machine-learning techniques.
In the urban context, these technologies are often associated with another
international interestdsmart cities. The implementation of a combination
of these technologies in urban environments could exceed our original
expectations. Managing these systems requires spatial statistical researchers
to take on the responsibility of data scientists analyzing complex urban spatial
big data, addressing new societal challenges, and contending with real-world
issues like climate change by developing new applications.
The purpose of this book is to provide graduate-level students and urban
researchers alike with the basic theories and methods of spatial statistics
and spatial econometrics necessary for developing new urban analytic appli-
cations utilizing spatial big data. Therefore, in the beginning emphasis is
centered on mathematical formulation of spatial statistical and econometric
methods, complemented with new developments for analyzing spatial big
data. Initially more attention is given to describing the fundamental theories
in an illustrative manner, to be useful for applied researchers (Methods).
Utilizing programming illustrations, using R code, we describe how to
empirically implement the methodologies presented in the previous chapters
(Implementations). Following this, varieties of empirical application examples
relating to the spatial big data analytical methods are explained with special
emphasis placed on climate change mitigation issues in the urban spatial
planning context (Applications).
We believe by combining foundational methods of spatial statistics and
spatial econometrics with practical empirical applications, researchers and
practitioners may have a better understanding of how spatial big data analysis
can be applied to facilitate evidence-based urban spatial planning. In the
empirical part of this book, the focus is on models exemplifying these
concepts, shown being utilized for climate change mitigation solutions in
xi j

cities in Japan. The research methods presented in this book are general
enough to be applicable for addressing other different type of issues in
different cities as well as other multispatial scales such as regional or national
levels. The authors also believe that this book’s contents can contribute to
supporting researchers, practitioners, and students in conducting practical
spatial analyses using big data for their studies or others.
Spatial statistics (econometrics) refers to statistical (econometric) analysis
conducted on data with position coordinates, that is, spatial data. The
utilization of spatial aspects of data in statistical analyses can enhance and
improve the reliability of our models and analysis. One of the key concepts
in spatial statistics and spatial econometrics is spatial dependency or spatial
autocorrelation, defined in Tobler’s first law of geography:“Everything is
related to everything else, but near things are more related than distant
things.” Although historically the most popular academic areas for spatial
autocorrelation analysis were, and are, ecology and genetics; the scope of
application for spatial statistics has expanded substantially to include
geography and regional science, medicine and epidemiology (Waller and
Gotway, 2004), criminology (Townsley, 2009), image analysis (Curran and
various studies in Atkinson, 1998), remote sensing (Cressie, 2018), minology
(Journel and Huijbregts, 1978), soil science (Goovaerts, 1999), climate
science (Elsner et al., 2011), and the water field (Ver Hoef et al., 2006),
among others, reinforcing the fact that knowledge is cumulative.
First, let us briefly trace the lineage of spatial autocorrelation analysis. The
first use of spatial autocorrelation analysis is often attributed to John Snow’s
cholera map in the mid 19th century. Snow created a disease map for the
Soho area of London, in which the distribution of cholera patients and the
distribution of water pumps were superimposed onto the city map. In doing
so he found that the data was concentrated and distributed, with cholera
patients clustered around water pumps in Broadstreet. This is 30 years
before the discovery of Vibrio cholerae by Robert Koch. Snow’s analysis,
which acquired useful information by mapping spatial accumulation and
autocorrelation of cholera, is considered to be the first real case of spatial
data analysis. It would take 100 years before Moran’s development of the
I statistic (1948; 1950), and Geary’s C statistic (1954) made it possible to
quantitatively evaluate the presence or absence of spatial autocorrelation.
In the 1960 and 1970s, in the field of quantitative geography, spatial
autocorrelation was regarded as one of the most basic and important prob-
lems, leading researchers to develop stronger analytical modeling methods
for spatial data (e.g., Curry, 1966; Cliff and Ord, 1973). Building on these
xii Quasi real-time energy use estimation using Google’s Popular Times dataPreface

techniques, the scientific field of spatial econometrics, which examines the
relationship between data in discrete spaces (zones of cities, towns, etc.)
and flows of econometric geography, emerges in regional science. However,
the field of econometrics has now become an independent field of
study, with many articles published in mainstream econometrics journals
(Anselin, 2010; Arbia, 2011).
Separately, another form of spatial statistics arose in the field of natural
sciences. This academic field was established from mining science and
treated spatial data as a continuous quantity in space (Matheron, 1963). In
geostatistics, the dependency between data is described as a direct function
of distance. Once a function is identified, the dependency between data at
any location can be expressed using it, which enables spatial prediction of
data at any point. This is a major feature of modeling in geostatistics.
Regardless of field, according to Cressie (1993), spatial data may be
categorized into:
(1) geostatistical data
(2) lattice data
(3) point patterns
The term spatial statistics is sometimes used to represent an academic
discipline dealing with geostatistical data or point patterns. However, spatial
statistics, as a comprehensive system of analysis, is better defined in relation
to all three data types noted by Cressie (1993). Among these, our book deals
only with the spatial data categorized in geostatistical and lattice data. , We
have little experience with point patterns; therefore, for more detail we refer
to comprehensive texts such as Diggle (2013). Further, several key topics fall
outside our scope including spatial data acquiring, sampling, handling,
processing, and mining. Our focus is on spatial data analysis (or modeling).
Methods for processing geostatistical data were developed from geo-
statistics, whereas lattice data analysis arose from spatial econometrics
(Anselin, 1988). Although there are many similarities in modeling tech-
niques of geostatistics and spatial econometrics, Anselin (1986) describes it
as “each approach tends to be self-contained, with little cross-reference
shown in published articles.” Different development histories make it
relatively rare for either field to reference the other’s papers. Compounding
this is a sort of mutually exclusive entry barrier resulting from different
designations being used for the same model, depending on the field,
causing confusion. In fact, texts that cover methods of spatial econometrics
Preface xiii

and geostatistics are limited to Haining (1990, 2003) and Chun and Griffith
(2013), for example. Brunsdon and Comber (2018) cover the implementa-
tion of R code in both fields but have few theoretical descriptions. Thus, in
this book, using our latest research, we explain the methods employed by
both fields as much as possible while supporting their implementation using
R coding.
Moreover, facilitating the need for better modeling techniques for spatial
big data, beyond just reviewing recent efforts, we introduce implementation
methods utilizing R. This book consists of three parts: Methods (Chapters
2e6), Implementations (Chapter 7), and Applications (Chapters 8e11). The
content of each chapter is briefly explained as follows.
Chapter 1: Defines spatial data and its two major features, “spatial
autocorrelation” and “spatial heterogeneity.” It provides a solid foundation
upon which all subsequent chapters are built.
Chapter 2: Provides the basic mathematical preparation necessary
for spatial statistical and econometric analysis. We describe the classical
regression model that is the basis of this book followed by explanations about
the applied regression models, generalized linear model and additive model.
Of course, readers who are familiar with regression models may skip this
chapter. In addition, since spatial statistical models in recent years are often
subjected to theoretical development based on Bayesian statistics, it is also
explained here.
Chapter 3: Introduces measures (test statistics) related to the existence of
spatial autocorrelation in data called global indicators of spatial association,
followed by reviews on the measures related to where the spatial auto-
correlation occurs, called local indicators of spatial association (LISA).
Also, the spatial weight matrix, which is an important tool for spatial eco-
nometrics, is explained.
Chapters 4 and 5: Explain the modeling techniques of the geostatistical
data and lattice data, respectively. For the latter, it especially focuses on
spatial econometrics. Applications to spatial big data is also discussed.
Chapter 6: Introduces geographically weighted regression (GWR)
and eigenvector spatial filtering approaches that have been developed in
quantitative geography. Their recent advances, especially in terms of
computation, are also explained.
Chapter 7: Provides implementations with R. We chose R because of its
barrier-free nature (available for free and easy to learn), which is important
for students, as well as the existence of a lot of excellent packages
that include many specialized functions for analyzing spatial big data.
xiv Quasi real-time energy use estimation using Google’s Popular Times dataPreface

As examples of spatial data, we use well-known housing price data in Lucas
county (Ohio, USA), available through spData package of R. The data size is
25,357, thus medium-sized.
Chapters 8e11 illustrate the application of spatial statistical/econo-
metrics techniques for urban planning issues, especially focusing on climate
change mitigation. Each chapter uses various original data, and in this sense
our application chapters do not offer current standard applications.
However, we believe such applications would become more and more
important in the future as aforementioned. The details of each chapter is
explained as follows:
Chapter 8: Illustrates spatial modeling by combining multiple spatial data.
This kind of topic is becoming important today as an increasing number of
spatial data in different forms are available. We first introduce a geostatistical
approach to estimate temperatures in an intraurban scale by combining
weather monitoring data and geo-tagged tweets relating to heat. Then,
this approach is employed in an empirical study in Tokyo.
Chapter 9: Illustrates two GPS data analyses to quantify goodness of
walking environment. The first study applies a quantitative geographic
approach (GWR model; see Chapter 6) to investigate local determinants
of a walking environment, focusing on the impact of a pedestrian network
structure on the number of pedestrians. The second analysis applies
LISA (see Chapter 3) to quantify the heat-wave risk for pedestrians in an
intraurban scale.
Chapter 10: Applies a spatial econometric approach to a spatially explicit
downscaling of socioeconomic scenarios; the resulting socioeconomic
scenarios by 0.5-degree grids are useful to evaluate regional climate risks
in the future. We first apply the spatial econometric model to project city
population growth in several future scenarios. Then, the result is used in
the downscaling.
Chapter 11: Illustrates an estimation of quasi real-time energy consump-
tion in each building using Google’s popular time data, which records quasi
real-time human locations/activities collected from users of Google Map on
smartphones. We apply the geostatistical compositional kriging model to the
popular time data for a ward in central Tokyo.
Although this book is as self-comprehensive as possible, a basic (under-
graduate level) knowledge of statistics and econometrics is useful. Therefore,
readers who are not familiar with statistics and econometrics are encouraged
to familiarize themselves with these topics before reading this book.
After reading our book, readers who are interested in spatial statistics and
Preface xv

spatial econometrics may also refer to more advanced textbooks such as
Cressie and Wikle (2011) for spatial statistics, and Kelejian and Piras
(2017) for spatial econometrics.
Yoshiki YAMAGATA and Hajime SEYA
May 1, 2019
References
Anselin, L., 1986. Some further notes on spatial models and regional science. Journal of
Regional Science 26 (4), 799e802.
Anselin, L., 1988. Spatial Econometrics: Methods and Models. Kluwer Academic
Publishers, Dordrecht.
Anselin, L., 2010. Thirty years of spatial econometrics. Papers in Regional Science 89 (1),
3e25.
Arbia, G., 2011. A lustrum of SEA: recent research trends following the creation of the
Spatial Econometrics Association (2007e2011). Spatial Economic Analysis 6 (4),
377e395.
Brunsdon, C., Comber, L., 2018. An Introduction to R for Spatial Analysis and Mapping,
second ed. SAGE Publications Ltd, London.
Chun, Y., Grifﬁth, D.A., 2013. Spatial Statistics and Geostatistics: Theory and Applications
for Geographic Information Science and Technology. SAGE Publications Ltd,
Thousand Oaks.
Cliff, A.D., Ord, J.K., 1973. Spatial Autocorrelation. Pion, London.
Cressie, N.A.C., 1993. Statistics for Spatial Data, Revised Edition. Wiley, New York.
Cressie, N.A.C., Wikle, C.K., 2011. Statistics for Spatio-Temporal Data. Wiley, New York.
Cressie, N.A.C., 2018. Mission CO2ntrol: a statistical scientist’s role in remote sensing of
atmospheric carbon dioxide. Journal of the American Statistical Association 113 (521),
152e168.
Curran, P.J., Atkinson, P.M., 1998. Geostatistics and remote sensing. Progress in Physical
Geography 22 (1), 61e78.
Curry, L., 1966. A note on spatial association. The Professional Geographer 18 (2), 97e99.
Diggle, P.J., 2013. Statistical Analysis of Spatial and Spatio-Temporal Point Patterns. third
ed. CRC Press, Boca Raton, FL.
Elsner, J.B., Hodges, R.E., Jagger, T.H., 2011. Spatial grids for hurricane climate research.
Climate Dynamics 39 (1e2), 21e36.
Geary, R.C., 1954. The contiguity ratio and statistical mapping. The Incorporated Statis-
tician 5 (3), 115e145.
Goovaerts, P., 1999. Geostatistics in soil science: state-of-the-art and perspectives. Geoderma
89 (1e2), 1e45.
Haining, R., 1990. Spatial Data Analysis in the Social and Environmental Sciences.
Cambridge University Press, Cambridge.
Haining, R., 2003. Spatial Data Analysis: Theory and Practice. Cambridge University Press,
Cambridge.
Journel, A.G., Huijbregts, C.J., 1978. Mining Geostatistics. Academic Press, London.
Kelejian, H.H., Piras, G., 2017. Spatial Econometrics. Academic Press, Cambridge.
Matheron, G., 1963. Principles of geostatistics. Economic Geology 58 (8), 1246e1266.
Moran, P.A.P., 1948. The interpretation of statistical maps. Journal of the Royal Statistical
Society B 10 (2), 243e251.
xvi Quasi real-time energy use estimation using Google’s Popular Times dataPreface

Moran, P.A.P., 1950. A test for the serial dependence of residuals. Biometrika 37 (1e2),
178e181.
Townsley, M., 2009. Spatial autocorrelation and impacts on criminology. Geographical
Analysis 41 (4), 452e461.
Ver Hoef, J.M., Peterson, E., Theobald, D., 2006. Spatial statistical models that use ﬂow and
stream distance. Environmental and Ecological Statistics 13 (4), 449e464.
Waller, L.A., Gotway, C.A., 2004. Applied Spatial Statistics for Public Health Data. Wiley,
New York.
Preface xvii

CHAPTER ONE
Introduction
Yoshiki Yamagata1
, Hajime Seya2
1
2
Contents
1.1 The definition of spatial data 1
1.2 Characteristics of spatial data: spatial autocorrelation and spatial heterogeneity 3
1.2.1 Spatial autocorrelation 3
1.2.2 Spatial heterogenity 4
References 5
1.1 The definition of spatial data
Data relating to geospatial information is used in our everyday lives. In
this book, based on the Japanese Basic Act on the Advancement of Utilizing
Geospatial Information promulgated in 2007, we define the term geospatial
information as
(1) information that represents the position of a specific point or area in
geospace (including temporal information pertaining to said informa-
tion, hereinafter referred to as positional information); and/or
(2) any information associated with this information.
We term the data, which relates to geospatial information, as spatial data.
In addition, the aforementioned geospatial and geographic information are
taken to have the same meaning. Naturally, there are various other methods
of defining spatial data (e.g., Waller and Gotway, 2004, pp. 38e39).
Currently, the single most important book concerning spatial statistics is,
undoubtedly, Cressie (1993). This great work spans 900 pages, and has served
as a “dictionary” in this field for many years. The first chapter classified spatial
Spatial Analysis Using Big Data
ISBN: 978-0-12-813127-5
https://doi.org/10.1016/B978-0-12-813127-5.00001-1
© 2020 Elsevier Inc.
All rights reserved. 1 j

data into geostatistical data, lattice data, and point patterns.1
We begin this
section with an outline of these data.
Let < be the whole set of real numbers, and let s ˛ <d be a spatial po-
sition in Euclidean space of dimension d (usually d ¼ 2 or 3)2
and let Y(s)
be a random (possibly multivariate) quantity at position s. The spatial
process3
is defined as fYðsÞ: s ˛ Dg (D 3 <d shows the domain). d ¼ 2
corresponds to two-dimensional spatial coordinates (e.g., x and y planar
coordinates), and d ¼ 3 is where the height dimension is added to this
(e.g., elevation).
As previously described, spatial data is data with location and attributes.
Here, the term data is frequently used to correspond with observed values.
In this book, the realization of the spatial process Y(s), namely the observed
value, is expressed as y(s).4
Cressie and Wikle (2011) have developed a clear
discussion by clearly separating a data model (DM) relating to y(s) and a process
model (PM) relating to Y(s). In this book, DM and PM are used separately
when needed, and the three types of spatial data are defined as follows.
• Geostatistical data y(s) are the realized values obtained from the geostat-
istical process Y(s), where s varies continuously within a fixed domain D.
For example, elevation corresponds to this data type.
• Lattice data y(s) are the realized values from the lattice process Y(s),
where s varies on a countable subdomain of a fixed domain D. For
example, socioeconomic data gathered from areas such as municipalities
and satellite remote sensing image data at the pixel level correspond to
this data type.
• Spatial point patterns data y(s) are the realized values from the spatial
point process, which is a spatial process relating to the position of a
randomly occurring event where D itself is random. Event data, such as
crime, correspond to this type.
In addition, following Cressie and Wikle (2011, p.18), we assume a
spatio-temporal process that adopts a time axis, expressing where t moves
continuously within T3< as fYðs; tÞ: s ˛D; t ˛Tg and where it moves
discretely as fYtðsÞ: s ˛D; t ˛Tg.
1
Note that among the standard texts with some reputation, Banerjee et al. (2014) refer to (a) and (b) as
point-referenced and areal data, respectively, while Schabenberger and Gotway (2005) present (b) as
lattice/regional data (other classification names are identical).
2
d > 3 is often used in the field of computer experiments.
3
Often also called a random field.
4
Arbia (2006) discusses the importance of distinguishing between Y(s) and y(s). However, there seems
to be little awareness of this point in applied research.
2 Yoshiki Yamagata and Hajime Seya

However, our book basically focuses on spatial process/data, not spatio-
temporal process/data. Now, observation points are taken to be si (i ¼ 1, .,
N). In this book, the univariate random variable is expressed as either Y(si) or
Yi, and its actual value as y(si) or yi.
1.2 Characteristics of spatial data: spatial
autocorrelation and spatial heterogeneity
According to Anselin (1988), the characteristics of spatial data are spatial
autocorrelation and spatial heterogeneity. The term spatial dependence is also
often used to mean the former of the two. Naturally, autocorrelation and
dependency are not identical terms. However, in practice, both are often
used to mean autocorrelation, and in this book, we advance the argument
that they are mutually interchangeable, as in Anselin and Bera (1998, p.
240). In addition, although the term spatial correlation is often used, it is
more appropriate to use the term autocorrelation rather than correlation for
the correlation that occurs due to spatial positioning in the same variable
(Getis, 2008).
1.2.1 Spatial autocorrelation
Spatial autocorrelation, as shown in Fig. 1.2.1, is generally classiﬁed into
positive spatial autocorrelation, in which neighboring data show similar
trends, and negative spatial autocorrelation, in which neighboring data
show notably different values. These are known as Tobler’s (1970) First
Law of Geographydeverything is related to everything else, but near things are
more related than distant things. The latter, shown in a checkerboard pattern
in Fig. 1.2.1, is not always easy to explain intuitively; however, when
considering, for example, the spatial distribution of forests or crops, negative
spatial autocorrelation may occur if appropriate thinning is not performed
Posive
Spaal autocorrelaon
Random distribuon Negave
Spaal autocorrelaon
Figure 1.2.1 A representation of spatial autocorrelation and are binary variables
taking 1 and 0, respectively.
Introduction 3

due to competition for necessary nutrients. See Griffith and Arbia (2010) for
details concerning negative spatial autocorrelation.
Mathematically, spatial autocorrelation is expressed by the following
moment condition (Anselin and Bera, 1998):
Cov yi; yj

¼ E yiyj

EðyiÞ$E yj

s0; cisj: (1.1)
Here, yi and yj show data at points si ˛D; sj ˛D. Various phenomena
showing spatial autocorrelation exist around us. Let us take an example of
the official land prices (Land Market Price Publication) in Japan, published
by the Ministry of Land, Infrastructure, Transport and Tourism. One
method used by real estate surveyors is transaction comparison, where
land is evaluated by comparing the surrounding transaction prices. As a
result, there is the possibility of spatial autocorrelation occurring in appraisal
value (Tsutsumi and Seya, 2009). In another example, in the field of spatial
epidemiology, the visualization and detection of disease clustering is of
particular interest. For example, while it is known that the risk of each dis-
ease occurring depends on the region (because of aspects like region-specific
foods and culture), a positive spatial autocorrelation is evident where this
kind of risk is spatially concentrated. Since illness is a type of event (number)
data, a specific method is required to test for spatial concentration (e.g.,
Waller and Gotway, 2004).
Here, we wish to briefly describe the difference between spatial
autocorrelation and temporal autocorrelation. The dependency relationship
of time series is modeled on the idea that the causal chain between the prior
phenomenon and the phenomenon of interest follows the direction of
progress, and that the phenomenon at a given point in time exerts no influ-
ence on those prior to that point in time. Conversely, spatial autocorrelation is
characterized by simultaneous occurrence in multiple directions with accom-
panying feedback (Anselin, 2009). Although the details are described later,
this relationship complicates the estimations and inferences.
1.2.2 Spatial heterogenity
Spatial heterogeneity refers to the uneven distribution of a trait, event, or
relationship across a region (Anselin, 2010). From a statistical point of
view, it results in unstable model structure in space (function form and/or
regression coefficient). One example is the segmented market of real estate
(e.g., Islam and Asami, 2009).
In a cross section, however, care needs to be taken since there are many
cases in which spatial heterogeneity is indistinguishable from spatial
4 Yoshiki Yamagata and Hajime Seya

autocorrelation. For example, when the residuals from a regression analysis
form positive spatial clusters, this can be interpreted as both spatial heteroge-
neity (group level variance heterogeneity) and spatial autocorrelation (con-
centration of similar residuals) (Anselin, 2001). Therefore, in spatial
econometrics, it is common to impose structure on the problem through
the specification of a model, coupled with extensive specification testing
for potential departures from the null model (Anselin and Bera, 1998).
However, approaches nonparametrically specifying unknown patterns of
spatial heterogeneity have also been developed (Kelejian and Piras, 2017),
which will also be explained in this book.
References
Anselin, L., 1988. Spatial Econometrics: Methods and Models. Kluwer Academic Publishers,
Dordrecht.
Anselin, L., 2001. Spatial econometrics. In: Baltagi, B. (Ed.), A Companion to Theoretical
Econometrics. Blackwell, Oxford, pp. 310e330.
Anselin, L., 2009. Spatial regression. In: Fotheringham, S., Rogerson, P. (Eds.), The SAGE
Handbook of Spatial Analysis. Sage Publications Inc, Los Angeles, pp. 255e276.
Anselin, L., 2010. Thirty years of spatial econometrics. Papers in Regional Science 89 (1),
3e25.
Anselin, L., Bera, A.K., 1998. Spatial dependence in linear regression models with an intro-
duction to spatial econometrics. In: Ullah, A., Giles, D.E. (Eds.), Handbook of Applied
Economic Statistics. Marcel Dekker, New York, pp. 237e289.
Arbia, G., 2006. Spatial Econometrics: Statistical Foundations and Applications to Regional
Growth Convergence. Springer, New York.
Banerjee, S., Carlin, B.P., Gelfand, A.E., 2014. Hierarchical Modeling and Analysis for
Spatial Data, second ed. Chapman Hall/CRC, Boca Raton.
Cressie, N.A.C., 1993. Statistics for Spatial Data, Revised Edition. Wiley, New York.
Cressie, N.A.C., Wikle, C.K., 2011. Statistics for Spatio-Temporal Data. Wiley, New York.
Getis, A., 2008. A history of the concept of spatial autocorrelation: a geographer’s
perspective. Geographical Analysis 40 (3), 297e309.
Griffith, D.A., Arbia, G., 2010. Detecting negative spatial autocorrelation in georeferenced
random variables. International Journal of Geographical Information Science 24 (3),
417e437.
Islam, K.S., Asami, Y., 2009. Housing market segmentation: a review. Review of Urban
Regional Development Studies 21 (2e3), 93e109.
Kelejian, H.H., Prucha, I.R., 2007. HAC estimation in a spatial framework. Journal of
Econometrics 140 (1), 131e154.
Schabenberger, O., Gotway, C.A., 2005. Statistical Methods for Spatial Data Analysis.
Chapman Hall/CRC, Boca Raton.
Tobler, W., 1970. A computer movie simulating urban growth in the Detroit region.
Economic Geography 46 (2), 234e240.
Tsutsumi, M., Seya, H., 2009. Hedonic approaches based on spatial econometrics and spatial
statistics: application to evaluation of project benefits. Journal of Geographical Systems 11
(4), 357e380.
Waller, L.A., Gotway, C.A., 2004. Applied Spatial Statistics for Public Health Data. Wiley,
New York.
Introduction 5

CHAPTER TWO
Mathematical preparation
Hajime Seya1
, Yoshiki Yamagata2
1
2
Contents
2.1 Definitions of notations 9
2.2 The classical linear regression model 10
2.2.1 The classical linear regression model and violation of typical assumptions 10
2.2.2 Endogeneity 12
2.2.3 Spatial autocorrelation of error term and heteroskedastic variance 16
2.3 The generalized linear model 17
2.4 The additive model 19
2.5 The basics of Bayesian statistics 23
2.5.1 Bayes’ theorem 23
2.5.2 The Markov chain Monte Carlo method 24
2.5.3 Bayesian estimation of the classical linear regression model 28
References 30
2.1 Definitions of notations
In this book, scalars are shown in fine italics a, and vectors and matrices
are shown in bold a (there are instances of lowercase characters and upper-
case characters). Moreover, when a is a column vector and A is a matrix, ai is
the ith component of a, ai is a vector from a with the ith component
removed, Ai is the ith row of A, Ai,j is the component on row i column j
of A, and Ai is a matrix from A with the ith row removed. Furthermore,
I expresses an identity matrix, O a square matrix from 0, 1 a column vector
of 1, 0 a column vector of 0, and A1
and A0 are the inverse matrix of A and
the transposed matrix of A, respectively. Dimensions of matrices and vectors
are omitted, where obvious, from the context. However, depending on
their necessity, these are written as A[n] for an nth order square matrix and
A[n m] for a matrix with n rows and m columns. In addition, as much as
Spatial Analysis using Big Data
ISBN: 978-0-12-813127-5

possible, mathematical notations will follow the standard form given by
Abadir and Magnus (2002).
2.2 The classical linear regression model
Before explaining spatial statistics and spatial econometrics, we
explain the basis of the classical linear regression (CLR) model. Also, the
generalized linear model (GLM), the additive model, and Bayesian statistics
are introduced here. Particularly, in recent years, many statistical estima-
tions, inferences, and predictions with regression models have been based
on the Bayesian statistical theory, and therefore it is desirable to understand
its basic principles. In fact, Banerjee et al. (2014) (for geostatistics or spatial
statistics) and LeSage and Pace (2009) (for spatial econometrics), two of the
standard texts, have developed theories that are reliant mostly upon
Bayesian statistics.
2.2.1 The classical linear regression model and violation of
typical assumptions
In the CLR model, the following relationship is established for all observed
values yi at position si (i ¼ 1, ., N):
yi ¼ b1 þ
X
K
k¼2
xk;ibk þ εi: (2.2.1)
where, yi denotes a dependent variable, xk,i (k ¼ 2, ., K) denotes an
exogenous explanatory variable, b1 is a constant, bk is the regression coefﬁ-
cient corresponding to xk,i, and εi is the error term. Expressing Eq. (2.2.1) as a
matrix yields the following equation:
0
B
B
B
@
y1
y2
«
yN
1
C
C
C
A
¼ b1
0
B
B
B
@
1
1
«
1
1
C
C
C
A
þ
0
B
@
2
6
4
x2;1 / xK;1
« 1 «
x2;N / xK;N
3
7
5
1
C
A
0
B
B
B
@
b2
b3
«
bK
1
C
C
C
A
þ
0
B
B
B
@
ε1
ε2
«
εN
1
C
C
C
A
;
(2.2.2)
or
y ¼ b11 þ xb2 þ ε:
where y is an N 1 dependent variable vector containing yi, 1 is an N 1
vector containing 1, x is an N (Ke1) exogenous explanatory variables
matrix consisting of xk,i, b2 is a (Ke1) 1 regression coefﬁcient vector
10 Hajime Seya and Yoshiki Yamagata

consisting of bk, and ε is an N 1 error term vector consisting of εi. If we
rearrange such that Xh[1;x], bh[b1;b0
2]’, we have
y ¼ Xb þ ε: (2.2.3)
In the CLR model, the following four assumptions are usually made:
(1) X is exogenous.
(2) Given X, the conditional expected value of y is Xb, and the
conditional expected value of ε is 0. That is, E[εjX] ¼ 0.
(3) Given X, the error term ε satisfies:
Var½εjX ¼ s2
εI½N ¼
0
B
B
B
@
s2
ε 0 / 0
0 s2
ε 1 «
« 1 1 0
0 / 0 s2
ε
1
C
C
C
A
: (2.2.4)
Eq. (2.2.4) implies independent and identically distributed error terms.
(4) The rank of X is K. That is, an inverse matrix exists in (X’X).
In addition to these four assumptions, the following assumption is typi-
cally made:
(5) εwN(0s2
εI½N); that is, normal distributed error terms in population.
This assumption implies
y w N

Xb; s2
εI½N

: (2.2.5)
With assumption (5), the ordinary least squares (OLS) estimator of b
follows the normal distribution, making it possible to perform a hypothesis
test on its significance when the number of observations in a sample is rather
small (when large, we can apply the central limit theorem).
The OLS estimator b
bols of the CLR model is obtained by minimizing
the sum of squares b
ε'olsb
εols of residual vector:
b
εols ¼ y Xb
bols (2.2.6)
The first-order condition of optimization yields
X'
y ¼ X'Xb
bols (2.2.7)
With assumption (4), there is an inverse matrix in (X’X), and therefore
the OLS estimator of b can be obtained from
b
bols ¼ ðX'XÞ1
X'
y (2.2.8)
Mathematical preparation 11

Also, the variance of b
bols is given by:
Var

b
bols

¼ s2
εðX'XÞ
1
(2.2.9)
Since s2
ε is usually unknown, we substitute s2
ε with the estimator
b
s2
ε;ols ¼ b
ε'olsb
εols=ðN KÞ (2.2.10)
Note that the fitted value of y is given by b
y ¼ Xb
bols, and thus if this is
substituted into Eq. (2.2.8), we obtain b
y ¼ XðX'XÞ1
X'y. Here,
PX ¼ X

X'X
1
X' is a projection matrix that produces b
y from y, and
for this reason, it is termed a hat matrix. Similarly, MX ¼ I[N]PX is an
operator that creates a residual from y. These operators are also important
in the restricted maximum likelihood (REML) method described elsewhere.
Where assumptions (1)e(4) hold, the OLS estimator becomes the best
linear unbiased estimator (BLUE), and this property is known as the
Gauss-Markov theorem. Unfortunately, however, it is rare in empirical
analyses that all these assumptions are satisfied, and in many cases, violations
of assumptions (1)e(3) occur in particular. Therefore, later we explain the
consequences of violating assumptions (1)e(3), and the countermeasures
to them. Note that with respect to assumption (4), it is possible to satisfy
this assumption by removing the explanatory variable(s) that causes perfect
multicollinearity.
2.2.2 Endogeneity
When deviating from assumptions (1) and (2) (i.e., when a correlation
between the explanatory variable and error term occurs), the OLS estimator
lacks both consistency and unbiasedness. Cases where x has a measurement
error, where an important variable is missing from the model (omitted
variable bias), and where x and y are jointly determined (simultaneity/
reverse causality) are regarded as examples of this kind of situation. As a
countermeasure to this problem, the instrumental variable (IV) method is
applied to obtain consistent estimators for the regression coefficients.
The IV(s) must have correlation to the endogenous explanatory variables
conditionally on the other covariates, although they cannot have correlation
to the error term conditionally on the other covariates. The latter condition
rule out any direct effect of the instruments on the dependent variable or
any effect running through omitted variables. This is called the exclusion
restriction. As its generalization, the two-stage least squares (2SLS) method
as well as the generalized method of moments (GMM) can also be used.

Since these methods are used for estimating parameters of the spatial eco-
nometric models, let us briefly describe their basics here.
We will now explicitly introduce endogenous variables into the CLR
model, which we can express as follows:
y ¼ Xb þ _
X _
b þ ε; (2.2.11)
where we take X to be an N K explanatory variable matrix consisting of
constant terms and exogenous variables, and _
X as an N L explanatory
variable matrix consisting of endogenous variables. In addition, b denotes a
K 1 regression coefficient vector corresponding to exogenous variables,
and _
b denotes an L 1 regression coefficient vector corresponding to
endogenous variables. Here, because _
X is an endogenous variable, it has a
correlation to the error term (Cov[εi, _
xl;i]s0, c l ¼ 1, ., L). Hence we
may consider introducing IVs, say Z[NP], which correlates with _
X, but
does not correlate with ε. For the sake of identification, the degree P of Z
must be greater than or equal to the number of endogenous variables L.
Under such a scenario, the IV estimator can be obtained by applying the
2SLS method. That is, when the explanatory variable matrix is rearranged to
form Rh

X; _
X

, a two-stage estimation is performed, such that R is
projected onto the plane spanned by S h [X;Z], uncorrelated to the error
term and the estimated b
R value obtained, the next y is regressed on
b
R; not R. This is a simple idea of using b
R, which does not have correlation
to the error term. If a valid IV is used, the 2SLS estimator may be consistent.
The 2SLS estimator of parameter €
bh
h
b0
; _
b0
i0
is finally obtained from
b
€
b2sls ¼

b
R
0
b
R
1
b
R
0
y; (2.2.12)
Var
h
b
€
b2sls
i
¼ s2
ε

R0 b
R
1
; (2.2.13)
where b
R ¼ ðS0SÞ1
S0R, and therefore €
b2sls ¼
h
R0SðS0SÞ1
S0R
i1
R0SðS0SÞ1
S0y. Since s2
ε is unknown, s2
ε can be substituted with an estimate
using
b
s2
ε;2sls ¼ b
ε'2slsb
ε2sls=ðN KÞ; (2.2.14)

where
b
ε2sls ¼ y R
€
b
b2sls (2.2.15)
Note the use of R instead of b
R. Since the coefﬁcient estimator in the
2SLS method can be obtained asymptotically, it can be divided either by
(NK) or N.
The 2SLS method is also used as the parameter estimation method of the
spatial lag model (SLM), which is one of the representative models of the
spatial econometrics that will be introduced in Chapter 5. GMM, instead,
can be applied for the parameter estimation of the spatial error model
(SEM), which is also introduced in Chapter 5.
Now, our explanation turns to the GMM. First let us explain the method
of moments (MM), which also can be used as the parameter estimation of
the CLR model. The MM is a method of estimating parameters by using
moment conditions that a model should satisfy. The moment condition
in the CLR model is the absence of correlation between the explanatory
variables and the error term; that is,
E½X0
iεi ¼ 0½K1; ci ¼ 1; .; N (2.2.16)
Here, since Xi is a 1 K vector expressing the ith row component of X,
the conditional expression becomes set K. In matrix form, we can write this
condition as
E½X0
ε ¼ 0½K1 (2.2.17)
The corresponding sample moment condition can be obtained as
X0ε
N
¼
X0ðy XbÞ
N
¼ 0½K1: (2.2.18)
Hence with assumption (4), the MM estimator may be given as
b
bmm ¼ ðX0
XÞ
1
X0
y; (2.2.19)
and thus it is understood to be identical to the OLS estimator.
Here, let us write the moment condition of the CLR model shown in
Eq. (2.2.16) in a more general way:
E½hðyi; Xi; bÞ ¼ 0½R1; (2.2.20)
Here, h(yi,Xi,b) is an R 1 vector-valued function. If we substitute with
the sample moment condition, this becomes:
hsðy; X; bÞ ¼
1
N
X
N
i¼1
hsðyi; Xi; bÞ; (2.2.21)

where hs(y,X,b) is also an R 1 vector-valued function. If the number of
the estimand, K, of b matches the number of moment conditions R, as in the
case of the CLR model, it is possible to obtain those parameters using the
MM. However, where R K the parameters that exactly satisfy
Eq. (2.2.21) are generally not present. Therefore, we consider determining
parameters that minimize a quadratic form such as
hsðy; X; bÞ0
Vhsðy; X; bÞ (2.2.22)
The estimator obtained in this way is a GMM estimator. The GMM esti-
mator is known to have consistency and asymptotic normality under very
general conditions (Hayashi, 2000). Here, V is the weight assigned to
each condition and Hansen (1982) showed that under some regularity con-
ditions, the minimum variance of the GMM estimator can be achieved by
using the following equation:
V ¼

1
N
X
N
i¼1
hðyi; Xi; bÞhðyi; Xi; bÞ0
#1
(2.2.23)
However, in order to obtain the estimator V in Eq. (2.2.23), it is neces-
sary to substitute the corresponding estimates for b. Therefore, a two-step
estimation is performed: calculating b
b
ð0Þ
gmm using suitable initial value of
weight (e.g., identity matrix value), substituting it into Eq. (2.2.23) to obtain
b
V
1
, and ﬁnally obtaining the GMM estimator by minimizing Eq. (2.2.22).
Here, we can easily show that the 2SLS estimator is a special case of the
GMM estimator. Now, let’s return to Eq. (2.2.11). When we set the IVs to
Z[NP], we can deﬁne S h [X;Z][N(KþP)]. Now, by adding the moment
condition relating to Z to that of the CLR model, the following equation is
obtained:
E½S0
ε ¼ 0½ðKþPÞ1 (2.2.24)
Replacing the left side with a sample analogous, we obtain
S0

y R€
b

N
; (2.2.25)
where Rh

X; _
X

, €
bh
h
b'
; _
b'
i'
. The GMM estimator for €
b is obtained by
minimizing the quadratic form
hs

y; X; _
X; Z; €
b
0
Vhs

y; X; _
X; Z; €
b

; (2.2.26)

and its sample analogous yields:
S0

y R€
b

N
!0
V
S0

y R€
b

N
!
; (2.2.27)
where V is obtained by:
V ¼

s2
εS0S
N
1
(2.2.28)
From the ﬁrst-order condition of optimization, the GMM estimator of
€
b is given by:
b
€
bgmm ¼
h
R0
SðS0
SÞ
1
S0
R
i1
R0
SðS0
SÞ
1
S0
y: (2.2.29)
It is apparent that this equation is identical to the 2SLS estimator
(Eq. 2.2.12). In addition, the asymptotic distribution is also consistent
with that obtained from the 2SLS method. For further details, please refer
to Hayashi et al. (2000).
2.2.3 Spatial autocorrelation of error term and
heteroskedastic variance
Next, we examine violations of assumption (3), where the error term does
not satisfy homoskedasticity and/or no-autocorrelation. In both cases the
OLS estimator is unbiased, but not efﬁcient. Particularly in the case of spatial
data, there are many instances where no-autocorrelation does not hold due
to spatial autocorrelation stems from unobserved factors. Now, let us expand
the CLR model in the following manner:
y ¼ Xb þ u; (2.2.30)
Var½u ¼ E½uu0
¼ S; (2.2.31)
with
S ¼
0
B
B
B
@
Var½u1 Cov½u1; u2 / Cov½u1; uN
Cov½u2; u1 Var½u2 / Cov½u2; uN
« « 1 «
Cov½uN ; u1 / / Var½uN
1
C
C
C
A
;
where, S is termed a varianceecovariance matrix, which is a matrix
with variance in the diagonal terms and covariance in nondiagonal terms.
If S ¼ s2
εI does not hold, the OLS estimator is not BLUE, and the standard
error estimator has bias, which may result in erroneous inference.

Fortunately, however, if the structure of S is known, using the generalized
least squares method, b0s BLUE can be obtained:
b
bgls ¼

X0
X1
X
1
X0
X1
y. (2.2.32)
Of course, S is usually not known, and it is necessary to set N N el-
ements of S with any assumption.
Simply speaking, in the modeling of geostatistical data (geostatistical
model), the varianceecovariance matrix is directly constructed as a function
of distance. Meanwhile, in lattice data modeling (spatial econometric model),
it is indirectly constructed through structuring the dependency between the
data or error terms (e.g., autoregression type and moving average type). These
differences are explained in more detail in Chapters 4 and 5.
2.3 The generalized linear model
In the CLR model introduced in Eq. (2.2.1), if we assume that the
error term follows a normal distribution, it can be expressed as:1
E½yi ¼ mi ¼ Xib; yiwN

mi; s2
ε

; (2.3.1)
where Xi denotes the vector consisting of the ith row of X. Here, needless to
say, it is not mandatory to assume normal distribution. The GLM can handle
a wide class of distributions called an exponential distribution family, where
we have
f ðE½yiÞ ¼ f ðmiÞ ¼ Xib (2.3.2)
Note that the relationship between mi and Xib are modeled by nonlinear
function f(,). The function f(,) is called a link function, and this kind of
model is called a generalized linear model. The GLM has the following
characteristics:
[1] Dependent variable yi follows a distribution belonging to the exponen-
tial distribution group.
[2] f(mi) and Xib have a linear relationship.
Poisson distribution and binomial distribution, in addition to the normal
distribution, are well-known distributions belonging to exponential distribu-
tion family. Commonly used link functions differ, depending on the
1
For the sake of simplicity, we do not distinguish between the stochastic variable Yi and its observation
yi.

distribution that we assume. For example, a logarithmic link function is gener-
ally used for Poisson distribution and its generalization, negative binominal dis-
tribution. These commonly used link functions are called canonical link
functions, and are convenient for practical use because the maximum likeli-
hood estimates of parameters can easily be calculated by applying the iterative
reweighted least squares (IRLS) method.
In the following, as an example of the GLM, we explain the Poisson
regression model, which is often used in spatial statistics. Now, let yi be
the number of occurrences of an event. Events {y1, ., yN} are assumed
to be mutually independent, and it is assumed that the frequency of
an event’s occurrence is very small. Given such a situation, the probability
distribution of yi is
yiwPoissonðmiÞ; (2.3.3)
and a Poisson distribution can approximate this situation well. Famous ex-
amples of phenomena that can be described by a Poisson distribution include
the number of spelling mistakes when writing a page of text, and the
number of traffic fatalities in a year in a given region. The probability mass
function of the Poisson distribution is given by
pðyijmiÞ ¼
m
yi
i expðmiÞ
yi!
; (2.3.4)
where yi! denotes the factorial of yi. In the Poisson distribution, yi takes an
infinite value with yi˛{0,1,2,.,N}. An important property of the Poisson
distribution is that expected value ¼ variance ¼ mi, and since the shape of
the distribution is defined by one parameter, mi, it has the advantage of being
extremely easy to use. In actual data, however, variance expected value,
termed overdispersion, often occurs. In this case it is possible to use a
negative binomial distribution that extends the Poisson distribution and
assumes that the variations of mi follow the gamma distribution. The
probability mass function of the negative binomial distribution is given by
pðyijmiÞ ¼
G

yi þ y1

yi!Gðy1Þ

y1
y1 þ mi
y1
mi
y1 þ mi
yi
; (2.3.5)
and the expected value and variance are mi and miþymi
2
, respectively. It is
therefore possible to adjust overdispersion using the parameter y.
In the Poisson (or negative binomial) distribution, the logarithmic
function

lnðmiÞ ¼ Xib (2.3.6)
is often used as a link function. It follows the mean component given as
mi ¼ expðXibÞ ¼ exp

b1 þ x2;ib2 þ . þ xK;ibK

(2.3.7)
Here, we introduce offset term to Eq. (2.3.7). Let ni be the arbitral total
number (e.g., population) of geographical unit i (i ¼ 1, ., N), and let yi be
the number of occurrences of events. Then it is natural to think that the
larger ni becomes, the number of event occurrences increases. However,
if we set a dependent variable as a dimensionless quantity (i.e., ratio), it is
no longer possible to distinguish 1/2 from 2/4 (i.e., information is lost),
and it becomes unclear what kind of distribution we should assume for
this ratio variable. As an alternative, therefore, we can assume
mi ¼ ni exp

b1 þ x2;ib2 þ . þ xK;ibK

; (2.3.8)
where the expected value of yi, mi, is modeled such that it is proportional to
ni. Taking the natural logarithm of both sides and in vector form, we obtain
lnmi ¼ lnni þ Xib; (2.3.9)
where ln ni is a constant called an offset term, which is a known constant
included in the model, and can be easily incorporated in parameter
estimation. Parameter b of the Poisson regression model can be estimated
using the maximum likelihood method via the IRLS method. For more
details about GLM, see, for example, Wood (2017).
2.4 The additive model
As we have discussed, the GLM is a generalized linear model, in the
sense that E(yi) and linear component Xib are related using a nonlinear
function f(,). However, it is possible that f(mi) and the explanatory variables
have nonlinear relationships. Hence natural extension is
f ðmiÞ ¼ gðXibÞ; (2.4.1)
where g(,) is a nonlinear function. Fig. 2.4.1 shows a case in which Seya
et al. (2011) constructed a hedonic model in which rent data for Tokyo’s 23
wards in FY 2006 were regressed on several explanatory variables, and for
one of the explanatory variables, the logarithm of occupied area, a partial
residual plot (one index of a nonlinearity judgment by plotting xk;i
b
bkþ b
εi
with respect to explanatory variable xk,i) was created. From this ﬁgure, it can

be seen that the relationship with rent changes signiﬁcantly around 1.9
(60e80 m2
occupied area). Where this kind of nonlinearity exists, in the
CLR model, the relationship between the explanatory variable and the data
cannot be expressed well.
Now, among the nonlinear functions g(,), a model that expresses the
sum of several smoothing functions like
f ðmiÞ ¼ Xib þ g1ðx1iÞ þ g2ðx2iÞ þ .; (2.4.2)
is comparatively easy to handle, and is termed the generalized additive
model (Hastie and Tibshirani, 1990; Wood, 2017). Since this model
contains both the parametric term Xib, and the nonparametric term
g1(x1,i)þg2(x2,i)þ., it can be understood to be a semiparametric regression
model (Ruppert et al., 2003). Of these, the nonparametric terms are often
speciﬁed to use penalized spline functions.
Next, for the sake of simplicity, we will explain the additive model
(AM), assuming that f(,) itself is linear ( f(mi) ¼ mi), and also we assume
the case of two explanatory variables.
When data (yi; x1,i; x2,i) is obtained, the AM can be formulated as
follows:
yi ¼ b0 þ x1;ib1 þ x2;ib2 þ g1

x1;i

þ g2

x2;i

þ εi; (2.4.3)
1.2 1.4 1.6 1.8 2.0 2.2
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
)
t
n
e
r
(
l
a
u
d
i
s
e
R
+
t
n
e
n
o
p
m
o
C
Figure 2.4.1 Partial residual plot (y axis:xk;i
b
bk;i þ b
εi, x axis:xk,i). The R function crPlots
(car package) was used for the plot. Dotted line: linear estimator; solid line: LOESS esti-
mator (smoothing).

where g1 and g2 are smoothing functions, respectively, and εi is the inde-
pendent and identically distributed error term. When a smoothing function
is specified by a linear spline function, the following equation is obtained:
yi ¼ b0 þ x1;ib1 þ x2;ib2 þ
X
Q1
h¼1
b1;h

x1;i k1;h

þ
þ
X
Q2
h¼1
b2;h

x2;i k2;h

þ
þ εi;
(2.4.4)
where (xk)þ is an operator that is 0 for xk, and xk for xk, and k
expresses a “not.” b1,h and b2,h are the corresponding parameters. For
example, let us suppose that the relationship between x1andy changes greatly
in the vicinity of x1 ¼ 0.6. In this case, with k1 ¼ 0.6, for values less or
greater than 0.6, we should attach a difference in the relationship between
x1and y (see Ruppert et al., 2003, for an explanation of the figure on this
point). Since multiple such points can usually be seen, multiple nots are
placed corresponding to each explanatory variable, respectively. In the
example of Eq. (2.4.4), Q1 nots are allocated to the variable x1, and Q2 nots
are allocated to variable x2. If there are too many nots Q1 and Q2, issues of
overfitting the data may occur. Hence we can estimate parameters using the
penalized least squares method with restrictions on possible values of b1,h and
b2,h. There are various methods for this restriction; for example,
max b1;h m,
P
b1;h m,
P
b2
1;h m(m is an appropriate constant). The
third restriction may lead to a type of ridge estimator.
Expressing Eq. (2.4.4) in a matrix form, we obtain:
y ¼ Xb þ Zb þ ε; (2.4.5)
where
X ¼
2
6
6
6
4
1
«
1
x1;1
«
x1;N
x2;1
«
x2;N
3
7
7
7
5
; b ¼
h
b0; b1; b2
i0
;
b ¼

b1;1; .; b1;Q1
; b2;1; .; b2;Q2
0
; Z ¼ ½Z1; Z2;

Z1 ¼
2
6
4

x1;1 k1;1

þ
/

x1;1 k1;Q1

þ
« 1 «

x1;N k1;1

þ
/

x1;N k1;Q1

þ
3
7
5;
Z2 ¼
2
6
4

x2;1 k2;1

þ
/

x2;1 k2;Q2

þ
« 1 «

x2;N k2;1

þ
/

x2;N k2;Q2

þ
3
7
5:
Define Rh[X;Z]; €
bh

b0
; b0

0, and a matrix D as
D ¼
0
B
B
@
O½22 O½2Q1 O½2Q2
O½Q12
e
l
2
1 I½Q1Q1 O½Q1Q2
O½Q22 O½Q2Q1
e
l
2
2 I½Q2Q2
1
C
C
A; (2.4.6)
where e
l1;e
l2 0 denotes the Lagrangian multipliers. The penalized least
squares estimate for €
b can be obtained by minimizing
y R€
b
2
þ €
b
0
D€
b; (2.4.7)
where jj . jj denotes a vector norm. The optimization yields
b
€
be
l
¼ ðR0
R þ DÞ
1
R0
y: (2.4.8)
When e
l1

or e
l2

takes a value close to 0, overfitting the data tends to
occur. When the value takes a relatively large value, on the other hand,
we cannot fully capture the nonlinearity between x1(or x2) and y. Hence
the calibration of these parameters are very important.
Here, let’s consider substituting fixed effects b1,h and b2,h with random
effects u1;hwi:i:d:N

0; s2
1u

and u2;hwi:i:d:N

0; s2
2u

, respectively, and
formalizing the model as a mixed model. This will smoothen the linear
spline function, and avoid data overfitting (Ruppert et al., 2003, p.109).
Moreover, a further advantage is that there are many statistical packages
for mixed models. Substituting b1,h, b2,h by u1,h, u2, h in Eq. (2.4.7) and
dividing by s2
ε, we obtain
1
s2
ε
y Xb Zu
2
þ
e
l
2
1
s2
ε
u1
2
þ
e
l
2
2
s2
ε
u2
2
; (2.4.9)

where u ¼ ½u0
1; u0
20
, u1 ¼

u1;1; .; u1;Q1
0
, andu2 ¼

u2;1; .; u2;Q2
0
.
Here, if we look at s2
1u ¼ s2
ε
.
e
l
2
1, s2
2u ¼ s2
ε
.
e
l
2
2, Eq. (2.4.9) can be no
other than the general criterion for obtaining the best linear unbiased
predictor (BLUP) of b and u (Ruppert et al., 2003, p.100). In other words,
the penalized least squares method of Eq. (2.4.9) is equivalent to finding the
BLUP in the mixed model. Hence let us reconstruct the model to be the
standard form of the mixed model as
y ¼ Xb þ Zu þ ε ¼ R€
b þ ε; (2.4.10)
Cov
u
ε
¼
2
6
6
4
s2
1;uI½Q1 O O
O s2
2;uI½Q2 O
O O s2
εI½n
3
7
7
5: (2.4.11)
We can estimate €
b with, for instance, the REML (Ruppert et al., 2003,
pp.100e101) method. The estimator of €
b can be obtained as
b
€
be
l
¼ ðR0
R þ DÞ
1
R0
y: (2.4.12)
Here, as in the hat matrix in the linear regression model, the matrix
RðR0R þ DÞ1
R0, converts an observed value into a fitted value, and its
trace gives a degree of freedom to the model. This value can be interpreted
as a measure of smoothness of the smoothing function, and it can also
be obtained for each explanatory variable (Ruppert et al., 2003,
pp.175e176). In this way, the degree of nonlinearity can be quantified.
Moreover, whether nonlinearity ought to be considered can be judged
statistically using a likelihood ratio test, which is possible to implement using
the standard mixed model software.
The AM is a model that forms the foundation of the geo-additive model,
which is sometimes used for modeling large spatial data (see Chapter 4).
Note that geo-additive model is an extension of the GLM framework using
AM; please refer to Wood (2017) in regard to this.
2.5 The basics of Bayesian statistics
2.5.1 Bayes’ theorem
In standard statistics (frequentism), we try to estimate parameters by
assuming that they have a specific fixed value. However, in the Bayesian sta-
tistical framework, we believe parameters to have a (probability) distribution.

This distribution, called a prior distribution, needs to be given to the analyst in
advance. In Bayesian estimation, we obtain the posterior distributions by
updating the prior distributions using observed data based on the Bayes’ the-
orem, and performing a statistical inference (Bayesian inference) based on that
posterior distribution. Since the choice of prior distributions affect the results
of parameter estimation there is a position that we should, as much as possible,
opt to give noninformative prior distributions. Since now, because of the
development of computation techniques including the Markov chain Monte
Carlo method (MCMC) and the Hamiltonian Monte Carlo, even when
given noninformative prior distributions, it is possible to derive and evaluate
the posterior distributions through simulation.
When taking the prior probability density function of parameter q
(q3J
) to be p(q), and the likelihood function to be p(yjq), the posterior
probability density function of q is obtained as
pðqjyÞ ¼
pðq; yÞ
R
pðq; yÞdq
¼
pðyjqÞpðqÞ
mðyÞ
fpðyjqÞpðqÞ; (2.5.1)
where p(qjy) incorporates all prior information and data in the form of the
prior distribution and likelihood function, respectively. The m(y) ¼ !p(q,y)
dq term is a scaling constant, representing the marginal probability density
function of y. Since this term is often difﬁcult to evaluate analytically, and
does not depend on q; it is often abbreviated as in Eq. (2.5.1). In Bayesian
estimation, it is possible to perform ﬂexible modeling by incorporating prior
beliefs and knowledge about parameters into the model through the prior
distribution in this way. In fact, there are also approaches to modeling spatial
autocorrelation through prior distribution (see Congdon, 2010).
2.5.2 The Markov chain Monte Carlo method
After obtaining the posterior distribution for q, statistical inferences may
be made about the parameters. To that end, when taking the parameter
of interest to be qj, it is necessary to use integral calculus to remove nuisance
parameters qj, in which we have no interest. In this way we obtain a
marginal probability density function, such that
pðqjjyÞ ¼
Z
pðqjyÞdqj: (2.5.2)
By using this marginal probability density, point estimation/interval
estimation for qjcan be performed. Unlike standard statistics, although the
obtained parameters have a distribution, there are many cases where the

point estimate b
qjis of our interest. For instance, the average, median mode of
the marginal posterior distribution is used as the point estimate value. For
interval estimation, we can use the Bayesian credible interval 100(1a)%
for example. When a ¼ 0.05, the credible interval indicates that parameter
b
qj is included in the interval (l, m) with 95% probability. This concept
is more intuitive and easier to understand than confidence intervals in
frequentism. In the formula, we define
Pr

l b
qj m

¼
Z m
l
p

b
qj y

dqj ¼ 1 a; 0 a 1: (2.5.3)
Parameter inference in the regression model (significance testing) is
performed by asking whether or not 0 is included in the credible interval.
For example, looking at the 95% credible interval, if (0:5 b
qj 1:0), since
0 is not included within the credible interval, we can judge the parameter as
being positively significant. However, since multiple integrations of
Eq. (2.5.2) are difficult to perform where q has large dimensions, an approx-
imation calculation is often necessary. Various Monte Carlo integration
methods have been proposed to date,2
and in particular, the MCMC
method Gelfand and Smith (1990) brought to the statistical sciences
has been an epoch-defining method that can efficiently evaluate high-
dimensional integration, contributing greatly to the development and
practical implementation of Bayesian statistics3
.
The MCMC method, as the name suggests, is a Monte Carlo method
using a Markov chain. Prior to the MCMC method, many methods were
based on independent sampling from the distribution; however, the
MCMC method uses the Monte Carlo method for sample sequences
with serial correlation. Here, a Markov chain is used to generate a sample
series with a serial correlation. In Markov chains, there is a property that,
when repeated a sufficient number of times from an appropriate initial value,
the distribution of stochastic samples converges on an invariant distribution
under regular conditions. Therefore, by constructing a Markov chain such
that this invariant distribution becomes a posterior distribution, stochastic
samples of the Markov chain can be used as probability samples from the
posterior distribution. Representative algorithms include the Gibbs sampler
and the MetropoliseHastings (MH) algorithm. We explain these next.
2
Rue and Martino’s (2007) integrated nested Laplace approximation method for function
approximation without simulation has also seen major development in recent years.
3
In fact, the first author, Professor Alan Gelfand, is a prominent spatial statistician.

The Gibbs sampler is an algorithm for when a full conditional distribu-
tion is available. When the probability density function of the posterior dis-
tribution is taken to be p(qjy), the density function of the conditional
posterior distribution can be expressed as follows:
pðq1jy; q2; .; qJ Þ
pðq2jy; q1; q3; .; qJ Þ
«
pðqJ jy; q1; .; qJ1Þ
(2.5.4)
When it is possible to generate random samples from each conditional
distribution, we can use the Gibbs sampler as follows:
(0) Set the initial value: qð0Þ
¼

q
ð0Þ
1 ; q
ð0Þ
2 ; .; q
ð0Þ
J
0
(1) for t ¼ 1 to T do
(2) sample q
ðtþ1Þ
1 wp

q1 y; q
ðtÞ
2 ; .; q
ðtÞ
J

(3) sample q
ðtþ1Þ
2 wp

q2 y; q
ðtþ1Þ
1 ; q
ðtÞ
3 ; .; q
ðtÞ
J

(4) .
(5) sample q
ðtþ1Þ
J wp

qJ y; q
ðtþ1Þ
1 ; .; q
ðtþ1Þ
J1

(6) qðtþ1Þ
)

q
ðtþ1Þ
1 ; q
ðtþ1Þ
2 ; .; q
ðtþ1Þ
J
0
(7) end for
Here, when T/N, the empirical distribution of the sample series

q
ðtÞ
1 ; q
ðtÞ
2 ; .; q
ðtÞ
J
0
; t ¼ 1; .; T, converges on a joint distribution (poste-
rior distribution). Now, by extracting the sample series

q
ðtÞ
j

; t ¼ 1; .; T for a given parameter qj, we can calculate a point
estimate and credible interval. In practice, the average and standard deviation
are obtained for the part that excludes the period that depends on the initial
value, termed the burn-in period.
Where sampling from the conditional posterior distribution cannot be
done simply, the MH algorithm is used. The MH algorithm is a method
of sampling from a proposal distribution q(q*jq(t1)
), which approximates
in lieu of a conditional posterior distribution, where the generation of
random numbers is difﬁcult. In instances where the proposal distribution
is a symmetric distribution such as a normal distribution, since

q

q
qðt1Þ

¼ q

qðt1Þ q

is established, the simpler Metropolis algo-
rithm can be used. The Metropolis algorithm can be described as follows:
(0) Set the initial value: qð0Þ
¼

q
ð0Þ
1 ; q
ð0Þ
2 ; .; q
ð0Þ
J
0
(1) for t ¼ 1 to T do
(2) sample

q
from proposal q

q
qðt1Þ

(3) compute ratio a ¼ pðq
jyÞ
pðqðt1Þ
jyÞ
¼ exp
h
ln

p

q
y

ln p

qðt1Þ
y
i
(4) if a 1,set q(t)
¼ q*
else if a 1; set qðtÞ
¼

q
with probability: a
qðt1Þ
with probability: ð1 aÞ
(5) end if
(6) end for
Since we usually wish to sample many high-probability points in
the probability density function, the movement to increase the value of
the posterior distribution is accepted with a probability of 1. However, if
we reject all movements that decrease the value of the posterior probability
distribution, we will be unable to move from the position with the highest
probability. Hence we also accept this move with an acceptance ratio a. The
ratio should be about 0.2e0.4 (e.g., Brooks et al., 2011, p.424); however,
care is required, as this depends on the dimensions of q.
The normal distribution q

q
qðt1Þ

¼ N

q
qðt1Þ
; E0

is often
used as the proposal distribution. However, because it is difﬁcult to ﬁnd a
proposal distribution that closely approximates a posterior distribution
where there are multiple dimensions, it is more practical to assume a
one-dimensional proposal distribution qj for each parameter, like the Gibbs
sampler. For this purpose, the random walk process can be used. Using the
random walk process, the proposal distribution q

q
j q
ðt1Þ
j

relating to a
given parameter qj can be expressed thus:
q

q
j q
ðt1Þ
j

¼ N

q
j q
ðt1Þ
j ; x2

; (2.5.5)
where x2
is an important parameter that determines the acceptance ratio, and
is adjusted within the algorithm such that it approaches the appropriate
acceptance ratio. For instance if we wish the acceptance ratio to be about
0.4, if the acceptance ratio falls below 0.3, x2
is multiplied by 0.9, and if it

exceeds 0.5, is multiplied by 1.1, during the burn-in period (Han and Lee,
2013).
Where it is desirable that the proposal distribution be symmetrical, this
kind of random walk process can be used; however, there are many instances
where the normal distribution does not match the proposal distribution,
such as when there is a positive limit on the state space that the parameter
can take (Banerjee et al., 2014). In this case, it is necessary for the proposal
distribution to use the asymmetric q

q
qðt1Þ

. We use the MH algorithm
where it is not symmetrical, which is the generalization of the Metropolis
algorithm. The acceptance ratio in the MH algorithm is given by:
a ¼
pðq
jyÞq

qðt1Þ
q

p

qðt1Þ
y

q

q
qðt1Þ
: (2.5.6)
For details concerning the MCMC method, refer to Brooks et al. (2011).
2.5.3 Bayesian estimation of the classical linear
regression model
In this section, we explain Bayesian estimation, taking the CLR model as
an example, as follows:
y wN

Xb; s2
εI

: (2.5.7)
The likelihood of this regression model is given by
p

y b; s2
ε

¼
Y
N
i¼1
1
ffiffiffiffiffiffiffiffiffiffi
2ps2
ε
p exp

ðyi XibÞ2
2s2
ε
#
f

s2
ε
N=2
exp
ðy XbÞ0
ðy XbÞ
2s2
ε
(2.5.8)
Here, with y Xb ¼

y Xb
b

þ

Xb
b Xb

¼ b
ε þ X

b
b b

and X0b
ε ¼ 0, we obtain
ðy XbÞ0
ðy XbÞ
¼
h
b
ε þ X

b
b b
i0h
b
ε þ X

b
b b
i
¼ b
ε0
b
ε þ

b b
b
0
X0
X

b b
b

¼ ðN KÞs2
þ

b b
b
0
X0
X

b b
b

;
(2.5.9)

where b
b ¼ ðX0XÞ1
X0y, s2 ¼

N K
1

y Xb
b

0

y Xb
b

. By
substituting Eq. (2.5.9) into Eq. (2.5.8) we obtain
p

y b; s2
ε

f

s2
ε
n=2
exp
2
6
4
ðN KÞs2 þ

b b
b
0
X0X

b b
b

2s2
ε
3
7
5
(2.5.10)
The posterior distribution can be obtained by combining this likelihood
with the prior distribution.
Useful prior distributions, known as conjugate prior distributions, are
often used. That is, if the posterior distributions are in the same probability
distribution family as the prior probability distribution, the prior is called as
conjugate. Now, for the prior distributions of b and s2
ε, we assume
p

b; s2
ε

¼ p

b s2
ε

p

s2
ε

: (2.5.11)
Then, we assume that the conditional prior probability of b when given
s2
εis the normal distribution b s2
εwN

b0; s2
εE1
0

, and that the prior distri-
bution of s2
ε follows the inverse gamma distribution s2
εwIGða0=2; b0=2Þ.
Their density functions are given by
p

b s2
ε

¼
1

2ps2
ε
k=2
E0
1=2
exp
ðb b0Þ0
E0ðb b0Þ
2s2
ε
; (2.5.12)
p

s2
ε

¼
ðb0=2Þða0=2Þ
Gða0=2Þ

s2
ε
ða0=2þ1Þ
exp

b0
2s2
ε
; (2.5.13)
respectively. According to Bayes’ theorem, because the posterior probability
density function becomes p

b; s2
εjy

fp

y b; s2
ε

p

b s2
ε

p

s2
ε

, through a
simple calculation, the conditional posterior distributions for each parameter
are obtained as
b s2
ε; ywN

e
b; s2
ε
e
E

; (2.5.14)
s2
εjywIG

e
a=2;e
b
.
2

; (2.5.15)
where e
b ¼ ðX0X þ E0Þ1

X0Xb
b þ E0b0

e
E ¼ ðX0X þ E0Þ1
e
a ¼ a0 þ
N, and e
b ¼ b0 þ

N K

s2 þ

b0 b
b

0
h
X0X
1
þ E1
0
i1
b0 b
b

.

Therefore, it is sufﬁcient to perform the Gibbs Sampler based on these
conditional distributions. Note that Bayesian estimation is a shrinkage
estimator, in the sense that the OLS estimator is corrected to the direction
toward average component in the prior distribution.
In these explanations, we assumed that p

b; s2
ε

¼ p

b s2
ε

p

s2
ε

. Next,
we assume an independent prior distribution as
p

b; s2
ε

¼ pðbÞp

s2
ε

; (2.5.16)
and we set
bwN

_
b; _
E

; (2.5.17)
s2
εwIG

_
a=2; €
b
.
2

; (2.5.18)
Then, the conditional posterior distribution is given as
b s2
ε; ywN

€
b; €
E

; (2.5.19)
s2
ε b; ywIG

€
a=2; €
b
.
2

; (2.5.20)
where €
b ¼

s2
ε X0X þ _
E1
1
s2
ε X0y þ _
E1
_
b

, €
E¼

s2
ε X0X þ
_
E1
1
, €
a ¼ _
a þ N, and €
b ¼ _
b þ

y XbÞ0

y Xb

.
Lastly, with p

b; s2
ε

¼ pðbÞp

s2
ε

, we set noninformative priors as
pðbÞf1; (2.5.21)
p

s2
ε

f
1
s2
ε
: (2.5.22)
With these priors, the conditional posterior distribution is given by
b s2
ε; ywN

b
b; s2
εðX0
XÞ
1

; (2.5.23)
s2
ε ywIG

ðN KÞ=2; ðN KÞs2

2

: (2.5.24)
References
Abadir, K., Magnus, J., 2002. Notation in econometrics: a proposal for a standard. The
Econometrics Journal 5, 76e90.
Banerjee, S., Carlin, B.P., Gelfand, A.E., 2014. Hierarchical Modeling and Analysis for
Spatial Data, second ed. Chapman Hall/CRC, Boca Raton.
Brooks, S., Gelman, A., Jones, G., Meng, X.L. (Eds.), 2011. Handbook of Markov Chain
Monte Carlo. Chapman and Hall/CRC, Boca Raton.

Congdon, P., 2010. Applied Bayesian Hierarchical Methods. Chapman and Hall/CRC,
London.
Hayashi, F., 2000. Econometrics. Princeton Univ Pr.
Gelfand, A.E., Smith, A.F.M., 1990. Sampling-based approaches to calculating marginal
densities. Journal of the American Statistical Association 85 (410), 398e409.
Han, X., Lee, L.F., 2013. Bayesian estimation and model selection for spatial Durbin error
model with ﬁnite distributed lags. Regional Science and Urban Economics 43 (5),
816e837.
Hansen, L.P., 1982. Large sample properties of generalized methods of moments estimators.
Econometrica 50 (4), 1029e1054.
Hastie, T., Tibshirani, R., 1990. Generalized Additive Models. Chapman Hall/CRC,
London.
Hayashi, F., 2000. Econometrics. Princeton University Press, Princeton.
LeSage, J.P., Pace, R.K., 2009. Introduction to Spatial Econometrics. Chapman Hall/
CRC, Boca Raton.
Rue, H., Martino, S., 2007. Approximate Bayesian inference for hierarchical Gaussian
Markov random ﬁeld models. Journal of statistical planning and inference 137 (10),
3177e3192.
Ruppert, D., Wand, M.P., Carroll, R.J., 2003. Semiparametric Regression (Cambridge
Series in Statistical and Probabilistic Mathematics). Cambridge University Press,
Cambridge.
Seya, H., Tsutsumi, M., Yoshida, Y., Kawaguchi, Y., 2011. Empirical comparison of
the various spatial prediction models: in spatial econometrics, spatial statistics, and
semiparametric statistics. Procedia-Social and Behavioral Sciences 21, 120e129.
Wood, S.N., 2017. Generalized Additive Models: An Introduction With R, second ed.
Chapman Hall/CRC, Boca Raton.

CHAPTER THREE
Global and local indicators of
spatial associations
Hajime Seya
Contents
3.1 Spatial weight matrix 33
3.1.1 Definition of the spatial weight matrix 34
3.1.2 Specification of the spatial weight matrix 37
3.1.3 Standardization of the spatial weight matrix 38
3.2 Testing for spatial autocorrelation 39
3.2.1 Testing for global spatial autocorrelation 39
3.2.2 Testing for local spatial autocorrelation 43
3.2.2.1 Local Moran statistic 44
3.2.2.2 Local Geary statistic 45
3.2.2.3 Gi and Gi* statistics 45
3.2.3 Examples 47
3.2.3.1 Japanese income data: an application of local Moran 47
3.2.3.2 Japanese population data: an application of local Geary 49
3.3 Testing for spatial heterogeneity 51
3.3.1 Testing for global spatial heterogeneity 51
3.3.2 Testing for local spatial heterogeneity: Hi statistic 51
References 53
3.1 Spatial weight matrix
In Chapter 2, we explained that bias may occur in standard errors of
regression coefficient estimates when spatial autocorrelation and/or spatial
heterogeneity exists in the residuals of the regression model. To confront
this problem, we can use spatial econometric models. A spatial weight
matrix, which we describe in the following subsections, is a central tool in
spatial econometrics.
Spatial Analysis Using Big Data
ISBN: 978-0-12-813127-5

3.1.1 Definition of the spatial weight matrix
The spatial weight matrix is a convenient and easy-to-understand tool for
addressing spatial autocorrelation among data. Here, the data may be
observed values or residuals obtained from a regression model. First, we
introduce the concept of spatial weight matrix with the following definition.
The spatial weight matrix W of N N describes the relationship
between the data observed at i and j. Let Si denote the neighborhood (label)
set consisting of areas/points with dependency on area/point i. If there is a
dependency between the data yi and yj observed at i and j (i:e:; j ˛ Si), the
element of W is given as wijs0; if there is no dependency (i:e:; j ; Si), then
it is given as wij ¼ 0.
Let us take an example used in Seya and Tsutsumi (2014). We consider a
simple case of five areas {1, 2, 3, 4, 5} as shown in Fig. 3.1.1. The coordinates
of the central point of each area are given by s1; s2; s3; s4; s5, respectively.
Here, for instance, in the case of area 1, if only area 2 is dependent, the neigh-
borhood set of area 1, say S1 , becomes {2} and (w12s0), and the remaining
{3, 4, 5} are excluded from the neighborhood set (w1j ¼ 0; for j ¼ 3; 4; 5).
Because a neighborhood set can be considered for each of the areas {1, 2, 3, 4,
5}, we consider a matrix W with i rows and j columns, and provide its
elements as wij. To avoid it explaining itself, the elements of the diagonal
matrix are usually 0.
Different from time series data, where unidirectional effects from old
data / new data can be found, there is no clear order in cross-sectional
Figure 3.1.1 Virtual area.
34 Hajime Seya

spatial data. Hence we typically assume that the relationship is bidirectional,
and it can be modeled by setting {wijs0 and wjis0}. Of course we can also
assume a unidirectional effect in an adhoc manner, for instance, as from large
economy to small economy (Seya et al., 2012). So many ways of providing
W can be considered, but those that are typically used in empirical research
are as follows.1
• Contiguity-based W: it is 1 if the zone boundaries are in contact (shared)
and 0 if not in contact (symmetric matrix).
0
1
0
1
0
1
0
1
1
0
0
1
0
0
1
1
1
0
0
1
0
0
1
1
0
=
W .
Affecting area
1 2 3 4 5
Affected area
1
2
3
4
5
(3.1.1)
• k nearest neighbor (kNN)-based W: point j might be less than or equal to
kth nearest neighbor of point i, the weight is 1;otherwise, the weight is
0 (asymmetric matrix). For example, if k ¼ 2, the following matrix can
be obtained.
W ¼
0
B
B
B
B
B
B
@
0 1 1 0 0
1 0 0 1 0
1 0 0 1 0
0 1 1 0 0
0 1 0 1 0
1
C
C
C
C
C
C
A
(3.1.2)
Notably, the W obtained using kNN is not necessarily a symmetrical
matrix because i is not necessarily included in the area neighborhood k
unit of area j, even if j is included in the area neighborhood k unit of area
i. One more point to note is that, in the case of k ¼ 3, {2, 4} is included
in the neighborhood set S5 of area 5, but it is not obvious which of
{1, 3} should be included. Therefore, it is necessary to exogenously select
one side, or make the selection including both (as weight 1/2, if needed).
1
The typical spatial weight matrices listed here can be easily implemented using the R spdep package,
GeoDa, etc.
Global and local indicators of spatial associations 35

• Inverse-distance-based W without a distance cutoff:
W ¼
0
B
B
B
B
B
B
@
0 1=2 1=2 1=ð2:83Þ 1=ð4:12Þ
1=2 0 1=ð2:83Þ 1=2 1=ð2:24Þ
1=2 1=ð2:83Þ 0 1=2 1=ð4:12Þ
1=ð2:83Þ 1=2 1=2 0 1=ð2:24Þ
1=ð4:12Þ 1=ð2:24Þ 1=ð4:12Þ 1=ð2:24Þ 0
1
C
C
C
C
C
C
A
z
0
B
B
B
B
B
B
@
0 0:50 0:50 0:35 0:24
0:50 0 0:35 0:50 0:45
0:50 0:35 0 0:50 0:24
0:35 0:50 0:50 0 0:45
0:24 0:45 0:24 0:45 0
1
C
C
C
C
C
C
A
(3.1.3)
It is W that uses the inverse of the (typically Euclidean) distance, which
does not set a cutoff that takes 0 when exceeding a certain distance. And it is
a symmetrical matrix (in case of Euclidean). Here, we set wij ¼ (1/dij)a
with
a ¼ 1, but in empirical analysis, a ¼ 2 is also used in an analogy of gravity
models.
However, in recent years, when using a dense matrix in whichwijs 0 for
almost all elements, it is suggested that spatial process may be overly
smoothed and spatial correlation parameters are systematically underesti-
mated in the spatial econometric model. Hence it is necessary to pay
attention when applying it (see Smith, 2009; Arbia et al., 2019).
• Inverse-distance-based W with a distance cutoff
It is W using the inverse of the (typically Euclidean) distance with a
cutoff that sets the weight to 0 if it exceeds a certain distance and it is a
symmetrical matrix (in case of Euclidean). For example, if we set the cutoff
to 2.5 km (i.e., 0.40), the following matrix is obtained.
W ¼
0
B
B
B
B
B
B
@
0 0:50 0:50 0 0
0:50 0 0 0:50 0:45
0:50 0 0 0:50 0
0 0:50 0:50 0 0:45
0 0:45 0 0:45 0
1
C
C
C
C
C
C
A
(3.1.4)
Instead of the Euclidean distance, LeSage and Polasek (2008) use a trafﬁc
network distance. In the trafﬁc network distance, W need not to be
symmetric, if one-way roads exist.
36 Hajime Seya

3.1.2 Specification of the spatial weight matrix
The choice of weight matrix W affects both estimations and inferences (e.g.,
Florax and Folmer, 1992; Griffith, 1996; Stakhovych and Bijmolt, 2009).
However, currently, there are few guidelines for selecting the correct W
(Anselin, 2002). Following Stakhovych and Bijmolt (2009), we classified
the means of providing W into the following three types:
[A] Completely exogenous
[B] Determining from the data
[C] Estimating
[A] is a typical method provided by whether area boundaries are in
contact (contiguity-based W) as previously mentioned, as well as the inverse
distance, Delaunay triangulation, and so forth (LeSage, 1999).
For [B], Aldstadt and Getis (2006) proposed an algorithm termed
AMOEBA (a multidirectional optimum ecotope-based algorithm),2
in which
the geometric form of spatial clusters are identified, similar to a region
growing algorithm of image segmentation. The distance need not be limited
to the geographic one. We can consider other types of distances, including
social economic distances (e.g., difference of gross domestic product [GDP]),
migration flow (Conway and Rork, 2004), technological similarity (Lychagin
et al., 2016), among others. These methods may also be categorized to [B].
However, we need to note that methods for parameter estimations and
inferences of spatial econometric models have been developed for exogenous
W. If W is endogenous, we need to rely on recently proposed parameter
estimation methods for endogenous W (e.g., Qu and Lee, 2015; Zhou
et al., 2016). Kostov (2010) noted that the reason why W based on geograph-
ical distance is often used is that the exogeneity is automatically satisfied.
There are very few studies classified as [C]. Fern
andez-V
azquez et al.
(2009) tried to estimate W by generalized maximum entropy and generalized
cross-entropy techniques. In a panel setting when the number of time series
variables can grow faster than the number of time points for data, studies
employed (graphical) LASSO to estimate W (Ahrens and Bhattacharjee,
2015; Moscone et al., 2017; Lam and Souza, 2019).
Because research regarding the estimation of the elements of W in [C] is
still under development, it is common to select the most adequate W from
prepared candidates. We could mention Kostov (2010) based on boosting
and LeSage and Pace (2009), who attempted a model selection using
2
GIS software that implements AMOEBA is available at the website of the first author (http://www.acsu.
buffalo.edu/wgeojared/tools.htm). It can also be implemented using the AMOEBA package of R.
Global and local indicators of spatial associations 37

posterior model probabilities based on a Bayesian approach.3
The latter
Bayesian approach is useful for the selection of both X and W (LeSage and
Pace, 2009). Kelejian (2008) and Kelejian and Piras (2011) extended the
J-test, which is a representative method of nonnested model selection, to
the spatial econometric model. Meanwhile, Mur and Angulo (2009) noted
the usefulness of information criterion (comprehensibility and clarity of results
in the sense of selecting the unique best model). Similarly, Stakhovych and
Bijmolt (2009) stated that using information criterion (e.g., Akaike’s informa-
tion criterion [AIC]) increases the possibility of correct model speciﬁcation.
Seya et al. (2013) attempted to simultaneously select W and X to minimize
the AIC by applying a technique termed transdimensional simulated
annealing that combines reversible jump Markov chain Monte Carlo and
simulated annealing. Other attempts include not model selections but model
ensembles. We can mention those based on Bayesian model averaging
(LeSage and Parent, 2007) and weighted average least squares (WALS)
(Seya et al., 2014).4
3.1.3 Standardization of the spatial weight matrix
The spatial lag model (SLM), described in Chapter 6, is a representative
spatial econometric model. For the maximum likelihood estimation of the
parameters of this model, calculation of the term ðI rWÞ1
is necessary
during the process of estimation (here r is a parameter that indicates the
degree of spatial autocorrelation). In many previous studies, jrj 1 is
assumed in the analogy of time series models, but the singular point of
ðI rWÞ is not necessarily outside the range of ð 1; 1Þ. Therefore,
some standardization is typically performed on W to guarantee the existence
of the inverse matrix. The most widely used standardization is row-
standardization (also termed row-normalization), in which the elements of
each row of W sum to unity. With the contiguity-based W (Eq. 3.1.1),
row-standardization may be performed as
W ¼
0
B
B
B
B
B
B
@
0 1=2 1=2 0 0
1=3 0 0 1=3 1=3
1=2 0 0 1=2 0
0 1=3 1=3 0 1=3
0 1=2 0 1=2 0
1
C
C
C
C
C
C
A
(3.1.5)
3
The Spatial Econometrics Toolbox of Matlab is available at the website of the ﬁrst author.
4
See Magnus and De Luca (2016) for details about the WALS.
38 Hajime Seya

Other documents randomly have
different content

The Project Gutenberg eBook of Busy Ben and
Idle Isaac

This ebook is for the use of anyone anywhere in the United States
and most other parts of the world at no cost and with almost no
restrictions whatsoever. You may copy it, give it away or re-use it
under the terms of the Project Gutenberg License included with this
ebook or online at www.gutenberg.org. If you are not located in the
United States, you will have to check the laws of the country where
you are located before using this eBook.
Title: Busy Ben and Idle Isaac
Author: Unknown
Release date: June 5, 2019 [eBook #59676]
Language: English
Credits: Produced by hekula03, Robert Tonsing, and the Online
Distributed Proofreading Team at http://www.pgdp.net
(This
book was produced from images made available by the
HathiTrust Digital Library.)
*** START OF THE PROJECT GUTENBERG EBOOK BUSY BEN AND
IDLE ISAAC ***

DEAN SON’S PENNY BOOKS.
11
NURSE
ROCKBABY’S
PRETTY STORY BOOKS.
BUSY BEN,
AND IDLE ISAAC.
LONDON, DEAN SON, THREADNEEDLE STREET.
BUSY BEN
AND IDLE ISAAC.

IN a very pretty village, there once lived two boys, named Benjamin
and Isaac. Benjamin was always at work, doing something; but
Isaac liked best to take his ease, and did not care, like Benjamin, to
be tidy and clean; but went ragged and dirty: so, at last, they were
called by the neighbours, Busy Ben, and Idle Isaac.
One morning, as Busy Ben was passing through the village he saw
Isaac idling about, “What! not doing any thing?” said Ben. “I have

been lying in the sun,” replied Isaac, “and now I am so hot, I have
taken off my jacket, to cool myself.”
Busy Ben saw that he had, and he saw also what a torn shirt he
had under it.
“And I,” said Ben, “have
been working in the
garden, and watering the
beds. Is it not much better
to be at work, than lying in
the sun?”
“You may think so,” said
Idle Isaac, yawning, “but I
do not; besides, my father
can afford to keep me
without,—he has plenty of
work.”
“But he may not always have plenty of work,” answered Ben: “and
even if he has, it is better to learn to keep ourselves.”

Such was the way these two boys talked to each other; but Isaac
did not improve, he still idled his time away; true, he would now and
then take a spade in hand, but he never did any work with it, but
stood still or, sauntered about, staring first at one thing and then at
another.
Ben, on the contrary, by being always busy, grew up strong
healthy, and clever; and, though still a boy, was soon able to
maintain himself.
The farmers round about were always glad of his help to tend
their sheep, for Ben was never above being industrious, and though
he often worked hard, Busy Ben was happier, and more cheerful,
than Idle Isaac.

And so it will always be:
for
“From honest labour many a blessing springs,
And health, and wealth, and happiness, it brings.”
When Ben was fifteen years old, a gentleman, who was a ship-
builder, came to the village, and was so much pleased by what he
heard of Busy Ben, that he apprenticed him. Ben was attentive to
what he was taught, and soon became a clever workman,—for, by
trying to find out the meaning of what he was doing, he was soon
able to finish the work in a much better manner than he would
otherwise have done.

And thus Busy Ben passed
many years; the more he got on,
the harder he worked; and the
cleverer he became, the more
pains he took to improve. At last,
by industry and frugality, he was
able to set up in business for
himself, and still keeping on as
he began, he became a rich man.
Idle Isaac had grown up, too,
but was as lazy a man as he had
been idle when a boy. As usual,
he was never seen at work,—but
was often seen idling away his
time,—or in the company of
boys, equally idle as himself.

Never having learned to be useful to anybody else, he was now of
no use to himself, and was often without a meal, and always in
shabby clothes;—for no one cared to assist an idle man, who had
brought on his own troubles.
One day, whilst sitting in a public-house, as he often did, reading
the newspaper, he saw in it that Busy Ben had built a ship, which
was the talk of all London; and that the Queen, hearing how Busy
Ben got on, from a poor boy, to be what he was, had knighted him.
And this was all owing to his good conduct and industry.
Idle Isaac now began to wish he had minded what Busy Ben had
said to him when a boy: he was at this time very poor, so he knew

he must do something;—but never having been taught a trade—
what that something was, we shall soon see.
Ben, by this time had a wife and two children, a little boy and girl;
and they had a pet dog and cat: and, one morning, there was a
great yelping and mewing between them, for the man who usually
brought their meat was a long while past his time.

“O, here he is,” at last cried one of the children; “but, papa, it is
not the same man that came before.”
And who do you think the new cats’ meat man was?—Idle Isaac!—
yes, it was indeed him!
Ben rang for his servant, and desired him to ask Isaac into the
hall.
When Idle Isaac came in, and saw Sir Benjamin, he was very sad,
for he well knew that he had passed the time in idleness in which he
might, had he been industrious and saving, have gained a good
home for himself.
“I am properly punished,” said Idle Isaac, “for my want of
industry; and you are justly made happy for your application.” “It is
never too late to mend,” said Ben; “but it is of little use to work
hard, unless you save up a part of what you earn.” “I am sure you

speak truth,” answered Isaac, “for you have earned for yourself a
fine house, and wealth, whilst I have only a poor hovel, and a
barrow.”
“It is never too late to mend,”
said Ben again, “and if you will promise to save up a part of what
you earn, and put it into the savings’ bank, I will put as much money
to it every half year, that you may have something to live on in your
old age.”
Save for old age while you may;
Sunshine lasts not all the day.
So Isaac promised, as he wheeled away his barrow, that he would
try.—And I hope all my little readers, who may be like Idle Isaac, will
take pattern by BUSY BEN.

NURSE ROCKBABY’S
PRETTY STORY BOOKS.
12 Sorts.
——oo——

1 New Historical Alphabet
2 The Ramble; and what was seen in it
3 Hans Dolan, and his Cat
4 Easy Reading and Pretty Pictures
5 The Brother and Sister
6 Little Rhymes for Little Readers
7 Greedy Peter
8 An Entertaining Walk with Mamma
9 The Little Merchant
10 Story of the Princess Fairlocks
11 Busy Ben and Idle Isaac
12 A Little Child’s Little Book of Goodness and Happiness
——oo——
Nurse Rockbaby’s Pretty Stories,
COLOURED ENGRAVINGS,
12 Sorts, 2d. each.
Transcriber’s Notes:
Silently corrected typographical errors.

*** END OF THE PROJECT GUTENBERG EBOOK BUSY BEN AND
IDLE ISAAC ***
Updated editions will replace the previous one—the old editions will
be renamed.
Creating the works from print editions not protected by U.S.
copyright law means that no one owns a United States copyright in
these works, so the Foundation (and you!) can copy and distribute it
in the United States without permission and without paying
copyright royalties. Special rules, set forth in the General Terms of
Use part of this license, apply to copying and distributing Project
Gutenberg™ electronic works to protect the PROJECT GUTENBERG™
concept and trademark. Project Gutenberg is a registered trademark,
and may not be used if you charge for an eBook, except by following
the terms of the trademark license, including paying royalties for use
of the Project Gutenberg trademark. If you do not charge anything
for copies of this eBook, complying with the trademark license is
very easy. You may use this eBook for nearly any purpose such as
creation of derivative works, reports, performances and research.
Project Gutenberg eBooks may be modified and printed and given
away—you may do practically ANYTHING in the United States with
eBooks not protected by U.S. copyright law. Redistribution is subject
to the trademark license, especially commercial redistribution.
START: FULL LICENSE

THE FULL PROJECT GUTENBERG LICENSE

PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK
To protect the Project Gutenberg™ mission of promoting the free
distribution of electronic works, by using or distributing this work (or
any other work associated in any way with the phrase “Project
Gutenberg”), you agree to comply with all the terms of the Full
Project Gutenberg™ License available with this file or online at
www.gutenberg.org/license.
Section 1. General Terms of Use and
Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand, agree
to and accept all the terms of this license and intellectual property
(trademark/copyright) agreement. If you do not agree to abide by all
the terms of this agreement, you must cease using and return or
destroy all copies of Project Gutenberg™ electronic works in your
possession. If you paid a fee for obtaining a copy of or access to a
Project Gutenberg™ electronic work and you do not agree to be
bound by the terms of this agreement, you may obtain a refund
from the person or entity to whom you paid the fee as set forth in
paragraph 1.E.8.
1.B. “Project Gutenberg” is a registered trademark. It may only be
used on or associated in any way with an electronic work by people
who agree to be bound by the terms of this agreement. There are a
few things that you can do with most Project Gutenberg™ electronic
works even without complying with the full terms of this agreement.
See paragraph 1.C below. There are a lot of things you can do with
Project Gutenberg™ electronic works if you follow the terms of this
agreement and help preserve free future access to Project
Gutenberg™ electronic works. See paragraph 1.E below.

1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright law
in the United States and you are located in the United States, we do
not claim a right to prevent you from copying, distributing,
performing, displaying or creating derivative works based on the
work as long as all references to Project Gutenberg are removed. Of
course, we hope that you will support the Project Gutenberg™
mission of promoting free access to electronic works by freely
sharing Project Gutenberg™ works in compliance with the terms of
this agreement for keeping the Project Gutenberg™ name associated
with the work. You can easily comply with the terms of this
agreement by keeping this work in the same format with its attached
full Project Gutenberg™ License when you share it without charge
with others.
1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside the
United States, check the laws of your country in addition to the
terms of this agreement before downloading, copying, displaying,
performing, distributing or creating derivative works based on this
work or any other Project Gutenberg™ work. The Foundation makes
no representations concerning the copyright status of any work in
any country other than the United States.
1.E. Unless you have removed all references to Project Gutenberg:
1.E.1. The following sentence, with active links to, or other
immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project Gutenberg™
work (any work on which the phrase “Project Gutenberg” appears,
or with which the phrase “Project Gutenberg” is associated) is
accessed, displayed, performed, viewed, copied or distributed:

This eBook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it away
or re-use it under the terms of the Project Gutenberg License
included with this eBook or online at www.gutenberg.org. If you
are not located in the United States, you will have to check the
laws of the country where you are located before using this
eBook.
1.E.2. If an individual Project Gutenberg™ electronic work is derived
from texts not protected by U.S. copyright law (does not contain a
notice indicating that it is posted with permission of the copyright
holder), the work can be copied and distributed to anyone in the
United States without paying any fees or charges. If you are
redistributing or providing access to a work with the phrase “Project
Gutenberg” associated with or appearing on the work, you must
comply either with the requirements of paragraphs 1.E.1 through
1.E.7 or obtain permission for the use of the work and the Project
Gutenberg™ trademark as set forth in paragraphs 1.E.8 or 1.E.9.
1.E.3. If an individual Project Gutenberg™ electronic work is posted
with the permission of the copyright holder, your use and distribution
must comply with both paragraphs 1.E.1 through 1.E.7 and any
additional terms imposed by the copyright holder. Additional terms
will be linked to the Project Gutenberg™ License for all works posted
with the permission of the copyright holder found at the beginning
of this work.
1.E.4. Do not unlink or detach or remove the full Project
Gutenberg™ License terms from this work, or any files containing a
part of this work or any other work associated with Project
Gutenberg™.
1.E.5. Do not copy, display, perform, distribute or redistribute this
electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1

with active links or immediate access to the full terms of the Project
Gutenberg™ License.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if you
provide access to or distribute copies of a Project Gutenberg™ work
in a format other than “Plain Vanilla ASCII” or other format used in
the official version posted on the official Project Gutenberg™ website
(www.gutenberg.org), you must, at no additional cost, fee or
expense to the user, provide a copy, a means of exporting a copy, or
a means of obtaining a copy upon request, of the work in its original
“Plain Vanilla ASCII” or other form. Any alternate format must
include the full Project Gutenberg™ License as specified in
paragraph 1.E.1.
1.E.7. Do not charge a fee for access to, viewing, displaying,
performing, copying or distributing any Project Gutenberg™ works
unless you comply with paragraph 1.E.8 or 1.E.9.
1.E.8. You may charge a reasonable fee for copies of or providing
access to or distributing Project Gutenberg™ electronic works
provided that:
• You pay a royalty fee of 20% of the gross profits you derive
from the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information

about donations to the Project Gutenberg Literary Archive
Foundation.”
• You provide a full refund of any money paid by a user who
notifies you in writing (or by e-mail) within 30 days of receipt
that s/he does not agree to the terms of the full Project
Gutenberg™ License. You must require such a user to return or
destroy all copies of the works possessed in a physical medium
and discontinue all use of and all access to other copies of
Project Gutenberg™ works.
• You provide, in accordance with paragraph 1.F.3, a full refund of
any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.E.9. If you wish to charge a fee or distribute a Project Gutenberg™
electronic work or group of works on different terms than are set
forth in this agreement, you must obtain permission in writing from
the Project Gutenberg Literary Archive Foundation, the manager of
the Project Gutenberg™ trademark. Contact the Foundation as set
forth in Section 3 below.
1.F.
1.F.1. Project Gutenberg volunteers and employees expend
considerable effort to identify, do copyright research on, transcribe
and proofread works not protected by U.S. copyright law in creating
the Project Gutenberg™ collection. Despite these efforts, Project
Gutenberg™ electronic works, and the medium on which they may
be stored, may contain “Defects,” such as, but not limited to,
incomplete, inaccurate or corrupt data, transcription errors, a
copyright or other intellectual property infringement, a defective or

damaged disk or other medium, a computer virus, or computer
codes that damage or cannot be read by your equipment.
1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except for
the “Right of Replacement or Refund” described in paragraph 1.F.3,
the Project Gutenberg Literary Archive Foundation, the owner of the
Project Gutenberg™ trademark, and any other party distributing a
Project Gutenberg™ electronic work under this agreement, disclaim
all liability to you for damages, costs and expenses, including legal
fees. YOU AGREE THAT YOU HAVE NO REMEDIES FOR
NEGLIGENCE, STRICT LIABILITY, BREACH OF WARRANTY OR
BREACH OF CONTRACT EXCEPT THOSE PROVIDED IN PARAGRAPH
1.F.3. YOU AGREE THAT THE FOUNDATION, THE TRADEMARK
OWNER, AND ANY DISTRIBUTOR UNDER THIS AGREEMENT WILL
NOT BE LIABLE TO YOU FOR ACTUAL, DIRECT, INDIRECT,
CONSEQUENTIAL, PUNITIVE OR INCIDENTAL DAMAGES EVEN IF
YOU GIVE NOTICE OF THE POSSIBILITY OF SUCH DAMAGE.
1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you
discover a defect in this electronic work within 90 days of receiving
it, you can receive a refund of the money (if any) you paid for it by
sending a written explanation to the person you received the work
from. If you received the work on a physical medium, you must
return the medium with your written explanation. The person or
entity that provided you with the defective work may elect to provide
a replacement copy in lieu of a refund. If you received the work
electronically, the person or entity providing it to you may choose to
give you a second opportunity to receive the work electronically in
lieu of a refund. If the second copy is also defective, you may
demand a refund in writing without further opportunities to fix the
problem.
1.F.4. Except for the limited right of replacement or refund set forth
in paragraph 1.F.3, this work is provided to you ‘AS-IS’, WITH NO
OTHER WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED,

INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.
1.F.5. Some states do not allow disclaimers of certain implied
warranties or the exclusion or limitation of certain types of damages.
If any disclaimer or limitation set forth in this agreement violates the
law of the state applicable to this agreement, the agreement shall be
interpreted to make the maximum disclaimer or limitation permitted
by the applicable state law. The invalidity or unenforceability of any
provision of this agreement shall not void the remaining provisions.
1.F.6. INDEMNITY - You agree to indemnify and hold the Foundation,
the trademark owner, any agent or employee of the Foundation,
anyone providing copies of Project Gutenberg™ electronic works in
accordance with this agreement, and any volunteers associated with
the production, promotion and distribution of Project Gutenberg™
electronic works, harmless from all liability, costs and expenses,
including legal fees, that arise directly or indirectly from any of the
following which you do or cause to occur: (a) distribution of this or
any Project Gutenberg™ work, (b) alteration, modification, or
additions or deletions to any Project Gutenberg™ work, and (c) any
Defect you cause.
Section 2. Information about the Mission
of Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new computers.
It exists because of the efforts of hundreds of volunteers and
donations from people in all walks of life.
Volunteers and financial support to provide volunteers with the
assistance they need are critical to reaching Project Gutenberg™’s
goals and ensuring that the Project Gutenberg™ collection will

remain freely available for generations to come. In 2001, the Project
Gutenberg Literary Archive Foundation was created to provide a
secure and permanent future for Project Gutenberg™ and future
generations. To learn more about the Project Gutenberg Literary
Archive Foundation and how your efforts and donations can help,
see Sections 3 and 4 and the Foundation information page at
www.gutenberg.org.
Section 3. Information about the Project
Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-profit
501(c)(3) educational corporation organized under the laws of the
state of Mississippi and granted tax exempt status by the Internal
Revenue Service. The Foundation’s EIN or federal tax identification
number is 64-6221541. Contributions to the Project Gutenberg
Literary Archive Foundation are tax deductible to the full extent
permitted by U.S. federal laws and your state’s laws.
The Foundation’s business office is located at 809 North 1500 West,
Salt Lake City, UT 84116, (801) 596-1887. Email contact links and up
to date contact information can be found at the Foundation’s website
and official page at www.gutenberg.org/contact
Section 4. Information about Donations to
the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission of
increasing the number of public domain and licensed works that can
be freely distributed in machine-readable form accessible by the
widest array of equipment including outdated equipment. Many

small donations ($1 to $5,000) are particularly important to
maintaining tax exempt status with the IRS.
The Foundation is committed to complying with the laws regulating
charities and charitable donations in all 50 states of the United
States. Compliance requirements are not uniform and it takes a
considerable effort, much paperwork and many fees to meet and
keep up with these requirements. We do not solicit donations in
locations where we have not received written confirmation of
compliance. To SEND DONATIONS or determine the status of
compliance for any particular state visit www.gutenberg.org/donate.
While we cannot and do not solicit contributions from states where
we have not met the solicitation requirements, we know of no
prohibition against accepting unsolicited donations from donors in
such states who approach us with offers to donate.
International donations are gratefully accepted, but we cannot make
any statements concerning tax treatment of donations received from
outside the United States. U.S. laws alone swamp our small staff.
Please check the Project Gutenberg web pages for current donation
methods and addresses. Donations are accepted in a number of
other ways including checks, online payments and credit card
donations. To donate, please visit: www.gutenberg.org/donate.
Section 5. General Information About
Project Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could be
freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose network of
volunteer support.

Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.
More than just a book-buying platform, we strive to be a bridge
connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.
Join us on a journey of knowledge exploration, passion nurturing, and
personal growth every day!
ebookmasss.com

Spatial analysis using big data: methods and urban applications Yamagata

More Related Content

Similar to Spatial analysis using big data: methods and urban applications Yamagata (20)

More from uscovshiou (7)

Recently uploaded (20)

Spatial analysis using big data: methods and urban applications Yamagata