High Performance Computing For Geospatial Applications 1st Ed Wenwu Tang

High Performance Computing For Geospatial
Applications 1st Ed Wenwu Tang download
https://ebookbell.com/product/high-performance-computing-for-
geospatial-applications-1st-ed-wenwu-tang-22505686
Explore and download more ebooks at ebookbell.com

Here are some recommended products that we believe you will be
interested in. You can click the link to download.
High Performance Computing For Intelligent Medical Systems Varun Bajaj
intelligent-medical-systems-varun-bajaj-46668016
High Performance Computing For Drug Discovery And Biomedicine
Alexander Heifetz
https://ebookbell.com/product/high-performance-computing-for-drug-
discovery-and-biomedicine-alexander-heifetz-54832056
High Performance Computing For Computational Science Vecpar 2008 8th
International Conference Toulouse France June 2427 2008 Revised
Selected Papers 1st Edition Jack Dongarra Auth
computational-science-vecpar-2008-8th-international-conference-
toulouse-france-june-2427-2008-revised-selected-papers-1st-edition-
jack-dongarra-auth-2039648
International Conference Berkeley Ca Usa June 2225 2010 Revised
Selected Papers 1st Edition John Shalf
berkeley-ca-usa-june-2225-2010-revised-selected-papers-1st-edition-
john-shalf-2133714

International Conference Berkeley Ca Usa June 2225 2010 Revised
Selected Papers 1st Edition John Shalf
berkeley-ca-usa-june-2225-2010-revised-selected-papers-1st-edition-
john-shalf-4141978
International Conference Valencia Spain June 2830 2004 Revised
Selected And Invited Papers 1st Edition Sato Tetsuya Auth
valencia-spain-june-2830-2004-revised-selected-and-invited-papers-1st-
edition-sato-tetsuya-auth-4239168
International Conference Rio De Janeiro Brazil June 1013 2006 Revised
Selected And Invited Papers 1st Edition Luiz Meyer
computational-science-vecpar-2006-7th-international-conference-rio-de-
janeiro-brazil-june-1013-2006-revised-selected-and-invited-papers-1st-
edition-luiz-meyer-4240224
International Conference Kope Japan July 1720 2012 Revised Selected
Papers 1st Edition Horst D Simon Auth
computational-science-vecpar-2012-10th-international-conference-kope-
japan-july-1720-2012-revised-selected-papers-1st-edition-horst-d-
simon-auth-4241052
International Conference Porto Portugal June 2628 2002 Selected Papers
And Invited Talks 1st Edition Rainald Lhner
computational-science-vecpar-2002-5th-international-conference-porto-
portugal-june-2628-2002-selected-papers-and-invited-talks-1st-edition-
rainald-lhner-4604722

Geotechnologies and the Environment
WenwuTang
ShaowenWang Editors
High Performance
Computing
for Geospatial
Applications

Volume 23
Series editors
Jay D. Gatrell, Department of Geology & Geography, Eastern Illinois University,
Charleston, IL, USA
Ryan R. Jensen, Department of Geography, Brigham Young University,
Provo, UT, USA

The Geotechnologies and the Environment series is intended to provide specialists
in the geotechnologies and academics who utilize these technologies, with an
opportunity to share novel approaches, present interesting (sometimes counter
intuitive) case studies, and, most importantly, to situate GIS, remote sensing,
GPS, the internet, new technologies, and methodological advances in a real world
context. In doing so, the books in the series will be inherently applied and reflect the
rich variety of research performed by geographers and allied professionals.
Beyond the applied nature of many of the papers and individual contributions,
the series interrogates the dynamic relationship between nature and society. For this
reason, many contributors focus on human-environment interactions. The series is
not limited to an interpretation of the environment as nature per se. Rather, the
series “places” people and social forces in context and thus explores the many
socio-spatial environments humans construct for themselves as they settle the
landscape. Consequently, contributions will use geotechnologies to examine both
urban and rural landscapes.
More information about this series at http://www.springer.com/series/8088

Wenwu Tang • Shaowen Wang
Editors
High Performance
Computing for Geospatial
Applications

ISSN 2365-0575 ISSN 2365-0583 (electronic)
ISBN 978-3-030-47997-8 ISBN 978-3-030-47998-5 (eBook)
https://doi.org/10.1007/978-3-030-47998-5
© Springer Nature Switzerland AG 2020
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the
editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Editors
Wenwu Tang
Center for Applied Geographic
Information Science
Department of Geography
and Earth Sciences
University of North Carolina at Charlotte
Charlotte, NC, USA
Shaowen Wang
Department of Geography and Geographic
Information Science
University of Illinois at Urbana-Champaign
Urbana, IL, USA

v
Preface
“Use the right tool for the job.”
While high-performance computing (HPC) has been recognized as the right tool
for computationally intensive geospatial applications, there is often a gap between
the rapid development of HPC approaches and their geospatial applications that are
often lagging behind. The objective of this edited book is to (help) fill this gap so
that this important right tool can be used in an appropriate and timely manner.
This book includes fifteen chapters to examine the utility of HPC for novel geo
spatial applications. Chapter 1 serves as an introduction to the entire book overarch
ing all the other fourteen chapters, which are organized into four parts. Part I (Chaps.
2 and 3) focuses on theoretical and algorithmic aspects of HPC within the context
of geospatial applications. Part II (Chaps. 4–9) concentrates on how HPC is applied
to geospatial data processing, spatial analysis and modeling, and cartography and
geovisualization. Part III (Chaps. 10–14) covers representative geospatial applica
tions of HPC from multiple domains. Part IV (Chap. 15) is a perspective view of
HPC for future geospatial applications.
This book serves as a collection of recent work written as review and research
papers on how HPC is applied to solve a variety of geospatial problems.As advanced
computing technologies keep evolving, HPC will continue to function as the right
tool for the resolution of many complex geospatial problems that are often compu
tationally demanding. The key is to understand both the computational and geospa
tial nature of these problems for best exploiting the amazing power of HPC. The
book is designed to serve this key purpose, which may help readers to identify per
tinent opportunities and challenges revolving around geospatial applications of
HPC—i.e., use the right tool for the job at the right time.
Charlotte, NC, USA Wenwu Tang
Urbana, IL, USA Shaowen Wang
February 28, 2020

vii
Acknowledgements
The editors want to take this opportunity to sincerely thank all the coauthors and
reviewers of chapters of this book for their considerable efforts and hard work. The
editors also owe special thanks to Zachary Romano and Silembarasan Panneerselvam
from Springer Nature for their strong support and timely help during the review and
editing process of this book. Minrui Zheng and Tianyang Chen provided assistance
on the formatting of the manuscripts of this book.
Nothing is more motivating than a baby cry; nothing is more relaxing than baby
giggles. Wenwu Tang wants to thank his wife, Shuyan Xia, and two sons, William
Tang and Henry Tang, for their love and understanding. The preparation of this book
was accompanied by the birth of baby Henry. While it is always challenging to bal
ance between interesting academic work and continuous family duties, enduring
support from his family is the greatest source of strength for completing this book.
Shaowen Wang wants to acknowledge the support of the U.S. National Science
Foundation under the grant numbers: 1443080 and 1743184. Any opinions, find
ings, and conclusions or recommendations expressed in this material are those of
the authors and do not necessarily reflect the views of the National Science
Foundation.

ix
Contents
1 Navigating High Performance Computing
for Geospatial Applications�� 1
Wenwu Tang and Shaowen Wang
Part I
Theoretical Aspects of High Performance Computing
2
High Performance Computing for Geospatial Applications:
A Retrospective View �� 9
Marc P. Armstrong
3 Spatiotemporal Domain Decomposition
for High Performance Computing: A Flexible
Splits Heuristic to Minimize Redundancy�� 27
Alexander Hohl, Erik Saule, Eric Delmelle, and Wenwu Tang
Part II
High Performance Computing for Geospatial Analytics
4
Geospatial Big Data Handling with High Performance
Computing: Current Approaches and Future Directions�� 53
Zhenlong Li
5 Parallel Landscape Visibility Analysis: A Case Study
in Archaeology�� 77
Minrui Zheng, Wenwu Tang, Akinwumi Ogundiran,
Tianyang Chen, and Jianxin Yang
6
Quantum Computing for Solving Spatial
Optimization Problems�� 97
Mengyu Guo and Shaowen Wang
7 Code Reusability and Transparency of Agent-Based
Modeling: A Review from a Cyberinfrastructure Perspective�� 115
Wenwu Tang, Volker Grimm, Leigh Tesfatsion, Eric Shook,
David Bennett, Li An, Zhaoya Gong, and Xinyue Ye

x
8 Integration of Web GIS with High-Performance
Computing: A Container-
Based Cloud Computing Approach�� 135
Zachery Slocum and Wenwu Tang
9 Cartographic Mapping Driven by High-Performance
Computing: A Review�� 159
Wenwu Tang
Part III
Domain Applications of High Performance Computing
10
High-Performance Computing for Earth System Modeling�� 175
Dali Wang and Fengming Yuan
11 High-Performance Pareto-Based Optimization Model
for Spatial Land Use Allocation�� 185
Xiaoya Ma, Xiang Zhao, Ping Jiang, and Yuangang Liu
12
High-Performance Computing in Urban Modeling�� 211
Zhaoya Gong and Wenwu Tang
13 Building a GPU-Enabled Analytical Workflow
for Maritime Pattern Discovery Using Automatic
Identification System Data�� 227
Xuantong Wang, Jing Li, and Tong Zhang
14
Domain Application of High Performance Computing
in Earth Science: An Example of Dust Storm Modeling
and Visualization�� 249
Qunying Huang, Jing Li, and Tong Zhang
Part IV
Future of High Performance Computing
for Geospatial Applications
15
High Performance Computing for Geospatial Applications:
A Prospective View�� 271
Marc P. Armstrong
Index�� 285
Contents

xi
Contributors
Li An Department of Geography and PKU-SDSU Center for Complex Human-
Environment Systems, San Diego State University, San Diego, CA, USA
Marc P. Armstrong Department of Geographical and Sustainability Sciences, The
University of Iowa, Iowa City, IA, USA
David Bennett Department of Geographical and Sustainability Sciences,
University of Iowa, Iowa City, IA, USA
Tianyang Chen Center for Applied Geographic Information Science, The
University of North Carolina at Charlotte, Charlotte, NC, USA
Department of Geography and Earth Sciences, The University of North Carolina at
Charlotte, Charlotte, NC, USA
Eric Delmelle Department of Geography and Earth Sciences, University of North
Carolina at Charlotte, Charlotte, NC, USA
Zhaoya Gong School of Geography, Earth and Environmental Sciences, University
of Birmingham, Edgbaston, Birmingham, UK
Volker Grimm Department of Ecological Modelling, Helmholtz Center for
Environmental Research-UFZ, Leipzig, Germany
Mengyu Guo University of California, Berkeley, CA, USA
Alexander Hohl Department of Geography, University of Utah, Salt Lake City,
UT, USA
Qunying Huang Department of Geography, University of Wisconsin-Madison,
Madison, WI, USA
Ping Jiang School of Resource and Environmental Science, Wuhan University,
Wuhan, Hubei, China

xii
Jing Li Department of Geography and the Environment, University of Denver,
Denver, CO, USA
Yuangang Liu School of Geosciences, Yangtze University, Wuhan, Hubei, China
Zhenlong Li Geoinformation and Big Data Research Laboratory, Department of
Geography, University of South Carolina, Columbia, SC, USA
Xiaoya Ma School of Geosciences, Yangtze University, Wuhan, Hubei, China
Key Laboratory of Urban Land Resources Monitoring and Simulation, Ministry of
Natural Resources, Shenzhen, Guangdong, China
Akinwumi Ogundiran Department of Africana Studies, The University of North
Carolina at Charlotte, Charlotte, NC, USA
Erik Saule Department of Computer Science, University of North Carolina at
Eric Shook Department of Geography, Environment, and Society, University of
Minnesota, Minneapolis, MN, USA
Zachery Slocum Center forApplied Geographic Information Science, Department
of Geography and Earth Sciences, University of North Carolina at Charlotte,
Charlotte, NC, USA
Wenwu Tang Center for Applied Geographic Information Science, Department of
Geography and Earth Sciences, University of North Carolina at Charlotte, Charlotte,
NC, USA
Leigh Tesfatsion Department of Economics, Iowa State University, Ames, IA,
USA
Dali Wang Oak Ridge National Laboratory, Oak Ridge, TN, USA
Shaowen Wang Department of Geography and Geographic Information Science,
University of Illinois at Urbana-Champaign, Urbana, IL, USA
Xuantong Wang Department of Geography and the Environment, University of
Denver, Denver, CO, USA
Jianxin Yang School of Public Administration, China University of Geoscience
(Wuhan), Wuhan, China
Xinyue Ye Department of Informatics, New Jersey Institute of Technology,
Newark, NJ, USA
Fengming Yuan Oak Ridge National Laboratory, Oak Ridge, TN, USA
Tong Zhang State Key Laboratory of Information Engineering in Surveying,
Mapping and Remote Sensing, Wuhan University, Wuhan, Hubei, China
Contributors

xiii
Xiang Zhao School of Resource and Environmental Science, Wuhan University,
Wuhan, Hubei, China
Minrui Zheng Center forApplied Geographic Information Science,The University
of North Carolina at Charlotte, Charlotte, NC, USA
Department of Geography and Earth Sciences, The University of North Carolina at
Contributors

1
W. Tang, S. Wang (eds.), High Performance Computing for Geospatial
Applications, Geotechnologies and the Environment 23,
https://doi.org/10.1007/978-3-030-47998-5_1
Chapter 1
Navigating High Performance Computing
for Geospatial Applications
Wenwu Tang and Shaowen Wang
Abstract High performance computing, as an important component of state-of-
the-art cyberinfrastructure, has been extensively applied to support geospatial stud-
ies facing computational challenges. Investigations on the utility of high performance
computing in geospatially related problem-solving and decision-making are timely
and important. This chapter is an introduction to geospatial applications of high
performance computing reported in this book. Summaries of all other chapters of
this book, adapted from contribution from their authors, are highlighted to demon-
strate the power of high performance computing in enabling, enhancing, and even
transforming geospatial analytics and modeling.
Keywords High performance computing · Geospatial applications ·
Computational science · Geospatial analytics and modeling
High performance computing (HPC) is an important enabler for computational and
data sciences (Armstrong 2000), and allows for the use of advanced computer sys-
tems to accelerate intensive computation for solving problems in numerous domains
including many geospatial problems. As multi- and many-core computing

architecture continues to evolve rapidly in the context of advanced cyberinfrastruc-
ture (NSF 2007), HPC capabilities and resources (e.g., computing clusters, compu-
tational grids, and cloud computing) have become abundantly available to support
and transform geospatial research and education. HPC is a key component of state-
of-
the-art cyberinfrastructure (NSF 2007) and its utility in empowering and enhanc-
W. Tang (*)
Center for Applied Geographic Information Science, Department of Geography and Earth
Sciences, University of North Carolina at Charlotte, Charlotte, NC, USA
e-mail: WenwuTang@uncc.edu
S. Wang
Department of Geography and Geographic Information Science,
University of Illinois at Urbana-Champaign, Urbana, IL, USA

2
ing domain-specific problem-solving has been widely recognized. Over the past
decade in particular, HPC has been increasingly introduced into geospatial applica-
tions to resolve computational and big data challenges facing these applications,
driven by the emergence of cyberGIS (Wang 2010).
Therefore, investigation on how HPC has been applied into geospatially related
studies is important and timely. This book is edited to serve this purpose by covering
the following four parts: (1) theoretical and algorithmic aspects of HPC for geospa-
tial applications; (2) applications of HPC in geospatial analytics including data pro-
cessing, spatial analysis, spatial modeling, and cartography and geovisualization;
(3) domain applications of HPC within geospatial context; and (4) future of HPC for
geospatial applications. Below are the highlights of chapters in these four parts.
Theories are fundamental in guiding us to fully and properly harness HPC power
for geospatial applications. In Chap. 2, Armstrong provides a brief overview of the
computational complexity associated with many types of geospatial analyses and
describes the ways in which these challenges have been addressed in previous
research. Different computer architectures are elucidated as current approaches that
employ network-enabled distributed resources including cloud and edge computing
in the context of cyberinfrastructure.
Domain decomposition is one of the important parallel strategies for HPC from
the algorithmic perspective. Spatiotemporal domain decomposition divides a com-
putational task into smaller ones by partitioning input datasets and related analyses
along spatiotemporal domain. In Chap. 3, Hohl et al. present a spatiotemporal
domain decomposition method (ST-FLEX-D) that supports flexible domain splits to
minimize data replication for refining a static partitioning approach. Two spatiotem-
poral datasets, dengue fever and pollen tweets, were used in their study. Their spa-
tiotemporal domain decomposition strategies hold potential in tackling the
computational challenge facing spatiotemporal analysis and modeling.
Geospatial big data plays an increasingly important role in addressing numerous
scientific and societal challenges. Efficacious handling of geospatial big data
through HPC is critical if we are to extract meaningful information that supports
knowledge discovery and decision-making. In Chap. 4, Li summarizes key compo-
nents of geospatial big data handling using HPC and then discusses existing HPC
platforms that we can leverage for geospatial big data processing. Three future
research directions are suggested in terms of utilizing HPC for geospatial big data
handling.
Viewshed analysis is a representative spatial analysis approach that has a variety
of geospatial applications. In Chap. 5, Zheng et al. propose a parallel computing
approach to handle the computation and data challenges of viewshed analysis within
the context of archaeological applications. This chapter focuses on bridging the gap
of conducting computationally intensive viewshed analysis for geospatial

applications, and investigating the capabilities of landscape pattern analysis in
understanding location choices of human settlements.
Spatial modeling has three pillars: spatial statistics, spatial optimization, and
spatial simulation (Tang et al. 2018). HPC has been extensively applied to enable
these spatial modeling approaches. For example, quantum computing has been
W. Tang and S. Wang

3
receiving much attention as its potential power in resolving computational chal-
lenges facing complex problems. In Chap. 6, Guo and Wang present one of the
quantum computing approaches, quantum annealing, in tackling combinatorial
optimization problems. An example of applying a p-median model for a spatial sup-
ply chain optimization, representative of spatial optimization problems, was used in
this chapter.
Agent-based modeling falls within the category of spatial simulation, as a pillar
of spatial modeling. In Chap. 7, Tang et al. present a discussion on the code reus-
ability and transparency of agent-based models from a cyberinfrastructure perspec-
tive. Code reusability and transparency post a significant challenge for spatial
models in general and agent-based model in particular. Tang et al. investigated in
detail the utility of cyberinfrastructure-based capabilities such as cloud computing
and high-performance parallel computing in resolving the challenge of code reus-
ability and transparency of agent-based modeling.
Stimulated by cyberinfrastructure technologies, Web GIS has become more and
more popular for geospatial applications. However, the processing and visual pre-
sentation of increasingly available geospatial data using Web GIS are often compu-
tationally challenging. In Chap. 8, Slocum and Tang examine the convergence of
Web GIS with HPC by implementing an integrative framework, GeoWebSwarm.
Container technologies and their modular design are demonstrated to be effective
for integrating HPC and Web GIS to resolve spatial big data challenges.
GeoWebSwarm has been made available on GitHub.
While HPC has been amply applied to geospatial data processing, analysis and
modeling, its applications in cartography and geovisualization seem less developed.
In Chap. 9, Tang conducts a detailed review of HPC applications in cartographic
mapping. This review concentrates on four major cartographic mapping steps: map
projection, cartographic generalization, mapping methods, and map rendering.
Challenges associated with big data handling and spatiotemporal mapping are dis-
cussed within the context of cartography and geovisualization.
Earth system models have been developed for understanding global-level climate
change and environmental issues. However, such models often require support from
HPC to resolve related computational challenges. In Chap. 10, Wang andYuan con-
duct a review on HPC applications for Earth system modeling, specifically covering
the early phase of model development, coupled system simulation, I/O challenges,
and in situ data analysis. This chapter then identifies fundamental challenges of
Earth system modeling, ranging from computing heterogeneity, coupling mecha-
nisms, accelerator-based model implementation, artificial intelligence-enhanced
numerical simulation, and in situ data analysis driven by artificial intelligence.
Geospatial applications in the context of land use and land cover change have
gained considerable attention over years. Alternative spatial models have been
developed to support the investigation of complex land dynamics. Pareto-based
heuristic optimization models, belonging to the category of spatial optimization for
land change modeling, are effective tools for the modeling of land use decision-
making of multiple stakeholders. However, Pareto-based optimization models for
spatial land use allocation are extremely time-consuming, which becomes one of
1 Navigating High Performance Computing for Geospatial Applications

4
the major challenges in obtaining the Pareto frontier in spatial land use allocation
problems. In Chap. 11, Ma et al. reported the development of a high-performance
Pareto-based optimization model based on shared- and distributed-memory com-
puting platforms to better support multi-objective decision-making within the con-
text of land use planning.
How to leverage HPC to enable large-scale computational urban models requires
in-depth understanding of spatially explicit computational and data intensity. In
Chap. 12, Gong and Tang review the design and development of HPC-enabled
urban models in a series of application categories. A case study of a general urban
system model and its HPC implementation are described that show the utility of
parallel urban models in resolving the challenge of computational intensity.
The advancement of sensor technologies leads to huge amounts of location-
based data often available at fine levels such as the ship tracking data collected by
the automatic identification system (AIS). Yet, the analysis of these location-aware
data is often computationally demanding. In Chap. 13, Wang et al. describe a graph-
ics processing unit (GPU)-based approach that supports automatic and accelerated
analysis of AIS trajectory data for the discovery of interesting patterns. Their paral-
lel algorithms were implemented at two levels: trajectory and point. A web-based
portal was developed for the visualization of these trajectory-related data.
A range of HPC technologies has been available for geospatial applications.
However, the selection of appropriate HPC technologies for specific applications
represents a challenge as these applications are often involved with different data
and modeling approaches. In Chap. 14, Huang et al. present a generalized HPC
architecture that provides multiple computing options for Earth science applica-
tions. This generalized architecture is based on the integration of common HPC
platforms and libraries that include data- and computing-based technologies. A dust
storm example was used to illustrate the capability of the proposed architecture.
Understanding of the past and present of geospatial applications of HPC is for
better insights into its future. In Chap. 15, Armstrong highlights current advances in
HPC and suggests several alternative approaches that are likely candidates for future
exploitation of performance improvement for geospatial applications. A particular
focus is placed on the use of heterogeneous processing, where problems are first
decomposed and the resulting parts are then assigned to different, well-suited, types
of architectures. The chapter also includes a discussion of quantum computing,
which holds considerable promise for becoming a new generation of HPC.
In summary, HPC has been playing a pivotal role in meeting the computational
needs of geospatial applications and enhancing the quality of computational solu-
tions. This trend will likely persist into the foreseeable future, particularly as trans-
formative advance in studies of big data and artificial intelligence and the lowering
of learning curve of HPC for geospatial applications continuously driven by research
and development of cyberGIS (Wang and Goodchild 2018).
W. Tang and S. Wang

5
References
Armstrong, M. P. (2000). Geography and computational science. Annals of the Association of
American Geographers, 90(1), 146–156.
NSF. (2007). Cyberinfrastructure vision for 21st century discovery (Report of NSF Council).
Retrieved from: http://www.nsf.gov/od/oci/ci_v5.pdf
Tang, W., Feng, W., Deng, J., Jia, M., Zuo, H. (2018). Parallel computing for geocomputa-
tional modeling. In GeoComputational analysis and modeling of regional systems (pp. 37–54).
Cham: Springer.
Wang, S. (2010). A CyberGIS framework for the synthesis of cyberinfrastructure, GIS, and spatial
analysis. Annals of the Association of American Geographers, 100(3), 535–557.
Wang, S., Goodchild, M. F. (Eds.). (2018). CyberGIS for geospatial innovation and discovery.
Dordrecht: Springer.
1 Navigating High Performance Computing for Geospatial Applications

Part I
Theoretical Aspects of High Performance
Computing

9
Chapter 2
High Performance Computing
for Geospatial Applications:
A Retrospective View
Marc P. Armstrong
Abstract Many types of geospatial analyses are computationally complex, involv-
ing, for example, solution processes that require numerous iterations or combinato-
rial comparisons.This complexity has motivated the application of high performance
computing (HPC) to a variety of geospatial problems. In many instances, HPC
assumes even greater importance because complexity interacts with rapidly grow-
ing volumes of geospatial information to further impede analysis and display. This
chapter briefly reviews the underlying need for HPC in geospatial applications and
describes different approaches to past implementations. Many of these applications
were developed using hardware systems that had a relatively short life-span and
were implemented in software that was not easily portable. More promising recent
approaches have turned to the use of distributed resources that includes cyberinfra-
structure as well as cloud and fog computing.
Keywords Computational complexity · Parallel processing · Cyberinfrastructure ·
Cloud computing · Edge computing
1 Introduction
High performance computing (HPC) has been used to address geospatial problems
for several decades (see, for example, Sandu and Marble 1988; Franklin et al. 1989;
Mower 1992). An original motivation for seeking performance improvements was
the intrinsic computational complexity of geospatial analyses, particularly combi-
natorial optimization problems (Armstrong 2000). Non-trivial examples of such
M. P. Armstrong (*)
Department of Geographical and Sustainability Sciences, The University of Iowa,
Iowa City, IA, USA
e-mail: marc-armstrong@uiowa.edu

10
problems require considerable amounts of memory and processing time, and even
now, remain intractable. Other spatial analysis methods also require substantial
amounts of computation to generate solutions. This chapter briefly reviews the com-
putational complexity of different kinds of geospatial analyses and traces the ways
in which HPC has been used in the past and during the present era. Several HPC
approaches have been investigated, with developments shifting from an early focus
on manufacturer-specific systems, which in most cases had idiosyncrasies (such as
parallel language extensions) that limited portability. This limitation was recog-
nized and addressed by using software environments that were designed to free
programmers from this type of system dependence (e.g., Message Passing Interface).
This is also acknowledged, for example, by the work presented in Healey et al.
(1998) with its focus on algorithms rather than implementations. Later approaches
used flexible configurations of commodity components linked using high-
performance networks. In the final section of this chapter, still-emerging approaches
such as cloud and edge computing are briefly described.
2
Computational Complexity of Geospatial Analyses:
A Brief Look
Computing solutions to geospatial problems, even relatively small ones, often
require substantial amounts of processing time. In its most basic form, time com-
plexity considers how the number of executed instructions interacts with the
number of data elements in an analysis. Best case, average and worst-case scenarios
are sometimes described, with the worst case normally reported. In describing com-
plexity, it is normative to use “Big O” notation, where O represents the order of, as
in the algorithm executes on the order of n2
, or O(n2
). In geospatial data operations
it is common to encounter algorithms that have an order of at least O(n2
) because
such complexity obtains for nested loops, though observed complexity is slightly
less since diagonal values are often not computed (the distance from a point to itself
is 0). Nevertheless, the complexity remains O(n2
), or quadratic, because the n2
fac-
tor is controlling the limit. In practice, any complexity that is n2
or worse becomes
intractable for large problems. Table 2.1 provides some simple examples to demon-
strate the explosive growth in computational requirements of different orders of
time complexity. The remaining parts of this section sketch out some additional
examples of complexity for different kinds of geospatial analyses.
• Spatial optimization problems impose extreme computational burdens. Consider
the p-median problem in which p facility locations are selected from n candidate
demand sites such that distances between facilities and demand location are min-
imized. In its most basic form, a brute force search for a solution requires the
evaluation of n!/[(n-p)!p!] alternatives (see Table 2.2 for some examples).
M. P. Armstrong

11
• As a result of this explosive combinatorial complexity, much effort has been
expended to develop robust heuristics that reduce the computational complexity
of search spaces. Densham and Armstrong (Densham and Armstrong 1994)
describe the use of the Teitz and Bart vertex substitution heuristic for two case
studies in India. This algorithm has a worst-case complexity of O(p∗n2
). In one
of their problems 130 facilities are located at 1664 candidate sites and in a larger
problem, 2500 facilities are located among 30,000 candidates. The smaller prob-
lem required the evaluation of 199,420 substitutions per iteration, while the
larger problem required 68,750,000 evaluations. Thus, a problem that was
approximately 19 times larger required 344 times the number of substitutions to
be evaluated during each iteration. In both cases, however, the heuristic search
space was far smaller than the full universe of alternatives.
• Many geospatial methods are based on the concept of neighborhood and require
the computation of distances among geographical features such as point samples
and polygon centroids. For example, the Gi
∗
(d) method (Ord and Getis 1995)
requires pairwise distance computations to derive distance-based weights used in
the computation of results. Armstrong et al. (2005) report a worst-case time com-
plexity of O(n3
) in their implementation.
• Bayesian statistical methods that employ Markov Chain Monte Carlo (MCMC)
are computationally demanding in terms of memory and compute time. MCMC
samples are often derived using a Gibbs sampler or Metropolis–Hastings
approach that may yield autocorrelated samples, which, in turn require larger
Table 2.1 Number of operations required with variable orders of time complexity and
problems sizes
Input log n n log n n2
n3
2n
n!
1 0 1 1 1 2 1
2 0.301 0.602 4 8 4 2
3 0.477 1.431 9 27 8 6
4 0.602 2.408 16 64 16 24
5 0.699 3.495 25 125 32 120
6 0.778 4.668 36 216 64 720
7 0.845 5.915 49 343 128 5040
8 0.903 7.224 64 512 256 40,320
9 0.954 8.586 81 729 512 36,2880
10 1 10 100 1000 1024 3,628,800
20 1.301 26.021 400 8000 1,048,576 2.40E+17
Table 2.2 Brute force
solution size for four
representative p-median
problems
n candidates p facilities Possible solutions
10 3 120
20 5 15,504
50 10 10,272,278,170
100 15 253,338,471,349,988,640
2 High Performance Computing for Geospatial Applications: A Retrospective View

12
sample sizes to make inferences. Yan et al. (2007) report, for example, that a
Bayesian geostatistical model requires O(n2
) memory for n observations and the
Cholesky decomposition of this matrix requires O(n3
) in terms of computational
requirements.
These examples are only the proverbial tip-of-the-iceberg. Computational bur-
dens are exacerbated when many methods (e.g., interpolation) are used with big
data, a condition that is now routinely encountered. Big data collections also intro-
duce substantial latency penalties during input-output operations.
3 Performance Evaluation
Until recently, computational performance routinely and steadily increased as a
consequence of Moore’s Law and Dennard Scaling. These improvements also
applied to the individual components of parallel systems. But those technological
artifacts are the result of engineering innovations and do not reflect conceptual
advances due to creation of algorithms that exploit the characteristics of problems.
To address this assessment issue, performance is often measured using “speedup”
which is simply a fractional measure of improvement (t1/t2) where times are nor-
mally execution times achieved using either different processors or numbers of pro-
cessors. In the latter case,
Speedup = t tn
1 / (2.1)
where t1 is sequential (one processor) time and tn is time required with n processors.
Speedup is sometimes standardized by the number of processors used (n) and
reported as efficiency where
Efficiency Speedup
n n n
= / (2.2)
Perfect efficiencies are rarely observed, however, since normally there are parts of a
program that must remain sequential. Amdahl’s Law is the theoretical maximum
improvement that can be obtained using parallelism (Amdahl 1967):
Theoretical Speedup
_ / /
= −
( )+
( )
1 1 p p n (2.3)
where p = proportion of the program that can be made parallel (1−p is the serial
part) and n is the number of processors.
As n grows, the right-hand term diminishes and speedups tend to 1/(1−p). The
consequence of Amdahl’s law is that the weakest link in the code will determine the
maximum parallel effectiveness. The effect of Amdahl’s Law can be observed in an
example reported by Rokos and Armstrong (1996), in which serial input-output
comprised a large fraction of total compute time, which had a deleterious effect on
overall speedup.
M. P. Armstrong

13
It should be noted that Amdahl had an axe to grind as he was involved in the
design of large mainframe uniprocessor computer systems for IBM (System/360)
and later Amdahl Computer, a lower-cost, plug-compatible, competitor of IBM
mainframes. In fact, Gustafson (1988, p. 533) reinterprets Amdahl’s “law” and sug-
gests that the computing community should overcome the “mental block” against
massive parallelism imposed by a misuse of Amdahl’s equation, asserting that
speedup should be measured by scaling the problem to the number of processors.
4
Alternative Approaches to Implementing HPC
Geospatial Applications
The earliest work on the application of HPC to geospatial analyses tended to focus
first on the use of uniprocessor architectures with relatively limited parallelism.
Subsequent work exploited pipelining and an increased number of processors exe-
cuting in parallel. Both cases, however, attempted to employ vertical scaling that
relies on continued increases in the performance of system components, such as
processor and network speeds. These vertically scaled parallel systems (Sect. 4.1)
were expensive and thus scarce. Moreover, companies that produced such systems
usually did not stay in business very long (see Kendall Square Research, Encore
Computer, Alliant Computer Systems). A different way to think about gaining per-
formance improvements is horizontal scaling in which distributed resources are
integrated into a configurable system that is linked using middleware (NIST 2015).
This latter approach (Sect. 4.2) has been championed using a variety of concepts
and terms such as grid computing and cyberinfrastructure.
4.1
The Early Era: Vertical Scaling, Uni-Processors, and Early
Architectural Innovations
The von Neumann architecture (see for example, Stone and Cocke 1991) serves as
a straightforward basis for modern computer architectures that have grown increas-
ingly complex during the past several decades. In its most simple form, a computer
system accepts input from a device, operates on that input, and moves the results of
the operation to an output device. Inputs in the form of data and instructions are
stored in memory and operations are typically performed by a central processing
unit (CPU) that can take many forms. Early computers used a single bus to access
both data and instructions and contention for this bus slowed program execution in
what is known as the von Neumann “bottleneck.” Though performance could be
improved by increasing the clock speed of the processor, computer architects have
also invented many other paths (e.g., the so-called Harvard architecture) that exploit
multiple buses, a well-developed memory hierarchy (Fig. 2.1) and multiple cores

14
and processors, in an attempt to overcome processing roadblocks. In practice, most
geospatial processing takes place using the limited number of cores that are avail-
able on the commodity chips used by desktop systems. Some current generation
CPUs, for example, feature six cores and 12 threads, thus enabling limited parallel-
ism that may be exploited automatically by compilers and thus remain invisible to
the average user.
When examining alternative architectures, it is instructive to use a generic tax-
onomy to place them into categories in a 2 × 2 grid with the axes representing data
streams and instruction streams (Flynn 1972). The top left category (single instruc-
tion and single data streams; SISD) represents a simple von Neumann architecture.
The MISD category (multiple instruction and single data streams; top right) has no
modern commercial examples that can be used for illustration. The remaining two
categories (the bottom row of Fig. 2.2) are parallel architectures that have had many
commercial implementations and a large number of architectural offshoots.
The simple SISD model (von Neumann architecture) is sometimes referred to as
a scalar architecture. This is illustrated in Fig. 2.3a, which shows that a single result
is produced only after several low-level instructions have been executed.
The vector approach, as the name implies, operates on entire vectors of data. It
requires the same number of steps to fill a “pipeline” but then produces a result with
every clock cycle (Fig. 2.3b). An imperfect analogy would be that a scalar approach
would allow only one person on an escalator at a time, whereas a vector approach
would allow a person on each step, with the net effect of much higher throughput.
Vector processing was a key performance enhancement of many early supercom-
puters, particularly those manufactured by Cray Research (e.g., Cray X-MP) which
Fig. 2.1 Memory
hierarchy in an abstract
representation of a
uniprocessor. Level 1
cache is usually accessible
in very few clock cycles,
while access to other levels
in the hierarchy typically
requires an increased
number of cycles
M. P. Armstrong

15
had only limited shared memory parallelism (August et al. 1989). The Cray-2 was a
successor to the X-MP and Griffith (1990) describes the advantages that accrue to
geospatial applications with vectorized code. In particular, in the Fortran examples
provided, the best improvements occur when short loops are processed; when nested
do-loops are used the inner loops are efficiently vectorized, while outer loops
are scalar.
Fig. 2.2 Flynn’s taxonomy represented in a 2 × 2 cross-classification (S single, M multiple, I
instruction streams, D data streams)
Fig. 2.3 (a) (top) and (b) (bottom). The top table shows a scalar approach that produces a single
result after multiple clock cycles. The bottom table (b) shows that the same number of cycles (40)
is required to compute the first result and then after that, the full pipeline produces a result at the
end of every cycle. (Adapted from Karplus 1989)

16
In a more conventional SIMD approach to parallel computing, systems are often
organized in an array (such as 64 × 64 = 4 K) of relatively simple processors that are
4- or 8-connected. SIMD processing is extremely efficient for gridded data because
rather than cycling through a matrix element by element, all elements in an array (or
a large portion of it) are processed in lockstep; a single operation (e.g., add two
integers) is executed on all matrix elements simultaneously. This is often referred to
as data parallelism and is particularly propitious for problems that are represented
by regular geometrical tessellations, as encountered, for example, during carto-
graphic modeling of raster data.
In other cases, while significant improvements can be observed, processing effi-
ciency may drop because of the intrinsic sequential nature of required computation
steps. For example, Armstrong and Marciano (1995) reported substantial improve-
ments over a then state-of-the-art workstation for a spatial statistics application
using a SIMD MasPar system with 16 K processors, though efficiency values were
more modest. In the current era, SIMD array-like processing is now performed
using GPU (graphics processing units) accelerators (e.g., Lim and Ma 2013; Tang
and Feng 2017). In short, a graphics card plays the same role that an entire computer
system played in the 1990s. Tang (2017, p. 3196) provides an overview of the use
of GPU computing for geospatial problems. He correctly points out that data struc-
tures and algorithms must be transformed to accommodate SIMD architectures and
that, if not properly performed, only modest improvements will obtain (see also
Armstrong and Marciano 1997).
MIMD computers take several forms. In general, they consist of multiple proces-
sors connected by a local high-speed bus. One simple model is the shared memory
approach, which hides much of the complexity of parallel processing from the user.
This model typically does not scale well as a consequence of bus contention: all
processors need to use the same bus to access the shared memory (see Fig. 2.4). This
Fig. 2.4 A simplified view of a four-processor shared memory architecture. The connection
between the bus and shared memory can become a choke-point that prevents scalability
M. P. Armstrong

17
approach also requires that the integrity of shared data structures be maintained,
since each process has equal memory modification rights. Armstrong et al. (1994)
implemented a parallel version of G(d) (Ord and Getis 1995) using an Encore
Multimax shared memory system with 14 processors. While performance improve-
ments were observed, the results also show decreasing scalability, measured by
speedup, particularly when small problems are processed, since the initialization
and synchronization processes (performed sequentially) occupy a larger proportion
of total compute time for those problems (Amdahl’s Law strikes again). Shekhar
et al. (1996) report better scalability in their application of data partitioning to sup-
port range query applications, but, again, their Silicon Graphics shared memory
system had only 16 processors.
A different approach to shared memory was taken by Kendall Square Research
when implementing their KSR-1 system from the mid-1990s. Rather than employ-
ing a single monolithic shared memory it used a hierarchical set of caches
(ALLCACHE, Frank et al. 1993). As shown in Fig. 2.5, each processor can access
its own memory as well as that of all other processing nodes to form a large virtual
address space. Processors are linked by a high-speed bus divided into hierarchical
zones, though the amount of time required to access different portions of shared
memory from any given processor will vary if zone borders must be crossed. A
Fig. 2.5 Non-uniform (hierarchical) memory structure of the KSR1 (after Armstrong and
Marciano 1996)

18
processor can access its 256 K sub-cache in only 2 clock cycles, it can access 32 MB
memory in its own local cache in 18 clock cycles, and if memory access is required
on another processor within a local zone, then a penalty of 175 clock cycles is
incurred (Armstrong and Marciano 1996). However, if a memory location outside
this zone must be accessed, then 600 cycles are required. This hierarchical approach
exploits data locality and is a computer architectural restatement of Tobler’s First
Law: All things are interconnected, but near things are interconnected faster than
distant things.
4.2
Distributed Parallelism and Increased
Horizontal Scalability
Horizontal scaling began to gain considerable traction in the early 1990s. In contrast
to the vertically scaled systems typically provided by a single, turn-key vendor,
these new approaches addressed scalability issues using a shared nothing approach
that had scalability as a central design element. Stonebraker (1986), for example,
described three conceptual architectures for multiprocessor systems (shared mem-
ory, shared disk, and shared nothing) and concluded that shared nothing was the
superior approach for database applications.
Smarr and Catlett (1992, p. 45) pushed this concept further in conceptualizing a
metacomputer as “a network of heterogeneous, computational resources linked by
software in such a way that they can be used as easily as a personal computer.” They
suggest an evolutionary path in network connectivity from local area networks to
wide area networks to a third stage: a transparent national network that relies on the
development of standards that enable local nodes to interoperate in flexible configu-
rations. These concepts continued to be funded by NSF and eventually evolved into
what was termed grid computing (Foster and Kesselman 1999) with its central met-
aphor of the electric grid (with computer cycles substituting for electricity). Wang
and Armstrong (2003), Armstrong et al. (2005), and Wang et al. (2008) illustrate the
effectiveness of the grid computing paradigm to geospatial information analysis.
At around the same time, several other related concepts were being developed
that bear some similarity to the grid approach. The Network of Workstations (NOW)
project originated in the mid-1990s at UC-Berkeley in an attempt to construct con-
figurable collections of commodity workstations that are connected using what
were then high-performance networks (Anderson et al. 1995). Armstrong and
Marciano (1998) developed a NOW implementation (using Message Passing
Interface, Snir et al. 1996) to examine its feasibility in geospatial processing
(inverse-distance weighted interpolation). While substantial reductions in comput-
ing time were realized, the processor configuration achieved only moderate levels of
M. P. Armstrong

19
efficiency when more than 20 processors were used due, in part, to communication
latency penalties from the master-worker approach used to assign parallel tasks. At
around the same time, Beowulf clusters were also developed with a similar architec-
tural philosophy: commodity class processors, linked by Ethernet to construct a
distributed parallel architecture.
4.3 Cyberinfrastructure and CyberGIS
Cyberinfrastructure is a related term that was promoted by the National Science
Foundation beginning in the early 2000s (Atkins et al. 2003) and it continues into
the present era with the establishment of NSF’s Office of Advanced
Cyberinfrastructure, which is part of the Computer and Information Science and
Engineering Directorate. While numerous papers have described various aspects of
Cyberinfrastructure, Stewart et al. (2010, p. 37) define the concept in the fol-
lowing way:
Cyberinfrastructure consists of computing systems, data storage systems, advanced instru-
ments and data repositories, visualization environments, and people, all linked together by
software and high performance networks to improve research productivity and enable
breakthroughs not otherwise possible.
The “all linked together” part of the definition moves the concept from a more local-
ized view promulgated by NOW and Beowulf to a much more decentralized, even
international, scope of operation more aligned with the concepts advanced as grid
computing. This linking is performed through the use of middleware, software that
acts as a kind of digital plumbing to enable disparate components to work together.
The most fully realized geospatial implementation of the cyberinfrastructure
concept is the CyberGIS project at the University of Illinois (Wang 2010, 2013; also
see http://cybergis.illinois.edu/).Wang (2019) characterizes CyberGIS as a “funda-
mentally new GIS modality based on holistic integration of high-performance and
distributed computing, data-driven knowledge discovery, visualization and visual
analytics, and collaborative problem-solving and decision-making capabilities.”
CyberGIS uses general-purpose middleware, but goes beyond that to also imple-
ment geospatially tailored middleware that is designed to capture spatial character-
istics of problems in order to promote application specific efficiencies in locating
and using distributed resources (Wang 2010). It seems clear, based on current trends,
that the concept of cyberinfrastructure will continue to be central to developments
in HPC at least for the next decade (NSF 2018) and that CyberGIS will continue to
undergo “parallel” developments.

20
4.4 Cloud Computing
Cloud computing is yet another general distributed model in which configurable
computer services, such as compute cycles and storage, are provided over a network
(Sugumaran and Armstrong 2017). It is, as such, a logical outgrowth of grid com-
puting and it is sometimes referred to as utility computing.
The US National Institute of Standards and Technology has provided an authori-
tative definition of cloud computing (Mell and Grance 2011, p. 2) with five essential
characteristics that are paraphrased here:
1. On-demand self-service. Consumers must be able to access computing capabili-
ties, such as compute time and network storage automatically.
2. Broad network access. Capabilities must be available over the network and
accessed through standard protocols that enable use by diverse platforms (e.g.,
tablets and desktop systems).
3. Resource pooling. Service provider’s resources are pooled to serve multiple con-
sumers, with different physical and virtual resources dynamically assigned
according to consumer demand. The consumer has no control or knowledge
about the location of the provided resources.
4. Rapid elasticity. Capabilities are elastically provided commensurate,
with demand.
5. Measured service. Cloud systems control and optimize resource use by metering
resources; usage is monitored, controlled, and reported, providing transparency
to providers and consumers.
The flexibility of the cloud approach means that users and organizations are able to
adapt to changes in demand. The approach also changes the economics of comput-
ing from a model that may require considerable capital investment in hardware,
with associated support and upgrade (fixed) costs to one in which operational
expenses are shifted to inexpensive networked clients (variable costs). While there
are private clouds, cloud computing services are often public and provided by large
corporate enterprises (e.g., Amazon and Google) that offer attractive, tailorable
hardware and software configurations. Cloud computing is reaching a mature stage
and because of its tailorable cost structures and configurability, the approach will
continue to be broadly adopted in the foreseeable future. Cloud computing has been
demonstrated to be effective in several geospatial problem domains (Hegeman et al.
2014; Yang et al. 2013).
4.5
Moving Closer to the Edge
Despite the substantial advantages provided by cloud computing, it does suffer from
some limitations, particularly latency. Communication is, after all, limited by the
speed of light and in practice it is far slower than that limit (Satyanarayanan 2017).
M. P. Armstrong

21
With the proliferation of electronic devices connected as part of the Internet of
Things (estimated to be approximately 50,000,000,000 by 2020) that are generating
zettabytes (1021
bytes) of data each year, bandwidth is now a major concern (Shi and
Dustdar 2016). Trends in distributed data collection and processing are likely to
persist, with one report1
by the FCC 5G IoT Working Group suggesting that the
amount of data created and processed outside a centralized center or cloud is now
around 10% and will likely increase to 75% by 2022.
Communication latency is particularly problematic for real-time systems, such
as augmented reality and autonomous vehicle control. As a consequence, edge and
fog computing have emerged as important concepts in which processing is decen-
tralized, taking place between individual devices and the cloud, to obviate the need
to move massive amounts of data, and thereby increase overall computational per-
formance. It turns out that geography matters. Achieving a proper balance between
centralized and distributed processing is a key here. The movement of massive
amounts of data has also become a source of concern for those companies that are
now providing 5G wireless network service that could be overwhelmed by data
fluxes before the systems are even fully implemented.
In a fashion similar to that of cloud computing, NIST has promulgated a defini-
tion of fog computing that contains these six fundamental elements as reported by
Iorga et al. (2018, pp. 3–4):
1. Contextual awareness and low latency. Fog computing is low latency because
nodes are often co-located with end-devices, and analysis and response to data
generated by these devices are faster than from a centralized data center.
2. Geographical distribution. In contrast to centralized cloud resources, fog com-
puting services and applications are geographically distributed.
3. Heterogeneity. Fog computing supports the collection and processing of differ-
ent types of data acquired by multiple devices and network communication
capabilities.
4. Interoperability and federation. Components must interoperate, and services are
federated across domains.
5. Real-time interactions. Fog computing applications operate in real-time rather
than in batch mode.
6. Scalability and agility. Fog computing is adaptive and supports, for example,
elastic computation, resource pooling, data-load changes, and network condition
variations.
The battle between centralization and de-centralization of computing is ongoing.
Much like the episodic sagas of the mainframe vs the personal computer, the cloud
vs fog approach requires that trade-offs be made in order to satisfy performance
objectives and meet economic constraints. While cloud computing provides access
to flexibly specified, metered, centralized resources, fog computing offloads
1
https://www.fcc.gov/bureaus/oet/tac/tacdocs/reports/2018/5G-Edge-Computing-Whitepaper-
v6-Final.pdf.

22
burdens from the cloud to provide low-latency services to applications that require
it. Cloud and fog computing, therefore, should be viewed as complementary.
5 Summary and Conclusion
The use of HPC in geospatial applications has had a checkered history with many
implementations showing success only to see particular architectures or systems
become obsolete. Nevertheless, there were lessons learned that could be passed
across generations of systems and researchers. For example, early conceptual work
on geospatial domain decomposition (Armstrong and Densham 1992) informed fur-
ther empirical investigations of performance that modelled processing time as a
function of location (Cramer and Armstrong 1999). A decade later, this work was
significantly extended to undergird the assignment of tasks in modern
cyberinfrastructure-
based approaches to distributed parallelism (Wang and
Armstrong 2009).
It also seems clear that HPC is reaching a maturation point with much attention
now focused on the use of distributed resources that interoperate over a network
(e.g., cyberinfrastructure, cloud and fog computing). Though much work remains,
there is at least a clear developmental path forward (Wang 2013). There is a cloud
on the horizon, however. Moore’s Law (Moore 1965) is no longer in force (Hennessy
and Patterson 2019). As a consequence, computer architects are searching for alter-
native methods, such as heterogeneous processing and quantum computing, to
increase performance. It is likely, however, that many of these emerging technolo-
gies will continue to be accessed using cyberinfrastructure.
References
Amdahl, G. M. (1967). Validity of the single-processor approach to achieving large-scale com-
puting capabilities. In Proceedings of the American Federation of Information Processing
Societies Conference (pp. 483–485). Reston, VA: AFIPS.
Anderson, T. E., Culler, D. E., Patterson, D. A., the NOW Team. (1995). A case for NOW (net-
works of workstations). IEEE Micro, 15(1), 54–64.
Armstrong, M. P. (2000). Geography and computational science. Annals of the Association of
American Geographers, 90(1), 146–156.
Armstrong, M. P., Cowles, M., Wang, S. (2005). Using a computational grid for geographic
information analysis. The Professional Geographer, 57(3), 365–375.
Armstrong, M. P., Densham, P. J. (1992). Domain decomposition for parallel processing of
spatial problems. Computers, Environment and Urban Systems, 16(6), 497–513.
Armstrong, M. P., Marciano, R. J. (1995). Massively parallel processing of spatial statistics.
International Journal of Geographical Information Systems, 9(2), 169–189.
Armstrong, M. P., Marciano, R. J. (1996). Local interpolation using a distributed parallel super-
computer. International Journal of Geographical Information Systems, 10(6), 713–729.
M. P. Armstrong

23
Armstrong, M. P., Marciano, R. J. (1997). Massively parallel strategies for local spatial interpo-
lation. Computers Geosciences, 23(8), 859–867.
Armstrong, M. P, Marciano, R. J. (1998). A network of workstations (NOW) approach to spa-
tial data analysis: The case of distributed parallel interpolation. In Proceedings of the Eighth
International Symposium on Spatial Data Handling (pp. 287–296). Burnaby, BC: International
Geographical Union.
Armstrong, M. P., Pavlik, C. E., Marciano, R. J. (1994). Parallel processing of spatial statistics.
Computers Geosciences, 20(2), 91–104.
Atkins, D. E., Droegemeier, K. K., Feldman, S. I., Garcia-Molina, H., Klein, M. L., Messerschmitt,
D. G., et al. (2003). Revolutionizing science and engineering through Cyberinfrastructure:
Report of the National Science Foundation Blue-Ribbon Advisory Panel on Cyberinfrastructure.
http://www.nsf.gov/od/oci/reports/toc.jsp
August, M. C., Brost, G. M., Hsiung, C. C., Schiffleger, A. J. (1989). Cray X-MP: The birth of
a supercomputer. IEEE Computer, 22(1), 45–52.
Cramer, B. E., Armstrong, M. P. (1999). An evaluation of domain decomposition strategies for
parallel spatial interpolation of surfaces. Geographical Analysis, 31(2), 148–168.
Densham, P. J., Armstrong, M. P. (1994). A heterogeneous processing approach to spatial
decision support systems. In T. C. Waugh R. G. Healey (Eds.), Advances in GIS research
(Vol. 1, pp. 29–45). London: Taylor and Francis.
Flynn, M. J. (1972). Some computer organizations and their effectiveness. IEEE Transactions on
Computers, C-21(9), 948–960. https://doi.org/10.1109/TC.1972.5009071
Foster, I., Kesselman, C. (Eds.). (1999). The grid: blueprint for a new computing infrastructure.
San Francisco, CA: Morgan Kaufmann.
Frank, S., Burkhardt, H., Rothnie, J. (1993). The KSR 1: Bridging the gap between shared mem-
ory and MPPs. COMPCON Spring '93. Digest of Papers, pp. 285–294. https://doi.ieeecomput-
ersociety.org/10.1109/CMPCON.1993.289682
Franklin, W. R., Narayanaswami, C., Kankanhalli, M., Sun, D., Zhou, M.-C., Wu, P. Y. F. (1989).
Uniform grids: A technique for intersection detection on serial and parallel machines. In
Proceedings of the Ninth International Symposium on Computer-Assisted Cartography,
Baltimore, MD, 2–7 April (pp. 100–109). Bethesda, MD: American Congress on Surveying
and Mapping.
Griffith, D. A. (1990). Supercomputing and spatial statistics: A reconnaissance. The Professional
Geographer, 42(4), 481–492.
Gustafson, J. L. (1988). Reevaluating Amdahl’s law. Communications of the Association for
Computing Machinery, 31(5), 532–533.
Healey, R., Dowers, S., Gittings, B., Mineter, M. (Eds.). (1998). Parallel processing algorithms
for GIS. Bristol, PA: Taylor Francis.
Hegeman, J. W., Sardeshmukh, V. B., Sugumaran, R., Armstrong, M. P. (2014). Distributed
LiDAR data processing in a high-memory cloud-computing environment. Annals of GIS,
20(4), 255–264.
Hennessy, J. L., Patterson, D. A. (2019). A new golden age for computer architecture.
Communications of the Association for Computing Machinery, 62(2), 48–60.
Iorga, M., Feldman, L., Barton, R., Martin, M. J., Goren, N., Mahmoudi, C. (2018). Fog comput-
ing conceptual model. Gaithersburg, MD: NIST . NIST Special Publication 500-325. https://
doi.org/10.6028/NIST.SP.500-325
Karplus, W. J. (1989). Vector processors and multiprocessors. In K. Hwang D. DeGroot
(Eds.), Parallel processing for supercomputers and artificial intelligence. New York, NY:
McGraw Hill.
Lim, G. J., Ma, L. (2013). GPU-based parallel vertex substitution algorithm for the p-median
problem. Computers Industrial Engineering, 64, 381–388.
Mell, P., Grance, T. (2011). The NIST definition of cloud computing. Gaithersburg, MD: NIST
. National Institute of Standards and Technology Special Publication 800-145. https://doi.
org/10.6028/NIST.SP.800-145

24
Moore, G. (1965). Cramming more components onto integrated circuits. Electronics, 38(8),
114–117.
Mower, J. E. (1992). Building a GIS for parallel computing environments. In Proceedings of
the Fifth International Symposium on Spatial Data Handling (pp. 219–229). Columbia, SC:
International Geographic Union.
NIST (National Institute of Standards and Technology). (2015). NIST big data interoperability
framework: Volume 1, Definitions . NIST Special Publication 1500-1. Gaithersburg, MD:
NIST. https://doi.org/10.6028/NIST.SP.1500-1
NSF (National Science Foundation). (2018). CI2030: Future advanced cyberinfrastructure. A
Report of the NSF Advisory Committee for Cyberinfrastructure. https://www.nsf.gov/cise/oac/
ci2030/ACCI_CI2030Report_Approved_Pub.pdf
Ord, J. K., Getis, A. (1995). Local spatial autocorrelation statistics: Distributional issues and an
application. Geographical Analysis, 27, 286–306.
Rokos, D., Armstrong, M. P. (1996). Using Linda to compute spatial autocorrelation in parallel.
Computers Geosciences, 22(5), 425–432.
Sandu, J. S., Marble, D. F. (1988). An investigation into the utility of the Cray X-MP supercom-
puter for handling spatial data. In Proceedings of the Third International Symposium on Spatial
Data Handling IGU, Sydney, Australia, pp. 253–266.
Satyanarayanan, M. (2017). The emergence of edge computing. IEEE Computer, 50(1), 30–39.
Shekhar, S., Ravada, S., Kumar, V., Chubb, D., Turner, G. (1996). Parallelizing a GIS on a
shared address space architecture. IEEE Computer, 29(12), 42–48.
Shi, W., Dustdar, S. (2016). The promise of edge computing. IEEE Computer, 49(5), 78–81.
Smarr, L., Catlett, C. E. (1992). Metacomputing. Communications of the Association for
Computing Machinery, 35(6), 44–52.
Snir, M., Otto, S., Huss-Lederman, S., Walker, D., Dongarra, J. (1996). MPI: The complete
reference. Cambridge, MA: MIT Press.
Stewart, C., Simms, S., Plale, B., Link, M., Hancock, D., Fox, G. (2010). What is cyberinfra-
structure? In SIGUCCS’'10 Proceedings of the 38th Annual ACM SIGUCCS Fall Conference:
Navigation and Discovery. Norfolk, VA, 24–27 Oct, pp. 37–44. https://dl.acm.org/citation.
cfm?doid=1878335.1878347
Stone, H. S., Cocke, J. (1991). Computer architecture in the 1990s. IEEE Computer, 24(9), 30–38.
Stonebraker, M. (1986). The case for shared nothing architecture. Database Engineering, 9, 1.
Sugumaran, R., Armstrong, M. P. (2017). Cloud computing. In The international encyclopedia
of geography: people, the earth, environment, and technology. New York, NY: John Wiley.
https://doi.org/10.1002/9781118786352.wbieg1017
Tang, W. (2017). GPU computing. In M. F. Goodchild M. P. Armstrong (Eds.), International
encyclopedia of geography. Hoboken, NJ: Wiley. https://doi.org/10.1002/9781118786352.
wbieg0129
Tang, W., Feng, W. (2017). Parallel map projection of vector-based big spatial data using general-
purpose graphics processing units. Computers, Environment and Urban Systems, 61, 187–197.
Wang, S. (2010). A CyberGIS framework for the synthesis of cyberinfrastructure, GIS, and spatial
analysis. Annals of the Association of American Geographers, 100(3), 535–557.
Wang, S. (2013). CyberGIS: Blueprint for integrated and scalable geospatial software ecosystems.
International Journal of Geographical Information Science, 27(11), 2119–2121.
Wang, S. (2019). Cyberinfrastructure. In J. P. Wilson (Ed.), The geographic information science
technology body of knowledge. Ithaca, NY: UCGIS. https://doi.org/10.22224/gistbok/2019.2.4
Wang, S., Armstrong, M. P. (2003). A quadtree approach to domain decomposition for spatial
interpolation in grid computing environments. Parallel Computing, 29(10), 1481–1504.
Wang, S., Armstrong, M. P. (2009). A theoretical approach to the use of cyberinfrastructure
in geographical analysis. International Journal of Geographical Information Science, 23(2),
169–193. https://doi.org/10.1080/13658810801918509
M. P. Armstrong

25
Wang, S., Cowles, M. K., Armstrong, M. P. (2008). Grid computing of spatial statistics: Using
the TeraGrid for Gi∗(d) analysis. Concurrency and Computation: Practice and Experience,
20(14), 1697–1720. https://doi.org/10.1002/cpe.1294
Yan, J., Cowles, M. K., Wang, S., Armstrong, M. P. (2007). Parallelizing MCMC for Bayesian
spatiotemporal geostatistical models. Statistics and Computing, 17(4), 323–335.
Yang, C., Huang, Q., Li, Z., Xu, C., Liu, K. (2013). Spatial cloud computing: A practical
approach. Boca Raton, FL: CRC Press.

27
Chapter 3
Spatiotemporal Domain Decomposition
for High Performance Computing:
A Flexible Splits Heuristic to Minimize
Redundancy
Alexander Hohl, Erik Saule, Eric Delmelle, and Wenwu Tang
Abstract There are three steps towards implementing the divide-and-conquer
strategy for accelerating spatiotemporal analysis: First, performing spatiotemporal
domain decomposition: dividing a large computational task into smaller parts (sub-
domains) by partitioning the input dataset along its spatiotemporal domain. Second,
computing a spatiotemporal analysis algorithm (e.g., kernel density estimation) for
each of the resulting subdomains using high performance parallel computing. Third,
collecting and reassembling the outputs. However, as many spatiotemporal analysis
approaches employ neighborhood search, data elements near domain boundaries
need to be assigned to multiple processors by replication to avoid spurious bound-
ary effects. We focus on the first step of the divide-and-conquer strategy because
replication of data decreases the efficiency of our approach. We develop a spatio-
temporal domain decomposition method (ST-FLEX-D) that allows for flexibility in
domainsplitpositions,whichrefinesthewell-knownstaticapproach(ST-STATIC-D).
We design a heuristic to find domain splits that minimize data replication and com-
pare the resulting set of subdomains to ST-STATIC-D using the following metrics:
(1) execution time of decomposition, (2) total number of replicated data points, (3)
average leaf node depth, and (4) average leaf node size. We make the following key
A. Hohl (*)
Department of Geography, University of Utah, Salt Lake City, UT, USA
e-mail: alexander.hohl@geog.utah.edu
E. Saule
Department of Computer Science, University of North Carolina at Charlotte,
Charlotte, NC, USA
E. Delmelle · W. Tang
Department of Geography and Earth Sciences, University of North Carolina at Charlotte,
Charlotte, NC, USA
Center for Applied Geographic Information Science, University of North Carolina at

28
assumption: The spatiotemporal analysis in the second step of the divide-and-
conquer strategy uses known, fixed spatial and temporal search radii, as determining
split positions would be very difficult otherwise. Our results show that ST-FLEX-D
is successful in reducing data replication across a range of parameterizations but
comes at the expense of increased decomposition time. Our approach is portable to
other space-time analysis methods and holds the potential to enable scalable geo-
spatial applications.
Keywords Domain decomposition · Algorithms · Parallel computing · Space-time
1 Introduction
Cyberinfrastructure (CI) and high performance computing (HPC) enable solving
computational problems that were previously inconceivable or intractable
(Armstrong 2000) and have transformative impacts on many disciplines such as
Geography, Engineering, or Biology. The advent of CI and HPC has allowed for
significant scientific enterprise within the GIScience community: the study of health
and wellbeing through participatory data collection using mobile devices (Yin et al.
2017) or web-scraping (Desjardins et al. 2018), the analysis of human digital foot-
prints (Soliman et al. 2017), the use of social media data for a geospatial perspective
on health issues (Gao et al. 2018; Padmanabhan et al. 2014; Ye et al. 2016; Shi and
Wang 2015), hydrologic modelling (Survila et al. 2016;Ye et al. 2014), biomass and
carbon assessment (Tang et al. 2016, 2017; Stringer et al. 2015), and agent-based
modelling (Shook et al. 2013; Fachada et al. 2017; Tang 2008; Tang and Bennett
2010; Tang and Wang 2009; Tang et al. 2011).
The evolution of technologies such as sensor systems, automated geocoding
abilities, and social media platforms has enabled us to collect large quantities of
spatial and spatiotemporal data at an increasingly fast pace (Goodchild 2007).
Therefore, we are facing a stream of geographically referenced data of unprece-
dented volume, velocity, and variety, which necessitates advanced computational
capabilities for analysis and prediction (Zikopoulos and Eaton 2011). There are
three additional factors motivating the call for increased computational power. The
first is the soaring computational intensity of many methods for analyzing geo-
graphic phenomena, due to underlying algorithms that require large amounts of
spatial search. Computational intensity is defined as “the magnitude of computa-
tional requirements of a problem based on the evaluation of the characteristics of the
problem, its input and output, and computational complexity” (Wang 2008). The
second is the use of simulation approaches (i.e., Monte Carlo) for significance test-
ing, which further increases the computational cost of geospatial analysis (Tang
et al. 2015). The third is the inclusion of a true representation of time in geographic
models that further complicates the computation of geospatial applications (Kwan
and Neutens 2014). Together, these factors unequivocally raise the need for further
A. Hohl et al.

29
scientific investigation to cope with computational problems within the domain of
GIScience.
Deploying many computing resources in parallel allows for increased perfor-
mance and therefore, solving large scale computational problems using parallel
algorithms and strategies (Wilkinson and Allen 2004). As opposed to sequential
algorithms, parallel algorithms perform multiple operations concurrently (Blelloch
and Maggs 1996). They include message passing, a standard for sending messages
between tasks or processes to synchronize tasks and to perform operations on data
in transit (Huang et al. 2011); shared memory, which is accessed by multiple con-
current programs to enable communication (Armstrong and Marciano 1997);
MapReduce, which is a programming model and implementation that allows for
processing big data and features the specification of the map and a reduce function
(Dean and Ghemawat 2008); and lastly, hybrid parallel algorithms, which combine
the above processes (Biswas et al. 2003). Parallel strategies are techniques designed
to achieve the major aim of parallelism in a computation (Wilkinson and Allen
2004). They include load balancing, which aims at balancing the workload among
concurrent processors (Hussain et al. 2013); task scheduling, which optimizes the
time computing tasks start execution on a set of processors (Dutot et al. 2004), and
finally, spatial and spatiotemporal domain decomposition, which divide a dataset
into smaller subsets along its spatiotemporal domain (Ding and Densham 1996).
Spatial and spatiotemporal domain decomposition are widely used strategies for
parallel problem solving (Deveci et al. 2016; Berger and Bokhari 1987; Nicol 1994)
and are one out of three crucial steps for implementing the divide-and-conquer strat-
egy (Hohl et al. 2015). These steps are:
1. Partition the dataset by decomposing its spatiotemporal domain into smaller
subsets.
2. Distribute the subsets to multiple concurrent processors1
for computing a spatio-
temporal analysis algorithm (e.g., kernel density estimation, KDE).
3. Collect the results and reassemble to one coherent dataset.
This approach is scalable if the computational intensity is balanced among con-
current processors by accounting for the explicit characteristics of the data (Wang
and Armstrong 2003; Wang et al. 2008; Wang 2008). Here, we focus on recursive
domain decomposition such as quadtrees and octrees, as these are common
approaches that promote balanced workloads for processing data that are heteroge-
neously distributed in space and time (Turton 2003; Hohl et al. 2016; Ding and
Densham 1996).
Spatiotemporal domain decomposition approaches partition the domain of a
given dataset into a hierarchical set of rectangular (2D) or cuboid (3D, 2D + time)
subdomains (Samet 1984). Each subdomain contains a similar number of data
points, which facilitates workload balance across concurrent processors. However,
1
We use in this chapter the term “processor” as a generic term for what is performing the computa-
tion which depending on context could be a core, a thread, a reduce task, a computing node, an
MPI rank, a physical processor, etc.
3 Spatiotemporal Domain Decomposition for High Performance Computing…

30
many spatiotemporal analysis and modelling approaches rely on neighborhood
information within a given distance of a location (“bandwidth”). This complicates
the domain decomposition procedure because by partitioning and distributing the
dataset to different processors, exactly that neighborhood information is no longer
accessible near domain boundaries (Zheng et al. 2018). Approaches to prevent the
introduction of edge effects (e.g., in KDE) due to partitioning include overlapping
subdomains, where data that falls within the overlapping region is assigned to mul-
tiple neighboring subdomains, which is either achieved through (1) replication of
data points (Hohl et al. 2015) or (2) communication between processors (Guan and
Clarke 2010). Both approaches reduce the efficiency of spatiotemporal domain
decomposition and therefore, limit scalability and applicability of the divide-and-
conquer strategy.
In this study, we focus on the first step of the divide-and-conquer strategy and
present a methodology that allows for scalable space-time analysis by leveraging
the power of high-performance computing. We introduce a heuristic that optimizes
the spatiotemporal domain decomposition procedure by reducing the loss in effi-
ciency from overlapping subdomains, which greatly benefits the scalability of the
divide-and-conquer strategy. Our method accelerates spatiotemporal analysis while
maintaining its integrity by preventing the introduction of edge effects through
overlapping subdomains. This chapter is organized as follows: Section 2 lays out
our methodology and data, Sect. 3 illustrates the results, and Sect. 4 holds discus-
sion and conclusions.
2 Data and Methods
2.1 Data
To illustrate the general applicability of our approach, we use two spatiotemporally
explicit datasets in this study. They differ in size, as well as spatial and temporal
extents, but have in common that they contain [x, y, t] tuples for each record. The
first dataset (“the dengue fever dataset,” Fig. 3.1, left) contains dengue fever cases
in the city of Cali, Colombia (Delmelle et al. 2013). We use a total of 11,056 geo-
coded cases, 9606 cases in 2010, and 1562 in 2011. The cumulative distribution of
the number of cases has a steep initial incline early on and is illustrated in Fig. 3.1,
right. We explain the difference in the number of cases by the fact that 2010 was
identified as an epidemic year (Varela et al. 2010). We use the home addresses of
patients geomasked to the nearest street intersection as the spatial coordinates of the
cases to maintain privacy (Kwan et al. 2004). The second dataset (“the pollen tweets
dataset,” Fig. 3.2, left) contains 551,627 geolocated tweets within the contiguous
USA from February 2016 to April 2016 (Saule et al. 2017). Tweets were selected by
keywords, such as “pollen” or “allergy” and if precise geographic location was not
available, we picked a random location within the approximated region provided by
A. Hohl et al.

31
Gnip (www.gnip.com). Therefore, the dataset contains spurious rectangular pat-
terns; however, they should not matter for the purpose of our study. The cumulative
temporal distribution (Fig. 3.2, right) shows a lower number of tweets at the begin-
ning of the study period and a steeper incline later on.
2.2 Methods
In this section, we introduce a method (ST-FLEX-D) that improves computational
scalability of spatiotemporal analysis, such as space-time kernel density estimation
(STKDE, Nakaya and Yano 2010). We seek to refine and complement an existing
methodology (ST-STATIC-D, Hohl et al. 2015), which has been successfully used
to enable and accelerate spatial and spatiotemporal analysis (Desjardins et al. 2018;
Hohl et al. 2016, 2018), and which involves the three steps of domain decomposi-
tion outlined previously. We attempt to address a crucial problem of ST-STATIC-D
by developing a heuristic that is characterized by flexible split locations for domain
partitioning with the goal of minimizing data replication due to domain overlap. We
assume known, fixed bandwidths of the spatiotemporal analysis algorithm at step 2
of the divide-and-conquer strategy.2
We employ the concept of space-time cube,
which substitutes the third (vertical) dimension with time (2D + time, Hagerstrand
1970). All computational experiments are conducted using a workstation with a
2
Otherwise, we would need to determine the spatial and temporal bandwidths prior to decomposi-
tion by utilizing a sequential procedure.
0
0
N
Dengue Fever Cases
Cali City Boundary
0 1 2 3 4 km
2,000
4,000
6,000
#Dengue
Cases
8,000
10,000
12,000
2010 2011
100 200 300 400 500 600 700
Time (julian day)
Fig. 3.1 Spatial (left) and cumulative temporal (right) distribution of geomasked dengue fever
cases in Cali, Colombia

32
64-bit operating system, Intel®
Core i3 CPU at 3.60 GHz clock speed, and 16 GB of
memory. The decomposition procedures, which we coded using Python program-
ming language, are run sequentially.
2.2.1
The Existing Method: ST-STATIC-D
ST-STATIC-D is an existing implementation of spatiotemporal domain decomposi-
tion for accelerating geospatial analysis using parallel computing (Hohl et al. 2015).
The procedure results in a set of subdomains of similar computational intensity,
thereby facilitating balanced workloads among concurrent processors.
Computational intensity of the spatiotemporal analysis algorithm at step 2 of the
divide-and-conquer strategy may depend on the following characteristics: (1) The
number of data points within the subdomain, (2) the number of grid points within a
regular grid in 3D space (x, y, t), at which local analysis is performed, e.g., STKDE
(Nakaya andYano 2010), or local space-time Ripley’s K function (Hohl et al. 2017).
Recursion is a concept where “the solution to a problem depends on solutions to
smaller instances of the same problem” (Graham 1994). We use recursive spatio-
temporal domain decomposition, which explicitly handles heavily clustered distri-
butions of point data, which pose a threat to workload balance otherwise.
The ST-STATIC-D decomposition algorithm consists of the following steps (see
Fig. 3.3): (1) Find the minimum and maximum values for each dimension (x, y, t),
a.k.a. the spatiotemporal domain of the dataset. (2) Bisect each dimension (midway
split), which creates eight subdomains of equal size and cuboid shape. Assign each
data point to the subdomain it falls into and (3) decompose subdomains recursively
until (4) the exit condition is met: the crossing of either threshold T1 or T2. T1 is the
number of data points within subdomain threshold and T2 the subdomain volume
threshold. Both metrics typically decrease as decomposition progresses. Therefore,
low thresholds result in fine-grained decompositions, which is advantageous as a
high number of small computing tasks (as opposed to a low number of large tasks)
30
0
0 1000 1500 2000 km
500
N
#
Tweets
Tweets
U.S. Boundary
100,000
200,000
300,000
400,000
500,000
600,000
40 50 60 70
Time (julian day)
2016
80 90 100 110
Fig. 3.2 Spatial (left) and cumulative temporal (right) distribution of tweets in the USA
A. Hohl et al.

33
facilitates workload balance among processors. However, choosing the thresholds
low increases the depth of recursion, which could cause an exponential increase in
computational intensity. We use the load balancing procedure described in Hohl,
Delmelle, and Tang et al. (2015), which equalizes the computational intensity CI
(Wang 2008) among multiple processors.
Partitioning the spatiotemporal domain of a dataset and distributing the subdo-
mains to different processors means crucial neighborhood information is no longer
available near subdomain boundaries. Unless properly dealt with, this loss of infor-
mation degrades the results of the spatiotemporal analysis in step 2 of the divide-
and-
conquer strategy, e.g., creating spurious patterns in the resulting kernel density
estimates. To prevent this undesirable case, we create circular/cylindrical buffers of
distance equal to the spatial and temporal search radii around each data point
(Fig. 3.4). A data point is assigned to subdomain sd1 either if it is located within sd1
(i.e., point p1 in Fig. 3.4) or if its buffer intersects with the boundary plane of sd1
(i.e., p2). As boundaries separate neighboring domains, it is possible that buffers
intersect with up to eight subdomains (i.e., p3). Therefore, data points may be repli-
cated and assigned to the eight subdomains in the worst case, which causes data
redundancy. Further recursive decomposition could cause a data point to be part of
many subdomains. Such redundancy limits our ability to run computation at scale in
two ways: First, the redundancy grows with increasing dataset size. While the
redundancy may not be prohibitive for the datasets used in this study, it certainly is
for bigger datasets (i.e., billions of observations). Second, the redundancy grows
with increasing search radii. While it is up to the analyst to choose buffer distances
(e.g., spatial and temporal bandwidths for STKDE), it is our goal to decrease redun-
dancy at any choice of buffer distance.
Fig. 3.3 Octree-based recursive spatiotemporal domain decomposition

34
2.2.2
The New Approach: ST-FLEX-D
Here, we introduce two alternatives to ST-STATIC-D: ST-FLEX-D-base, and
ST-FLEX-D-uneven. These methods focus on minimizing the redundancy caused
by replication of points whose buffers intersect with subdomain boundaries. Their
design is based on the observation that ST-STATIC-D bisects domains at the mid-
point of each dimension (x, y, t). Therefore, ST-FLEX-D relaxes the midway split
dictate and picks the optimal split out of a set of candidate splits.
ST-FLEX-D-base defines Nc = 5 candidate split positions by regular increments
along each axis (see Figs. 3.5, 3.6, and 3.7). It then chooses the best candidate split
for partitioning according to the following rules:
• Rule 1—The number of cuts. Pick the candidate split that produces the lowest
number of replicated points. A data point is replicated if the splitting plane inter-
sects (cuts) the cylinder centered on it. The spatial and temporal search distance
(bandwidth) of the spatiotemporal analysis algorithm are equal to the radius and
height of the cylinder (Fig. 3.5).
• Rule 2—Split evenness. In case of a tie (two candidate splits both produce the
lowest number of replicated points), pick the most even split. This split partitions
the set of points most evenly among candidate splits in consideration (Fig. 3.6).
Fig. 3.4 Buffer implementation for handling edge effects. Note that we chose to represent this 3D
problem (2D + time) in 2D for ease of visualization. Same concepts apply in 3D, where circles
become cylinders and lines become planes
A. Hohl et al.

35
• Rule 3—Centrality. If still tied, pick the most central split. This split is most
central among candidate splits in consideration (Fig. 3.7).
Figure 3.8 illustrates the entire process using an example in 2D (again, same
concepts apply in 3D). We focus on the x-axis first: Two candidate splits (SX1 and
SX5) are tied in producing the minimum number of cuts. Therefore, we apply Rule
2 and pick SX5 because its split evenness (9/1) is higher than SX1 (0/10). Then, we
focus on the y-axis, where again two candidate splits (SY1 and SY5) both produce
the minimum number of cuts. We pick SY1 by applying Rule 2 (evenness of 1/9 over
evenness of 10/0).
ST-FLEX-D-base may face the issue of picking splits that do not advance the
decomposition procedure (“bad splits”). The issue arises by selecting the outer
splits (SX1, SX5, SY1, SY5) when points are distributed more centrally within the
domain. Bad splits cut zero cylinders (and therefore are chosen by ST-FLEX-D-
base), all points lie on the same side of the split, which does not further the goal of
having decompositions of the data points. ST-FLEX-D-uneven addresses the
problem by an uneven distribution of candidate split locations that do not cover
the entire axis but congregate around the midway split (Fig. 3.9). This regime
maintains flexible split locations while reducing the odds of choosing bad splits.
Rules 1–3 of ST-FLEX-D-base for picking the best candidate split still apply for
ST-FLEX-D-uneven.
Fig. 3.5 Rule 1 of ST-FLEX-D-base. In this example, we choose S4 because it minimizes the
number of circles cut by the bisection line

36
2.2.3
Performance Metrics and Sensitivity
For both datasets, we compare the performance of ST-STATIC-D with the imple-
mentations of ST-FLEX-D (ST-FLEX-D-base, ST-FLEX-D-uneven) using the fol-
lowing metrics:
1. execution time of decomposition,
2. total number of cut cylinders (total number of replicated data points),
3. average leaf node depth,
4. average leaf node size.
The execution time of decomposition is the total amount of time required for
decomposing the dataset, disregarding I/O. The total number of cut cylinders is
equal to the number of replicated data points that result from the decomposition due
to partitioning. It is a measure of the redundancy within the decomposition proce-
dure and our goal is to minimize it. The decomposition procedure is inherently
hierarchical, where a domain splits into multiple subdomains. Therefore, it is com-
mon to illustrate the procedure as a tree, where the domain of the input dataset is
represented by the root node, and the subdomains resulting from the first split are
children nodes linked to the root node (see Fig. 3.10 for illustration and example).
Fig. 3.6 Rule 2 of ST-FLEX-D-base. Here, the minimum number of cut circles ties between S4
and S5 (Rule 1). Hence, we pick S4, which bisects the set of points more evenly
A. Hohl et al.

37
Since the
recursion does not go equally deep in all of its branches, we use the aver-
age leaf node depth to measure how many times on average the initial domain is
split to form a particular subdomain. The average leaf node size is the number of
data points that leaf nodes contain: this measures the granularity of the decomposi-
tion. The largest leaf node ultimately limits scalability of the calculation as it is the
largest chunk of undivided work.
2.2.4 Decomposition Parameters
Given a set of spatiotemporal points, the following parameters determine the out-
comes (i.e., performance metrics) of our spatiotemporal domain decomposition
implementations: (1) the maximum number of points per subdomain (threshold T1,
see Sect. 2.2.1), (2) the buffer ratio (threshold T2, Sect. 2.2.1), (3) spatial and tem-
poral bandwidths, and (4) output grid resolution. We picked the spatial and tempo-
ral bandwidths, which are very important and well-discussed parameters in
spatiotemporal analysis (Brunsdon 1995), to illustrate the sensitivity of our
Fig. 3.7 Rule 3 of ST-FLEX-D-base. Here, the minimum number of cut circles ties between can-
didate splits S2, S3 and S4 (Rule 1). Split evenness ties between S3 and S4 (Rule 2). We pick S3,
which is more central than S4

38
decomposition implementations to varying inputs. We decomposed both datasets
using the parameter configuration given in Table 3.1, where all values are kept
steady but values for spatial and temporal bandwidth vary (spatial: 200–2500 m in
steps of 100 m; temporal: 1 day–14 days in steps of 1 day). Hence, we have 336
different parameter configurations (treatments) for which we compute the metrics
introduced above for all implementations (ST-STATIC-D, ST-FLEX-D-base,
ST-FLEX-D-uneven) and datasets (dengue fever, pollen tweets) separately. We
report boxplots to illustrate the distribution of the performance metrics across the
varying parameter configurations.
The parameters in Table 3.1 have a substantial influence on the outcome of
decomposition. We expect an increase in the number of cut cylinders metric (and the
number of replicated points) with increasing bandwidths, as likely more buffers will
Fig. 3.8 Example of ST-FLEX-D-base
A. Hohl et al.

39
be intersected by the splitting planes. Therefore, we assessed the sensitivity of the
number of cut cylinders metric to varying bandwidth configurations. For both data-
sets, we report contour plots to show how the metric varies across the parameter
space. This gives us an indication of how the decomposition implementations
behave for varying parameter configurations, which might be an important factor
for choosing a method for a given decomposition task.
Fig. 3.9 Uneven candidate splits
Fig. 3.10 Domain decomposition. Spatial depiction (left), tree (right). Leaf nodes of the tree are
denoted by gray color. The average leaf node depth is (1 + 1 + 1 + 2 + 2 + 2 + 2)/7 = 1.57

40
3 Results
3.1
Distribution of Performance Metrics
Here, we report boxplots that illustrate the distribution of the performance metrics
across the 336 different parameter configurations that result from the values given
in Table 3.1. We draw a boxplot for each metric (execution time, number of cut
cylinders, average leaf node depth, average leaf node size), decomposition imple-
mentation (ST-STATIC-D, ST-FLEX-D-base, ST-FLEX-D-uneven), and dataset
(dengue fever, pollen tweets), resulting in a total of 24 plots.
3.1.1 Execution Time of Decomposition
Figure 3.11 shows decomposition execution times in seconds. ST-STATIC-D is the
fastest, followed by ST-FLEX-D-uneven and ST-FLEX-D-base. We attribute the
difference to the added cost of choosing the optimal split for the ST-FLEX-D imple-
mentations. In addition, ST-STATIC-D has the smallest variation across parameter
configurations, whereas execution times vary substantially more for ST-FLEX-D-
base and ST-FLEX-D-uneven. The pollen tweets dataset takes much longer to
decompose, but for comparing decomposition implementations, we largely observe
the same patterns for both datasets.
3.1.2
Total Number of Cut Cylinders (Total Number of Replicated Data
Points)
Generally, the number of cut cylinders is very high (Fig. 3.12), ranging from around
55,000 to 5,500,000 for the dengue fever dataset and from 10,000,000 to 100,000,000
for the pollen tweets dataset. What seem to be unexpectedly high numbers, given
the initial number of data points, can be explained by the fact that cylinders may be
Table 3.1 Parameter values for ST-STATIC-D and ST-FLEX-D
Parameter Name Values
1 Maximum
number of points
per subdomain
50
2 Buffer ratio 0.01
3 Grid resolution 50 m, 1 day
4 Spatial and
temporal
bandwidths
[200 m, 300 m, 400 m, 500 m, 600 m, 700 m, 800 m, 900 m,
1000 m, 1100 m, 1200 m, 1300 m, 1400 m, 1500 m, 1600 m,
1700 m, 1800 m, 1900 m, 2000 m, 2100 m, 2200 m, 2300 m,
2400 m, 2500 m], [1 day, 2 days, 3 days, 4 days, 5 days, 6 days,
7 days, 8 days, 9 days, 10 days, 11 days, 12 days, 13 days,
14 days]
A. Hohl et al.

41
cut multiple times as the recursion runs multiple levels deep. ST-STATIC-D per-
forms surprisingly well for this metric (Fig. 3.12), its median is lower than
ST-FLEX-
D-base and ST-FLEX-D-uneven, although it exhibits the largest range of
values for the dengue fever dataset, as well as outliers of extremely high values.
Such a distribution is not desired in spatiotemporal domain decomposition as the
data redundancy is less predictable. However, the situation is different for the pollen
tweets dataset, where the range is similar across all implementations, and the
median of ST-STATIC-D (61,591,096) is lower than ST-FLEX-D-base (66,964,108),
but higher than ST-FLEX-D-uneven (56,951,505). For both datasets, the interquar-
tile range is similar between ST-STATIC-D and ST-FLEX-D-uneven, whereas
ST-FLEX-D-base performs worse.
Fig. 3.11 Execution times in seconds for dengue fever (left) and pollen tweets (right) datasets
Fig. 3.12 Number of cut cylinders for dengue fever (left) and pollen tweets (right) datasets

Exploring the Variety of Random
Documents with Different Content

parapet, equally strong, had been thrown up for the defense of Port Hudson, surrounding
the town for a distance of three miles and more, each end terminating on the riverbank.
Four powerful forts were located at the salients, and the line throughout was defended by
thirty pieces of field artillery. Brigadier-General Beall, who commanded the post in 1862,
constructed these works. Major-General Frank Gardner succeeded him in command at the
close of the year.
THE WELL-DEFENDED WORKS

CONFEDERATE FORTIFICATIONS BEFORE PORT HUDSON
Gardner was behind these defenses with a garrison of about seven thousand when Banks
approached Port Hudson for the second time on May 24th. Gardner was under orders to
evacuate the place and join his force to that of Johnston at Jackson, Mississippi, but the
courier who brought the order arrived at the very hour when Banks began to bottle up
the Confederates. On the morning of May 25th Banks drove in the Confederate
skirmishers and outposts and, with an army of thirty thousand, invested the fortifications
from the eastward. At 10 a.m., after an artillery duel of more than four hours, the Federals
advanced to the assault of the works. Fighting in a dense forest of magnolias, amid thick
undergrowth and among ravines choked with felled timber, the progress of the troops was
too slow for a telling attack. The battle has been described as “a gigantic bushwhack.”
The Federals at the center reached the ditch in front of the Confederate works but were
driven off. At nightfall the attempt was abandoned. It had cost Banks nearly two thousand
men.

THE GUN THAT FOOLED THE FEDERALS
A “Quaker gun” that was mounted by the Confederates in the fortifications on the bluff at
the river-front before Port Hudson. This gun was hewn out of a pine log and mounted on
a carriage, and a black ring was painted around the end facing the river. Throughout the
siege it was mistaken by the Federals for a piece of real ordnance. To such devices as this
the beleaguered garrison was compelled constantly to resort in order to impress the
superior forces investing Port Hudson with the idea that the position they sought to
capture was formidably defended. The ruse was effective. Port Hudson was not again
attacked from the river after the passing of Farragut’s two ships.

COLLECTION OF FREDERICK H. MESERVE COPYRIGHT, 1911, REVIEW OF REVIEWS CO.
WITHIN “THE CITADEL”
This bastion fort, near the left of the Confederate line of defenses at Port Hudson, was
the strongest of their works, and here Weitzel and Grover’s divisions of the Federals
followed up the attack (begun at daylight of June 14th) that Banks had ordered all along
the line in his second effort to capture the position. The only result was simply to advance
the Federal lines from fifty to two hundred yards nearer. In front of the “citadel” an
advance position was gained from which a mine was subsequently run to within a few
yards of the fort.
COPYRIGHT, 1911, REVIEW OF REVIEWS CO.
THE FIRST INDIANA NAVY ARTILLERY AT BATON ROUGE

PHOTOGRAPHS THAT FURNISHED VALUABLE SECRET SERVICE INFORMATION TO THE
CONFEDERATES
The clearest and most trustworthy evidence of an opponent’s strength is of course an
actual photograph. Such evidence, in spite of the early stage of the art and the difficulty
of “running in” chemical supplies on “orders to trade,” was supplied the Confederate
leaders in the Southwest by Lytle, the Baton Rouge photographer—really a member of the
Confederate secret service. Here are photographs of the First Indiana Heavy Artillery
(formerly the Twenty-first Indiana Infantry), showing its strength and position on the
arsenal grounds at Baton Rouge. As the Twenty-first Indiana, the regiment had been at
Baton Rouge during the first Federal occupation, and after the fall of Port Hudson it
returned there for garrison duty. Little did its officers suspect that the quiet man
photographing the batteries at drill was about to convey the “information” beyond their
lines to their opponents.
“MY EXECUTIVE OFFICER, MR. DEWEY”
THE FUTURE ADMIRAL AS CIVIL WAR LIEUTENANT
In the fight with the batteries at Port Hudson, March 14, 1863, Farragut, in the “Hartford”
lashed to the “Albatross,” got by, but the fine old consort of the “Hartford,” the
“Mississippi,” went down—her gunners fighting to the last. Farragut, in anguish, could see

her enveloped in flames lighting up the river. She had grounded under the very guns of a
battery, and not until actually driven off by the flames did her men leave her. When the
“Mississippi” grounded, the shock threw her lieutenant-commander into the river, and in
confusion he swam toward the shore; then, turning about, he swam back to his ship.
Captain Smith thus writes in his report: “I consider that I should be neglecting a most
important duty should I omit to mention the coolness of my executive officer, Mr. Dewey,
and the steady, fearless, and gallant manner in which the officers and men of the
‘Mississippi’ defended her, and the orderly and quiet manner in which she was abandoned
after being thirty-five minutes aground under the fire of the enemy’s batteries. There was
no confusion in embarking the crew, and the only noise was from the enemy’s cannon.”
Lieutenant-Commander George Dewey, here mentioned at the age of 26, was to
exemplify in Manila Bay on May 1, 1898, the lessons he was learning from Farragut.

Painted by C. D. Graves.
Copyright, 1901, by Perrien-Keydel Co., Detroit, Mich., U. S. A.
PICKETT’S CHARGE AT GETTYSBURG.
Larger Image
WHILE LINCOLN SPOKE AT GETTYSBURG, NOVEMBER 19, 1863
DURING THE FAMOUS ADDRESS IN DEDICATION OF THE CEMETERY

The most important American address is brief: “Fourscore and seven years ago our
fathers brought forth on this continent a new nation, conceived in liberty, and dedicated
to the proposition that all men are created equal. Now we are engaged in a great civil
war, testing whether that nation, or any nation so conceived and so dedicated, can long
endure. We are met on a great battlefield of that war. We have come to dedicate a
portion of that field as a final resting-place for those who here gave their lives that that
nation might live. It is altogether fitting and proper that we should do this. But in a larger
sense, we cannot dedicate, we cannot consecrate, we cannot hallow this ground. The
brave men, living and dead, who struggled here, have consecrated it far above our poor
power to add or detract. The world will little note, nor long remember, what we say here,
but it can never forget what they did here. It is for us, the living, rather, to be dedicated
here to the unfinished work which they who fought here have thus far so nobly advanced.
It is rather for us to be here dedicated to the great task remaining before us;—that from
these honored dead, we take increased devotion to that cause for which they gave the
last full measure of devotion;—that we here highly resolve that these dead shall not have
died in vain, that this nation, under God, shall have a new birth of freedom, and that
government of the people, by the people, for the people, shall not perish from the earth.”

COPYRIGHT, 1911, PATRIOT PUB. CO.

T
THE BATTLE OF GETTYSBURG—THE HIGH-
WATER MARK OF THE CIVIL WAR
HE military operations of the American Civil War were carried on for the most part
south of the Mason and Dixon line; but the greatest and most famous of the battles
was fought on the soil of the old Keystone State, which had given birth to the Declaration
of Independence and to the Constitution of the United States.
Gettysburg is a quiet hamlet, nestling among the hills of Adams County, and in 1863
contained about fifteen hundred inhabitants. It had been founded in 1780 by James
Gettys, who probably never dreamed that his name thus given to the village would,
through apparently accidental circumstances, become famous in history for all time.
The hills immediately around Gettysburg are not rugged or precipitous; they are little
more than gentle swells of ground, and many of them were covered with timber when the
hosts of the North and the legions of the South fought out the destiny of the American
republic on those memorable July days in 1863.
Lee’s army was flushed with victory after Chancellorsville and was strengthened by the
memory of Fredericksburg. Southern hopes were high after Hooker’s defeat on the
Rappahannock, in May, 1863, and public opinion was unanimous in demanding an
invasion of Northern soil. On the other hand, the Army of the Potomac, under its several
leaders, had met with continual discouragement, and, with all its patriotism and valor, its
two years’ warfare showed but few bright pages to cheer the heart of the war-broken
soldier, and to inspire the hopes of the anxious public in the North.
Leaving General Stuart with ten thousand cavalry and a part of Hill’s corps to prevent
Hooker from pursuing, Lee crossed the Potomac early in June, 1863, concentrated his
army at Hagerstown, Maryland, and prepared for a campaign in Pennsylvania, with
Harrisburg as the objective. His army was organized in three corps, under the respective
commands of Longstreet, Ewell, and A. P. Hill. Lee had divided his army so as to approach
Harrisburg by different routes and to assess the towns along the way for large sums of
money. Late in June, he was startled by the intelligence that Stuart had failed to detain
Hooker, and that the Federals had crossed the Potomac and were in hot pursuit.
Lee was quick to see that his plans must be changed. He knew that to continue his march
he must keep his army together to watch his pursuing antagonist, and that such a course
in this hostile country would mean starvation, while the willing hands of the surrounding
populace would minister to the wants of his foe. Again, if he should scatter his forces that
they might secure the necessary supplies, the parts would be attacked singly and
destroyed. Lee saw, therefore, that he must abandon his invasion of the North or turn
upon his pursuing foe and disable him in order to continue his march. But that foe was a
giant of strength and courage, more than equal to his own; and the coming together of

two such forces in a mighty death-struggle meant that a great battle must be fought, a
greater battle than this Western world had hitherto known.
The Army of the Potomac had again changed leaders, and George Gordon Meade was
now its commander. Hooker, after a dispute with Halleck, resigned his leadership, and
Meade, the strongest of the corps commanders, was appointed in his place, succeeding
him on June 28th. The two great armies—Union and Confederate—were scattered over
portions of Maryland and southern Pennsylvania. Both were marching northward, along
almost parallel lines. The Confederates were gradually pressing toward the east, while the
Federals were marching along a line eastward of that followed by the Confederates. The
new commander of the Army of the Potomac was keeping his forces interposed between
the legions of Lee and the Federal capital, and watching for an opportunity to force the
Confederates to battle where the Federals would have the advantage of position. It was
plain that they must soon come together in a gigantic contest; but just where the shock
of battle would take place was yet unknown. Meade had ordered a general movement
toward Harrisburg, and General Buford was sent with four thousand cavalry to intercept
the Confederate advance guard.
On the night of June 30th Buford encamped on a low hill, a mile west of Gettysburg, and
here on the following morning the famous battle had its beginning.
On the morning of July 1st the two armies were still scattered, the extremes being forty
miles apart. But General Reynolds, with two corps of the Union army, was but a few miles
away, and was hastening to Gettysburg, while Longstreet and Hill were approaching from
the west. Buford opened the battle against Heth’s division of Hill’s corps. Reynolds soon
joined Buford, and three hours before noon the battle was in progress on Seminary Ridge.
Reynolds rode out to his fighting-lines on the ridge, and while placing his troops, a little
after ten o’clock in the morning, he received a sharpshooter’s bullet in the brain. The
gallant Federal leader fell dead. John F. Reynolds, who had been promoted for gallantry at
Buena Vista in the Mexican War, was one of the bravest and ablest generals of the Union
army. No casualty of the war brought more widespread mourning to the North than the
death of Reynolds.
But even this calamity could not stay the fury of the battle. By one o’clock both sides had
been greatly reënforced, and the battle-line extended north of the town from Seminary
Ridge to the bank of Rock Creek. Here for hours the roar of the battle was unceasing.
About the middle of the afternoon a breeze lifted the smoke that had enveloped the
whole battle-line in darkness, and revealed the fact that the Federals were being pressed
back toward Gettysburg. General Carl Schurz, who after Reynolds’ death directed the
extreme right near Rock Creek, leaving nearly half of his men dead or wounded on the
field, retreated toward Cemetery Hill, and in passing through the town the Confederates
pursued and captured a large number of the remainder. The left wing, now unable to hold
its position owing to the retreat of the right, was also forced back, and it, too, took refuge
on Cemetery Hill, which had been selected by General O. O. Howard; and the first day’s
fight was over. It was several hours before night, and had the Southerners known of the
disorganized condition of the Union troops, they might have pursued and captured a large
part of the army. Meade, who was still some miles from the field, hearing of the death of
Reynolds, had sent Hancock to take general command until he himself should arrive.

Hancock had ridden at full speed and arrived on the field between three and four o’clock
in the afternoon. His presence soon brought order out of chaos. His superb bearing, his
air of confidence, his promise of heavy reënforcements during the night, all tended to
inspire confidence and to renew hope in the ranks of the discouraged army. Had this day
ended the affair at Gettysburg, the usual story of the defeat of the Army of the Potomac
would have gone forth to the world. Only the advance portions of both armies had been
engaged; and yet the battle had been a formidable one. The Union loss was severe. A
great commander had fallen, and the rank and file had suffered the fearful loss of ten
thousand men.
Meade reached the scene late in the night, and chose to make this field, on which the
advance of both armies had accidentally met, the place of a general engagement. Lee
had come to the same decision, and both called on their outlying legions to make all
possible speed to Gettysburg. Before morning, nearly all the troops of both armies had
reached the field. The Union army rested with its center on Cemetery Ridge, with its right
thrown around to Culp’s Hill and its left extended southward toward the rocky peak called
Round Top. The Confederate army, with its center on Seminary Ridge, its wings extending
from beyond Rock Creek on the north to a point opposite Round Top on the south, lay in
a great semi-circle, half surrounding the Army of the Potomac. But Lee was at a
disadvantage. First, “Stonewall” Jackson was gone, and second, Stuart was absent with
his ten thousand cavalry. Furthermore, Meade was on the defensive, and had the
advantage of occupying the inner ring of the huge half circle. Thus lay the two mighty
hosts, awaiting the morning, and the carnage that the day was to bring. It seemed that
the fate of the Republic was here to be decided, and the people of the North and the
South watched with breathless eagerness for the decision about to be made at
Gettysburg.
The dawn of July 2d betokened a beautiful summer day in southern Pennsylvania. The
hours of the night had been spent by the two armies in marshaling of battalions and
maneuvering of corps and divisions, getting into position for the mighty combat of the
coming day. But, when morning dawned, both armies hesitated, as if unwilling to begin
the task of bloodshed. They remained inactive, except for a stray shot here and there,
until nearly four o’clock in the afternoon.
The fighting on this second day was chiefly confined to the two extremes, the centers
remaining comparatively inactive. Longstreet commanded the Confederate right, and
opposite him on the Union left was General Daniel E. Sickles. The Confederate left wing,
under Ewell, was opposite Slocum and the Union right stationed on Culp’s Hill.
The plan of General Meade had been to have the corps commanded by General Sickles
connect with that of Hancock and extend southward near the base of the Round Tops.
Sickles found this ground low and disadvantageous as a fighting-place. In his front he saw
the high ground along the ridge on the side of which the peach orchard was situated, and
advanced his men to this position, placing them along the Emmitsburg road, and back
toward the Trostle farm and the wheat-field, thus forming an angle at the peach orchard.
The left flank of Hancock’s line now rested far behind the right flank of Sickles’ forces.
The Third Corps was alone in its position in advance of the Federal line. The Confederate
troops later marched along Sickles’ front so that Longstreet’s corps overlapped the left

wing of the Union army. The Northerners grimly watched the bristling cannon and the
files of men that faced them across the valley, as they waited for the battle to commence.
The boom of cannon from Longstreet’s batteries announced the beginning of the second
day’s battle. Lee had ordered Longstreet to attack Sickles in full force. The fire was
quickly answered by the Union troops, and before long the fight extended from the peach
orchard through the wheatfield and along the whole line to the base of Little Round Top.
The musketry commenced with stray volleys here and there—then more and faster, until
there was one continuous roar, and no ear could distinguish one shot from another.
Longstreet swept forward in a magnificent line of battle, a mile and a half long. He
pressed back the Union infantry, and was seriously threatening the artillery.
At the extreme left, close to the Trostle house, Captain John Bigelow commanded the
Ninth Battery, Massachusetts Light Artillery. He was ordered to hold his position at all
hazards until reënforced. With double charges of grape and canister, again and again he
tore great gaps in the advancing line, but it re-formed and pressed onward until the men
in gray reached the muzzles of the Federal guns. Again Bigelow fired, but the heroic band
had at last to give way to the increased numbers of the attack, which finally resulted in a
hand-to-hand struggle with a Mississippi regiment. Bigelow was wounded, and twenty-
eight of his hundred and four men were left on the bloody field, while he lost sixty-five
out of eighty-eight horses, and four of six guns. Such was one of many deeds of heroism
enacted at Gettysburg.
But the most desperate struggle of the day was the fight for the possession of Little
Round Top. Just before the action began General Meade sent his chief engineer, General
G. K. Warren, to examine conditions on the Union left. The battle was raging in the peach
orchard when he came to Little Round Top. It was unoccupied at the time, and Warren
quickly saw the great importance of preventing its occupation by the Confederates, for
the hill was the key to the whole battle-ground west and south of Cemetery Ridge. Before
long, the engineer saw Hood’s division of Longstreet’s corps moving steadily toward the
hill, evidently determined to occupy it. Had Hood succeeded, the result would have been
most disastrous to the Union army, for the Confederates could then have subjected the
entire Union lines on the western edge of Cemetery Ridge to an enfilading fire. Warren
and a signal officer seized flags and waved them, to deceive the Confederates as to the
occupation of the height. Sykes’ corps, marching to the support of the left, soon came
along, and Warren, dashing down the side of the hill to meet it, caused the brigade under
Colonel Vincent and a part of that under General Weed to be detached, and these
occupied the coveted position. Hazlett’s battery was dragged by hand up the rugged slope
and planted on the summit.
Meantime Hood’s forces had come up the hill, and were striving at the very summit; and
now occurred one of the most desperate hand-to-hand conflicts of the war—in which men
forgot that they were human and tore at each other like wild beasts. The opposing forces,
not having time to reload, charged each other with bayonets—men assaulted each other
with clubbed muskets—the Blue and the Gray grappled in mortal combat and fell dead,
side by side. The privates in the front ranks fought their way onward until they fell, the
officers sprang forward, seized the muskets from the hands of the dying and the dead,
and continued the combat. The furious struggle continued for half an hour, when Hood’s

forces gave way and were pressed down the hillside. But they rallied and advanced again
by way of a ravine on the left, and finally, after a most valiant charge, were driven back at
the point of the bayonet.
Little Round Top was saved to the Union army, but the cost was appalling. The hill was
covered with hundreds of the slain. Scores of the Confederate sharpshooters had taken
position among the crevasses in the Devil’s Den, where they could overlook the position
on Little Round Top, and their unerring aim spread death among the Federal officers and
gunners. Colonel O’Rourke and General Vincent were dead. General Weed was dying;
and, as Hazlett was stooping to receive Weed’s last message, a sharpshooter’s bullet laid
him—dead—across the body of his chief.
During this attack, and for some hours thereafter, the battle continued in the valley below
on a grander scale and with demon-like fury. Here many thousands were engaged.
Sickles’ whole line was pressed back to the base of the hill from which it had advanced in
the morning. Sickles’ leg was shattered by a shell, necessitating amputation, while scores
of his brave officers, and thousands of his men, lay on the field of battle when the
struggle ceased at nightfall. This valley has been appropriately named the “Valley of
Death.”
Before the close of this main part of the second day’s battle, there was another clash of
arms, fierce but of short duration, at the other extreme of the line. Lee had ordered Ewell
to attack Cemetery Hill and Culp’s Hill on the north, held by Slocum, who had been
weakened by the sending of a large portion of the Twelfth Corps to the assistance of the
left wing. Ewell had three divisions, two of which were commanded by Generals Early and
Johnson. It was nearly sunset when he sent Early to attack Cemetery Hill. Early was
repulsed after an hour’s bloody and desperate hand-to-hand fight, in which muskets and
bayonets, rammers, clubs, and stones were used. Johnson’s attack on Culp’s Hill was
more successful. After a severe struggle of two or three hours General Greene, who alone
of the Twelfth Corps remained on the right, succeeded, after reënforcement, in driving
the right of Johnson’s division away from its entrenchments, but the left had no difficulty
in taking possession of the abandoned works of Geary and Ruger, now gone to Round Top
and Rock Creek to assist the left wing.
Thus closed the second day’s battle at Gettysburg. The harvest of death had been
frightful. The Union loss during the two days had exceeded twenty thousand men; the
Confederate loss was nearly equal. The Confederate army had gained an apparent
advantage in penetrating the Union breastworks on Culp’s Hill. But the Union lines, except
on Culp’s Hill, were unbroken. On the night of July 2d, Lee and his generals held a council
of war and decided to make a grand final assault on Meade’s center the following day.
Against this decision Longstreet protested in vain. His counsel was that Lee withdraw to
the mountains, compel Meade to follow, and then turn and attack him. But Lee was
encouraged by the arrival of Pickett’s division and of Stuart’s cavalry, and Longstreet’s
objections were overruled. Meade and his corps commanders had met and made a like
decision—that there should be a fight to the death at Gettysburg.
That night a brilliant July moon shed its luster upon the ghastly field on which thousands
of men lay, unable to rise. Many of them no longer needed help. Their last battle was

over, and their spirits had fled to the great Beyond. But there were great numbers, torn
and gashed with shot and shell, who were still alive and calling for water or for the kindly
touch of a helping hand. Nor did they call wholly in vain. Here and there in the moonlight
little rescuing parties were seeking out whom they might succor. They carried many to the
improvised hospitals, where the surgeons worked unceasingly and heroically, and many
lives were saved.
All through the night the Confederates were massing artillery along the crest of Seminary
Ridge. The sound horses were carefully fed and watered, while those killed or disabled
were replaced by others. The ammunition was replenished and the guns were placed in
favorable positions and made ready for their work of destruction.
On the other side, the Federals were diligently laboring in the moonlight, and ere the
coming of the day they had planted batteries on the brow of the hill above the town as
far as Little Round Top. The coming of the morning revealed the two parallel lines of
cannon, a mile apart, which signified only too well the story of what the day would bring
forth.
The people of Gettysburg, which lay almost between the armies, were awakened on that
fateful morning—July 3, 1863—by the roar of artillery from Culp’s Hill, around the bend
toward Rock Creek. This knoll in the woods had, as we have seen, been taken by
Johnson’s men the night before. When Geary and Ruger returned and found their
entrenchments occupied by the Confederates they determined to recapture them in the
morning, and began firing their guns at daybreak. Seven hours of fierce bombardment
and daring charges were required to regain them. Every rod of space was disputed at the
cost of many a brave man’s life. At eleven o’clock this portion of the Twelfth Corps was
again in its old position.
But the most desperate onset of the three days’ battle was yet to come—Pickett’s charge
on Cemetery Ridge—preceded by the heaviest cannonading ever heard on the American
continent.
With the exception of the contest at Culp’s Hill and a cavalry fight east of Rock Creek, the
forenoon of July 3d passed with only an occasional exchange of shots at irregular
intervals. At noon there was a lull, almost a deep silence, over the whole field. It was the
ominous calm that precedes the storm. At one o’clock signal guns were fired on Seminary
Ridge, and a few moments later there was a terrific outburst from one hundred and fifty
Confederate guns, and the whole crest of the ridge, for two miles, was a line of flame.
The scene was majestic beyond description. The scores of batteries were soon enveloped
in smoke, through which the flashes of burning powder were incessant.
The long line of Federal guns withheld their fire for some minutes, when they burst forth,
answering the thunder of those on the opposite hill. An eye-witness declares that the
whole sky seemed filled with screaming shells, whose sharp explosions, as they burst in
mid-air, with the hurtling of the fragments, formed a running accompaniment to the deep,
tremendous roar of the guns.
Many of the Confederate shots went wild, passing over the Union army and plowing up
the earth on the other side of Cemetery Ridge. But others were better aimed and burst

among the Federal batteries, in one of which twenty-seven out of thirty-six horses were
killed in ten minutes. The Confederate fire seemed to be concentrated upon one point
between Cemetery Ridge and Little Round Top, near a clump of scrub oaks. Here the
batteries were demolished and men and horses were slain by scores. The spot has been
called “Bloody Angle.”
The Federal fire proved equally accurate and the destruction on Seminary Ridge was
appalling. For nearly two hours the hills shook with the tremendous cannonading, when it
gradually slackened and ceased. The Union army now prepared for the more deadly
charge of infantry which it felt was sure to follow.
They had not long to wait. As the cannon smoke drifted away from between the lines
fifteen thousand of Longstreet’s corps emerged in grand columns from the wooded crest
of Seminary Ridge under the command of General Pickett on the right and General
Pettigrew on the left. Longstreet had planned the attack with a view to passing around
Round Top, and gaining it by flank and reverse attack, but Lee, when he came upon the
scene a few moments after the final orders had been given, directed the advance to be
made straight toward the Federal main position on Cemetery Ridge.
The charge was one of the most daring in warfare. The distance to the Federal lines was
a mile. For half the distance the troops marched gayly, with flying banners and glittering
bayonets. Then came the burst of Federal cannon, and the Confederate ranks were torn
with exploding shells. Pettigrew’s columns began to waver, but the lines re-formed and
marched on. When they came within musket-range, Hancock’s infantry opened a terrific
fire, but the valiant band only quickened its pace and returned the fire with volley after
volley. Pettigrew’s troops succumbed to the storm. For now the lines in blue were fast
converging. Federal troops from all parts of the line now rushed to the aid of those in
front of Pickett. The batteries which had been sending shell and solid shot changed their
ammunition, and double charges of grape and canister were hurled into the column as it
bravely pressed into the sea of flame. The Confederates came close to the Federal lines
and paused to close their ranks. Each moment the fury of the storm from the Federal
guns increased.
“Forward,” again rang the command along the line of the Confederate front, and the
Southerners dashed on. The first line of the Federals was driven back. A stone wall behind
them gave protection to the next Federal force. Pickett’s men rushed upon it. Riflemen
rose from behind and hurled a death-dealing volley into the Confederate ranks. A defiant
cheer answered the volley, and the Southerners placed their battle-flags on the ramparts.
General Armistead grasped the flag from the hand of a falling bearer, and leaped upon
the wall, waving it in triumph. Almost instantly he fell among the Federal troops, mortally
wounded. General Garnett, leading his brigade, fell dead close to the Federal line. General
Kemper sank, wounded, into the arms of one of his men.
Pickett had entered a death-trap. Troops from all directions rushed upon him. Clubbed
muskets and barrel-staves now became weapons of warfare. The Confederates began
surrendering in masses and Pickett ordered a retreat. Yet the energy of the indomitable
Confederates was not spent. Several supporting brigades moved forward, and only

succumbed when they encountered two regiments of Stannard’s Vermont brigade, and
the fire of fresh batteries.
As the remnant of the gallant division returned to the works on Seminary Ridge General
Lee rode out to meet them. His demeanor was calm. His features gave no evidence of his
disappointment. With hat in hand he greeted the men sympathetically. “It was all my
fault,” he said. “Now help me to save that which remains.”
The battle of Gettysburg was over. The cost in men was frightful. The losses of the two
armies reached fifty thousand, about half on either side. More than seven thousand men
had fallen dead on the field of battle.
The tide could rise no higher; from this point the ebb must begin. Not only here, but in
the West the Southern cause took a downward turn; for at this very hour of Pickett’s
charge, Grant and Pemberton, a thousand miles away, stood under an oak tree on the
heights above the Mississippi and arranged for the surrender of Vicksburg.
Lee could do nothing but lead his army back to Virginia. The Federals pursued but feebly.
The Union victory was not a very decisive one, but, supported as it was by the fall of
Vicksburg, the moral effect on the nation and on the world was great. The period of
uncertainty was ended. It required but little prophetic vision to foresee that the Republic
would survive the dreadful shock of arms.
THE CRISIS BRINGS FORTH THE MAN
Major-General George Gordon Meade and Staff. Not men, but a man is what counts in
war, said Napoleon; and Lee had proved it true in many a bitter lesson administered to

the Army of the Potomac. At the end of June, 1863, for the third time in ten months, that
army had a new commander. Promptness and caution were equally imperative in that
hour. Meade’s fitness for the post was as yet undemonstrated; he had been advanced
from the command of the Fifth Corps three days before the army was to engage in its
greatest battle. Lee must be turned back from Harrisburg and Philadelphia and kept from
striking at Baltimore and Washington, and the somewhat scattered Army of the Potomac
must be concentrated. In the very first flush of his advancement, Meade exemplified the
qualities of sound generalship that placed his name high on the list of Federal
commanders.
ROBERT E. LEE IN 1863
It was with the gravest misgivings that Lee began his invasion of the North in 1863. He
was too wise a general not to realize that a crushing defeat was possible. Yet, with
Vicksburg already doomed, the effort to win a decisive victory in the East was imperative
in its importance. Magnificent was the courage and fortitude of Lee’s maneuvering during
that long march which was to end in failure. Hitherto he had made every one of his
veterans count for two of their antagonists, but at Gettysburg the odds had fallen heavily
against him. Jackson, his resourceful ally, was no more. Longstreet advised strongly
against giving battle, but Lee unwaveringly made the tragic effort which sacrificed more
than a third of his splendid army.

HANCOCK, “THE SUPERB”
Every man in this picture was wounded at Gettysburg. Seated, is Winfield Scott Hancock;
the boy-general, Francis C. Barlow (who was struck almost mortally), leans against the
tree. The other two are General John Gibbon and General David B. Birney. About four
o’clock on the afternoon of July 1st a foam-flecked charger dashed up Cemetery Hill
bearing General Hancock. He had galloped thirteen miles to take command. Apprised of
the loss of Reynolds, his main dependence, Meade knew that only a man of vigor and
judgment could save the situation. He chose wisely, for Hancock was one of the best all-
round soldiers that the Army of the Potomac had developed. It was he who re-formed the
shattered corps and chose the position to be held for the decisive struggle.

MUTE PLEADERS IN THE CAUSE OF PEACE
There was little time that could be employed by either side in caring for those who fell
upon the fields of the almost uninterrupted fighting at Gettysburg. On the morning of the
4th, when Lee began to abandon his position on Seminary Ridge, opposite the Federal
right, both sides sent forth ambulance and burial details to remove the wounded and bury
the dead in the torrential rain then falling. Under cover of the hazy atmosphere, Lee was
getting his whole army in motion to retreat. Many an unfinished shallow grave, like the
one above, had to be left by the Confederates. In this lower picture some men of the
Twenty-fourth Michigan infantry are lying dead on the field of battle. This regiment—one
of the units of the Iron Brigade—left seven distinct rows of dead as it fell back from
battle-line to battle-line, on the first day. Three-fourths of its members were struck down.

MEN OF THE IRON BRIGADE
THE FIRST DAY’S TOLL
The lives laid down by the blue-clad soldiers in the first day’s fighting made possible the
ultimate victory at Gettysburg. The stubborn resistance of Buford’s cavalry and of the First

and Eleventh Corps checked the Confederate advance for an entire day. The delay was
priceless; it enabled Meade to concentrate his army upon the heights to the south of
Gettysburg, a position which proved impregnable. To a Pennsylvanian, General John F.
Reynolds, falls the credit of the determined stand that was made that day. Commanding
the advance of the army, he promptly went to Buford’s support, bringing up his infantry
and artillery to hold back the Confederates.
McPHERSON’S WOODS
At the edge of these woods General Reynolds was killed by a Confederate sharpshooter in
the first vigorous contest of the day. The woods lay between the two roads upon which
the Confederates were advancing from the west, and General Doubleday (in command of
the First Corps) was ordered to take the position so that the columns of the foe could be
enfiladed by the infantry, while contending with the artillery posted on both roads. The
Iron Brigade under General Meredith was ordered to hold the ground at all hazards. As
they charged, the troops shouted: “If we can’t hold it, where will you find the men who
can?” On they swept, capturing General Archer and many of his Confederate brigade that
had entered the woods from the other side. As Archer passed to the rear, Doubleday, who
had been his classmate at West Point, greeted him with “Good morning! I’m glad to see
you!”

FEDERAL DEAD AT GETTYSBURG, JULY 1, 1863
All the way from McPherson’s Woods back to Cemetery Hill lay the Federal soldiers, who
had contested every foot of that retreat until nightfall. The Confederates were massing so
rapidly from the west and north that there was scant time to bring off the wounded and
none for attention to the dead. There on the field lay the shoes so much needed by the
Confederates, and the grim task of gathering them began. The dead were stripped of
arms, ammunition, caps, and accoutrements as well—in fact, of everything that would be
of the slightest use in enabling Lee’s poorly equipped army to continue the internecine
strife. It was one of war’s awful expedients.

SEMINARY RIDGE, BEYOND GETTYSBURG
Along this road the Federals retreated toward Cemetery Hill in the late afternoon of July
1st. The success of McPherson’s Woods was but temporary, for the Confederates under
Hill were coming up in overpowering numbers, and now Ewell’s forces appeared from the
north. The first Corps, under Doubleday, “broken and defeated but not dismayed,” fell
back, pausing now and again to fire a volley at the pursuing Confederates. It finally joined
the Eleventh Corps, which had also been driven back to Cemetery Hill. Lee was on the
field in time to watch the retreat of the Federals, and advised Ewell to follow them up, but
Ewell (who had lost 3,000 men) decided upon discretion. Night fell with the beaten
Federals, reinforced by the Twelfth Corps and part of the Third, facing nearly the whole of
Lee’s army.

IN THE DEVIL’S DEN
Upon this wide, steep hill, about five hundred yards due west of Little Round Top and one
hundred feet lower, was a chasm named by the country folk “the Devil’s Den.” When the
position fell into the hands of the Confederates at the end of the second day’s fighting, it
became the stronghold of their sharpshooters, and well did it fulfill its name. It was a
most dangerous post to occupy, since the Federal batteries on the Round Top were
constantly shelling it in an effort to dislodge the hardy riflemen, many of whom met the
fate of the one in the picture. Their deadly work continued, however, and many a gallant
officer of the Federals was picked off during the fighting on the afternoon of the second
day. General Vincent was one of the first victims; General Weed fell likewise; and as
Lieutenant Hazlett bent over him to catch his last words, a bullet through the head
prostrated that officer lifeless on the body of his chief.

THE UNGUARDED LINK
Little Round Top, the key to the Federal left at Gettysburg, which they all but lost on the
second day—was the scene of hand-to-hand fighting rarely equaled since long-range
weapons were invented. Twice the Confederates in fierce conflict fought their way near to
this summit, but were repulsed. Had they gained it, they could have planted artillery
which would have enfiladed the left of Meade’s line, and Gettysburg might have been
turned into an overwhelming defeat. Beginning at the right, the Federal line stretched in
the form of a fish-hook, with the barb resting on Culp’s Hill, the center at the bend in the
hook on Cemetery Hill, and the left (consisting of General Sickles’ Third Corps) forming
the shank to the southward as far as Round Top. On his own responsibility Sickles had
advanced a portion of his line, leaving Little Round Top unprotected. Upon this advanced
line of Sickles, at the Peach Orchard on the Emmitsburg road, the Confederates fell in an
effort to turn what they supposed to be Meade’s left flank. Only the promptness of
General Warren, who discovered the gap and remedied it in time, saved the key.

THE HEIGHT OF THE BATTLE-TIDE
Near this gate to the local cemetery of Gettysburg there stood during the battle this sign:
“All persons found using firearms in these grounds will be prosecuted with the utmost
rigor of the law.” Many a soldier must have smiled grimly at these words, for this gateway
became the key of the Federal line, the very center of the cruelest use of firearms yet
seen on this continent. On the first day Reynolds saw the value of Cemetery Hill in case of
a retreat. Howard posted his reserves here, and Hancock greatly strengthened the
position. One hundred and fifty Confederate guns were turned against it that last
afternoon. In five minutes every man of the Federals had been forced to cover; for an
hour and a half the shells fell fast, dealing death and laying waste the summer verdure in
the little graveyard. Up to the very guns of the Federals on Cemetery Hill, Pickett led his
devoted troops. At night of the 3d it was one vast slaughter-field. On this eminence,
where thousands were buried, was dedicated the soldiers’ National Cemetery.

PICKETT—THE MARSHALL NEY OF GETTYSBURG
The Now-or-never Charge of Pickett’s Men. When the Confederate artillery opened at one
o’clock on the afternoon of July 3d, Meade and his staff were driven from their
headquarters on Cemetery Ridge. Nothing could live exposed on that hillside, swept by
cannon that were being worked as fast as human hands could work them. It was the
beginning of Lee’s last effort to wrest victory from the odds that were against him.
Longstreet, on the morning of the 3d, had earnestly advised against renewing the battle
against the Gettysburg heights. But Lee saw that in this moment the fate of the South
hung in the balance; that if the Army of Northern Virginia did not win, it would never
again become the aggressor. Pickett’s division, as yet not engaged, was the force Lee
designated for the assault; every man was a Virginian, forming a veritable Tenth Legion in
valor. Auxiliary divisions swelled the charging column to 15,000. In the middle of the
afternoon the Federal guns ceased firing. The time for the charge had come. Twice Pickett
asked of Longstreet if he should go forward. Longstreet merely bowed in answer. “Sir, I
shall lead my division forward,” said Pickett at last, and the heavy-hearted Longstreet
bowed his head. As the splendid column swept out of the woods and across the plain the
Federal guns reopened with redoubled fury. For a mile Pickett and his men kept on, facing
a deadly greeting of round shot, canister, and the bullets of Hancock’s resolute infantry. It
was magnificent—but every one of Pickett’s brigade commanders went down and their
men fell by scores and hundreds around them. A hundred led by Armistead, waving his
cap on his sword-point, actually broke through and captured a battery, Armistead falling
beside a gun. It was but for a moment. Longstreet had been right when he said: “There
never was a body of fifteen thousand men who could make that attack successfully.”
Before the converging Federals the thinned ranks of Confederates drifted wearily back
toward Seminary Ridge. Victory for the South was not to be.

MEADE’S HEADQUARTERS ON CEMETERY RIDGE
WHERE PICKETT CHARGED

The prelude to Pickett’s magnificent charge was a sudden deluge of shells from 150 long-
range Confederate guns trained upon Cemetery Ridge. General Meade and his staff were
instantly driven from their headquarters (already illustrated) and within five minutes the
concentrated artillery fire had swept every unsheltered position on Cemetery Ridge clear
of men. In the woods, a mile and a half distant, Pickett and his men watched the effect of
the bombardment, expecting the order to “Go Forward” up the slope (shown in the
picture). The Federals had instantly opened with their eighty available guns, and for three
hours the most terrific artillery duel of the war was kept up. Then the Federal fire
slackened, as though the batteries were silenced. The Confederates’ artillery ammunition
also was now low. “For God’s sake, come on!” was the word to Pickett. And at
Longstreet’s reluctant nod the commander led his 14,000 Virginians across the plain in
their tragic charge up Cemetery Ridge.
GENERAL L. A. ARMISTEAD, C. S. A.
In that historic charge was Armistead, who achieved a momentary victory and met a
hero’s death. On across the Emmitsburg road came Pickett’s dauntless brigades, coolly
closing up the fearful chasms torn in their ranks by the canister. Up to the fence held by
Hays’ brigade dashed the first gray line, only to be swept into confusion by a cruel
enfilading fire. Then the brigades of Armistead and Garnett moved forward, driving Hays’
brigade back through the batteries on the crest. Despite the death-dealing bolts on all
sides, Pickett determined to capture the guns; and, at the order, Armistead, leaping the
fence and waving his cap on his sword-point, rushed forward, followed by about a
hundred of his men. Up to the very crest they fought the Federals back, and Armistead,
shouting, “Give them the cold steel, boys!” seized one of the guns. For a moment the
Confederate flag waved triumphantly over the Federal battery. For a brief interval the fight
raged fiercely at close quarters. Armistead was shot down beside the gun he had taken,
and his men were driven back. Pickett, as he looked around the top of the ridge he had

gained, could see his men fighting all about with clubbed muskets and even flagstaffs
against the troops that were rushing in upon them from all sides. Flesh and blood could
not hold the heights against such terrible odds, and with a heart full of anguish Pickett
ordered a retreat. The despairing Longstreet, watching from Seminary Ridge, saw through
the smoke the shattered remnants drift sullenly down the slope and knew that Pickett’s
glorious but costly charge was ended.
THE MAN WHO HELD THE CENTER
Headquarters of Brigadier-General Alexander S. Webb. It devolved upon the man pictured
here (booted and in full uniform, before his headquarters tent to the left of the picture) to
meet the shock of Pickett’s great charge. With four Pennsylvania regiments (the Sixty-
Ninth, Seventy-First, Seventy-Second, and One Hundred and Sixth) of Hancock’s Second
Corps, Webb was equal to the emergency. Stirred to great deeds by the example of a
patriotic ancestry, he felt that upon his holding his position depended the outcome of the
day. His front had been the focus of the Confederate artillery fire. Batteries to right and
left of his line were practically silenced. Young Lieutenant Cushing, mortally wounded,
fired the last serviceable gun and fell dead as Pickett’s men came on. Cowan’s First New
York Battery on the left of Cushing’s used canister on the assailants at less than ten yards.
Webb at the head of the Seventy-Second Pennsylvania fought back the on-rush, posting a
line of slightly wounded in his rear. Webb himself fell wounded but his command checked
the assault till Hall’s brilliant charge turned the tide at this point.

Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.
More than just a book-buying platform, we strive to be a bridge
connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.
Join us on a journey of knowledge exploration, passion nurturing, and
personal growth every day!
ebookbell.com

High Performance Computing For Geospatial Applications 1st Ed Wenwu Tang

More Related Content

Similar to High Performance Computing For Geospatial Applications 1st Ed Wenwu Tang (20)

Recently uploaded (20)

High Performance Computing For Geospatial Applications 1st Ed Wenwu Tang