SlideShare a Scribd company logo
Peers In A Clientserver World A Modern
Perspective On Peer To Peer And Grid Computing
Ian J Taylor download
https://ebookbell.com/product/peers-in-a-clientserver-world-a-
modern-perspective-on-peer-to-peer-and-grid-computing-ian-j-
taylor-4105402
Explore and download more ebooks at ebookbell.com
Here are some recommended products that we believe you will be
interested in. You can click the link to download.
The Collegial Phenomenon The Social Mechanisms Of Cooperation Among
Peers In A Corporate Law Partnership Emmanuel Lazega
https://ebookbell.com/product/the-collegial-phenomenon-the-social-
mechanisms-of-cooperation-among-peers-in-a-corporate-law-partnership-
emmanuel-lazega-43095368
Towards A New Order In The Global Automotive Industry How Asian
Companies Catch Up To Their Western Peers How Asian Companies Catch Up
To Their Western Peers 1st Edition Daniel Wldchen
https://ebookbell.com/product/towards-a-new-order-in-the-global-
automotive-industry-how-asian-companies-catch-up-to-their-western-
peers-how-asian-companies-catch-up-to-their-western-peers-1st-edition-
daniel-wldchen-51288016
Arterial Chemoreceptors In Physiology And Pathophysiology 1st Edition
Chris Peers
https://ebookbell.com/product/arterial-chemoreceptors-in-physiology-
and-pathophysiology-1st-edition-chris-peers-5234704
A Book Of Psalms From Eleventhcentury Byzantium The Complex Of Texts
And Images In Vat Gr 752 Ediz Illustrata Barbara Crostini Editor
https://ebookbell.com/product/a-book-of-psalms-from-eleventhcentury-
byzantium-the-complex-of-texts-and-images-in-vat-gr-752-ediz-
illustrata-barbara-crostini-editor-52329184
Peer Support In Medicine A Quick Guide 1st Ed 2021 Jonathan D Avery
Editor
https://ebookbell.com/product/peer-support-in-medicine-a-quick-
guide-1st-ed-2021-jonathan-d-avery-editor-23526684
Chemical Peels In Clinical Practice A Practical Guide To Superficial
Medium And Deep Peels Series In Cosmetic And Laser Therapy 1st Edition
Xavier G Goodarzian
https://ebookbell.com/product/chemical-peels-in-clinical-practice-a-
practical-guide-to-superficial-medium-and-deep-peels-series-in-
cosmetic-and-laser-therapy-1st-edition-xavier-g-goodarzian-51705350
Up In A Heaval Piers Anthony
https://ebookbell.com/product/up-in-a-heaval-piers-anthony-33606426
Adventures In A Pairadice Peters Terry Michael
https://ebookbell.com/product/adventures-in-a-pairadice-peters-terry-
michael-7996668
Adventuring In The Englishes Language And Literature In A Postcolonial
Globalized World Unabridged Piers Michael Smith Editor
https://ebookbell.com/product/adventuring-in-the-englishes-language-
and-literature-in-a-postcolonial-globalized-world-unabridged-piers-
michael-smith-editor-10862896
Peers In A Clientserver World A Modern Perspective On Peer To Peer And Grid Computing Ian J Taylor
Ian J. Taylor
From P2P
to Web Services
and Grids
Peers in a Client/Server World
Ian J. Taylor, PhD
School of Computer Science, University of Cardiff, Cardiff, Wales
Series editor
Professor A.J. Sammes, BSc, MPhil, PhD, FBCS, CEng
CISM Group, Cranfield University, RMCS, Shrivenham, Swindon SN6 8LA, UK
British Library Cataloguing in Publication Data
Taylor, Ian J.
From P2P to Web Services and Grids. — (Computer communications and networks)
1. Client/server computing 2. Internet programming 3. Middleware
4. Peer-to-peer architecture (Computer networks) 5. Web services
6. Computational grides (Computer systems) I. Title
004.3′6
ISBN 1852338695
A catalog record for this book is available from the Library of Congress.
Apart from any fair dealing for the purposes of research or private study, or criticism or review, as
permitted under the Copyright, Designs and Patents Act 1988, this publication may only be repro-
duced, stored or transmitted, in any form or by any means, with the prior permission in writing of
the publishers, or in the case of reprographic reproduction in accordance with the terms of licences
issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those terms
should be sent to the publishers.
Computer Communications and Networks ISSN 1617-7975
ISBN 1-85233-869-5 Springer London Berlin Heidelberg
Springer is a part of Springer Science+Business Media
springeronline.com
© Springer-Verlag London Limited 2005
The use of registered names, trademarks etc. in this publication does not imply, even in the absence
of a specific statement, that such names are exempt from the relevant laws and regulations and
therefore free for general use.
The publisher makes no representation, express or implied, with regard to the accuracy of the infor-
mation contained in this book and cannot accept any legal responsibility or liability for any errors
or omissions that may be made.
Printed and bound in the United States of America
34/3830–543210 Printed on acid-free paper SPIN 10975107
To my dad, George, for always helping me with the international
bureaucracies of this world and to him and his accomplice, Gill,
for saving me from strange places at strange times. . . and to both
for their continuous support. I am forever thankful.
Preface
Current users typically interact with the Internet through the use of a Web
browser and a client/server based connection to a Web server. However, as
we move forward to allow true machine-to-machine communication, we are in
need of more scalable solutions which employ the use of decentralized tech-
niques to add redundancy, fault tolerance and scalability to distributed sys-
tems. Distributed systems take many forms, appear in many areas and range
from truly decentralized systems, like Gnutella and Jxta, centrally indexed
brokered systems like Web services and Jini and centrally coordinated sys-
tems like SETI@Home.
From P2P to Web Services and Grids: Peers in a client/server world pro-
vides a comprehensive overview of the emerging trends in peer-to-peer (P2P),
distributed objects, Web services and Grid computing technologies, which
have redefined the way we think about distributed computing and the Inter-
net. This book has two main themes: applications and middleware. Within
the context of applications, examples of the many diverse architectures are
provided including: decentralized systems like Gnutella and Freenet; brokered
ones like Napster; and centralized applications like SETI and conventional
Web servers. For middleware, the book covers Jxta, as a programming in-
frastructure for P2P computing, along with Web services, Grid computing
paradigms, e.g., Globus and OGSA, and distributed-object architectures, e.g.,
Jini. Each technology is described in detail, including source code where ap-
propriate, and their capabilities are analysed in the context of the degree of
centralization or decentralization they employ.
To maintain coherency, each system is discussed in terms of the generalized
taxonomy, which is outlined in the first chapter. This taxonomy serves as a
placeholder for the systems presented in the book and gives an overview of the
organizational differences between the various approaches. Most of the sys-
tems are discussed at a high level, particularly addressing the organization and
topologies of the distributed resources. However, some (e.g., Jxta, Jini, Web
services and, to some extent, Gnutella) are discussed in much more detail,
giving practical programming tutorials for their use. Security is paramount
VIII Preface
throughout and introduced with a dedicated chapter outlining the many ap-
proaches to security within distributed systems.
Why did I decide to write this book?
I initially wrote the book for my lecture course in the School of Computer
Science at Cardiff University on Distributed Systems. I wanted to give the stu-
dents a broad overview of distributed-computing techniques that have evolved
over the past decade. The text therefore outlines the key applications and mid-
dleware used to construct distributed applications today. I wrote each lecture
as a book chapter and these notes have been extremely well received by the
students and therefore I decided to extend this into a book for their use and
for others ... so:
Who should read this book?
This book, I believe, has a wide-ranging scope. It was initially written for
BSc students, with an extensive computing background, and MSc students,
who have little or no prior computing experience, i.e., some students had
never written a line of code in their lives !... Therefore, this book should
appeal to people with various computer programming abilities but also to the
casual reader who is simply interested in the recent advances in the distributed
systems world.
Readers will learn about the various distributed systems that are available
today. For a designer of new applications, this will provide a good reference.
For students, this text would accompany any course on distributed computing
to give a broader context of the subject area. For a casual reader, interested in
P2P and Grid computing, the book will give a broad overview of the field and
specifics about how such systems operate in practice without delving into the
low-level details. For example, to both casual and programming-level readers,
all chapters will be of interest, except some parts of the Gnutella chapter
and some sections of the deployment chapters, which are more tuned to the
lower-level mechanisms and therefore targeted more to programmers.
Organization
Chapter 1: Introduction: In this chapter, an introduction is given into
distributed systems, paying particular attention to the role of middleware.
A taxonomy is constructed for distributed systems ranging on a scale from
centralized to decentralized depending on how resources or services are
organized, discovered and how they communicate with each other. This
will serve as an underlying theme for the understanding of the various
applications and middleware discussed in this book.
Chapter 2: Peer-2-Peer Systems: This chapter gives a brief history of
client/server and peer-to-peer computing. The current P2P definition is
stated and specifics of the P2P environment that distinguish it from
Preface IX
client/server are provided: e.g., transient nodes, multi-hop, NAT, firewalls
etc. Several examples of P2P technologies are given, along with applica-
tion scenarios for their use and categorizations of their behaviour within
the taxonomy described in the first chapter.
Chapter 3: Web Services: This chapter introduces the concept of machine-
to-machine communication and how this fits in with the existing Web
technologies and future scopes. This leads onto a high-level overview of
Web services, which illustrates the core concepts without getting bogged
down with the deployment details.
Chapter 4: Grid Computing: This chapter introduces the idea of a com-
putational Grid environment, which is typically composed of a number
of heterogeneous resources that may be owned and managed by different
administrators. The concept of a “virtual organization” is discussed along
with its security model, which employs a single sign-on mechanism. The
Globus toolkit, the reference implementation that can be used to program
computational Grids, is then outlined giving some typical scenarios.
Chapter 5: Jini: This chapter gives an overview of Jini, which provides an
example of a distributed-object based technology. A background is given
into the development of Jini and into the network plug-and-play manner in
which Jini accesses distributed objects. The discovery of look-up servers,
searching and using Jini services is described in detail and advanced Jini
issues, such as leasing and events are discussed.
Chapter 6: Gnutella: This chapter combines a conceptual overview of
Gnutella and the details of the actual Gnutella protocol specification.
Many empirical studies are then outlined that illustrate the behaviour of
the Gnutella network in practice and show the many issues which need to
be overcome in order for this decentralized structure to succeed. Finally,
the advantages and disadvantages of this approach are discussed.
Chapter 7: Scalability: In this chapter, we look at scalability issues by
analysing the manner in which peers are organized within popular P2P
networks. First, social networks are introduced and compared against their
P2P counterparts. We then explore the use of decentralized P2P networks
within the context of file sharing. It is shown why in practice, neither
extreme (i.e., completely centralized or decentralized architectures) gives
effective results and therefore why most current P2P applications use a
hybrid of the two approaches.
Chapter 8: Security: This chapter covers the basic elements of security
in a distributed system. It covers the various ways that a third party can
gain access to data and the design issues involved in building a distributed
security system. It then gives a basic overview of cryptography and de-
scribes the various ways in which secure channels can be set up, using
public-key pairs or by using symmetric keys, e.g., shared secret keys or
session keys. Finally, secure mobile code is discussed within the concept
of sandboxing.
X Preface
Chapter 9: Freenet: This chapter gives a concise description of the Freenet
distributed information storage system, which is real-world example of
how the various technologies, so far discussed, can be integrated and used
within a single system. For example: Freenet is designed to work within a
P2P environment; it addresses scalability through the use of an adaptive
routing algorithm that creates a centralized/decentralized network topol-
ogy dynamically; and it address a number of privacy issues by using a
combination of hash functions and public/private key encryption.
Chapter 10: Jxta: This chapter introduces Jxta that provides a set of open,
generalized, P2P protocols to allow any connected device (cell phone to
PDA, PC to server) on the network to communicate and collaborate. An
overview of the motivation behind Jxta is given followed by a description
of its key concepts. Finally, a detailed overview of the six Jxta protocols
is given.
Chapter 11: Distributed Object Deployment Using Jini: This chap-
ter describes how one would use Jini in practice. This is illustrated through
several simple RMI and Jini applications that describe how the individ-
ual parts and protocols fit together and give a good context for the Jini
chapter and how the deployment differs from other systems discussed in
this book.
Chapter 12: P2P Deployment Using Jxta: This chapter uses several
Jxta programming examples to illustrate some issues of programming and
operating within a P2P environment. A number of key practical issues,
such as out-of-date advertisements and peer configuration, which have to
be dealt with in any P2P application are discussed and illustrated by
outlining the potential solutions employed by Jxta.
Chapter 13: Web Services Deployment: This chapter describes the
Web services deployment technologies, typically used for representing and
invoking Web services. Specifically, three core technologies are discussed in
detail: SOAP for wrapping XML messages within an envelope, WSDL for
representing the Web services interface description, and UDDI for storing
indexes of the locations of Web services.
Chapter 14: OGSA: This chapter discusses the Open Grid Service Ar-
chitecture (OGSA), which extends Web services into the Grid computing
arena by using WSDL to achieve self-descriptive, discoverable services
that can be referenced during their lifetime, i.e., maintain state. OGSI is
discussed, which provides an implementation of the OGSA ideas. This is
followed by OGSI’s supercessor, WSRF, which translates the OGSI defi-
nitions into representations that are compatible with other emerging Web
service standards.
Disclaimer
Within this book, I draw in a number of examples from file-sharing programs,
such as Napster, Gnutella (e.g., Limewire), Fastrack and KaZaA to name a
Preface XI
few. The reason for this is to illustrate the different approaches in the orga-
nization of distributed systems in a computational scientific context. Under
no circumstances, using this text, am I endorsing or supporting any or all of
these file-sharing applications in their current legal battles concerning copy-
right issues.
My focus here is on the use of this infrastructure in many other scientific
situations where there is no question of their legality. We can learn a lot from
such applications when designing future Grids and P2P systems, both from
a computational science aspect and from a social aspect, in the sense of how
users behave as computing peers within such a system, i.e., do they share or
not? These studies give us insight about how we may approach the scalability
issues in future distributed systems.
English Spelling
I struggled with the appropriate spelling of some words, which in British En-
glish, should (arguably) be spelt with an ‘s’ but in almost all related literature
within this subject area, they are spelt with a ‘z’, e.g., organize, centralize,
etc. After much dialogue with colleagues and Springer, we decided on a com-
promise; that is, I shall use an amalgamation of America English and British
English known as mid-Atlantic English.... Therefore, for the set of such words,
I will use the ‘z’ form. These include derivatives of: authorize, centralize, de-
centralize, generalize, maximize, minimize, organize, quantize, serialize, spe-
cialize, standardize, utilize, virtualize and visualize. Otherwise, I will use the
British English spelling e.g. advertise, characterise, conceptualise, customise,
realise, recognise, stabilise etc. Interestingly, however, even the Oxford Concise
English Dictionary lists many of these words in their ‘z’ form....
Acknowledgements
I would like to thank a number of people who provided sanity checks and
proof-reading for a number of chapters in this book. In particular, I’d like
to thank Shalil Majithia, Andrew Harrison, Omer Rana and Jonathon Giddy.
Also, many thanks to the numerous members of the GridLab, Triana and NRL
groups for their encouragement and enlightening discussions during the writ-
ing of this book. So, to name a few, thanks to Alex Hardisty, Andre Merzky,
Andrei Hutanu, Brian Adamson, Bernard Schutz, Joe Macker, Ed Seidel,
Gabrielle Allen, Ian Kelley, Jason Novotny, Roger Philp, Wangy, Matthew
Shields, Michael Russell, Oliver Wehrens, Felix Hupfeld, Rick Jones, Shel-
don Gardner, Thilo Kielmann, Jarek Nabrzyski, Sathya, Tom Goodale, David
Walker, Kelly Davis, Hartmut Kaiser, Dave Angulo, Alex Gray and Krzysztof
Kurowski.
Most of this book was written in Sicily and therefore, I’d like to thank
everyone I met there who made me feel so welcome and for those necessary
breaks in B&Js in Ragusa Ibla and il Bagatto in Siracusa.... Finally, thanks
XII Preface
to Matt for keeping his cool during some pretty daunting deadlines towards
the end of the writing of this book.
Cardiff, UK. Ian Taylor
April 2004 Ian Taylor
Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Introduction to Distributed Systems . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Some Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Centralized and Decentralized Systems . . . . . . . . . . . . . . . . . . . . . 5
1.3.1 Resource Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.2 Resource Availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.3 Resource Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Examples of Distributed Applications . . . . . . . . . . . . . . . . . . . . . . 10
1.4.1 A Web Server: Centralized. . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4.2 SETI@Home: Centralized . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4.3 Napster: Brokered . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.4 Gnutella: Decentralized . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.5 Examples of Middleware. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5.1 J2EE and JMS: Centralized . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5.2 Jini: Brokered. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.5.3 Web Services: Brokered . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.5.4 Jxta: Decentralized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Part I Distributed Environments
2 Peer-2-Peer Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.1 What is Peer to Peer? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.1.1 Historical Peer to Peer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.1.2 Binding of Peers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.1.3 Modern Definition of Peer to Peer . . . . . . . . . . . . . . . . . . . 25
2.1.4 Social Impacts of P2P . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1.5 True Peer to Peer? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.1.6 Why Peer-to-Peer? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.2 The P2P Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
XIV Contents
2.2.1 Hubs, Switches, Bridges, Access Points and Routers . . . 31
2.2.2 NAT Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.2.3 Firewalls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.2.4 P2P Overlay Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.3 P2P Example Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.3.1 MP3 File Sharing with Napster . . . . . . . . . . . . . . . . . . . . . 37
2.3.2 Distributed Computing Using SETI@Home . . . . . . . . . . . 38
2.3.3 Instant Messaging with ICQ . . . . . . . . . . . . . . . . . . . . . . . . 39
2.3.4 File Sharing with Gnutella. . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3 Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.1.1 Looking Forward: What Do We Need? . . . . . . . . . . . . . . . 44
3.1.2 Representing Data and Semantics . . . . . . . . . . . . . . . . . . . 47
3.2 Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.2.1 A Minimal Web Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.2.2 Web Services Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.2.3 Web Services Development . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.3 Service-Oriented Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.3.1 A Web Service SOA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.4 Common Web Service Misconceptions . . . . . . . . . . . . . . . . . . . . . . 55
3.4.1 Web Services and Distributed Objects . . . . . . . . . . . . . . . 55
3.4.2 Web Services and Web Servers . . . . . . . . . . . . . . . . . . . . . . 55
3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4 Grid Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.1 The Grid Dream . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2 Social Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3 History of the Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.3.1 The First Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.3.2 The Second Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.3.3 The Third Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.4 The Grid Computing Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.4.1 Virtual Organizations and the Sharing of Resources . . . . 64
4.5 To Be or Not to Be a Grid: These Are the Criteria... . . . . . . . . . 67
4.5.1 Centralized Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.5.2 Standard, Open, General-Purpose Protocols . . . . . . . . . . 68
4.5.3 Quality Of Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.6 Types of Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.7 The Globus Toolkit 2.x. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.7.1 Globus Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.7.2 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.7.3 Information Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.7.4 Data Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Contents XV
4.7.5 Resource Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.8 Comments and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Part II Middleware, Applications and Supporting Technologies
5 Jini . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.1 Jini . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.1.1 Setting the Scene . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.2 Jini’s Transport Backbone: RMI and Serialization . . . . . . . . . . . 84
5.2.1 RMI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.2.2 Serialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.3 Jini Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.3.1 Jini in Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.4 Registering and Using Jini Services . . . . . . . . . . . . . . . . . . . . . . . . 93
5.4.1 Discovery: Finding Lookup Services . . . . . . . . . . . . . . . . . . 93
5.4.2 Join: Registering a Service (Jini Service) . . . . . . . . . . . . . 94
5.4.3 Lookup: Finding and Using Services (Jini Client) . . . . . . 96
5.5 Jini: Tying Things Together . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.6 Organization of Jini Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.6.1 Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6 Gnutella . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.1 History of Gnutella . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.2 What Is Gnutella? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.3 A Gnutella Scenario: Connecting and Operating Within a
Gnutella Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.3.1 Discovering Peers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.3.2 Gnutella in Operation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.3.3 Searching Within Gnutella . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.4 Gnutella 0.4 Protocol Description. . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.4.1 Gnutella Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
6.4.2 Gnutella Descriptor Header . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.4.3 Gnutella Payload: Ping . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.4.4 Gnutella Payload: Pong . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.4.5 Gnutella Payload: Query . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.4.6 Gnutella Payload: QueryHit . . . . . . . . . . . . . . . . . . . . . . . . 111
6.4.7 Gnutella Payload: Push . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.5 File Downloads. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.6 Gnutella Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.7 More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
XVI Contents
7 Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.1 Social Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.2 P2P Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
7.2.1 Performance in P2P Networks . . . . . . . . . . . . . . . . . . . . . . 119
7.3 Peer Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
7.3.1 Centralized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
7.3.2 Ring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
7.3.3 Hierarchical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
7.3.4 Decentralized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
7.4 Hybrid Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
7.4.1 Centralized/Ring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
7.4.2 Centralized/Centralized . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
7.4.3 Centralized/Decentralized . . . . . . . . . . . . . . . . . . . . . . . . . . 125
7.5 The Convergence of Napster and Gnutella . . . . . . . . . . . . . . . . . . 127
7.6 A Southern Side-Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
7.7 Gnutella Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
7.7.1 Gnutella Free Riding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
7.7.2 Equal Peers?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
7.7.3 Power-Law Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
7.8 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
7.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
8 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
8.2 Design Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
8.2.1 Focus of Data Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
8.2.2 Layering of Security Mechanisms . . . . . . . . . . . . . . . . . . . . 136
8.2.3 Simplicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
8.3 Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
8.3.1 Basics of Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
8.3.2 Types of Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
8.3.3 Symmetric Cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
8.3.4 Asymmetric Cryptosystem. . . . . . . . . . . . . . . . . . . . . . . . . . 141
8.3.5 Hash Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
8.4 Signing Messages with a Digital Signature . . . . . . . . . . . . . . . . . . 143
8.5 Secure Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
8.5.1 Secure Channels Using Symmetric Keys . . . . . . . . . . . . . . 145
8.5.2 Secure Channels Using Public/Private Keys . . . . . . . . . . 145
8.6 Secure Mobile Code: Creating a Sandbox . . . . . . . . . . . . . . . . . . . 147
8.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Contents XVII
9 Freenet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
9.2 Freenet Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
9.2.1 Populating the Freenet Network . . . . . . . . . . . . . . . . . . . . . 152
9.2.2 Self-Organizing Adaptive Behaviour in Freenet . . . . . . . . 153
9.2.3 Requesting Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
9.2.4 Similarities with Other Peer Organization Techniques . . 155
9.3 Freenet Keys. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
9.3.1 Keyword-Signed Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
9.3.2 Signed Subspace Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
9.3.3 Content Hash Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
9.3.4 Clustering Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
9.4 Joining the Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
9.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
10 Jxta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
10.1 Background: Why Was Project Jxta Started? . . . . . . . . . . . . . . . 163
10.1.1 Interoperability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
10.1.2 Platform independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
10.1.3 Ubiquity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
10.2 Jxta Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
10.2.1 The Jxta Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
10.2.2 Jxta Peers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
10.2.3 Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
10.2.4 Advertisements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
10.2.5 Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
10.2.6 Modules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
10.3 Jxta Network Overlay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
10.3.1 Peer Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
10.3.2 Rendezvous Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
10.3.3 Pipes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
10.3.4 Relay Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
10.4 The Jxta Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
10.4.1 The Peer Discovery Protocol . . . . . . . . . . . . . . . . . . . . . . . . 174
10.4.2 The Peer Resolver Protocol . . . . . . . . . . . . . . . . . . . . . . . . . 175
10.4.3 The Peer Information Protocol . . . . . . . . . . . . . . . . . . . . . . 176
10.4.4 The Pipe Binding Protocol . . . . . . . . . . . . . . . . . . . . . . . . . 176
10.4.5 The Endpoint Routing Protocol . . . . . . . . . . . . . . . . . . . . . 176
10.4.6 The Rendezvous Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . 176
10.5 A Jxta Scenario: Fitting Things Together . . . . . . . . . . . . . . . . . . . 176
10.6 Jxta Environment Considerations. . . . . . . . . . . . . . . . . . . . . . . . . . 177
10.6.1 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
10.6.2 NAT and Firewalls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
10.7 Comment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
10.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
XVIII Contents
Part III Middleware Deployment
11 Distributed Object Deployment Using Jini . . . . . . . . . . . . . . . . 185
11.1 RMI Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
11.2 An RMI Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
11.2.1 The Java Proxy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
11.2.2 The Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
11.2.3 The Client. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
11.2.4 Setting up the Environment . . . . . . . . . . . . . . . . . . . . . . . . 190
11.3 A Jini Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
11.3.1 The Remote Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
11.3.2 The Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
11.3.3 The Client. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
11.4 Running Jini Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
11.4.1 HTTP Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
11.4.2 RMID Daemon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
11.4.3 The Jini Lookup Service . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
11.4.4 Running the Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
11.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
12 P2P Deployment Using Jxta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
12.1 Jxta Programming: Three Examples Illustrated . . . . . . . . . . . . . 199
12.1.1 Starting the Jxta Platform . . . . . . . . . . . . . . . . . . . . . . . . . 200
12.1.2 Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
12.1.3 Creating Pipes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
12.2 Running Jxta Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
12.3 P2P Environment: The Jxta Approach . . . . . . . . . . . . . . . . . . . . . 209
12.3.1 Peer Configuration Using Jxta . . . . . . . . . . . . . . . . . . . . . . 209
12.3.2 Peer Configuration Management Within Jxta . . . . . . . . . 211
12.3.3 Running The Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
12.3.4 Jxta and P2P Advert Availability . . . . . . . . . . . . . . . . . . . 214
12.3.5 Expiration of Adverts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
12.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
13 Web Services Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
13.1 SOAP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
13.1.1 Just Like Sending a Letter. . . . . . . . . . . . . . . . . . . . . . . . . . 218
13.1.2 Web Services Architecture with SOAP . . . . . . . . . . . . . . . 219
13.1.3 The Anatomy of a SOAP Message . . . . . . . . . . . . . . . . . . . 221
13.2 WSDL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
13.2.1 Service Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
13.2.2 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
13.2.3 Anatomy of a WSDL Document . . . . . . . . . . . . . . . . . . . . . 225
13.3 UDDI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Contents XIX
13.4 Using Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
13.4.1 Axis Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
13.4.2 A Simple Web Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
13.4.3 Deploying a Web Service Using Axis . . . . . . . . . . . . . . . . . 232
13.4.4 Web Service Invocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
13.4.5 Cleaning Up and Un-Deploying . . . . . . . . . . . . . . . . . . . . . 235
13.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Part IV From Web Services to Future Grids
14 OGSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
14.1 OGSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
14.1.1 Grid Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
14.1.2 Virtual Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
14.1.3 OGSA Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
14.2 OGSI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
14.2.1 Globus Toolkit, Version 3 . . . . . . . . . . . . . . . . . . . . . . . . . . 249
14.3 WSRF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
14.3.1 Problems with OGSI. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
14.3.2 Grid Services or Resources?. . . . . . . . . . . . . . . . . . . . . . . . . 251
14.3.3 OGSI Functionality in WSRF . . . . . . . . . . . . . . . . . . . . . . . 251
14.3.4 Globus Toolkit, Version 4 . . . . . . . . . . . . . . . . . . . . . . . . . . 252
14.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
A Want to Find Out More? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
A.1 Grid Computing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
A.2 P2P Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
A.3 Distributed Object Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
A.4 Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
B RSA Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
1
Introduction
Recently, there has been an explosion of applications using peer-to-peer (P2P)
and Grid-computing technology. On the one hand, P2P has become ingrained
in current grass-roots Internet culture through applications like Gnutella [6]
and SETI@Home [3]. It has appeared in several popular magazines including
the Red Herring and Wired, and frequently quoted as being crowned by For-
tune as one of the four technologies that will shape the Internet’s future. The
popularity of P2P has spread through to academic and industrial circles, be-
ing propelled by media and widespread debate both in the courtroom and out.
However, such enormous hype and controversy has led to the mistrust of such
technology as a serious distributed systems platform for future computing,
but in fact in reality, there is significant substance as we shall see.
In parallel, there has been an overwhelming interest in Grid computing,
which is attempting to build the infrastructure to enable on-demand comput-
ing in a similar fashion to the way we access other utilities now, e.g., electricity.
Further, the introduction of the Open Grid Services Architecture (OGSA) [21]
has aligned this vision with the technological machine-to-machine capabilities
of Web services (see Chapter 3). This convergence has gained a significant in-
put from both commercial and non-commercial organizations ([27] and [28])
and has a firm grounding in standardized Web technologies, which could per-
haps even lead to the kind of ubiquitous uptake necessary for such a infras-
tructure to be globally deployed.
Although the underlying philosophies of Grid computing and P2P are
different, they both are attempting to solve the same problem, that is, to
create a virtual overlay [23] over the existing Internet to enable collaboration
and sharing of resources [24]. However, in implementation, the approaches
differ greatly. Whilst Grid computing connects virtual organizations [32] that
can cooperate in a collaborative fashion, P2P connects individual users using
highly transient devices and computers living at the edges of the Internet [46]
(i.e., behind NAT, firewalls etc).
The name “Peers in a Client/Server World” describes the transitionary
evolution from the widespread client/server based Internet, dominant over
2 1 Introduction
the past decade, back to the roots of the Internet where every peer had equal
status. Inevitably, both history and practicality will influence the next gen-
eration Internet as we attempt to migrate from the technical maturity and
robustness of the current Internet to its future vision. Therefore, as we move
forward, we must build upon the current infrastructure to address key issues
of widespread availability and deployment.
In this book, the key influential technologies are addressed that will help
to shape the next-generation Internet. P2P and distributed-object based tech-
nologies, through to the promised pervasive deployment of Grid computing
combined with Web services will be needed in order to address the funda-
mental issues of creating a scalable ubiquitous next-generation computing
infrastructure. Specifically, a comprehensive overview of current distributed-
systems technologies is given, covering P2P environments (Chapters 2,6,7,
9,10,12), security techniques (Chapter 8), distributed-object systems (Chap-
ters 5 and 11), Grid computing (Chapter 4) and both stateless (Chapters 3
and 13) and stateful Web services (Chapter 14).
1.1 Introduction to Distributed Systems
A distributed system can be defined as follows:
“A distributed system is a collection of independent computers that appears
to its users as a single coherent system” [1]
There are two aspects to this: hardware and software. The hardware ma-
chines must be autonomous and the software must be organized in such a way
as to make the users think that they are dealing with a single system. Expand-
ing on these fundamentals, distributed systems typically have the following
characteristics; they should:
• be capable of dealing with heterogeneous devices, i.e., various vendors,
software stacks and operating systems should be able to interoperate
• be easy to expand and scale
• be permanently available (even though parts of it may not be)
• hide communication from the users.
In order for a distributed system to support a collection of heterogeneous
computers and networks while offering a single system view, the software stack
is often divided into two layers. At the higher layers, there are applications
(and users) and at the lower layer there is middleware, which interacts with
the underlying networks and computer systems to give applications and users
the transparency they need (see Fig. 1.1).
Middleware abstracts the underlying mechanisms and protocols from the
application developer and provides a collection of high-level capabilities to
1.2 Some Terminology 3
0DFKLQH$ 0DFKLQH% 0DFKLQH
'LVWULEXWHG$SSOLFDWLRQV
0LGGOHZDUH6HUYLFHV
26HJ
:LQGRZV;3
26HJ
0DF26
26HJ
/LQX[
1HWZRUN
Fig. 1.1. The role of middleware in a distributed system; it hides the underlying
infrastructure away from the application and user level.
make things far easier for programmers to develop and deploy their applica-
tions. For example, within the middleware layer, there maybe simple abstract
communication calls that do not specify which underlying mechanisms they
actually use, e.g., TCP/IP, UDP, Bluetooth etc. Such concrete deployment
bindings are often decided at run time through configuration files or dynami-
cally, thereby being dependent on the particular deployment environment.
Middleware therefore provides the virtual overlay across the distributed
resources to enable transparent deployment across the underlying infrastruc-
tures. In this book, we will take a look at a number of different approaches in
designing the middleware abstraction layer by identifying the kinds of capa-
bilities that are exposed by the various types.
1.2 Some Terminology
Often, a number of terms are used to define a device or capability on a dis-
tributed network, e.g., node, resource, peer, agent, service, server etc. In this
section, common definitions are given which are used consistently throughout
this book. The definitions presented here do represent a compromise however,
because often certain distributed entities are not identified in all systems in
4 1 Introduction
the same way. Therefore, wherever appropriate, the terminology provided here
is given within the context of the system they described within. The terms
are defined as follows:
• Resource: any hardware or software entity being represented or shared
on a distributed network. For example, a resource could be any of the fol-
lowing: a computer; a file storage system; a file; a communication channel;
a service, i.e., algorithm/function call; and so on
• Node: a generic term used to represent any device on a distributed net-
work. A node that performs one (or more) capabilities is often exposed as
a service
• Client: is a consumer of information, e.g., a Web browser
• Server: is a provider of information, e.g., a Web server or a peer offering
a file-sharing service
• Service: is “a network-enabled entity that provides some capability” [21];
e.g., a Web server provides a remote HTTP file-retrieval service. A single
device can expose several capabilities as individual services
• Peer: a peer is when a device acts as both a consumer and provider of
information.
3HHU
OLHQW 6HUYHU
1RGH
RPSXWHU
'HYLFH
6HUYLFH
5HVRXUFH
Fig. 1.2. An overview of the terms used to describe distributed resources.
1.3 Centralized and Decentralized Systems 5
Figure 1.2 organizes these terms by associating relationships between the
various terminologies. Here, we can see that any device is a entity on the
network. Devices can also be referred to in many different ways, e.g., a node,
computer, PDA, peer etc. Each device can run any number of clients, servers,
services or peers. A peer is a special kind of node, which acts as both a client
and a server.
There is often confusion about the term resource. The easiest way to think
of a resource is any capability that is shared on a distributed network. Sharing
resources can be exposed in a number of ways and can also be used to represent
a number of physical or virtual entities. For example, you can share: files (so
a file is a resource), CPU cycles, storage capabilities (i.e., a file system), a
service, e.g., a Web server or Web service, and so on. Therefore, everything in
1.2 is a resource except a client, who does not share.
A service is a software entity that can be used to represent resources, and
therefore capabilities, on a network. There are numerous examples, e.g., Web
servers, Web services, Jini services, Jxta peers providing a service, and so
forth and so on. In simple terms, services can be thought of as the network
counterparts of local function calls. Services receive a request (just like the
arguments to a function call) and (optionally) return a response (as do local
function calls ). To illustrate this analogy, consider the functionality of a
standard HTTP Web server: it receives a request for an HTTP file and returns
the contents of that file, if found. If this was implemented as a local function
call in Java, it would look something like this:
String getWebPage(String httpfile)
This simple function call takes a file-name argument (including its direc-
tory, e.g., /mydir/myfilename.html) and it returns the contents of that local
file within a Java String object. This is basically what a Web server does. How-
ever, within the Web server scenario, the user would provide an HTTP address
(e.g., http://www.google.com/index.html) and this would be converted into a
remote request to the specified Web server (e.g., http://www.google.com) with
the requested file (index.html). The entire process would involve the use of the
DNS (Domain Name Service) but the client (e.g., the Web browser) performs
the same operation as our simple local reader but renders the information in
a specific way for the user, i.e., using HTML.
1.3 Centralized and Decentralized Systems
In this section, the middleware and systems outlined in this book are classi-
fied onto a taxonomy according to a scale ranging between centralized and
decentralized. The distributed architectures are divided into categories that
define an axis on the comparison space. On one side of this spectrum, we have
centralized systems, e.g., typical client/server based systems. and on the other
side, we have decentralized systems, often classified as P2P. In the centre is a
6 1 Introduction
mix of the two extremes in the form of hybrid systems, e.g., brokered, where
a system may broker the functionality or communication request to another
service. This taxonomy sets the scene for the specifics of each system which
will be outlined in the chapters to follow and serves as a simple look-up table
for determining a system’s high-level behaviour.
The boundaries are not clean-cut however and there are a number of fac-
tors that can determine the centralized nature of a system. Even systems
that are considered fully decentralized can, in practice, employ some degrees
of centralization, albeit often in a self-organizing fashion [2]. Typically, de-
centralized systems adopt immense redundancy, both in the discovering of
information and content, by dynamically repeating information across many
other peers on the network.
Broadly speaking, there are three main areas that determine whether a
system is centralized or decentralized:
1. Resource Discovery
2. Resource Availability
3. Resource Communication
One important consideration to bear in mind as we talk about the degree
of centralization of systems is that of scalability. When we say a resource is
centralized, we do not mean to imply that there is only one server serving the
information, rather, we mean that there are a fixed number of servers (possibly
one) providing the information which does not scale proportionately with the
size of the network. Obviously, there are many levels of granularities here
and hence the adoption of a sliding scale, illustrating the various levels on a
resource-organization continuum.
1.3.1 Resource Discovery
Within any distributed system, there needs to be a mechanism for discovering
the resources. This process is referred to as discovery and a service which
supplies this information is called a discovery service (e.g., DNS, Jini Lookup,
Jxta Rendezvous, JNDI, UDDI etc.). There are a number of mechanisms for
discovering distributed resources, which are often highly dependent on the
type of application or middleware. For example, resource discovery can be
organized centrally, e.g., DNS, or decentrally, e.g., Gnutella.
Discovery is typically a two-stage process. First, the discovery service needs
to be located; then the relevant information is retrieved. The mechanism of
how the information is retrieved can be highly decentralized (as in the lower
layers of DNS), even though access to the discovery service is centralized.
Here, we are concerned about the discovery mechanism as a whole. Therefore,
a system that has centralized access to a decentralized search is factored by
its lowest common denominator, i.e., the centralized access. There are two
examples given below that illustrate this.
1.3 Centralized and Decentralized Systems 7
As our first example, let’s consider DNS which is used to discover an
Internet resource. DNS works in much the same way as a telephone book.
You give a DNS an Internet site name (e.g., www.cs.cf.ac.uk) and the DNS
server returns to you the IP address (e.g., 131.251.49.190) for locating this
site. In the same way as you keep a list of name/number pairs on your mobile
phone, DNS keeps a list of name/IP number pairs.
DNS is not centralized in structure but the access to the discovery service
certainly is because there are generally only a couple of specified hosts that act
as DNS servers. Typically, users specify a small number of DNS servers (e.g.,
one or two), which are narrow relative to the number of services available to it.
If these servers go down then access to DNS information is disabled. However,
behind this small gateway of hosts, the storage of DNS information is mas-
sively hierarchical, employing an efficient decentralized look-up mechanism
that is spread amongst many hosts.
Another illustration here is the Web site Google. Google is certainly a cen-
tralized Web server in the sense that there is only one Google machine (at a
specific time) that binds to the address http://www.google.com. When we ask
DNS to provide the Google address, it returns the IP Address 168.127.47.8,
which allows you to contact the main Google server directly. However, Google
is a Web search engine that is used by millions of people daily and conse-
quently it stores a massive number of entries (around 1.6 billion). To access
this information, it relies on a database that uses a parallel cluster of 10,000
Linux machines to provide the service (at the time of writing). Therefore, the
access and storage of this information, from a user’s perspective, is central-
ized but from a search or computational perspective, it is certainly distributed
across many machines.
1.3.2 Resource Availability
Another important factor is the availability of resources. Again, Web servers
fall into the centralized category here because there is only one IP address
that hosts a particular site. If that machine goes down then the Web site is
unavailable. Of course, machines could be made fault tolerant by replicat-
ing the web site and employing some internal switching mechanisms but the
availability of the IP address remains the same.
Other systems, however, use a more decentralized approach by offering
many duplicate services that can perform the same functionality. Resource
availability is tied in closely to resource discovery. There are many examples
here but to illustrate various availability levels, let’s briefly consider the shar-
ing of files on the internet through the use of three approaches, which are
illustrated in Fig. 1.3:
1. MP3.com
2. Napster
3. Gnutella.
8 1 Introduction
User
Mp3.com
MP3.com
Scenario
User
Napster.com
Napster
Scenario
Gnutella
Scenario
Fig. 1.3. A comparison of service availability from centralized, brokered and decen-
tralized systems.
MP3.com contains a number of MP3 files that are stored locally at (or
behind) the Web site. If the Web site or the hard disk(s) containing the
database goes down, then users have no access to the content.
Napster, on the other hand, stores the MP3 files on the actual users’
machines and napster.com is used as a massive index (or meeting place) for
connecting users. Users connect to Napster to search for the files they desire
and thereafter connect to users directly to download the file. Therefore, each
MP3 file is distributed across a number of servers making it more reliable
against failure.
However, as the search is centralized, it is dependent on the availability
of the main Web site; i.e., if the Web site goes down then access to the MP3
files would also be lost. Interestingly, the difference between MP3.com and
Napster is smaller than you may think: one centralizes the files, whilst the
other centralizes the addresses of the files. Either is susceptible to failure if
the Web site goes down. The difference in Napster’s case is that, if the Web
site goes down then current users can still finish downloading the current files
they have discovered since the communication is decentralized from the main
search engine. Therefore, if a user has already located the file and initiated
the download process, then the availability of the Web site does not matter
and they can quite happily carry on using the service (but not search for more
files).
1.3 Centralized and Decentralized Systems 9
Thirdly, let’s consider Gnutella. Gnutella does not have a centralized
search facility nor a central storage facility for the files. Each user in the
network runs a servent (a client and a server), which allows him/her to act as
both a provider and consumer of information (as in Napster) but furthermore
acts as a search facility also. Servents search for other files by contacting other
servents they are connected to, and these servents connect to the servents they
are connected to and so on. Therefore, if any of the servents are unavailable,
users can almost certainly still reach the file they require (assuming it is avail-
able at all).
Here, therefore, it is important to insert redundancy in both the discovery
and availability of the resources for a system to be truly robust against single-
point failure. Often, when there are a number of duplicated resources available
but the discovery of such resources is centralized, we call this a brokered
system; i.e., the discovery service brokers the request to another service. Some
examples of brokered systems include Napster, Jini, ICQ and Corba
1.3.3 Resource Communication
The last factor is that of resource communication. There are two methods of
communication between resources of a distributed system:
1. Brokered Communication: where the communication is always passed
through a central server and therefore a resource does not have to reference
the other resource directly
2. Point-to-Point (or Peer-to-Peer) Communication: this involves a
direct connection (although this connection may be multi-hop) between
the sender and the receiver. In this case, the sender is aware of the re-
ceiver’s location.
Both forms of communication have their implications on the centralized
nature of the systems. In the first case for brokered communication, there is
always a central server which passes the information between one resource and
another (i.e., centralized). Further, it is almost certainly the case that such sys-
tems are centralized from the resource discovery and availability standpoints
also, since this level of communication implies fundamental central organiza-
tion. Some examples here are J2EE, JMS chat and many publish/subscribe
systems.
Second, there are many systems that use point-to-point connections, e.g.,
Napster and Gnutella but also, so do Web servers! Therefore, this category is
split horizontally across the scale and the significance here is in the central-
ization of the communication with respect to the types of connections.
For example, in the Web server example, communication always originates
from the user. There exists a many-to-one relationship between users and
the Web server and therefore this is considered centralized communication.
This is illustrated in Fig. 1.4, where an obvious centralized communication
pattern is seen for the Web server case.
10 1 Introduction
Equal Peers: communication is supposed
to be even; i.e., each provider is also a
server of information and each node has
an equal number of connections
Web
Server
Many-to-one relationship between
users and the Web server and
therefore this can be considered
centralized communication
Fig. 1.4. The centralization of communication: a truly decentralized system would
have even connections across hosts, rather than a many-to-one type of connectivity.
However, in more decentralized systems, such as Napster and Gnutella,
communication is more evenly distributed across the resources; i.e., each
provider of information is also a server of information, and therefore the con-
nectivity leans more towards a one-to-one connectivity rather than many-
to-one. This equal distribution across the resource (known as equal peers)
decentralizes communication across the entire system. However, in practice
this is almost never the case because of the behavioural patterns depicted by
users of such networks; e.g., some users do not share files and others share
many (see Section 7.7).
1.4 Examples of Distributed Applications
In this section, the criteria defining the taxonomy are applied to several well-
known examples of existing distributed applications and middleware. The ex-
amples given here serve as a point of reference for each chapter that describes
the particular application or middleware in more detail.
1.4.1 A Web Server: Centralized
A good example of a centralized system is a Web server. Clients (i.e., users) use
their Web browser to navigate Web pages on one or more Web sites. Each Web
1.4 Examples of Distributed Applications 11
Resource
Availability
Resource
Discovery
Resource
Communication
Centralized
Decentralized
Web
Server
Fig. 1.5. Taxonomy for a Web server.
site is static to the particular domain with which it is associated. A Web server
therefore is centralized in every sense. It has centralized discovery (through
DNS), it is either available or not and all communication is centralized to the
particular Web server being contacted. Communication is point to point but
there is a many-to-one relationship between the users of this service and the
server itself.
The circles in Fig. 1.5 show the position where a Web server lies on the cen-
tralized/decentralized scale for the three categories listed: resource discovery,
resource availability and resource communication. The scale at the right-hand
side of this graph indicates the broad granularity of our measurements (finer
levels would not really change the outcome much anyway) but somewhere
around the mid-point would denote the brokered case.
With brokering, typically one service brokers the request to another. DNS
does not fall into this category since it has no intrinsic functionality or se-
mantics itself. Web forwarding is a kind of brokering in this sense but this
is a one-to-one forwarding. Typically, brokering involves making a decision
about where to broker the request and therefore typically, there are many ser-
vices offering the same functionality from which to choose. Communication
can also be brokered by the server acting as a coordinator between the sender
and receiver.
12 1 Introduction
1.4.2 SETI@Home: Centralized
5HVRXUFH
$YDLODELOLW
5HVRXUFH
'LVFRYHU
5HVRXUFH
RPPXQLFDWLRQ
HQWUDOL]HG
'HFHQWUDOL]HG
6(7,#
+RPH
Fig. 1.6. Taxonomy for SETI@Home.
SETI@Home (Search for Extraterrestrial Intelligence) [3] is a project that
analyses data from a radio telescope to search for signs of extraterrestrial life.
Each user who takes part in this project downloads a data set and executes
some signal-processing tasks. The actual program is implemented as a screen
saver and therefore only operates when the computer is idle. The SETI@Home
project has used over a billion years of CPU time at the time of writing.
Here, the entire system is run from the SETI@Home Web site. Users down-
load the code and also the data when they are available to process. Therefore,
the discovery is centralized (DNS) and the communication is centralized to the
Web site. Resource availability is also centralized because without the avail-
ability of the Web site, the many SETI nodes cannot do anything since they
need this server to download the next chunk of data. This taxonomy also ap-
plies to BOINC [38], which is the new open source release of the SETI@Home
infrastructure. SETI is discussed in more detail in Chapter 2.
1.4 Examples of Distributed Applications 13
5HVRXUFH
$YDLODELOLW
5HVRXUFH
'LVFRYHU
5HVRXUFH
RPPXQLFDWLRQ
HQWUDOL]HG
'HFHQWUDOL]HG
1DSVWHU
Fig. 1.7. Taxonomy for Napster.
1.4.3 Napster: Brokered
A good example of a brokered system is Napster [4]. Napster stores informa-
tion about the location of peers and music files in a centralized way but then
lets the peers communicate directly when they transfer files.
Here therefore, the discovery and availability are centralized through the
Napster Web site but the communication between the peers is decentralized.
However, the availability of the resources (i.e., files) is less centralized to a
degree because users can still download the file even if the Napster server
goes down. However, users cannot search for new resources when the Web site
is unavailable and therefore limited in this respect. Napster is described in
more detail in Chapter 2.
1.4.4 Gnutella: Decentralized
A popular example of a decentralized system is Gnutella [6] where discovery,
availability and communication are completely decentralized over the network.
Gnutella is discussed in detail in Chapter 6.
In theory Gnutella is completely decentralized but in practice is this really
true? Decentralized networks are inherently self-organizing and so it is not
only possible but indeed very likely that strong servers of information (the
14 1 Introduction
5HVRXUFH
$YDLODELOLW
5HVRXUFH
'LVFRYHU
5HVRXUFH
RPPXQLFDWLRQ
HQWUDOL]HG
'HFHQWUDOL]HG
*QXWHOOD
Fig. 1.8. Taxonomy for Gnutella.
so-called super-peers in Gnutella) could easily turn a decentralized network
into a semi-centralized one when peers contain an uneven amount of content.
Whether this is achieved by behavioural patterns or by artificially creating
a centralized-decentralized structure, the resulting network is no longer com-
pletely decentralized. This is discussed in detail in Chapter 7.
It is no coincidence, for example, that this evolution of hybrid decentral-
ized and centralized systems echoes the evolution of other types of systems
such as Usenet [62]. The history of Usenet shows us that peer-to-peer (de-
centralization) and client/server (centralization) are not mutually exclusive.
Usenet was originally peer-to-peer. Sites connected via a modem and agreed
to exchange information (news and mail) with each other (UUCP). However,
over time, it became obvious that certain sites had better servers than others
and these sites went on to form the Usenet backbone. Today, the volume of
Usenet is enormous and servers on the backbone can elect how much infor-
mation they want to serve and they get added to the Usenet network in a
decentralized fashion. Even the addition of new newsgroups is not centralized
as users have to vote for a newsgroup before it gets initiated.
1.5 Examples of Middleware 15
1.5 Examples of Middleware
1.5.1 J2EE and JMS: Centralized
5HVRXUFH
$YDLODELOLW
5HVRXUFH
'LVFRYHU
5HVRXUFH
RPPXQLFDWLRQ
HQWUDOL]HG
'HFHQWUDOL]HG
-06
Fig. 1.9. Taxonomy for JMS
The Java development kit enterprise edition J2EE [13] is an example of
a centrally controlled system. Here, one Web site is the manager of all inter-
action between clients. Clients in the Java Messaging System (JMS) do not
know the whereabouts of other clients because this knowledge is stored within
the central manger on the J2EE server. The entire system is based around a
Web site and therefore the discovery is central.
JMS is used as a publish/subscribe mechanism within the J2EE environ-
ment (amongst other things) and is quite typical of other messaging systems,
e.g., ICQ where messages are brokered through a central server in order to
get to their destination. Therefore, the communication is brokered through
the Web site. Further, there is only one copy of the Web site (typically these
are quite complicated to set up) and therefore the availability is centralized
also.
16 1 Introduction
5HVRXUFH
$YDLODELOLW
5HVRXUFH
'LVFRYHU
5HVRXUFH
RPPXQLFDWLRQ
HQWUDOL]HG
'HFHQWUDOL]HG
-LQL
Fig. 1.10. Taxonomy for Jini.
1.5.2 Jini: Brokered
Jini [78] allows Java objects to become network-enabled services that can be
distributed in a network ‘plug and play’ manner. In a running Jini system,
there are three main players. There is a service, such as a printer, a super-
computer running a software service etc. There is a client which would like
to make use of this service. Third, there is a lookup service (service locator)
which acts as a broker/trader/locator between services and clients. Jini is
discussed in detail in Chapters 5 and 11.
Jini is another example of a brokered system. Jini clients find out about
services by using the lookup server. The lookup server brokers the request
to a matching service and thereafter the communication takes place directly
between the client and services. Therefore, the availability is centralized in
the sense that it is dependent on the Jini lookup service but on the other
hand, once a client discovers a service it wishes to use, the client and service
can carry on communicating without the availability of the lookup service.
Therefore, as in previous brokered systems, the availability is better than a
strict centralized system.
1.5 Examples of Middleware 17
5HVRXUFH
$YDLODELOLW
5HVRXUFH
'LVFRYHU
5HVRXUFH
RPPXQLFDWLRQ
HQWUDOL]HG
'HFHQWUDOL]HG
:HE
6HUYLFHV
Fig. 1.11. Taxonomy for Web services.
1.5.3 Web Services: Brokered
At the core of the Web services model is the notion of a service, which can
be described, discovered and invoked using standard XML technologies such
as SOAP, WSDL and UDDI. Conventionally, Web services are described by a
WSDL document, advertised and discovered using a UDDI server and invoked
with a message conforming to the SOAP specification.
Web services therefore use the same brokered model as other systems, such
as Napster, Jini or CORBA and therefore have a similar taxonomy to those
systems. However, Web services differentiates itself by being based completely
on open standards that has gained enormous support from thousands of com-
panies and have been adopted by several communities, including the GGF.
Web services are discussed in detail in Chapters 3, 13 and 14.
1.5.4 Jxta: Decentralized
Project Jxta [15] defines a set of protocols that can be used to construct
peer-to-peer systems using any of the centralized, brokered and decentralized
approaches but its main aim is to facilitate the creation of decentralized sys-
tems. Jxta’s goal is to develop basic building blocks and services to enable
P2P applications for interested groups of peers. Jxta will be discussed, both
18 1 Introduction
5HVRXUFH
$YDLODELOLW
5HVRXUFH
'LVFRYHU
5HVRXUFH
RPPXQLFDWLRQ
HQWUDOL]HG
'HFHQWUDOL]HG
-[WD
Fig. 1.12. Taxonomy for JXTA.
conceptually and from a programmers perspective in Chapters 10 and 12,
respectively.
Jxta can support any level of centralization/decentralization but its main
focus (and hence power) is to facilitate the development of decentralized appli-
cations. Therefore, in this context, Jxta peers can be located in a decentralized
fashion; they have much redundancy in their availability and their communi-
cation is point to point and therefore no central control authority is needed
for their operation.
1.6 Conclusion
In this chapter, the critical components of any distributed system were
outlined concentrating particularly on the role of middleware. Distributed-
systems terminology was introduced, along with notion of a service, which
will be used frequently within this book. We then discussed a taxonomy for
distributed systems based on a scale ranging from centralized to decentral-
ized, which factored in: resource discovery, resource availability and resource
communication. Several well-known distributed applications and middleware
have been classified using this taxonomy, which will serve as a placeholder
and give context to the distributed systems described in the rest of this book.
Part I
Distributed Environments
21
In this book, there are four main themes: distributed environments, mid-
dleware and applications, middleware deployment and future trends. We begin
by setting the scene and introducing three diverse, yet somewhat complimen-
tary technologies, that have evolved over the past several years. These are
peer to peer, Web services and Grid computing. Each of these technological
areas addresses specific issues within the distributed system spectrum and,
as we look ahead, it is highly likely that each will play an important role in
contributing to our future distributed-systems infrastructure.
2
Peer-2-Peer Systems
At the time of writing, there are one and a half billion devices worldwide (e.g.,
PCs, phone, PDAs, etc.), a figure which is rising rapidly. Surveys have stated
that Internet users surpassed 530 million in 2001 and predictions indicate that
this will double to 1.12 billion by year-end 2005 [175].
The computer hardware industry has also been characterised by expo-
nential production volumes. Gordon Moore, the co-founder of Intel, in his
famous observation in 1965 [140] (made just four years after the first planar
integrated circuit was discovered), predicted that the number of transistors
on integrated circuits would double every few years. Indeed this prediction,
thereafter called Moore’s law, remains true up until today and Intel predicts
that this will remain true at least until the end of this decade [141].
Such acceleration in development has been made possible by the mas-
sive investment by companies who deal with comparatively short product life
cycles. Each user now in this massive network has the CPU capability of
more than 100 times that of an early 1990s supercomputer and surprisingly,
GartnerGroup research reveals that over 95% of today’s PC power is wasted.
The potential of such a distributed computing resource has been in some ways
demonstrated by the SETI@Home project [3], having used over a million years
of CPU time at the time of writing.
In this chapter, peer-to-peer computing, a possible paradigm for making
use of such devices, is discussed. An historical perspective is given, followed
by a definition, taxonomy and justification for P2P computing. A background
into the P2P environment is given followed by examples of several P2P appli-
cations that operate within such an environment.
2.1 What is Peer to Peer?
This section gives a brief background and history of the term “peer to peer”
and describes its definition in the current context. Examples of P2P tech-
24 2 Peer-2-Peer Systems
nologies are given followed by categorizations of their behaviour within the
taxonomy described in the first chapter.
2.1.1 Historical Peer to Peer
Peer to peer was originally used to describe the communication of two peers
and is analogous to a telephone conversation. A phone conversation involves
two people (peers) of equal status, communication between a point-to-point
connection. Simply, this is what P2P is, a point-to-point connection between
two equal participants.
The Internet started as a peer-to-peer system. The goal of the original
ARPANET was to share computing resources around the USA. Its challenge
was to connect a set of distributed resources, using different network con-
nectivity, within one common network architecture. The first hosts on the
ARPANET were several US universities, e.g., the University College of Los
Angeles, Santa Barbara, SRI and University of Utah. These were already in-
dependent computing sites with equal status and the ARPANET connected
them as such, not in a master/slave or client/server relationship but rather
as equal computing peers.
From the late 1960s until 1994, the Internet had one model of connectivity.
Machines were assumed to be always switched on, always connected, and
assigned permanent IP addresses. The original DNS system was designed for
this environment, where a change in IP address was assumed to be abnormal
and rare, and could take days to propagate through the system.
However, with the invention of Mosaic, another model began to emerge
in the form of users connecting to the Internet from dial-up modems. This
created a second class of connectivity because PCs would enter and leave
the network frequently and unpredictably. Further, because ISPs began to
run out of IP addresses, they began to assign IP addresses dynamically for
each session, giving each PC a different, possibly masked, IP address. This
transient nature and instability prevented PCs from being assigned permanent
DNS entries, and therefore prevented most PC users from hosting any data
or network-facing applications locally.
For a few years, treating PCs as clients worked well. Over time though,
as hardware and software improved, the unused resources that existed behind
this veil of second-class connectivity started to look like something worth get-
ting at. Given the vast array of available processors mentioned earlier, the
software community is starting to take P2P applications very seriously. Most
importantly, P2P research is concerned in addressing some of the main difficul-
ties of current distributed computing: scalability, reliability, interoperability.
2.1.2 Binding of Peers
Within today’s Internet, we rely on fixed IP addresses. When a user types
an address into his/her Web browser (such as http://www.google.com/), the
2.1 What is Peer to Peer? 25
http://www.google.com/
DNS
168.127.47.8
Fig. 2.1. The process whereby an Internet address is converted into the IP address
for locating a Web page on the Internet.
Web server address is translated into the IP address (e.g., 168.127.47.8) by a
domain name server (DNS). The Internet protocol (IP) then makes a rout-
ing decision based on the IP Address. If DNS is unavailable then typing
http://168.127.47.8/ into a browser would be equivalent since the Web page
is permanently bound to the IP address.
This is known as static or early binding. Figure 2.1 illustrates this pro-
cess graphically. Early bindings form a simple architecture very similar to an
address book on a mobile phone; e.g., the person’s name is statically bound
to his/her telephone number. This works in practice because typically people
have long-term (early) bindings with their phone numbers and Web sites have
long-term bindings with their IP addresses.
However, if a Web site changed its IP address several times a day then
this type of binding starts to become impractical. Within P2P networks this
is the norm. Often devices do not have a fixed address as they are hidden
behind Network Address Translation (NAT) systems and therefore need a
late binding of their addresses with their network identifier.
2.1.3 Modern Definition of Peer to Peer
With the emergence of new technologies in the late 1990s a new definition for
peer to peer has begun to emerge, as follows:
26 2 Peer-2-Peer Systems
P2P is a class of applications that takes advantage of resources e.g.
storage, cycles, content, human presence, available at the edges of the
Internet (Shirky [46]).
Computers/devices “at the edges of the Internet” are those operating within
transient and often hostile environments. Devices within this environment: can
come and go frequently; can be hidden behind a firewall or operate outside of
DNS, e.g., by NAT (see next section); and often have to deal with differing
transport protocols, devices and operating systems (see Fig. 2.2 below). Often
the number of computers in a P2P network is enormous consisting of millions
of interconnecting peers.
This modern definition rather defines the P2P environment of devices and
resources rather than previous definitions that focused on the servent method-
ology and decentralized nature of systems like Gnutella [6]. For example, in
Gnutella, there are two key differences compared to client/server based sys-
tems:
• A peer can act as both a client and a server (they call these servents i.e.
server and client in Gnutella.)
• The network is completely decentralized and has no central point of con-
trol. Peers in a Gnutella network are typically connected to three or four
other nodes and to search the network a query is broadcast throughout
the network.
Certainly, within P2P systems, peers exist as defined in Gnutella. How-
ever, P2P networks do not have to be completely decentralized. This is
evident in modern Gnutella implementations [51], which employ a central-
ized/decentralized approach in order to be able to scale the network and in-
crease efficiency of search. Such networks are implemented using super-peers
that cache file locations so that peers only have to search a small fraction of
the network in order to satisfy their search requests.
Therefore, Shirky’s definition here is more appropriate to describe a new
class of applications that are designed to work within this highly transient
environment (see also section 2.2), something previously unattainable.
Systems like Gnutella are now often referred to as True P2P (see Section
2.1.5) because of their pure decentralized approach, where everyone partici-
pates equally in the network. However, this ideal can never really be realised
by a P2P system simply because certainly not all peers are equal within actual
P2P networks, which has been proven by several empirical studies [69], [37]
and [67]. See the next two chapters for a detailed overview of the evolving
network topologies employed by recent decentralized file-sharing networks.
Other authors have noted the same. From [24], the authors state that
“they prefer this definition to the alternative ‘decentralized, self-organizing
distributed systems, in which all or most communication is symmetric,’ be-
cause it encompasses large-scale deployed (albeit centralized) P2P systems
(such as Napster and SETI@Home) where much experience has been gained”.
2.1 What is Peer to Peer? 27
/LQX[
7

3

,
3
%OXHWRRWK
+
7
7
3
7

3
,
3
73,3
;3
1$7 )LUHZDOO
331HWZRUN
+HWHURJHQHRXVVHWRI1HWZRUNHG'HYLFHVHJGLIIHUHQW
2SHUDWLQJVVWHPVSURJUDPPLQJODQJXDJHVDQGQHWZRUNV
$SSOLFDWLRQ
HJILOHVKDULQJ38VKDULQJ
Fig. 2.2. A P2P environment: devices are connected behind NATs and firewalls;
they run on different platforms, potentially using different programming languages,
e.g. Jxta [15].
Examples of recent P2P technologies include:
• File sharing/storage programs, e.g., Gnutella [6], Napster [4], Limewire
[51], KaZaA [52], Freenet [58] and Popular Power [53], some of which have
taken the spotlight by providing a way of sharing any type of digital file,
of which, users typically provide audio and video files
• CPU resource-sharing systems, e.g., SETI@Home [3], United Devices[54],
Entropia [55] and XtremWeb [191]
• Instant messaging (e.g., ICQ [56] and Jabber [5])
• Conferencing applications e.g.,netmeeting [57] for white-boarding, voice
over IP.
What makes these similar is that they are all leveraging previously unused
resources by tolerating and even working with the variable connectivity that
many devices connected to these networks exhibit.
2.1.4 Social Impacts of P2P
The legal connotations and social impacts of P2P are ongoing. No doubt, it
has opened the eyes and imaginations of people from numerous disciplines to
28 2 Peer-2-Peer Systems
the massive sharing of resources across the Internet. Even within the context
of the sharing of copyrighted material, there are compulsive arguments for and
against the use of such technologies. There are a number of articles and books
written on the subject that support the concept of P2P and those that give
legal context for it. For example, on the Open Democracy Web site, there are
a number of articles that give a social context for P2P, both from a cultural
perspective and a legal one. In this section, a very brief summary of some of
the points raised is given.
Vaidhyanathan [178], in his five-part article on the new information ecosys-
tem, paints a picturesque account of a deep cultural change that is taking
place through the introduction of P2P technologies. He argues that “what we
call P2P communicative networks actually reflect and amplify - revise and
extend - an old ideology or cultural habit. Electronic peer-to-peer systems
like Gnutella merely simulates other, more familiar forms of unmediated, un-
censorable, irresponsible, troublesome speech; for example, anti-royal gossip
before the French revolution, trading cassette tapes among youth subcultures
as punk or rap, or the illicit Islamist cassette tapes through the streets and
bazaars of Cairo.”
He argues against the current clampdown strategy that is being employed
by companies and governments. Such a strategy involves radically redesigning
the communication technologies so that information can be monitored more
closely. These restrictions would destroy the current openness of the current
Internet and could bring about a new type of Internet which, he says, would
“not be open and customisable. Content - and thus culture - would not be
adaptable and malleable. And what small measures of privacy these networks
now afford would evaporate”.
Rainsford [179] uses the term “information feudalism”, which was taken
from an analogy given by Peter Drahos [181]. Drahos suggests that
The current push for control over intellectual property rights has bred
a situation analogous to the feudal agricultural system in the medieval
period. In effect, songwriters and scientists work for corporate feudal
lords, licensing their own inventions in exchange for a living and the
right to ‘till the lands’ of the information society.
Rainsford quotes a number of authors who believe that the struggle that
we are experiencing has deep underlying roots in cultural transformations,
which will inevitably bring about a change in the decaying business models
of today. Rainsford also notes that “the links asserted between p2p systems
and terrorism, or the funding of terrorism” are “a concept which is laughably
ironic as p2p by its very nature is a non-profit system”.
Rimmer [180] gives a legal case for the argument and argues that “if claims
by peer-to-peer distributors that they are supporting free speech and con-
tributing to knowledge want to find a sympathetic ear in the courtroom, then
they have to mean it”. He discusses the current use of P2P and argues that
they have not lived up to their revolutionary promise, being used mostly for
2.1 What is Peer to Peer? 29
circulating copyrighted media around the world. He lists several cases which
have been brought against companies, which have resulted in infringements,
and some that have not.
Rimmer states that P2P networks are “vulnerable to legal actions for
copyright infringements because they have facilitated the dissemination of
copyright media for profit and gain.” He concludes that “the courts would
be happy to foster such technology if it promoted the freedom of speech, the
mixing of cultures, and the progress of science”.
For further reading, see the articles listed or the Open Democracy Web site
[177], which hosts a series of articles in response to these comments. Similar
articles appear on other Web sites, such as OpenP2P [65].
2.1.5 True Peer to Peer?
Within P2P, there are three categories of systems (as outlined in Chapter 1):
• Centralized systems: where every peer connects to a server which co-
ordinates and manages communication. Some examples here include the
CPU sharing applications, e.g., SETI@Home
• Brokered systems: where peers connect to a server in order to discover
other peers, but then manage the communication themselves (e.g., Nap-
ster). This is also called Brokered P2P.
• Decentralized systems: where peers run independently without the
need for centralized services. Here, the discovery is decentralized and the
communication takes place between the peers. Peers do not need a known
centralized service for them to operate, e.g., Gnutella, Freenet
Most Internet services are distributed using the traditional client/server
(centralized) architecture. In this architecture, clients connect to a server using
a specific communications protocol (e.g., TCP) to obtain access to a specific
resource. Most of the processing involved in delivering a service usually occurs
on the server, leaving the client relatively unburdened. Most popular Internet
applications, including the World Wide Web, FTP, telnet, and email, use this
service-delivery model. Unfortunately, this architecture has a major drawback;
that is, as the number of clients increases (and therefore load and bandwidth)
the server becomes a bottleneck and can eventually result in the server not
being able to handle any additional clients.
The advantage of the client/server model is that it requires less compu-
tational power on the client side. However, this has been somewhat circum-
vented due to ever-increasing CPU power and therefore most desktop PCs are
ludicrously overpowered to operate as simple clients, e.g., for browsing and
email.
P2P, on the other hand, has the capability of serving resources with high
availability at a much lower cost, while maximizing the use of resources from
every peer connected to the P2P network. Whereas client/server solutions rely
30 2 Peer-2-Peer Systems
on costly bandwidth, equipment, and location to maintain a robust solution,
P2P can offer a similar level of robustness by spreading network and resource
demands across the network. Note though that some middleware architectures
used to program such systems are often capable of operating in one or more
of these modes.
Further, the more decentralized the system, the better the fault tolerance,
since the services are spread across more resources. Therefore, at the far side
of the scale, you have true P2P systems, which employ a completely decen-
tralized structure, both in look-up and in communication. Hong [62] gives a
useful description for communication within P2P systems. He defines P2P
systems as being a class of distributed systems that are biased to more of
a decentralized approach, where there is no global notion of centralization.
He argues that such systems are primarily concerned with smaller distributed
levels of centralization with respect to communication.
When designing a P2P system therefore, there is a trade-off between in-
serting the correct amount of decentralization for the network to be fault
tolerant against failure but centralized enough to scale to large number of
participants. These issues are discussed in detail in Chapter 7.
2.1.6 Why Peer-to-Peer?
So why is P2P important. What’s new?
Although the term P2P, in many peoples’ minds, is linked with distribut-
ing copyrighted material illegally, it has in fact much more to offer. P2P
file-sharing applications have addressed a number of important issues when
dealing with large-scale connectivity of transient devices. There are a number
of practical real-world applications for such a technology, both on the Internet
[54] [3] and on wireless networks, e.g., for mobile sensors applications [176],
and in many different kinds of scientific and social experiments.
P2P could provide more useful and robust solutions over current technolo-
gies in many different situations. For example, current search engine solutions
centralize the knowledge and their resources. This is an inherent limitation.
Google, for example, relies on a central database that is updated daily by
scouring the Internet for new information. Simply due to the massive size of
this database (more than 1.6 billion entries) not every entry gets updated
every day, and as a result, information can often be out of date. Further, it is
impractical (from a cost perspective) that such solutions will be scalable for
the future Internet.
For example, even though Google, at the time of writing, runs a cluster of
10,000 machines to provide its service, it only searches a subset of available
Web pages (about 1.3 x 108
) to create its database. Furthermore, the world
produces two exabytes (2 x 1018
bytes) each year but only publishes about 300
terabytes (3 x 1012
bytes) i.e. for every megabyte of information produced,
2.2 The P2P Environment 31
one byte gets published. Therefore, finding useful information in real-time is
becoming increasingly difficult.
A similar service could be implemented using P2P technology. One pos-
sibility is that every person runs a personal Web server on a desktop com-
puter that has the capability to process requests for information about the
documents it manages. A user’s server could receive a query, check the local
documents and respond with a list of matching documents. Each server would
be responsible for indexing its own documents and would therefore be capable
of providing more specialized, accurate and up-to-date information.
This decentralization of indexing is much more manageable than the task
facing Google. Corporations could also provide specialized information avail-
able that current search engines cannot reach. Further, if the user’s server
disconnected from the network then the search service would also become
unavailable and therefore users searching would not receive results for un-
available resources as they do at present. This solution outlines an extreme
P2P solution, but in practice some combinational technique could prove very
effective.
2.2 The P2P Environment
This section covers the technology that makes the P2P environment so difficult
to work within. In Fig. 2.2, this environment was illustrated; that is, peers
are: extremely transient (they are continually disappearing and reappearing),
connections are often multi-hop (i.e., packets travel via several intermediaries
before they reach their destination), and peers reside in hostile environments
(i.e., they live behind NAT routing systems and firewalls).
In this section, a background is given into some of the technologies behind
P2P networks, which helps set a more realistic P2P scene. The first section
makes a brief excursion into switching technology for networks. The second
section describes a particular subset of these that contains NAT systems.
Lastly, firewalls are discussed.
2.2.1 Hubs, Switches, Bridges, Access Points and Routers
This section gives a brief overview of the various devices used to partition a
network, which gives the context for the following two sections on NAT and
firewalls often employed within a P2P network. Briefly, the critical distinction
between these devices is the level or layer at which they operate within the
International Standard Organization’s Open System Interconnect (ISO/OSI)
model, which defines seven network layers [98].
.
• Hubs: A hub is a repeater that works at the physical (lowest) layer of OSI.
A hub takes data that comes into a port and sends it to the other ports
32 2 Peer-2-Peer Systems
in the hub. It doesn’t perform any filtering or redirection of data. You
can think of a hub as a kind of Internet chat room. Everyone who joins
a particular chat is seen by everyone else. If there are too many people
trying to chat, things get bogged down.
• Switches and Bridges: These are pretty similar. Both operate at the
Data Link layer (just above Physical) and both can filter data so that
only the appropriate segment or host receives a transmission. Both filter
packets based on the physical address (i.e. Media Access Control (MAC)
address) of the sender/receiver although newer switches sometimes include
the capabilities of a router and can forward data based on IP address
(operating at the network layer), referred to as IP switches. In general,
bridges are used to extend the distance capabilities of the network while
minimizing overall traffic, and switches are used primarily for their filtering
capabilities to create multiple, smaller virtual local area networks (LANs)
out of one large LAN for easier management/administration (V-LANs).
• Routers: These work at the Network layer of OSI (above Data Link) and
operate on the IP address. Like switches and bridges, they filter by only
forwarding packets destined for remote networks thus minimizing traffic,
but are significantly more complex than any other networking device; thus
they require much more maintenance and administration. The home net-
worker typically uses a DSL or cable modem router that joins the home’s
LAN to the wide area network (WAN) of the Internet. By maintaining
configuration information in a “routing table” routers also have the abil-
ity to filter traffic, either incoming or outgoing, based on the IP addresses
of senders and receivers. Most routers allow the home networker to update
the routing table from a Web browser interface. DSL and cable modem
routers typically combine the functions of a router with those of a switch
in a single unit.
2.2.2 NAT Systems
For a computer to communicate with other computers and Web servers on
the Internet, it must have an IP address. An IP address is a unique 32-bit
number that identifies the location of your computer on a network. There
are, in theory, 232
(4,294,967,296) unique addresses but the actual number
available is much smaller (somewhere between 3.2 and 3.3 billion). This is
due to the way that the addresses are separated into classes and also because
some are set aside for multicasting, testing or other special uses.
With the explosion of the Internet and the increase in home networks
and business networks, the number of available IP addresses is simply not
enough. An obvious solution is to redesign the address format to allow for
more possible addresses. This is being developed and is called IPv6, but it
may take several years to deploy because it requires modification of the entire
infrastructure of the Internet.
2.2 The P2P Environment 33
3ULYDWH1HWZRUN
1$7
5RXWHU
/RFDO$UHD
1HWZRUN
,QWHUQHW
2XWJRLQJ
3XEOLF1HWZRUN
,QFRPLQJ
2XWJRLQJ
,QFRPLQJ
VWXE
GRPDLQ
Fig. 2.3. A NAT System divides a local network from the public network and offers
local-to-public mapping of addresses. This allows the number of machines on the
Internet to increase past the physical limit. A NAT system converts local addresses
within the stub domain into one Internet address.
A network address translation system (see Fig. 2.3) allows a single device,
such as a router, to act as an agent between the Internet (public network) and
a local (private) network. This means that only a single, unique IP address
is required to represent an entire group of computers. The internal network
is usually a LAN; commonly referred to as the stub domain. A stub domain
is a LAN that uses IP addresses internally. Any internal computers that use
unregistered IP addresses must use NAT to communicate with the rest of the
world.
There are two types of NAT translation, static or dynamic, which are
illustrated in Fig. 2.4. Static NAT involves mapping an unregistered IP ad-
dress to a registered IP address on a one-to-one basis. Particularly useful
when a device needs to be accessible from outside the network (i.e., in static
NAT), the computer with the IP address of 192.168.0.0 will always translate
to 131.251.45.110 (see upper part of Fig. 2.4).
Dynamic NAT, on the other hand, maps an unregistered IP address to
a registered IP address from a group of local dynamically allocatable IP ad-
dresses, i.e., the stub domain computers will be allocated an address from a
specified range of addresses, e.g., 192.168.0.0 to 192.168.0.50, in Figure 2.4 and
34 2 Peer-2-Peer Systems
3ULYDWH1HWZRUN
1$7
5RXWHU
/RFDO$UHD1HWZRUN ,QWHUQHW
2XWJRLQJ
3XEOLF1HWZRUN
,QFRPLQJ
2XWJRLQJ
,QFRPLQJ
3ULYDWH1HWZRUN
1$7
5RXWHU
/RFDO$UHD1HWZRUN ,QWHUQHW
2XWJRLQJ
3XEOLF1HWZRUN
,QFRPLQJ
2XWJRLQJ
,QFRPLQJ
6WDWLF
'QDPLF
 

«


1$76VWHP7SHV
2QHWR2QH
2QHWR0DQ
Fig. 2.4. A NAT system can be allocate dynamic address or translate from fixed
stub domain address to outside ones.
will translate these to 131.251.45.110 for the outside world. In this circum-
stance, it is easy to see why NAT systems are problematic since you could
have potentially hundreds of stub domain computers masquerading as one
external IP address.
2.2.3 Firewalls
A firewall is a system designed to prevent unauthorized access to or from a
private network. All messages entering or leaving the computer system pass
through the firewall, which examines each message and blocks those that do
not meet the specified security criteria. Specifically, firewalls are implemented
by blocking certain ports, thereby disabling certain types of services that
operate on those ports.
Some firewalls permit only email traffic, thereby protecting the network
against any attacks other than attacks against the email service. Other fire-
walls provide less strict protections, and block services that are known to be
problematic. Generally, firewalls are configured to protect against unauthen-
ticated interactive logins from the outside world. This, more than anything,
helps prevent unauthorized users from logging into machines on your network.
More elaborate firewalls block traffic from the outside to the inside, but
permit users on the inside to communicate freely with the outside. Figure 2.5
2.2 The P2P Environment 35
Fig. 2.5. A firewall blocks traffic to and from specified ports, here only SSH and
Web browsing are allowed by external computers.
illustrates a scenario where both telnet and audio conferencing are blocked
from the outside world but Web browsing and SSH connections are acceptable.
However, internal users can freely open up external connections using any of
these services but, in this example, they would not be able to hear the other
participants in the audio conference because incoming audio is blocked.
A firewall therefore can essentially protect you against most types of net-
work attack. Firewalls are also important since they can provide a single
choke point where security and audit can be imposed, i.e., they can provide
an important logging and auditing function and provide summaries to the
administrator about what kinds and amount of traffic passed through it and
how many attempts there were to break into it. Within P2P applications, it is
often necessary to traverse such firewalls, for example, by rerouting the data
over the HTTP port.
2.2.4 P2P Overlay Networks
P2P implementations frequently involve the creation of overlay networks ([23])
with a structure that is completely independent of that of the underlying
network of connected devices. The purpose of overlay networks is that they
abstract the complicated connectivity of a P2P network to a higher-level pro-
grammatical view of the peers that make up the network. This is illustrated
7HOQHW
$XGLR
RQIHUHQFLQJ
66+
:HE
%URZVHU
x
x 7HOQHW
$XGLR
RQIHUHQFLQJ
66+
:HE
%URZVHU
,QWHUQDO
([WHUQDO
36 2 Peer-2-Peer Systems
in Fig. 2.6 which shows the programmer’s view of the network (see top cloud
of peers) that simplifies and abstracts the network structure and underlying
transport mechanisms (see bottom part) into a collection of cooperating peers.
%OXHWRRWK 73,3
1$7
1
$
7
1
$
7
)LUHZDOO
)LUHZDOO
)LUHZDOO
3KVLFDO
1HWZRUN
+WWS
3HHU
3HHU
3HHU
3HHU
3HHU 3HHU
3HHU
3HHU
9LUWXDO
2YHUOD
9LUWXDO330DSSLQJ
Fig. 2.6. An illustration of the notion of an overlay network. Modern P2P infras-
tructures typically overlay a virtual view of the nodes on the network to abstract
the underlying mechanisms that actually connect these devices; this example was
taken from Jxta [15].
There are several different types of overlay networks. For example, within
Jxta, a virtual network overlay sits on top of the physical devices and is orga-
nized into transient or persistent relationships, which they call peer groups.
Peers in Jxta are not required to have direct point-to-point network connec-
tions and such connections are represented through the use of virtual pipes.
Virtual pipes simply define the endpoints of the connection and leave it to
the underlying mechanisms to implement the appropriate behaviour for that
environment, e.g., for TCP, a fixed point-to-point connection is created for
the pipe but for UDP pipes this is not required and therefore the pipe re-
mains connectionless. Other network overlays include the use of distributed
hashtables e.g. Chord [45] or Pastry [44].
Another Random Scribd Document
with Unrelated Content
appear in {braces} within the text.
The companion volume,
A Middle English Vocabulary, designed for use with SISAM's Fourteenth Century Verse 
Prose, by J. R. R. Tolkien
is available at PG #43737.
*** END OF THE PROJECT GUTENBERG EBOOK FOURTEENTH
CENTURY VERSE  PROSE ***
Updated editions will replace the previous one—the old editions
will be renamed.
Creating the works from print editions not protected by U.S.
copyright law means that no one owns a United States
copyright in these works, so the Foundation (and you!) can copy
and distribute it in the United States without permission and
without paying copyright royalties. Special rules, set forth in the
General Terms of Use part of this license, apply to copying and
distributing Project Gutenberg™ electronic works to protect the
PROJECT GUTENBERG™ concept and trademark. Project
Gutenberg is a registered trademark, and may not be used if
you charge for an eBook, except by following the terms of the
trademark license, including paying royalties for use of the
Project Gutenberg trademark. If you do not charge anything for
copies of this eBook, complying with the trademark license is
very easy. You may use this eBook for nearly any purpose such
as creation of derivative works, reports, performances and
research. Project Gutenberg eBooks may be modified and
printed and given away—you may do practically ANYTHING in
the United States with eBooks not protected by U.S. copyright
law. Redistribution is subject to the trademark license, especially
commercial redistribution.
START: FULL LICENSE
THE FULL PROJECT GUTENBERG LICENSE
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK
To protect the Project Gutenberg™ mission of promoting the
free distribution of electronic works, by using or distributing this
work (or any other work associated in any way with the phrase
“Project Gutenberg”), you agree to comply with all the terms of
the Full Project Gutenberg™ License available with this file or
online at www.gutenberg.org/license.
Section 1. General Terms of Use and
Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand,
agree to and accept all the terms of this license and intellectual
property (trademark/copyright) agreement. If you do not agree
to abide by all the terms of this agreement, you must cease
using and return or destroy all copies of Project Gutenberg™
electronic works in your possession. If you paid a fee for
obtaining a copy of or access to a Project Gutenberg™
electronic work and you do not agree to be bound by the terms
of this agreement, you may obtain a refund from the person or
entity to whom you paid the fee as set forth in paragraph 1.E.8.
1.B. “Project Gutenberg” is a registered trademark. It may only
be used on or associated in any way with an electronic work by
people who agree to be bound by the terms of this agreement.
There are a few things that you can do with most Project
Gutenberg™ electronic works even without complying with the
full terms of this agreement. See paragraph 1.C below. There
are a lot of things you can do with Project Gutenberg™
electronic works if you follow the terms of this agreement and
help preserve free future access to Project Gutenberg™
electronic works. See paragraph 1.E below.
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright
law in the United States and you are located in the United
States, we do not claim a right to prevent you from copying,
distributing, performing, displaying or creating derivative works
based on the work as long as all references to Project
Gutenberg are removed. Of course, we hope that you will
support the Project Gutenberg™ mission of promoting free
access to electronic works by freely sharing Project Gutenberg™
works in compliance with the terms of this agreement for
keeping the Project Gutenberg™ name associated with the
work. You can easily comply with the terms of this agreement
by keeping this work in the same format with its attached full
Project Gutenberg™ License when you share it without charge
with others.
1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside
the United States, check the laws of your country in addition to
the terms of this agreement before downloading, copying,
displaying, performing, distributing or creating derivative works
based on this work or any other Project Gutenberg™ work. The
Foundation makes no representations concerning the copyright
status of any work in any country other than the United States.
1.E. Unless you have removed all references to Project
Gutenberg:
1.E.1. The following sentence, with active links to, or other
immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project
Gutenberg™ work (any work on which the phrase “Project
Gutenberg” appears, or with which the phrase “Project
Gutenberg” is associated) is accessed, displayed, performed,
viewed, copied or distributed:
This eBook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and
with almost no restrictions whatsoever. You may copy it,
give it away or re-use it under the terms of the Project
Gutenberg License included with this eBook or online at
www.gutenberg.org. If you are not located in the United
States, you will have to check the laws of the country
where you are located before using this eBook.
1.E.2. If an individual Project Gutenberg™ electronic work is
derived from texts not protected by U.S. copyright law (does not
contain a notice indicating that it is posted with permission of
the copyright holder), the work can be copied and distributed to
anyone in the United States without paying any fees or charges.
If you are redistributing or providing access to a work with the
phrase “Project Gutenberg” associated with or appearing on the
work, you must comply either with the requirements of
paragraphs 1.E.1 through 1.E.7 or obtain permission for the use
of the work and the Project Gutenberg™ trademark as set forth
in paragraphs 1.E.8 or 1.E.9.
1.E.3. If an individual Project Gutenberg™ electronic work is
posted with the permission of the copyright holder, your use and
distribution must comply with both paragraphs 1.E.1 through
1.E.7 and any additional terms imposed by the copyright holder.
Additional terms will be linked to the Project Gutenberg™
License for all works posted with the permission of the copyright
holder found at the beginning of this work.
1.E.4. Do not unlink or detach or remove the full Project
Gutenberg™ License terms from this work, or any files
containing a part of this work or any other work associated with
Project Gutenberg™.
1.E.5. Do not copy, display, perform, distribute or redistribute
this electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1
with active links or immediate access to the full terms of the
Project Gutenberg™ License.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must,
at no additional cost, fee or expense to the user, provide a copy,
a means of exporting a copy, or a means of obtaining a copy
upon request, of the work in its original “Plain Vanilla ASCII” or
other form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.
1.E.7. Do not charge a fee for access to, viewing, displaying,
performing, copying or distributing any Project Gutenberg™
works unless you comply with paragraph 1.E.8 or 1.E.9.
1.E.8. You may charge a reasonable fee for copies of or
providing access to or distributing Project Gutenberg™
electronic works provided that:
• You pay a royalty fee of 20% of the gross profits you derive
from the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You provide a full refund of any money paid by a user who
notifies you in writing (or by e-mail) within 30 days of receipt
that s/he does not agree to the terms of the full Project
Gutenberg™ License. You must require such a user to return or
destroy all copies of the works possessed in a physical medium
and discontinue all use of and all access to other copies of
Project Gutenberg™ works.
• You provide, in accordance with paragraph 1.F.3, a full refund of
any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.E.9. If you wish to charge a fee or distribute a Project
Gutenberg™ electronic work or group of works on different
terms than are set forth in this agreement, you must obtain
permission in writing from the Project Gutenberg Literary
Archive Foundation, the manager of the Project Gutenberg™
trademark. Contact the Foundation as set forth in Section 3
below.
1.F.
1.F.1. Project Gutenberg volunteers and employees expend
considerable effort to identify, do copyright research on,
transcribe and proofread works not protected by U.S. copyright
law in creating the Project Gutenberg™ collection. Despite these
efforts, Project Gutenberg™ electronic works, and the medium
on which they may be stored, may contain “Defects,” such as,
but not limited to, incomplete, inaccurate or corrupt data,
transcription errors, a copyright or other intellectual property
infringement, a defective or damaged disk or other medium, a
computer virus, or computer codes that damage or cannot be
read by your equipment.
1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except
for the “Right of Replacement or Refund” described in
paragraph 1.F.3, the Project Gutenberg Literary Archive
Foundation, the owner of the Project Gutenberg™ trademark,
and any other party distributing a Project Gutenberg™ electronic
work under this agreement, disclaim all liability to you for
damages, costs and expenses, including legal fees. YOU AGREE
THAT YOU HAVE NO REMEDIES FOR NEGLIGENCE, STRICT
LIABILITY, BREACH OF WARRANTY OR BREACH OF CONTRACT
EXCEPT THOSE PROVIDED IN PARAGRAPH 1.F.3. YOU AGREE
THAT THE FOUNDATION, THE TRADEMARK OWNER, AND ANY
DISTRIBUTOR UNDER THIS AGREEMENT WILL NOT BE LIABLE
TO YOU FOR ACTUAL, DIRECT, INDIRECT, CONSEQUENTIAL,
PUNITIVE OR INCIDENTAL DAMAGES EVEN IF YOU GIVE
NOTICE OF THE POSSIBILITY OF SUCH DAMAGE.
1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you
discover a defect in this electronic work within 90 days of
receiving it, you can receive a refund of the money (if any) you
paid for it by sending a written explanation to the person you
received the work from. If you received the work on a physical
medium, you must return the medium with your written
explanation. The person or entity that provided you with the
defective work may elect to provide a replacement copy in lieu
of a refund. If you received the work electronically, the person
or entity providing it to you may choose to give you a second
opportunity to receive the work electronically in lieu of a refund.
If the second copy is also defective, you may demand a refund
in writing without further opportunities to fix the problem.
1.F.4. Except for the limited right of replacement or refund set
forth in paragraph 1.F.3, this work is provided to you ‘AS-IS’,
WITH NO OTHER WARRANTIES OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.
1.F.5. Some states do not allow disclaimers of certain implied
warranties or the exclusion or limitation of certain types of
damages. If any disclaimer or limitation set forth in this
agreement violates the law of the state applicable to this
agreement, the agreement shall be interpreted to make the
maximum disclaimer or limitation permitted by the applicable
state law. The invalidity or unenforceability of any provision of
this agreement shall not void the remaining provisions.
1.F.6. INDEMNITY - You agree to indemnify and hold the
Foundation, the trademark owner, any agent or employee of the
Foundation, anyone providing copies of Project Gutenberg™
electronic works in accordance with this agreement, and any
volunteers associated with the production, promotion and
distribution of Project Gutenberg™ electronic works, harmless
from all liability, costs and expenses, including legal fees, that
arise directly or indirectly from any of the following which you
do or cause to occur: (a) distribution of this or any Project
Gutenberg™ work, (b) alteration, modification, or additions or
deletions to any Project Gutenberg™ work, and (c) any Defect
you cause.
Section 2. Information about the Mission
of Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new
computers. It exists because of the efforts of hundreds of
volunteers and donations from people in all walks of life.
Volunteers and financial support to provide volunteers with the
assistance they need are critical to reaching Project
Gutenberg™’s goals and ensuring that the Project Gutenberg™
collection will remain freely available for generations to come. In
2001, the Project Gutenberg Literary Archive Foundation was
created to provide a secure and permanent future for Project
Gutenberg™ and future generations. To learn more about the
Project Gutenberg Literary Archive Foundation and how your
efforts and donations can help, see Sections 3 and 4 and the
Foundation information page at www.gutenberg.org.
Section 3. Information about the Project
Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-
profit 501(c)(3) educational corporation organized under the
laws of the state of Mississippi and granted tax exempt status
by the Internal Revenue Service. The Foundation’s EIN or
federal tax identification number is 64-6221541. Contributions
to the Project Gutenberg Literary Archive Foundation are tax
deductible to the full extent permitted by U.S. federal laws and
your state’s laws.
The Foundation’s business office is located at 809 North 1500
West, Salt Lake City, UT 84116, (801) 596-1887. Email contact
links and up to date contact information can be found at the
Foundation’s website and official page at
www.gutenberg.org/contact
Section 4. Information about Donations to
the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission
of increasing the number of public domain and licensed works
that can be freely distributed in machine-readable form
accessible by the widest array of equipment including outdated
equipment. Many small donations ($1 to $5,000) are particularly
important to maintaining tax exempt status with the IRS.
The Foundation is committed to complying with the laws
regulating charities and charitable donations in all 50 states of
the United States. Compliance requirements are not uniform
and it takes a considerable effort, much paperwork and many
fees to meet and keep up with these requirements. We do not
solicit donations in locations where we have not received written
confirmation of compliance. To SEND DONATIONS or determine
the status of compliance for any particular state visit
www.gutenberg.org/donate.
While we cannot and do not solicit contributions from states
where we have not met the solicitation requirements, we know
of no prohibition against accepting unsolicited donations from
donors in such states who approach us with offers to donate.
International donations are gratefully accepted, but we cannot
make any statements concerning tax treatment of donations
received from outside the United States. U.S. laws alone swamp
our small staff.
Please check the Project Gutenberg web pages for current
donation methods and addresses. Donations are accepted in a
number of other ways including checks, online payments and
credit card donations. To donate, please visit:
www.gutenberg.org/donate.
Section 5. General Information About
Project Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could
be freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose
network of volunteer support.
Project Gutenberg™ eBooks are often created from several
printed editions, all of which are confirmed as not protected by
copyright in the U.S. unless a copyright notice is included. Thus,
we do not necessarily keep eBooks in compliance with any
particular paper edition.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.
This website includes information about Project Gutenberg™,
including how to make donations to the Project Gutenberg
Literary Archive Foundation, how to help produce our new
eBooks, and how to subscribe to our email newsletter to hear
about new eBooks.
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.
More than just a book-buying platform, we strive to be a bridge
connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.
Join us on a journey of knowledge exploration, passion nurturing, and
personal growth every day!
ebookbell.com

More Related Content

PDF
Algorithms And Dynamical Models For Communities And Reputation In Social Netw...
PDF
2021_Book_EmbeddedSystemDesign.pdf
PDF
Internet Computing Principles Of Distributed Systems And Emerging Internetbas...
PDF
Mobility Management Principle Technology and Applications 1st Edition Shanzhi...
PDF
On The Power Of Fuzzy Markup Language 1st Edition Bruno N Di Stefano Auth
PDF
From Net Neutrality To Ict Neutrality Patrick Maill Bruno Tuffin
PDF
Multimedia Technologies In The Internet Of Things Environment Volume 3 Raghve...
PDF
Emergent semantics interoperability in large scale decentralized information ...
Algorithms And Dynamical Models For Communities And Reputation In Social Netw...
2021_Book_EmbeddedSystemDesign.pdf
Internet Computing Principles Of Distributed Systems And Emerging Internetbas...
Mobility Management Principle Technology and Applications 1st Edition Shanzhi...
On The Power Of Fuzzy Markup Language 1st Edition Bruno N Di Stefano Auth
From Net Neutrality To Ict Neutrality Patrick Maill Bruno Tuffin
Multimedia Technologies In The Internet Of Things Environment Volume 3 Raghve...
Emergent semantics interoperability in large scale decentralized information ...

Similar to Peers In A Clientserver World A Modern Perspective On Peer To Peer And Grid Computing Ian J Taylor (20)

PDF
Distributed User Interfaces Usability And Collaboration 1st Edition Pedro G V...
PDF
Advanced Applications of Blockchain Technology Shiho Kim 2024 Scribd Download
PDF
Smart Service Innovation An Ecosystem Perspective On Organization Design And ...
PDF
Complex Intelligent Systems And Their Applications 1st Edition Thomas Moser
PDF
The Cloud-to-Thing Continuum: Opportunities and Challenges in Cloud, Fog and ...
PDF
Managing Distributed Cloud Applications And Infrastructure A Selfoptimising A...
PDF
The Revolution Of Cloud Computing
PDF
Workflow Management Models Methods and Systems 1st Edition Wil Van Der Aalst
PDF
Workflow Management Models Methods and Systems 1st Edition Wil Van Der Aalst
PDF
Recent Advances In Intelligent Systems And Smart Applications 1st Ed Mostafa ...
PDF
Big Data and Blockchain for Service Operations Management Ali Emrouznejad
PDF
Practical Applications Of Agentbased Technology Edited By Haiping Xu
PDF
The Personal Web A Research Agenda 1st Edition Joanna Ng Auth
PDF
Performability in Internet of Things Fadi Al-Turjman (Ed)
PDF
Architecting Dependable Systems Vi 1st Edition Ricardo Jimenezperis
PDF
Federated Learning Over Wireless Edge Networks Wei Yang Bryan Lim
PDF
Managing The Complexity Of Critical Infrastructures Roberto Setola Vittorio R...
PDF
Intelligent Computing Proceedings Of The 2020 Computing Conference Volume 3 1...
PDF
Distributed Realtime Systems Theory And Practice 1st Ed K Erciyes
PDF
Software Engineering Research, Management and Applications Roger Lee
Distributed User Interfaces Usability And Collaboration 1st Edition Pedro G V...
Advanced Applications of Blockchain Technology Shiho Kim 2024 Scribd Download
Smart Service Innovation An Ecosystem Perspective On Organization Design And ...
Complex Intelligent Systems And Their Applications 1st Edition Thomas Moser
The Cloud-to-Thing Continuum: Opportunities and Challenges in Cloud, Fog and ...
Managing Distributed Cloud Applications And Infrastructure A Selfoptimising A...
The Revolution Of Cloud Computing
Workflow Management Models Methods and Systems 1st Edition Wil Van Der Aalst
Workflow Management Models Methods and Systems 1st Edition Wil Van Der Aalst
Recent Advances In Intelligent Systems And Smart Applications 1st Ed Mostafa ...
Big Data and Blockchain for Service Operations Management Ali Emrouznejad
Practical Applications Of Agentbased Technology Edited By Haiping Xu
The Personal Web A Research Agenda 1st Edition Joanna Ng Auth
Performability in Internet of Things Fadi Al-Turjman (Ed)
Architecting Dependable Systems Vi 1st Edition Ricardo Jimenezperis
Federated Learning Over Wireless Edge Networks Wei Yang Bryan Lim
Managing The Complexity Of Critical Infrastructures Roberto Setola Vittorio R...
Intelligent Computing Proceedings Of The 2020 Computing Conference Volume 3 1...
Distributed Realtime Systems Theory And Practice 1st Ed K Erciyes
Software Engineering Research, Management and Applications Roger Lee
Ad

Recently uploaded (20)

PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
Cell Types and Its function , kingdom of life
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
PPTX
Cell Structure & Organelles in detailed.
PPTX
Orientation - ARALprogram of Deped to the Parents.pptx
PDF
01-Introduction-to-Information-Management.pdf
PDF
Weekly quiz Compilation Jan -July 25.pdf
PPTX
master seminar digital applications in india
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
Pharma ospi slides which help in ospi learning
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
Complications of Minimal Access Surgery at WLH
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
Cell Types and Its function , kingdom of life
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
Chinmaya Tiranga quiz Grand Finale.pdf
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
Cell Structure & Organelles in detailed.
Orientation - ARALprogram of Deped to the Parents.pptx
01-Introduction-to-Information-Management.pdf
Weekly quiz Compilation Jan -July 25.pdf
master seminar digital applications in india
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Pharma ospi slides which help in ospi learning
Microbial diseases, their pathogenesis and prophylaxis
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
202450812 BayCHI UCSC-SV 20250812 v17.pptx
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
Microbial disease of the cardiovascular and lymphatic systems
Complications of Minimal Access Surgery at WLH
VCE English Exam - Section C Student Revision Booklet
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Ad

Peers In A Clientserver World A Modern Perspective On Peer To Peer And Grid Computing Ian J Taylor

  • 1. Peers In A Clientserver World A Modern Perspective On Peer To Peer And Grid Computing Ian J Taylor download https://ebookbell.com/product/peers-in-a-clientserver-world-a- modern-perspective-on-peer-to-peer-and-grid-computing-ian-j- taylor-4105402 Explore and download more ebooks at ebookbell.com
  • 2. Here are some recommended products that we believe you will be interested in. You can click the link to download. The Collegial Phenomenon The Social Mechanisms Of Cooperation Among Peers In A Corporate Law Partnership Emmanuel Lazega https://ebookbell.com/product/the-collegial-phenomenon-the-social- mechanisms-of-cooperation-among-peers-in-a-corporate-law-partnership- emmanuel-lazega-43095368 Towards A New Order In The Global Automotive Industry How Asian Companies Catch Up To Their Western Peers How Asian Companies Catch Up To Their Western Peers 1st Edition Daniel Wldchen https://ebookbell.com/product/towards-a-new-order-in-the-global- automotive-industry-how-asian-companies-catch-up-to-their-western- peers-how-asian-companies-catch-up-to-their-western-peers-1st-edition- daniel-wldchen-51288016 Arterial Chemoreceptors In Physiology And Pathophysiology 1st Edition Chris Peers https://ebookbell.com/product/arterial-chemoreceptors-in-physiology- and-pathophysiology-1st-edition-chris-peers-5234704 A Book Of Psalms From Eleventhcentury Byzantium The Complex Of Texts And Images In Vat Gr 752 Ediz Illustrata Barbara Crostini Editor https://ebookbell.com/product/a-book-of-psalms-from-eleventhcentury- byzantium-the-complex-of-texts-and-images-in-vat-gr-752-ediz- illustrata-barbara-crostini-editor-52329184
  • 3. Peer Support In Medicine A Quick Guide 1st Ed 2021 Jonathan D Avery Editor https://ebookbell.com/product/peer-support-in-medicine-a-quick- guide-1st-ed-2021-jonathan-d-avery-editor-23526684 Chemical Peels In Clinical Practice A Practical Guide To Superficial Medium And Deep Peels Series In Cosmetic And Laser Therapy 1st Edition Xavier G Goodarzian https://ebookbell.com/product/chemical-peels-in-clinical-practice-a- practical-guide-to-superficial-medium-and-deep-peels-series-in- cosmetic-and-laser-therapy-1st-edition-xavier-g-goodarzian-51705350 Up In A Heaval Piers Anthony https://ebookbell.com/product/up-in-a-heaval-piers-anthony-33606426 Adventures In A Pairadice Peters Terry Michael https://ebookbell.com/product/adventures-in-a-pairadice-peters-terry- michael-7996668 Adventuring In The Englishes Language And Literature In A Postcolonial Globalized World Unabridged Piers Michael Smith Editor https://ebookbell.com/product/adventuring-in-the-englishes-language- and-literature-in-a-postcolonial-globalized-world-unabridged-piers- michael-smith-editor-10862896
  • 5. Ian J. Taylor From P2P to Web Services and Grids Peers in a Client/Server World
  • 6. Ian J. Taylor, PhD School of Computer Science, University of Cardiff, Cardiff, Wales Series editor Professor A.J. Sammes, BSc, MPhil, PhD, FBCS, CEng CISM Group, Cranfield University, RMCS, Shrivenham, Swindon SN6 8LA, UK British Library Cataloguing in Publication Data Taylor, Ian J. From P2P to Web Services and Grids. — (Computer communications and networks) 1. Client/server computing 2. Internet programming 3. Middleware 4. Peer-to-peer architecture (Computer networks) 5. Web services 6. Computational grides (Computer systems) I. Title 004.3′6 ISBN 1852338695 A catalog record for this book is available from the Library of Congress. Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be repro- duced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms of licences issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those terms should be sent to the publishers. Computer Communications and Networks ISSN 1617-7975 ISBN 1-85233-869-5 Springer London Berlin Heidelberg Springer is a part of Springer Science+Business Media springeronline.com © Springer-Verlag London Limited 2005 The use of registered names, trademarks etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant laws and regulations and therefore free for general use. The publisher makes no representation, express or implied, with regard to the accuracy of the infor- mation contained in this book and cannot accept any legal responsibility or liability for any errors or omissions that may be made. Printed and bound in the United States of America 34/3830–543210 Printed on acid-free paper SPIN 10975107
  • 7. To my dad, George, for always helping me with the international bureaucracies of this world and to him and his accomplice, Gill, for saving me from strange places at strange times. . . and to both for their continuous support. I am forever thankful.
  • 8. Preface Current users typically interact with the Internet through the use of a Web browser and a client/server based connection to a Web server. However, as we move forward to allow true machine-to-machine communication, we are in need of more scalable solutions which employ the use of decentralized tech- niques to add redundancy, fault tolerance and scalability to distributed sys- tems. Distributed systems take many forms, appear in many areas and range from truly decentralized systems, like Gnutella and Jxta, centrally indexed brokered systems like Web services and Jini and centrally coordinated sys- tems like SETI@Home. From P2P to Web Services and Grids: Peers in a client/server world pro- vides a comprehensive overview of the emerging trends in peer-to-peer (P2P), distributed objects, Web services and Grid computing technologies, which have redefined the way we think about distributed computing and the Inter- net. This book has two main themes: applications and middleware. Within the context of applications, examples of the many diverse architectures are provided including: decentralized systems like Gnutella and Freenet; brokered ones like Napster; and centralized applications like SETI and conventional Web servers. For middleware, the book covers Jxta, as a programming in- frastructure for P2P computing, along with Web services, Grid computing paradigms, e.g., Globus and OGSA, and distributed-object architectures, e.g., Jini. Each technology is described in detail, including source code where ap- propriate, and their capabilities are analysed in the context of the degree of centralization or decentralization they employ. To maintain coherency, each system is discussed in terms of the generalized taxonomy, which is outlined in the first chapter. This taxonomy serves as a placeholder for the systems presented in the book and gives an overview of the organizational differences between the various approaches. Most of the sys- tems are discussed at a high level, particularly addressing the organization and topologies of the distributed resources. However, some (e.g., Jxta, Jini, Web services and, to some extent, Gnutella) are discussed in much more detail, giving practical programming tutorials for their use. Security is paramount
  • 9. VIII Preface throughout and introduced with a dedicated chapter outlining the many ap- proaches to security within distributed systems. Why did I decide to write this book? I initially wrote the book for my lecture course in the School of Computer Science at Cardiff University on Distributed Systems. I wanted to give the stu- dents a broad overview of distributed-computing techniques that have evolved over the past decade. The text therefore outlines the key applications and mid- dleware used to construct distributed applications today. I wrote each lecture as a book chapter and these notes have been extremely well received by the students and therefore I decided to extend this into a book for their use and for others ... so: Who should read this book? This book, I believe, has a wide-ranging scope. It was initially written for BSc students, with an extensive computing background, and MSc students, who have little or no prior computing experience, i.e., some students had never written a line of code in their lives !... Therefore, this book should appeal to people with various computer programming abilities but also to the casual reader who is simply interested in the recent advances in the distributed systems world. Readers will learn about the various distributed systems that are available today. For a designer of new applications, this will provide a good reference. For students, this text would accompany any course on distributed computing to give a broader context of the subject area. For a casual reader, interested in P2P and Grid computing, the book will give a broad overview of the field and specifics about how such systems operate in practice without delving into the low-level details. For example, to both casual and programming-level readers, all chapters will be of interest, except some parts of the Gnutella chapter and some sections of the deployment chapters, which are more tuned to the lower-level mechanisms and therefore targeted more to programmers. Organization Chapter 1: Introduction: In this chapter, an introduction is given into distributed systems, paying particular attention to the role of middleware. A taxonomy is constructed for distributed systems ranging on a scale from centralized to decentralized depending on how resources or services are organized, discovered and how they communicate with each other. This will serve as an underlying theme for the understanding of the various applications and middleware discussed in this book. Chapter 2: Peer-2-Peer Systems: This chapter gives a brief history of client/server and peer-to-peer computing. The current P2P definition is stated and specifics of the P2P environment that distinguish it from
  • 10. Preface IX client/server are provided: e.g., transient nodes, multi-hop, NAT, firewalls etc. Several examples of P2P technologies are given, along with applica- tion scenarios for their use and categorizations of their behaviour within the taxonomy described in the first chapter. Chapter 3: Web Services: This chapter introduces the concept of machine- to-machine communication and how this fits in with the existing Web technologies and future scopes. This leads onto a high-level overview of Web services, which illustrates the core concepts without getting bogged down with the deployment details. Chapter 4: Grid Computing: This chapter introduces the idea of a com- putational Grid environment, which is typically composed of a number of heterogeneous resources that may be owned and managed by different administrators. The concept of a “virtual organization” is discussed along with its security model, which employs a single sign-on mechanism. The Globus toolkit, the reference implementation that can be used to program computational Grids, is then outlined giving some typical scenarios. Chapter 5: Jini: This chapter gives an overview of Jini, which provides an example of a distributed-object based technology. A background is given into the development of Jini and into the network plug-and-play manner in which Jini accesses distributed objects. The discovery of look-up servers, searching and using Jini services is described in detail and advanced Jini issues, such as leasing and events are discussed. Chapter 6: Gnutella: This chapter combines a conceptual overview of Gnutella and the details of the actual Gnutella protocol specification. Many empirical studies are then outlined that illustrate the behaviour of the Gnutella network in practice and show the many issues which need to be overcome in order for this decentralized structure to succeed. Finally, the advantages and disadvantages of this approach are discussed. Chapter 7: Scalability: In this chapter, we look at scalability issues by analysing the manner in which peers are organized within popular P2P networks. First, social networks are introduced and compared against their P2P counterparts. We then explore the use of decentralized P2P networks within the context of file sharing. It is shown why in practice, neither extreme (i.e., completely centralized or decentralized architectures) gives effective results and therefore why most current P2P applications use a hybrid of the two approaches. Chapter 8: Security: This chapter covers the basic elements of security in a distributed system. It covers the various ways that a third party can gain access to data and the design issues involved in building a distributed security system. It then gives a basic overview of cryptography and de- scribes the various ways in which secure channels can be set up, using public-key pairs or by using symmetric keys, e.g., shared secret keys or session keys. Finally, secure mobile code is discussed within the concept of sandboxing.
  • 11. X Preface Chapter 9: Freenet: This chapter gives a concise description of the Freenet distributed information storage system, which is real-world example of how the various technologies, so far discussed, can be integrated and used within a single system. For example: Freenet is designed to work within a P2P environment; it addresses scalability through the use of an adaptive routing algorithm that creates a centralized/decentralized network topol- ogy dynamically; and it address a number of privacy issues by using a combination of hash functions and public/private key encryption. Chapter 10: Jxta: This chapter introduces Jxta that provides a set of open, generalized, P2P protocols to allow any connected device (cell phone to PDA, PC to server) on the network to communicate and collaborate. An overview of the motivation behind Jxta is given followed by a description of its key concepts. Finally, a detailed overview of the six Jxta protocols is given. Chapter 11: Distributed Object Deployment Using Jini: This chap- ter describes how one would use Jini in practice. This is illustrated through several simple RMI and Jini applications that describe how the individ- ual parts and protocols fit together and give a good context for the Jini chapter and how the deployment differs from other systems discussed in this book. Chapter 12: P2P Deployment Using Jxta: This chapter uses several Jxta programming examples to illustrate some issues of programming and operating within a P2P environment. A number of key practical issues, such as out-of-date advertisements and peer configuration, which have to be dealt with in any P2P application are discussed and illustrated by outlining the potential solutions employed by Jxta. Chapter 13: Web Services Deployment: This chapter describes the Web services deployment technologies, typically used for representing and invoking Web services. Specifically, three core technologies are discussed in detail: SOAP for wrapping XML messages within an envelope, WSDL for representing the Web services interface description, and UDDI for storing indexes of the locations of Web services. Chapter 14: OGSA: This chapter discusses the Open Grid Service Ar- chitecture (OGSA), which extends Web services into the Grid computing arena by using WSDL to achieve self-descriptive, discoverable services that can be referenced during their lifetime, i.e., maintain state. OGSI is discussed, which provides an implementation of the OGSA ideas. This is followed by OGSI’s supercessor, WSRF, which translates the OGSI defi- nitions into representations that are compatible with other emerging Web service standards. Disclaimer Within this book, I draw in a number of examples from file-sharing programs, such as Napster, Gnutella (e.g., Limewire), Fastrack and KaZaA to name a
  • 12. Preface XI few. The reason for this is to illustrate the different approaches in the orga- nization of distributed systems in a computational scientific context. Under no circumstances, using this text, am I endorsing or supporting any or all of these file-sharing applications in their current legal battles concerning copy- right issues. My focus here is on the use of this infrastructure in many other scientific situations where there is no question of their legality. We can learn a lot from such applications when designing future Grids and P2P systems, both from a computational science aspect and from a social aspect, in the sense of how users behave as computing peers within such a system, i.e., do they share or not? These studies give us insight about how we may approach the scalability issues in future distributed systems. English Spelling I struggled with the appropriate spelling of some words, which in British En- glish, should (arguably) be spelt with an ‘s’ but in almost all related literature within this subject area, they are spelt with a ‘z’, e.g., organize, centralize, etc. After much dialogue with colleagues and Springer, we decided on a com- promise; that is, I shall use an amalgamation of America English and British English known as mid-Atlantic English.... Therefore, for the set of such words, I will use the ‘z’ form. These include derivatives of: authorize, centralize, de- centralize, generalize, maximize, minimize, organize, quantize, serialize, spe- cialize, standardize, utilize, virtualize and visualize. Otherwise, I will use the British English spelling e.g. advertise, characterise, conceptualise, customise, realise, recognise, stabilise etc. Interestingly, however, even the Oxford Concise English Dictionary lists many of these words in their ‘z’ form.... Acknowledgements I would like to thank a number of people who provided sanity checks and proof-reading for a number of chapters in this book. In particular, I’d like to thank Shalil Majithia, Andrew Harrison, Omer Rana and Jonathon Giddy. Also, many thanks to the numerous members of the GridLab, Triana and NRL groups for their encouragement and enlightening discussions during the writ- ing of this book. So, to name a few, thanks to Alex Hardisty, Andre Merzky, Andrei Hutanu, Brian Adamson, Bernard Schutz, Joe Macker, Ed Seidel, Gabrielle Allen, Ian Kelley, Jason Novotny, Roger Philp, Wangy, Matthew Shields, Michael Russell, Oliver Wehrens, Felix Hupfeld, Rick Jones, Shel- don Gardner, Thilo Kielmann, Jarek Nabrzyski, Sathya, Tom Goodale, David Walker, Kelly Davis, Hartmut Kaiser, Dave Angulo, Alex Gray and Krzysztof Kurowski. Most of this book was written in Sicily and therefore, I’d like to thank everyone I met there who made me feel so welcome and for those necessary breaks in B&Js in Ragusa Ibla and il Bagatto in Siracusa.... Finally, thanks
  • 13. XII Preface to Matt for keeping his cool during some pretty daunting deadlines towards the end of the writing of this book. Cardiff, UK. Ian Taylor April 2004 Ian Taylor
  • 14. Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Introduction to Distributed Systems . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Some Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Centralized and Decentralized Systems . . . . . . . . . . . . . . . . . . . . . 5 1.3.1 Resource Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3.2 Resource Availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.3.3 Resource Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.4 Examples of Distributed Applications . . . . . . . . . . . . . . . . . . . . . . 10 1.4.1 A Web Server: Centralized. . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.4.2 SETI@Home: Centralized . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.4.3 Napster: Brokered . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.4.4 Gnutella: Decentralized . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.5 Examples of Middleware. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.5.1 J2EE and JMS: Centralized . . . . . . . . . . . . . . . . . . . . . . . . 15 1.5.2 Jini: Brokered. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.5.3 Web Services: Brokered . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.5.4 Jxta: Decentralized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Part I Distributed Environments 2 Peer-2-Peer Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.1 What is Peer to Peer? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.1.1 Historical Peer to Peer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.1.2 Binding of Peers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.1.3 Modern Definition of Peer to Peer . . . . . . . . . . . . . . . . . . . 25 2.1.4 Social Impacts of P2P . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.1.5 True Peer to Peer? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.1.6 Why Peer-to-Peer? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.2 The P2P Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
  • 15. XIV Contents 2.2.1 Hubs, Switches, Bridges, Access Points and Routers . . . 31 2.2.2 NAT Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.2.3 Firewalls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.2.4 P2P Overlay Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 2.3 P2P Example Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.3.1 MP3 File Sharing with Napster . . . . . . . . . . . . . . . . . . . . . 37 2.3.2 Distributed Computing Using SETI@Home . . . . . . . . . . . 38 2.3.3 Instant Messaging with ICQ . . . . . . . . . . . . . . . . . . . . . . . . 39 2.3.4 File Sharing with Gnutella. . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3 Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.1.1 Looking Forward: What Do We Need? . . . . . . . . . . . . . . . 44 3.1.2 Representing Data and Semantics . . . . . . . . . . . . . . . . . . . 47 3.2 Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.2.1 A Minimal Web Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.2.2 Web Services Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.2.3 Web Services Development . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.3 Service-Oriented Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.3.1 A Web Service SOA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.4 Common Web Service Misconceptions . . . . . . . . . . . . . . . . . . . . . . 55 3.4.1 Web Services and Distributed Objects . . . . . . . . . . . . . . . 55 3.4.2 Web Services and Web Servers . . . . . . . . . . . . . . . . . . . . . . 55 3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4 Grid Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.1 The Grid Dream . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.2 Social Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.3 History of the Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.3.1 The First Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.3.2 The Second Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.3.3 The Third Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 4.4 The Grid Computing Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.4.1 Virtual Organizations and the Sharing of Resources . . . . 64 4.5 To Be or Not to Be a Grid: These Are the Criteria... . . . . . . . . . 67 4.5.1 Centralized Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.5.2 Standard, Open, General-Purpose Protocols . . . . . . . . . . 68 4.5.3 Quality Of Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 4.6 Types of Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 4.7 The Globus Toolkit 2.x. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.7.1 Globus Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.7.2 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.7.3 Information Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.7.4 Data Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
  • 16. Contents XV 4.7.5 Resource Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.8 Comments and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Part II Middleware, Applications and Supporting Technologies 5 Jini . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 5.1 Jini . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.1.1 Setting the Scene . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.2 Jini’s Transport Backbone: RMI and Serialization . . . . . . . . . . . 84 5.2.1 RMI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 5.2.2 Serialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 5.3 Jini Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 5.3.1 Jini in Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 5.4 Registering and Using Jini Services . . . . . . . . . . . . . . . . . . . . . . . . 93 5.4.1 Discovery: Finding Lookup Services . . . . . . . . . . . . . . . . . . 93 5.4.2 Join: Registering a Service (Jini Service) . . . . . . . . . . . . . 94 5.4.3 Lookup: Finding and Using Services (Jini Client) . . . . . . 96 5.5 Jini: Tying Things Together . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 5.6 Organization of Jini Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.6.1 Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 6 Gnutella . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 6.1 History of Gnutella . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 6.2 What Is Gnutella? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 6.3 A Gnutella Scenario: Connecting and Operating Within a Gnutella Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 6.3.1 Discovering Peers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 6.3.2 Gnutella in Operation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 6.3.3 Searching Within Gnutella . . . . . . . . . . . . . . . . . . . . . . . . . 106 6.4 Gnutella 0.4 Protocol Description. . . . . . . . . . . . . . . . . . . . . . . . . . 107 6.4.1 Gnutella Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 6.4.2 Gnutella Descriptor Header . . . . . . . . . . . . . . . . . . . . . . . . . 109 6.4.3 Gnutella Payload: Ping . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 6.4.4 Gnutella Payload: Pong . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 6.4.5 Gnutella Payload: Query . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 6.4.6 Gnutella Payload: QueryHit . . . . . . . . . . . . . . . . . . . . . . . . 111 6.4.7 Gnutella Payload: Push . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 6.5 File Downloads. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 6.6 Gnutella Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 6.7 More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 6.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
  • 17. XVI Contents 7 Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 7.1 Social Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 7.2 P2P Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 7.2.1 Performance in P2P Networks . . . . . . . . . . . . . . . . . . . . . . 119 7.3 Peer Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 7.3.1 Centralized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 7.3.2 Ring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 7.3.3 Hierarchical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 7.3.4 Decentralized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 7.4 Hybrid Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 7.4.1 Centralized/Ring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 7.4.2 Centralized/Centralized . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 7.4.3 Centralized/Decentralized . . . . . . . . . . . . . . . . . . . . . . . . . . 125 7.5 The Convergence of Napster and Gnutella . . . . . . . . . . . . . . . . . . 127 7.6 A Southern Side-Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 7.7 Gnutella Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 7.7.1 Gnutella Free Riding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 7.7.2 Equal Peers?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 7.7.3 Power-Law Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 7.8 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 7.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 8 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 8.2 Design Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 8.2.1 Focus of Data Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 8.2.2 Layering of Security Mechanisms . . . . . . . . . . . . . . . . . . . . 136 8.2.3 Simplicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 8.3 Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 8.3.1 Basics of Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 8.3.2 Types of Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 8.3.3 Symmetric Cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 8.3.4 Asymmetric Cryptosystem. . . . . . . . . . . . . . . . . . . . . . . . . . 141 8.3.5 Hash Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 8.4 Signing Messages with a Digital Signature . . . . . . . . . . . . . . . . . . 143 8.5 Secure Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 8.5.1 Secure Channels Using Symmetric Keys . . . . . . . . . . . . . . 145 8.5.2 Secure Channels Using Public/Private Keys . . . . . . . . . . 145 8.6 Secure Mobile Code: Creating a Sandbox . . . . . . . . . . . . . . . . . . . 147 8.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
  • 18. Contents XVII 9 Freenet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 9.2 Freenet Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 9.2.1 Populating the Freenet Network . . . . . . . . . . . . . . . . . . . . . 152 9.2.2 Self-Organizing Adaptive Behaviour in Freenet . . . . . . . . 153 9.2.3 Requesting Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 9.2.4 Similarities with Other Peer Organization Techniques . . 155 9.3 Freenet Keys. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 9.3.1 Keyword-Signed Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 9.3.2 Signed Subspace Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 9.3.3 Content Hash Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 9.3.4 Clustering Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 9.4 Joining the Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 9.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 10 Jxta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 10.1 Background: Why Was Project Jxta Started? . . . . . . . . . . . . . . . 163 10.1.1 Interoperability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 10.1.2 Platform independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 10.1.3 Ubiquity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 10.2 Jxta Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 10.2.1 The Jxta Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 10.2.2 Jxta Peers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 10.2.3 Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 10.2.4 Advertisements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 10.2.5 Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 10.2.6 Modules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 10.3 Jxta Network Overlay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 10.3.1 Peer Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 10.3.2 Rendezvous Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 10.3.3 Pipes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 10.3.4 Relay Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 10.4 The Jxta Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 10.4.1 The Peer Discovery Protocol . . . . . . . . . . . . . . . . . . . . . . . . 174 10.4.2 The Peer Resolver Protocol . . . . . . . . . . . . . . . . . . . . . . . . . 175 10.4.3 The Peer Information Protocol . . . . . . . . . . . . . . . . . . . . . . 176 10.4.4 The Pipe Binding Protocol . . . . . . . . . . . . . . . . . . . . . . . . . 176 10.4.5 The Endpoint Routing Protocol . . . . . . . . . . . . . . . . . . . . . 176 10.4.6 The Rendezvous Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . 176 10.5 A Jxta Scenario: Fitting Things Together . . . . . . . . . . . . . . . . . . . 176 10.6 Jxta Environment Considerations. . . . . . . . . . . . . . . . . . . . . . . . . . 177 10.6.1 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 10.6.2 NAT and Firewalls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 10.7 Comment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 10.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
  • 19. XVIII Contents Part III Middleware Deployment 11 Distributed Object Deployment Using Jini . . . . . . . . . . . . . . . . 185 11.1 RMI Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 11.2 An RMI Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 11.2.1 The Java Proxy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 11.2.2 The Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 11.2.3 The Client. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 11.2.4 Setting up the Environment . . . . . . . . . . . . . . . . . . . . . . . . 190 11.3 A Jini Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 11.3.1 The Remote Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 11.3.2 The Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 11.3.3 The Client. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 11.4 Running Jini Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 11.4.1 HTTP Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 11.4.2 RMID Daemon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 11.4.3 The Jini Lookup Service . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 11.4.4 Running the Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 11.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 12 P2P Deployment Using Jxta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 12.1 Jxta Programming: Three Examples Illustrated . . . . . . . . . . . . . 199 12.1.1 Starting the Jxta Platform . . . . . . . . . . . . . . . . . . . . . . . . . 200 12.1.2 Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 12.1.3 Creating Pipes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 12.2 Running Jxta Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 12.3 P2P Environment: The Jxta Approach . . . . . . . . . . . . . . . . . . . . . 209 12.3.1 Peer Configuration Using Jxta . . . . . . . . . . . . . . . . . . . . . . 209 12.3.2 Peer Configuration Management Within Jxta . . . . . . . . . 211 12.3.3 Running The Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 12.3.4 Jxta and P2P Advert Availability . . . . . . . . . . . . . . . . . . . 214 12.3.5 Expiration of Adverts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 12.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 13 Web Services Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 13.1 SOAP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 13.1.1 Just Like Sending a Letter. . . . . . . . . . . . . . . . . . . . . . . . . . 218 13.1.2 Web Services Architecture with SOAP . . . . . . . . . . . . . . . 219 13.1.3 The Anatomy of a SOAP Message . . . . . . . . . . . . . . . . . . . 221 13.2 WSDL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 13.2.1 Service Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 13.2.2 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 13.2.3 Anatomy of a WSDL Document . . . . . . . . . . . . . . . . . . . . . 225 13.3 UDDI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
  • 20. Contents XIX 13.4 Using Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 13.4.1 Axis Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 13.4.2 A Simple Web Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 13.4.3 Deploying a Web Service Using Axis . . . . . . . . . . . . . . . . . 232 13.4.4 Web Service Invocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 13.4.5 Cleaning Up and Un-Deploying . . . . . . . . . . . . . . . . . . . . . 235 13.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Part IV From Web Services to Future Grids 14 OGSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 14.1 OGSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 14.1.1 Grid Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 14.1.2 Virtual Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 14.1.3 OGSA Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 14.2 OGSI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 14.2.1 Globus Toolkit, Version 3 . . . . . . . . . . . . . . . . . . . . . . . . . . 249 14.3 WSRF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 14.3.1 Problems with OGSI. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 14.3.2 Grid Services or Resources?. . . . . . . . . . . . . . . . . . . . . . . . . 251 14.3.3 OGSI Functionality in WSRF . . . . . . . . . . . . . . . . . . . . . . . 251 14.3.4 Globus Toolkit, Version 4 . . . . . . . . . . . . . . . . . . . . . . . . . . 252 14.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 A Want to Find Out More? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 A.1 Grid Computing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 A.2 P2P Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 A.3 Distributed Object Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 A.4 Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 B RSA Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
  • 21. 1 Introduction Recently, there has been an explosion of applications using peer-to-peer (P2P) and Grid-computing technology. On the one hand, P2P has become ingrained in current grass-roots Internet culture through applications like Gnutella [6] and SETI@Home [3]. It has appeared in several popular magazines including the Red Herring and Wired, and frequently quoted as being crowned by For- tune as one of the four technologies that will shape the Internet’s future. The popularity of P2P has spread through to academic and industrial circles, be- ing propelled by media and widespread debate both in the courtroom and out. However, such enormous hype and controversy has led to the mistrust of such technology as a serious distributed systems platform for future computing, but in fact in reality, there is significant substance as we shall see. In parallel, there has been an overwhelming interest in Grid computing, which is attempting to build the infrastructure to enable on-demand comput- ing in a similar fashion to the way we access other utilities now, e.g., electricity. Further, the introduction of the Open Grid Services Architecture (OGSA) [21] has aligned this vision with the technological machine-to-machine capabilities of Web services (see Chapter 3). This convergence has gained a significant in- put from both commercial and non-commercial organizations ([27] and [28]) and has a firm grounding in standardized Web technologies, which could per- haps even lead to the kind of ubiquitous uptake necessary for such a infras- tructure to be globally deployed. Although the underlying philosophies of Grid computing and P2P are different, they both are attempting to solve the same problem, that is, to create a virtual overlay [23] over the existing Internet to enable collaboration and sharing of resources [24]. However, in implementation, the approaches differ greatly. Whilst Grid computing connects virtual organizations [32] that can cooperate in a collaborative fashion, P2P connects individual users using highly transient devices and computers living at the edges of the Internet [46] (i.e., behind NAT, firewalls etc). The name “Peers in a Client/Server World” describes the transitionary evolution from the widespread client/server based Internet, dominant over
  • 22. 2 1 Introduction the past decade, back to the roots of the Internet where every peer had equal status. Inevitably, both history and practicality will influence the next gen- eration Internet as we attempt to migrate from the technical maturity and robustness of the current Internet to its future vision. Therefore, as we move forward, we must build upon the current infrastructure to address key issues of widespread availability and deployment. In this book, the key influential technologies are addressed that will help to shape the next-generation Internet. P2P and distributed-object based tech- nologies, through to the promised pervasive deployment of Grid computing combined with Web services will be needed in order to address the funda- mental issues of creating a scalable ubiquitous next-generation computing infrastructure. Specifically, a comprehensive overview of current distributed- systems technologies is given, covering P2P environments (Chapters 2,6,7, 9,10,12), security techniques (Chapter 8), distributed-object systems (Chap- ters 5 and 11), Grid computing (Chapter 4) and both stateless (Chapters 3 and 13) and stateful Web services (Chapter 14). 1.1 Introduction to Distributed Systems A distributed system can be defined as follows: “A distributed system is a collection of independent computers that appears to its users as a single coherent system” [1] There are two aspects to this: hardware and software. The hardware ma- chines must be autonomous and the software must be organized in such a way as to make the users think that they are dealing with a single system. Expand- ing on these fundamentals, distributed systems typically have the following characteristics; they should: • be capable of dealing with heterogeneous devices, i.e., various vendors, software stacks and operating systems should be able to interoperate • be easy to expand and scale • be permanently available (even though parts of it may not be) • hide communication from the users. In order for a distributed system to support a collection of heterogeneous computers and networks while offering a single system view, the software stack is often divided into two layers. At the higher layers, there are applications (and users) and at the lower layer there is middleware, which interacts with the underlying networks and computer systems to give applications and users the transparency they need (see Fig. 1.1). Middleware abstracts the underlying mechanisms and protocols from the application developer and provides a collection of high-level capabilities to
  • 23. 1.2 Some Terminology 3 0DFKLQH$ 0DFKLQH% 0DFKLQH 'LVWULEXWHG$SSOLFDWLRQV 0LGGOHZDUH6HUYLFHV 26HJ :LQGRZV;3 26HJ 0DF26 26HJ /LQX[ 1HWZRUN Fig. 1.1. The role of middleware in a distributed system; it hides the underlying infrastructure away from the application and user level. make things far easier for programmers to develop and deploy their applica- tions. For example, within the middleware layer, there maybe simple abstract communication calls that do not specify which underlying mechanisms they actually use, e.g., TCP/IP, UDP, Bluetooth etc. Such concrete deployment bindings are often decided at run time through configuration files or dynami- cally, thereby being dependent on the particular deployment environment. Middleware therefore provides the virtual overlay across the distributed resources to enable transparent deployment across the underlying infrastruc- tures. In this book, we will take a look at a number of different approaches in designing the middleware abstraction layer by identifying the kinds of capa- bilities that are exposed by the various types. 1.2 Some Terminology Often, a number of terms are used to define a device or capability on a dis- tributed network, e.g., node, resource, peer, agent, service, server etc. In this section, common definitions are given which are used consistently throughout this book. The definitions presented here do represent a compromise however, because often certain distributed entities are not identified in all systems in
  • 24. 4 1 Introduction the same way. Therefore, wherever appropriate, the terminology provided here is given within the context of the system they described within. The terms are defined as follows: • Resource: any hardware or software entity being represented or shared on a distributed network. For example, a resource could be any of the fol- lowing: a computer; a file storage system; a file; a communication channel; a service, i.e., algorithm/function call; and so on • Node: a generic term used to represent any device on a distributed net- work. A node that performs one (or more) capabilities is often exposed as a service • Client: is a consumer of information, e.g., a Web browser • Server: is a provider of information, e.g., a Web server or a peer offering a file-sharing service • Service: is “a network-enabled entity that provides some capability” [21]; e.g., a Web server provides a remote HTTP file-retrieval service. A single device can expose several capabilities as individual services • Peer: a peer is when a device acts as both a consumer and provider of information. 3HHU OLHQW 6HUYHU 1RGH RPSXWHU 'HYLFH 6HUYLFH 5HVRXUFH Fig. 1.2. An overview of the terms used to describe distributed resources.
  • 25. 1.3 Centralized and Decentralized Systems 5 Figure 1.2 organizes these terms by associating relationships between the various terminologies. Here, we can see that any device is a entity on the network. Devices can also be referred to in many different ways, e.g., a node, computer, PDA, peer etc. Each device can run any number of clients, servers, services or peers. A peer is a special kind of node, which acts as both a client and a server. There is often confusion about the term resource. The easiest way to think of a resource is any capability that is shared on a distributed network. Sharing resources can be exposed in a number of ways and can also be used to represent a number of physical or virtual entities. For example, you can share: files (so a file is a resource), CPU cycles, storage capabilities (i.e., a file system), a service, e.g., a Web server or Web service, and so on. Therefore, everything in 1.2 is a resource except a client, who does not share. A service is a software entity that can be used to represent resources, and therefore capabilities, on a network. There are numerous examples, e.g., Web servers, Web services, Jini services, Jxta peers providing a service, and so forth and so on. In simple terms, services can be thought of as the network counterparts of local function calls. Services receive a request (just like the arguments to a function call) and (optionally) return a response (as do local function calls ). To illustrate this analogy, consider the functionality of a standard HTTP Web server: it receives a request for an HTTP file and returns the contents of that file, if found. If this was implemented as a local function call in Java, it would look something like this: String getWebPage(String httpfile) This simple function call takes a file-name argument (including its direc- tory, e.g., /mydir/myfilename.html) and it returns the contents of that local file within a Java String object. This is basically what a Web server does. How- ever, within the Web server scenario, the user would provide an HTTP address (e.g., http://www.google.com/index.html) and this would be converted into a remote request to the specified Web server (e.g., http://www.google.com) with the requested file (index.html). The entire process would involve the use of the DNS (Domain Name Service) but the client (e.g., the Web browser) performs the same operation as our simple local reader but renders the information in a specific way for the user, i.e., using HTML. 1.3 Centralized and Decentralized Systems In this section, the middleware and systems outlined in this book are classi- fied onto a taxonomy according to a scale ranging between centralized and decentralized. The distributed architectures are divided into categories that define an axis on the comparison space. On one side of this spectrum, we have centralized systems, e.g., typical client/server based systems. and on the other side, we have decentralized systems, often classified as P2P. In the centre is a
  • 26. 6 1 Introduction mix of the two extremes in the form of hybrid systems, e.g., brokered, where a system may broker the functionality or communication request to another service. This taxonomy sets the scene for the specifics of each system which will be outlined in the chapters to follow and serves as a simple look-up table for determining a system’s high-level behaviour. The boundaries are not clean-cut however and there are a number of fac- tors that can determine the centralized nature of a system. Even systems that are considered fully decentralized can, in practice, employ some degrees of centralization, albeit often in a self-organizing fashion [2]. Typically, de- centralized systems adopt immense redundancy, both in the discovering of information and content, by dynamically repeating information across many other peers on the network. Broadly speaking, there are three main areas that determine whether a system is centralized or decentralized: 1. Resource Discovery 2. Resource Availability 3. Resource Communication One important consideration to bear in mind as we talk about the degree of centralization of systems is that of scalability. When we say a resource is centralized, we do not mean to imply that there is only one server serving the information, rather, we mean that there are a fixed number of servers (possibly one) providing the information which does not scale proportionately with the size of the network. Obviously, there are many levels of granularities here and hence the adoption of a sliding scale, illustrating the various levels on a resource-organization continuum. 1.3.1 Resource Discovery Within any distributed system, there needs to be a mechanism for discovering the resources. This process is referred to as discovery and a service which supplies this information is called a discovery service (e.g., DNS, Jini Lookup, Jxta Rendezvous, JNDI, UDDI etc.). There are a number of mechanisms for discovering distributed resources, which are often highly dependent on the type of application or middleware. For example, resource discovery can be organized centrally, e.g., DNS, or decentrally, e.g., Gnutella. Discovery is typically a two-stage process. First, the discovery service needs to be located; then the relevant information is retrieved. The mechanism of how the information is retrieved can be highly decentralized (as in the lower layers of DNS), even though access to the discovery service is centralized. Here, we are concerned about the discovery mechanism as a whole. Therefore, a system that has centralized access to a decentralized search is factored by its lowest common denominator, i.e., the centralized access. There are two examples given below that illustrate this.
  • 27. 1.3 Centralized and Decentralized Systems 7 As our first example, let’s consider DNS which is used to discover an Internet resource. DNS works in much the same way as a telephone book. You give a DNS an Internet site name (e.g., www.cs.cf.ac.uk) and the DNS server returns to you the IP address (e.g., 131.251.49.190) for locating this site. In the same way as you keep a list of name/number pairs on your mobile phone, DNS keeps a list of name/IP number pairs. DNS is not centralized in structure but the access to the discovery service certainly is because there are generally only a couple of specified hosts that act as DNS servers. Typically, users specify a small number of DNS servers (e.g., one or two), which are narrow relative to the number of services available to it. If these servers go down then access to DNS information is disabled. However, behind this small gateway of hosts, the storage of DNS information is mas- sively hierarchical, employing an efficient decentralized look-up mechanism that is spread amongst many hosts. Another illustration here is the Web site Google. Google is certainly a cen- tralized Web server in the sense that there is only one Google machine (at a specific time) that binds to the address http://www.google.com. When we ask DNS to provide the Google address, it returns the IP Address 168.127.47.8, which allows you to contact the main Google server directly. However, Google is a Web search engine that is used by millions of people daily and conse- quently it stores a massive number of entries (around 1.6 billion). To access this information, it relies on a database that uses a parallel cluster of 10,000 Linux machines to provide the service (at the time of writing). Therefore, the access and storage of this information, from a user’s perspective, is central- ized but from a search or computational perspective, it is certainly distributed across many machines. 1.3.2 Resource Availability Another important factor is the availability of resources. Again, Web servers fall into the centralized category here because there is only one IP address that hosts a particular site. If that machine goes down then the Web site is unavailable. Of course, machines could be made fault tolerant by replicat- ing the web site and employing some internal switching mechanisms but the availability of the IP address remains the same. Other systems, however, use a more decentralized approach by offering many duplicate services that can perform the same functionality. Resource availability is tied in closely to resource discovery. There are many examples here but to illustrate various availability levels, let’s briefly consider the shar- ing of files on the internet through the use of three approaches, which are illustrated in Fig. 1.3: 1. MP3.com 2. Napster 3. Gnutella.
  • 28. 8 1 Introduction User Mp3.com MP3.com Scenario User Napster.com Napster Scenario Gnutella Scenario Fig. 1.3. A comparison of service availability from centralized, brokered and decen- tralized systems. MP3.com contains a number of MP3 files that are stored locally at (or behind) the Web site. If the Web site or the hard disk(s) containing the database goes down, then users have no access to the content. Napster, on the other hand, stores the MP3 files on the actual users’ machines and napster.com is used as a massive index (or meeting place) for connecting users. Users connect to Napster to search for the files they desire and thereafter connect to users directly to download the file. Therefore, each MP3 file is distributed across a number of servers making it more reliable against failure. However, as the search is centralized, it is dependent on the availability of the main Web site; i.e., if the Web site goes down then access to the MP3 files would also be lost. Interestingly, the difference between MP3.com and Napster is smaller than you may think: one centralizes the files, whilst the other centralizes the addresses of the files. Either is susceptible to failure if the Web site goes down. The difference in Napster’s case is that, if the Web site goes down then current users can still finish downloading the current files they have discovered since the communication is decentralized from the main search engine. Therefore, if a user has already located the file and initiated the download process, then the availability of the Web site does not matter and they can quite happily carry on using the service (but not search for more files).
  • 29. 1.3 Centralized and Decentralized Systems 9 Thirdly, let’s consider Gnutella. Gnutella does not have a centralized search facility nor a central storage facility for the files. Each user in the network runs a servent (a client and a server), which allows him/her to act as both a provider and consumer of information (as in Napster) but furthermore acts as a search facility also. Servents search for other files by contacting other servents they are connected to, and these servents connect to the servents they are connected to and so on. Therefore, if any of the servents are unavailable, users can almost certainly still reach the file they require (assuming it is avail- able at all). Here, therefore, it is important to insert redundancy in both the discovery and availability of the resources for a system to be truly robust against single- point failure. Often, when there are a number of duplicated resources available but the discovery of such resources is centralized, we call this a brokered system; i.e., the discovery service brokers the request to another service. Some examples of brokered systems include Napster, Jini, ICQ and Corba 1.3.3 Resource Communication The last factor is that of resource communication. There are two methods of communication between resources of a distributed system: 1. Brokered Communication: where the communication is always passed through a central server and therefore a resource does not have to reference the other resource directly 2. Point-to-Point (or Peer-to-Peer) Communication: this involves a direct connection (although this connection may be multi-hop) between the sender and the receiver. In this case, the sender is aware of the re- ceiver’s location. Both forms of communication have their implications on the centralized nature of the systems. In the first case for brokered communication, there is always a central server which passes the information between one resource and another (i.e., centralized). Further, it is almost certainly the case that such sys- tems are centralized from the resource discovery and availability standpoints also, since this level of communication implies fundamental central organiza- tion. Some examples here are J2EE, JMS chat and many publish/subscribe systems. Second, there are many systems that use point-to-point connections, e.g., Napster and Gnutella but also, so do Web servers! Therefore, this category is split horizontally across the scale and the significance here is in the central- ization of the communication with respect to the types of connections. For example, in the Web server example, communication always originates from the user. There exists a many-to-one relationship between users and the Web server and therefore this is considered centralized communication. This is illustrated in Fig. 1.4, where an obvious centralized communication pattern is seen for the Web server case.
  • 30. 10 1 Introduction Equal Peers: communication is supposed to be even; i.e., each provider is also a server of information and each node has an equal number of connections Web Server Many-to-one relationship between users and the Web server and therefore this can be considered centralized communication Fig. 1.4. The centralization of communication: a truly decentralized system would have even connections across hosts, rather than a many-to-one type of connectivity. However, in more decentralized systems, such as Napster and Gnutella, communication is more evenly distributed across the resources; i.e., each provider of information is also a server of information, and therefore the con- nectivity leans more towards a one-to-one connectivity rather than many- to-one. This equal distribution across the resource (known as equal peers) decentralizes communication across the entire system. However, in practice this is almost never the case because of the behavioural patterns depicted by users of such networks; e.g., some users do not share files and others share many (see Section 7.7). 1.4 Examples of Distributed Applications In this section, the criteria defining the taxonomy are applied to several well- known examples of existing distributed applications and middleware. The ex- amples given here serve as a point of reference for each chapter that describes the particular application or middleware in more detail. 1.4.1 A Web Server: Centralized A good example of a centralized system is a Web server. Clients (i.e., users) use their Web browser to navigate Web pages on one or more Web sites. Each Web
  • 31. 1.4 Examples of Distributed Applications 11 Resource Availability Resource Discovery Resource Communication Centralized Decentralized Web Server Fig. 1.5. Taxonomy for a Web server. site is static to the particular domain with which it is associated. A Web server therefore is centralized in every sense. It has centralized discovery (through DNS), it is either available or not and all communication is centralized to the particular Web server being contacted. Communication is point to point but there is a many-to-one relationship between the users of this service and the server itself. The circles in Fig. 1.5 show the position where a Web server lies on the cen- tralized/decentralized scale for the three categories listed: resource discovery, resource availability and resource communication. The scale at the right-hand side of this graph indicates the broad granularity of our measurements (finer levels would not really change the outcome much anyway) but somewhere around the mid-point would denote the brokered case. With brokering, typically one service brokers the request to another. DNS does not fall into this category since it has no intrinsic functionality or se- mantics itself. Web forwarding is a kind of brokering in this sense but this is a one-to-one forwarding. Typically, brokering involves making a decision about where to broker the request and therefore typically, there are many ser- vices offering the same functionality from which to choose. Communication can also be brokered by the server acting as a coordinator between the sender and receiver.
  • 32. 12 1 Introduction 1.4.2 SETI@Home: Centralized 5HVRXUFH $YDLODELOLW 5HVRXUFH 'LVFRYHU 5HVRXUFH RPPXQLFDWLRQ HQWUDOL]HG 'HFHQWUDOL]HG 6(7,# +RPH Fig. 1.6. Taxonomy for SETI@Home. SETI@Home (Search for Extraterrestrial Intelligence) [3] is a project that analyses data from a radio telescope to search for signs of extraterrestrial life. Each user who takes part in this project downloads a data set and executes some signal-processing tasks. The actual program is implemented as a screen saver and therefore only operates when the computer is idle. The SETI@Home project has used over a billion years of CPU time at the time of writing. Here, the entire system is run from the SETI@Home Web site. Users down- load the code and also the data when they are available to process. Therefore, the discovery is centralized (DNS) and the communication is centralized to the Web site. Resource availability is also centralized because without the avail- ability of the Web site, the many SETI nodes cannot do anything since they need this server to download the next chunk of data. This taxonomy also ap- plies to BOINC [38], which is the new open source release of the SETI@Home infrastructure. SETI is discussed in more detail in Chapter 2.
  • 33. 1.4 Examples of Distributed Applications 13 5HVRXUFH $YDLODELOLW 5HVRXUFH 'LVFRYHU 5HVRXUFH RPPXQLFDWLRQ HQWUDOL]HG 'HFHQWUDOL]HG 1DSVWHU Fig. 1.7. Taxonomy for Napster. 1.4.3 Napster: Brokered A good example of a brokered system is Napster [4]. Napster stores informa- tion about the location of peers and music files in a centralized way but then lets the peers communicate directly when they transfer files. Here therefore, the discovery and availability are centralized through the Napster Web site but the communication between the peers is decentralized. However, the availability of the resources (i.e., files) is less centralized to a degree because users can still download the file even if the Napster server goes down. However, users cannot search for new resources when the Web site is unavailable and therefore limited in this respect. Napster is described in more detail in Chapter 2. 1.4.4 Gnutella: Decentralized A popular example of a decentralized system is Gnutella [6] where discovery, availability and communication are completely decentralized over the network. Gnutella is discussed in detail in Chapter 6. In theory Gnutella is completely decentralized but in practice is this really true? Decentralized networks are inherently self-organizing and so it is not only possible but indeed very likely that strong servers of information (the
  • 34. 14 1 Introduction 5HVRXUFH $YDLODELOLW 5HVRXUFH 'LVFRYHU 5HVRXUFH RPPXQLFDWLRQ HQWUDOL]HG 'HFHQWUDOL]HG *QXWHOOD Fig. 1.8. Taxonomy for Gnutella. so-called super-peers in Gnutella) could easily turn a decentralized network into a semi-centralized one when peers contain an uneven amount of content. Whether this is achieved by behavioural patterns or by artificially creating a centralized-decentralized structure, the resulting network is no longer com- pletely decentralized. This is discussed in detail in Chapter 7. It is no coincidence, for example, that this evolution of hybrid decentral- ized and centralized systems echoes the evolution of other types of systems such as Usenet [62]. The history of Usenet shows us that peer-to-peer (de- centralization) and client/server (centralization) are not mutually exclusive. Usenet was originally peer-to-peer. Sites connected via a modem and agreed to exchange information (news and mail) with each other (UUCP). However, over time, it became obvious that certain sites had better servers than others and these sites went on to form the Usenet backbone. Today, the volume of Usenet is enormous and servers on the backbone can elect how much infor- mation they want to serve and they get added to the Usenet network in a decentralized fashion. Even the addition of new newsgroups is not centralized as users have to vote for a newsgroup before it gets initiated.
  • 35. 1.5 Examples of Middleware 15 1.5 Examples of Middleware 1.5.1 J2EE and JMS: Centralized 5HVRXUFH $YDLODELOLW 5HVRXUFH 'LVFRYHU 5HVRXUFH RPPXQLFDWLRQ HQWUDOL]HG 'HFHQWUDOL]HG -06 Fig. 1.9. Taxonomy for JMS The Java development kit enterprise edition J2EE [13] is an example of a centrally controlled system. Here, one Web site is the manager of all inter- action between clients. Clients in the Java Messaging System (JMS) do not know the whereabouts of other clients because this knowledge is stored within the central manger on the J2EE server. The entire system is based around a Web site and therefore the discovery is central. JMS is used as a publish/subscribe mechanism within the J2EE environ- ment (amongst other things) and is quite typical of other messaging systems, e.g., ICQ where messages are brokered through a central server in order to get to their destination. Therefore, the communication is brokered through the Web site. Further, there is only one copy of the Web site (typically these are quite complicated to set up) and therefore the availability is centralized also.
  • 36. 16 1 Introduction 5HVRXUFH $YDLODELOLW 5HVRXUFH 'LVFRYHU 5HVRXUFH RPPXQLFDWLRQ HQWUDOL]HG 'HFHQWUDOL]HG -LQL Fig. 1.10. Taxonomy for Jini. 1.5.2 Jini: Brokered Jini [78] allows Java objects to become network-enabled services that can be distributed in a network ‘plug and play’ manner. In a running Jini system, there are three main players. There is a service, such as a printer, a super- computer running a software service etc. There is a client which would like to make use of this service. Third, there is a lookup service (service locator) which acts as a broker/trader/locator between services and clients. Jini is discussed in detail in Chapters 5 and 11. Jini is another example of a brokered system. Jini clients find out about services by using the lookup server. The lookup server brokers the request to a matching service and thereafter the communication takes place directly between the client and services. Therefore, the availability is centralized in the sense that it is dependent on the Jini lookup service but on the other hand, once a client discovers a service it wishes to use, the client and service can carry on communicating without the availability of the lookup service. Therefore, as in previous brokered systems, the availability is better than a strict centralized system.
  • 37. 1.5 Examples of Middleware 17 5HVRXUFH $YDLODELOLW 5HVRXUFH 'LVFRYHU 5HVRXUFH RPPXQLFDWLRQ HQWUDOL]HG 'HFHQWUDOL]HG :HE 6HUYLFHV Fig. 1.11. Taxonomy for Web services. 1.5.3 Web Services: Brokered At the core of the Web services model is the notion of a service, which can be described, discovered and invoked using standard XML technologies such as SOAP, WSDL and UDDI. Conventionally, Web services are described by a WSDL document, advertised and discovered using a UDDI server and invoked with a message conforming to the SOAP specification. Web services therefore use the same brokered model as other systems, such as Napster, Jini or CORBA and therefore have a similar taxonomy to those systems. However, Web services differentiates itself by being based completely on open standards that has gained enormous support from thousands of com- panies and have been adopted by several communities, including the GGF. Web services are discussed in detail in Chapters 3, 13 and 14. 1.5.4 Jxta: Decentralized Project Jxta [15] defines a set of protocols that can be used to construct peer-to-peer systems using any of the centralized, brokered and decentralized approaches but its main aim is to facilitate the creation of decentralized sys- tems. Jxta’s goal is to develop basic building blocks and services to enable P2P applications for interested groups of peers. Jxta will be discussed, both
  • 38. 18 1 Introduction 5HVRXUFH $YDLODELOLW 5HVRXUFH 'LVFRYHU 5HVRXUFH RPPXQLFDWLRQ HQWUDOL]HG 'HFHQWUDOL]HG -[WD Fig. 1.12. Taxonomy for JXTA. conceptually and from a programmers perspective in Chapters 10 and 12, respectively. Jxta can support any level of centralization/decentralization but its main focus (and hence power) is to facilitate the development of decentralized appli- cations. Therefore, in this context, Jxta peers can be located in a decentralized fashion; they have much redundancy in their availability and their communi- cation is point to point and therefore no central control authority is needed for their operation. 1.6 Conclusion In this chapter, the critical components of any distributed system were outlined concentrating particularly on the role of middleware. Distributed- systems terminology was introduced, along with notion of a service, which will be used frequently within this book. We then discussed a taxonomy for distributed systems based on a scale ranging from centralized to decentral- ized, which factored in: resource discovery, resource availability and resource communication. Several well-known distributed applications and middleware have been classified using this taxonomy, which will serve as a placeholder and give context to the distributed systems described in the rest of this book.
  • 40. 21 In this book, there are four main themes: distributed environments, mid- dleware and applications, middleware deployment and future trends. We begin by setting the scene and introducing three diverse, yet somewhat complimen- tary technologies, that have evolved over the past several years. These are peer to peer, Web services and Grid computing. Each of these technological areas addresses specific issues within the distributed system spectrum and, as we look ahead, it is highly likely that each will play an important role in contributing to our future distributed-systems infrastructure.
  • 41. 2 Peer-2-Peer Systems At the time of writing, there are one and a half billion devices worldwide (e.g., PCs, phone, PDAs, etc.), a figure which is rising rapidly. Surveys have stated that Internet users surpassed 530 million in 2001 and predictions indicate that this will double to 1.12 billion by year-end 2005 [175]. The computer hardware industry has also been characterised by expo- nential production volumes. Gordon Moore, the co-founder of Intel, in his famous observation in 1965 [140] (made just four years after the first planar integrated circuit was discovered), predicted that the number of transistors on integrated circuits would double every few years. Indeed this prediction, thereafter called Moore’s law, remains true up until today and Intel predicts that this will remain true at least until the end of this decade [141]. Such acceleration in development has been made possible by the mas- sive investment by companies who deal with comparatively short product life cycles. Each user now in this massive network has the CPU capability of more than 100 times that of an early 1990s supercomputer and surprisingly, GartnerGroup research reveals that over 95% of today’s PC power is wasted. The potential of such a distributed computing resource has been in some ways demonstrated by the SETI@Home project [3], having used over a million years of CPU time at the time of writing. In this chapter, peer-to-peer computing, a possible paradigm for making use of such devices, is discussed. An historical perspective is given, followed by a definition, taxonomy and justification for P2P computing. A background into the P2P environment is given followed by examples of several P2P appli- cations that operate within such an environment. 2.1 What is Peer to Peer? This section gives a brief background and history of the term “peer to peer” and describes its definition in the current context. Examples of P2P tech-
  • 42. 24 2 Peer-2-Peer Systems nologies are given followed by categorizations of their behaviour within the taxonomy described in the first chapter. 2.1.1 Historical Peer to Peer Peer to peer was originally used to describe the communication of two peers and is analogous to a telephone conversation. A phone conversation involves two people (peers) of equal status, communication between a point-to-point connection. Simply, this is what P2P is, a point-to-point connection between two equal participants. The Internet started as a peer-to-peer system. The goal of the original ARPANET was to share computing resources around the USA. Its challenge was to connect a set of distributed resources, using different network con- nectivity, within one common network architecture. The first hosts on the ARPANET were several US universities, e.g., the University College of Los Angeles, Santa Barbara, SRI and University of Utah. These were already in- dependent computing sites with equal status and the ARPANET connected them as such, not in a master/slave or client/server relationship but rather as equal computing peers. From the late 1960s until 1994, the Internet had one model of connectivity. Machines were assumed to be always switched on, always connected, and assigned permanent IP addresses. The original DNS system was designed for this environment, where a change in IP address was assumed to be abnormal and rare, and could take days to propagate through the system. However, with the invention of Mosaic, another model began to emerge in the form of users connecting to the Internet from dial-up modems. This created a second class of connectivity because PCs would enter and leave the network frequently and unpredictably. Further, because ISPs began to run out of IP addresses, they began to assign IP addresses dynamically for each session, giving each PC a different, possibly masked, IP address. This transient nature and instability prevented PCs from being assigned permanent DNS entries, and therefore prevented most PC users from hosting any data or network-facing applications locally. For a few years, treating PCs as clients worked well. Over time though, as hardware and software improved, the unused resources that existed behind this veil of second-class connectivity started to look like something worth get- ting at. Given the vast array of available processors mentioned earlier, the software community is starting to take P2P applications very seriously. Most importantly, P2P research is concerned in addressing some of the main difficul- ties of current distributed computing: scalability, reliability, interoperability. 2.1.2 Binding of Peers Within today’s Internet, we rely on fixed IP addresses. When a user types an address into his/her Web browser (such as http://www.google.com/), the
  • 43. 2.1 What is Peer to Peer? 25 http://www.google.com/ DNS 168.127.47.8 Fig. 2.1. The process whereby an Internet address is converted into the IP address for locating a Web page on the Internet. Web server address is translated into the IP address (e.g., 168.127.47.8) by a domain name server (DNS). The Internet protocol (IP) then makes a rout- ing decision based on the IP Address. If DNS is unavailable then typing http://168.127.47.8/ into a browser would be equivalent since the Web page is permanently bound to the IP address. This is known as static or early binding. Figure 2.1 illustrates this pro- cess graphically. Early bindings form a simple architecture very similar to an address book on a mobile phone; e.g., the person’s name is statically bound to his/her telephone number. This works in practice because typically people have long-term (early) bindings with their phone numbers and Web sites have long-term bindings with their IP addresses. However, if a Web site changed its IP address several times a day then this type of binding starts to become impractical. Within P2P networks this is the norm. Often devices do not have a fixed address as they are hidden behind Network Address Translation (NAT) systems and therefore need a late binding of their addresses with their network identifier. 2.1.3 Modern Definition of Peer to Peer With the emergence of new technologies in the late 1990s a new definition for peer to peer has begun to emerge, as follows:
  • 44. 26 2 Peer-2-Peer Systems P2P is a class of applications that takes advantage of resources e.g. storage, cycles, content, human presence, available at the edges of the Internet (Shirky [46]). Computers/devices “at the edges of the Internet” are those operating within transient and often hostile environments. Devices within this environment: can come and go frequently; can be hidden behind a firewall or operate outside of DNS, e.g., by NAT (see next section); and often have to deal with differing transport protocols, devices and operating systems (see Fig. 2.2 below). Often the number of computers in a P2P network is enormous consisting of millions of interconnecting peers. This modern definition rather defines the P2P environment of devices and resources rather than previous definitions that focused on the servent method- ology and decentralized nature of systems like Gnutella [6]. For example, in Gnutella, there are two key differences compared to client/server based sys- tems: • A peer can act as both a client and a server (they call these servents i.e. server and client in Gnutella.) • The network is completely decentralized and has no central point of con- trol. Peers in a Gnutella network are typically connected to three or four other nodes and to search the network a query is broadcast throughout the network. Certainly, within P2P systems, peers exist as defined in Gnutella. How- ever, P2P networks do not have to be completely decentralized. This is evident in modern Gnutella implementations [51], which employ a central- ized/decentralized approach in order to be able to scale the network and in- crease efficiency of search. Such networks are implemented using super-peers that cache file locations so that peers only have to search a small fraction of the network in order to satisfy their search requests. Therefore, Shirky’s definition here is more appropriate to describe a new class of applications that are designed to work within this highly transient environment (see also section 2.2), something previously unattainable. Systems like Gnutella are now often referred to as True P2P (see Section 2.1.5) because of their pure decentralized approach, where everyone partici- pates equally in the network. However, this ideal can never really be realised by a P2P system simply because certainly not all peers are equal within actual P2P networks, which has been proven by several empirical studies [69], [37] and [67]. See the next two chapters for a detailed overview of the evolving network topologies employed by recent decentralized file-sharing networks. Other authors have noted the same. From [24], the authors state that “they prefer this definition to the alternative ‘decentralized, self-organizing distributed systems, in which all or most communication is symmetric,’ be- cause it encompasses large-scale deployed (albeit centralized) P2P systems (such as Napster and SETI@Home) where much experience has been gained”.
  • 45. 2.1 What is Peer to Peer? 27 /LQX[ 7 3 , 3 %OXHWRRWK + 7 7 3 7 3 , 3 73,3 ;3 1$7 )LUHZDOO 331HWZRUN +HWHURJHQHRXVVHWRI1HWZRUNHG'HYLFHVHJGLIIHUHQW 2SHUDWLQJVVWHPVSURJUDPPLQJODQJXDJHVDQGQHWZRUNV $SSOLFDWLRQ HJILOHVKDULQJ38VKDULQJ Fig. 2.2. A P2P environment: devices are connected behind NATs and firewalls; they run on different platforms, potentially using different programming languages, e.g. Jxta [15]. Examples of recent P2P technologies include: • File sharing/storage programs, e.g., Gnutella [6], Napster [4], Limewire [51], KaZaA [52], Freenet [58] and Popular Power [53], some of which have taken the spotlight by providing a way of sharing any type of digital file, of which, users typically provide audio and video files • CPU resource-sharing systems, e.g., SETI@Home [3], United Devices[54], Entropia [55] and XtremWeb [191] • Instant messaging (e.g., ICQ [56] and Jabber [5]) • Conferencing applications e.g.,netmeeting [57] for white-boarding, voice over IP. What makes these similar is that they are all leveraging previously unused resources by tolerating and even working with the variable connectivity that many devices connected to these networks exhibit. 2.1.4 Social Impacts of P2P The legal connotations and social impacts of P2P are ongoing. No doubt, it has opened the eyes and imaginations of people from numerous disciplines to
  • 46. 28 2 Peer-2-Peer Systems the massive sharing of resources across the Internet. Even within the context of the sharing of copyrighted material, there are compulsive arguments for and against the use of such technologies. There are a number of articles and books written on the subject that support the concept of P2P and those that give legal context for it. For example, on the Open Democracy Web site, there are a number of articles that give a social context for P2P, both from a cultural perspective and a legal one. In this section, a very brief summary of some of the points raised is given. Vaidhyanathan [178], in his five-part article on the new information ecosys- tem, paints a picturesque account of a deep cultural change that is taking place through the introduction of P2P technologies. He argues that “what we call P2P communicative networks actually reflect and amplify - revise and extend - an old ideology or cultural habit. Electronic peer-to-peer systems like Gnutella merely simulates other, more familiar forms of unmediated, un- censorable, irresponsible, troublesome speech; for example, anti-royal gossip before the French revolution, trading cassette tapes among youth subcultures as punk or rap, or the illicit Islamist cassette tapes through the streets and bazaars of Cairo.” He argues against the current clampdown strategy that is being employed by companies and governments. Such a strategy involves radically redesigning the communication technologies so that information can be monitored more closely. These restrictions would destroy the current openness of the current Internet and could bring about a new type of Internet which, he says, would “not be open and customisable. Content - and thus culture - would not be adaptable and malleable. And what small measures of privacy these networks now afford would evaporate”. Rainsford [179] uses the term “information feudalism”, which was taken from an analogy given by Peter Drahos [181]. Drahos suggests that The current push for control over intellectual property rights has bred a situation analogous to the feudal agricultural system in the medieval period. In effect, songwriters and scientists work for corporate feudal lords, licensing their own inventions in exchange for a living and the right to ‘till the lands’ of the information society. Rainsford quotes a number of authors who believe that the struggle that we are experiencing has deep underlying roots in cultural transformations, which will inevitably bring about a change in the decaying business models of today. Rainsford also notes that “the links asserted between p2p systems and terrorism, or the funding of terrorism” are “a concept which is laughably ironic as p2p by its very nature is a non-profit system”. Rimmer [180] gives a legal case for the argument and argues that “if claims by peer-to-peer distributors that they are supporting free speech and con- tributing to knowledge want to find a sympathetic ear in the courtroom, then they have to mean it”. He discusses the current use of P2P and argues that they have not lived up to their revolutionary promise, being used mostly for
  • 47. 2.1 What is Peer to Peer? 29 circulating copyrighted media around the world. He lists several cases which have been brought against companies, which have resulted in infringements, and some that have not. Rimmer states that P2P networks are “vulnerable to legal actions for copyright infringements because they have facilitated the dissemination of copyright media for profit and gain.” He concludes that “the courts would be happy to foster such technology if it promoted the freedom of speech, the mixing of cultures, and the progress of science”. For further reading, see the articles listed or the Open Democracy Web site [177], which hosts a series of articles in response to these comments. Similar articles appear on other Web sites, such as OpenP2P [65]. 2.1.5 True Peer to Peer? Within P2P, there are three categories of systems (as outlined in Chapter 1): • Centralized systems: where every peer connects to a server which co- ordinates and manages communication. Some examples here include the CPU sharing applications, e.g., SETI@Home • Brokered systems: where peers connect to a server in order to discover other peers, but then manage the communication themselves (e.g., Nap- ster). This is also called Brokered P2P. • Decentralized systems: where peers run independently without the need for centralized services. Here, the discovery is decentralized and the communication takes place between the peers. Peers do not need a known centralized service for them to operate, e.g., Gnutella, Freenet Most Internet services are distributed using the traditional client/server (centralized) architecture. In this architecture, clients connect to a server using a specific communications protocol (e.g., TCP) to obtain access to a specific resource. Most of the processing involved in delivering a service usually occurs on the server, leaving the client relatively unburdened. Most popular Internet applications, including the World Wide Web, FTP, telnet, and email, use this service-delivery model. Unfortunately, this architecture has a major drawback; that is, as the number of clients increases (and therefore load and bandwidth) the server becomes a bottleneck and can eventually result in the server not being able to handle any additional clients. The advantage of the client/server model is that it requires less compu- tational power on the client side. However, this has been somewhat circum- vented due to ever-increasing CPU power and therefore most desktop PCs are ludicrously overpowered to operate as simple clients, e.g., for browsing and email. P2P, on the other hand, has the capability of serving resources with high availability at a much lower cost, while maximizing the use of resources from every peer connected to the P2P network. Whereas client/server solutions rely
  • 48. 30 2 Peer-2-Peer Systems on costly bandwidth, equipment, and location to maintain a robust solution, P2P can offer a similar level of robustness by spreading network and resource demands across the network. Note though that some middleware architectures used to program such systems are often capable of operating in one or more of these modes. Further, the more decentralized the system, the better the fault tolerance, since the services are spread across more resources. Therefore, at the far side of the scale, you have true P2P systems, which employ a completely decen- tralized structure, both in look-up and in communication. Hong [62] gives a useful description for communication within P2P systems. He defines P2P systems as being a class of distributed systems that are biased to more of a decentralized approach, where there is no global notion of centralization. He argues that such systems are primarily concerned with smaller distributed levels of centralization with respect to communication. When designing a P2P system therefore, there is a trade-off between in- serting the correct amount of decentralization for the network to be fault tolerant against failure but centralized enough to scale to large number of participants. These issues are discussed in detail in Chapter 7. 2.1.6 Why Peer-to-Peer? So why is P2P important. What’s new? Although the term P2P, in many peoples’ minds, is linked with distribut- ing copyrighted material illegally, it has in fact much more to offer. P2P file-sharing applications have addressed a number of important issues when dealing with large-scale connectivity of transient devices. There are a number of practical real-world applications for such a technology, both on the Internet [54] [3] and on wireless networks, e.g., for mobile sensors applications [176], and in many different kinds of scientific and social experiments. P2P could provide more useful and robust solutions over current technolo- gies in many different situations. For example, current search engine solutions centralize the knowledge and their resources. This is an inherent limitation. Google, for example, relies on a central database that is updated daily by scouring the Internet for new information. Simply due to the massive size of this database (more than 1.6 billion entries) not every entry gets updated every day, and as a result, information can often be out of date. Further, it is impractical (from a cost perspective) that such solutions will be scalable for the future Internet. For example, even though Google, at the time of writing, runs a cluster of 10,000 machines to provide its service, it only searches a subset of available Web pages (about 1.3 x 108 ) to create its database. Furthermore, the world produces two exabytes (2 x 1018 bytes) each year but only publishes about 300 terabytes (3 x 1012 bytes) i.e. for every megabyte of information produced,
  • 49. 2.2 The P2P Environment 31 one byte gets published. Therefore, finding useful information in real-time is becoming increasingly difficult. A similar service could be implemented using P2P technology. One pos- sibility is that every person runs a personal Web server on a desktop com- puter that has the capability to process requests for information about the documents it manages. A user’s server could receive a query, check the local documents and respond with a list of matching documents. Each server would be responsible for indexing its own documents and would therefore be capable of providing more specialized, accurate and up-to-date information. This decentralization of indexing is much more manageable than the task facing Google. Corporations could also provide specialized information avail- able that current search engines cannot reach. Further, if the user’s server disconnected from the network then the search service would also become unavailable and therefore users searching would not receive results for un- available resources as they do at present. This solution outlines an extreme P2P solution, but in practice some combinational technique could prove very effective. 2.2 The P2P Environment This section covers the technology that makes the P2P environment so difficult to work within. In Fig. 2.2, this environment was illustrated; that is, peers are: extremely transient (they are continually disappearing and reappearing), connections are often multi-hop (i.e., packets travel via several intermediaries before they reach their destination), and peers reside in hostile environments (i.e., they live behind NAT routing systems and firewalls). In this section, a background is given into some of the technologies behind P2P networks, which helps set a more realistic P2P scene. The first section makes a brief excursion into switching technology for networks. The second section describes a particular subset of these that contains NAT systems. Lastly, firewalls are discussed. 2.2.1 Hubs, Switches, Bridges, Access Points and Routers This section gives a brief overview of the various devices used to partition a network, which gives the context for the following two sections on NAT and firewalls often employed within a P2P network. Briefly, the critical distinction between these devices is the level or layer at which they operate within the International Standard Organization’s Open System Interconnect (ISO/OSI) model, which defines seven network layers [98]. . • Hubs: A hub is a repeater that works at the physical (lowest) layer of OSI. A hub takes data that comes into a port and sends it to the other ports
  • 50. 32 2 Peer-2-Peer Systems in the hub. It doesn’t perform any filtering or redirection of data. You can think of a hub as a kind of Internet chat room. Everyone who joins a particular chat is seen by everyone else. If there are too many people trying to chat, things get bogged down. • Switches and Bridges: These are pretty similar. Both operate at the Data Link layer (just above Physical) and both can filter data so that only the appropriate segment or host receives a transmission. Both filter packets based on the physical address (i.e. Media Access Control (MAC) address) of the sender/receiver although newer switches sometimes include the capabilities of a router and can forward data based on IP address (operating at the network layer), referred to as IP switches. In general, bridges are used to extend the distance capabilities of the network while minimizing overall traffic, and switches are used primarily for their filtering capabilities to create multiple, smaller virtual local area networks (LANs) out of one large LAN for easier management/administration (V-LANs). • Routers: These work at the Network layer of OSI (above Data Link) and operate on the IP address. Like switches and bridges, they filter by only forwarding packets destined for remote networks thus minimizing traffic, but are significantly more complex than any other networking device; thus they require much more maintenance and administration. The home net- worker typically uses a DSL or cable modem router that joins the home’s LAN to the wide area network (WAN) of the Internet. By maintaining configuration information in a “routing table” routers also have the abil- ity to filter traffic, either incoming or outgoing, based on the IP addresses of senders and receivers. Most routers allow the home networker to update the routing table from a Web browser interface. DSL and cable modem routers typically combine the functions of a router with those of a switch in a single unit. 2.2.2 NAT Systems For a computer to communicate with other computers and Web servers on the Internet, it must have an IP address. An IP address is a unique 32-bit number that identifies the location of your computer on a network. There are, in theory, 232 (4,294,967,296) unique addresses but the actual number available is much smaller (somewhere between 3.2 and 3.3 billion). This is due to the way that the addresses are separated into classes and also because some are set aside for multicasting, testing or other special uses. With the explosion of the Internet and the increase in home networks and business networks, the number of available IP addresses is simply not enough. An obvious solution is to redesign the address format to allow for more possible addresses. This is being developed and is called IPv6, but it may take several years to deploy because it requires modification of the entire infrastructure of the Internet.
  • 51. 2.2 The P2P Environment 33 3ULYDWH1HWZRUN 1$7 5RXWHU /RFDO$UHD 1HWZRUN ,QWHUQHW 2XWJRLQJ 3XEOLF1HWZRUN ,QFRPLQJ 2XWJRLQJ ,QFRPLQJ VWXE GRPDLQ Fig. 2.3. A NAT System divides a local network from the public network and offers local-to-public mapping of addresses. This allows the number of machines on the Internet to increase past the physical limit. A NAT system converts local addresses within the stub domain into one Internet address. A network address translation system (see Fig. 2.3) allows a single device, such as a router, to act as an agent between the Internet (public network) and a local (private) network. This means that only a single, unique IP address is required to represent an entire group of computers. The internal network is usually a LAN; commonly referred to as the stub domain. A stub domain is a LAN that uses IP addresses internally. Any internal computers that use unregistered IP addresses must use NAT to communicate with the rest of the world. There are two types of NAT translation, static or dynamic, which are illustrated in Fig. 2.4. Static NAT involves mapping an unregistered IP ad- dress to a registered IP address on a one-to-one basis. Particularly useful when a device needs to be accessible from outside the network (i.e., in static NAT), the computer with the IP address of 192.168.0.0 will always translate to 131.251.45.110 (see upper part of Fig. 2.4). Dynamic NAT, on the other hand, maps an unregistered IP address to a registered IP address from a group of local dynamically allocatable IP ad- dresses, i.e., the stub domain computers will be allocated an address from a specified range of addresses, e.g., 192.168.0.0 to 192.168.0.50, in Figure 2.4 and
  • 52. 34 2 Peer-2-Peer Systems 3ULYDWH1HWZRUN 1$7 5RXWHU /RFDO$UHD1HWZRUN ,QWHUQHW 2XWJRLQJ 3XEOLF1HWZRUN ,QFRPLQJ 2XWJRLQJ ,QFRPLQJ 3ULYDWH1HWZRUN 1$7 5RXWHU /RFDO$UHD1HWZRUN ,QWHUQHW 2XWJRLQJ 3XEOLF1HWZRUN ,QFRPLQJ 2XWJRLQJ ,QFRPLQJ 6WDWLF 'QDPLF « 1$76VWHP7SHV 2QHWR2QH 2QHWR0DQ Fig. 2.4. A NAT system can be allocate dynamic address or translate from fixed stub domain address to outside ones. will translate these to 131.251.45.110 for the outside world. In this circum- stance, it is easy to see why NAT systems are problematic since you could have potentially hundreds of stub domain computers masquerading as one external IP address. 2.2.3 Firewalls A firewall is a system designed to prevent unauthorized access to or from a private network. All messages entering or leaving the computer system pass through the firewall, which examines each message and blocks those that do not meet the specified security criteria. Specifically, firewalls are implemented by blocking certain ports, thereby disabling certain types of services that operate on those ports. Some firewalls permit only email traffic, thereby protecting the network against any attacks other than attacks against the email service. Other fire- walls provide less strict protections, and block services that are known to be problematic. Generally, firewalls are configured to protect against unauthen- ticated interactive logins from the outside world. This, more than anything, helps prevent unauthorized users from logging into machines on your network. More elaborate firewalls block traffic from the outside to the inside, but permit users on the inside to communicate freely with the outside. Figure 2.5
  • 53. 2.2 The P2P Environment 35 Fig. 2.5. A firewall blocks traffic to and from specified ports, here only SSH and Web browsing are allowed by external computers. illustrates a scenario where both telnet and audio conferencing are blocked from the outside world but Web browsing and SSH connections are acceptable. However, internal users can freely open up external connections using any of these services but, in this example, they would not be able to hear the other participants in the audio conference because incoming audio is blocked. A firewall therefore can essentially protect you against most types of net- work attack. Firewalls are also important since they can provide a single choke point where security and audit can be imposed, i.e., they can provide an important logging and auditing function and provide summaries to the administrator about what kinds and amount of traffic passed through it and how many attempts there were to break into it. Within P2P applications, it is often necessary to traverse such firewalls, for example, by rerouting the data over the HTTP port. 2.2.4 P2P Overlay Networks P2P implementations frequently involve the creation of overlay networks ([23]) with a structure that is completely independent of that of the underlying network of connected devices. The purpose of overlay networks is that they abstract the complicated connectivity of a P2P network to a higher-level pro- grammatical view of the peers that make up the network. This is illustrated 7HOQHW $XGLR RQIHUHQFLQJ 66+ :HE %URZVHU x x 7HOQHW $XGLR RQIHUHQFLQJ 66+ :HE %URZVHU ,QWHUQDO ([WHUQDO
  • 54. 36 2 Peer-2-Peer Systems in Fig. 2.6 which shows the programmer’s view of the network (see top cloud of peers) that simplifies and abstracts the network structure and underlying transport mechanisms (see bottom part) into a collection of cooperating peers. %OXHWRRWK 73,3 1$7 1 $ 7 1 $ 7 )LUHZDOO )LUHZDOO )LUHZDOO 3KVLFDO 1HWZRUN +WWS 3HHU 3HHU 3HHU 3HHU 3HHU 3HHU 3HHU 3HHU 9LUWXDO 2YHUOD 9LUWXDO330DSSLQJ Fig. 2.6. An illustration of the notion of an overlay network. Modern P2P infras- tructures typically overlay a virtual view of the nodes on the network to abstract the underlying mechanisms that actually connect these devices; this example was taken from Jxta [15]. There are several different types of overlay networks. For example, within Jxta, a virtual network overlay sits on top of the physical devices and is orga- nized into transient or persistent relationships, which they call peer groups. Peers in Jxta are not required to have direct point-to-point network connec- tions and such connections are represented through the use of virtual pipes. Virtual pipes simply define the endpoints of the connection and leave it to the underlying mechanisms to implement the appropriate behaviour for that environment, e.g., for TCP, a fixed point-to-point connection is created for the pipe but for UDP pipes this is not required and therefore the pipe re- mains connectionless. Other network overlays include the use of distributed hashtables e.g. Chord [45] or Pastry [44].
  • 55. Another Random Scribd Document with Unrelated Content
  • 56. appear in {braces} within the text. The companion volume, A Middle English Vocabulary, designed for use with SISAM's Fourteenth Century Verse Prose, by J. R. R. Tolkien is available at PG #43737.
  • 57. *** END OF THE PROJECT GUTENBERG EBOOK FOURTEENTH CENTURY VERSE PROSE *** Updated editions will replace the previous one—the old editions will be renamed. Creating the works from print editions not protected by U.S. copyright law means that no one owns a United States copyright in these works, so the Foundation (and you!) can copy and distribute it in the United States without permission and without paying copyright royalties. Special rules, set forth in the General Terms of Use part of this license, apply to copying and distributing Project Gutenberg™ electronic works to protect the PROJECT GUTENBERG™ concept and trademark. Project Gutenberg is a registered trademark, and may not be used if you charge for an eBook, except by following the terms of the trademark license, including paying royalties for use of the Project Gutenberg trademark. If you do not charge anything for copies of this eBook, complying with the trademark license is very easy. You may use this eBook for nearly any purpose such as creation of derivative works, reports, performances and research. Project Gutenberg eBooks may be modified and printed and given away—you may do practically ANYTHING in the United States with eBooks not protected by U.S. copyright law. Redistribution is subject to the trademark license, especially commercial redistribution. START: FULL LICENSE
  • 58. THE FULL PROJECT GUTENBERG LICENSE
  • 59. PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK To protect the Project Gutenberg™ mission of promoting the free distribution of electronic works, by using or distributing this work (or any other work associated in any way with the phrase “Project Gutenberg”), you agree to comply with all the terms of the Full Project Gutenberg™ License available with this file or online at www.gutenberg.org/license. Section 1. General Terms of Use and Redistributing Project Gutenberg™ electronic works 1.A. By reading or using any part of this Project Gutenberg™ electronic work, you indicate that you have read, understand, agree to and accept all the terms of this license and intellectual property (trademark/copyright) agreement. If you do not agree to abide by all the terms of this agreement, you must cease using and return or destroy all copies of Project Gutenberg™ electronic works in your possession. If you paid a fee for obtaining a copy of or access to a Project Gutenberg™ electronic work and you do not agree to be bound by the terms of this agreement, you may obtain a refund from the person or entity to whom you paid the fee as set forth in paragraph 1.E.8. 1.B. “Project Gutenberg” is a registered trademark. It may only be used on or associated in any way with an electronic work by people who agree to be bound by the terms of this agreement. There are a few things that you can do with most Project Gutenberg™ electronic works even without complying with the full terms of this agreement. See paragraph 1.C below. There are a lot of things you can do with Project Gutenberg™ electronic works if you follow the terms of this agreement and help preserve free future access to Project Gutenberg™ electronic works. See paragraph 1.E below.
  • 60. 1.C. The Project Gutenberg Literary Archive Foundation (“the Foundation” or PGLAF), owns a compilation copyright in the collection of Project Gutenberg™ electronic works. Nearly all the individual works in the collection are in the public domain in the United States. If an individual work is unprotected by copyright law in the United States and you are located in the United States, we do not claim a right to prevent you from copying, distributing, performing, displaying or creating derivative works based on the work as long as all references to Project Gutenberg are removed. Of course, we hope that you will support the Project Gutenberg™ mission of promoting free access to electronic works by freely sharing Project Gutenberg™ works in compliance with the terms of this agreement for keeping the Project Gutenberg™ name associated with the work. You can easily comply with the terms of this agreement by keeping this work in the same format with its attached full Project Gutenberg™ License when you share it without charge with others. 1.D. The copyright laws of the place where you are located also govern what you can do with this work. Copyright laws in most countries are in a constant state of change. If you are outside the United States, check the laws of your country in addition to the terms of this agreement before downloading, copying, displaying, performing, distributing or creating derivative works based on this work or any other Project Gutenberg™ work. The Foundation makes no representations concerning the copyright status of any work in any country other than the United States. 1.E. Unless you have removed all references to Project Gutenberg: 1.E.1. The following sentence, with active links to, or other immediate access to, the full Project Gutenberg™ License must appear prominently whenever any copy of a Project Gutenberg™ work (any work on which the phrase “Project
  • 61. Gutenberg” appears, or with which the phrase “Project Gutenberg” is associated) is accessed, displayed, performed, viewed, copied or distributed: This eBook is for the use of anyone anywhere in the United States and most other parts of the world at no cost and with almost no restrictions whatsoever. You may copy it, give it away or re-use it under the terms of the Project Gutenberg License included with this eBook or online at www.gutenberg.org. If you are not located in the United States, you will have to check the laws of the country where you are located before using this eBook. 1.E.2. If an individual Project Gutenberg™ electronic work is derived from texts not protected by U.S. copyright law (does not contain a notice indicating that it is posted with permission of the copyright holder), the work can be copied and distributed to anyone in the United States without paying any fees or charges. If you are redistributing or providing access to a work with the phrase “Project Gutenberg” associated with or appearing on the work, you must comply either with the requirements of paragraphs 1.E.1 through 1.E.7 or obtain permission for the use of the work and the Project Gutenberg™ trademark as set forth in paragraphs 1.E.8 or 1.E.9. 1.E.3. If an individual Project Gutenberg™ electronic work is posted with the permission of the copyright holder, your use and distribution must comply with both paragraphs 1.E.1 through 1.E.7 and any additional terms imposed by the copyright holder. Additional terms will be linked to the Project Gutenberg™ License for all works posted with the permission of the copyright holder found at the beginning of this work. 1.E.4. Do not unlink or detach or remove the full Project Gutenberg™ License terms from this work, or any files
  • 62. containing a part of this work or any other work associated with Project Gutenberg™. 1.E.5. Do not copy, display, perform, distribute or redistribute this electronic work, or any part of this electronic work, without prominently displaying the sentence set forth in paragraph 1.E.1 with active links or immediate access to the full terms of the Project Gutenberg™ License. 1.E.6. You may convert to and distribute this work in any binary, compressed, marked up, nonproprietary or proprietary form, including any word processing or hypertext form. However, if you provide access to or distribute copies of a Project Gutenberg™ work in a format other than “Plain Vanilla ASCII” or other format used in the official version posted on the official Project Gutenberg™ website (www.gutenberg.org), you must, at no additional cost, fee or expense to the user, provide a copy, a means of exporting a copy, or a means of obtaining a copy upon request, of the work in its original “Plain Vanilla ASCII” or other form. Any alternate format must include the full Project Gutenberg™ License as specified in paragraph 1.E.1. 1.E.7. Do not charge a fee for access to, viewing, displaying, performing, copying or distributing any Project Gutenberg™ works unless you comply with paragraph 1.E.8 or 1.E.9. 1.E.8. You may charge a reasonable fee for copies of or providing access to or distributing Project Gutenberg™ electronic works provided that: • You pay a royalty fee of 20% of the gross profits you derive from the use of Project Gutenberg™ works calculated using the method you already use to calculate your applicable taxes. The fee is owed to the owner of the Project Gutenberg™ trademark, but he has agreed to donate royalties under this paragraph to the Project Gutenberg Literary Archive Foundation. Royalty
  • 63. payments must be paid within 60 days following each date on which you prepare (or are legally required to prepare) your periodic tax returns. Royalty payments should be clearly marked as such and sent to the Project Gutenberg Literary Archive Foundation at the address specified in Section 4, “Information about donations to the Project Gutenberg Literary Archive Foundation.” • You provide a full refund of any money paid by a user who notifies you in writing (or by e-mail) within 30 days of receipt that s/he does not agree to the terms of the full Project Gutenberg™ License. You must require such a user to return or destroy all copies of the works possessed in a physical medium and discontinue all use of and all access to other copies of Project Gutenberg™ works. • You provide, in accordance with paragraph 1.F.3, a full refund of any money paid for a work or a replacement copy, if a defect in the electronic work is discovered and reported to you within 90 days of receipt of the work. • You comply with all other terms of this agreement for free distribution of Project Gutenberg™ works. 1.E.9. If you wish to charge a fee or distribute a Project Gutenberg™ electronic work or group of works on different terms than are set forth in this agreement, you must obtain permission in writing from the Project Gutenberg Literary Archive Foundation, the manager of the Project Gutenberg™ trademark. Contact the Foundation as set forth in Section 3 below. 1.F. 1.F.1. Project Gutenberg volunteers and employees expend considerable effort to identify, do copyright research on, transcribe and proofread works not protected by U.S. copyright
  • 64. law in creating the Project Gutenberg™ collection. Despite these efforts, Project Gutenberg™ electronic works, and the medium on which they may be stored, may contain “Defects,” such as, but not limited to, incomplete, inaccurate or corrupt data, transcription errors, a copyright or other intellectual property infringement, a defective or damaged disk or other medium, a computer virus, or computer codes that damage or cannot be read by your equipment. 1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except for the “Right of Replacement or Refund” described in paragraph 1.F.3, the Project Gutenberg Literary Archive Foundation, the owner of the Project Gutenberg™ trademark, and any other party distributing a Project Gutenberg™ electronic work under this agreement, disclaim all liability to you for damages, costs and expenses, including legal fees. YOU AGREE THAT YOU HAVE NO REMEDIES FOR NEGLIGENCE, STRICT LIABILITY, BREACH OF WARRANTY OR BREACH OF CONTRACT EXCEPT THOSE PROVIDED IN PARAGRAPH 1.F.3. YOU AGREE THAT THE FOUNDATION, THE TRADEMARK OWNER, AND ANY DISTRIBUTOR UNDER THIS AGREEMENT WILL NOT BE LIABLE TO YOU FOR ACTUAL, DIRECT, INDIRECT, CONSEQUENTIAL, PUNITIVE OR INCIDENTAL DAMAGES EVEN IF YOU GIVE NOTICE OF THE POSSIBILITY OF SUCH DAMAGE. 1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you discover a defect in this electronic work within 90 days of receiving it, you can receive a refund of the money (if any) you paid for it by sending a written explanation to the person you received the work from. If you received the work on a physical medium, you must return the medium with your written explanation. The person or entity that provided you with the defective work may elect to provide a replacement copy in lieu of a refund. If you received the work electronically, the person or entity providing it to you may choose to give you a second opportunity to receive the work electronically in lieu of a refund.
  • 65. If the second copy is also defective, you may demand a refund in writing without further opportunities to fix the problem. 1.F.4. Except for the limited right of replacement or refund set forth in paragraph 1.F.3, this work is provided to you ‘AS-IS’, WITH NO OTHER WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PURPOSE. 1.F.5. Some states do not allow disclaimers of certain implied warranties or the exclusion or limitation of certain types of damages. If any disclaimer or limitation set forth in this agreement violates the law of the state applicable to this agreement, the agreement shall be interpreted to make the maximum disclaimer or limitation permitted by the applicable state law. The invalidity or unenforceability of any provision of this agreement shall not void the remaining provisions. 1.F.6. INDEMNITY - You agree to indemnify and hold the Foundation, the trademark owner, any agent or employee of the Foundation, anyone providing copies of Project Gutenberg™ electronic works in accordance with this agreement, and any volunteers associated with the production, promotion and distribution of Project Gutenberg™ electronic works, harmless from all liability, costs and expenses, including legal fees, that arise directly or indirectly from any of the following which you do or cause to occur: (a) distribution of this or any Project Gutenberg™ work, (b) alteration, modification, or additions or deletions to any Project Gutenberg™ work, and (c) any Defect you cause. Section 2. Information about the Mission of Project Gutenberg™
  • 66. Project Gutenberg™ is synonymous with the free distribution of electronic works in formats readable by the widest variety of computers including obsolete, old, middle-aged and new computers. It exists because of the efforts of hundreds of volunteers and donations from people in all walks of life. Volunteers and financial support to provide volunteers with the assistance they need are critical to reaching Project Gutenberg™’s goals and ensuring that the Project Gutenberg™ collection will remain freely available for generations to come. In 2001, the Project Gutenberg Literary Archive Foundation was created to provide a secure and permanent future for Project Gutenberg™ and future generations. To learn more about the Project Gutenberg Literary Archive Foundation and how your efforts and donations can help, see Sections 3 and 4 and the Foundation information page at www.gutenberg.org. Section 3. Information about the Project Gutenberg Literary Archive Foundation The Project Gutenberg Literary Archive Foundation is a non- profit 501(c)(3) educational corporation organized under the laws of the state of Mississippi and granted tax exempt status by the Internal Revenue Service. The Foundation’s EIN or federal tax identification number is 64-6221541. Contributions to the Project Gutenberg Literary Archive Foundation are tax deductible to the full extent permitted by U.S. federal laws and your state’s laws. The Foundation’s business office is located at 809 North 1500 West, Salt Lake City, UT 84116, (801) 596-1887. Email contact links and up to date contact information can be found at the Foundation’s website and official page at www.gutenberg.org/contact
  • 67. Section 4. Information about Donations to the Project Gutenberg Literary Archive Foundation Project Gutenberg™ depends upon and cannot survive without widespread public support and donations to carry out its mission of increasing the number of public domain and licensed works that can be freely distributed in machine-readable form accessible by the widest array of equipment including outdated equipment. Many small donations ($1 to $5,000) are particularly important to maintaining tax exempt status with the IRS. The Foundation is committed to complying with the laws regulating charities and charitable donations in all 50 states of the United States. Compliance requirements are not uniform and it takes a considerable effort, much paperwork and many fees to meet and keep up with these requirements. We do not solicit donations in locations where we have not received written confirmation of compliance. To SEND DONATIONS or determine the status of compliance for any particular state visit www.gutenberg.org/donate. While we cannot and do not solicit contributions from states where we have not met the solicitation requirements, we know of no prohibition against accepting unsolicited donations from donors in such states who approach us with offers to donate. International donations are gratefully accepted, but we cannot make any statements concerning tax treatment of donations received from outside the United States. U.S. laws alone swamp our small staff. Please check the Project Gutenberg web pages for current donation methods and addresses. Donations are accepted in a number of other ways including checks, online payments and
  • 68. credit card donations. To donate, please visit: www.gutenberg.org/donate. Section 5. General Information About Project Gutenberg™ electronic works Professor Michael S. Hart was the originator of the Project Gutenberg™ concept of a library of electronic works that could be freely shared with anyone. For forty years, he produced and distributed Project Gutenberg™ eBooks with only a loose network of volunteer support. Project Gutenberg™ eBooks are often created from several printed editions, all of which are confirmed as not protected by copyright in the U.S. unless a copyright notice is included. Thus, we do not necessarily keep eBooks in compliance with any particular paper edition. Most people start at our website which has the main PG search facility: www.gutenberg.org. This website includes information about Project Gutenberg™, including how to make donations to the Project Gutenberg Literary Archive Foundation, how to help produce our new eBooks, and how to subscribe to our email newsletter to hear about new eBooks.
  • 69. Welcome to our website – the perfect destination for book lovers and knowledge seekers. We believe that every book holds a new world, offering opportunities for learning, discovery, and personal growth. That’s why we are dedicated to bringing you a diverse collection of books, ranging from classic literature and specialized publications to self-development guides and children's books. More than just a book-buying platform, we strive to be a bridge connecting you with timeless cultural and intellectual values. With an elegant, user-friendly interface and a smart search system, you can quickly find the books that best suit your interests. Additionally, our special promotions and home delivery services help you save time and fully enjoy the joy of reading. Join us on a journey of knowledge exploration, passion nurturing, and personal growth every day! ebookbell.com