SlideShare a Scribd company logo
Overview on P2P Principles Kalman Graffi DFG Research Group QuaP2P Technische Universität Darmstadt
Overview Motivation Peer-to-Peer principle Overlay networks  Unstructured P2P networks Centralized P2P network Distributed P2P network Structured DHT-based P2P networks Chord Content Addressable Network
Overview Motivation Peer-to-Peer principle Overlay networks  Unstructured P2P networks Centralized P2P network Distributed P2P network Structured DHT-based P2P networks Chord
Motivation A huge number of nodes participating in the network Having resources to share With demands for resources they do not have Main question Which of the nodes keep resources I want? Pe er-to-Peer (P2P) as solution P2P offers mechanisms to find / look up what I want Therefore: Build additional overlay network  After finding the node providing the desired service: Communicate directly from peer to peer
Peer-to-Peer  Properties Heterogeneous peers Reliable? Permanent? Connectivity of individual peers cannot be assumed Peers offer and consume services and resources  Services are exchanged between any participating peers Peers (=end-systems) form an overlay network  Peers have significant autonomy Self-organizing system Overlay Connection Service Delivery
Overlay Networks Picture adapted from Traversat, et.al  Project JXTA virtual network Overlay network network built  ON TOP  of one or more existing networks (e.g.IP network) adds an additional layer of  abstraction  indirection/virtualization Provide sophisticated services (search, look-up) Both P2P overlay and IP network have their own addressing scheme, provide routing functionality are based on the end-to-end principle TCP/IP TCP/IP TCP/IP Peers Overlay Network Underlay Networks
Overlay Networks: Properties Advantages: New layer fastens search/lookup of requested information Allow for bootstrapping Make use of existing environment adding new layer Not all nodes have to participate in maintaining  But free riding is still a problem Disadvantages: Overhead Additional layer in networking stack,  Complexity Layering does not eliminate complexity, it only manages it Misleading behavior, unintended interaction between layers Redundancy / Features may be available at various layer Two types of P2P overlay networks: Unstructured and Structured
Structured and Unstructured P2P Systems Unstructured P2P Networks objects have no special identifier  location of desired object a priori not known each peer is only responsible for its own objects Search:  Find all (or some) objects in the P2P network which fit to given criteria. Structured P2P Networks peers and objects have identifiers, strict topology objects are stored on peers according to their ID:  responsibleFor(ObjID) = PeerID distributed indexing points to object location Lookup / Addressing:  Retrieve the object which is identified with a given identifier .
1st Generation 2nd Generation 3rd Generation Unstructured P2P : Centralized from R.Schollmeier and J.Eberspächer, TU München DHT-Based Pure P2P Hybrid P2P Centralized P2P 1. Any terminal entity  can be removed without loss of functionality 2. No central entities, fully distributed 3. “Fixed” connections in the overlay network 4.Search costs: O(log n) 5.Costs for state: O(log n) 6.For: Lookup Any terminal entity can be removed without loss of functionality Dynamic central entities for faster search 3.Search costs: variable 4.Costs for state: variable 5.For: Searches 1.Any terminal entity can be removed with- out loss of functionality 2. No central entities, fully distributed 3.Search costs: O(n) 4.Costs for state: O(1) 5.For: Searches 1.Central entity is  necessary to provide the service 2.Central entity is some kind of index database 3.Search costs: O(1) 4.Costs for state: O(n) 5.For: Searches Structured P2P Unstructured P2P
Principle Central server stores information about locations Unstructured Centralized P2P Systems    Node B requests item D from node A    Requester (B) asks server S for the location of D    Server S tells B that node A stores item D    Node A (provider) tells server that it stores item D 2. Send Query for desired object 3. P2P com-munication. Get  Contents 1.  Publish  contents at own peer, tell server ?
Unstructured Centralized P2P Systems Advantages Search complexity of O(1) –  “just ask the server” Complex and fuzzy queries are possible Simple, fast and finding all objects Problems No robustness: server is single point of failure (SPOF) No self organization No intrinsic scalability: O(N) network and system load Non-linear increasing maintenance cost in particular for achieving high availability and scalability But overall, …  Best principle for small and simple applications
1st Generation 2nd Generation 3rd Generation Unstructured P2P : Pure / Distributed from R.Schollmeier and J.Eberspächer, TU München DHT-Based Pure P2P Hybrid P2P Centralized P2P 1. Any terminal entity  can be removed without loss of functionality 2. No central entities, fully distributed 3. “Fixed” connections in the overlay network 4.Search costs: O(log n) 5.Costs for state: O(log n) 6.For: Lookup Any terminal entity can be removed without loss of functionality Dynamic central entities for faster search 3.Search costs: variable 4.Costs for state: variable 5.For: Searches 1.Any terminal entity can be removed with- out loss of functionality 2. No central entities, fully distributed 3.Search costs: O(n) 4.Costs for state: O(1) 5.For: Searches 1.Central entity is  necessary to provide the service 2.Central entity is some kind of index database 3.Search costs: O(1) 4.Costs for state: O(n) 5.For: Searches Structured P2P Unstructured P2P
Unstructured Distributed P2P Systems Fully Distributed Approach Central systems  vulnerable, do not scale,  unbalanced costs Unstructured P2P systems follow opposite approach No global information available about location of item Information only stored at respective node providing it  Retrieval of data No routing information for content Necessity to ask as many systems as possible  Approaches: Flooding: high traffic load on network, does not scale Highest effort for searching  quick search through large areas  many messages needed for unique identification
Unstructured Distributed P2P Systems Characteristics All peers are equal (in their roles) Search mechanism is provided by cooperation of all peers Local view on the network Overlay Network Service delivery Tasks to solve: Connecting to the network No central index server. Joining strategies needed To join: have to know at least 1 peer in network Local view on network => advertisements needed Search different search strategies available providing different benefits & drawbacks Service delivery Establish connection to other node Peer to peer communication
Principle No information about the objects is spread Unstructured Distributed P2P Example 2.  Search  desired object 3. P2P com-munication. Get  Contents    Node C searching object floods the network Node A and B send a reply    Node A and B (provider) store object, tell no one.    Node C requests the object from subset of the repliers 1.  Publish  contents at own peer ?
Properties of Distributed P2P Systems Benefits:  Robustness: Every peer is dispensable Switch off peer => no effect for network Balanced costs: each peer contributes the same Self organization Drawbacks:  Slow and expensive search (flood the network) Finding all objects fitting to search criteria is not guaranteed Object out of reach for search query
1st Generation 2nd Generation 3rd Generation Unstructured Hybrid P2P Systems from R.Schollmeier and J.Eberspächer, TU München DHT-Based Pure P2P Hybrid P2P Centralized P2P 1. Any terminal entity  can be removed without loss of functionality 2. No central entities, fully distributed 3. “Fixed” connections in the overlay network 4.Search costs: O(log n) 5.Costs for state: O(log n) 6.For: Lookup Any terminal entity can be removed without loss of functionality Dynamic central entities for faster search 3.Search costs: variable 4.Costs for state: variable 5.For: Searches 1.Any terminal entity can be removed with- out loss of functionality 2. No central entities, fully distributed 3.Search costs: O(n) 4.Costs for state: O(1) 5.For: Searches 1.Central entity is  necessary to provide the service 2.Central entity is some kind of index database 3.Search costs: O(1) 4.Costs for state: O(n) 5.For: Searches Structured P2P Unstructured P2P
Unstructured Hybrid P2P Systems Combine best of both worlds: Robustness by distributed indexing Fast searches by server queries How it works: Supernodes are mini servers / super peers Normal peers: have only overlay connections to supernodes Use supernodes as servers for queries Supernodes: queries are flooded in supernodes subnetwork Advantages: More robust than centralized solutions Faster searches than in pure P2P systems Disadvantages: Need algorithms to choose reliable supernodes
Unstructured Hybrid P2P Systems Example: Gnutella 0.6 from R.Schollmeier and J.Eberspächer, TU München
1st Generation 2nd Generation 3rd Generation Structured DHT-based P2P Systems from R.Schollmeier and J.Eberspächer, TU München DHT-Based Pure P2P Hybrid P2P Centralized P2P 1. Any terminal entity  can be removed without loss of functionality 2. No central entities, fully distributed 3. “Fixed” connections in the overlay network 4.Search costs: O(log n) 5.Costs for state: O(log n) 6.For: Lookup Any terminal entity can be removed without loss of functionality Dynamic central entities for faster search 3.Search costs: variable 4.Costs for state: variable 5.For: Searches 1.Any terminal entity can be removed with- out loss of functionality 2. No central entities, fully distributed 3.Search costs: O(n) 4.Costs for state: O(1) 5.For: Searches 1.Central entity is  necessary to provide the service 2.Central entity is some kind of index database 3.Search costs: O(1) 4.Costs for state: O(n) 5.For: Searches Structured P2P Unstructured P2P
Distributed Hash Table: Steps of Operation Beginning:  Mapping nodes and data onto same address space  Peers and objects are addressed using flat IDs.  Nodes are responsible for data in certain parts of the address space:  responsibleFor(ObjectID) = PeerID Association of data to nodes may change (churn) Later: Storing / Looking up data in the DHT Retrieving data = routing to the responsible node Responsible node not necessarily known in advance Deterministic statement about availability of data All nodes maintain routing information to other nodes Limitations Maintenance of routing information required Load balancing problematic No fuzzy search supported
Principle: Location of the objects is found via routing    Node A (provider) advertises object at responsible peer B.  Structured Overlay Networks: Example 3. P2P com-munication. Get  link to object. 2. “Routing” to /  Lookup  of desired Object Advertisement is routed to B.    Node C looking for object sends query to the network.  Query is routed to responsible node.    Node B replies to C by sending contact information of A 1.  Publish  link at responsible Peer ?
Chord: Network Topology Uses SHA-1 to map IP address/object name to 160 Bit ID Basic ring topology mod 2 n   Successor/  Predecessor Circular Key Space Link to ring successor 2207 2012-2207 2906 2683-2906 3485 2907-3485 2011 1623-2011 1622 1009-1622 1008 710-1008 709 660-709 659 612-659 2682 2208-2682 611 3486-4095 0-611 Enhanced topology k th  finger of Peer n is shortcut pointing to peers being responsible for Object ID (n + 2 k ) O(log(N)) fingers lead to lookup operation of O(log(N)) Fingers poin to peers with ObjectIDs increasing ex-ponentially. Here: 709 + 2 k = …, 965, 1221, 1733, 2757 2207 2012-2207 2906 2683-2906 3485 2907-3485 2011 1623-2011 1622 1009-1622 1008 710-1008 709 660-709 659 612-659 2682 2208-2682 611 3486-4095 0-611
Chord: Addressing Content Query Contains the hash value of the queried content On each step the distance from the destination is halved Node 1008 queries item 3000 Use Fingers to locate  the destination faster Without fingers: no shortcuts, walk the circle Responsible peer found 2207 2012-2207 2906 2683-2906 3485 2907-3485 2011 1623-2011 1622 1009-1622 1008 710-1008 709 660-709 659 612-659 2682 2208-2682 611 3486-… 0-611 2 Responsible for 1008 + 1024 3 1 Responsible for 2207 + 512 Responsible for 3000
Chord: Join Procedure Request to join the Chord ring New node (e.g. 1289) contacts a member of the ring (e.g. 2906) Contacted node routes the query to the responsible node (1622) Responsible node (1622) contacts new node Then: 2207 2012-2207 2906 2683-2906 3485 2907-3485 2011 1623-2011 1622 1290-1622 1008 710-1008 709 660-709 659 612-659 2682 2208-2682 611 3486-… 0-611 2a. Set new predecessor  2b. Redistribute indexing information (e.g. 1009-1289) 3. Update successor of predecessor 4. Build fingers Fingers of peer n pointing to peers responsible for ObjectID n + 2 k  thus, log(N) fingers are built. 1289 1009-1289 1. Set/contact successor
Summary Motivation Peer-to-Peer principle Overlay networks  Unstructured P2P networks Centralized P2P network Distributed P2P network Structured DHT-based P2P networks Chord
Questions? Thank you for your attention. Any questions? ?
Content Addressable Network (CAN) A d-dimensional hash-table in cyclic coordinate space.  d hash-functions, 1 per coordinate PeerID(p) =(h1(p),h2(p),... hd(p)) ObjID(obj) =(h1(obj),…,hd(obj)) CAN nodes Each is responsible for a distinct  rectangular zone of the space Store data that hash into its zone The peers cover together the entire space 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 n2 n3 n4 n5 f1 f2 f3 f4 2-dimensional CAN Routing:  A peer knows the IP addresses and zone ranges of its neighbors Peers can communicate only with their neighbors Properties  Routing table size O(d) Guarantees that a file is found in at most d*n 1/d  steps, where n is the total number of peers n1
CAN: New Peer Join New node has to acquire a zone to be responsible for Steps: The node chooses randomly a point P in the space The zone which includes P will be split in 2 halfs New node n6 requests to join Contacts a node (e.g. n5)  Selects point P n5 routes the join query to n1 n1 splits its zone n6 is responsible for the new zone (at point P) 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 n1 n2 n3 n4 n5 f1 f2 f3 f4 2-dimensional CAN Figure modified from another presentation n6 P n6
Summary Motivation Peer-to-Peer principle Overlay networks  Unstructured P2P networks Centralized P2P network Distributed P2P network Structured DHT-based P2P networks Chord Content Addressable Network
Questions? Thank you for your attention. Any questions? ?
 
 
After this slide: slides in reserve
Overview Motivation Peer-to-Peer principle Overlay networks  Unstructured P2P networks Centralized P2P network Distributed P2P network Structured DHT-based P2P networks Chord Content Addressable Network
Requirements for Overlay Networks Fault-tolerance resilience of the connectivity when failures are encountered  by arbitrary leave of peers Heterogeneity considering variations in physical capabilities and peer behavior (e.g. file fishing) Fairness evenly distributing workload across nodes Security ability of a system to manage, protect and distribute sensitive information Privacy degree to which a system or component allows for (or supports) anonymous transactions
Requirements for Overlay Networks: Trade-offs Time – Space e.g. local information vs. complete replication of  indices Security – Privacy e.g. fully logged operations vs. totally untraceable Efficiency – Completeness e.g. exact key-based matching vs. partial matching (use of wildcards) Scope – Network load with TTL (time to live) e.g. TTL based requests vs. exhaustive search Efficiency – Autonomy e.g. hierarchical vs. pure P2P overlays Reliability – Low maintenance overhead e.g. deterministic vs. probabilistic operations
Centralized P2P Networks Central index server, maintaining index: What: object name, file name, criteria (ID3) … Where (IP address, Port) Search engine, combining  both information Global view on the network Normal peer, maintaining the objects: Each peer maintains only its own objects decentralized storage (content at the edges)  file transfer between clients (decentralized) Issues: Unbalanced costs: central server is bottleneck Security: server is single point of failure Central Server Overlay Network
Search in Centralized P2P Networks Search Peers contact central server asking for objects which fulfill some criteria Central server answers with list of addresses of peers that contain objects with these criteria Peer contacts peers containing desired objects Transferring object / providing service P2P Central Server 3,4: Service Delivery P2P 1: Query server 2: Server answers
Step 1: Addressing in Distributed Hash Tables Mapping of content/nodes into linear space Usually: 0, …, 2 m -1 >> number of objects to be stored Mapping of data and nodes into an address space (e.g.0 to 2 m -1) (with hash function) E.g., Hash( String ) mod 2 m : H(„ my data “)    2313 Association of parts of address space to DHT nodes The content of this slide has been adapted from “Peer-to-Peer Systems and Applications”, ed. By Steinmetz, Wehrle H(Node Y)=3485 3485 - 610 1622 - 2010 611 - 709 2011 - 2206 2207- 2905 (3485 - 610) 2906 - 3484 1008 - 1621 Y X 2 m -1 0 Often, the address  space is viewed as  a circle. Data item “D”: H(“D”)=3107 H(Node X)=2906
Step 2: Association of Address Space with Nodes Arrangement of the range of values Each node is responsible for part of the value range Often with redundancy (overlapping of parts) Continuous adaptation Real (underlay) and logical (overlay)  topology are (mostly) uncorrelated The content of this slide has been adapted from “Peer-to-Peer Systems and Applications”, ed. By Steinmetz, Wehrle Node 3485 is responsible for data items in range 2907 to 3485  (in case of a Chord-DHT) Logical view of the  Distributed Hash Table Mapping on the  real topology 2207 2906 3485 2011 1622 1008 709 611
Step 3: Locating a Data Item Locating the data  content-based routing Goal: Small and scalable effort O(1) with centralized hash table But: Management of a centralized hash table too costly (server) Minimum overhead with distributed hash tables O(log N):  DHT hops to locate object O(log N):  number of keys and routing information per node (N = # nodes) The content of this slide has been adapted from “Peer-to-Peer Systems and Applications”, ed. By Steinmetz, Wehrle
Step 4: Routing to a Data Item Routing to a key/value-pair Start lookup at arbitrary node of DHT Routing to requested data item (key) ( 3107, (ip, port) ) Value  = pointer to location of data Key  = H(“ my data ”) Node 3485 manages  keys 2907-3485,  Initial node (arbitrary) H(„ my data “) = 3107 2207 2906 3485 2011 1622 1008 709 611 ? The content of this slide has been adapted from “Peer-to-Peer Systems and Applications”, ed. By Steinmetz, Wehrle
Step 5: Data Retrieval – Usage of located Resource Accessing the content Key/value-pair is delivered to requester Requester analyzes key/Value-tuple (and downloads data from actual location – in case of indirect storage) H(„ my data “) = 3107 2207 2906 3485 2011 1622 1008 709 611 ? Get_Data(ip, port) Node 3485 sends  (3107, (ip/port)) to requester In case of indirect storage: After knowing the actual  Location, data is requested The content of this slide has been adapted from “Peer-to-Peer Systems and Applications”, ed. By Steinmetz, Wehrle
(Step 6) Where is the Data located? Association of Data with IDs Direct Storage Indirect Storage D D 134.2.11.68 2207 2906 3485 2011 1622 1008 709 611 H SHA-1 („D“)=3107 D 2207 2906 3485 2011 1622 1008 709 611 H SHA-1 („D“)=3107 Item D: 134.2.11.68 D 134.2.11.68
Distributed Hash Table: Insert and Delete a Node Join of a new node Calculation of node ID New node contacts DHT via arbitrary node Assignment of a particular hash range Copying of key/value-pairs of hash rang (usually with redundancy) Binding into routing environment 2207 2906 3485 2011 1622 1008 709 611 ID: 3485 134.2.11.68    The content of this slide has been adapted from “Peer-to-Peer Systems and Applications”, edt. By Steinmetz, Wehrle
Node Failure and Node Departure Failure of a node Use of redundant key/value pairs (if a node fails) Use of redundant / alternative routing paths Key-value usually still retrievable if at least one copy remains Departure of a node  Partitioning of hash range to neighbor nodes Copying of key/value pairs to corresponding nodes Unbinding from routing environment The content of this slide has been adapted from “Peer-to-Peer Systems and Applications”, edt. By Steinmetz, Wehrle
Summary of DHTs: Properties Hash buckets distributed over nodes Nodes form an overlay network Route messages in overlay to find responsible node Routing scheme in the overlay network is the difference between different DHTs DHT behavior and usage: Node knows “object” name and wants to find it Unique and known object names assumed Node routes a message in overlay to the responsible node Responsible node replies with “object” Semantics of “object” are application defined 3.6
Chord: Join Procedure (1) Request to join the Chord ring New Peer 1289 1. Contact a member  of the ring 2. Route the query  in the ring 3. Provide new  peer’s successor 2207 2012-2207 2906 2683-2906 3485 2907-3485 2011 1623-2011 1622 1009-1622 1008 710-1008 709 660-709 659 612-659 2682 2208-2682 611 3486-… 0-611
CAN: New Peer Join New node has to acquire a zone to be responsible for Steps: The node chooses randomly a point P in the space The zone which includes P will be split in 2 halfs Example: Node n6 requests to Join Contacts n4  Selects point P n4 routes the Join query to n1 n1 splits its zone n6 is responsible for the new zone (at point P) P n6 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 n1 n2 n3 n4 n5 f1 f2 f3 f4
CAN: Routing 2 CAN nodes are neighbors if their zones overlap along d-1 dimensions and  abut along one dimension I.e. A node knows the IP addresses of its neighbors A node knows the coordinates of neighboring zones Nodes can communicate only with their neighbors Properties  Routing table size O(d) Guarantees that a file is found in at most d*n 1/d  steps, where n is the total number of nodes 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 n1 n2 n3 n4 n5 f2 f4 f3 f6 Lookup example: Node n5 is looking for file f3 Abut = direkt angrenzen f5 f1
CAN: New Peer Join New node has to acquire a zone to be responsible for Steps: The node chooses randomly a point P in the space The zone which includes P will be split in 2 halfs Example: Node n6 requests to Join Contacts n4  Selects point P n4 routes the Join query to n1 n1 splits its zone n6 is responsible for the new zone (at point P) P n6 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 n1 n2 n3 n4 n5 f1 f2 f3 f4
Kademlia Widely deployed DHT (used in e.g. „eMule“) Features DHT-based overlay network using the XOR metric E.g. D( 1 10 1 01,  0 10 0 01) = 4 + 32 = 36 Simple operation Symmetrical (A->B == B->A) Use lookup messages to maintain the overlay network (Cannot be applied to asymmetrical Chord) Multiple routing possibilities Parallel asynchronous queries Overcome faulty nodes Properties Scalable Logarithmic complexity in node degree (O(log(N))) and lookup steps (O(log(N))) Efficient Low maintenance cost Robust 3.9
Kademlia: System Description Nodes are leaves in a binary tree Position is determined by shortest unique prefix of node ID For every node (e.g. node „001“) Divide the binary tree into a series of successively lower sub-trees that do not contain that node At least one contact node is required in each sub-tree Large sub-trees provide many neighbor alternatives
Kademlia: Lookup Operation Every hop forwards the queries to a smaller sub-tree around the target Example: Node „001“ routes to “101” Node „001“ needs a contact in 1-(*) sub-tree (e.g. „110“) Node „110“ needs a contact in 10-(*) sub-tree (e.g. „100“ Node „100“ forwards the query to destination „101“ 1 2 3
Communication Networks II Peer-To-Peer Networking Note: many Images were taken and adapted from contribution at the book “P2P Systems and Applications” Ed. Steinmetz, Wehrle H(„ my data “) = 3107 2207 7.31.10.25 peer-to-peer.info 12.5.7.31 95.7.6.10 86.8.10.18 planet-lab.org berkeley.edu 2906 3485 2011 1622 1008 709 611 61.51.166.150 ?
Structured Overlay Networks Principle peers and objects have identifiers objects are stored on peers according to their ID distributed indexing points to object location Lookup / Addressing:  Retrieve the object which is identified with a given identifier . 3. P2P com-munication. Get  Contents 2. “Routing” to /  Lookup  of desired Object 1.  Publish  contents at responsible Peer
Structured Overlay Networks Principle peers and objects have identifiers objects are stored on peers according to their ID distributed indexing points to object location Lookup / Addressing:  Retrieve the object which is identified with a given identifier . 3. P2P com-munication. Get  Contents 2. “Routing” to /  Lookup  of desired Object 1.  Publish  contents at responsible Peer
Structured Overlay Networks Principle peers and objects have identifiers objects are stored on peers according to their ID distributed indexing points to object location Lookup / Addressing:  Retrieve the object which is identified with a given identifier .
Unstructured centralized P2P networks Simple strategy: Central server stores information about locations    Node A (provider) tells server that it stores item D    Node B (requester) asks server S for the location of D    Server S tells B that node A stores item D    Node B requests item D from node A Node  A Server  S “ A  stores  D ” Node  B The content of this slide has been adapted from “Peer-to-Peer Systems and Applications”, edt. By Steinmetz, Wehrle Transmission: D    Node B    “ Where is  D  ?”    “ A  stores  D ”    “ A  stores  D ” 
Unstructured distributed P2P networks Fully Decentralized Approach No information about location of data at intermediate systems Necessity for broad search    Node B (requester) asks neighboring nodes for item D  -   Nodes forward request to further nodes (breadth-first search / flooding)    Node A (provider of item D) sends D to requesting node B The content of this slide has been adapted from “Peer-to-Peer Systems and Applications”, edt. By Steinmetz, Wehrle & Transmission: D    Node B “ I have  D  ?”    “ B  searches  D ” Node  A Node  B “ I   store  D ”              

More Related Content

PDF
Peer to peer network schemes and finding algorithms
PPT
Lecture - Network Technologies: Peer-to-Peer Networks
PPTX
P2P Lookup Protocols
PPTX
Bit torrent a revolution in p2p
PPTX
Citcism on Peer to peer networking
PDF
Peer to peer Networks
PPT
Peerto Peer Networks
PPT
P2p Peer To Peer Introduction
Peer to peer network schemes and finding algorithms
Lecture - Network Technologies: Peer-to-Peer Networks
P2P Lookup Protocols
Bit torrent a revolution in p2p
Citcism on Peer to peer networking
Peer to peer Networks
Peerto Peer Networks
P2p Peer To Peer Introduction

What's hot (20)

PPTX
Peer To Peer Networking
PDF
Peer-to-Peer Systems
PPTX
Peer To Peer Protocols
PPT
P2P Seminar
PPTX
Peer Sim & P2P
PPTX
Final peersimp pt
PDF
International Journal of Computational Engineering Research(IJCER)
PDF
Introduction To Computer Networks
PPTX
Computer Network basic
PPTX
Network Topology
PDF
Computer networks : intro to networking, pros and cons, uses, network edges :...
PPTX
Overlay network
PPTX
Open Systems Interconnection
PDF
Computer networking 1
PPT
Computer network basics
PDF
OSI Model
PDF
Network Topology
Peer To Peer Networking
Peer-to-Peer Systems
Peer To Peer Protocols
P2P Seminar
Peer Sim & P2P
Final peersimp pt
International Journal of Computational Engineering Research(IJCER)
Introduction To Computer Networks
Computer Network basic
Network Topology
Computer networks : intro to networking, pros and cons, uses, network edges :...
Overlay network
Open Systems Interconnection
Computer networking 1
Computer network basics
OSI Model
Network Topology
Ad

Viewers also liked (12)

KEY
P2P Supernodes
PPT
Introduction to Peer-to-Peer Networks
PDF
Webtuesday Zurich
PPTX
Content addressable network(can)
PDF
IEEE P2P 2009 - Kalman Graffi - Monitoring and Management of Structured Peer-...
PDF
CS4344 Lecture 6: Interest Management in P2P Architecture
PDF
Effective Web Application Development with Apache Sling
PPTX
Overlay networks ppt
PPTX
Peer to peer system
PPT
Synchronization in distributed systems
PPTX
Ppt of routing protocols
P2P Supernodes
Introduction to Peer-to-Peer Networks
Webtuesday Zurich
Content addressable network(can)
IEEE P2P 2009 - Kalman Graffi - Monitoring and Management of Structured Peer-...
CS4344 Lecture 6: Interest Management in P2P Architecture
Effective Web Application Development with Apache Sling
Overlay networks ppt
Peer to peer system
Synchronization in distributed systems
Ppt of routing protocols
Ad

Similar to QuaP2P P2P Tutorial 2006 (20)

PPTX
Peer to peer data management
PPT
Peer to Peer services and File systems
PPT
IEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia Communities
PDF
Textual based retrieval system with bloom in unstructured Peer-to-Peer networks
PDF
The International Journal of Engineering and Science (IJES)
PDF
Flexible Bloom for Searching Textual Content Based Retrieval System in an Uns...
PDF
Flexible bloom for searching textual content
PDF
Flexible bloom for searching textual content
PDF
SECURITY CONSIDERATION IN PEER-TO-PEER NETWORKS WITH A CASE STUDY APPLICATION
PPT
Agents and P2P Networks
PPTX
02 - Topologies of Distributed Systems
PPT
Advance Computer Networking bachelor of science in computer engineering
PDF
Analysis of threats and security issues evaluation in mobile P2P networks
PPTX
Peer to peer Paradigms
PDF
App for peer-to-peer file transfer
PPTX
Module 1 Internet of Things Introduction
PDF
This chapter introduces about the Architectures of Distributed Systems
PDF
This chapter gives the Architectures Introduction about distributed systems
PPTX
Network Measurement and Monitori - Assigment 1, Group3, "Classification"
PPTX
Peer To Peer.pptx
Peer to peer data management
Peer to Peer services and File systems
IEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia Communities
Textual based retrieval system with bloom in unstructured Peer-to-Peer networks
The International Journal of Engineering and Science (IJES)
Flexible Bloom for Searching Textual Content Based Retrieval System in an Uns...
Flexible bloom for searching textual content
Flexible bloom for searching textual content
SECURITY CONSIDERATION IN PEER-TO-PEER NETWORKS WITH A CASE STUDY APPLICATION
Agents and P2P Networks
02 - Topologies of Distributed Systems
Advance Computer Networking bachelor of science in computer engineering
Analysis of threats and security issues evaluation in mobile P2P networks
Peer to peer Paradigms
App for peer-to-peer file transfer
Module 1 Internet of Things Introduction
This chapter introduces about the Architectures of Distributed Systems
This chapter gives the Architectures Introduction about distributed systems
Network Measurement and Monitori - Assigment 1, Group3, "Classification"
Peer To Peer.pptx

More from Kalman Graffi (20)

PDF
IEEE CRS 2014 - Secure Distributed Data Structures for Peer-to-Peer-based Soc...
PPTX
LibreSocial - P2P Framework for Social Networks - Overview
PDF
IEEE P2P 2013 - Bootstrapping Skynet: Calibration and Autonomic Self-Control ...
PPTX
IEEE ICCCN 2013 - Continuous Gossip-based Aggregation through Dynamic Informa...
PPT
IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case
PPT
IEEE HPCS 2013 - Comparative Evaluation of Peer-to-Peer Systems Using Peerfac...
PPTX
Kalman Graffi - IEEE NetSys 2013 - Ca-Re-Chord - A Churn Resistant Self-stabi...
PDF
Kalman Graffi - IEEE NetSys 2013 - Adding Capacity-Aware Storage Indirection ...
PPT
Kalman Graffi - IEEE ICC 2013 - Symbiotic Coupling of Peer-to-Peer and Cloud ...
PPT
Kalman Graffi - IEEE HPCS 2013 - Comparative Evaluation of P2P Systems Using ...
PDF
Kalman Graffi - Monitoring and Management of P2P Systems - 2010
PPT
IEEE CCNC 2011: Kalman Graffi - LifeSocial.KOM: A Secure and P2P-based Soluti...
PDF
Kalman Graffi - 15 Slide on Monitoring P2P Systems - 2010
PDF
QuaP2P Lunchtalk on Online Social Networks 2010 - LifeSocial
PPT
LifeSocial - A P2P-Platform for Secure Online Social Networks
PDF
Dagstuhl 2010 - Kalman Graffi - Alternative, more promising IT Paradigms for ...
PDF
Kalman Graffi - 10 Slide - 2010
PDF
Kalman Graffi - 1 Slide - 2010
PPT
Kalman Graffi - Sichere Digitale Soziale Netzwerke – Eine Chance für E-Learni...
PDF
Kalman Graffi - 3rd Research Talk - 2010
IEEE CRS 2014 - Secure Distributed Data Structures for Peer-to-Peer-based Soc...
LibreSocial - P2P Framework for Social Networks - Overview
IEEE P2P 2013 - Bootstrapping Skynet: Calibration and Autonomic Self-Control ...
IEEE ICCCN 2013 - Continuous Gossip-based Aggregation through Dynamic Informa...
IEEE ICC 2013 - Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case
IEEE HPCS 2013 - Comparative Evaluation of Peer-to-Peer Systems Using Peerfac...
Kalman Graffi - IEEE NetSys 2013 - Ca-Re-Chord - A Churn Resistant Self-stabi...
Kalman Graffi - IEEE NetSys 2013 - Adding Capacity-Aware Storage Indirection ...
Kalman Graffi - IEEE ICC 2013 - Symbiotic Coupling of Peer-to-Peer and Cloud ...
Kalman Graffi - IEEE HPCS 2013 - Comparative Evaluation of P2P Systems Using ...
Kalman Graffi - Monitoring and Management of P2P Systems - 2010
IEEE CCNC 2011: Kalman Graffi - LifeSocial.KOM: A Secure and P2P-based Soluti...
Kalman Graffi - 15 Slide on Monitoring P2P Systems - 2010
QuaP2P Lunchtalk on Online Social Networks 2010 - LifeSocial
LifeSocial - A P2P-Platform for Secure Online Social Networks
Dagstuhl 2010 - Kalman Graffi - Alternative, more promising IT Paradigms for ...
Kalman Graffi - 10 Slide - 2010
Kalman Graffi - 1 Slide - 2010
Kalman Graffi - Sichere Digitale Soziale Netzwerke – Eine Chance für E-Learni...
Kalman Graffi - 3rd Research Talk - 2010

QuaP2P P2P Tutorial 2006

  • 1. Overview on P2P Principles Kalman Graffi DFG Research Group QuaP2P Technische Universität Darmstadt
  • 2. Overview Motivation Peer-to-Peer principle Overlay networks Unstructured P2P networks Centralized P2P network Distributed P2P network Structured DHT-based P2P networks Chord Content Addressable Network
  • 3. Overview Motivation Peer-to-Peer principle Overlay networks Unstructured P2P networks Centralized P2P network Distributed P2P network Structured DHT-based P2P networks Chord
  • 4. Motivation A huge number of nodes participating in the network Having resources to share With demands for resources they do not have Main question Which of the nodes keep resources I want? Pe er-to-Peer (P2P) as solution P2P offers mechanisms to find / look up what I want Therefore: Build additional overlay network After finding the node providing the desired service: Communicate directly from peer to peer
  • 5. Peer-to-Peer Properties Heterogeneous peers Reliable? Permanent? Connectivity of individual peers cannot be assumed Peers offer and consume services and resources Services are exchanged between any participating peers Peers (=end-systems) form an overlay network Peers have significant autonomy Self-organizing system Overlay Connection Service Delivery
  • 6. Overlay Networks Picture adapted from Traversat, et.al Project JXTA virtual network Overlay network network built ON TOP of one or more existing networks (e.g.IP network) adds an additional layer of abstraction indirection/virtualization Provide sophisticated services (search, look-up) Both P2P overlay and IP network have their own addressing scheme, provide routing functionality are based on the end-to-end principle TCP/IP TCP/IP TCP/IP Peers Overlay Network Underlay Networks
  • 7. Overlay Networks: Properties Advantages: New layer fastens search/lookup of requested information Allow for bootstrapping Make use of existing environment adding new layer Not all nodes have to participate in maintaining But free riding is still a problem Disadvantages: Overhead Additional layer in networking stack, Complexity Layering does not eliminate complexity, it only manages it Misleading behavior, unintended interaction between layers Redundancy / Features may be available at various layer Two types of P2P overlay networks: Unstructured and Structured
  • 8. Structured and Unstructured P2P Systems Unstructured P2P Networks objects have no special identifier location of desired object a priori not known each peer is only responsible for its own objects Search: Find all (or some) objects in the P2P network which fit to given criteria. Structured P2P Networks peers and objects have identifiers, strict topology objects are stored on peers according to their ID: responsibleFor(ObjID) = PeerID distributed indexing points to object location Lookup / Addressing: Retrieve the object which is identified with a given identifier .
  • 9. 1st Generation 2nd Generation 3rd Generation Unstructured P2P : Centralized from R.Schollmeier and J.Eberspächer, TU München DHT-Based Pure P2P Hybrid P2P Centralized P2P 1. Any terminal entity can be removed without loss of functionality 2. No central entities, fully distributed 3. “Fixed” connections in the overlay network 4.Search costs: O(log n) 5.Costs for state: O(log n) 6.For: Lookup Any terminal entity can be removed without loss of functionality Dynamic central entities for faster search 3.Search costs: variable 4.Costs for state: variable 5.For: Searches 1.Any terminal entity can be removed with- out loss of functionality 2. No central entities, fully distributed 3.Search costs: O(n) 4.Costs for state: O(1) 5.For: Searches 1.Central entity is necessary to provide the service 2.Central entity is some kind of index database 3.Search costs: O(1) 4.Costs for state: O(n) 5.For: Searches Structured P2P Unstructured P2P
  • 10. Principle Central server stores information about locations Unstructured Centralized P2P Systems  Node B requests item D from node A  Requester (B) asks server S for the location of D  Server S tells B that node A stores item D  Node A (provider) tells server that it stores item D 2. Send Query for desired object 3. P2P com-munication. Get Contents 1. Publish contents at own peer, tell server ?
  • 11. Unstructured Centralized P2P Systems Advantages Search complexity of O(1) – “just ask the server” Complex and fuzzy queries are possible Simple, fast and finding all objects Problems No robustness: server is single point of failure (SPOF) No self organization No intrinsic scalability: O(N) network and system load Non-linear increasing maintenance cost in particular for achieving high availability and scalability But overall, … Best principle for small and simple applications
  • 12. 1st Generation 2nd Generation 3rd Generation Unstructured P2P : Pure / Distributed from R.Schollmeier and J.Eberspächer, TU München DHT-Based Pure P2P Hybrid P2P Centralized P2P 1. Any terminal entity can be removed without loss of functionality 2. No central entities, fully distributed 3. “Fixed” connections in the overlay network 4.Search costs: O(log n) 5.Costs for state: O(log n) 6.For: Lookup Any terminal entity can be removed without loss of functionality Dynamic central entities for faster search 3.Search costs: variable 4.Costs for state: variable 5.For: Searches 1.Any terminal entity can be removed with- out loss of functionality 2. No central entities, fully distributed 3.Search costs: O(n) 4.Costs for state: O(1) 5.For: Searches 1.Central entity is necessary to provide the service 2.Central entity is some kind of index database 3.Search costs: O(1) 4.Costs for state: O(n) 5.For: Searches Structured P2P Unstructured P2P
  • 13. Unstructured Distributed P2P Systems Fully Distributed Approach Central systems vulnerable, do not scale, unbalanced costs Unstructured P2P systems follow opposite approach No global information available about location of item Information only stored at respective node providing it Retrieval of data No routing information for content Necessity to ask as many systems as possible Approaches: Flooding: high traffic load on network, does not scale Highest effort for searching quick search through large areas many messages needed for unique identification
  • 14. Unstructured Distributed P2P Systems Characteristics All peers are equal (in their roles) Search mechanism is provided by cooperation of all peers Local view on the network Overlay Network Service delivery Tasks to solve: Connecting to the network No central index server. Joining strategies needed To join: have to know at least 1 peer in network Local view on network => advertisements needed Search different search strategies available providing different benefits & drawbacks Service delivery Establish connection to other node Peer to peer communication
  • 15. Principle No information about the objects is spread Unstructured Distributed P2P Example 2. Search desired object 3. P2P com-munication. Get Contents  Node C searching object floods the network Node A and B send a reply  Node A and B (provider) store object, tell no one.  Node C requests the object from subset of the repliers 1. Publish contents at own peer ?
  • 16. Properties of Distributed P2P Systems Benefits: Robustness: Every peer is dispensable Switch off peer => no effect for network Balanced costs: each peer contributes the same Self organization Drawbacks: Slow and expensive search (flood the network) Finding all objects fitting to search criteria is not guaranteed Object out of reach for search query
  • 17. 1st Generation 2nd Generation 3rd Generation Unstructured Hybrid P2P Systems from R.Schollmeier and J.Eberspächer, TU München DHT-Based Pure P2P Hybrid P2P Centralized P2P 1. Any terminal entity can be removed without loss of functionality 2. No central entities, fully distributed 3. “Fixed” connections in the overlay network 4.Search costs: O(log n) 5.Costs for state: O(log n) 6.For: Lookup Any terminal entity can be removed without loss of functionality Dynamic central entities for faster search 3.Search costs: variable 4.Costs for state: variable 5.For: Searches 1.Any terminal entity can be removed with- out loss of functionality 2. No central entities, fully distributed 3.Search costs: O(n) 4.Costs for state: O(1) 5.For: Searches 1.Central entity is necessary to provide the service 2.Central entity is some kind of index database 3.Search costs: O(1) 4.Costs for state: O(n) 5.For: Searches Structured P2P Unstructured P2P
  • 18. Unstructured Hybrid P2P Systems Combine best of both worlds: Robustness by distributed indexing Fast searches by server queries How it works: Supernodes are mini servers / super peers Normal peers: have only overlay connections to supernodes Use supernodes as servers for queries Supernodes: queries are flooded in supernodes subnetwork Advantages: More robust than centralized solutions Faster searches than in pure P2P systems Disadvantages: Need algorithms to choose reliable supernodes
  • 19. Unstructured Hybrid P2P Systems Example: Gnutella 0.6 from R.Schollmeier and J.Eberspächer, TU München
  • 20. 1st Generation 2nd Generation 3rd Generation Structured DHT-based P2P Systems from R.Schollmeier and J.Eberspächer, TU München DHT-Based Pure P2P Hybrid P2P Centralized P2P 1. Any terminal entity can be removed without loss of functionality 2. No central entities, fully distributed 3. “Fixed” connections in the overlay network 4.Search costs: O(log n) 5.Costs for state: O(log n) 6.For: Lookup Any terminal entity can be removed without loss of functionality Dynamic central entities for faster search 3.Search costs: variable 4.Costs for state: variable 5.For: Searches 1.Any terminal entity can be removed with- out loss of functionality 2. No central entities, fully distributed 3.Search costs: O(n) 4.Costs for state: O(1) 5.For: Searches 1.Central entity is necessary to provide the service 2.Central entity is some kind of index database 3.Search costs: O(1) 4.Costs for state: O(n) 5.For: Searches Structured P2P Unstructured P2P
  • 21. Distributed Hash Table: Steps of Operation Beginning: Mapping nodes and data onto same address space Peers and objects are addressed using flat IDs. Nodes are responsible for data in certain parts of the address space: responsibleFor(ObjectID) = PeerID Association of data to nodes may change (churn) Later: Storing / Looking up data in the DHT Retrieving data = routing to the responsible node Responsible node not necessarily known in advance Deterministic statement about availability of data All nodes maintain routing information to other nodes Limitations Maintenance of routing information required Load balancing problematic No fuzzy search supported
  • 22. Principle: Location of the objects is found via routing  Node A (provider) advertises object at responsible peer B. Structured Overlay Networks: Example 3. P2P com-munication. Get link to object. 2. “Routing” to / Lookup of desired Object Advertisement is routed to B.  Node C looking for object sends query to the network. Query is routed to responsible node.  Node B replies to C by sending contact information of A 1. Publish link at responsible Peer ?
  • 23. Chord: Network Topology Uses SHA-1 to map IP address/object name to 160 Bit ID Basic ring topology mod 2 n Successor/ Predecessor Circular Key Space Link to ring successor 2207 2012-2207 2906 2683-2906 3485 2907-3485 2011 1623-2011 1622 1009-1622 1008 710-1008 709 660-709 659 612-659 2682 2208-2682 611 3486-4095 0-611 Enhanced topology k th finger of Peer n is shortcut pointing to peers being responsible for Object ID (n + 2 k ) O(log(N)) fingers lead to lookup operation of O(log(N)) Fingers poin to peers with ObjectIDs increasing ex-ponentially. Here: 709 + 2 k = …, 965, 1221, 1733, 2757 2207 2012-2207 2906 2683-2906 3485 2907-3485 2011 1623-2011 1622 1009-1622 1008 710-1008 709 660-709 659 612-659 2682 2208-2682 611 3486-4095 0-611
  • 24. Chord: Addressing Content Query Contains the hash value of the queried content On each step the distance from the destination is halved Node 1008 queries item 3000 Use Fingers to locate the destination faster Without fingers: no shortcuts, walk the circle Responsible peer found 2207 2012-2207 2906 2683-2906 3485 2907-3485 2011 1623-2011 1622 1009-1622 1008 710-1008 709 660-709 659 612-659 2682 2208-2682 611 3486-… 0-611 2 Responsible for 1008 + 1024 3 1 Responsible for 2207 + 512 Responsible for 3000
  • 25. Chord: Join Procedure Request to join the Chord ring New node (e.g. 1289) contacts a member of the ring (e.g. 2906) Contacted node routes the query to the responsible node (1622) Responsible node (1622) contacts new node Then: 2207 2012-2207 2906 2683-2906 3485 2907-3485 2011 1623-2011 1622 1290-1622 1008 710-1008 709 660-709 659 612-659 2682 2208-2682 611 3486-… 0-611 2a. Set new predecessor 2b. Redistribute indexing information (e.g. 1009-1289) 3. Update successor of predecessor 4. Build fingers Fingers of peer n pointing to peers responsible for ObjectID n + 2 k thus, log(N) fingers are built. 1289 1009-1289 1. Set/contact successor
  • 26. Summary Motivation Peer-to-Peer principle Overlay networks Unstructured P2P networks Centralized P2P network Distributed P2P network Structured DHT-based P2P networks Chord
  • 27. Questions? Thank you for your attention. Any questions? ?
  • 28. Content Addressable Network (CAN) A d-dimensional hash-table in cyclic coordinate space. d hash-functions, 1 per coordinate PeerID(p) =(h1(p),h2(p),... hd(p)) ObjID(obj) =(h1(obj),…,hd(obj)) CAN nodes Each is responsible for a distinct rectangular zone of the space Store data that hash into its zone The peers cover together the entire space 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 n2 n3 n4 n5 f1 f2 f3 f4 2-dimensional CAN Routing: A peer knows the IP addresses and zone ranges of its neighbors Peers can communicate only with their neighbors Properties Routing table size O(d) Guarantees that a file is found in at most d*n 1/d steps, where n is the total number of peers n1
  • 29. CAN: New Peer Join New node has to acquire a zone to be responsible for Steps: The node chooses randomly a point P in the space The zone which includes P will be split in 2 halfs New node n6 requests to join Contacts a node (e.g. n5) Selects point P n5 routes the join query to n1 n1 splits its zone n6 is responsible for the new zone (at point P) 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 n1 n2 n3 n4 n5 f1 f2 f3 f4 2-dimensional CAN Figure modified from another presentation n6 P n6
  • 30. Summary Motivation Peer-to-Peer principle Overlay networks Unstructured P2P networks Centralized P2P network Distributed P2P network Structured DHT-based P2P networks Chord Content Addressable Network
  • 31. Questions? Thank you for your attention. Any questions? ?
  • 32.  
  • 33.  
  • 34. After this slide: slides in reserve
  • 35. Overview Motivation Peer-to-Peer principle Overlay networks Unstructured P2P networks Centralized P2P network Distributed P2P network Structured DHT-based P2P networks Chord Content Addressable Network
  • 36. Requirements for Overlay Networks Fault-tolerance resilience of the connectivity when failures are encountered by arbitrary leave of peers Heterogeneity considering variations in physical capabilities and peer behavior (e.g. file fishing) Fairness evenly distributing workload across nodes Security ability of a system to manage, protect and distribute sensitive information Privacy degree to which a system or component allows for (or supports) anonymous transactions
  • 37. Requirements for Overlay Networks: Trade-offs Time – Space e.g. local information vs. complete replication of indices Security – Privacy e.g. fully logged operations vs. totally untraceable Efficiency – Completeness e.g. exact key-based matching vs. partial matching (use of wildcards) Scope – Network load with TTL (time to live) e.g. TTL based requests vs. exhaustive search Efficiency – Autonomy e.g. hierarchical vs. pure P2P overlays Reliability – Low maintenance overhead e.g. deterministic vs. probabilistic operations
  • 38. Centralized P2P Networks Central index server, maintaining index: What: object name, file name, criteria (ID3) … Where (IP address, Port) Search engine, combining both information Global view on the network Normal peer, maintaining the objects: Each peer maintains only its own objects decentralized storage (content at the edges) file transfer between clients (decentralized) Issues: Unbalanced costs: central server is bottleneck Security: server is single point of failure Central Server Overlay Network
  • 39. Search in Centralized P2P Networks Search Peers contact central server asking for objects which fulfill some criteria Central server answers with list of addresses of peers that contain objects with these criteria Peer contacts peers containing desired objects Transferring object / providing service P2P Central Server 3,4: Service Delivery P2P 1: Query server 2: Server answers
  • 40. Step 1: Addressing in Distributed Hash Tables Mapping of content/nodes into linear space Usually: 0, …, 2 m -1 >> number of objects to be stored Mapping of data and nodes into an address space (e.g.0 to 2 m -1) (with hash function) E.g., Hash( String ) mod 2 m : H(„ my data “)  2313 Association of parts of address space to DHT nodes The content of this slide has been adapted from “Peer-to-Peer Systems and Applications”, ed. By Steinmetz, Wehrle H(Node Y)=3485 3485 - 610 1622 - 2010 611 - 709 2011 - 2206 2207- 2905 (3485 - 610) 2906 - 3484 1008 - 1621 Y X 2 m -1 0 Often, the address space is viewed as a circle. Data item “D”: H(“D”)=3107 H(Node X)=2906
  • 41. Step 2: Association of Address Space with Nodes Arrangement of the range of values Each node is responsible for part of the value range Often with redundancy (overlapping of parts) Continuous adaptation Real (underlay) and logical (overlay) topology are (mostly) uncorrelated The content of this slide has been adapted from “Peer-to-Peer Systems and Applications”, ed. By Steinmetz, Wehrle Node 3485 is responsible for data items in range 2907 to 3485 (in case of a Chord-DHT) Logical view of the Distributed Hash Table Mapping on the real topology 2207 2906 3485 2011 1622 1008 709 611
  • 42. Step 3: Locating a Data Item Locating the data content-based routing Goal: Small and scalable effort O(1) with centralized hash table But: Management of a centralized hash table too costly (server) Minimum overhead with distributed hash tables O(log N): DHT hops to locate object O(log N): number of keys and routing information per node (N = # nodes) The content of this slide has been adapted from “Peer-to-Peer Systems and Applications”, ed. By Steinmetz, Wehrle
  • 43. Step 4: Routing to a Data Item Routing to a key/value-pair Start lookup at arbitrary node of DHT Routing to requested data item (key) ( 3107, (ip, port) ) Value = pointer to location of data Key = H(“ my data ”) Node 3485 manages keys 2907-3485, Initial node (arbitrary) H(„ my data “) = 3107 2207 2906 3485 2011 1622 1008 709 611 ? The content of this slide has been adapted from “Peer-to-Peer Systems and Applications”, ed. By Steinmetz, Wehrle
  • 44. Step 5: Data Retrieval – Usage of located Resource Accessing the content Key/value-pair is delivered to requester Requester analyzes key/Value-tuple (and downloads data from actual location – in case of indirect storage) H(„ my data “) = 3107 2207 2906 3485 2011 1622 1008 709 611 ? Get_Data(ip, port) Node 3485 sends (3107, (ip/port)) to requester In case of indirect storage: After knowing the actual Location, data is requested The content of this slide has been adapted from “Peer-to-Peer Systems and Applications”, ed. By Steinmetz, Wehrle
  • 45. (Step 6) Where is the Data located? Association of Data with IDs Direct Storage Indirect Storage D D 134.2.11.68 2207 2906 3485 2011 1622 1008 709 611 H SHA-1 („D“)=3107 D 2207 2906 3485 2011 1622 1008 709 611 H SHA-1 („D“)=3107 Item D: 134.2.11.68 D 134.2.11.68
  • 46. Distributed Hash Table: Insert and Delete a Node Join of a new node Calculation of node ID New node contacts DHT via arbitrary node Assignment of a particular hash range Copying of key/value-pairs of hash rang (usually with redundancy) Binding into routing environment 2207 2906 3485 2011 1622 1008 709 611 ID: 3485 134.2.11.68    The content of this slide has been adapted from “Peer-to-Peer Systems and Applications”, edt. By Steinmetz, Wehrle
  • 47. Node Failure and Node Departure Failure of a node Use of redundant key/value pairs (if a node fails) Use of redundant / alternative routing paths Key-value usually still retrievable if at least one copy remains Departure of a node Partitioning of hash range to neighbor nodes Copying of key/value pairs to corresponding nodes Unbinding from routing environment The content of this slide has been adapted from “Peer-to-Peer Systems and Applications”, edt. By Steinmetz, Wehrle
  • 48. Summary of DHTs: Properties Hash buckets distributed over nodes Nodes form an overlay network Route messages in overlay to find responsible node Routing scheme in the overlay network is the difference between different DHTs DHT behavior and usage: Node knows “object” name and wants to find it Unique and known object names assumed Node routes a message in overlay to the responsible node Responsible node replies with “object” Semantics of “object” are application defined 3.6
  • 49. Chord: Join Procedure (1) Request to join the Chord ring New Peer 1289 1. Contact a member of the ring 2. Route the query in the ring 3. Provide new peer’s successor 2207 2012-2207 2906 2683-2906 3485 2907-3485 2011 1623-2011 1622 1009-1622 1008 710-1008 709 660-709 659 612-659 2682 2208-2682 611 3486-… 0-611
  • 50. CAN: New Peer Join New node has to acquire a zone to be responsible for Steps: The node chooses randomly a point P in the space The zone which includes P will be split in 2 halfs Example: Node n6 requests to Join Contacts n4 Selects point P n4 routes the Join query to n1 n1 splits its zone n6 is responsible for the new zone (at point P) P n6 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 n1 n2 n3 n4 n5 f1 f2 f3 f4
  • 51. CAN: Routing 2 CAN nodes are neighbors if their zones overlap along d-1 dimensions and abut along one dimension I.e. A node knows the IP addresses of its neighbors A node knows the coordinates of neighboring zones Nodes can communicate only with their neighbors Properties Routing table size O(d) Guarantees that a file is found in at most d*n 1/d steps, where n is the total number of nodes 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 n1 n2 n3 n4 n5 f2 f4 f3 f6 Lookup example: Node n5 is looking for file f3 Abut = direkt angrenzen f5 f1
  • 52. CAN: New Peer Join New node has to acquire a zone to be responsible for Steps: The node chooses randomly a point P in the space The zone which includes P will be split in 2 halfs Example: Node n6 requests to Join Contacts n4 Selects point P n4 routes the Join query to n1 n1 splits its zone n6 is responsible for the new zone (at point P) P n6 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 n1 n2 n3 n4 n5 f1 f2 f3 f4
  • 53. Kademlia Widely deployed DHT (used in e.g. „eMule“) Features DHT-based overlay network using the XOR metric E.g. D( 1 10 1 01, 0 10 0 01) = 4 + 32 = 36 Simple operation Symmetrical (A->B == B->A) Use lookup messages to maintain the overlay network (Cannot be applied to asymmetrical Chord) Multiple routing possibilities Parallel asynchronous queries Overcome faulty nodes Properties Scalable Logarithmic complexity in node degree (O(log(N))) and lookup steps (O(log(N))) Efficient Low maintenance cost Robust 3.9
  • 54. Kademlia: System Description Nodes are leaves in a binary tree Position is determined by shortest unique prefix of node ID For every node (e.g. node „001“) Divide the binary tree into a series of successively lower sub-trees that do not contain that node At least one contact node is required in each sub-tree Large sub-trees provide many neighbor alternatives
  • 55. Kademlia: Lookup Operation Every hop forwards the queries to a smaller sub-tree around the target Example: Node „001“ routes to “101” Node „001“ needs a contact in 1-(*) sub-tree (e.g. „110“) Node „110“ needs a contact in 10-(*) sub-tree (e.g. „100“ Node „100“ forwards the query to destination „101“ 1 2 3
  • 56. Communication Networks II Peer-To-Peer Networking Note: many Images were taken and adapted from contribution at the book “P2P Systems and Applications” Ed. Steinmetz, Wehrle H(„ my data “) = 3107 2207 7.31.10.25 peer-to-peer.info 12.5.7.31 95.7.6.10 86.8.10.18 planet-lab.org berkeley.edu 2906 3485 2011 1622 1008 709 611 61.51.166.150 ?
  • 57. Structured Overlay Networks Principle peers and objects have identifiers objects are stored on peers according to their ID distributed indexing points to object location Lookup / Addressing: Retrieve the object which is identified with a given identifier . 3. P2P com-munication. Get Contents 2. “Routing” to / Lookup of desired Object 1. Publish contents at responsible Peer
  • 58. Structured Overlay Networks Principle peers and objects have identifiers objects are stored on peers according to their ID distributed indexing points to object location Lookup / Addressing: Retrieve the object which is identified with a given identifier . 3. P2P com-munication. Get Contents 2. “Routing” to / Lookup of desired Object 1. Publish contents at responsible Peer
  • 59. Structured Overlay Networks Principle peers and objects have identifiers objects are stored on peers according to their ID distributed indexing points to object location Lookup / Addressing: Retrieve the object which is identified with a given identifier .
  • 60. Unstructured centralized P2P networks Simple strategy: Central server stores information about locations  Node A (provider) tells server that it stores item D  Node B (requester) asks server S for the location of D  Server S tells B that node A stores item D  Node B requests item D from node A Node A Server S “ A stores D ” Node B The content of this slide has been adapted from “Peer-to-Peer Systems and Applications”, edt. By Steinmetz, Wehrle Transmission: D  Node B  “ Where is D ?”  “ A stores D ”  “ A stores D ” 
  • 61. Unstructured distributed P2P networks Fully Decentralized Approach No information about location of data at intermediate systems Necessity for broad search  Node B (requester) asks neighboring nodes for item D  -  Nodes forward request to further nodes (breadth-first search / flooding)  Node A (provider of item D) sends D to requesting node B The content of this slide has been adapted from “Peer-to-Peer Systems and Applications”, edt. By Steinmetz, Wehrle & Transmission: D  Node B “ I have D ?”  “ B searches D ” Node A Node B “ I store D ”              

Editor's Notes

  • #24: Nicht standard protokoll, es ist das bekannteste in der Lehre Am einfachsten zum erklären. Kommt in freier Wildbahn aber kaum vor. Am häufigsten zitiert.
  • #37: It has been observed by many measurement studies (e.g. Tran-Gia, Saroiu, etc.) that the rate at which peers join and leave the P2P systems is very high. This raises additional concerns, especially in cases where peers are assigned particular responsibilities and the connectivity of the overlay is not high enough to ensure that no partitioning of the system will take place. Overlay network design should not ignore the heterogeneity in node capabilities and behavior. Designing schemes that require homogenous components can either decrease the system capabilities to those achievable by the weakest components, or faulty/inefficient operation should be expected from the least capable nodes. Moreover, the observed variation in node behavior (e.g. up-time patterns) should be taken into account in the design of the overlay to increase the efficiency of the systems. Load-balance is the extent to which the load is evenly spread across nodes. The accounted load consists of the effort required for the basic overlay operations, e.g. maintenance, routing, indexing, caching, etc. Designing an overlay that avoids communication hot spots can increase the performance and the fault-tolerance of the overall system. On the other hand, by taking into account the heterogeneous environment , not all of the nodes are capable of offering the same amount of resources. A fair solution on the offered services should provide the incentives and the weighted balance between the resource contribution and the consumed overlay services. Security is the ability of a system to manage, protect and distribute sensitive information. In the context of P2P overlays security issues are basically raised by the presence of malicious peers , which do not forward or forward in the wrong direction received search requests . Additionally, selfish peers can behave in a way that could have similar results. Anonymity is the degree to which a system or component allows for (or supports) anonymous transactions. This is a special requirement for certain applications that require anti-censorship features or increase d privacy for the participating users. Of course misusage of those systems is always an issue (e.g. can be used to share illegal or inappropriate content) . (Introduce next slide) Meeting the complete set of the aforementioned requirements is not a trivial task. A number of trade-offs appear while designing P2P search services.
  • #38: A trade-off which is common both in distributed and local search (although different techniques apply for each case) is the one between the time required to find the requested information versus the space that is required to store the information. In the case of distributed systems, the communication among peers is the most costly operation with respect to time and we will mainly consider only the number of messages exchanged and not local search operations on data structures. An example of a technique that favors time over space is the complete replication of the indexing information on every peer. No communication is required (at least for searching). An example of a technique that favors space over time is to assign every peer indexing responsibilities only for its local content. Every peer should contact every other peer to find all the information it searches (Gnutella approach). Another interesting trade-off appears between the request for security and privacy. Many security techniques require the logging of detailed information on the interactions among peers. This enables the easy tracing of peer‘s operations. On the other hand, the opposite effect appears in systems that provide anonymity on users‘ actions. More specifically for search operations, as it has already been mentioned, two main operations have been proposed: Looking for information that can be mapped in a simple key and searching for a set of results based on a more complex descripion. While the first approach can be very efficient in terms of workload and latency, the latter offers a more rich functionality. But in order to cover completely the matched items it comes at a high communication cost. In order to face the high communication cost, TTL-based solutions have been proposed to limit the network load. In that case, the overlay is not search exaustively and matching content will not be reported to the requestor. More advanced techniques can provide alternative trade-offs among these factors. P2P systems are supposed to be composed by autonomous entities. But for large scale systems this comes at a very high cost, either for search operations or the maintenance of the overlay. For this reason, alternatives have been proposed where hierarchical solutions introduced the concept of „super-peers“. Super-peers are peers with higher responsibilities that serve normal peers in certain aspects of the inter-peer collaboration. This introduces dependencies among the peers and can cause larger problems by (accidental or not) misbehavior of the super-peers. A different aspect on P2P algoritms is their reliability degree, which determines the additional operational overhead. This overhead can be required for example for the maintenance of the overlay structure that provides the degree of reliability. For example, deterministic maintenance algorithms require a very specific structure for the overlay. Althernatively, probabilistic approaches do not provide the same level of reliability, but they can provide a certain level on average at a much lower cost since they tolarate more overlay changes. (Introduce next slide) Until now we have described the environment, where the P2P search algorithms should operate, the general requirements and the introduced trade-offs. As a next step we will investigate the design dimensions of the most crucial component of a P2P system, the Overlay.
  • #52: (This slide contains animation)