Limits...
Relay discovery and selection for large-scale P2P streaming

View Article: PubMed Central - PubMed

ABSTRACT

In peer-to-peer networks, application relays have been commonly used to provide various networking services. The service performance often improves significantly if a relay is selected appropriately based on its network location. In this paper, we studied the location-aware relay discovery and selection problem for large-scale P2P streaming networks. In these large-scale and dynamic overlays, it incurs significant communication and computation cost to discover a sufficiently large relay candidate set and further to select one relay with good performance. The network location can be measured directly or indirectly with the tradeoffs between timeliness, overhead and accuracy. Based on a measurement study and the associated error analysis, we demonstrate that indirect measurements, such as King and Internet Coordinate Systems (ICS), can only achieve a coarse estimation of peers’ network location and those methods based on pure indirect measurements cannot lead to a good relay selection. We also demonstrate that there exists significant error amplification of the commonly used “best-out-of-K” selection methodology using three RTT data sets publicly available. We propose a two-phase approach to achieve efficient relay discovery and accurate relay selection. Indirect measurements are used to narrow down a small number of high-quality relay candidates and the final relay selection is refined based on direct probing. This two-phase approach enjoys an efficient implementation using the Distributed-Hash-Table (DHT). When the DHT is constructed, the node keys carry the location information and they are generated scalably using indirect measurements, such as the ICS coordinates. The relay discovery is achieved efficiently utilizing the DHT-based search. We evaluated various aspects of this DHT-based approach, including the DHT indexing procedure, key generation under peer churn and message costs.

No MeSH data available.


A pair of streaming peers connected via relays.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC5391927&req=5

pone.0175360.g001: A pair of streaming peers connected via relays.

Mentions: We consider a large-scale P2P network for streaming applications with N peers. We assume that peers and relays are also the overlay nodes of this P2P network. In such a large-scale overlay, there are many thousands, perhaps millions, of relay candidates, denoted as M. To serve as a relay candidate, a peer must have a public Internet address and have sufficient bandwidth capacity. We expect that the population of relay candidates to be a significant portion of the overlay population so as to achieve load balancing and sufficient capacity for many concurrent streaming sessions. A peer usually does not know the complete relay candidate set. We denote K as the number of the relay candidates which a peer has discovered. As shown in Fig 1, peer p1 initiates a streaming session with a remote peer p2 using relays. Peer p1 and p2 must mutually agree on selecting an intermediate peer to serve as a relay for NAT traversal or other purposes. We call these prospective peers candidate relays, i.e., Ri(i = 1, …, K). The process of identifying these candidate relays is relay discovery. Relay selection is the process to choose a relay or multiple relays from the set of candidate relays to meet service requirements of P2P applications. Although multiple relays may be used along a path, we only study the case of a single relay for simplicity in this paper. In practice, a peer may engage in many sessions to different peers. It is prerequisite to maintain a large set of candidate relays in order to select a high-quality relay for a session because session endpoints may be arbitrarily distributed in an Internet-scale overlay.


Relay discovery and selection for large-scale P2P streaming
A pair of streaming peers connected via relays.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC5391927&req=5

pone.0175360.g001: A pair of streaming peers connected via relays.
Mentions: We consider a large-scale P2P network for streaming applications with N peers. We assume that peers and relays are also the overlay nodes of this P2P network. In such a large-scale overlay, there are many thousands, perhaps millions, of relay candidates, denoted as M. To serve as a relay candidate, a peer must have a public Internet address and have sufficient bandwidth capacity. We expect that the population of relay candidates to be a significant portion of the overlay population so as to achieve load balancing and sufficient capacity for many concurrent streaming sessions. A peer usually does not know the complete relay candidate set. We denote K as the number of the relay candidates which a peer has discovered. As shown in Fig 1, peer p1 initiates a streaming session with a remote peer p2 using relays. Peer p1 and p2 must mutually agree on selecting an intermediate peer to serve as a relay for NAT traversal or other purposes. We call these prospective peers candidate relays, i.e., Ri(i = 1, …, K). The process of identifying these candidate relays is relay discovery. Relay selection is the process to choose a relay or multiple relays from the set of candidate relays to meet service requirements of P2P applications. Although multiple relays may be used along a path, we only study the case of a single relay for simplicity in this paper. In practice, a peer may engage in many sessions to different peers. It is prerequisite to maintain a large set of candidate relays in order to select a high-quality relay for a session because session endpoints may be arbitrarily distributed in an Internet-scale overlay.

View Article: PubMed Central - PubMed

ABSTRACT

In peer-to-peer networks, application relays have been commonly used to provide various networking services. The service performance often improves significantly if a relay is selected appropriately based on its network location. In this paper, we studied the location-aware relay discovery and selection problem for large-scale P2P streaming networks. In these large-scale and dynamic overlays, it incurs significant communication and computation cost to discover a sufficiently large relay candidate set and further to select one relay with good performance. The network location can be measured directly or indirectly with the tradeoffs between timeliness, overhead and accuracy. Based on a measurement study and the associated error analysis, we demonstrate that indirect measurements, such as King and Internet Coordinate Systems (ICS), can only achieve a coarse estimation of peers’ network location and those methods based on pure indirect measurements cannot lead to a good relay selection. We also demonstrate that there exists significant error amplification of the commonly used “best-out-of-K” selection methodology using three RTT data sets publicly available. We propose a two-phase approach to achieve efficient relay discovery and accurate relay selection. Indirect measurements are used to narrow down a small number of high-quality relay candidates and the final relay selection is refined based on direct probing. This two-phase approach enjoys an efficient implementation using the Distributed-Hash-Table (DHT). When the DHT is constructed, the node keys carry the location information and they are generated scalably using indirect measurements, such as the ICS coordinates. The relay discovery is achieved efficiently utilizing the DHT-based search. We evaluated various aspects of this DHT-based approach, including the DHT indexing procedure, key generation under peer churn and message costs.

No MeSH data available.