love li fang fang forever

P2P通信原理-英文版

• _posts

迁移源地址为:《P2P通信原理-英文版》

参考地址: http://vincentchun.iteye.com/blog/375525

Internet Draft B. Ford Document: draft-ford-midcom-p2p-01.txt M.I.T. Expires: April 27, 2004 P. Srisuresh Caymas Systems D. Kegel kegel.com October 2003

          Peer-to-Peer (P2P) communication across middleboxes

Status of this Memo

This document is an Internet-Draft and is subject to all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as “work in progress.”

The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html

The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html

Distribution of this document is unlimited.

Copyright Notice

Copyright (C) The Internet Society (2003). All Rights Reserved.

Abstract

This memo documents the methods used by the current peer-to-peer (P2P) applications to communicate in the presence of middleboxes such as firewalls and network address translators (NAT). In addition, the memo suggests guidelines to application designers and middlebox implementers on the measures they could take to enable immediate, wide deployment of P2P applications with or without requiring the use of special proxy, relay or midcom protocols.

Ford, Srisuresh & Kegel [Page 1]

Internet-Draft P2P applications across middleboxes October 2003

Table of Contents

  1. Introduction ………………………………………….
  2. Terminology …………………………………………..
  3. Techniques for P2P communication over middleboxes ………… 3.1. Relaying ……………………………………….. 3.2. Connection reversal ……………………………… 3.3. UDP Hole Punching ……………………………….. 3.3.1. Peers behind different NATs ……………… 3.3.2. Peers behind the same NAT ……………….. 3.3.3. Peers separated by multiple NATs …………… 3.3.4. Consistent port bindings ………………….. 3.4. UDP Port number prediction ……………………….. 3.5. Simultaneous TCP open …………………………….
  4. Application design guidelines ………………………….. 4.1. What works with P2P middleboxes ……………………. 4.2. Applications behind the same NAT …………………… 4.3. Peer discovery …………………………………… 4.4. TCP P2P applications ……………………………… 4.5. Use of midcom protocol …………………………….
  5. NAT design guidelines …………………………………. 5.1. Deprecate the use of symmetric NATs ………………… 5.2. Add incremental Cone-NAT support to symmetric NAT devices 5.3. Maintaining consistent port bindings for UDP ports ….. 5.3.1. Preserving Port Numbers …………………… 5.4. Maintaining consistent port bindings for TCP ports ….. 5.5. Large timeout for P2P applications ………………….
  6. Security considerations ………………………………..

  7. Introduction

    Present-day Internet has seen ubiquitous deployment of “middleboxes” such as network address translators(NAT), driven primarily by the ongoing depletion of the IPv4 address space. The asymmetric addressing and connectivity regimes established by these middleboxes, however, have created unique problems for peer-to-peer (P2P) applications and protocols, such as teleconferencing and multiplayer on-line gaming. These issues are likely to persist even into the IPv6 world, where NAT is often used as an IPv4 compatibility mechanism [NAT-PT], and firewalls will still be commonplace even after NAT is no longer required.

    Currently deployed middleboxes are designed primarily around the client/server paradigm, in which relatively anonymous client machines actively initiate connections to well-connected servers having stable IP addresses and DNS names. Most middleboxes implement an asymmetric

Ford, Srisuresh & Kegel [Page 2]

Internet-Draft P2P applications across middleboxes October 2003

communication model in which hosts on the private internal network can initiate outgoing connections to hosts on the public network, but external hosts cannot initiate connections to internal hosts except as specifically configured by the middlebox’s administrator. In the common case of NAPT, a client on the internal network does not have a unique IP address on the public Internet, but instead must share a single public IP address, managed by the NAPT, with other hosts on the same private network. The anonymity and inaccessibility of the internal hosts behind a middlebox is not a problem for client software such as web browsers, which only need to initiate outgoing connections. This inaccessibility is sometimes seen as a privacy benefit.

In the peer-to-peer paradigm, however, Internet hosts that would normally be considered “clients” need to establish communication sessions directly with each other. The initiator and the responder might lie behind different middleboxes with neither endpoint having any permanent IP address or other form of public network presence. A common on-line gaming architecture, for example, is for the participating application hosts to contact a well-known server for initialization and administration purposes. Subsequent to this, the hosts establish direct connections with each other for fast and efficient propagation of updates during game play. Similarly, a file sharing application might contact a well-known server for resource discovery or searching, but establish direct connections with peer hosts for data transfer. Middleboxes create problems for peer-to-peer connections because hosts behind a middlebox normally have no permanently usable public ports on the Internet to which incoming TCP or UDP connections from other peers can be directed. RFC 3235 [NAT-APPL] briefly addresses this issue, but does not offer any general solutions.

In this document we address the P2P/middlebox problem in two ways. First, we summarize known methods by which P2P applications can work around the presence of middleboxes. Second, we provide a set of application design guidelines based on these practices to make P2P applications operate more robustly over currently-deployed middleboxes. Further, we provide design guidelines for future middleboxes to allow them to support P2P applications more effectively. Our focus is to enable immediate and wide deployment of P2P applications requiring to traverse middleboxes.

  1. Terminology

In this section we first summarize some middlebox terms. We focus here on the two kinds of middleboxes that commonly cause problems for P2P applications.

Ford, Srisuresh & Kegel [Page 3]

Internet-Draft P2P applications across middleboxes October 2003

Firewall A firewall restricts communication between a private internal network and the public Internet, typically by dropping packets that are deemed unauthorized. A firewall examines but does not modify the IP address and TCP/UDP port information in packets crossing the boundary.

Network Address Translator (NAT) A network address translator not only examines but also modifies the header information in packets flowing across the boundary, allowing many hosts behind the NAT to share the use of a smaller number of public IP addresses (often one).

Network address translators in turn have two main varieties:

Basic NAT A Basic NAT maps an internal host’s private IP address to a public IP address without changing the TCP/UDP port numbers in packets crossing the boundary. Basic NAT is generally only useful when the NAT has a pool of public IP addresses from which to make address bindings on behalf of internal hosts.

Network Address/Port Translator (NAPT) By far the most common, a Network Address/Port Translator examines and modifies both the IP address and the TCP/UDP port number fields of packets crossing the boundary, allowing multiple internal hosts to share a single public IP address simultaneously.

Refer to [NAT-TRAD] and [NAT-TERM] for more general information on NAT taxonomy and terminology. Additional terms that further classify NAPT are defined in more recent work [STUN]. When an internal host opens an outgoing TCP or UDP session through a network address/port translator, the NAPT assigns the session a public IP address and port number so that subsequent response packets from the external endpoint can be received by the NAPT, translated, and forwarded to the internal host. The effect is that the NAPT establishes a port binding between (private IP address, private port number) and (public IP address, public port number). The port binding defines the address translation the NAPT will perform for the duration of the session. An issue of relevance to P2P applications is how the NAT behaves when an internal host initiates multiple simultaneous sessions from a single (private IP, private port) pair to multiple distinct endpoints on the external network.

Cone NAT After establishing a port binding between a (private IP, private port) tuple and a (public IP, public port) tuple, a cone NAT will re-use this port binding for subsequent sessions the

Ford, Srisuresh & Kegel [Page 4]

Internet-Draft P2P applications across middleboxes October 2003

  application may initiate from the same private IP address and
  port number, for as long as at least one session using the port
  binding remains active.

  For example, suppose Client A in the diagram below initiates two
  simultaneous outgoing sessions through a cone NAT, from the same
  internal endpoint (10.0.0.1:1234) to two different
  external servers, S1 and S2.  The cone NAT assigns just one public
  endpoint tuple, 155.99.25.11:62000, to both of these sessions,
  ensuring that the "identity" of the client's port is maintained
  across address translation. Since Basic NATs and firewalls do
  not modify port numbers as packets flow across
  the middlebox, these types of middleboxes can be viewed as a
  degenerate form of Cone NAT.



       Server S1                                     Server S2
    18.181.0.31:1235                              138.76.29.7:1235
           |                                             |
           |                                             |
           +----------------------+----------------------+
                                  |
      ^  Session 1 (A-S1)  ^      |      ^  Session 2 (A-S2)  ^
      |  18.181.0.31:1235  |      |      |  138.76.29.7:1235  |
      v 155.99.25.11:62000 v      |      v 155.99.25.11:62000 v
                                  |
                               Cone NAT
                             155.99.25.11
                                  |
      ^  Session 1 (A-S1)  ^      |      ^  Session 2 (A-S2)  ^
      |  18.181.0.31:1235  |      |      |  138.76.29.7:1235  |
      v   10.0.0.1:1234    v      |      v   10.0.0.1:1234    v
                                  |
                               Client A
                            10.0.0.1:1234

Ford, Srisuresh & Kegel [Page 5]

Internet-Draft P2P applications across middleboxes October 2003

Symmetric NAT A symmetric NAT, in contrast, does not maintain a consistent port binding between (private IP, private port) and (public IP, public port) across all sessions. Instead, it assigns a new public port to each new session. For example, suppose Client A initiates two outgoing sessions from the same port as above, one with S1 and one with S2. A symmetric NAT might allocate the public endpoint 155.99.25.11:62000 to session 1, and then allocate a different public endpoint 155.99.25.11:62001, when the application initiates session 2. The NAT is able to differentiate between the two sessions for translation purposes because the external endpoints involved in the sessions (those of S1 and S2) differ, even as the endpoint identity of the client application is lost across the address translation boundary.

       Server S1                                     Server S2
    18.181.0.31:1235                              138.76.29.7:1235
           |                                             |
           |                                             |
           +----------------------+----------------------+
                                  |
      ^  Session 1 (A-S1)  ^      |      ^  Session 2 (A-S2)  ^
      |  18.181.0.31:1235  |      |      |  138.76.29.7:1235  |
      v 155.99.25.11:62000 v      |      v 155.99.25.11:62001 v
                                  |
                             Symmetric NAT
                             155.99.25.11
                                  |
      ^  Session 1 (A-S1)  ^      |      ^  Session 2 (A-S2)  ^
      |  18.181.0.31:1235  |      |      |  138.76.29.7:1235  |
      v   10.0.0.1:1234    v      |      v   10.0.0.1:1234    v
                                  |
                               Client A
                            10.0.0.1:1234

The issue of cone versus symmetric NAT behavior applies equally to TCP and UDP traffic.

Cone NAT is further classified according to how liberally the NAT accepts incoming traffic directed to an already-established (public IP, public port) pair. This classification generally applies only to UDP traffic, since NATs and firewalls reject incoming TCP connection attempts unconditionally unless specifically configured to do otherwise.

Full Cone NAT

Ford, Srisuresh & Kegel [Page 6]

Internet-Draft P2P applications across middleboxes October 2003

  After establishing a public/private port binding for a new
  outgoing session, a full cone NAT will subsequently accept
  incoming traffic to the corresponding public port from ANY
  external endpoint on the public network.  Full cone NAT is
  also sometimes called "promiscuous" NAT.

Restricted Cone NAT A restricted cone NAT only forwards an incoming packet directed to a public port if its external (source) IP address matches the address of a node to which the internal host has previously sent one or more outgoing packets. A restricted cone NAT effectively refines the firewall principle of rejecting unsolicited incoming traffic, by restricting incoming traffic to a set of “known” external IP addresses.

Port-Restricted Cone NAT A port-restricted cone NAT, in turn, only forwards an incoming packet if its external IP address AND port number match those of an external endpoint to which the internal host has previously sent outgoing packets. A port-restricted cone NAT provides internal nodes the same level of protection against unsolicited incoming traffic that a symmetric NAT does, while maintaining a private port’s identity across translation.

Finally, in this document we define new terms for classifying the P2P-relevant behavior of middleboxes:

P2P-Application P2P-application as used in this document is an application in which each P2P participant registers with a public registration server, and subsequently uses either its private endpoint, or public endpoint, or both, to establish peering sessions.

P2P-Middlebox A P2P-Middlebox is middlebox that permits the traversal of P2P applications.

P2P-firewall A P2P-firewall is a P2P-Middlebox that provides firewall functionality but performs no address translation.

P2P-NAT A P2P-NAT is a P2P-Middlebox that provides NAT functionality, and may also provide firewall functionality. At minimum, a P2P-Middlebox must implement Cone NAT behavior for UDP traffic, allowing applications to establish robust P2P connectivity using the UDP hole punching technique.

Ford, Srisuresh & Kegel [Page 7]

Internet-Draft P2P applications across middleboxes October 2003

Loopback translation When a host in the private domain of a NAT device attempts to connect with another host behind the same NAT device using the public address of the host, the NAT device performs the equivalent of a “Twice-nat” translation on the packet as follows. The originating host’s private endpoint is translated into its assigned public endpoint, and the target host’s public endpoint is translated into its private endpoint, before the packet is forwarded to the target host. We refer the above translation performed by a NAT device as “Loopback translation”.

  1. Techniques for P2P Communication over middleboxes

    This section reviews in detail the currently known techniques for implementing peer-to-peer communication over existing middleboxes, from the perspective of the application or protocol designer.

3.1. Relaying

The most reliable, but least efficient, method of implementing peer- to-peer communication in the presence of a middlebox is to make the peer-to-peer communication look to the network like client/server communication through relaying. For example, suppose two client hosts, A and B, have each initiated TCP or UDP connections with a well-known server S having a permanent IP address. The clients reside on separate private networks, however, and their respective middleboxes prevent either client from directly initiating a connection to the other.

                            Server S
                               |
                               |
        +----------------------+----------------------+
        |                                             |
      NAT A                                         NAT B
        |                                             |
        |                                             |
     Client A                                      Client B

Instead of attempting a direct connection, the two clients can simply use the server S to relay messages between them. For example, to send a message to client B, client A simply sends the message to server S along its already-established client/server connection, and server S then sends the message on to client B using its existing client/server connection with B.

This method has the advantage that it will always work as long as

Ford, Srisuresh & Kegel [Page 8]

Internet-Draft P2P applications across middleboxes October 2003

both clients have connectivity to the server. Its obvious disadvantages are that it consumes the server’s processing power and network bandwidth unnecessarily, and communication latency between the two clients is likely to be increased even if the server is well- connected. The TURN protocol [TURN] defines a method of implementing relaying in a relatively secure fashion.

Ford, Srisuresh & Kegel [Page 9]

Internet-Draft P2P applications across middleboxes October 2003

3.2. Connection reversal

The second technique works if only one of the clients is behind a middlebox. For example, suppose client A is behind a NAT but client B has a globally routable IP address, as in the following diagram:

                            Server S
                        18.181.0.31:1235
                               |
                               |
        +----------------------+----------------------+
        |                                             |
      NAT A                                           |
155.99.25.11:62000                                    |
        |                                             |
        |                                             |
     Client A                                      Client B
  10.0.0.1:1234                               138.76.29.7:1234

Client A has private IP address 10.0.0.1, and the application is using TCP port 1234. This client has established a connection with server S at public IP address 18.181.0.31 and port 1235. NAT A has assigned TCP port 62000, at its own public IP address 155.99.25.11, to serve as the temporary public endpoint address for A’s session with S: therefore, server S believes that client A is at IP address 155.99.25.11 using port 62000. Client B, however, has its own permanent IP address, 138.76.29.7, and the peer-to-peer application on B is accepting TCP connections at port 1234.

Now suppose client B would like to initiate a peer-to-peer communication session with client A. B might first attempt to contact client A either at the address client A believes itself to have, namely 10.0.0.1:1234, or at the address of A as observed by server S, namely 155.99.25.11:62000. In either case, however, the connection will fail. In the first case, traffic directed to IP address 10.0.0.1 will simply be dropped by the network because 10.0.0.1 is not a publicly routable IP address. In the second case, the TCP SYN request from B will arrive at NAT A directed to port 62000, but NAT A will reject the connection request because only outgoing connections are allowed.

After attempting and failing to establish a direct connection to A, client B can use server S to relay a request to client A to initiate a “reversed” connection to client B. Client A, upon receiving this relayed request through S, opens a TCP connection to client B at B’s public IP address and port number. NAT A allows the connection to proceed because it is originating inside the firewall, and client B can receive the connection because it is not behind a middlebox.

Ford, Srisuresh & Kegel [Page 10]

Internet-Draft P2P applications across middleboxes October 2003

A variety of current peer-to-peer systems implement this technique. Its main limitation, of course, is that it only works as long as only one of the communicating peers is behind a NAT: in the increasingly common case where both peers are behind NATs, the method fails. Because connection reversal is not a general solution to the problem, it is NOT recommended as a primary strategy. Applications may choose to attempt connection reversal, but should be able to fall back automatically on another mechanism such as relaying if neither a “forward” nor a “reverse” connection can be established.

3.3. UDP hole punching

The third technique, and the one of primary interest in this document, is widely known as “UDP Hole Punching.” UDP hole punching relies on the properties of common firewalls and cone NATs to allow appropriately designed peer-to-peer applications to “punch holes” through the middlebox and establish direct connectivity with each other, even when both communicating hosts may lie behind middleboxes. This technique was mentioned briefly in section 5.1 of RFC 3027 [NAT- PROT], and has been informally described elsewhere on the Internet [KEGEL] and used in some recent protocols [TEREDO, ICE]. As the name implies, unfortunately, this technique works reliably only with UDP.

We will consider two specific scenarios, and how applications can be designed to handle both of them gracefully. In the first situation, representing the common case, two clients desiring direct peer-to- peer communication reside behind two different NATs. In the second, the two clients actually reside behind the same NAT, but do not necessarily know that they do.

3.3.1. Peers behind different NATs

Suppose clients A and B both have private IP addresses and lie behind different network address translators. The peer-to-peer application running on clients A and B and on server S each use UDP port 1234. A and B have each initiated UDP communication sessions with server S, causing NAT A to assign its own public UDP port 62000 for A’s session with S, and causing NAT B to assign its port 31000 to B’s session with S, respectively.

                            Server S
                        18.181.0.31:1234
                               |
                               |
        +----------------------+----------------------+
        |                                             |
      NAT A                                         NAT B

Ford, Srisuresh & Kegel [Page 11]

Internet-Draft P2P applications across middleboxes October 2003

155.99.25.11:62000                            138.76.29.7:31000
        |                                             |
        |                                             |
     Client A                                      Client B
  10.0.0.1:1234                                 10.1.1.3:1234

Now suppose that client A wants to establish a UDP communication session directly with client B. If A simply starts sending UDP messages to B’s public address, 138.76.29.7:31000, then NAT B will typically discard these incoming messages (unless it is a full cone NAT), because the source address and port number does not match those of S, with which the original outgoing session was established. Similarly, if B simply starts sending UDP messages to A’s public address, then NAT A will typically discard these messages.

Suppose A starts sending UDP messages to B’s public address, however, and simultaneously relays a request through server S to B, asking B to start sending UDP messages to A’s public address. A’s outgoing messages directed to B’s public address (138.76.29.7:31000) cause NAT A to open up a new communication session between A’s private address and B’s public address. At the same time, B’s messages to A’s public address (155.99.25.11:62000) cause NAT B to open up a new communication session between B’s private address and A’s public address. Once the new UDP sessions have been opened up in each direction, client A and B can communicate with each other directly without further burden on the “introduction” server S.

The UDP hole punching technique has several useful properties. Once a direct peer-to-peer UDP connection has been established between two clients behind middleboxes, either party on that connection can in turn take over the role of “introducer” and help the other party establish peer-to-peer connections with additional peers, minimizing the load on the initial introduction server S. The application does not need to attempt to detect explicitly what kind of middlebox it is behind, if any [STUN], since the procedure above will establish peer- to-peer communication channels equally well if either or both clients do not happen to be behind a middlebox. The hole punching technique even works automatically with multiple NATs, where one or both clients are removed from the public Internet via two or more levels of address translation.

3.3.2. Peers behind the same NAT

Now consider the scenario in which the two clients (probably unknowingly) happen to reside behind the same NAT, and are therefore located in the same private IP address space. Client A has established a UDP session with server S, to which the common NAT has assigned public port number 62000. Client B has similarly

Ford, Srisuresh & Kegel [Page 12]

Internet-Draft P2P applications across middleboxes October 2003

established a session with S, to which the NAT has assigned public port number 62001.

                            Server S
                        18.181.0.31:1234
                               |
                               |
                              NAT
                     A-S 155.99.25.11:62000
                     B-S 155.99.25.11:62001
                               |
        +----------------------+----------------------+
        |                                             |
     Client A                                      Client B
  10.0.0.1:1234                                 10.1.1.3:1234

Suppose that A and B use the UDP hole punching technique as outlined above to establish a communication channel using server S as an introducer. Then A and B will learn each other’s public IP addresses and port numbers as observed by server S, and start sending each other messages at those public addresses. The two clients will be able to communicate with each other this way as long as the NAT allows hosts on the internal network to open translated UDP sessions with other internal hosts and not just with external hosts. We refer to this situation as “loopback translation,” because packets arriving at the NAT from the private network are translated and then “looped back” to the private network rather than being passed through to the public network. For example, when A sends a UDP packet to B’s public address, the packet initially has a source IP address and port number of 10.0.0.1:124 and a destination of 155.99.25.11:62001. The NAT receives this packet, translates it to have a source of 155.99.25.11:62000 (A’s public address) and a destination of 10.1.1.3:1234, and then forwards it on to B. Even if loopback translation is supported by the NAT, this translation and forwarding step is obviously unnecessary in this situation, and is likely to add latency to the dialog between A and B as well as burdening the NAT.

The solution to this problem is straightforward, however. When A and B initially exchange address information through server S, they should include their own IP addresses and port numbers as “observed” by themselves, as well as their addresses as observed by S. The clients then simultaneously start sending packets to each other at each of the alternative addresses they know about, and use the first address that leads to successful communication. If the two clients are behind the same NAT, then the packets directed to their private addresses are likely to arrive first, resulting in a direct communication channel not involving the NAT. If the two clients are behind different NATs, then the packets directed to their private

Ford, Srisuresh & Kegel [Page 13]

Internet-Draft P2P applications across middleboxes October 2003

addresses will fail to reach each other at all, but the clients will hopefully establish connectivity using their respective public addresses. It is important that these packets be authenticated in some way, however, since in the case of different NATs it is entirely possible for A’s messages directed at B’s private address to reach some other, unrelated node on A’s private network, or vice versa.

3.3.3. Peers separated by multiple NATs

In some topologies involving multiple NAT devices, it is not possible for two clients to establish an “optimal” P2P route between them without specific knowledge of the topology. Consider for example the following situation.

                            Server S
                        18.181.0.31:1234
                               |
                               |
                             NAT X
                     A-S 155.99.25.11:62000
                     B-S 155.99.25.11:62001
                               |
                               |
        +----------------------+----------------------+
        |                                             |
      NAT A                                         NAT B
192.168.1.1:30000                             192.168.1.2:31000
        |                                             |
        |                                             |
     Client A                                      Client B
  10.0.0.1:1234                                 10.1.1.3:1234

Suppose NAT X is a large industrial NAT deployed by an internet service provider (ISP) to multiplex many customers onto a few public IP addresses, and NATs A and B are small consumer NAT gateways deployed independently by two of the ISP’s customers to multiplex their private home networks onto their respective ISP-provided IP addresses. Only server S and NAT X have globally routable IP addresses; the “public” IP addresses used by NAT A and NAT B are actually private to the ISP’s addressing realm, while client A’s and B’s addresses in turn are private to the addressing realms of NAT A and B, respectively. Each client initiates an outgoing connection to server S as before, causing NATs A and B each to create a single public/private translation, and causing NAT X to establish a public/private translation for each session.

Now suppose clients A and B attempt to establish a direct peer-to-

Ford, Srisuresh & Kegel [Page 14]

Internet-Draft P2P applications across middleboxes October 2003