Network congestion

Network congestion in data networking and queueing theory is the reduced quality of service that occurs when a network node is carrying more data than it can handle. Typical effects include queueing delay, packet loss or the blocking of new connections. A consequence of the latter two effects is that an incremental increase in offered load leads either only to a small increase or even a decrease in network throughput.[1]

Network protocols that use aggressive retransmissions to compensate for packet loss due to congestion can increase congestion, even after the initial load has been reduced to a level that would not normally have induced network congestion. Such networks exhibit two stable states under the same level of load. The stable state with low throughput is known as congestive collapse.

Networks use congestion control and congestion avoidance techniques to try to avoid congestion collapse. These include: exponential backoff in protocols such as 802.11 CSMA/CA and the original Ethernet, window reduction in TCP, and fair queueing in devices such as routers. Another method is to implement priority schemes, transmitting some packets with higher priority than others. Priority schemes help to alleviate the effects of congestion for services such as 802.1p.

A third avoidance method is the explicit allocation of network resources to specific flows. One example of this is the use of Contention-Free Transmission Opportunities (CFTXOPs) in the ITU-T G.hn standard, which provides high-speed (up to 1 Gbit/s) local area networking over varying wires (power lines, phone lines and coaxial cables).

RFC 2914 addresses the subject of congestion control in detail.

Network capacity

Network resources are limited, including router processing time and link throughput.

For example:

Congestive collapse

Congestive collapse (or congestion collapse) is the condition in which congestion prevents or limits useful communication. Congestion collapse generally occurs at "choke points" in the network, where incoming traffic exceeds outgoing bandwidth. Connection points between a local area network and a wide area network are common choke points.

When a network is in this condition, it settles into a stable state where traffic demand is high but little useful throughput is available, packet delay and loss and quality of service is extremely poor.

Congestion collapse was identified as a possible problem by 1984, for example in RFC 896. It was first observed on the early Internet in October 1986, when the NSFnet phase-I backbone dropped three orders of magnitude from its capacity of 32 kbit/s to 40 bit/s, which continued until end nodes started implementing Van Jacobson's congestion control between 1987 and 1988.

When more packets were sent than could be handled by intermediate routers, the intermediate routers discarded many packets, expecting the end points of the network to retransmit the information. However, early TCP implementations had poor retransmission behavior. When this packet loss occurred, the endpoints sent extra packets that repeated the information lost, doubling the incoming rate.

Congestion control

Congestion control controls traffic entry into a telecommunications network, to avoid congestive collapse by avoiding oversubscription by reducing the rate of packets. It should not be confused with flow control, which prevents the sender from overwhelming the receiver.

Theory of congestion control

The theory on congestion control was pioneered by Frank Kelly, who applied microeconomic theory and convex optimization theory to describe how individuals controlling their own rates can interact to achieve an "optimal" network-wide rate allocation.

Examples of "optimal" rate allocation are max-min fair allocation and Kelly's suggestion of proportionally fair allocation, although many others are possible.

Let x_i be the rate of flow i, C_l be the capacity of link l, and r_{li} be 1 if flow i uses link l and 0 otherwise. Let x, c and R be the corresponding vectors and matrix. Let U(x) be an increasing, strictly concave function, called the utility, which measures how much benefit a user obtains by transmitting at rate x. The optimal rate allocation then satisfies

\max\limits_x \sum_i U(x_i)
such that Rx \le c

The Lagrange dual of this problem decouples, so that each flow sets its own rate, based only on a "price" signalled by the network. Each link capacity imposes a constraint, which gives rise to a Lagrange multiplier, p_l. The sum of these multipliers, y_i=\sum_l p_l r_{li}, is the price to which the flow responds.

Congestion control then becomes a distributed optimisation algorithm. Many current congestion control algorithms can be modelled in this framework, with p_l being either the loss probability or the queueing delay at link l.

A major weakness is that it assigns the same price to all flows, while sliding window flow control causes "burstiness" that causes different flows to observe different loss or delay at a given link.

Classification of congestion control algorithms

Among the ways to classify congestion control algorithms are:

Mitigation

A few mechanisms have been invented to prevent network congestion or to deal with a network collapse:

The correct endpoint behavior is usually to repeat dropped information, but progressively slow the repetition rate. Provided all endpoints do this, the congestion lifts and the network resumes normal behavior. Other strategies such as slow-start ensure that new connections don't overwhelm the router before congestion detection initiates.

The most common router congestion avoidance mechanisms are fair queuing and other scheduling algorithms, and random early detection, or RED, where packets are randomly dropped, proactively triggering the endpoints to slow transmission before congestion collapse occurs. Fair queuing is most useful in routers at chokepoints with a small number of connections passing through them. Larger routers must rely on RED.

Some end-to-end protocols behave better under congested conditions. TCP is perhaps the best behaved. The first TCP implementations to handle congestion were developed in 1984, but Van Jacobson's inclusion of an open source solution in the Berkeley Standard Distribution UNIX ("BSD") in 1988 first provided good behavior.

UDP does not control congestion. Protocols built atop UDP must handle congestion independently. Protocols that transmit at a fixed rate, independent of congestion, can be problematic. Real-time streaming protocols, including many Voice over IP protocols, have this property. Thus, special measures, such as quality-of-service routing, must be taken to keep packets from being dropped.

In general, congestion in pure datagram networks must be kept at the periphery of the network, where the above mechanisms can handle it. Congestion in the Internet backbone is problematic. Cheap fiber-optic lines have reduced costs in the Internet backbone allowing it to be provisioned with enough bandwidth to keep congestion at the periphery.

Practical network congestion avoidance

Connection-oriented protocols, such as the widely used TCP protocol, generally watch for packet errors, losses, or delays (see Quality of Service) to adjust the transmit speed. Various network congestion avoidance processes, support different trade-offs.[2]

TCP/IP congestion avoidance

The TCP congestion avoidance algorithm is the primary basis for congestion control in the Internet.[3][4][5][6][7]

Problems occur when concurrent TCP flows experience port queue buffer tail-drops, defeating TCP's automatic congestion avoidance. All flows that experience port queue buffer tail-drop begin a TCP retrain at the same moment – this is called TCP global synchronization.

Active queue management

Active queue management (AQM) is the reorder or drop of network packets inside a transmit buffer that is associated with a network interface controller (NIC). This task is performed by the network scheduler.

Random early detection

One solution is to use random early detection (RED) on the network equipment's port queue buffer.[8][9] On network equipment ports with more than one queue buffer, weighted random early detection (WRED) could be used if available.

RED indirectly signals to sender and receiver by deleting some packets, e.g. when the average queue buffer lengths are more than a threshold (e.g. 50%) and deletes linearly or cubically more packets,[10] up to e.g. 100%. The average queue buffer lengths are computed over 1 second intervals.

Robust random early detection (RRED)

The robust random early detection (RRED) algorithm was proposed to improve the TCP throughput against denial-of-service (DoS) attacks, particularly low-rate denial-of-service (LDoS) attacks. Experiments confirmed that RED-like algorithms were vulnerable under Low-rate Denial-of-Service (LDoS) attacks due to the oscillating TCP queue size caused by the attacks.[11]

Flowbased-RED/WRED

Some network equipment is equipped with ports that can follow and measure each flow (flowbased-RED/WRED) and are thereby able to signal a too big bandwidth flow according to some quality of service policy. A policy could then divide the bandwidth among all flows by some criteria.

Explicit Congestion Notification

Another approach is to use IP Explicit Congestion Notification (ECN).[12] ECN is used only when two hosts signal that they want to use it. With this method, a protocol bit is used to signal explicit congestion. This is better than the indirect packet delete congestion notification performed by the RED/WRED algorithms, but it requires explicit support by both hosts.[13] ECN coauthor Sally Floyd published detailed information on ECN, including the version required for Cisco IOS.[8]

When a router receives a packet marked as ECN-capable and anticipates (using RED) congestion, it sets the ECN flag, notifying the sender of congestion. The sender should respond by decreasing its transmission bandwidth, e.g., by decreasing the TCP window size (sending rate) or by other means.

Cisco AQM: Dynamic buffer limiting

Cisco Systems (Engine IV and V) has the capability to classify flows as aggressive (bad) or adaptive (good). It ensures that no flows fill the port queues. DBL can utilize IP ECN instead of packet-delete-signalling.[14][15]

TCP window shaping

Congestion avoidance can be achieved efficiently by reducing traffic. When an application requests a large file, graphic or web page, it usually advertises a "window" of between 32K and 64K. This results in the server sending a full window of data (assuming the file is larger than the window). When many applications simultaneously request downloads, this data creates a congestion point at an upstream provider by flooding the queue. By using a device to reduce the window advertisement, the remote servers send less data, thus reducing the congestion. This technique can reduce network congestion by a factor of 40.

Backward ECN

Backward ECN (BECN) is another proposed congestion mechanism. It uses ICMP source quench messages as an IP signalling mechanism to implement a basic ECN mechanism for IP networks, keeping congestion notifications at the IP level and requiring no negotiation between network endpoints. Effective congestion notifications can be propagated to transport layer protocols, such as TCP and UDP, for the appropriate adjustments.[16]

Side effects of congestive collapse avoidance

Radio links

The protocols that avoid congestive collapse are often based on the idea that data loss is caused by congestion. This is true in nearly all cases; errors during transmission are rare. However, this causes WiFi, 3G or other networks with a radio layer to have poor throughput in some cases since wireless networks are susceptible to data loss due to interference. The TCP connections running over a radio based physical layer see the data loss and tend to erroneously believe that congestion is occurring.

Short-lived connections

The slow-start protocol performs badly for short connections. Older web browsers created many short-lived connections and opened and closed the connection for each file. This kept most connections in the slow start mode, which slowed response times.

To avoid this problem, modern browsers either open multiple connections simultaneously or reuse one connection for all files requested from a particular server. Initial performance can be poor, and many connections never get out of the slow-start regime, significantly increasing latency.

See also

References

  1. (Al-Bahadili, 2012, p. 282) Al-Bahadili, H. (2012). Simulation in computer network design and modeling: Use and analysis. Hershey, PA: IGI Global.
  2. TCP Tunnels: Avoiding Congestion Collapse (2000)
  3. Van Jacobson, Michael J. Karels. Congestion Avoidance and Control (1988). Proceedings of the Sigcomm '88 Symposium, vol.18(4): pp.314329. Stanford, CA. August, 1988. This paper originated many of the congestion avoidance algorithms used in TCP/IP.
  4. RFC 2001 - TCP Slow Start, Congestion Avoidance, Fast Retransmit, and Fast Recovery Algorithms
  5. RFC 2581 - TCP Congestion Control
  6. RFC 3390 - TCP Increasing TCP's Initial Window
  7. TCP Congestion Avoidance Explained via a Sequence Diagram
  8. 1 2 Sally Floyd: RED (Random Early Detection) Queue Management
  9. Sally Floyd, Van Jacobson. Random Early Detection Gateways for Congestion Avoidance (1993). IEEE/ACM Transactions on Networking, vol.1(4): pp.397413. Invented Random Early Detection (RED) gateways.
  10. An Analytical RED Function Design Guaranteeing Stable System Behavior Quote: "...The advantage of this function lies not only in avoiding heavy oscillations but also in avoiding link under-utilization at low loads. The applicability of the derived function is independent of the load range, no parameters are to be adjusted. Compared to the original linear drop function applicability is extended by far...Our example with realistic system parameters gives an approximation function of the cubic of the queue size..."
  11. Zhang, Changwang; Yin, Jianping; Cai, Zhiping; Chen, Weifeng (2010). "RRED: Robust RED Algorithm to Counter Low-rate Denial-of-Service Attacks" (PDF). IEEE Communications Letter (IEEE) 14: 489–491.
  12. RFC 3168 - The Addition of Explicit Congestion Notification (ECN) to IP
  13. Comparative study of RED, ECN and TCP Rate Control (1999)
  14. Active Queue Management
  15. Enabling Dynamic Buffer Limiting
  16. A proposal for Backward ECN for the Internet Protocol

Books

External links

This article is issued from Wikipedia - version of the Thursday, February 25, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.