Monday, September 8, 2008

Increase/Decrease Algorithms

EDIT: Minor note - I hadn't read the other paper yet when I wrote this summary.

This paper starts with some simple assumptions: that we wish to build a distributed congestion avoidance system, which will take as input two values: the amount of data transmitted at the last time step, and whether the bottleneck was underutilized or congested last time step. Furthermore, it assumes that the new amount of data to be transmitted will be a linear function of the amount at the previous time step. It then tries to achieve the following goals: efficiency (hovering around the point of congestion), fairness, convergence, and distribution.

The last goal is easily achieved simply by the setup of the scenario. The other three goals can be described as follows:
  • efficiency - If the system is overutilized at one time step, the total utilization will go down for the next step. Similarly, if it is underutilized, the total utilization will go up for the next step.
  • fairness - Eventually, all clients will have approximately equal utilization of the bottleneck.
  • convergence - The system should eventually stabilize. Preferably, it should stabilize quickly, and the oscillations around the optimal point should be small.
By writing out a set of equations describing the utilization, and then manipulating them to guarantee the above criteria, the paper concludes that the decrease must be multiplicative, and that the increase may be additive and multiplicative. The paper also notes that a pure additive increase converges fastest, and therefore concludes that an additive increase and multiplicative increase is the best linear formula that satisfies the four conditions.

This explains the reason the TCP window grows and shrinks the way it is currently implemented. Most implementations of TCP have a window size that grows additively and shrinks exponentially. The indicator of whether or not there is congestion, in this case, is obviously the presence or absence of an ACK.

This paper makes the flawed assumption that all endpoints are willing to comply with the proposed algorithm. Since the algorithm is supposed to guarantee fairness, this is a terrible assumption. An adversarial client can ignore the multiplicative decrease and simply transmit at an unusually high rate. This will cause the adversary to have an unfairly large share of the bottleneck, while the well-behaved clients are told to continue reducing their transmission rates.

One important question that the paper doesn't address is the transience of the system. This is inherent in the way the internet works; clients will come and go frequently. There is a question of whether or not the system is still guaranteed to converge to the optimal efficiency if clients continue to arrive and leave.

This ties directly into the question of how quickly the system converges. If a single client is underutilizing its connection, does it take a long time or a short time for that client to approach a fair utilization? What about short-lived connections? What kind of initial conditions should a client set if it has just joined the network? And somewhat importantly, does this system inherently favor new clients, old clients, or neither? If it favors one over the other, is it possible for an adversary to get better performance than its peers by dropping and establishing a connection many times over? Or by keeping a connection alive long after it is no longer useful in the hopes that it will be useful again in the future?

No comments: