Symmetry: on averages and variation

|

Dr. Kenjiro Cho  is a renowned researcher of the Internet in Japan. His famous paper on The Impact and Implications of the Growth in Residential User-to-User Traffic  was the first to point to the real observed usage of high speed broadband by a large population.

In my words: at one point in time 5% of the population is a heavy user, but its never the same 5 %. People use the capacity casual when needed and expect an immediate response. Just like we use our electricity network, water network, sewer network. They are not built for averages.

Think of sewers pipes from your home, what a mess would it be if they were designed for the average flow and you try to flush your toilet.

At the IETF meeting in Japan last November  on "Internet Bandwidth Growth: Dealing with Reality"  Dr. Kenjiro Cho  talked about his latest findings (Transcript session IETF76 Dealing with bandwidth growth Cho.pdf)


He used some statistical terminology in his talk that needs some explanation, lest people draw the wrong conclusion. Dr. Cho was kind enough to clarify the data for me, which I would like to share. [Update] The interpretation is mine.

The first data point is the gradual shift in average residential data consumption in Japan.

 In the period 2005 -2009 the average upload grew by 29 %, the average download however grew by 117 %. The symmetry ratio's (up:down) changed from 1:1 in 2005 and 1:1,7 in 2009. The main reason is the increase in streamed video.

But averages can be misleading (as in the case of your sewer pipe when you flush the toilet a couple of times per day). An indicator of the variation is the difference between the mean (average) value measured and the mode value of the measured distribution. (The mode value is the value that has the highest  count in the table of measurements. For instance the row (400,1,1,1,300,1,2,1,350) has a mean value of 117 and a mode value of 1).

The bigger the difference between mode and mean, the more variation there is in the measured values.

The ratio of observed mean versus mode values are  123:1 in 2005 and 93:1 in 2009 for the upload, indicating a very high variability in usage patterns for the upload. Sometimes some people are sending large amounts of data.

For the download the ratio is 14:1 in 2005 and 8,5: 1 in 2009, less variation than observed in the upload  and consistent with the observation that a lot of streamed video is viewed.

Dr. Cho's observations show that bandwidth usage in ultra high speed broadband networks is  consistent with a casual usage pattern with very high variations, like we use other utilities.

And that averages can be very misleading.

Leave a comment