Next: 7.4 Sized-based load balancing Up: 7.3 World Cup 1998 Previous: 7.3.1 Fitting request sizes

## 7.3.2 Fittings using FW algorithm

We use the FW algorithm to fit the data representing the request sizes from two representative days of the World Cup 98 site, day 57 and day 80, into hyperexponential distributions. Figures 7.3 and 7.4 illustrate the CDFs of the actual data, their fittings into a hybrid distributions (lognormal and power-tail), and the fittings of the hybrid distribution into a hyperexponential one. We observe that the resulting hyperexponential distribution closely matches the behavior of the original data.

Table 7.1 illustrates the parameters of the lognormal and the power-tailed portions of the distribution for each day. For both days, we see that the bulk of the data lies in the lognormal portion of the data set, while only a very small part (albeit with very large file sizes) lies in the power-tailed part. Table 7.1 also shows the parameters that the FW algorithm suggests for the fitting of the above distributions into hyperexponentials. In both cases, a total of seven exponential phases, four for the lognormal portion and three for the power-tailed portion, are sufficient to achieve an excellent approximation of the original data.

Table 7.1: Workload parameters for day 57 and day 80
 Data from Day 57 Data from Day 80 Lognormal() Power() Weight for Lognormal() Power() Weight for Lognormal Lognormal 0.99935 0.999977 Parameters of the H fitting Parameters of the H fitting 0.000000008469532 0.000000000438327 0.000000013708911 0.000000000190479 0.000000106031775 0.000000002331229 0.000000215667700 0.000000001569731 0.000011317265198 0.000649997230444 0.000043293323498 0.000569998239790 0.000001510047810 0.000014700266903 0.000005312880260 0.000744062014933 0.000018914686309 0.014450795383216 0.000029646684825 0.039321723492673 0.000190007539767 0.443701366678877 0.000150999863822 0.367916087712180 0.001235615667834 0.541183137671004 0.000870285230628 0.591448126780214
.

Because our objective is to analyze the performance of load balancing policies under variable service process, we also considered synthetically-generated data sets that exhibits different variabilities. We base our synthetic service process generation on the properties of day 57. Since the power-tailed portion of each of the selected days is very small, we turn our attention to the lognormal portion which also exhibits heavy-tailed behavior. By changing simultaneously both the and parameters of a lognormal distribution, we change both the scale and shape of the distribution so as to vary the variance of the distribution and at the same type keep the mean of the distribution constant (equal to the mean of day 57, i.e., 3629 Bytes). This way, we can examine the sensitivity of our load balancing policies to the variability of the service process.

Figure 7.5 illustrates how the CDF changes when we change the variability in the data set (the fitting technique resulted in three, four, or five stages for the lognormal fitting, thus a total of six, seven, or eight stages were used to fit the mixture of the lognormal and power-tailed distribution). We evaluate the sensitivity of the policy as a function of variability in the service process in Section 7.4.

Next: 7.4 Sized-based load balancing Up: 7.3 World Cup 1998 Previous: 7.3.1 Fitting request sizes
Alma Riska 2003-01-13