We use the FW algorithm to fit the data representing the request sizes from two representative days of the World Cup 98 site, day 57 and day 80, into hyperexponential distributions. Figures 7.3 and 7.4 illustrate the CDFs of the actual data, their fittings into a hybrid distributions (lognormal and power-tail), and the fittings of the hybrid distribution into a hyperexponential one. We observe that the resulting hyperexponential distribution closely matches the behavior of the original data.
Table 7.1 illustrates the parameters of the lognormal and the power-tailed portions of the distribution for each day. For both days, we see that the bulk of the data lies in the lognormal portion of the data set, while only a very small part (albeit with very large file sizes) lies in the power-tailed part. Table 7.1 also shows the parameters that the FW algorithm suggests for the fitting of the above distributions into hyperexponentials. In both cases, a total of seven exponential phases, four for the lognormal portion and three for the power-tailed portion, are sufficient to achieve an excellent approximation of the original data.
Because our objective is to analyze the performance of load balancing policies
under variable service process, we also considered synthetically-generated
data sets that exhibits different variabilities. We base our synthetic service
process generation on the properties of day 57.
Since the power-tailed portion of each of the selected days
is very small, we turn our attention to the lognormal portion which
also exhibits heavy-tailed behavior.
By changing simultaneously both the
and
parameters of a
lognormal distribution, we change both the scale and shape of the distribution
so as to vary the variance of the distribution and at the same type keep the
mean of the distribution constant (equal to the mean of day 57, i.e., 3629
Bytes).
This way, we can examine the sensitivity of our load balancing policies to
the variability of the service process.