Characteristics of WWW Client-based Traces

Date
1995-07-18
DOI
Authors
Cunha, Carlos R.
Bestavros, Azer
Crovella, Mark E.
Version
OA Version
Citation
Cunha, Carlos; Bestavros, Azer; Crovella, Mark. "Characteristics of WWW Client-based Traces“, Technical Report BUCS-1995-010, Computer Science Department, Boston University, April 1, 1995. [Available from: http://hdl.handle.net/2144/1571]
Abstract
The explosion of WWW traffic necessitates an accurate picture of WWW use, and in particular requires a good understanding of client requests for WWW documents. To address this need, we have collected traces of actual executions of NCSA Mosaic, reflecting over half a million user requests for WWW documents. In this paper we describe the methods we used to collect our traces, and the formats of the collected data. Next, we present a descriptive statistical summary of the traces we collected, which identifies a number of trends and reference patterns in WWW use. In particular, we show that many characteristics of WWW use can be modelled using power-law distributions, including the distribution of document sizes, the popularity of documents as a function of size, the distribution of user requests for documents, and the number of references to documents as a function of their overall rank in popularity (Zipf's law). Finally, we show how the power-law distributions derived from our traces can be used to guide system designers interested in caching WWW documents.
Description
License