Traffic Characteristics and Communication Patterns in Blogosphere
MetadataShow full item record
We present a thorough characterization of the access patterns in blogspace -- a fast-growing constituent of the content available through the Internet -- which comprises a rich interconnected web of blog postings and comments by an increasingly prominent user community that collectively define what has become known as the blogosphere. Our characterization of over 35 million read, write, and administrative requests spanning a 28-day period is done from three different blogosphere perspectives. The server view characterizes the aggregate access patterns of all users to all blogs; the user view characterizes how individual users interact with blogosphere objects (blogs); the object view characterizes how individual blogs are accessed. Our findings support two important conclusions. First, we show that the nature of interactions between users and objects is fundamentally different in blogspace than that observed in traditional web content. Access to objects in blogspace could be conceived as part of an interaction between an author and its readership. As we show in our work, such interactions range from one-to-many "broadcast-type" and many-to-one "registration-type" communication between an author and its readers, to multi-way, iterative "parlor-type" dialogues among members of an interest group. This more-interactive nature of the blogosphere leads to interesting traffic and communication patterns, which are different from those observed in traditional web content. Second, we identify and characterize novel features of the blogosphere workload, and we investigate the similarities and differences between typical web server workloads and blogosphere server workloads. Given the increasing share of blogspace traffic, understanding such differences is important for capacity planning and traffic engineering purposes, for example.