Sampling Biases in IP Topology Measurements

Lakhina, Anukool; Byers, John W.; Crovella, Mark; Xie, Peng

Sampling Biases in IP Topology Measurements

Files

2002-021-topology-sampling-bias.pdf(238.7 KB)

Date

2002-07-15

Authors

Lakhina, Anukool

Byers, John W.

Crovella, Mark

Xie, Peng

URI

https://hdl.handle.net/2144/1667

Abstract

Considerable attention has been focused on the properties of graphs derived from Internet measurements. Router-level topologies collected via traceroute studies have led some authors to conclude that the router graph of the Internet is a scale-free graph, or more generally a power-law random graph. In such a graph, the degree distribution of nodes follows a distribution with a power-law tail. In this paper we argue that the evidence to date for this conclusion is at best insufficient. We show that graphs appearing to have power-law degree distributions can arise surprisingly easily, when sampling graphs whose true degree distribution is not at all like a power-law. For example, given a classical Erdös-Rényi sparse, random graph, the subgraph formed by a collection of shortest paths from a small set of random sources to a larger set of random destinations can easily appear to show a degree distribution remarkably like a power-law. We explore the reasons for how this effect arises, and show that in such a setting, edges are sampled in a highly biased manner. This insight allows us to distinguish measurements taken from the Erdös-Rényi graphs from those taken from power-law random graphs. When we apply this distinction to a number of well-known datasets, we find that the evidence for sampling bias in these datasets is strong.

Collections

CAS: Computer Science: Technical Reports

Full item page