OpenBU

Indexing Distances in Large Graphs and Applications in Search Tasks

OpenBU

Show simple item record

dc.contributor.author Potamias, Michalis en_US
dc.date.accessioned 2011-10-20T04:51:08Z
dc.date.available 2011-10-20T04:51:08Z
dc.date.issued 2009 en_US
dc.identifier.uri http://hdl.handle.net/2144/1721
dc.description.abstract This thesis elaborates on the problem of preprocessing a large graph so that single-pair shortest-path queries can be answered quickly at runtime. Computing shortest paths is a well studied problem, but exact algorithms do not scale well to real-world huge graphs in applications that require very short response time. The focus is on approximate methods for distance estimation, in particular in landmarks-based distance indexing. This approach involves choosing some nodes as landmarks and computing (offline), for each node in the graph its embedding, i.e., the vector of its distances from all the landmarks. At runtime, when the distance between a pair of nodes is queried, it can be quickly estimated by combining the embeddings of the two nodes. Choosing optimal landmarks is shown to be hard and thus heuristic solutions are employed. Given a budget of memory for the index, which translates directly into a budget of landmarks, different landmark selection strategies can yield dramatically different results in terms of accuracy. A number of simple methods that scale well to large graphs are therefore developed and experimentally compared. The simplest methods choose central nodes of the graph, while the more elaborate ones select central nodes that are also far away from one another. The efficiency of the techniques presented in this thesis is tested experimentally using five different real world graphs with millions of edges; for a given accuracy, they require as much as 250 times less space than the current approach which considers selecting landmarks at random. Finally, they are applied in two important problems arising naturally in large-scale graphs, namely social search and community detection. en_US
dc.language.iso en_US en_US
dc.publisher Boston University Computer Science Department en_US
dc.relation.ispartofseries BUCS Technical Reports;BUCS-TR-2008-029 en_US
dc.title Indexing Distances in Large Graphs and Applications in Search Tasks en_US
dc.type Technical Report en_US
etd.degree.name Master of Arts
etd.degree.level masters
etd.degree.discipline Computer Science
etd.degree.grantor Boston University


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search OpenBU


Browse

Deposit Materials

Statistics