This document discusses distributed graph mining using MapReduce. It describes how partitioning graph data across multiple machines can make processing very large graphs feasible. The document outlines two partitioning techniques - MRGP which assigns partitions sequentially, and DGP which balances partitions based on density. It also discusses how local support counts are adjusted compared to global support when graphs are partitioned across many machines. An experiment environment using Hadoop and both synthetic and real-world graph datasets is also mentioned.
Related topics: