This document summarizes research on clustering blogs and discovering blog communities. It outlines the significance of clustering the huge and growing blogosphere. Both network-based and content-based clustering approaches are discussed, as well as hybrid approaches. Evaluation of approaches shows hybrid clustering using both network and accompanying content information leads to more coherent blog clusters and distinct communities compared to network-based information alone. The document concludes more work should consider temporal dynamics in blog clustering.
Related topics: