This document describes a 3-step approach to generating synthetic social network data that respects user privacy:
1. Topology generation uses the R-mat method to create a graph with power law distributions and community structure. Communities are identified using Louvain method. Seed nodes are selected as central nodes in each community.
2. Data attributes like age, gender, interests are defined based on real statistics. Attribute values and their proportions are specified in a table.
3. Data is populated starting from seed nodes using propagation rules. Nearby nodes are more likely to get similar attribute values to their seed. Challenges include disproportionate seed influence and ensuring diversity while meeting proportions.