The document explores the complexities of balancing data privacy and utility through a heuristic known as the comparative classification error gauge (x-ceg). It outlines a methodology for generating synthetic data sets while ensuring acceptable levels of privacy and utility by applying various privacy algorithms and machine classifiers. The findings indicate that choosing the right threshold for classification error can help achieve a suitable trade-off, although achieving the optimal balance remains a challenging task.
Related topics: