This paper addresses the challenge of converting probabilistic data into deterministic formats suitable for legacy systems, aiming to optimize the quality of outcomes from queries on this data. A novel query-aware determinization framework is proposed, which outperforms traditional methods through efficient algorithms that minimize expected query costs. The study emphasizes the importance of maintaining high data quality and explores future directions, including techniques for handling complex probabilistic models.
Related topics: