This document discusses the challenges and solutions in performing analytic queries over large geospatial time-series datasets using distributed storage frameworks. It highlights the development of algorithms for exploratory and predictive analytics that autonomously learn relationships in data, ensuring efficient query evaluations while minimizing disk accesses. The approach focuses on managing metadata effectively, avoiding query hotspots, and improving overall query turnaround times, validated by empirical results.
Related topics: