This document discusses big data summarization, including the challenges and potential solutions. It presents a framework for big data summarization with four main stages: 1) data clustering to group similar documents, 2) data generalization to abstract data to a higher conceptual level, 3) semantic term identification to identify metadata for more efficient data representation, and 4) evaluation of the summaries. Key challenges addressed include initializing clustering methods, selecting attributes to control generalization, and ensuring semantic associations in representations. Solutions proposed are detailed assessments of clustering initialization methods and statistical approaches for clustering, generalization and term identification.