The document discusses strategies for working with large biological datasets as sequencing costs decrease and data volumes increase exponentially. It summarizes three key uses for abundant sequencing data: hypothesis falsification, model comparison, and hypothesis generation. The author's lab aims to develop open tools for moving quickly from raw data to hypotheses and identify challenges preventing collaborators from doing their science. Summarizing a discussion on soil microbial communities, it notes the immense diversity and challenges of culture-dependent approaches, necessitating single-cell sequencing and metagenomics.