The document discusses biodiversity informatics and the large amount of untapped biodiversity data, referred to as "dark data", stored in various sources like literature, museum specimens, and field notes. It notes that over a billion natural history specimens have been collected over 250 years in many languages without standardized publishing. Extracting and integrating this dark data using techniques like machine learning and metadata extraction could help applications like ecological niche modeling and taxonomic analysis. It also describes education programs to train biological information specialists to help curate and manage biodiversity data.