This document discusses approaches to automatically structuring natural heritage literature using machine learning techniques. It compares supervised machine learning, which uses labeled examples to train models, and unsupervised machine learning, which derives structure without labeled examples. The document describes a prototype application that can convert free text to XML format in batch or online modes to support tasks like specimen identification.
Related topics: