Active learning aims to improve machine learning models using less training data by strategically selecting the most informative data points to be labeled. It is important because manually labeling data can be time-consuming and expensive. The core problem is how to actively select the most informative training points to query labels for. Different active learning methods, such as using neural networks, Bayesian models, and support vector machines, aim to query points that the current model is most uncertain about. Combining active learning with expectation maximization using a large pool of unlabeled data can improve text classification when only a small amount of labeled training data is available.
Related topics: