Semi-supervised learning aims to build accurate predictors using both labeled and unlabeled data. There are three main paradigms: transductive learning focuses on unlabeled data that are the test examples, active learning allows selecting unlabeled examples to label, and multi-view learning uses unlabeled data that have different feature sets. A popular multi-view method is co-training, which trains two classifiers simultaneously on different feature views and has them label each other's unlabeled data. Co-training assumes the views are conditionally independent and each is sufficient for prediction. It can be applied to tasks like web page and text classification.
Related topics: