This document summarizes a project report on evaluating different machine learning techniques for protein function prediction, called the Proteome Analyst problem. The techniques explored include Naive Bayes (generative vs discriminative learning), Tree-Augmented Naive Bayes (TAN), neural networks, and other classifiers from the WEKA data mining system. While Naive Bayes has been applied successfully previously, the authors aim to find an approach with better classification accuracy and faster execution time. Their empirical analysis found Support Vector Machines achieved better accuracy than Naive Bayes with comparable training speed.
Related topics: