1) The document discusses sparse kernel machines and support vector machines (SVMs). It covers the optimization of SVMs using Lagrange multipliers and the Karush-Kuhn-Tucker (KKT) conditions.
2) SVMs find the maximum-margin separating hyperplane between two classes of data points by maximizing the margin between the closest data points of each class. This is done by solving the dual optimization problem.
3) Support vectors are the data points that lie on the boundary or margin of the classifier. Only the support vectors are needed for making predictions on new data using the SVM model.