This document discusses calculating the derivative of an SVM loss function. It begins by defining the SVM loss function, which aims to make the score of the correct class higher than incorrect classes by at least a margin. It then shows how to calculate the scores for each class for a single data point using dot products between the data point and weight vectors. The loss function is defined for a sample where class 2 is correct. Taking the derivative of this loss function with respect to each weight gives the gradient, which is used in optimization. The derivatives are calculated separately for the weights of the correct class versus incorrect classes. This approach can be generalized to multiple examples.
Related topics: