Entropy is a measure of unpredictability or impurity in a data set. It is used in decision trees to determine the best way to split data at each node. High entropy means low purity with an equal mix of classes, while low entropy means high purity with mostly one class. Information gain is the reduction in entropy when splitting on an attribute, with the attribute with the highest information gain chosen as the split. For example, in a data set on restaurant patrons, splitting on the "patrons" attribute results in a higher information gain than splitting on "type of food" so "patrons" would be chosen as the root node.