This document discusses a method for face matching in videos using low-level facial geometries. It aims to detect human faces in video frames and retrieve similar face images from a large database. The method uses Viola-Jones face detection to extract frames from videos and locate faces. It then extracts features like color maps, edge maps and 68 facial landmarks. Local binary pattern descriptors are extracted from facial grids as local features. These are quantized into codewords using attribute-enhanced sparse coding, which considers human attributes to improve face retrieval. The codewords are used to build an attribute embedded inverted index for efficient image ranking and retrieval. Experiments on public datasets show the method can effectively retrieve similar face images from videos.