The paper investigates the impact of vocabulary length in binary image classification using a Bag-of-Visual-Words (BoVW) approach within a two-level spatial pyramid representation. It specifically addresses classification tasks for detecting persons and cars, utilizing images from the Pascal dataset, and highlights the task-dependent nature of vocabulary length and its importance in preserving spatial information. Results demonstrate that the shorter vocabulary is generally more effective for level-0 of the spatial pyramid, serving as contextual support for the classification of level-1.