This document describes a proposed candidate set key document retrieval system. The system would process user queries in English and return relevant documents from a collection. It would use natural language processing techniques like tokenization, stop word removal, stemming, and lemmatization to index the documents and match them with user queries. The proposed system architecture includes components for indexing, processing user queries, and retrieving relevant documents from the collection. The indexing process involves organizing the documents, extracting tokens, removing stop words, and applying stemming/lemmatization to create an inverted index for efficient searching.