The document presents a hybrid approach combining lexical and semantic methods to improve document retrieval systems. It discusses the limitations of solely using lexical approaches, such as vocabulary mismatch, and proposes a weakly supervised learning method that integrates both lexical models (like BM25) and semantic models (based on BERT) for enhanced retrieval effectiveness. Experimental results demonstrate the benefits of this hybrid model compared to traditional methods.
Related topics: