This document presents a framework for detecting near-duplicate tweets on Twitter. The framework analyzes tweet pairs using three approaches: (1) comparing syntactic characteristics like word overlap, (2) measuring semantic similarity, and (3) analyzing contextual information. Machine learning is used to learn patterns that help identify duplicate tweets. The framework is integrated into a Twitter search engine called Twinder to diversify search results and improve search quality. Extensive experiments evaluate strategies for detecting duplicate tweets and analyzing features that impact detection. The results show semantic features can boost duplicate detection performance.