The document outlines a two-step approach for creating and evaluating domain-specific web corpora for language technology applications, focusing primarily on medical terms. It discusses the extraction of term seeds from real-world use cases and scenarios, followed by the bootstrapping and assessment of the resulting corpora's quality using established metrics. The findings indicate that automatically extracted term seeds can yield corpora with similar domain-specificity as those created with hand-picked seeds, emphasizing the utility of quantitative evaluations in corpus quality assessment.