This document discusses large-scale integration of biological data and text. It mentions combining data from many databases on proteins, interactions, complexes and pathways using parsers and mapping files to overcome different formats and identifiers. It discusses using techniques like co-mentioning within documents, paragraphs and sentences to provide comprehensive information and improve quality scores. The goal is to combine all available evidence from various sources to generate a comprehensive resource, as described on the string-db.org website and Cytoscape app.