This document provides information about several nucleotide and protein sequence databases including:
- INSDC (International Nucleotide Sequence Database Collaboration) which includes GenBank, EMBL, and DDBJ.
- GenBank which contains over 80 billion nucleotide bases from 76 million sequences and doubles in size every 18 months. The top species represented are human, mouse, rat, cattle, and maize.
- EMBL and DDBJ which are similar to GenBank in content and format but maintained by different collaborations. Secondary databases like UniProt, PROSITE and PRINTS/BLOCKS provide additional annotation and analysis of sequences.