This document discusses mining data availability statements from publications in Europe PMC to find statements about genomic data from genome-wide association studies (GWAS). It describes how the GWAS Catalog contains over 4,000 publications and 7,600 studies linking genetic variants to traits. Machine learning has improved the efficiency of identifying relevant publications for the catalog compared to manual searching. Data availability statements commonly mention making data publicly available in repositories like dbGaP and EGA which are cited in millions of publications.