GWAS causal gene survey, May 2024 edition
How does this relate to this month's GWASes?

GWAS causal gene survey, May 2024 edition

Welcome friends,

              This is the May 2024 review of the most extreme p-values added to the GWAS Catalog in the past month, including one with a p-value of 10^-3553. Plus special bonus feature: How can you tell if a cis-pQTL is a technical artifact or reflects real biology?

If you’re wondering what this is a picture of, you’ll have to wait till the end.

             

Most significant association for a novel trait!

              My definition for a novel trait is that the specific trait code has never been used for a genome-wide significant association previously in the GWAS catalog.

              For May, that code is EFO_0002890, which is “renal carcinoma”. And the strongest association for “renal carcinoma” added this month is at rs7948643 with a p-value of 6e-72, coming from Purdue MP, Dutta D, Machiela MJ, et al. Multi-ancestry genome-wide association study of kidney cancer identifies 63 susceptibility regionsNat Genet. 2024;56(5):809-818.

            The closest gene to rs7948643 is MYEOV, otherwise known as Myeloma-overexpressed gene protein. It has been implicated in multiple cancers, including pancreatic and non-small cell lung cancer, but as far as I know its role in kidney cancer has not previously been studied. See for example, Liang E, Lu Y, Shi Y, Zhou Q, Zhi F. MYEOV increases HES1 expression and promotes pancreatic cancer progression by enhancing SOX9 transactivity Oncogene. 2020;39(41):6437-6450.

              I will note that while EFO_0002890 appears for the first time this month, there are many associations for some of the “child terms”, such as clear cell renal cell carcinoma.

Most significant cis-pQTL!

              New protein QTLs can come from massive platform based studies, or from a more focused GWAS on just one or two specific proteins. May’s strongest cis-pQTL comes from the paper “Meta-GWAS on PCSK9 concentrations reveals associations of novel loci outside the PCSK9 locus in White populationsby Azin Kheirkhah and colleagues. While this was published in December of last year the associations were added to the GWAS catalog in May.

As the title suggests, the protein being studied was PCSK9. As is usually observed, the strongest genetic variants for levels of the PCSK9 protein sit close to the PCSK9 gene. This observation is some of the strongest evidence that the closest gene (for all GWAS) is usually the closest gene.

The top of that peak in the Manhattan plot sits at rs11591147 with a p-value of 7e-189. This is the Arg46Leu missense variant in the PCSK9 protein. With a protein QTL there is always the caveat that the variant may simply be affecting the detection efficiency. That is, if the antibody or aptamer binds better to the arginine (or the leucine) variant that would show up as a pQTL even though there might be no actual change in the level of the PCSK9.

One way to mitigate against this is to check whether that same variant has any other GWAS associations, and in this case rs11591147 is associated with a whole slew of traits directly downstream of PCSK9 activity like hypercholesterolemia (see LAVAA plot below, data from https://r10.finngen.fi/variant/1:55039974-G-T). This suggests that rs11591147 has biological and not just technical consequences.

LAVAA plot for PCSK9

 

Most significant trans-pQTLs with and without obvious causal genes

I am combining the “explained” and “unexplained” trans-pQTLs to highlight the growing grey zone that emerges as these protein QTL studies get larger and larger. A “simple” trans-pQTL is when protein A directly interacts with protein B, for example variants in the IL6 receptor gene influence levels of the IL6 protein (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10457199/)

So what to make of an association between rs1260326 and levels of gamma glutamyl transferase with a p-value of 1e-164? This association comes to us from Ghouse J, Sveinbjörnsson G, Vujkovic M, et al. Integrative common and rare variant analyses provide insights into the genetic architecture of liver cirrhosisNat Genet. 2024;56(5):827-837. The variant is the well known and highly pleiotropic missense variant in GCKR, which encodes a glucokinase inhibitor which on first blush has no direct relationship to extracellular glutathione, the product of the gamma glutamyl transferase reaction.

But GCKR doesn’t just interact with glucokinase; by its activity it has a huge influence on the production of hepatic triglycerides. And gamma glutamyl transferase isn’t just a metabolic enzyme. It is a key biomarker of liver health.

So a likely hypothesis is that GCKR activity leads to liver fat which leads to liver damage which results in increased liver function test enzymes. This is consistent with the direction of effect of rs1260326 on triglyceride levels and gamma glutamyl transferase levels (https://genetics.opentargets.org/Variant/2_27508073_T_C/associations)

I think this is a nice reminder that when you’re trying to decipher a GWAS hit it is often helpful to look at the “strongest” genetic association at that same variant, ideally confirmed by colocaliation.

Most significant metabolite QTLs with and without obvious causal genes

May’s featured metabolite is urate which allows me to reuse the figure from last month’s survey, but with a change in signage:

DALL-E's version of a urate transporter

Just as with protein QTLs, sometimes new metabolite QTLs come not from a broad panel but rather a deep dive into a few or even just one metabolite.

For May we have this GWAS of urate in over one million people! Cho C, Kim B, Kim DS, et al. Large-scale cross-ancestry genome-wide meta-analysis of serum urateNat Commun. 2024;15(1):3441

The strongest association has a p-value of 9e-3553 and sits at rs3733588, an intron of the gene SLC2A9. As recorded in the EGEA resource, SLC2A9 encodes a urate transporter, which was first demonstrated in 2008 (Vitart V, Rudan I, Hayward C, et al. SLC2A9 is a newly identified urate transporter influencing serum urate concentration, urate excretion and goutNat Genet. 2008;40(4):437-442).

This same study gives us the strongest metabolite QTL for which the causal gene is not obvious. The variant rs2292123 sits in an intron of SLC29A2 and has a p-value of (only) 5e-267 for uric acid. SLC29A2 encodes the equilibrative nucleoside transporter 2 (ENT2) which can transport inosine, hypoxanthine and guanine among other nucleosides and nucleobases. These are upstream up urate (uric acid) making SLC29A2 an attractive candidate gene for urate levels.

based on: Physiology of Hyperuricemia and Urate-Lowering Treatments

 

Most significant disease/phenotype trait with and without obvious causal genes

 

And now we return to this figure, which I originally used on Twitter to illustrate a “Who’s That Causal Gene” thread on uterine fibroids in 2021.

You'll have to visit X Twitter to find the answer

Sorry for the long delay in getting back to this figure in this review; consider it a pregnant pause.

 

For this month’s strongest “explained” and “unexplained” variants I’m turning to this study: Xiao C, Wu X, Gallagher CS, Rasooly D, Jiang X, Morton CC. Genetic contribution of reproductive traits to risk of uterine leiomyomata: a large-scale, genome-wide, cross-trait analysisAm J Obstet Gynecol. 2024;230(4):438.e1-438.e15.

With a p-value of 1e-324 the strongest association reported in this paper is located at rs16991615 for the trait “Age at natural menopause and uterine leiomyomata”. I really dislike pleiotropic trait definitions because usually a gene is related to one or the other and so before you can figure out what the causal gene you have to figure out which trait you’re trying to explain.

In this case, rs16991615 falls near the MCM8 gene and the locus has a very strong association with age at natural menopause; see for example: Ruth KS, Day FR, Hussain J, et al. Genetic insights into biological mechanisms governing human ovarian ageingNature. 2021;596(7872):393-397. MCM8 encodes the minichromosome maintenance 8 homologous recombination repair factor which plays a significant role in DNA repair in general and also in meiosis specifically. (Helderman NC, Terlouw D, Bonjoch L, et al. Molecular functions of MCM8 and MCM9 and their associated pathologiesiScience. 2023;26(6):106737.) making this a reasonable causal gene for the trait of age at menopause, which coincides with the cessation of ovulatory function.

This same paper also reports an association for same trait (“Age at natural menopause and uterine leiomyomata”) at rs11740768 which sits in an intron of UIMC1, with HK3, ZNF346, FGFR4, and NSD1 also nearby. None of these genes have any obvious link to age of menopause which is why I am leaving this association as “unsolved”.

 

And that’s a wrap for this month!

What was your favorite association? What would you like to see more of in these surveys? Please let me know in the comments!

I’ll leave you with this:

What do you call a nucleus dressed up for a formal event?

A nucleo-tied!

Sana Herireche

Sales Manager for an innovative startup developing AI solutions for medicine, biology, and environmental sectors

1y

Thank you for sharing ! We have recently published a new GWAS technology that we call Next Gen GWAS. This method represents the first complete 2D epistatic interaction map to date. The 2D GWAS technology we offer is a world first. Until now, it has been nearly impossible to perform rapid calculations at this scale. Conventional GWAS models face efficiency problems, taking years to fully explore combinatorial epistatic interactions. Our 2D GWAS technology allows the evaluation of more than 80 million epistatic interactions within a few minutes. Additionally, we have developed a tool that enables the visualization of these epistatic interactions, providing valuable insights that were previously difficult to obtain. The link to the article is available here: https://genomebiology.biomedcentral.com/articles/10.1186/s13059-024-03202-0

Like
Reply
Sook Wah Yee

Membrane transporters, Pharmacogenomics, Transporter biomarkers

1y

Thank you for sharing this Eric Fauman ! I am so happy to hear that SLC29A2 is strongly associated with uric acid. Yes, uric acid is a substrate of SLC29A2. I did an experiment few years ago. We did not publish it because we feel it is not so surprise as they have similar structure and also part of the pathway.

  • No alternative text description for this image
Hena Jose

Genomics | Healthtech | Lifescience | Information Technology

1y

The way you present the round up from GWAS studies is really insightful. You described about the novel trait in Renal carcinoma, then at Pcsk9 pQTL and mQTL associated with Urate all in one post. Continue your great work

To view or add a comment, sign in

Others also viewed

Explore content categories