🧫Macrophage TBK1 Signaling in Breast Cancer🎗️,🕸️NetStart 2.0: Prediction of Eukaryotic Translation Initiation🏁, ⚠️GeneRiskCalc: Analysis webtool🧮

🧫Macrophage TBK1 Signaling in Breast Cancer🎗️,🕸️NetStart 2.0: Prediction of Eukaryotic Translation Initiation🏁, ⚠️GeneRiskCalc: Analysis webtool🧮

Stay Updated with the Latest in Bioinformatics!

Issue: 💯 | Date: 22 August 2025

👋 Welcome to the Bioinformer Weekly Roundup!

In this newsletter, we curate and bring you the most captivating stories, developments, and breakthroughs from the world of bioinformatics. Whether you are a seasoned researcher, a student, or simply curious about the intersection of biology and data science, we have got you covered. Subscribe now to stay ahead in the exciting realm of Bioinformatics!

👩‍🔬 Featured Research

A novel anoikis related gene prognostic model for colorectal cancer based on single cell sequencing and bulk transcriptome analyses | Scientific Reports

The study aimed to identify prognostic biomarkers for colorectal cancer (CRC) by analyzing anoikis-related genes (ANRG) using single-cell and bulk transcriptomic data. A 10-gene prognostic model was developed through LASSO-Cox regression, showing strong predictive performance with AUCs of 0.744, 0.797, and 0.755 for 1-, 3-, and 5-year survival. These ANRGs offer potential for CRC risk stratification and prognosis assessment.

Druggable genome-wide Mendelian randomization integrating GWAS and eQTL/pQTL data identifies targets for lung squamous cell carcinoma | Scientific Reports

The study aimed to identify causal genes associated with lung squamous cell carcinoma (LUSC) using genome-wide Mendelian randomization (MR) with eQTL and pQTL data. Eight genes—DNMT1, ACSS2, YBX1, SELENOS, PPARA, MST1, CPA4, and MPO—were linked to LUSC risk. Prognostic analysis showed gene expression correlated with survival outcomes, and immune infiltration was reduced in tumor samples. Single-cell analysis revealed cell-type-specific expression patterns, suggesting potential therapeutic targets.

Proteogenomic approach to immunopeptidomics of ovarian tumors identifies shared peptide vaccine candidates | npj Vaccines

The study aimed to identify shared tumor antigens in high-grade serous ovarian cancer (HGSC) using a proteogenomic immunopeptidomics approach. By analyzing HLA class I-presented peptides from 11 patient tumors, it integrated canonical and personalized transcriptomes to uncover candidate antigens. Thirteen peptides were selected based on binding predictions and validated for immunogenicity. The findings support personalized antigen discovery for HGSC immunotherapy.

Integrated multi-omic analysis reveals novel subtype-specific regulatory interactions in pediatric B-cell acute lymphoblastic leukemia | bioRxiv

Researchers applied an integrated multi-omic framework to study pediatric B-cell acute lymphoblastic leukemia. By combining genomic, transcriptomic, and epigenomic data, the analysis uncovered subtype-specific regulatory interactions. These findings highlight mechanisms that may influence disease development and heterogeneity across subtypes.

Exploring the shared genetic basis between autism spectrum disorder and gastrointestinal disorders: a bioinformatic study | Scientific Reports

This bioinformatic study examined genetic correlations between autism spectrum disorder and gastrointestinal conditions. The analysis identified shared genetic loci and biological pathways linking the two. Results suggest overlapping molecular mechanisms that could help explain the frequent co-occurrence of these disorders.

Root zone diazotrophs of wheat in coastal saline soils from water-scarce regions of the Bohai Sea | Scientific Data

The study characterized root-associated diazotrophic bacterial communities in wheat grown in saline soils near the Bohai Sea. Data on microbial diversity, abundance, and community composition were collected under water-scarce conditions. Findings provide a resource for understanding microbial contributions to wheat growth in challenging environments. 

Explicit Scale Simulation for analysis of RNA-sequencing count data with ALDEx2 | NAR Genomics and Bioinformatics

This paper presents Explicit Scale Simulation as an approach to evaluate the performance of RNA-seq differential expression methods. Using ALDEx2 as a case study, the work assesses how compositional and technical factors affect analysis outcomes. The framework provides a way to test robustness of statistical tools on RNA-seq count data.

Macrophage TBK1 signaling drives the development and outgrowth of breast cancer brain metastasis | PNAS

The study investigates the role of macrophage TBK1 signaling in breast cancer brain metastasis (BCBM). It identifies that BCBM cell-derived MMP1 activates TBK1 in tumor-associated macrophages (TAMs), which then produce GM-CSF to promote cancer cell migration and invasion. Inhibiting TBK1 or GM-CSF signaling suppresses brain metastatic outgrowth. These findings highlight TBK1 as a potential therapeutic target in BCBM.

G-quadruplex stabilization induces DNA breaks in pericentromeric repetitive DNA sequences in B lymphocytes | PNAS

The study examines how pyridostatin (PDS)-induced stabilization of G-quadruplex (G4) DNA structures affects genome stability in B lymphocytes. PDS triggers DNA damage and chromosomal rearrangements in ribosomal and pericentromeric regions. Primary cells show tetraploidy and dicentric chromosomes, while malignant cells activate checkpoints to mitigate these effects, revealing distinct cellular responses and potential for selective cancer targeting.

🧰 Latest Tools

SV-MeCa: an XGBoost-based meta-caller approach for structural variant calling from short-read data | BMC Bioinformatics

This tool merges outputs from seven short-read SV callers using SURVIVOR, then extracts quality features and feeds them into pre-trained insertion- or deletion-specific XGBoost classifiers to assign a probability score for each consensus SV. Output can be ranked flexibly based on probability thresholds.

The source code is available here.

Learning-based parallel acceleration for HaplotypeCaller | BMC Bioinformatics

This article talks about a novel framework called LPA (Learning-based Parallel Acceleration) designed to speed up GATK HaplotypeCaller, a widely used tool for variant calling in genomics. It uses AI to predict the computational complexity of genomic data blocks, enabling adaptive segmentation and balanced task scheduling via the Multi-Knapsack Problem. LPA significantly reduces computational skew and long-tail latency, achieving up to significant speedup and maintaining >99.9% accuracy. The framework is implemented in Spark and shows good performance and resource utilization across diverse datasets.

The source code is available here. 

NetStart 2.0: prediction of eukaryotic translation initiation sites using a protein language model | BMC Bioinformatics

This article talks about NetStart 2.0, a deep learning model that predicts translation initiation sites (TISs) in eukaryotic transcripts using the ESM-2 protein language model. It integrates peptide-level context with nucleotide-level features to distinguish coding from non-coding regions across 60 diverse species. Trained as a single model, NetStart 2.0 achieves state-of-the-art accuracy in identifying correct TISs even in complex transcript structures. Its success highlights the power of protein language models in bridging transcriptomic and proteomic data for biological prediction tasks.

The source code is available here.

consHLA: a next generation sequencing consensus-based HLA typing workflow | BMC Bioinformatics

This work introduces consHLA, a workflow for HLA typing using consensus-based analysis of next-generation sequencing data. It integrates multiple HLA typing tools and applies a consensus strategy to improve allele assignment consistency. The pipeline is designed to handle diverse sequencing platforms and data formats.

The source code is available here.

GeneRiskCalc: a web-based tool for genetic risk association analysis in case–control studies | BMC Bioinformatics

GeneRiskCalc is a web-based application developed for analyzing genetic risk associations in case–control studies. It supports various statistical models and includes features for data preprocessing, visualization, and interpretation. The tool is designed to facilitate reproducible and accessible genetic risk analysis.

The tool is available here.

Autoencoders with shared and specific embeddings for multi-omics data integration | BMC Bioinformatics   

This study proposes a deep learning framework using autoencoders with shared and specific embeddings to integrate multi-omics data. The model captures both common and modality-specific features across omics layers. It is evaluated on benchmark datasets to assess its performance in clustering and classification tasks.

The source code is available here.

fSuSiE enables fine-mapping of QTLs from genome-scale molecular profiles | bioRxiv

This study introduces fSuSiE, a statistical method designed to improve fine-mapping of quantitative trait loci (QTLs) across genome-scale datasets. The tool models molecular profiles with greater precision to identify causal variants. It leverages functional genomic data to enhance detection of regulatory variants underlying complex traits.

The source code for fsusieR is available here.

Rakaia: interactive discovery of spatial biology at scale | bioRxiv

This study introduces Rakaia, a browser-based tool designed for interactive analysis of spatial biology data, including multiplexed imaging and spatial transcriptomics. It supports visualization, annotation, and feature-based querying across thousands of images without requiring code. The platform was applied to over 200 images of non-malignant human breast tissue to identify cell types of interest. Rakaia aims to facilitate scalable exploration of spatial datasets.

The tools is available here.

PLNMFG: Pseudo-label guided non-negative matrix factorization model with graph constraint for single-cell multi-omics data clustering | PLOS Computational Biology

This research presents PLNMFG, a non-negative matrix factorization model tailored for clustering single-cell multi-omics data. The model integrates latent representation and cluster structure learning, incorporates prior biological knowledge, and performs adaptive imputation for dropout handling. It applies graph Laplacian constraints and learns omic-specific weights to preserve structural and similarity information. Evaluation on eight benchmark datasets highlights its clustering performance and computational efficiency.

The source code is available here.

G4SNVHunter: An R/Bioconductor Package for Evaluating SNV-Induced Disruption of G-Quadruplex Structures Leveraging the G4Hunter Algorithm | PLOS Computational Biology

This study introduces G4SNVHunter, an R/Bioconductor package designed to detect single-nucleotide variants that may affect G-quadruplex (G4) formation. Built on the G4Hunter algorithm, it quantitatively evaluates G4-forming potential in genomic regions. The tool was applied to archaic introgressed variants from Neandertal and Denisovan genomes, identifying ~5,800 variants within G4 regions, with ~230 potentially reducing G4 formation propensity. The package supports targeted experimental design for further functional validation.

The source code is available here.

🗞️ Community News

UCSF researchers uncover how breast cancer steals energy from fat cells | News Medical Life Sciences

Triple-negative breast cancer cells form gap junctions with adjacent adipocytes, triggering lipolysis and energy release that supports tumor growth. Blocking these junctions inhibited tumor progression in models. The study used patient-derived samples and lab models to elucidate this mechanism.

Genetic variants influencing vitamin D synthesis, metabolism, and transport | News Medical Life Sciences

A Canadian review outlines how polymorphisms in genes like DHCR7, CYP2R1, CYP27B1, and GC affect vitamin D synthesis, transport, and metabolism. These variants influence circulating levels and response to supplementation, with implications for personalized nutrition and disease risk.

Metabolic syndrome linked to higher risk of Parkinson's disease | News Medical Life Sciences

A longitudinal study of over 467,000 individuals found that metabolic syndrome increases Parkinson’s disease risk by ~40%. Meta-analysis confirmed a 29% elevated risk. Genetic predisposition compounded the effect, suggesting metabolic health may modulate neurodegenerative susceptibility.

Epigenetic Noise Helps Cells Adopt New Identities to Train the Immune System | Genetic Engineering & Biotechnology News

Researchers identified a reversible epigenetic switch involving Tcf7 silencing that governs T-cell memory formation. Memory fate decisions occur at multiple infection stages, allowing adaptive flexibility. The study combined transcriptomics, live-cell imaging, and mathematical modeling.

Rapidly Evolving Human Genomic Region Tied to Neural Development, Flexible Thinking | Genetic Engineering & Biotechnology News

HAR123, a human-accelerated noncoding enhancer, was shown to regulate neuroectoderm formation and neural progenitor cell development. Knockout models exhibited altered neuron/glia ratios and impaired cognitive flexibility, suggesting a role in human-specific brain traits.

Scientists program cells to create biological qubit in multidisciplinary research | Phys.Org

University of Chicago researchers engineered fluorescent proteins into spin qubits within living cells. These protein-based quantum sensors enable nanoscale biological measurements and may advance quantum-enabled imaging and material design using cellular self-assembly.

Ionic liquids turn whole organs transparent like glass while preserving intricate tissue details | Phys.Org

The VIVIT technique uses ionic liquids to vitrify tissues, rendering whole organs optically transparent without structural distortion. It enhances fluorescent signals and enables high-resolution 3D imaging across scales, supported by a reconstruction software system (TARRS).

🗓️ Upcoming Events

BTEP: Introduction to Quarto for Scientific Writing | NIH National Cancer Institute

The workshop introduces Quarto, an open-source scientific publishing system that integrates code, narrative, and results into dynamic documents. Participants learn to create reproducible reports using markdown, YAML headers, and code chunks, with support for multiple programming languages. The session emphasizes Quarto’s compatibility with RStudio, VS Code, and Jupyter, and its alignment with FAIR data principles. A hands-on example using differential expression data demonstrates its utility in scientific communication.

🎓 Educational Corner

How to calculate partial correlation controlling cancer types | Chatomics

This tutorial explains how to compute partial correlations between variables while adjusting for cancer type as a confounding factor. It uses linear modeling to isolate direct relationships in multi-cancer datasets. The approach helps clarify associations that may otherwise be masked by disease heterogeneity.

Learning Antimicrobial Resistance (AMR) genes with Bioconductor | R-bloggers

The post introduces workflows for identifying antimicrobial resistance genes using Bioconductor tools. It covers data import, annotation, and visualization techniques tailored to resistance profiling. The guide emphasizes reproducibility and integration with existing genomic datasets.

ATAC Sequence Analysis | Bioinformatics Workbook

This resource outlines a complete workflow for ATAC-seq data analysis, including preprocessing, peak calling, and downstream interpretation. It incorporates command-line tools and R-based visualization to support reproducible research. The guide is structured to assist users in exploring chromatin accessibility across samples.

Tertiary analysis of MAS-Seq single cell data with popular community tools | Iso-Seq

The article describes how to perform tertiary analysis on MAS-Seq single-cell transcriptomic data using community-supported tools. It includes steps for isoform quantification, clustering, and visualization of transcript diversity. The workflow supports exploration of cell-type-specific isoform expression.

cBioPortal API Workshop | cBioPortal

This workshop introduces participants to the cBioPortal API for accessing cancer genomics data programmatically. It covers authentication, data querying, and integration into custom analysis pipelines. The session is designed to support researchers in building automated workflows using public cancer datasets.

Connect with Us

Stay connected and engage with us on social media for daily updates, discussions, and more!

·         Website

·         LinkedIn

·         GitHub

📬 Subscribe

Don't miss an issue! Subscribe to the Bioinformer Weekly Roundup and receive the latest insights directly in your inbox.

Subscribe Now

We hope you enjoyed this week's edition of the Bioinformer Weekly Roundup. Feel free to share it with your colleagues and friends who share your passion for bioinformatics!


Disclaimer: The information provided in this newsletter is for educational and informational purposes only and does not constitute professional advice.

Contact: bioinformatics@zifornd.com

Copyright © 2025, Bioinformer Weekly Roundup. All rights reserved.


To view or add a comment, sign in

Explore content categories