🍟 CRISPRware: Guide RNA Library Design🧬, 🧪hDNApipe: streamlining human genome analysis👤, 🦠MiCoDe: Microbiome Community Detection🔍

🍟 CRISPRware: Guide RNA Library Design🧬, 🧪hDNApipe: streamlining human genome analysis👤, 🦠MiCoDe: Microbiome Community Detection🔍

Stay Updated with the Latest in Bioinformatics!

Issue: 93 | Date: 04 July 2025

👋 Welcome to the Bioinformer Weekly Roundup!

In this newsletter, we curate and bring you the most captivating stories, developments, and breakthroughs from the world of bioinformatics. Whether you are a seasoned researcher, a student, or simply curious about the intersection of biology and data science, we have got you covered. Subscribe now to stay ahead in the exciting realm of Bioinformatics!

🔬 Featured Research

69.9-kb long inverted repeat increases genome instability in a strain of Lactobacillus crispatus | Oxford Academic

This study likely investigates a large inverted repeat sequence in the genome of L. crispatus and its role in promoting genomic instability. The repeat may facilitate recombination events or structural rearrangements, impacting genome integrity and possibly influencing strain-specific traits.

Comprehensive profiling of integrative conjugative elements (ICEs) in Mollicutes: distinct catalysts of gene flow and genome shaping | Oxford Academic

This research probably characterizes ICEs across Mollicutes, a group of wall-less bacteria. It may detail how ICEs contribute to horizontal gene transfer, genome plasticity, and adaptation, highlighting their structural diversity and evolutionary significance in shaping microbial genomes.

Inferring metabolite states from spatial transcriptomes using multiple graph neural network | bioRxiv

This study introduces MGFEA, a graph-based algorithm that infers metabolite levels from spatial and single-cell transcriptomic data. It integrates gene interaction and spatial graphs guided by genome-scale metabolic models to estimate metabolic fluxes. MGFEA improves inference accuracy by incorporating metabolome data and addresses limitations of prior models like scFEA and Compass.

A systematic assessment of phylogenomic approaches for microbial species tree reconstruction | bioRxiv

The authors evaluate various phylogenomic methods for reconstructing microbial species trees, focusing on gene-level evolutionary histories and their impact on genome-wide phylogenies. They propose a visualization framework using low-dimensional tree space to identify outlier gene histories and improve species tree estimation. The approach aids in selecting gene sets for robust phylogenomic inference.

Machine learning differentiates between bulk and pseudo-bulk RNA-seq datasets | bioRxiv

This research presents bulk2sc, a variational autoencoder model that generates synthetic single-cell RNA-seq data from bulk RNA-seq. It deconvolves pseudo-bulk datasets by learning cell-type distributions, enabling single-cell level insights from bulk data. The model is validated against real scRNA-seq data and offers a cost-effective alternative for disease studies.

Novel binning-based methods for model fitting and data splitting improved machine learning imbalanced data | bioRxiv

The study benchmarks deep learning-based metagenomic binning tools, highlighting COMEBin and GenomeFace for their accuracy and speed. It emphasizes the effectiveness of multi-sample binning and embedding space partitioning for low-coverage datasets. The work provides standardized workflows for evaluating binning performance and improving MAG recovery.

Extensive data mining uncovers novel diversity among members of the rare biosphere within the Thermoplasmatota | BMC Microbiome

Researchers identified three novel orders within the class Ca. Penumbrarchaeia of Thermoplasmatota using metagenomic mining and enrichments. These rare biosphere members exhibit unique gene content and potential roles in organic matter degradation in anoxic environments. The study highlights their functional novelty and habitat specificity.

Lineage-specific expansions of polinton-like viruses in photosynthetic cryptophytes | BMC Microbiome

Using long-read sequencing, the study uncovers over a thousand polinton-like viruses (PLVs) in cryptophyte genomes, particularly Rhodomonas lacustris. These PLVs show lineage-specific expansions and diverse replication strategies. The findings link PLVs to host-virus interactions and suggest their role as endogenous viral elements in freshwater protists.

Comparative transcriptomic analysis reveals the important role of hepatic fatty acid metabolism in the acute heat stress response in chickens | BMC Genomics

This study analyses transcriptomic changes across multiple chicken tissues under acute heat stress. The liver shows significant differential gene expression, with fatty acid metabolism pathways playing a central role. Functional validation of FASN in hepatocytes confirms its involvement in mitigating heat-induced metabolic disruptions.

Complete chloroplast genomes of 25 mulberry plants: insight into genome characteristics, comparative analysis and phylogenetic relationships | BMC Genomics

The authors sequenced and analysed chloroplast genomes of 25 Morus species, identifying conserved structures and SSR polymorphisms. Phylogenetic analyses grouped the species into three clades based on usage (leaf, fruit, wild). The study provides SSR markers for classification and insights into mulberry phylogeny.

🛠️ Latest Tools

2dSpAn-Auto: an automated tool for analysis of two-dimensional dendritic spine images | BMC Bioinformatics

2dSpAn-Auto provides two workflows—binary skeletonization (2dSpAn-Auto.b) and fuzzy skeletonization (2dSpAn-Auto.f)—to segment and quantify dendritic spines in 2D maximum intensity projection images. It extracts spine density and morphometry metrics (area, length, head width, neck widths) along with total dendrite length via automated batch processing with optional expert parameter tuning through a GUI. Validation across in vitro, ex vivo, and in vivo imaging demonstrates high accuracy and reproducibility under varying protocols. The open-source tool, released under GPL v3, addresses the need for fast, modality-agnostic spine analysis in neurological research and clinical studies.

The source code is available here.

LabOps: A flexible self-hosted workflow of open-source tools for efficient collaboration within research laboratories | PLOS Computational Biology

LabOps introduces a self-hosted Free and Open-Source Software workflow that integrates tools for collaborative writing, instant messaging, data storage, and more, tailored for academic research labs. It provides ready-to-deploy YAML configurations for Mattermost, Nextcloud, Radicale, and OnlyOffice, enabling secure, customizable communication and resource sharing without proprietary constraints. The paper outlines adoption strategies, discusses limitations of FOSS versus commercial suites, and presents a case study of cross-lab collaboration. LabOps aims to enhance data sovereignty and long-term accessibility in research environments.

The source code is available here.

scRepertoire 2: Enhanced and efficient toolkit for single-cell immune profiling | PLOS Computational Biology

scRepertoire 2 is an R package update for integrated analysis of single-cell adaptive immune receptor sequencing alongside transcriptomic data. New features include expanded clonotype tracking workflows, diversity metrics, longitudinal and comparative visualization modules, and seamless compatibility with Seurat and SingleCellExperiment frameworks. Benchmarking shows an 85.1% speed improvement and 91.9% memory reduction over version 1 across diverse repertoire sizes. The toolkit, available under the MIT license via Bioconductor and GitHub, supports end-to-end immune profiling in health and disease studies.

The source code is available here.

CRISPRware: a software package for contextual gRNA library design | BMC Genomics

CRISPRware is a locally installable tool for high-throughput design of guide RNA libraries targeting coding, noncoding, and translated genomic regions. It integrates modern on-target scoring algorithms (including deep learning–based predictors) and ensemble strategies, sensitive off-target search, and comprehensive annotations (gene, TSS, SNPs) for five CRISPR modalities (knockout, activation, inhibition, base editing, knockdown). The authors demonstrate genome-wide gRNA generation for six model organisms and host results via UCSC Genome Browser sessions. CRISPRware enhances flexibility and customizability over existing web portals.

The source code is available here.

MiCoDe: A web tool for performing microbiome community detection using a Bayesian weighted stochastic block model | Oxford Academic

MiCoDe is a user-friendly web application for unsupervised detection of microbial communities from taxonomic abundance data. It implements a Bayesian weighted stochastic block model tailored to address high dimensionality, compositionality, zero inflation, and nonlinearity inherent to microbiome sequencing. Users upload a CSV of taxa abundances (samples × taxa), select transformations, network estimation methods, and community numbers (with sensible defaults), and receive interactive network visualizations. The source R code is available on GitHub for local use.

The source code is available here.

AutoPM3: Enhancing Variant Interpretation via LLM-driven PM3 Evidence Extraction from Scientific Literature | Oxford Academic

AutoPM3 automates extraction of PM3 criterion evidence—variant co-occurrence in trans—from literature using open-source large language models. The pipeline comprises four modules: variant augmentation for alternative representations, a Retrieval-Augmented Generation system to locate relevant text passages, a TableLLM with Text2SQL for table parsing, and an evidence synthesizer. Evaluation on PM3-Bench (1,027 variant-publication pairs) shows improvements in variant hit rate and in trans identification over baseline methods. A Streamlit interface facilitates local deployment for clinical variant interpretation workflows.

The source code is available here.

AOP-helpFinder 3.0: from text mining to network visualization of key event relationships, and knowledge integration from multiple sources | Oxford Academic

AOP-helpFinder 3.0 extends previous text-mining tools by incorporating additional data sources and graph-based methods to identify stressor–event and event–event associations across molecular initiating events, key events, and adverse outcomes. It automatically annotates mined relationships with toxicological database entries (AOP-Wiki, KEGG, Reactome, DisGeNET, etc.) and offers interactive network visualization on the web server. The updated pipeline enhances integrative toxicology by streamlining adverse outcome pathway development directly from PubMed abstracts.

The source code is available here.

hDNApipe: streamlining human genome analysis and interpretation with an intuitive and user-friendly interface | Oxford Academic

hDNApipe is an end-to-end pipeline for human genomic sequencing data that delivers variant calling (SNVs, INDELs, SVs, CNVs), annotation, and optional visualization via both command-line and a Tkinter-based GUI. Distributed as a Docker container for effortless setup, it supports WGS, WES, and targeted panels in germline and somatic contexts. Benchmarking against existing pipelines demonstrates competitive precision, sensitivity, and runtime efficiency. hDNApipe simplifies customization through parameter files and dual-mode operation, facilitating rapid genomic analysis deployment.

The source code is available here.

📰 Community News

Scientists trace leprosy’s roots in South America back 4,000 years | News Medical Life Sciences

Researchers sequenced 4,000-year-old genomes from skeletal remains in Chile, identifying Mycobacterium lepromatosis, a pathogen linked to severe forms of leprosy. The findings suggest leprosy was present in South America long before European contact, challenging previous assumptions about its historical spread.

Targeting a key enzyme could reverse early Parkinson's effects | News Medical Life Sciences

A Stanford-led study in mice found that inhibiting the overactive LRRK2 enzyme, linked to a genetic form of Parkinson’s, may help preserve dopamine-producing neurons. The research highlights a potential therapeutic strategy for early-stage intervention.

Immune Markers Help Identify Subgroups of ME/CFS Patients | The Scientist

A study profiled cerebrospinal fluid from ME/CFS patients and identified distinct neuroinflammatory protein signatures. These immune markers revealed subgroups within the disease, offering insights into its heterogeneous nature and potential diagnostic pathways. MexOMICs Maps the Genetic and Social Landscape of Disease in Mexico | The Scientist

The MexOmics consortium is collecting genetic, clinical, and social data from twins and patients with lupus and Parkinson’s disease. By integrating functional genomics with community engagement, the initiative aims to understand how genetic and environmental factors shape disease in Mexico.

Google DeepMind Unveils AlphaGenome: a Unified AI Model for High-Resolution Genome Interpretation | InfoQ

AlphaGenome is a new AI model developed to predict the regulatory impact of genetic variants across the genome. It integrates convolutional and transformer architectures to analyse long-range DNA sequences and supports variant interpretation across multiple molecular modalities.

Single-cell transcriptomes of immune cells offer insight into juvenile idiopathic arthritis | News Medical Life Sciences

Using single-cell RNA sequencing, researchers analysed immune cells from JIA patients and identified subtype-specific inflammatory profiles. The study revealed distinct cellular interactions and signalling pathways, contributing to improved classification and understanding of JIA pathogenesis.

📅 Upcoming Events

Decision Trees, Survival Trees, and Random Forest: Practical Examples with R Programming | BTEP

This training demonstrates practical applications of decision trees, survival analysis, and random forest models using R. It includes techniques for handling censored data and building predictive models, with a focus on statistical programming and interpretation of model outputs.

Microbiology Week Virtual Event Series 2025 | labroots

This event will focus on advances in microbiology, including infectious diseases, antimicrobial resistance, and microbiome diagnostics. Expected outcomes include insights into AI-driven genomics, machine learning for gene annotation, and multi-omics approaches for microbial community profiling. Sessions will also address microbial adaptation to climate change and environmental microbiology.

Determination Of Cell Surface Markers On Circulating Immune Cells As Biomarkers For Disease Monitoring Using Spectral Flow Cytometry | labroots

This webinar will present spectral flow cytometry as a tool for deep immune profiling to identify blood-based biomarkers in diseases like COVID-19 and Giant Cell Arteritis. It will highlight distinct cell surface markers and phenotypic shifts linked to disease severity, validated through scRNA-seq integration. The session aims to demonstrate the method’s value in translational immunology and non-invasive disease monitoring.

📚 Educational Corner

Building Trust with Code: Validating Shiny Apps in Regulated Environments | R-bloggers

The article outlines best practices for validating Shiny applications in regulated domains such as healthcare, pharma, and finance. It emphasizes the importance of reproducibility, traceability, and documentation to meet compliance standards. Key recommendations include modular code design, separation of UI and logic, version control, and reproducible environments. Common pitfalls like hardcoded paths, global variables, and lack of testing are identified, with practical solutions offered to enhance reliability and maintainability.

How to Use R with Excel workshop | R-bloggers

The workshop introduces practical methods for integrating R with Excel, focusing on importing/exporting workbooks and performing typical Excel-based analyses within R. It covers data visualization, row and column operations, and reproducibility techniques. Led by a biostatistics PhD candidate, the session emphasizes public health and biomedical research applications. Participants receive resources for continued learning and can engage through Q&A segments.

Counting Digits Quickly | R-bloggers

The article explores performance optimization in R by comparing digit-counting implementations across languages, including R, Julia, and Fortran. It introduces the {quickr} package, which transpiles R code into Fortran to enhance execution speed. The author tests various approaches, highlighting differences in syntax, memory management, and computational efficiency. The post includes reflections on Fortran’s continued relevance in scientific computing and its integration with R for high-performance tasks.

Tool Calling with Local LLMs: A Practical Evaluation | Docker

The article evaluates local large language models (LLMs) for structured tool calling in agentic applications using Docker Model Runner. It documents manual and scaled testing across models under 10B parameters, including xLAM-2-8b-fc-r and watt-tool-8B. Key issues observed include premature tool invocation, incorrect tool selection, malformed arguments, and ignored responses. A leaderboard ranks models based on performance in a simulated shopping assistant scenario, highlighting variability in tool-handling capabilities among local LLMs.

Connect with Us

Stay connected and engage with us on social media for daily updates, discussions, and more!

📬 Subscribe

Don't miss an issue! Subscribe to the Bioinformer Weekly Roundup and receive the latest insights directly in your inbox.

Subscribe Now

We hope you enjoyed this week's edition of the Bioinformer Weekly Roundup. Feel free to share it with your colleagues and friends who share your passion for bioinformatics!


Disclaimer: The information provided in this newsletter is for educational and informational purposes only and does not constitute professional advice.

Contact: bioinformatics@zifornd.com

Copyright © 2025, Bioinformer Weekly Roundup. All rights reserved.



To view or add a comment, sign in

Others also viewed

Explore topics