Computational Biology Resources

Explore top LinkedIn content from expert professionals.

  • View profile for Michael Bass, M.D.
    Michael Bass, M.D. Michael Bass, M.D. is an Influencer

    LinkedIn Top Voice | Gastroenterologist I Medical Director @ Oshi Health

    30,085 followers

    AI just ran its own multidisciplinary tumor board. And nailed the diagnosis + treatment. This was a full-stack oncology reasoning engine—pulling from imaging, pathology, genomics, guidelines, and literature in real time. A new paper in Nature Cancer describes how researchers built a GPT-4-powered multitool agent that: • Interprets CT & MRI scans with MedSAM • Identifies KRAS, BRAF, MSI status from histology • Calculates tumor growth over time • Searches PubMed + OncoKB • And synthesizes everything into a cited, evidence-based treatment plan In short: it acts like a multidisciplinary team. Results : • Accuracy jumped from 30% (GPT-4 alone) to 87% • Correct treatment plans in 91% of complex cases • Every conclusion backed by a verifiable citation This is bigger than oncology. Any field that relies on multi-modal data and cross-domain reasoning—like my field of GI ( GI + Mental Health+ Nutrition + Excercise ) could benefit from this collaborative AI architecture. Despite the visual, it doesn’t replace the human team—it augments it. Providers still decide. But now, they do it faster, with more context, and less cognitive fatigue. #AI #HealthcareonLinkedin #Healthcare #Cancer

  • View profile for Sumeet Pandey, PhD

    Translational Immunology & Multi-omics

    3,491 followers

    Histopathological Images to develop “#TissueClocks”: “#TissueClocks”—deep learning-based predictors of biological age using histopathological images from 40 tissue types (GTEx dataset). #KeyFindings >Accuracy: Mean age prediction error was 4.9 years, correlating with telomere attrition, subclinical pathologies, and comorbidities. > AgeingPatterns: Tissue-specific and non-uniform rates of ageing, with some organs aging earlier and others showing bimodal changes. > InnovativeStrategy: Combined histology, gene expression, and clinical data to predict age gaps, validated in external healthy and diseased cohorts. #Significance: > Focus: Shifts from molecular/cellular changes to tissue structure. > Insights: Provides a multi-layered understanding of ageing and potential interventions for age-related diseases. > Applications: Enables biological age prediction, risk assessment, and monitoring of anti-ageing therapies. #Limitations: Invasive sampling, sex imbalance in GTEx, reliance on postmortem samples, and lack of longitudinal data. Got value from this? Share 🔄 #TranslationalResearch #MultiOmics https://lnkd.in/eW277U7s https://lnkd.in/eKqrUc6r

  • View profile for Andrew Dunn
    Andrew Dunn Andrew Dunn is an Influencer

    Senior Biopharma Correspondent at Endpoints News

    19,162 followers

    “The first thing just worked,” Boris Power, head of OpenAI’s applied research team, told me ahead of the company's latest announcement. “That’s rarely the case in research. We were skeptical for a very long time.” Power was talking about GPT-4b micro, a protein-focused variant of its GPT-4o model that it built in collaboration with Retro Biosciences. The research shows how LLMs could be applied to life sciences research. In this case, by making variants on the famed Yamanaka factors that were more efficient in turning mature cells back into stem cells. Retro CEO "Joe Betts-LaCroix" expects to use some of these AI-made proteins in a preclinical research program, seeking to reprogram patient's cells. “Because reprogramming is in the loop, the timing and efficiency of it matter for the patient in terms of how many starting cells do you need, how long does the patient have to wait around,” Betts-LaCroix said. “It can work with canonical Yamanaka factors, but as we’re optimizing, we’re like, ‘Why?’ They’re worse.” My latest exclusive on the AI bio frontier at Endpoints News: https://lnkd.in/gU8_8ppf

  • View profile for Heather Couture, PhD

    Making vision AI work in the real world • Consultant, Applied Scientist, Writer & Host of Impact AI Podcast

    15,803 followers

    𝐅𝐢𝐫𝐬𝐭 𝐅𝐨𝐮𝐧𝐝𝐚𝐭𝐢𝐨𝐧 𝐌𝐨𝐝𝐞𝐥 𝐟𝐨𝐫 𝐒𝐩𝐚𝐭𝐢𝐚𝐥 𝐏𝐫𝐨𝐭𝐞𝐨𝐦𝐢𝐜𝐬 Understanding where proteins are located within tissues is crucial for cancer diagnosis, drug development, and precision medicine. But analyzing these complex spatial patterns has remained largely manual and inconsistent across laboratories. Muhammad Shaban et al. developed KRONOS, a foundation model specifically designed for analyzing spatial proteomics data - imaging that maps protein expression at single-cell resolution within tissues. 𝗧𝗵𝗲 𝗖𝗵𝗮𝗹𝗹𝗲𝗻𝗴𝗲: Current spatial proteomics analysis typically relies on cell segmentation followed by rule-based classification. While effective for well-defined cell types, this approach struggles with complex tissue regions and treats each protein marker independently, potentially missing important spatial relationships. 𝗧𝗵𝗲 𝗔𝗽𝗽𝗿𝗼𝗮𝗰𝗵: KRONOS was trained using self-supervised learning on 47 million image patches from 175 protein markers across 16 tissue types and 8 imaging platforms. The model uses a Vision Transformer architecture adapted for the variable number of protein channels in multiplex imaging. 𝗞𝗲𝘆 𝗧𝗲𝗰𝗵𝗻𝗶𝗰𝗮𝗹 𝗙𝗶𝗻𝗱𝗶𝗻𝗴𝘀: The research identified several important architectural choices: • 𝗠𝗮𝗿𝗸𝗲𝗿 𝗲𝗻𝗰𝗼𝗱𝗶𝗻𝗴 𝗶𝗺𝗽𝗿𝗼𝘃𝗲𝗺𝗲𝗻𝘁𝘀: Adding dedicated sinusoidal encoding for different protein markers yielded a large increase in balanced accuracy on Hodgkin lymphoma data • 𝗧𝗼𝗸𝗲𝗻 𝘀𝗶𝘇𝗲 𝗼𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻: Using smaller 4×4 pixel tokens improved accuracy compared to standard 16×16 tokens, though overlapping tokens with 50% overlap achieved similar performance • 𝗘𝗺𝗯𝗲𝗱𝗱𝗶𝗻𝗴 𝘀𝘁𝗿𝗮𝘁𝗲𝗴𝘆: Replacing image-level (CLS token) embeddings with marker-specific embeddings led to substantial performance gains 𝗔𝗽𝗽𝗹𝗶𝗰𝗮𝘁𝗶𝗼𝗻𝘀 𝗗𝗲𝗺𝗼𝗻𝘀𝘁𝗿𝗮𝘁𝗲𝗱: - Cell phenotyping without requiring cell segmentation - Cross-dataset generalization across different imaging platforms - Few-shot learning with limited labeled examples - Patient stratification for treatment response prediction - Tissue region classification and artifact detection This work represents a step toward more automated and scalable analysis of spatial proteomics data, which could be valuable for biomarker discovery and understanding tissue architecture in disease. paper: https://lnkd.in/eDebvrXy blog: https://lnkd.in/ePAxM7zZ code: https://lnkd.in/e7f95fZK model: https://lnkd.in/eFnJFYYy #SpatialProteomics #ComputationalBiology #MachineLearning #Biomedical #Research

  • View profile for Kermen Bolaeva

    Area Sales Rep Middle East & CIS@ New England Biolabs | Molecular Biology

    2,181 followers

    ❓ ONT, Illumina & MGI – What’s the Difference? 🔬 Next-Generation Sequencing (NGS) allows scientists to read genetic code by sequencing millions (or billions) of DNA fragments in parallel. Let’s explore some key platforms: 1️⃣ Illumina 1) Sample & Library Preparation: DNA/RNA is purified, fragmented, and ligated with adapters containing cluster recognition sites (bind to specific spots on the flow cell), index sequences (identify the sample), and primer binding sites. NEBNext UltraExpress® FS DNA Library Prep Kit https://lnkd.in/dMjcZphg is widely used for high-quality library preparation. 2) Cluster Generation: The flow cell has oligonucleotides complementary to the adapters, allowing fragments to bind. A PCR-like process (bridge amplification) forms clusters. Multiple copies of the strand ensure that the fluorescent signal during sequencing will be strong enough. 3) Sequencing: Fluorescently labeled nucleotides (G,C,A,T) with terminators bind one at a time to all single strands in the cluster (at any given moment, only one type of nucleotide binds, emitting a specific color). A camera records fluorescence to identify nucleotides. Terminator groups are cleaved to allow the next cycle. 4) Reverse Strand Sequencing: Index sequences are read, the reverse strand is synthesized and sequencing is repeated. 5) Data Analysis: Low-quality reads are filtered, and sequences are aligned. 2️⃣ MGI 1) Sample & Library Preparation: DNA is fragmented, ligated with adapters, and circularized into ssCirDNA. NEBNext® FS DNA Library Prep Kit for MGI® https://lnkd.in/dQMtgbNd provides a reliable solution for generating high-complexity libraries with optimized workflow. 2) DNB Generation by Rolling Circle Amplification: ssCirDNA acts as a template for continuous amplification, forming dense DNA Nanoballs (DNBs) with multiple copies of the sequence. 3) Loading DNBs: DNBs bind to specific spots on the flow cell. 4) Sequencing: Fluorescently labeled nucleotides with terminators bind one at a time to all sequences in the DNBs simultaneously. A camera records fluorescence to identify each nucleotide. Terminators are cleaved to allow the next cycle. 3️⃣ Oxford Nanopore Technologies 1) Sample & Library Preparation: DNA/RNA is extracted, purified, and ligated with motor protein adapters. 2) Loading the Flow Cell: The library is added to a flow cell containing thousands of nanopores. 3) Sequencing: The motor protein unzips the DNA, guiding it through the nanopore one base at a time. Each nucleotide disrupts the ionic current in a unique way, producing a signal used to determine the sequence. 4) Base Calling & Data Analysis: Signals are converted into nucleotide sequences, followed by read alignment and error correction. #NGS #Sequencing #Genomics #Bioinformatics #Illumina #Nanopore #MGI #Biotech

  • View profile for Sreedath Panat

    MIT PhD | IITM | 100K+ LinkedIn | Co-founder Vizuara & Videsh | Making AI accessible for all

    113,214 followers

    A simple ML model for forecasting the pandemic This article is about Scientific ML - an emerging field that combines mechanistic models' interpretability with ML models' predictive power. Take the example of COVID. How would we model the spread of COVID in a population? SIR model is one of the simplest yet foundational models in epidemiology that categorizes a population into three compartments: Susceptible (S), Infected (I), and Recovered (R). Initially, the entire population is susceptible except for a few infected individuals. As the virus spreads, susceptible individuals become infected, and over time, infected individuals recover and gain immunity. This transition is modeled using a system of three ordinary differential equations, each representing the rate of change of the S, I, and R populations. The model uses parameters like τ_SI (the rate at which susceptible people become infected) and τ_IR (the recovery rate). One of the biggest problems with the SIR model is that the interaction parameters τ_SI and τ_IR are difficult to estimate in real life. So how about using a neural network to overcome these limitations? We can use neural networks to create a mapping from initial conditions such as S(t=0), I(t=0), and R(t=0) to the number of infections at the end of a month. For this, we will need real-world data from different regions of the world. However, traditional machine learning models are black boxes and don’t leverage the known structure of these scientific equations. In the SIR model, we have some information. Although the interaction parameters may be unknown, we can still deduce the presence of S, I, and R terms in the 3 equations using logic. So why should we throw away this knowledge and just use a black box? Scientific ML bridges this gap by integrating known physical or epidemiological relationships (like those in the SIR model) into the learning process, resulting in models that are both interpretable and more accurate, even with partially unknown parameters. Universal Differential Equations (UDEs) in SciML preserve the structure of known scientific laws - such as the system of ordinary differential equations (ODEs) in the SIR model - while replacing unknown or hard-to-estimate terms (like the interaction parameters τ_SI and τ_IR) with trainable neural networks. I have made a lecture video on UDEs (incorporating the SIR model) to forecast the spread of a pandemic and hosted it on Vizuara’s YouTube channel. Do check this out. I hope you enjoy watching this lecture as much as I enjoyed making it: https://lnkd.in/gMdquisq If you wish to conduct research in SciML, check this out: https://lnkd.in/dSxFNzPb

  • View profile for MD MAHIDUL ISLAM

    Laboratory Manager & Scientist | Genetic counseling & Testing Expert | Scientific, QA & Clinical Affairs Consultant | Lead auditor BAB, CAP, & ISO-15189 | Science Leadership Strategist | Motivator | Mentor

    10,404 followers

    DNA sequencing is the process of determining the exact order of nucleotides (A, T, C, and G) in a DNA molecule. It is a crucial technique in genetics and molecular biology, allowing scientists to study genes, diagnose genetic disorders, and even track diseases like cancer and infectious pathogens. Types of DNA Sequencing: 1. Sanger Sequencing (First-Generation) Developed by Frederick Sanger in the 1970s. Uses chain termination with fluorescent or radioactive labeling. Highly accurate but slow and expensive. 2. Next-Generation Sequencing (NGS) Includes platforms like Illumina and Roche 454. Allows for sequencing millions of DNA fragments in parallel. Faster and cheaper than Sanger sequencing. 3. Third-Generation Sequencing Examples: PacBio and Oxford Nanopore. Can sequence long DNA fragments in real time. Useful for complex genome studies and detecting epigenetic modifications. Applications of DNA Sequencing: Medical Diagnostics: Identifies genetic mutations linked to diseases. Personalized Medicine: Helps tailor treatments based on a person’s genetic makeup. Forensic Science: Used in crime investigations and human identification. Evolutionary Biology: Helps trace ancestry and evolutionary relationships. Agriculture & Biotechnology: Improves crops and livestock breeding. Would you like details on a specific aspect of DNA sequencing?

  • View profile for Jan Beger

    Healthcare needs AI ... because it needs the human touch.

    85,650 followers

    This paper provides an accessible guide for cancer researchers on how AI can be integrated into their research. It discusses AI’s role in image analysis, natural language processing, and drug discovery, focusing on practical applications rather than technical details. 1️⃣ AI tools are now widely accessible to cancer researchers, offering productivity boosts in everyday workflows and enabling new discoveries from existing data. 2️⃣ Researchers without programming skills can use off-the-shelf AI software, while those with computational expertise can develop custom pipelines. 3️⃣ Deep learning techniques like convolutional neural networks (CNNs) and transformers dominate AI applications in medical imaging and language processing. 4️⃣ AI helps in cancer research by automating tasks such as cell detection in microscopy, identifying genetic mutations, and discovering new drugs. 5️⃣ The future of AI in cancer research includes "foundation models" capable of being fine-tuned for various tasks across multiple data types. 6️⃣ AI models, particularly transformers, are now state-of-the-art in both image and language processing tasks, often outperforming traditional CNNs. 7️⃣ Self-supervised learning and reinforcement learning are emerging methods that allow AI systems to learn from unlabelled data and optimize clinical trials or cancer screening protocols. 8️⃣ AI-driven image analysis can predict cancer-related biomarkers, genetic alterations, and patient outcomes directly from routine pathology slides with high accuracy. 9️⃣ Natural language processing (NLP) tools, especially LLMs, are increasingly used to summarize medical notes, extract clinical insights, and improve research communication. 🔟 AI's integration of multimodal data (e.g., genomic, radiological, and clinical) promises more accurate cancer diagnosis and treatment recommendations. ✍🏻 Raquel Perez-Lopez, Narmin Ghaffari Laleh, Faisal Mahmood, Jakob Nikolas Kather. A guide to artificial intelligence for cancer researchers. Nature Reviews Cancer. 2024. DOI: 10.1038/s41568-024-00694-7

  • View profile for Jorge Bravo Abad

    AI/ML for Science & DeepTech | PI of the AI for Materials Lab | Prof. of Physics at UAM

    23,453 followers

    Bayesian network modeling for analyzing protein dynamics Proteins are constantly moving, and these structural shifts help determine their roles in biology. Capturing the shifting conformations is critical for applications like drug development, yet the sheer amount of data produced from molecular simulations can be overwhelming. New strategies are needed to identify which interactions matter most and how they shape a protein’s overall behavior. Mukhaleva et al. introduce BaNDyT, a specialized software that employs Bayesian network modeling, an interpretable machine learning method designed to uncover probabilistic relationships in high-dimensional data. In this framework, each residue or residue pair is modeled as a node, and edges represent direct dependencies rather than mere correlations. The approach involves converting continuous simulation output into data bins, systematically searching for the best-fitting network structure, and then measuring each node’s weighted degree to highlight particularly influential contacts or regions. By filtering out redundant connections, the software effectively pinpoints functionally significant interactions buried in large-scale simulation datasets. Using this method on G protein-coupled receptor systems, the authors discovered both local and long-range interactions that drive protein dynamics. The researchers showed how BaNDyT can identify critical residues and communication pathways, even in distant parts of the structure, offering fresh insights into protein allostery. This interpretable machine learning approach lays a foundation for more nuanced studies of molecular interactions, broadening possibilities for research and therapeutic innovation. Paper: https://lnkd.in/dw6ypcaK #MachineLearning #BayesianNetworks #DataScience #ProteinDynamics #StructuralBiology #ComputationalBiology #Bioinformatics #DrugDiscovery #ComputationalChemistry #Proteomics #Pharmacology #ProteinFunction #MolecularModeling #AIforScience #Biotech

  • View profile for Luke Yun

    building AI computer fixer | AI Researcher @ Harvard Medical School, Oxford

    32,843 followers

    Unification of the analysis of bottom-up proteomics data across all major mass spectrometry acquisition methods. Proteomic data analysis has long been fragmented across different software for data-dependent acquisition (DDA), data-independent acquisition (DIA), and parallel reaction monitoring (PRM). 𝗖𝗛𝗜𝗠𝗘𝗥𝗬𝗦 𝗶𝘀 𝗮 𝘀𝗽𝗲𝗰𝘁𝗿𝘂𝗺-𝗰𝗲𝗻𝘁𝗿𝗶𝗰 𝘀𝗲𝗮𝗿𝗰𝗵 𝗮𝗹𝗴𝗼𝗿𝗶𝘁𝗵𝗺 𝘁𝗵𝗮𝘁 𝗱𝗲𝗰𝗼𝗻𝘃𝗼𝗹𝘂𝘁𝗲𝘀 𝗰𝗵𝗶𝗺𝗲𝗿𝗶𝗰 𝘀𝗽𝗲𝗰𝘁𝗿𝗮 𝗮𝗻𝗱 𝘂𝗻𝗶𝗳𝗶𝗲𝘀 𝗽𝗲𝗽𝘁𝗶𝗱𝗲 𝗶𝗱𝗲𝗻𝘁𝗶𝗳𝗶𝗰𝗮𝘁𝗶𝗼𝗻 𝗮𝗻𝗱 𝗾𝘂𝗮𝗻𝘁𝗶𝗳𝗶𝗰𝗮𝘁𝗶𝗼𝗻 𝗮𝗰𝗿𝗼𝘀𝘀 𝗗𝗗𝗔, 𝗗𝗜𝗔, 𝗮𝗻𝗱 𝗣𝗥𝗠.  1. Identified over 238,000 peptide-spectrum matches (PSMs) in a 2-hour HeLa DDA dataset, exceeding the identification rate of eight leading search engines while completing analysis faster than data acquisition time.  2. Increased peptide group identifications in complex biological samples by up to 98% (acetylation-enriched samples) compared to traditional tools like Sequest HT and MSFragger.  3. Demonstrated robust quantification, achieving a Pearson correlation of 0.99 with manually curated Skyline data across five orders of magnitude of protein abundance in PRM assays.  4. Unified DDA and DIA analysis under one framework, revealing DIA quantified up to 98.7% of peptide groups across replicates, while DDA quantified 61.7% under the same conditions. Couple thoughts:  • The use of entrapment experiments across isolation window widths was cool. They confirm that CHIMERYS’ q-values closely match empirical FDR. This ensures trustworthy identifications even in highly chimeric spectra  • to broaden applicability and accommodate non-Thermo instruments, the deep-learning fragmentation models could be expanded to cover rare post-translational modifications and adopt open formats (mzML)  • could introducing a lightweight, neural-based pre-scoring step to filter unlikely peptide candidates before regression? i'm thinking benefits include shrinking problem size and improving scalability for proteome-wide libraries. Here's the awesome work: https://lnkd.in/g94Earve Congrats to Martin Frejno, Michelle Tamara Berger, Johanna Tueshaus, Daniel P. Zolg, Mathias Wilhelm, and co! I post my takes on the latest developments in health AI – 𝗰𝗼𝗻𝗻𝗲𝗰𝘁 𝘄𝗶𝘁𝗵 𝗺𝗲 𝘁𝗼 𝘀𝘁𝗮𝘆 𝘂𝗽𝗱𝗮𝘁𝗲𝗱! Also, check out my health AI blog here: https://lnkd.in/g3nrQFxW

Explore categories