SlideShare a Scribd company logo
T-bioinfo overview
Typical Mass-use Pipelines Complex Challenges and Workflows
NGS (Next Generation Sequencing)
1. Total-RNA Analysis (RNA-seq, Non-Coding RNA, Repeats)
2. Epigenetics (CHiP-seq and Bisulfate-Seq)
3. Variant Calling
4. Microbiome (Metagenomics)
Mass Spec
1. Proteomics
2. Metabolomics
Structural Biology
1. Libraries of Small Molecules (Query, Clustering)
2. Docking (Including large molecules)
Machine Learning
1. Phenotypic Analysis and Modeling
2. Analysis of visual data
3. Standard Statistical methods
4. Integration of heterogenous data sets
CirSeq Mutation Analysis
1. Analysis of viral CirSeq data for precise mutation identification
2. Fitness of mutations reflecting viral adaptation
3. Identification of viral quasi-species
Mass Spec
1. Protein-protein Interactions between host and viral
proteins
2. Post translational modifications of host proteins
Structural Biology
1. Libraries of Small Molecules (Query, Clustering)
2. Docking (Including large molecules)
NGS host data
1. Host gene expression variations in response to
infectious quasi-species
T-BioInfo is a user-friendly computational platform that enables analysis and integration of big data.The challenge of mining -omics data for meaningful
patters that can be applied in biomedical and agricultural research as sequencing becomes cheaper and more precise. On the other hand, complex
networks of dependencies that define many conditions tend to require integration of huge heterogenous data sets from SNPs, gene expression, epigenetic
markers, proteomic and metabolomic profiles, even structural biology data. Our company has developed innovative and user friendly workflows for analysis
and integration of these different datasets. Now we are looking to test and commercialize a platform that provides web access to the platform.
Simple, Flexible and Consistent Interface Across All Sections
Integration of
analysis types
One environment
for all types of data
and analysis
“one-button”
approach to most
areas of analysis
• Flexible	analysis	pipelines	in	the	pla/orm	sec4ons	and	easy	to	perform	data	input	
• A	user	is	assisted	by	the	pla/orm	in	construc4ng	meaningful	algorithmic	pipelines	for	processing	data:	
modules	for	pipeline	con4nua4on	are	highlighted	by	black	background	and	yellow	4tle.
Analysis of Total RNA
Concept:	raw	total	transcriptome	reads	contain	informa4on	not	only	about	
expressed	splice	variants	(isoforms)	of	genes,	but	also	about	expressed	
transposons	and	regulatory	non-coding	RNAs.	The	complete	analysis	consists	
of	three	steps.	First,	the	reads	are	mapped	on	isoforms	in	order	to	get	
isoform	expression	levels.	Second,	previously	unmapped	reads	are	mapped	
on	known	repe44ve	elements	(RE)	and	non-coding	RNAs	in	order	to	get	their	
expression	levels.	Third,	the	rest	of	reads	are	processed	by	special	clustering	
(BiClustering)	in	order	to	get	new	expressed	RE	and	non-coding	RNAs	as	well	
as	their	expression	levels	under	applied	biological	condi4ons.	On	the	next	
stage,	data	integra4on	can	be	performed:	interplay	between	expressed	
isoforms,	transposons,	and	regulatory	RNAs.	
1
Detec4on	of	expressed	isoforms	and	their	expression	levels	
by	mapping	the	reads	on	constructed	transcripts
√
2 For	unmapped	reads: √
3
Detec4on	of	most	expressed	repeats	and	regulatory	
RNA	from	databases
√
4
BiClustering:	associa4ons	of	kmers	and	reads	as	a	
bicluster,		and	genera4on	of	Kchains	of	biclusters	
√
5 Extensions	of	Kchains	 ±
6
Mapping	of	NGS	reads	on	found	Kchains:	detec4on	of	
most	expressed	novel	transposons	and	regulatory	
RNAs
√
T-Bioinfo RNA-seq/chip section
Example: Expression of RepeatsAlgorithmic Approaches:
Analysis of
“Junk” RNA
Epigenetic Analysis: Bisulfite DNA Methylation and CHiP-Seq
Bisulfite	Concept:	bisulfite	sequencing	shows	T	instead	of	C	in	a	read	if	C	of	a	genomics	site	
(like	CpG)	is	methylated.	Thus,	detec4on	of	methylated	sites	and	genome	fragments	
enriched/depleted	by	methyla4on	is	based	on	special	type	of	read	mapping,	and	
segmenta4on	of	the	whole	genome	methyla4on	profile.	The	analysis	objec4ves	include	
special	mapping	algorithms	with	tolerance	of	the	T-to-C	mismatch,	sta4s4cal	es4ma4on	of	
the	per-site	methyla4on	level,	allele	specificity	of	DNA	methyla4on,	as	well	as	detec4on	of	
the	over-methylated		and	under-methylated	genomic	regions.	
CHiP-Seq	Concept:	detec4on	of	epigene4c	signals	such	as	histone	modifica4ons	of	different	types	and	DNA	
methyla4on	events	as	well	as	determining	protein/DNA	binding	sites	(TF	binding	sites)	are	performed	by	CHiP-seq	
and	CHiP-chip	experiments.	Analysis	of	profiles	of	these	whole	genome	signals	is	performed	by	the	genome	
segmenta4on	algorithms.	The	analysis	objec4ves	include	iden4fying	signal	enriched	genome	fragments	as	puta4ve	
epigene4c	events,	and	a	combina4on	of	enriched	fragments	on	posi4ve	and	nega4ve	strands	with	a	certain	
distance	between	them	as	the	TF	binding	event.	On	the	next	analysis	stage,	the	data	integra4on	can	be	performed:	
interplay	between	genome	muta4ons	and	epigene4c	signals	on	one	side	and	expressed	isoforms,	transposons,	and	
regulatory	RNAs	on	the	other	side.	The	network	of	gene	regula4on	by	a	transcrip4on	factor	can	be	reconstructed	
from	the	whole	genome	TF	binding	posi4ons	and	expressions	of	the	down-stream	genes.	Microarray	datasets	are	
transformed	into	pseudo	NGS	reads	and	are	analyzed	by	the	same	CHiP-seq	pipelines.	
T-Bioinfo CHiP-seq section
1 Preprocessing	of	raw	data	 √
2
Mapping	of	NGS	reads	by	bisulfite	mapping	algorithms:	no	penalty	
for	T(read)-to-C(genome)	mismatches
√
3
Detec4on	of	the	DNA	methylated	posi4ons	and	their	scores	by	the	
confidence	interval	method
√
4 Allele	specificity	of	the	methyla4on	in	a	posi4on. -
5
Detec4on	of	over-methylated	and	under-methylated	genomic	
intervals	by	the	segmenta4on	algorithms
±
6
Detec4on	of	differen4al	DNA	methyla4ons	(individual	posi4ons	and	
intervals)	between	contras4ng	condi4ons
±
Virology Pipeline
Mutation Fitness
Genome-wide fitness calculations enabled by CirSeq,
combined with structural information, can provide
high-definition, bias-free insights into structure-function
relationships, potentially revealing novel functions for
viral proteins and RNA structures, as well as nuanced
insights into a viral genome’s phenotypic space. Such
analyses have the power to reveal protein residues or
domains that directly correspond to viral functional
plasticity and may significantly inform our structural
and mechanistic understanding of host–pathogen
interactions.
T-bioinfo overview
Integration of Heterogenous Data sets
Concept:	mutual	associa4on	of	features	of	biological	datasets	is	most	substan4al	part	for	
integra4on	of	several	analyses	of	biological	projects	in	one	story.	We	are	sugges4ng	several	
techniques	for	such	associa4ons.			
Matching	of	metabolite	and	SNP	profiles	
according	to	LB’s	selection	of	SNPs
Patent Pending Technology
for Drug Discovery
Fast screening and clustering of small molecules based on
physico-chemical similarity (70-100 times faster than industry
standard)
Small Molecule Candidate
Identifying a biologically active molecule
(Polio)
Patent Pending: Ref. P-78368-US | App. No. 14/625,785 entitled
SYSTEMS AND METHODS OF IMPROVED MOLECULE SCREENING
Computational analysis of small molecules can be roughly divided into three sections: pre-
processing analysis, virtual screening methods, and clustering.The aim of the conformer
generation process is to build a set of representative conformers that covers the conformational
space of a given molecule.There are two main classes of virtual screening methods: similarity-
based methods (descriptor-based screening; geometric querying; shape-based querying;
fingerprints) and receptor-based methods (docking). One of the greatest challenges of docking
software is to consider protein flexibility.These macromolecules are not static objects and
conformational changes are often key elements in ligand binding. T-Bioinfo provides a number of
proprietary methods that can be combined into pipelines for drug discovery.
Tauber Bioinformatics
Research Center
Tauber Bioinformatics Research Center at the University of Haifa
has a proven track record in Bioinformatics with scientific
collaborations with Hospitals, top US Universities, involvement in
government-funded projects, and multiple publications in
leading journals such as Science and Nature.
Pine Biotech holds an exclusive license for commercialization of
tools developed at the TBRC for research, industry applications
and education. The startup is located at the BioInnovation Center
in New Orleans, LA. In collaboration with TBRC staff, Pine Biotech
is completing several pilot projects to validate our approach.
Aleph
Therapeutics‫א‬
Early Adopters and Collaborators:

More Related Content

PDF
Publicly available tools and open resources in Bioinformatics
PDF
Louisiana Biomedical Research Network - Fall 2020 Bioinformatics Program Ove...
PDF
User-friendly bioinformatics (Monthly Informational workshop)
PPTX
Free webinar-introduction to bioinformatics - biologist-1
PDF
ANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MINING
PDF
Opening up pharmacological space, the OPEN PHACTs api
PPTX
Bioinformatics introduction
PPT
Proteome databases
Publicly available tools and open resources in Bioinformatics
Louisiana Biomedical Research Network - Fall 2020 Bioinformatics Program Ove...
User-friendly bioinformatics (Monthly Informational workshop)
Free webinar-introduction to bioinformatics - biologist-1
ANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MINING
Opening up pharmacological space, the OPEN PHACTs api
Bioinformatics introduction
Proteome databases

What's hot (20)

PPTX
PPTX
Bioinformatics ppt
PPTX
Bioinformatics for beginners (exam point of view)
PPTX
Tools of bioinforformatics by kk
PPT
Bioinformatics Projects And Applications
PPTX
Analysis with biological pathways:
PPTX
Data mining ppt
PPTX
Introduction to bioinformatics
PPTX
Using ontologies to do integrative systems biology
PPTX
Bioinformatics
PPT
Use of data
PPTX
Bind database
PPT
Pharmacoinformatics Database basics(sree)
PPTX
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
PPTX
Bioinformatics Analysis of Nucleotide Sequences
PPTX
Introduction to Bioinformatics
PPT
An Introduction to Chemoinformatics for the postgraduate students of Agriculture
PPT
Literature Based Framework for Semantic Descriptions of e-Science resources
PDF
call for papers, research paper publishing, where to publish research paper, ...
PPTX
European Molecular Biology Laboratory (EMBL)- European Bioinformatics Institu...
Bioinformatics ppt
Bioinformatics for beginners (exam point of view)
Tools of bioinforformatics by kk
Bioinformatics Projects And Applications
Analysis with biological pathways:
Data mining ppt
Introduction to bioinformatics
Using ontologies to do integrative systems biology
Bioinformatics
Use of data
Bind database
Pharmacoinformatics Database basics(sree)
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Bioinformatics Analysis of Nucleotide Sequences
Introduction to Bioinformatics
An Introduction to Chemoinformatics for the postgraduate students of Agriculture
Literature Based Framework for Semantic Descriptions of e-Science resources
call for papers, research paper publishing, where to publish research paper, ...
European Molecular Biology Laboratory (EMBL)- European Bioinformatics Institu...
Ad

Similar to T-bioinfo overview (20)

PPT
2011-10-11 Open PHACTS at BioIT World Europe
PPT
iOMICS Research
PDF
LECTURE NOTES ON BIOINFORMATICS
PPT
SooryaKiran Bioinformatics
PPT
2011-11-28 Open PHACTS at RSC CICAG
PDF
Processing Amplicon Sequence Data for the Analysis of Microbial Communities
PPTX
BioInformatics Tools -Genomics , Proteomics and metablomics
PPTX
MLS 5321 MOLECULAR BIOLOGY II TECHNIQUES AND APPLICATIONS POWER POINT.pptx
PPTX
MOLECULAR BIOLOGY TECHNIQUES AND APPLICATIONS
PDF
Bioinformatics data mining
PDF
Impact_of_gene_length_on_DEG
PPTX
Bioinformatics Introduction and Use of BLAST Tool
PDF
Introduction to Bioinformatics-1.pdf
PDF
NetBioSIG2012 anyatsalenko-en-viz
PDF
Overall Vision for NRNB: 2015-2020
PPT
INBIOMEDvision Workshop at MIE 2011. Victoria López
PPTX
Bioinformatics .pptx
PDF
Bioinformatics مي.pdf
PDF
MORPH-R article
PPTX
BIOINFO unit 1.pptx
2011-10-11 Open PHACTS at BioIT World Europe
iOMICS Research
LECTURE NOTES ON BIOINFORMATICS
SooryaKiran Bioinformatics
2011-11-28 Open PHACTS at RSC CICAG
Processing Amplicon Sequence Data for the Analysis of Microbial Communities
BioInformatics Tools -Genomics , Proteomics and metablomics
MLS 5321 MOLECULAR BIOLOGY II TECHNIQUES AND APPLICATIONS POWER POINT.pptx
MOLECULAR BIOLOGY TECHNIQUES AND APPLICATIONS
Bioinformatics data mining
Impact_of_gene_length_on_DEG
Bioinformatics Introduction and Use of BLAST Tool
Introduction to Bioinformatics-1.pdf
NetBioSIG2012 anyatsalenko-en-viz
Overall Vision for NRNB: 2015-2020
INBIOMEDvision Workshop at MIE 2011. Victoria López
Bioinformatics .pptx
Bioinformatics مي.pdf
MORPH-R article
BIOINFO unit 1.pptx
Ad

Recently uploaded (20)

PPTX
2Systematics of Living Organisms t-.pptx
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
PPTX
Introduction to Cardiovascular system_structure and functions-1
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PDF
The scientific heritage No 166 (166) (2025)
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPT
protein biochemistry.ppt for university classes
PDF
An interstellar mission to test astrophysical black holes
PPTX
Pharmacology of Autonomic nervous system
DOCX
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PDF
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
PPTX
2. Earth - The Living Planet Module 2ELS
PPTX
2. Earth - The Living Planet earth and life
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
PPTX
Microbiology with diagram medical studies .pptx
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
2Systematics of Living Organisms t-.pptx
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
Classification Systems_TAXONOMY_SCIENCE8.pptx
Introduction to Cardiovascular system_structure and functions-1
ECG_Course_Presentation د.محمد صقران ppt
The scientific heritage No 166 (166) (2025)
Phytochemical Investigation of Miliusa longipes.pdf
protein biochemistry.ppt for university classes
An interstellar mission to test astrophysical black holes
Pharmacology of Autonomic nervous system
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
Biophysics 2.pdffffffffffffffffffffffffff
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
2. Earth - The Living Planet Module 2ELS
2. Earth - The Living Planet earth and life
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
The KM-GBF monitoring framework – status & key messages.pptx
POSITIONING IN OPERATION THEATRE ROOM.ppt
Microbiology with diagram medical studies .pptx
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf

T-bioinfo overview

  • 2. Typical Mass-use Pipelines Complex Challenges and Workflows NGS (Next Generation Sequencing) 1. Total-RNA Analysis (RNA-seq, Non-Coding RNA, Repeats) 2. Epigenetics (CHiP-seq and Bisulfate-Seq) 3. Variant Calling 4. Microbiome (Metagenomics) Mass Spec 1. Proteomics 2. Metabolomics Structural Biology 1. Libraries of Small Molecules (Query, Clustering) 2. Docking (Including large molecules) Machine Learning 1. Phenotypic Analysis and Modeling 2. Analysis of visual data 3. Standard Statistical methods 4. Integration of heterogenous data sets CirSeq Mutation Analysis 1. Analysis of viral CirSeq data for precise mutation identification 2. Fitness of mutations reflecting viral adaptation 3. Identification of viral quasi-species Mass Spec 1. Protein-protein Interactions between host and viral proteins 2. Post translational modifications of host proteins Structural Biology 1. Libraries of Small Molecules (Query, Clustering) 2. Docking (Including large molecules) NGS host data 1. Host gene expression variations in response to infectious quasi-species T-BioInfo is a user-friendly computational platform that enables analysis and integration of big data.The challenge of mining -omics data for meaningful patters that can be applied in biomedical and agricultural research as sequencing becomes cheaper and more precise. On the other hand, complex networks of dependencies that define many conditions tend to require integration of huge heterogenous data sets from SNPs, gene expression, epigenetic markers, proteomic and metabolomic profiles, even structural biology data. Our company has developed innovative and user friendly workflows for analysis and integration of these different datasets. Now we are looking to test and commercialize a platform that provides web access to the platform.
  • 3. Simple, Flexible and Consistent Interface Across All Sections Integration of analysis types One environment for all types of data and analysis “one-button” approach to most areas of analysis • Flexible analysis pipelines in the pla/orm sec4ons and easy to perform data input • A user is assisted by the pla/orm in construc4ng meaningful algorithmic pipelines for processing data: modules for pipeline con4nua4on are highlighted by black background and yellow 4tle.
  • 4. Analysis of Total RNA Concept: raw total transcriptome reads contain informa4on not only about expressed splice variants (isoforms) of genes, but also about expressed transposons and regulatory non-coding RNAs. The complete analysis consists of three steps. First, the reads are mapped on isoforms in order to get isoform expression levels. Second, previously unmapped reads are mapped on known repe44ve elements (RE) and non-coding RNAs in order to get their expression levels. Third, the rest of reads are processed by special clustering (BiClustering) in order to get new expressed RE and non-coding RNAs as well as their expression levels under applied biological condi4ons. On the next stage, data integra4on can be performed: interplay between expressed isoforms, transposons, and regulatory RNAs. 1 Detec4on of expressed isoforms and their expression levels by mapping the reads on constructed transcripts √ 2 For unmapped reads: √ 3 Detec4on of most expressed repeats and regulatory RNA from databases √ 4 BiClustering: associa4ons of kmers and reads as a bicluster, and genera4on of Kchains of biclusters √ 5 Extensions of Kchains ± 6 Mapping of NGS reads on found Kchains: detec4on of most expressed novel transposons and regulatory RNAs √ T-Bioinfo RNA-seq/chip section Example: Expression of RepeatsAlgorithmic Approaches: Analysis of “Junk” RNA
  • 5. Epigenetic Analysis: Bisulfite DNA Methylation and CHiP-Seq Bisulfite Concept: bisulfite sequencing shows T instead of C in a read if C of a genomics site (like CpG) is methylated. Thus, detec4on of methylated sites and genome fragments enriched/depleted by methyla4on is based on special type of read mapping, and segmenta4on of the whole genome methyla4on profile. The analysis objec4ves include special mapping algorithms with tolerance of the T-to-C mismatch, sta4s4cal es4ma4on of the per-site methyla4on level, allele specificity of DNA methyla4on, as well as detec4on of the over-methylated and under-methylated genomic regions. CHiP-Seq Concept: detec4on of epigene4c signals such as histone modifica4ons of different types and DNA methyla4on events as well as determining protein/DNA binding sites (TF binding sites) are performed by CHiP-seq and CHiP-chip experiments. Analysis of profiles of these whole genome signals is performed by the genome segmenta4on algorithms. The analysis objec4ves include iden4fying signal enriched genome fragments as puta4ve epigene4c events, and a combina4on of enriched fragments on posi4ve and nega4ve strands with a certain distance between them as the TF binding event. On the next analysis stage, the data integra4on can be performed: interplay between genome muta4ons and epigene4c signals on one side and expressed isoforms, transposons, and regulatory RNAs on the other side. The network of gene regula4on by a transcrip4on factor can be reconstructed from the whole genome TF binding posi4ons and expressions of the down-stream genes. Microarray datasets are transformed into pseudo NGS reads and are analyzed by the same CHiP-seq pipelines. T-Bioinfo CHiP-seq section 1 Preprocessing of raw data √ 2 Mapping of NGS reads by bisulfite mapping algorithms: no penalty for T(read)-to-C(genome) mismatches √ 3 Detec4on of the DNA methylated posi4ons and their scores by the confidence interval method √ 4 Allele specificity of the methyla4on in a posi4on. - 5 Detec4on of over-methylated and under-methylated genomic intervals by the segmenta4on algorithms ± 6 Detec4on of differen4al DNA methyla4ons (individual posi4ons and intervals) between contras4ng condi4ons ±
  • 6. Virology Pipeline Mutation Fitness Genome-wide fitness calculations enabled by CirSeq, combined with structural information, can provide high-definition, bias-free insights into structure-function relationships, potentially revealing novel functions for viral proteins and RNA structures, as well as nuanced insights into a viral genome’s phenotypic space. Such analyses have the power to reveal protein residues or domains that directly correspond to viral functional plasticity and may significantly inform our structural and mechanistic understanding of host–pathogen interactions.
  • 8. Integration of Heterogenous Data sets Concept: mutual associa4on of features of biological datasets is most substan4al part for integra4on of several analyses of biological projects in one story. We are sugges4ng several techniques for such associa4ons. Matching of metabolite and SNP profiles according to LB’s selection of SNPs
  • 9. Patent Pending Technology for Drug Discovery Fast screening and clustering of small molecules based on physico-chemical similarity (70-100 times faster than industry standard) Small Molecule Candidate Identifying a biologically active molecule (Polio) Patent Pending: Ref. P-78368-US | App. No. 14/625,785 entitled SYSTEMS AND METHODS OF IMPROVED MOLECULE SCREENING Computational analysis of small molecules can be roughly divided into three sections: pre- processing analysis, virtual screening methods, and clustering.The aim of the conformer generation process is to build a set of representative conformers that covers the conformational space of a given molecule.There are two main classes of virtual screening methods: similarity- based methods (descriptor-based screening; geometric querying; shape-based querying; fingerprints) and receptor-based methods (docking). One of the greatest challenges of docking software is to consider protein flexibility.These macromolecules are not static objects and conformational changes are often key elements in ligand binding. T-Bioinfo provides a number of proprietary methods that can be combined into pipelines for drug discovery.
  • 10. Tauber Bioinformatics Research Center Tauber Bioinformatics Research Center at the University of Haifa has a proven track record in Bioinformatics with scientific collaborations with Hospitals, top US Universities, involvement in government-funded projects, and multiple publications in leading journals such as Science and Nature. Pine Biotech holds an exclusive license for commercialization of tools developed at the TBRC for research, industry applications and education. The startup is located at the BioInnovation Center in New Orleans, LA. In collaboration with TBRC staff, Pine Biotech is completing several pilot projects to validate our approach. Aleph Therapeutics‫א‬ Early Adopters and Collaborators: