SlideShare a Scribd company logo
A	pathway	and	SVM	
based	tool	for	
tumor	classification
A.A.	2016/2017
Candidato:
Luca	Vitale
Matricola:	0522500362
Relatori:
Prof:	Roberto	Tagliaferri
Dr.ssa:	Angela	Serra
Goals:
Classify	with	
pathways
1
Identify	relations	
among	pathways
2
Build	a	graph	of	
interactions	
between	pathways
3
The	Data
• Lung	Squamous	Cell	Carcinoma	(LSCC)	
• 106	patients
• 11837	genes
• 23074	methylation	values
• 352	miRNAs
• Survival	information
Pipeline
Similarity	Network	Fusion	- SNF
• SNF	is	a	intermediate	multi-view	clustering	methodology	for	patients	sub-typing.
Patients similarity
network
Fusion	iterations Fused patients
similarity network
miRNAs
methy
SNF	- Grid	search
• The	algorithm	is	run	different	time	with	the	following	parameters:
• Number	of	iterations:	200	
• K:	10	to	30	step	by	1	
• Number	of	nearest	neighbors
• 𝛼 :	0.3	to	0.8	step	by	0.1
• Variance	for	local	model
• For	each	combination	of	K	and	𝛼, the	number	of	clusters	was	evaluated	through	
two	heuristics:	eigen-gaps	K12	and	eigen-gaps	K2.
• Each	clustering	was	evaluated	through	the	survival	analysis	by	using	the	log-rank	
test.
SNF	- The	results
P-Value	=	0.0015	 K	=	23 𝛼 =	0.6 Number of Iteration =	200
Feature	selection
• Identify	discriminant	genes		
Discriminant	
Fuzzy	Pattern	
• Identify	which	pathways	are	
significantly	represented	by	the	
genes	selected	by	the	DFP	algorithm
Enrichment	
Analysis
Skip	factor	à 0,	1,	2,	3	
• The	skipFactor value	to	
skip	the	outliers.	Higher	
values	imply	that	less	
gene	are	considered	
outliers.	skipFactor
equal	to	0	does	not	
skip;
1
Zeta	à 0.35,	0.4,	0.45,	
0.5
• The	zeta	parameter	that	
sets	the	threshold	value	
which	controls	the	
activation	of	a	linguistic	
label;
2
piVal à 0.4	to	0.8	step by	
0.05
• The	piVal parameter	is	
equal	to	the	percentage	of	
values	of	a	class	to	
determine	the	fuzzy	
patterns.	It	can	take	values	
in	the	interval	[0,1];
3
Overlapping à 1,	2
• Determines the	
number of	discrete	
labels;
4
Discriminant	Fuzzy	Pattern	– Grid	search
Enrichment	Analysis
• For	each	group	of	genes	selected	by	DFP	parameters	the	enrichment	
analysis	was	performed
• The	p-value	is	calculated	based	on	the	hypergeometric	model
• We	only	used	KEGG	and	Reactome Database
Evaluation	of	
DFP	results
• We	selected	the	dataset	which	
reached	the	maximum	number	of	
pathways,	containing	only	the	
genes	selected	with	the	DFP.
• The	selected	combination	has	
1384	genes,	67	pathways	and	
piVal 0.6.
The	pathways
• The	selected	pathways	are	67:	28	KEGG	and	
39	Reactome
• The	selected	genes	are	1384.	The	DFP	
parameters	are:
• skip	factor	2
• zeta	0.3
• piVal 0.6
• overlapping	1
Classification	with	SVM
• For	each pathway a	Linear SVM	was executed on	each pair	of	classes
• Two level cross-validation
• 3	outer folds
• 2	inner folds
• C:	1e-5,	1e-4,	1e-3,	1e-2,	1e-1,	1e0,	1e1,	1e2,	1e3,	1e4,	1e5,	1e6
A pathway and SVM based tool for tumor classification
Permutation	test
• The	goal	of	permutation	test	is	to	identify	the	
pathways	that	are	statistically	significant	for	the	
classification
name	pathway	 id	pathway	 p-value	 Accuracy	 Size	 classes				
Cytokine-cytokine	receptor	interaction	 K	_hsa04060	 0.04	 0.93	 21 5vs2				
Cell	cycle	 K	_hsa04110	 0.03	 0.89	 12 1vs2				
Cell	cycle	 K	_hsa04110	 0.02	 0.97	 13 2vs3				
Cell	cycle	 K	_hsa04110	 0.05	 0.90	 11 5vs3				
Osteoclast	differentiation	 K	_hsa04380	 0.03	 1.00	 10 1vs4				
Antigen	processing	and	presentation	 K	_hsa04612	 0.03	 1.00	 8 1vs4				
Antigen	processing	and	presentation	 K	_hsa04612	 0.05	 0.92	 7 2vs3				
Antigen	processing	and	presentation	 K	_hsa04612	 0.03	 1.00	 7 5vs4				
T	cell	receptor	signaling	pathway	 K	_hsa04640	 0.04	 1.00	 11 1vs4				
T	cell	receptor	signaling	pathway	 K	_hsa04640	 0.01	 0.81	 12 5vs1				
Th1	and	Th2	cell	differentiation	 K	_hsa04658	 0.04	 1.00	 9 1vs4				
Th1	and	Th2	cell	differentiation	 K	_hsa04658	 0.04	 0.93	 9 2vs3				
Th17	cell	differentiation	 K	_hsa04659	 0.04	 1.00	 11 1vs4				
Th17	cell	differentiation	 K	_hsa04659	 0.04	 0.96	 12 2vs3				
T	cell	receptor	signaling	pathway	 K	_hsa04660	 0.03	 1.00	 8 1vs4				
T	cell	receptor	signaling	pathway	 K	_hsa04660	 0.04	 1.00	 8 5vs4				
B	cell	receptor	signaling	pathway	 K	_hsa04662	 0.04	 0.84	 7 1vs2				
B	cell	receptor	signaling	pathway	 K	_hsa04662	 0.03	 0.89	 8 5vs2				
Leukocyte	transendothelial	migration	 K	_hsa04670	 0.01	 0.91	 12 1vs2				
Leukocyte	transendothelial	migration	 K	_hsa04670	 0.01	 0.90	 13 1vs3				
Leukocyte	transendothelial	migration	 K	_hsa04670	 0.04	 0.92	 14 5vs3
A pathway and SVM based tool for tumor classification
Second	step	of	classification:	pathway	
probabilities	combinations
• For	each	pairs	of	class,	we	combine	the	pathways	using	the	class	
probabilities	of	SVM	as	new	features.	
• We	try	all	the	combination	of	pathways	using	linear	SVM
• C:	1e-5,	1e-4,	1e-3,	1e-2,	1e-1,	1e0,	1e1,	1e2,	1e3,	1e4,	1e5,	1e6
Graph	Interaction
• We	create	a	graph	interaction	for	each	combination	of	classes
• The	vertices	of	the	graph	are	the	genes	in	pathways
• The	size	is	equal	at	SVM	weight,	if	a	genes	is	in	common	
between	the	pathways	we	pick	the	max	weight.
• For	the	edges:
1. We	calculated	the	correlation	between	the	genes
2. Split	the	correlation	in	positive	and	negative	and	
calculate	the	MST
3. Only	the	edges	belonged	to	MST	are	in	the	final	graph
• We	highlighted	the	pathways	with	shapes	of	different	colours
Conclusion
• The	pathways	are	good	features	for	the	classification	problem.
• The	pipeline	can	be	tested	on	other	datasets	to	test	it’s	generalization	
ability.
A pathway and SVM based tool for tumor classification

More Related Content

PDF
Reconstruction and analysis of cancerspecific Gene regulatory networks from G...
PDF
BRITEREU_finalposter
PPTX
BiPday 2014 -- Santorsola Mariangela
PPTX
NetBioSIG2014-Talk by David Amar
PPTX
BiPday 2014 --Creanza Teresa
PPTX
Stratification of TCGA melanoma patients according to Tumor Infiltrative CD8...
PDF
Liver_Cancer_f_CIN-6-Fung_1624
PDF
Personalizing Oncology with Genomics
Reconstruction and analysis of cancerspecific Gene regulatory networks from G...
BRITEREU_finalposter
BiPday 2014 -- Santorsola Mariangela
NetBioSIG2014-Talk by David Amar
BiPday 2014 --Creanza Teresa
Stratification of TCGA melanoma patients according to Tumor Infiltrative CD8...
Liver_Cancer_f_CIN-6-Fung_1624
Personalizing Oncology with Genomics

What's hot (20)

PDF
Ransbotyn et al PUBLISHED (1)
PDF
Liangqun ms defense.pptx
PDF
EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...
PDF
EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...
PDF
Co-clustering algorithm for the identification of cancer subtypes from gene e...
PPTX
Qtl analysis and its mapping
PDF
human_mutation_article
PDF
An Overview on Gene Expression Analysis
PDF
Ascb 2010 poster
PDF
50120130405008 2
PDF
Melanoma
PDF
Genomics2 Phenomics Complete
PDF
Nrgastro.2013.52
PPTX
Multi-Scale Modeling of T Cell and Antigen Presenting Cell Interaction in the...
PDF
genomics upenn publication 1
PPTX
Selene_Hess_Deciphering Antibiotic Resistance_EDITED
PDF
Oncogene_2010_Ocak
PDF
Genes and Tissue Culture Technology - Next Generation Sequencing - Applicatio...
PDF
Transcriptional Responses to Anti-cancer Drugs in vitro
PDF
B45020308
Ransbotyn et al PUBLISHED (1)
Liangqun ms defense.pptx
EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...
EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...
Co-clustering algorithm for the identification of cancer subtypes from gene e...
Qtl analysis and its mapping
human_mutation_article
An Overview on Gene Expression Analysis
Ascb 2010 poster
50120130405008 2
Melanoma
Genomics2 Phenomics Complete
Nrgastro.2013.52
Multi-Scale Modeling of T Cell and Antigen Presenting Cell Interaction in the...
genomics upenn publication 1
Selene_Hess_Deciphering Antibiotic Resistance_EDITED
Oncogene_2010_Ocak
Genes and Tissue Culture Technology - Next Generation Sequencing - Applicatio...
Transcriptional Responses to Anti-cancer Drugs in vitro
B45020308
Ad

Similar to A pathway and SVM based tool for tumor classification (20)

PPTX
Data integration lab_meeting
PPTX
Developing a framework for for detection of low frequency somatic genetic alt...
PDF
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...
PPTX
Qi liu 08.08.2014
PDF
microRNA discovery and biomarker development in clinical samples
PPTX
Transcriptomics
PPTX
Axt microarrays
PPTX
Experimental methods and the big data sets
PPT
Myths in science & statistics s ha h.ppt
PPTX
A review on early hospital mortality prediction using vital signals
PPTX
Predictive Features of TCR Repertoire
PPTX
High throughput Data Analysis
PPTX
TNBC Research Presentation and medical virology .pptx
PDF
RapportHicham
PPTX
Recent advances in soft tissue sarcoma and its applications
PPTX
GeneDosage_scDNAseq_MinZhao - Copy.pptx
PDF
Analytical Validation of the Oncomine™ Comprehensive Assay v3 with FFPE and C...
PPTX
Lab Presentation, Molecular Data Cluster Algorithms
PDF
Structural Variation Detection
PDF
Functional genomics
Data integration lab_meeting
Developing a framework for for detection of low frequency somatic genetic alt...
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...
Qi liu 08.08.2014
microRNA discovery and biomarker development in clinical samples
Transcriptomics
Axt microarrays
Experimental methods and the big data sets
Myths in science & statistics s ha h.ppt
A review on early hospital mortality prediction using vital signals
Predictive Features of TCR Repertoire
High throughput Data Analysis
TNBC Research Presentation and medical virology .pptx
RapportHicham
Recent advances in soft tissue sarcoma and its applications
GeneDosage_scDNAseq_MinZhao - Copy.pptx
Analytical Validation of the Oncomine™ Comprehensive Assay v3 with FFPE and C...
Lab Presentation, Molecular Data Cluster Algorithms
Structural Variation Detection
Functional genomics
Ad

More from Luca Vitale (11)

PDF
Estimation of the Latent Signals for Consensus Across Multiple Ranked Lists u...
PDF
Pathway based OMICs data classification
PDF
JSON-LD
PDF
Metodi per la soluzione di problemi di programmazione non lineare
PDF
Shrinkage methods
PDF
Log structured-file-system
PDF
Utilizzo dei Thread
PDF
Classificazione in efMRI: Un caso di studio sulla coniugazione dei verbi
PDF
Linguaggi Context-Sensitive e Linear Bounded Automata
PDF
Soluzione numerica di equazioni differenziali a grandi dimensioni su GPUs
Estimation of the Latent Signals for Consensus Across Multiple Ranked Lists u...
Pathway based OMICs data classification
JSON-LD
Metodi per la soluzione di problemi di programmazione non lineare
Shrinkage methods
Log structured-file-system
Utilizzo dei Thread
Classificazione in efMRI: Un caso di studio sulla coniugazione dei verbi
Linguaggi Context-Sensitive e Linear Bounded Automata
Soluzione numerica di equazioni differenziali a grandi dimensioni su GPUs

Recently uploaded (20)

PDF
Lecture1.pdf buss1040 uses economics introduction
PDF
HCWM AND HAI FOR BHCM STUDENTS(1).Pdf and ptts
PDF
Understanding University Research Expenditures (1)_compressed.pdf
PDF
Corporate Finance Fundamentals - Course Presentation.pdf
PDF
how_to_earn_50k_monthly_investment_guide.pdf
PDF
How to join illuminati agent in Uganda Kampala call 0782561496/0756664682
PPTX
Introduction to Customs (June 2025) v1.pptx
PDF
caregiving tools.pdf...........................
PPTX
The discussion on the Economic in transportation .pptx
PDF
ABriefOverviewComparisonUCP600_ISP8_URDG_758.pdf
PPTX
Session 3. Time Value of Money.pptx_finance
PDF
discourse-2025-02-building-a-trillion-dollar-dream.pdf
PPTX
introuction to banking- Types of Payment Methods
PPTX
Basic Concepts of Economics.pvhjkl;vbjkl;ptx
PDF
Chapter 9 IFRS Ed-Ed4_2020 Intermediate Accounting
PDF
Copia de Minimal 3D Technology Consulting Presentation.pdf
PDF
NAPF_RESPONSE_TO_THE_PENSIONS_COMMISSION_8 _2_.pdf
PDF
ECONOMICS AND ENTREPRENEURS LESSONSS AND
PDF
Q2 2025 :Lundin Gold Conference Call Presentation_Final.pdf
PDF
Mathematical Economics 23lec03slides.pdf
Lecture1.pdf buss1040 uses economics introduction
HCWM AND HAI FOR BHCM STUDENTS(1).Pdf and ptts
Understanding University Research Expenditures (1)_compressed.pdf
Corporate Finance Fundamentals - Course Presentation.pdf
how_to_earn_50k_monthly_investment_guide.pdf
How to join illuminati agent in Uganda Kampala call 0782561496/0756664682
Introduction to Customs (June 2025) v1.pptx
caregiving tools.pdf...........................
The discussion on the Economic in transportation .pptx
ABriefOverviewComparisonUCP600_ISP8_URDG_758.pdf
Session 3. Time Value of Money.pptx_finance
discourse-2025-02-building-a-trillion-dollar-dream.pdf
introuction to banking- Types of Payment Methods
Basic Concepts of Economics.pvhjkl;vbjkl;ptx
Chapter 9 IFRS Ed-Ed4_2020 Intermediate Accounting
Copia de Minimal 3D Technology Consulting Presentation.pdf
NAPF_RESPONSE_TO_THE_PENSIONS_COMMISSION_8 _2_.pdf
ECONOMICS AND ENTREPRENEURS LESSONSS AND
Q2 2025 :Lundin Gold Conference Call Presentation_Final.pdf
Mathematical Economics 23lec03slides.pdf

A pathway and SVM based tool for tumor classification