SlideShare a Scribd company logo
S1: Let me introduce a work titled “Anticancer Thiazolidinones Design: Mining
of 60-Cell Lines Experimental Data”. A search for anticancer agents
containing thiazolidinone scaffold is a promising trend in modern medicinal
chemistry. Computational chemistry, particularly QSAR and docking, is one
of the success factors in this direction. In the present research we tried to
extract all valuable knowledge hidden in 60-cell line anticancer screen results
in-home database, that will be useful in further QSAR-studies.
S2: The in vitro cell line screening is implemented under the developmental
therapeutics program of National Cancer Institute (USA). The operation of
this screen utilizes 60 different human tumor cell lines, representing leukemia,
melanoma and cancers of the lung, colon, brain, ovary, breast, prostate, and
kidney. The screening is a two-stage process, beginning with the evaluation of
all compounds against the 60 cell lines at a single dose of 10 uM. The output
from the single dose screen is reported as a mean graph. Compounds which
exhibit significant growth inhibition are evaluated against the 60 cell panel at
five concentration levels (with 10 uM as one of them). The results of both
stages are compared to control test and are represented by a Growth percent.
S3: The next problems had to be solved during current computational study:
- Have the same dose results of this two stages enough statistical similarity
to be treated together in future QSAR modelling?
- Where is a rational border between active and inactive compounds?
- Is there different mechanisms of antitumor action associated with
investigated compounds?
S4: A hypothesis, stated that same dose results are homogenous was
investigated. If it is true, the next conclusions will be useful for further
investigations:
Primarily, this results can be combined together to increase overall data
amount.
Secondly, deviation in the results for same compounds is an error of the
experiment
And this experimental error is a minimal error for any QSAR model based
on this data
S5: According to statistical concepts, if same dose results are homogenous then
deviations in the results for same compounds is a normally distributed random
sample with zero mean and unknown variance. This null-hypothesis was
evaluated by Student’s t-test with 60 cell lines results for 73 pairs of compounds,
and was rejected for 41 cell lines with default statistical significance 0,05. In case
of other 19 cell lines we cannot reject null-hypothesis, what means that either it
is true, either it is insufficient data to reject this hypothesis. That is why we reject
the investigated hypothesis in general.
S6: Looking at the mean deviations of growth percents for different cell lines
distribution, a shift to positive numbers can be pointed. It means that the results
of the second testing stage are more optimistic than the results of the first one.
S7: The distribution of mean deviations of growth percents for different
compounds indicates the presence of extreme errors, that are still not corrected
after averaging. Considering a case of 100% deviation of single pair results
values as an outlier, extreme errors rate above 4% was found.
S8: Testing results for non-active compounds have to be normally distributed
with mean growth percent of cancer cells = 100% and unknown variance.
Making an assumption about an abscence of the tumor growth enhancers among
the investigated compounds, it can be stated that all mean growth percent values
above 100% form a right tail of this distribution. So the left tail can be found
statistically. For this purpose multiple evaluating of t-test with slow change of
cut-off was carried out and resulted in the first failure to reject null-hypothesis
with minimum growth percent = 86%. Simply saying, all compounds with mean
growth percent above 86% have to be treated as non-active. Such introduction of
the border between active and non-active compounds let us to form rational data
arrays for further QSAR investigations.
S9: Principal component analysis finds such linear combinations of variables that
the projection of initial data on the obtained vectors will have maximal
dispersion. Using principal component analysis is possible under the assumption
that experimantal error is less than difference in sensitiity patterns for various
mechanisms. It is expected that the first principal component incorporates an
information about mean growth percent, and the others principal components
cover differences in mechanisms and errors of the experiment. A change in
explained variances of two next principal components was selected as a
separation criteria between mechanisms and errors. Prior to calculations data was
normalized by cell lines to provide equal influence of every analyzed cell line.
Since change in explained variance with the second principal component is 5
times greater than next one, the presence of two different mechanisms are
indicated. Approximate clusters of compounds with different modes of action are
outlined by ellipses in the figure. As you can see, it is difficult to establish exact
borders and a role of intermediate compounds remains unclear. That is why
neural network modelling as more powerful computational approach was
utilized.
S10: Cohonen’s self-organizing 6 for 6 map was used for unsupervised learning
durig 5 000 epochs. Prior to calculations data was normalized by compounds, so
mean activity information was removed. And clusters with active compounds
contain also inactive because of experimental error randomness. The distribution
of whole compounds set in the neural network is showed on the left figure, and
the distribution of only active molecules with the distances between neurons are
on the right figure. We still cannot clearly separate different mechanisms
becouse importance of a cluster depends not on the number of active compounds,
but on the values of growth percents.
S11: So an integrated activity measure, calculated as a sum of same cluster
compounds contributions, was introduced. A surface of integrated activity over
the neural network is presented in the figure. It gives a possibility to distinguish
three classes: two different mechanisms (A and C) and mixed one (B) and to
separate them clearly.
S12: Weight plains of every cell line in the neural network allow to analyze
selectivity patterns for active compounds. In this figures cell line is more
sensitive to compounds from dark clusters and less sensitive to compounds from
light clusters. It have to be pointed that NCI-H460 is more sensitive than other
lines to all three classes, the second line is sensitive mostly to mechanism C,
M14 and SK-MEL-5 – to mechanism A. M14 is rather insensitive to cluster C.
S13: In the other hand, all lines on this slide are less sensitive to class A, and
OVCAR-5 and SNB-19 – to class C. We can see also that weights of mechanism
B are mostly in the interval formed by A and C weights. That confirms a
hypothesis about mixed mechanism of tumor growth inhibition by compounds
from cluster B.
S14. Conclusions:
The homogeneity of humat tumor cell line screen results obtained from different
testing stages is rejected so they cannot be combined together in further
computational investigations
It is found that about 4% of testing results are extreme errors, what is useful with
outliers detection in future QSAR models
Rational border between active and non-active compounds is introduced and
proper data arrays for further QSAR are formed.
Two independent and one mixed mechanisms of 4-thiazolidinones antitumor
activity are identified
Some selectivity linked with different modes of action for separate cell lines is
highlighted

More Related Content

PPTX
Novel Methodology for Predicting Synergistic Cancer Drug Pairs Slides
PDF
Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...
PDF
Joshua D. Gallo Pancreatic Cancer Research Poster Presentation
PDF
Ascb 2010 poster
PDF
The bayesian revolution in genetics
PDF
UNDERSTANDING LEAST ABSOLUTE VALUE IN REGRESSION-BASED DATA MINING
PPT
Pharsight RAS Expertise
PPTX
Statistical modeling in pharmaceutical research and development
Novel Methodology for Predicting Synergistic Cancer Drug Pairs Slides
Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...
Joshua D. Gallo Pancreatic Cancer Research Poster Presentation
Ascb 2010 poster
The bayesian revolution in genetics
UNDERSTANDING LEAST ABSOLUTE VALUE IN REGRESSION-BASED DATA MINING
Pharsight RAS Expertise
Statistical modeling in pharmaceutical research and development

What's hot (17)

PDF
Innovative Technique for Gene Selection in Microarray Based on Recursive Clus...
PPT
Standard error-Biostatistics
PDF
McCarthy_TermPaperSpring
PDF
Descriptive versus Mechanistic Modeling
DOC
final paper
PDF
An Update of Lot Quality Assurance Sampling (LQAS) Technologies Handout 2
PDF
Classification and Predication of Breast Cancer Risk Factors Using Id3
PDF
AMP Poster_Final
PDF
Analysis of Imbalanced Classification Algorithms A Perspective View
PPTX
Data analysis and Visualisation Techniques for Compound Combination Modelling
PPTX
Analysis of kinetic data
PDF
V5I3_IJERTV5IS031157
PDF
Simulation Study of Hurdle Model Performance on Zero Inflated Count Data
PDF
B45020308
PDF
MultipleLinearRegressionPaper
DOCX
Computer simulation in pharmacokinetics and pharmacodynamics
PPTX
Breast cancer classification
Innovative Technique for Gene Selection in Microarray Based on Recursive Clus...
Standard error-Biostatistics
McCarthy_TermPaperSpring
Descriptive versus Mechanistic Modeling
final paper
An Update of Lot Quality Assurance Sampling (LQAS) Technologies Handout 2
Classification and Predication of Breast Cancer Risk Factors Using Id3
AMP Poster_Final
Analysis of Imbalanced Classification Algorithms A Perspective View
Data analysis and Visualisation Techniques for Compound Combination Modelling
Analysis of kinetic data
V5I3_IJERTV5IS031157
Simulation Study of Hurdle Model Performance on Zero Inflated Count Data
B45020308
MultipleLinearRegressionPaper
Computer simulation in pharmacokinetics and pharmacodynamics
Breast cancer classification
Ad

Viewers also liked (12)

PPTX
Cave of the Heart - Esegesi emotiva - Gruppo "Medea"
PPT
ANTICANCER THIAZOLIDINONES DESIGN: Mining of 60-Cell Lines Experimental Data
PDF
Asia’s Growing Influence on Houston 9.16.2015
PPT
Descriptive stat
PPT
презентація медичного факультету УжНУ
PPT
Lecture2 hypothesis testing
PPT
Motivation for biostatistics
PPTX
SaltConf 2015: Salt stack at web scale: Better, Stronger, Faster
PPT
Introduction to biostatistics
PDF
Workshop de Inovação em Medical Devices
PPTX
Saltconf 2016: Salt stack transport and concurrency
PDF
Introduction to Biodesign - Stanford University - Ravi Pamnani
Cave of the Heart - Esegesi emotiva - Gruppo "Medea"
ANTICANCER THIAZOLIDINONES DESIGN: Mining of 60-Cell Lines Experimental Data
Asia’s Growing Influence on Houston 9.16.2015
Descriptive stat
презентація медичного факультету УжНУ
Lecture2 hypothesis testing
Motivation for biostatistics
SaltConf 2015: Salt stack at web scale: Better, Stronger, Faster
Introduction to biostatistics
Workshop de Inovação em Medical Devices
Saltconf 2016: Salt stack transport and concurrency
Introduction to Biodesign - Stanford University - Ravi Pamnani
Ad

Similar to Notes for macc (8)

PDF
Predicting active compounds for lung cancer based on quantitative structure-a...
PPT
Health Canada Genetic Tox Lecture Part 1
PDF
HTS by mukesh
PPTX
qsar.pptx
PPT
PROGRAM PHASE IN LIGAND-BASED PHARMACOPHORE MODEL GENERATION AND 3D DATABASE ...
PPT
Prediction Of Bioactivity From Chemical Structure
PPTX
Basics of QSAR Modeling by Prof Rahul D. Jawarkar.pptx
PPT
Rational_Drug_Design.ppt
Predicting active compounds for lung cancer based on quantitative structure-a...
Health Canada Genetic Tox Lecture Part 1
HTS by mukesh
qsar.pptx
PROGRAM PHASE IN LIGAND-BASED PHARMACOPHORE MODEL GENERATION AND 3D DATABASE ...
Prediction Of Bioactivity From Chemical Structure
Basics of QSAR Modeling by Prof Rahul D. Jawarkar.pptx
Rational_Drug_Design.ppt

Recently uploaded (20)

PPTX
cloud_computing_Infrastucture_as_cloud_p
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPT
Teaching material agriculture food technology
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Tartificialntelligence_presentation.pptx
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PPTX
A Presentation on Artificial Intelligence
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Assigned Numbers - 2025 - Bluetooth® Document
cloud_computing_Infrastucture_as_cloud_p
Group 1 Presentation -Planning and Decision Making .pptx
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Teaching material agriculture food technology
Univ-Connecticut-ChatGPT-Presentaion.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
Tartificialntelligence_presentation.pptx
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
OMC Textile Division Presentation 2021.pptx
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Building Integrated photovoltaic BIPV_UPV.pdf
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
A Presentation on Artificial Intelligence
Advanced methodologies resolving dimensionality complications for autism neur...
Reach Out and Touch Someone: Haptics and Empathic Computing
Heart disease approach using modified random forest and particle swarm optimi...
Empathic Computing: Creating Shared Understanding
Programs and apps: productivity, graphics, security and other tools
Assigned Numbers - 2025 - Bluetooth® Document

Notes for macc

  • 1. S1: Let me introduce a work titled “Anticancer Thiazolidinones Design: Mining of 60-Cell Lines Experimental Data”. A search for anticancer agents containing thiazolidinone scaffold is a promising trend in modern medicinal chemistry. Computational chemistry, particularly QSAR and docking, is one of the success factors in this direction. In the present research we tried to extract all valuable knowledge hidden in 60-cell line anticancer screen results in-home database, that will be useful in further QSAR-studies. S2: The in vitro cell line screening is implemented under the developmental therapeutics program of National Cancer Institute (USA). The operation of this screen utilizes 60 different human tumor cell lines, representing leukemia, melanoma and cancers of the lung, colon, brain, ovary, breast, prostate, and kidney. The screening is a two-stage process, beginning with the evaluation of all compounds against the 60 cell lines at a single dose of 10 uM. The output from the single dose screen is reported as a mean graph. Compounds which exhibit significant growth inhibition are evaluated against the 60 cell panel at five concentration levels (with 10 uM as one of them). The results of both stages are compared to control test and are represented by a Growth percent. S3: The next problems had to be solved during current computational study: - Have the same dose results of this two stages enough statistical similarity to be treated together in future QSAR modelling? - Where is a rational border between active and inactive compounds? - Is there different mechanisms of antitumor action associated with investigated compounds? S4: A hypothesis, stated that same dose results are homogenous was investigated. If it is true, the next conclusions will be useful for further investigations: Primarily, this results can be combined together to increase overall data amount. Secondly, deviation in the results for same compounds is an error of the experiment And this experimental error is a minimal error for any QSAR model based on this data S5: According to statistical concepts, if same dose results are homogenous then deviations in the results for same compounds is a normally distributed random sample with zero mean and unknown variance. This null-hypothesis was evaluated by Student’s t-test with 60 cell lines results for 73 pairs of compounds, and was rejected for 41 cell lines with default statistical significance 0,05. In case of other 19 cell lines we cannot reject null-hypothesis, what means that either it
  • 2. is true, either it is insufficient data to reject this hypothesis. That is why we reject the investigated hypothesis in general. S6: Looking at the mean deviations of growth percents for different cell lines distribution, a shift to positive numbers can be pointed. It means that the results of the second testing stage are more optimistic than the results of the first one. S7: The distribution of mean deviations of growth percents for different compounds indicates the presence of extreme errors, that are still not corrected after averaging. Considering a case of 100% deviation of single pair results values as an outlier, extreme errors rate above 4% was found. S8: Testing results for non-active compounds have to be normally distributed with mean growth percent of cancer cells = 100% and unknown variance. Making an assumption about an abscence of the tumor growth enhancers among the investigated compounds, it can be stated that all mean growth percent values above 100% form a right tail of this distribution. So the left tail can be found statistically. For this purpose multiple evaluating of t-test with slow change of cut-off was carried out and resulted in the first failure to reject null-hypothesis with minimum growth percent = 86%. Simply saying, all compounds with mean growth percent above 86% have to be treated as non-active. Such introduction of the border between active and non-active compounds let us to form rational data arrays for further QSAR investigations. S9: Principal component analysis finds such linear combinations of variables that the projection of initial data on the obtained vectors will have maximal dispersion. Using principal component analysis is possible under the assumption that experimantal error is less than difference in sensitiity patterns for various mechanisms. It is expected that the first principal component incorporates an information about mean growth percent, and the others principal components cover differences in mechanisms and errors of the experiment. A change in explained variances of two next principal components was selected as a separation criteria between mechanisms and errors. Prior to calculations data was normalized by cell lines to provide equal influence of every analyzed cell line. Since change in explained variance with the second principal component is 5 times greater than next one, the presence of two different mechanisms are indicated. Approximate clusters of compounds with different modes of action are outlined by ellipses in the figure. As you can see, it is difficult to establish exact borders and a role of intermediate compounds remains unclear. That is why neural network modelling as more powerful computational approach was utilized.
  • 3. S10: Cohonen’s self-organizing 6 for 6 map was used for unsupervised learning durig 5 000 epochs. Prior to calculations data was normalized by compounds, so mean activity information was removed. And clusters with active compounds contain also inactive because of experimental error randomness. The distribution of whole compounds set in the neural network is showed on the left figure, and the distribution of only active molecules with the distances between neurons are on the right figure. We still cannot clearly separate different mechanisms becouse importance of a cluster depends not on the number of active compounds, but on the values of growth percents. S11: So an integrated activity measure, calculated as a sum of same cluster compounds contributions, was introduced. A surface of integrated activity over the neural network is presented in the figure. It gives a possibility to distinguish three classes: two different mechanisms (A and C) and mixed one (B) and to separate them clearly. S12: Weight plains of every cell line in the neural network allow to analyze selectivity patterns for active compounds. In this figures cell line is more sensitive to compounds from dark clusters and less sensitive to compounds from light clusters. It have to be pointed that NCI-H460 is more sensitive than other lines to all three classes, the second line is sensitive mostly to mechanism C, M14 and SK-MEL-5 – to mechanism A. M14 is rather insensitive to cluster C. S13: In the other hand, all lines on this slide are less sensitive to class A, and OVCAR-5 and SNB-19 – to class C. We can see also that weights of mechanism B are mostly in the interval formed by A and C weights. That confirms a hypothesis about mixed mechanism of tumor growth inhibition by compounds from cluster B. S14. Conclusions: The homogeneity of humat tumor cell line screen results obtained from different testing stages is rejected so they cannot be combined together in further computational investigations It is found that about 4% of testing results are extreme errors, what is useful with outliers detection in future QSAR models Rational border between active and non-active compounds is introduced and proper data arrays for further QSAR are formed. Two independent and one mixed mechanisms of 4-thiazolidinones antitumor activity are identified Some selectivity linked with different modes of action for separate cell lines is highlighted