SlideShare a Scribd company logo
Bulletin of Electrical Engineering and Informatics
Vol. 10, No. 6, December 2021, pp. 3121~3126
ISSN: 2302-9285, DOI: 10.11591/eei.v10i6.3179 3121
Journal homepage: http://beei.org
Twin support vector machine using kernel function
for colorectal cancer detection
Zuherman Rustam, Fildzah Zhafarina, Jane Eva Aurelia, Yasirly Amalia
Department of Mathematics, University of Indonesia, Indonesia
Article Info ABSTRACT
Article history:
Received Jul 22, 2020
Revised Jun 14, 2021
Accepted Oct 12, 2021
Nowadays, machine learning technology is needed in the medical field.
therefore, this research is useful for solving problems in the medical field by
using machine learning. Many cases of colorectal cancer are diagnosed late.
When colorectal cancer is detected, the cancer is usually well developed.
Machine learning is an approach that is part of artificial intelligence and can
detect colorectal cancer early. This study discusses colorectal cancer detection
using twin support vector machine (SVM) method and kernel function i.e.
linear kernels, polynomial kernels, RBF kernels, and gaussian kernels. By
comparing the accuracy and running time, then we will know which method is
better in classifying the colorectal cancer dataset that we get from Al-Islam
Hospital, Bandung, Indonesia. The results showed that polynomial kernels has
better accuracy and running time. It can be seen with a maximum accuracy of
twin SVM using polynomial kernels 86% and 0.502 seconds running time.
Keywords:
Colorectal cancer
Detection
Kernel
Machine learning
Twin support vector machine
This is an open access article under the CC BY-SA license.
Corresponding Author:
Zuherman Rustam
Department of Mathematics
University of Indonesia
Jl. Prof DR. Sudjono D. Pusponegoro, Pondok Cina, Depok, Jawa Barat 16424, Indonesia
Email: rustam@ui.ac.id
1. INTRODUCTION
One of the diseases that cause death in the world is cancer. Cancer is the second leading cause of
death globally [1]. Detecting these diseases when still at an early stage is associated with markedly improved
survival prospects [2], [3]. Early-stage of the cancer is more likely to treat [4]. Colorectal cancer is cancer
with the third death rate. responsible for around 600,000 per year worldwide [5]-[8]. Information technology
has an important role in the field of medicine. Cancer is a disease that can be detected by machine learning.
Data is very useful in the medical field. It can be seen from the development of data mining in medical
science is increasing rapidly. This increase can be seen from the high prediction results, can reduce treatment
costs, increase the chances of recovery of patients, and decisions to save lives [9], [10].
Machine learning is an application of artificial intelligence that provides systems the ability to
automatically learn and improve from experience without being explicitly programmed [11]. One method
that is popular because the learning performance is very good is the twin support vector machine (SVM) [12].
Kernel method is a method that uses functions when the algorithm operates in feature space with a higher
dimension. This process uses product operations between images, all feature pairs. This method is used
directly or indirectly by a SVM and twin SVM to classify data [13]. The kernel functions commonly used for
SVM methods are linear kernel, polynomial kernel, RBF kernel, and gaussian kernel. This paper proposes the
twin SVM method as a novel approach for the early detection of colorectal cancer. The kernel functions used
are the linear kernel, polynomial kernel, RBF kernel, and gaussian kernel. This paper compares the
performance of the twin SVM with each kernel to get the best kernel for the detection of colorectal cancer.
 ISSN: 2302-9285
Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3121 – 3126
3122
2. RESEARCH METHOD
2.1. Twin support vector machine
SVM is a method used to find a single hyperplane to classify samples [14] proposed twin SVM is
found where samples are given to classes with two hyperplanes according to their distance from their
hyperplanes. Equations of the two hyperplanes are as:
𝑤1
𝑇
𝑥𝑠 + 𝑏1 = 0
𝑤2
𝑇
𝑥𝑠 + 𝑏1 = 0
i-th hyperline parameters shown by 𝑤𝑖 and 𝑏𝑖. Each hyperline is closest to its class sample, non-
parallel in nature, and farthest from the opposite class sample. Assume a binary classification task with
classes +1 and −1, and A ∈ ℝ𝑛1𝑥𝑑
and B ∈ ℝ𝑛2𝑥𝑑
indicate each matrix has a sample with each class +1 and
-1 [15]. Based on the appropriate class, one sample is shown with each matrix row. The two hyperplanes of
twin SVM obtained from (1) and (2):
min
1
2
(𝑨𝒘𝟏 + 𝒆𝑏1)𝑇
(𝑨𝒘𝟏 + 𝒆𝑏1) + 𝑝1𝑒𝑇
𝜉
𝑠. 𝑡 − (𝑩𝒘𝟏 + 𝒆𝑏1) + 𝜉 ≥ 𝑒, 𝜉 ≥ 0 (1)
min
1
2
(𝑩𝒘𝟐 + 𝒆𝑏2)𝑇
(𝑩𝒘𝟐 + 𝒆𝑏2) + 𝑝2𝑒𝑇
𝜉
𝑠. 𝑡 − (𝑨𝒘𝟐 + 𝒆𝑏2) + 𝜉 ≥ 𝑒, 𝜉 ≥ 0 (2)
ξ is a non-negative vector component, therefore ξ ≥ 0. Vector of the size slack variable n represented
by e. letting the margin of decision make a few mistakes is the standard approach. a standard approach is
taken if the sampling service cannot be separated linearly. (for example, some points are in or on the wrong
margin). the cost for a wrong-classified sample that is proportional to the distance between the sample and
the decision margin is determined by each zero-zero element of the slack variable vector. Based on these
equations, 𝜌1 and 𝜌2 are penalty parameters. Twin SVM is in great demand in various fields with various
versions of the proposed algorithm [16]. Recently, several fuzzy formulations from twin SVM have also been
proposed [17]
2.2. Kernel function
Kernel method is a method that uses kernel functions to operate algorithms in feature spaces that
have higher dimensions. This method uses product operations between images of all image pairs in the
feature space [18]. Accuracy for classifying objects in the right cluster is difficult to obtain in high
dimensional data sets, measuring euclidean distances on k-means, c-means, or fuzzy c-medoids. Distribution
data can be represented to validate the truly central cluster. This difficulty can be overcome by using the
kernel method [19]. Let Xn
be an input space; F is a feature space and ϕ : Xn →F. In (3) defines kernel
functions [20], [21]:
𝐾(𝑥1, 𝑥2) = 𝜑(𝑥1)𝜑(𝑥2) (3)
where 𝑥1, 𝑥2 ∈ Xn
.
Kernel functions that are often used are linear kernel, polynomial kernel, RBF kernel, and gaussian kernel.
Table 1 lists the formulas for kernel functions [22]-[23]:
Table 1. The formula of kernel function
Kernel Function Formula
Linear Kernel 𝐾(𝑥1, 𝑥2) = 𝑥1
𝑇
𝑥2 + 𝑐
Polynomial Kernel 𝐾(𝑥1, 𝑥2) = (𝛾𝑥1
𝑇
𝑥2 + 𝑐)𝑑
; 𝛾 > 0
RBF Kernel 𝐾(𝑥1, 𝑥2) = 𝑒−𝛾||𝑥1−𝑥2||
2
; 𝛾 > 0
Gaussian Kernel
𝐾(𝑥1, 𝑥2) = 𝑒
−
||𝑥1−𝑥2||
2
2𝜎2
2.3. k-Fold cross validation
The dataset is divided into two, i.e training data and testing data. This is done so that the resulting
model can be evaluated and obtained. Colorectal cancer data patterns are studied and recognized by machines
Bulletin of Electr Eng & Inf ISSN: 2302-9285 
Twin support vector machine using kernel function for colorectal cancer detection (Zuherman Rustam)
3123
with training data. Testing data are data used to evaluate models obtained after a machine learns data patterns
[24]. By using the k-fold cross validation method, the dataset is divided into training data and testing data
[25]. Training data samples were selected by the k-fold cross validation method. This method works by
dividing the dataset with k-parts of the same size. Models and repetition of processes k times tested for each
subsample taken as validation data.
2.4. Proposed method
Several stages are proposed in this study, including data divided into training and testing data. then
the data is tested with k-fold cross validation. The k-value chosen was 10 and 45 for the random state. This
means that the dataset was divided into 10 samples of the same size. In the second stage, the training data
were used by the twin SVM method based on linear kernel, polynomial kernel, RBF kernel, and gaussian
kernel to study data patterns and build classification models. The next step is to classify the models obtained
and evaluated based on the parameters of accuracy and running time. To find the best kernel, the evaluation
parameters produced by each kernel are compared.
3. RESULTS AND ANALYSIS
This research using Jupyter Notebook as software for running the program of twin SVM using linear
kernel, polynomial kernel, RBF kernel, and gaussian kernel. The stages carried out in this paper using the
Python 3 programming language.
3.1. Data
In this study, the data consisted of 210 samples and seven features. these seven features consist of
CEA, hemoglobin, leukocytes, hematocrit, platelets, age. diagnosis features become a target feature in
detecting colorectal cancer. The data are colorectal cancer data obtained from Al-Islam Hospital, Bandung,
Indonesia with cancer diagnoses (1), and no cancer (0). Table 2 represented part of the data:
Table 2. Part of colorectal cancer data
Age CEA Hemoglobin Leukocyte Hematocrit Platelets Diagnosis
74 3.26 11.8 19400 37.3 341000 0
84 29.12 8 12400 26.6 465000 1
81 4.5 8.8 19900 26.2 468000 0
56 0.96 13.9 9400 41.5 260000 0
75 3.24 7.7 13500 22.5 377000 0
58 0.71 11 18200 34 259000 0
63 1.65 10.1 19900 32.1 151000 0
73 36.49 11.1 9700 33.4 267000 1
3.2. Confusion matrix
In this paper, a confusion matrix was used to assist in calculating the evaluation parameters of the
classification model. Table 2 shows the confusion matrix used to evaluate the twin SVM classification model
based on the kernel for the diagnosis of colorectal cancer. Table 3 shown confusion matrix.
Table 3. Confusion Matrix
Predict
Cancer (Y) Non-Cancer (N)
Actual
Cancer (Y) TP FN
Non Cancer (N) FP TN
Explanation:
TP (true positive): many cases of colorectal cancer are predicted to be correct
TN (true negative): many cases of not colorectal cancer are predicted to be correct
FP (false positive): many cases of not colorectal cancer are predicted to be wrong (predicted as colorectal
cancer)
FN (false negative): many colorectal cancer cases are predicted to be wrong (predicted as not pancreatic cancer)
3.3. Evaluation parameters
The parameters to evaluate the performance of the twin SVM classification model were accuracy and
required running time. In 4 shows the formula for accuracy:
 ISSN: 2302-9285
Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3121 – 3126
3124
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
(𝑇𝑁+𝑇𝑃)
(𝐹𝑁+𝑇𝑃+𝐹𝑃+𝑇𝑁)
𝑥 100% (4)
Accuracy is used to compare the number of cases of colorectal cancer and not colorectal cancer that
identified correctly with the total number of cases.
3.4. Results
In this section, we discuss the performance evaluation of the twin SVM classification model with
linear kernel, polynomial kernel, RBF kernel, and gaussian kernel. The twin SVM classification model based
on kernel detects colorectal cancer using a twin SVM with a linear kernel, polynomial kernel, RBF kernel,
and gaussian kernel. In this research, the highest accuracy is from the polynomial kernel. This indicates that
the polynomial kernel is the appropriate kernel in detecting colorectal using a twin support vector machine.
In this paper, we have built the twin SVM classification model with linear kernels, polynomial kernels, radial
basis function kernels, and gaussian kernels in detecting colorectal cancer. Table 4 presents a comparison of
twin SVM performance linear kernel, polynomial kernel, RBF kernel, and gaussian kernel. All kernel
parameter is 1. The performance evaluation parameters compared are accuracy and running time. Table 4
shows the result of the accuracy and running time twin SVM classification model based on kernel.
Table 4. Results of the twin SVM classification model based on kernel
Classification Model Accuracy (%) Running Time (seconds)
Linear Kernel 81% 0.565
Polynomial Kernel 86% 0.502
RBF Kernel 76% 1.605
Gaussian Kernel 76% 1.612
Based on Tabel 4, that can be seen that for accuracy, twin SVM models the highest accuracy of 86%
was recorded when using the polynomial kernel at 0.502 seconds. While the lowest accuracy at 76% was
recorded when RBF and Gaussian kernel with a running time of 1.605 seconds for RBF kernel and 1.612 for
the gaussian kernel. For consideration of running time, the twin SVM model with polynomial kernel has the
fastest running time compared to linear, RBF, and gaussian kernels, which is around 0.502 s. The twin SVM
model with the gaussian kernel actually produces the longest running time which is around 1.612 s. Based on
the results obtained, the polynomial kernel gets the best results in terms of accuracy and running time. Thus,
the polynomial kernel is the best kernel for the twin SVM in detecting colorectal cancer dataset.
4. CONCLUSION
Colorectal cancer detection quickly is very important. it is useful for handling cancer quickly before
being infected to all organs of the body. However, this is difficult because colorectal cancer has no specific
symptoms. The twin SVM method can help detect colorectal cancer based on blood tests and age. The most
appropriate kernel for the twin SVM method in detecting colorectal cancer is the polynomial kernel which
produces an accuracy of 86% and the required running time is 0.502 seconds.
ACKNOWLEDGEMENTS
Zuherman Rustam supported financially by University of Indonesia with a Hibah FMIPA 2021
grant schema. Fildzah Zhafarina, Jane Eva Aurella, Yasirly Amalia supported financially by PUTI 2020 grant
schema.
REFERENCES
[1] T. Nadira and Z. Rustam, “Classification of cancer data using support vector machines with features selection
method based on global artificial bee colony,” AIP Conference Proceedings, vol. 2023, no. 1, pp. 1-8, 2018, doi:
10.1063/1.5064202.
[2] N. Bannister and J. Broggio, “Cancer survival by stage at diagnosis for England (experimental statistics): adults
diagnosed 2012, 2013 and 2014 and followed up to 2015,” Produced in collaboration with Public Health England,
2015.
Bulletin of Electr Eng & Inf ISSN: 2302-9285 
Twin support vector machine using kernel function for colorectal cancer detection (Zuherman Rustam)
3125
[3] P. Muller, S. Walters, M. P. Coleman and L. Woods, “Which indicators of early cancer diagnosis from population-
based data sources are associated with short-term mortality and survival?,” Cancer epidemiology, vol. 56, pp. 161-
170, 2018, doi: 10.1016/j.canep.2018.07.010.
[4] R. Siegel, C. Desantis and A. Jemal, “Colorectal cancer statistics, 2014,” CA: a cancer journal for clinicians, vol.
64, no. 2, pp. 104-117, 2014, doi: 10.3322/caac.21220.
[5] W. C. Shangkuan, “Risk analysis of colorectal cancer incidence by gene expression analysis,” PeerJ, 5, e3003,
2017, doi: 10.7717/peerj.3003.
[6] M. S. Kim, D. Kim and J. -R. Kim, “Stage-Dependent Gene Expression Profiling in Colorectal Cancer,”
IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 16, no. 5, pp. 1685-1692, 2019, doi:
10.1109/TCBB.2018.2814043.
[7] A. Calon, et al., “Stromal gene expression defines poor-prognosis subtypes in colorectal cancer,” Nature genetics,
vol. 47, no. 4, pp. 320-329, 2015, doi: https://doi.org/10.1038/ng.3225.
[8] P. F. Simmonds, et al., “Surgery for colorectal cancer in elderly patients: a systematic review,” The Lancet, vol.
356, no. 9234, pp. 968-974, 2000, doi: 10.1016/s0140-6736(00)02707-0.
[9] H. Asri, H. Mousannif, H. A. Moatassime and T. Noel, “Using machine learning algorithms for breast cancer risk
prediction and diagnosis,” Procedia Computer Science, vol. 83, pp. 1064-1069, 2016,
https://doi.org/10.1016/j.procs.2016.04.224.
[10] Z. Rustam, V. A. W. Hapsari and M. R. Solihin, “Optimal cervical cancer classification using Gauss-Newton
representation based algorithm,” AIP Conference Proceedings, vol. 2168, no. 1, pp. 020045 1-6, 2019, doi:
10.1063/1.5132472.
[11] Z. Rustam and N. P. A, Ariantari, “Support Vector Machines for Classifying Policyholders Satisfactorily in
Automobile Insurance,” Journal of Physics: Conference Series, 2018, vol. 1028, no. 1, pp. 1-9, doi :10.1088/1742-
6596/1028/1/012005.
[12] H. Huajuan, W. Xiuxi and Z. Yongquan, “Twin support vector machines: A survey,” Neurocomputing, vol. 300,
pp. 34-43, 2018, doi: 10.1016/j.neucom.2018.01.093.
[13] H. J. S. Taylor, J. S. Taylor and N. Cristiani, “Kernel methods for pattern analysis,” Cambridge university press,
2004.
[14] Jayadeva, R. Khemchandani and S. Chandra, “Twin Support Vector Machines for Pattern Classification,” IEEE
Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 5, pp. 905-910, 2007, doi:
10.1109/TPAMI.2007.1068.
[15] M. Tzelepi and A. Tefas, “Improving the performance of lightweight CNNs for binary classification using
Quadratic Mutual Information regularization,” Pattern Recognition, vol. 106, no. 107407, 2020, doi:
10.1016/j.patcog.2020.107407.
[16] S. Ding, Y. An, X. Zhang, F. Wu and Y. Xue, “Wavelet twin support vector machines based on glowworm swarm
optimization,” Neurocomputing, vol. 225, pp. 157-163, 2017, doi: 10.1016/j.neucom.2016.11.026.
[17] D. Gupta, B. Richhariya and P. Borah, “A fuzzy twin support vector machine based on information entropy for
class imbalance learning,” Neural Computing and Applications, vol. 31, no. 11, pp. 7153-7164, 2019, doi:
10.1007/s00521-018-3551-9.
[18] W. Sadewo, Z. Rustam, H. Hamidah and A. R. Chusmarsyah, “Pancreatic Cancer Early Detection Using Twin
Support Vector Machine Based on Kernel,” Symmetry, vol. 12, no. 667, pp. 1-8, 2020, doi: 10.3390/sym12040667.
[19] Z. Rustam and A. S. Talita, “Fuzzy kernel K-medoids algorithm for anomaly detection problems,” AIP Conference
Proceedings, vol. 1862, no. 030154, pp. 1-8, 2016, doi: 10.1063/1.4991258.
[20] Z. Rustam and R. Faradina, “Face recognition to identify look-alike faces using support vector machine,” Journal
of Physics: Conference Series, vol. 1108, no. 012071, pp. 1-7, 2018, doi: 10.1088/1742-6596/1108/1/012071.
[21] C. Bishop, “Pattern recognition and machine learning,” Springer, 2006.
[22] A. Zheng, “Evaluating Machine Learning Models: A Beginner’s Guide to Key Concepts and Pitfalls,” Sebastopol,
CA: O’Reilly Media, Inc, 2015.
[23] Arfiani, Z. Rustam, J. Pandelaki and A. Siahan, “Kernel spherical k-means and support vector machine for acute
sinusitis classification,” IOP Conference Series: Materials Science and Engineering, vol. 546, no. 052011, pp. 1-
10, 2019, doi: 10.1088/1757-899X/546/5/052011.
[24] H. Glanz, L. Calvanho, D. S. Menashe and M. A. Frield, “A parametric model for classifying land cover and
evaluating training data based on multi-temporal remote sensing data,” ISPRS journal of photogrammetry and
remote sensing, vol. 97, pp. 219-228, 2014, doi: 10.1016/j.isprsjprs.2014.09.004.
[25] S. Saud, B. Jamil, Y. Upadhyay and K. Irshad, “Performance improvement of empirical models for estimation of
global solar radiation in India: A k-fold cross-validation approach,” Sustainable Energy Technologies and
Assessments, vol. 40, no. 100768, 2020, doi: 10.1016/j.seta.2020.100768.
 ISSN: 2302-9285
Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3121 – 3126
3126
BIOGRAPHIES OF AUTHORS
Zuherman Rustam is an Associate Professor and a lecturer of the intelligence computationat
the Department of Mathematics, University of Indonesia. He obtained his Master of Science in
1989 in informatics, Paris Diderot University, French, and completed his Ph.D. in 2006
fromcomputer science, University of Indonesia. Assoc. Prof. Dr. Rustam is a member of IEEE
who is actively researching machine learning, pattern recognition, neural network, artificial
intelligence.
Fildzah Zhafarina is a final year student from Departement of Mathematics, University of
Indonesia. Ms. Zhafarina is passionately researching machine learning, computer vision, neural
networks and deep learning in various fields.
Jane Eva Aurelia was born in Jakarta, 19 June 1998. She is a final year student in the
Departement of Mathematics, University of Indonesia. She is currently working on her thesis,
which is firmly about applied mathematics using machine learning. Also, Ms. Jane’s
specialties in research are mostly about machine learning, mathematical modeling, and data
mining.
Yasirly Amalia is a final semester student in Department of Mathematics, University of
Indonesia. Ms. Yasirly is passionately in machine learning, mathematical modelling, and data
mining.

More Related Content

PDF
Possibilistic Fuzzy C Means Algorithm For Mass classificaion In Digital Mammo...
PDF
Az4102375381
PDF
Breast cancer diagnosis and recurrence prediction using machine learning tech...
PDF
Ijetcas14 327
PDF
IRJET - Survey on Analysis of Breast Cancer Prediction
PDF
M sc research_project_report_x18134599
PDF
My own Machine Learning project - Breast Cancer Prediction
PPTX
Predictive Analysis of Breast Cancer Detection using Classification Algorithm
Possibilistic Fuzzy C Means Algorithm For Mass classificaion In Digital Mammo...
Az4102375381
Breast cancer diagnosis and recurrence prediction using machine learning tech...
Ijetcas14 327
IRJET - Survey on Analysis of Breast Cancer Prediction
M sc research_project_report_x18134599
My own Machine Learning project - Breast Cancer Prediction
Predictive Analysis of Breast Cancer Detection using Classification Algorithm

What's hot (20)

PPTX
Breast cancer classification
PDF
Iganfis Data Mining Approach for Forecasting Cancer Threats
PDF
Breast Cancer Detection using Convolution Neural Network
PDF
40120130405013
PDF
A new model for large dataset dimensionality reduction based on teaching lear...
PDF
Classification of Breast Masses Using Convolutional Neural Network as Feature...
PDF
Quantitative Comparison of Artificial Honey Bee Colony Clustering and Enhance...
PPTX
a novel approach for breast cancer detection using data mining tool weka
DOCX
Simplified Knowledge Prediction: Application of Machine Learning in Real Life
PDF
Breast Mass Segmentation Using a Semi-automatic Procedure Based on Fuzzy C-me...
PDF
F43043034
PDF
Skin lesion detection from dermoscopic images using Convolutional Neural Netw...
PDF
IRJET- Detection of Breast Cancer using Machine Learning Techniques
PDF
Classification of pneumonia from X-ray images using siamese convolutional net...
PDF
Hybrid Technique Based on N-GRAM and Neural Networks for Classification of Ma...
PDF
Early Detection of Lung Cancer Using Neural Network Techniques
PDF
25 17 dec16 13743 28032-1-sm(edit)
PDF
Deep learning model for thorax diseases detection
PPTX
On Predicting and Analyzing Breast Cancer using Data Mining Approach
DOCX
Report (1)
Breast cancer classification
Iganfis Data Mining Approach for Forecasting Cancer Threats
Breast Cancer Detection using Convolution Neural Network
40120130405013
A new model for large dataset dimensionality reduction based on teaching lear...
Classification of Breast Masses Using Convolutional Neural Network as Feature...
Quantitative Comparison of Artificial Honey Bee Colony Clustering and Enhance...
a novel approach for breast cancer detection using data mining tool weka
Simplified Knowledge Prediction: Application of Machine Learning in Real Life
Breast Mass Segmentation Using a Semi-automatic Procedure Based on Fuzzy C-me...
F43043034
Skin lesion detection from dermoscopic images using Convolutional Neural Netw...
IRJET- Detection of Breast Cancer using Machine Learning Techniques
Classification of pneumonia from X-ray images using siamese convolutional net...
Hybrid Technique Based on N-GRAM and Neural Networks for Classification of Ma...
Early Detection of Lung Cancer Using Neural Network Techniques
25 17 dec16 13743 28032-1-sm(edit)
Deep learning model for thorax diseases detection
On Predicting and Analyzing Breast Cancer using Data Mining Approach
Report (1)
Ad

Similar to Twin support vector machine using kernel function for colorectal cancer detection (20)

DOC
Introduction to Support Vector Machines
PDF
An Approach for Disease Data Classification Using Fuzzy Support Vector Machine
PDF
The comparison study of kernel KC-means and support vector machines for class...
PDF
Classification of Breast Cancer Diseases using Data Mining Techniques
PPTX
Module 3 -Support Vector Machines data mining
PDF
A Fuzzy Interactive BI-objective Model for SVM to Identify the Best Compromis...
PDF
A FUZZY INTERACTIVE BI-OBJECTIVE MODEL FOR SVM TO IDENTIFY THE BEST COMPROMIS...
PDF
Single to multiple kernel learning with four popular svm kernels (survey)
PDF
Application of combined support vector machines in process fault diagnosis
PDF
Support Vector Machine Optimal Kernel Selection
PDF
IRJET- Breast Cancer Relapse Prognosis by Classic and Modern Structures o...
PDF
KNOWLEDGE BASED ANALYSIS OF VARIOUS STATISTICAL TOOLS IN DETECTING BREAST CANCER
PPT
PPT
SVM (2).ppt
PPTX
Support Vector Machines Simply
PDF
Epsrcws08 campbell kbm_01
PDF
Extra Lecture - Support Vector Machines (SVM), a lecture in subject module St...
PPT
Introduction to Support Vector Machine 221 CMU.ppt
PDF
TextCategorization support vector_0308.pdf
Introduction to Support Vector Machines
An Approach for Disease Data Classification Using Fuzzy Support Vector Machine
The comparison study of kernel KC-means and support vector machines for class...
Classification of Breast Cancer Diseases using Data Mining Techniques
Module 3 -Support Vector Machines data mining
A Fuzzy Interactive BI-objective Model for SVM to Identify the Best Compromis...
A FUZZY INTERACTIVE BI-OBJECTIVE MODEL FOR SVM TO IDENTIFY THE BEST COMPROMIS...
Single to multiple kernel learning with four popular svm kernels (survey)
Application of combined support vector machines in process fault diagnosis
Support Vector Machine Optimal Kernel Selection
IRJET- Breast Cancer Relapse Prognosis by Classic and Modern Structures o...
KNOWLEDGE BASED ANALYSIS OF VARIOUS STATISTICAL TOOLS IN DETECTING BREAST CANCER
SVM (2).ppt
Support Vector Machines Simply
Epsrcws08 campbell kbm_01
Extra Lecture - Support Vector Machines (SVM), a lecture in subject module St...
Introduction to Support Vector Machine 221 CMU.ppt
TextCategorization support vector_0308.pdf
Ad

More from journalBEEI (20)

PDF
Square transposition: an approach to the transposition process in block cipher
PDF
Hyper-parameter optimization of convolutional neural network based on particl...
PDF
Supervised machine learning based liver disease prediction approach with LASS...
PDF
A secure and energy saving protocol for wireless sensor networks
PDF
Plant leaf identification system using convolutional neural network
PDF
Customized moodle-based learning management system for socially disadvantaged...
PDF
Understanding the role of individual learner in adaptive and personalized e-l...
PDF
Prototype mobile contactless transaction system in traditional markets to sup...
PDF
Wireless HART stack using multiprocessor technique with laxity algorithm
PDF
Implementation of double-layer loaded on octagon microstrip yagi antenna
PDF
The calculation of the field of an antenna located near the human head
PDF
Exact secure outage probability performance of uplinkdownlink multiple access...
PDF
Design of a dual-band antenna for energy harvesting application
PDF
Transforming data-centric eXtensible markup language into relational database...
PDF
Key performance requirement of future next wireless networks (6G)
PDF
Noise resistance territorial intensity-based optical flow using inverse confi...
PDF
Modeling climate phenomenon with software grids analysis and display system i...
PDF
An approach of re-organizing input dataset to enhance the quality of emotion ...
PDF
Parking detection system using background subtraction and HSV color segmentation
PDF
Quality of service performances of video and voice transmission in universal ...
Square transposition: an approach to the transposition process in block cipher
Hyper-parameter optimization of convolutional neural network based on particl...
Supervised machine learning based liver disease prediction approach with LASS...
A secure and energy saving protocol for wireless sensor networks
Plant leaf identification system using convolutional neural network
Customized moodle-based learning management system for socially disadvantaged...
Understanding the role of individual learner in adaptive and personalized e-l...
Prototype mobile contactless transaction system in traditional markets to sup...
Wireless HART stack using multiprocessor technique with laxity algorithm
Implementation of double-layer loaded on octagon microstrip yagi antenna
The calculation of the field of an antenna located near the human head
Exact secure outage probability performance of uplinkdownlink multiple access...
Design of a dual-band antenna for energy harvesting application
Transforming data-centric eXtensible markup language into relational database...
Key performance requirement of future next wireless networks (6G)
Noise resistance territorial intensity-based optical flow using inverse confi...
Modeling climate phenomenon with software grids analysis and display system i...
An approach of re-organizing input dataset to enhance the quality of emotion ...
Parking detection system using background subtraction and HSV color segmentation
Quality of service performances of video and voice transmission in universal ...

Recently uploaded (20)

PDF
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PPT
Project quality management in manufacturing
PPTX
Sustainable Sites - Green Building Construction
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
bas. eng. economics group 4 presentation 1.pptx
PDF
Well-logging-methods_new................
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
web development for engineering and engineering
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
Artificial Intelligence
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
CH1 Production IntroductoryConcepts.pptx
DOCX
573137875-Attendance-Management-System-original
PPTX
Safety Seminar civil to be ensured for safe working.
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
R24 SURVEYING LAB MANUAL for civil enggi
Project quality management in manufacturing
Sustainable Sites - Green Building Construction
CYBER-CRIMES AND SECURITY A guide to understanding
bas. eng. economics group 4 presentation 1.pptx
Well-logging-methods_new................
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
web development for engineering and engineering
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
Internet of Things (IOT) - A guide to understanding
Artificial Intelligence
Foundation to blockchain - A guide to Blockchain Tech
Embodied AI: Ushering in the Next Era of Intelligent Systems
CH1 Production IntroductoryConcepts.pptx
573137875-Attendance-Management-System-original
Safety Seminar civil to be ensured for safe working.

Twin support vector machine using kernel function for colorectal cancer detection

  • 1. Bulletin of Electrical Engineering and Informatics Vol. 10, No. 6, December 2021, pp. 3121~3126 ISSN: 2302-9285, DOI: 10.11591/eei.v10i6.3179 3121 Journal homepage: http://beei.org Twin support vector machine using kernel function for colorectal cancer detection Zuherman Rustam, Fildzah Zhafarina, Jane Eva Aurelia, Yasirly Amalia Department of Mathematics, University of Indonesia, Indonesia Article Info ABSTRACT Article history: Received Jul 22, 2020 Revised Jun 14, 2021 Accepted Oct 12, 2021 Nowadays, machine learning technology is needed in the medical field. therefore, this research is useful for solving problems in the medical field by using machine learning. Many cases of colorectal cancer are diagnosed late. When colorectal cancer is detected, the cancer is usually well developed. Machine learning is an approach that is part of artificial intelligence and can detect colorectal cancer early. This study discusses colorectal cancer detection using twin support vector machine (SVM) method and kernel function i.e. linear kernels, polynomial kernels, RBF kernels, and gaussian kernels. By comparing the accuracy and running time, then we will know which method is better in classifying the colorectal cancer dataset that we get from Al-Islam Hospital, Bandung, Indonesia. The results showed that polynomial kernels has better accuracy and running time. It can be seen with a maximum accuracy of twin SVM using polynomial kernels 86% and 0.502 seconds running time. Keywords: Colorectal cancer Detection Kernel Machine learning Twin support vector machine This is an open access article under the CC BY-SA license. Corresponding Author: Zuherman Rustam Department of Mathematics University of Indonesia Jl. Prof DR. Sudjono D. Pusponegoro, Pondok Cina, Depok, Jawa Barat 16424, Indonesia Email: rustam@ui.ac.id 1. INTRODUCTION One of the diseases that cause death in the world is cancer. Cancer is the second leading cause of death globally [1]. Detecting these diseases when still at an early stage is associated with markedly improved survival prospects [2], [3]. Early-stage of the cancer is more likely to treat [4]. Colorectal cancer is cancer with the third death rate. responsible for around 600,000 per year worldwide [5]-[8]. Information technology has an important role in the field of medicine. Cancer is a disease that can be detected by machine learning. Data is very useful in the medical field. It can be seen from the development of data mining in medical science is increasing rapidly. This increase can be seen from the high prediction results, can reduce treatment costs, increase the chances of recovery of patients, and decisions to save lives [9], [10]. Machine learning is an application of artificial intelligence that provides systems the ability to automatically learn and improve from experience without being explicitly programmed [11]. One method that is popular because the learning performance is very good is the twin support vector machine (SVM) [12]. Kernel method is a method that uses functions when the algorithm operates in feature space with a higher dimension. This process uses product operations between images, all feature pairs. This method is used directly or indirectly by a SVM and twin SVM to classify data [13]. The kernel functions commonly used for SVM methods are linear kernel, polynomial kernel, RBF kernel, and gaussian kernel. This paper proposes the twin SVM method as a novel approach for the early detection of colorectal cancer. The kernel functions used are the linear kernel, polynomial kernel, RBF kernel, and gaussian kernel. This paper compares the performance of the twin SVM with each kernel to get the best kernel for the detection of colorectal cancer.
  • 2.  ISSN: 2302-9285 Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3121 – 3126 3122 2. RESEARCH METHOD 2.1. Twin support vector machine SVM is a method used to find a single hyperplane to classify samples [14] proposed twin SVM is found where samples are given to classes with two hyperplanes according to their distance from their hyperplanes. Equations of the two hyperplanes are as: 𝑤1 𝑇 𝑥𝑠 + 𝑏1 = 0 𝑤2 𝑇 𝑥𝑠 + 𝑏1 = 0 i-th hyperline parameters shown by 𝑤𝑖 and 𝑏𝑖. Each hyperline is closest to its class sample, non- parallel in nature, and farthest from the opposite class sample. Assume a binary classification task with classes +1 and −1, and A ∈ ℝ𝑛1𝑥𝑑 and B ∈ ℝ𝑛2𝑥𝑑 indicate each matrix has a sample with each class +1 and -1 [15]. Based on the appropriate class, one sample is shown with each matrix row. The two hyperplanes of twin SVM obtained from (1) and (2): min 1 2 (𝑨𝒘𝟏 + 𝒆𝑏1)𝑇 (𝑨𝒘𝟏 + 𝒆𝑏1) + 𝑝1𝑒𝑇 𝜉 𝑠. 𝑡 − (𝑩𝒘𝟏 + 𝒆𝑏1) + 𝜉 ≥ 𝑒, 𝜉 ≥ 0 (1) min 1 2 (𝑩𝒘𝟐 + 𝒆𝑏2)𝑇 (𝑩𝒘𝟐 + 𝒆𝑏2) + 𝑝2𝑒𝑇 𝜉 𝑠. 𝑡 − (𝑨𝒘𝟐 + 𝒆𝑏2) + 𝜉 ≥ 𝑒, 𝜉 ≥ 0 (2) ξ is a non-negative vector component, therefore ξ ≥ 0. Vector of the size slack variable n represented by e. letting the margin of decision make a few mistakes is the standard approach. a standard approach is taken if the sampling service cannot be separated linearly. (for example, some points are in or on the wrong margin). the cost for a wrong-classified sample that is proportional to the distance between the sample and the decision margin is determined by each zero-zero element of the slack variable vector. Based on these equations, 𝜌1 and 𝜌2 are penalty parameters. Twin SVM is in great demand in various fields with various versions of the proposed algorithm [16]. Recently, several fuzzy formulations from twin SVM have also been proposed [17] 2.2. Kernel function Kernel method is a method that uses kernel functions to operate algorithms in feature spaces that have higher dimensions. This method uses product operations between images of all image pairs in the feature space [18]. Accuracy for classifying objects in the right cluster is difficult to obtain in high dimensional data sets, measuring euclidean distances on k-means, c-means, or fuzzy c-medoids. Distribution data can be represented to validate the truly central cluster. This difficulty can be overcome by using the kernel method [19]. Let Xn be an input space; F is a feature space and ϕ : Xn →F. In (3) defines kernel functions [20], [21]: 𝐾(𝑥1, 𝑥2) = 𝜑(𝑥1)𝜑(𝑥2) (3) where 𝑥1, 𝑥2 ∈ Xn . Kernel functions that are often used are linear kernel, polynomial kernel, RBF kernel, and gaussian kernel. Table 1 lists the formulas for kernel functions [22]-[23]: Table 1. The formula of kernel function Kernel Function Formula Linear Kernel 𝐾(𝑥1, 𝑥2) = 𝑥1 𝑇 𝑥2 + 𝑐 Polynomial Kernel 𝐾(𝑥1, 𝑥2) = (𝛾𝑥1 𝑇 𝑥2 + 𝑐)𝑑 ; 𝛾 > 0 RBF Kernel 𝐾(𝑥1, 𝑥2) = 𝑒−𝛾||𝑥1−𝑥2|| 2 ; 𝛾 > 0 Gaussian Kernel 𝐾(𝑥1, 𝑥2) = 𝑒 − ||𝑥1−𝑥2|| 2 2𝜎2 2.3. k-Fold cross validation The dataset is divided into two, i.e training data and testing data. This is done so that the resulting model can be evaluated and obtained. Colorectal cancer data patterns are studied and recognized by machines
  • 3. Bulletin of Electr Eng & Inf ISSN: 2302-9285  Twin support vector machine using kernel function for colorectal cancer detection (Zuherman Rustam) 3123 with training data. Testing data are data used to evaluate models obtained after a machine learns data patterns [24]. By using the k-fold cross validation method, the dataset is divided into training data and testing data [25]. Training data samples were selected by the k-fold cross validation method. This method works by dividing the dataset with k-parts of the same size. Models and repetition of processes k times tested for each subsample taken as validation data. 2.4. Proposed method Several stages are proposed in this study, including data divided into training and testing data. then the data is tested with k-fold cross validation. The k-value chosen was 10 and 45 for the random state. This means that the dataset was divided into 10 samples of the same size. In the second stage, the training data were used by the twin SVM method based on linear kernel, polynomial kernel, RBF kernel, and gaussian kernel to study data patterns and build classification models. The next step is to classify the models obtained and evaluated based on the parameters of accuracy and running time. To find the best kernel, the evaluation parameters produced by each kernel are compared. 3. RESULTS AND ANALYSIS This research using Jupyter Notebook as software for running the program of twin SVM using linear kernel, polynomial kernel, RBF kernel, and gaussian kernel. The stages carried out in this paper using the Python 3 programming language. 3.1. Data In this study, the data consisted of 210 samples and seven features. these seven features consist of CEA, hemoglobin, leukocytes, hematocrit, platelets, age. diagnosis features become a target feature in detecting colorectal cancer. The data are colorectal cancer data obtained from Al-Islam Hospital, Bandung, Indonesia with cancer diagnoses (1), and no cancer (0). Table 2 represented part of the data: Table 2. Part of colorectal cancer data Age CEA Hemoglobin Leukocyte Hematocrit Platelets Diagnosis 74 3.26 11.8 19400 37.3 341000 0 84 29.12 8 12400 26.6 465000 1 81 4.5 8.8 19900 26.2 468000 0 56 0.96 13.9 9400 41.5 260000 0 75 3.24 7.7 13500 22.5 377000 0 58 0.71 11 18200 34 259000 0 63 1.65 10.1 19900 32.1 151000 0 73 36.49 11.1 9700 33.4 267000 1 3.2. Confusion matrix In this paper, a confusion matrix was used to assist in calculating the evaluation parameters of the classification model. Table 2 shows the confusion matrix used to evaluate the twin SVM classification model based on the kernel for the diagnosis of colorectal cancer. Table 3 shown confusion matrix. Table 3. Confusion Matrix Predict Cancer (Y) Non-Cancer (N) Actual Cancer (Y) TP FN Non Cancer (N) FP TN Explanation: TP (true positive): many cases of colorectal cancer are predicted to be correct TN (true negative): many cases of not colorectal cancer are predicted to be correct FP (false positive): many cases of not colorectal cancer are predicted to be wrong (predicted as colorectal cancer) FN (false negative): many colorectal cancer cases are predicted to be wrong (predicted as not pancreatic cancer) 3.3. Evaluation parameters The parameters to evaluate the performance of the twin SVM classification model were accuracy and required running time. In 4 shows the formula for accuracy:
  • 4.  ISSN: 2302-9285 Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3121 – 3126 3124 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (𝑇𝑁+𝑇𝑃) (𝐹𝑁+𝑇𝑃+𝐹𝑃+𝑇𝑁) 𝑥 100% (4) Accuracy is used to compare the number of cases of colorectal cancer and not colorectal cancer that identified correctly with the total number of cases. 3.4. Results In this section, we discuss the performance evaluation of the twin SVM classification model with linear kernel, polynomial kernel, RBF kernel, and gaussian kernel. The twin SVM classification model based on kernel detects colorectal cancer using a twin SVM with a linear kernel, polynomial kernel, RBF kernel, and gaussian kernel. In this research, the highest accuracy is from the polynomial kernel. This indicates that the polynomial kernel is the appropriate kernel in detecting colorectal using a twin support vector machine. In this paper, we have built the twin SVM classification model with linear kernels, polynomial kernels, radial basis function kernels, and gaussian kernels in detecting colorectal cancer. Table 4 presents a comparison of twin SVM performance linear kernel, polynomial kernel, RBF kernel, and gaussian kernel. All kernel parameter is 1. The performance evaluation parameters compared are accuracy and running time. Table 4 shows the result of the accuracy and running time twin SVM classification model based on kernel. Table 4. Results of the twin SVM classification model based on kernel Classification Model Accuracy (%) Running Time (seconds) Linear Kernel 81% 0.565 Polynomial Kernel 86% 0.502 RBF Kernel 76% 1.605 Gaussian Kernel 76% 1.612 Based on Tabel 4, that can be seen that for accuracy, twin SVM models the highest accuracy of 86% was recorded when using the polynomial kernel at 0.502 seconds. While the lowest accuracy at 76% was recorded when RBF and Gaussian kernel with a running time of 1.605 seconds for RBF kernel and 1.612 for the gaussian kernel. For consideration of running time, the twin SVM model with polynomial kernel has the fastest running time compared to linear, RBF, and gaussian kernels, which is around 0.502 s. The twin SVM model with the gaussian kernel actually produces the longest running time which is around 1.612 s. Based on the results obtained, the polynomial kernel gets the best results in terms of accuracy and running time. Thus, the polynomial kernel is the best kernel for the twin SVM in detecting colorectal cancer dataset. 4. CONCLUSION Colorectal cancer detection quickly is very important. it is useful for handling cancer quickly before being infected to all organs of the body. However, this is difficult because colorectal cancer has no specific symptoms. The twin SVM method can help detect colorectal cancer based on blood tests and age. The most appropriate kernel for the twin SVM method in detecting colorectal cancer is the polynomial kernel which produces an accuracy of 86% and the required running time is 0.502 seconds. ACKNOWLEDGEMENTS Zuherman Rustam supported financially by University of Indonesia with a Hibah FMIPA 2021 grant schema. Fildzah Zhafarina, Jane Eva Aurella, Yasirly Amalia supported financially by PUTI 2020 grant schema. REFERENCES [1] T. Nadira and Z. Rustam, “Classification of cancer data using support vector machines with features selection method based on global artificial bee colony,” AIP Conference Proceedings, vol. 2023, no. 1, pp. 1-8, 2018, doi: 10.1063/1.5064202. [2] N. Bannister and J. Broggio, “Cancer survival by stage at diagnosis for England (experimental statistics): adults diagnosed 2012, 2013 and 2014 and followed up to 2015,” Produced in collaboration with Public Health England, 2015.
  • 5. Bulletin of Electr Eng & Inf ISSN: 2302-9285  Twin support vector machine using kernel function for colorectal cancer detection (Zuherman Rustam) 3125 [3] P. Muller, S. Walters, M. P. Coleman and L. Woods, “Which indicators of early cancer diagnosis from population- based data sources are associated with short-term mortality and survival?,” Cancer epidemiology, vol. 56, pp. 161- 170, 2018, doi: 10.1016/j.canep.2018.07.010. [4] R. Siegel, C. Desantis and A. Jemal, “Colorectal cancer statistics, 2014,” CA: a cancer journal for clinicians, vol. 64, no. 2, pp. 104-117, 2014, doi: 10.3322/caac.21220. [5] W. C. Shangkuan, “Risk analysis of colorectal cancer incidence by gene expression analysis,” PeerJ, 5, e3003, 2017, doi: 10.7717/peerj.3003. [6] M. S. Kim, D. Kim and J. -R. Kim, “Stage-Dependent Gene Expression Profiling in Colorectal Cancer,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 16, no. 5, pp. 1685-1692, 2019, doi: 10.1109/TCBB.2018.2814043. [7] A. Calon, et al., “Stromal gene expression defines poor-prognosis subtypes in colorectal cancer,” Nature genetics, vol. 47, no. 4, pp. 320-329, 2015, doi: https://doi.org/10.1038/ng.3225. [8] P. F. Simmonds, et al., “Surgery for colorectal cancer in elderly patients: a systematic review,” The Lancet, vol. 356, no. 9234, pp. 968-974, 2000, doi: 10.1016/s0140-6736(00)02707-0. [9] H. Asri, H. Mousannif, H. A. Moatassime and T. Noel, “Using machine learning algorithms for breast cancer risk prediction and diagnosis,” Procedia Computer Science, vol. 83, pp. 1064-1069, 2016, https://doi.org/10.1016/j.procs.2016.04.224. [10] Z. Rustam, V. A. W. Hapsari and M. R. Solihin, “Optimal cervical cancer classification using Gauss-Newton representation based algorithm,” AIP Conference Proceedings, vol. 2168, no. 1, pp. 020045 1-6, 2019, doi: 10.1063/1.5132472. [11] Z. Rustam and N. P. A, Ariantari, “Support Vector Machines for Classifying Policyholders Satisfactorily in Automobile Insurance,” Journal of Physics: Conference Series, 2018, vol. 1028, no. 1, pp. 1-9, doi :10.1088/1742- 6596/1028/1/012005. [12] H. Huajuan, W. Xiuxi and Z. Yongquan, “Twin support vector machines: A survey,” Neurocomputing, vol. 300, pp. 34-43, 2018, doi: 10.1016/j.neucom.2018.01.093. [13] H. J. S. Taylor, J. S. Taylor and N. Cristiani, “Kernel methods for pattern analysis,” Cambridge university press, 2004. [14] Jayadeva, R. Khemchandani and S. Chandra, “Twin Support Vector Machines for Pattern Classification,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 5, pp. 905-910, 2007, doi: 10.1109/TPAMI.2007.1068. [15] M. Tzelepi and A. Tefas, “Improving the performance of lightweight CNNs for binary classification using Quadratic Mutual Information regularization,” Pattern Recognition, vol. 106, no. 107407, 2020, doi: 10.1016/j.patcog.2020.107407. [16] S. Ding, Y. An, X. Zhang, F. Wu and Y. Xue, “Wavelet twin support vector machines based on glowworm swarm optimization,” Neurocomputing, vol. 225, pp. 157-163, 2017, doi: 10.1016/j.neucom.2016.11.026. [17] D. Gupta, B. Richhariya and P. Borah, “A fuzzy twin support vector machine based on information entropy for class imbalance learning,” Neural Computing and Applications, vol. 31, no. 11, pp. 7153-7164, 2019, doi: 10.1007/s00521-018-3551-9. [18] W. Sadewo, Z. Rustam, H. Hamidah and A. R. Chusmarsyah, “Pancreatic Cancer Early Detection Using Twin Support Vector Machine Based on Kernel,” Symmetry, vol. 12, no. 667, pp. 1-8, 2020, doi: 10.3390/sym12040667. [19] Z. Rustam and A. S. Talita, “Fuzzy kernel K-medoids algorithm for anomaly detection problems,” AIP Conference Proceedings, vol. 1862, no. 030154, pp. 1-8, 2016, doi: 10.1063/1.4991258. [20] Z. Rustam and R. Faradina, “Face recognition to identify look-alike faces using support vector machine,” Journal of Physics: Conference Series, vol. 1108, no. 012071, pp. 1-7, 2018, doi: 10.1088/1742-6596/1108/1/012071. [21] C. Bishop, “Pattern recognition and machine learning,” Springer, 2006. [22] A. Zheng, “Evaluating Machine Learning Models: A Beginner’s Guide to Key Concepts and Pitfalls,” Sebastopol, CA: O’Reilly Media, Inc, 2015. [23] Arfiani, Z. Rustam, J. Pandelaki and A. Siahan, “Kernel spherical k-means and support vector machine for acute sinusitis classification,” IOP Conference Series: Materials Science and Engineering, vol. 546, no. 052011, pp. 1- 10, 2019, doi: 10.1088/1757-899X/546/5/052011. [24] H. Glanz, L. Calvanho, D. S. Menashe and M. A. Frield, “A parametric model for classifying land cover and evaluating training data based on multi-temporal remote sensing data,” ISPRS journal of photogrammetry and remote sensing, vol. 97, pp. 219-228, 2014, doi: 10.1016/j.isprsjprs.2014.09.004. [25] S. Saud, B. Jamil, Y. Upadhyay and K. Irshad, “Performance improvement of empirical models for estimation of global solar radiation in India: A k-fold cross-validation approach,” Sustainable Energy Technologies and Assessments, vol. 40, no. 100768, 2020, doi: 10.1016/j.seta.2020.100768.
  • 6.  ISSN: 2302-9285 Bulletin of Electr Eng & Inf, Vol. 10, No. 6, December 2021 : 3121 – 3126 3126 BIOGRAPHIES OF AUTHORS Zuherman Rustam is an Associate Professor and a lecturer of the intelligence computationat the Department of Mathematics, University of Indonesia. He obtained his Master of Science in 1989 in informatics, Paris Diderot University, French, and completed his Ph.D. in 2006 fromcomputer science, University of Indonesia. Assoc. Prof. Dr. Rustam is a member of IEEE who is actively researching machine learning, pattern recognition, neural network, artificial intelligence. Fildzah Zhafarina is a final year student from Departement of Mathematics, University of Indonesia. Ms. Zhafarina is passionately researching machine learning, computer vision, neural networks and deep learning in various fields. Jane Eva Aurelia was born in Jakarta, 19 June 1998. She is a final year student in the Departement of Mathematics, University of Indonesia. She is currently working on her thesis, which is firmly about applied mathematics using machine learning. Also, Ms. Jane’s specialties in research are mostly about machine learning, mathematical modeling, and data mining. Yasirly Amalia is a final semester student in Department of Mathematics, University of Indonesia. Ms. Yasirly is passionately in machine learning, mathematical modelling, and data mining.