A study of feature extraction for Arabic calligraphy characters recognition

International Journal of Electrical and Computer Engineering (IJECE)
Vol. 14, No. 1, February 2024, pp. 870~877
ISSN: 2088-8708, DOI: 10.11591/ijece.v14i1.pp870-877  870
Journal homepage: http://ijece.iaescore.com
A study of feature extraction for Arabic calligraphy characters
recognition
Abdelhay Zoizou, Chaimae Errebiai, Arsalane Zarghili, Ilham Chaker
Laboratory of Intelligent Systems and Applications, Faculty of Sciences and Technology, University Sidi Mohamed Ben Abdellah,
Route Immouzer, Fez, Morocco
Article Info ABSTRACT
Article history:
Received Feb 22, 2023
Revised Jun 24, 2023
Accepted Aug 8, 2023
Optical character recognition (OCR) is one of the widely used pattern
recognition systems. However, the research on ancient Arabic writing
recognition has suffered from a lack of interest for decades, despite the
availability of thousands of historical documents. One of the reasons for this
lack of interest is the absence of a standard dataset, which is fundamental for
building and evaluating an OCR system. In 2022, we published a database of
ancient Arabic words as the only public dataset of characters written in Al-
Mojawhar Moroccan calligraphy. Therefore, such a database needs to be
studied and evaluated. In this paper, we explored the proposed database and
investigated the recognition of Al-Mojawhar Arabic characters. We studied
feature extraction by using the most popular descriptors used in Arabic
OCR. The studied descriptors were associated with different machine
learning classifiers to build recognition models and verify their performance.
In order to compare the learned and handcrafted features on the proposed
dataset, we proposed a deep convolutional neural network for character
recognition. Regarding the complexity of the character shapes, the results
obtained were very promising, especially by using the convolutional neural
network model, which gave the highest accuracy score.
Keywords:
Ancient text recognition
Arabic words recognition
Deep learning
Handcrafted feature extraction
Optical character recognition
This is an open access article under the CC BY-SA license.
Corresponding Author:
Abdelhay Zoizou
Laboratory of Intelligent Systems and Applications, Faculty of Sciences and Technology, University Sidi
Mohamed Ben Abdellah
Route Immouzer, Fez, Morocco
Email: Abdelhay.zoizou@usmba.ac.ma
1. INTRODUCTION
As a subfield of the pattern recognition domain, optical character recognition has seen significant
development, especially with languages that use Latin characters. As a result, the Latin optical character
recognition systems have sought to achieve very high levels of accuracy. Many commercial products, which
permit the automatic transformation of a text image into machine-editable and readable text, are available
worldwide. Otherwise, it is only in the last years that handwriting recognition has attracted the interest of
researchers, but the results are not very interesting compared to those of Latin-based optical character
recognition (OCRs). There is also a significant lack of research interest in Arabic OCR for historical
documents. This lack is due to many reasons, on the one hand, the absence of a public database of Arabic
words and characters, and, on the other hand, a diversity of shapes and sizes for each character, where a
single character can have up to five different forms.
Figure Figure 1 shows Arabic historical documents that represent a great fortune worldwide, mainly
for literature, art, history, and other fields. Therefore, the need for an Arabic historical document processing
system for ancient calligraphic styles creates a new fruitful research dimension for OCR. Verily, during the

Int J Elec & Comp Eng ISSN: 2088-8708 
A study of feature extraction for Arabic calligraphy characters recognition (Abdelhay Zoizou)
871
past five years, few works have been conducted to process Arabic historical documents [1]–[5]. The optical
recognition of these documents is very challenging because they are written in calligraphic style, which is
usually more difficult to recognize than normal handwriting. In fact, ancient Arabic characters are very
different from modern handwriting, mainly because of the shape variation of the same character. While a
standard Arabic character changes the shape depending on the position within a word, ancient characters may
have different shapes for the same position.
Ancient Arabic is distinguished from other scripts by the connectivity of its characters. Its artistic
aspect makes it widely used in the decoration of old buildings and palaces. Al-Mojawhar, for example, is one
of the most widely used ancient calligraphies. It was used to write public and private letters, as well as
scientific and artistic books, which is shown in Figure 2.
Figure 1. Different shapes for the same letter
"KAF - ‫"كاف‬
Figure 2. A sample of an ancient Arabic
document
Recently, Zoizou et al. [6] published a database of words, sub-words, and characters extracted from
historical documents written in Mojawhar calligraphy. As it is a new database, it has to be exploited and
well-studied through experiments and analysis. This work is about exploring the proposed database. Several
experiments were conducted using this new database to find the most suitable combination of feature
description methods and classification. For this purpose, we used some popular classifiers, namely multi-
layer perceptron (MLP), support vector machines (SVM), k-nearest neighbor (KNN), and random forest
(RF). For each category of feature extraction methods, we chose the most used ones in the optical character
recognition field. Namely, scale invariant feature transforms (SIFT) and histogram of oriented gradient
(HOG) as distribution-based descriptors, Zernike as moment-based descriptors, and Gabor filter as a spatial
frequency-based descriptor. We also tested the studied classifiers with raw pixel data values as features. In a
final experiment, we built a deep convolutional neural network to classify the character images.
The research on Arabic OCR did not gain much interest until the last years. Only a few works have
been published to contribute to developing printed and handwritten Arabic OCR systems. Nevertheless, the
results are still considered shy due to the complexity of Arabic letters’ shapes, which require more
sophisticated form descriptors to produce accurate features.
Elleuc et al. [7] used Gabor filter features as input to an SVM classifier based on radial basis
function (RBF) and polynomial kernels for Arabic handwritten recognition. To test the suggested model, a
handwritten Arabic character database (HACDB) database with 66 different classes is used. It is reported that
the success classification rates are 88.77% for the RBF-based SVM and 70.82% for the polynomial-based
SVM.
Alternatively, Jebril et al. [8] combined SVM with HOG descriptor as a feature extractor. Before
extracting features, the input images are submitted to several consecutive pre-processing operations such as
cleaning, binarization, color normalization and segmentation of words into small windows. The resulted
features’ vectors are fed to an SVM. A recognition success rate of 99% was reported after implementing the
proposed model on a private dataset of Jordanian city names.
Hassen and Khemakhem [9] presented a comparative study on some feature extraction methods used
for Arabic handwriting recognition. The selected methods are Gabor filter, wavelet transform, Fourier
transform and Hough transform. They are compared in terms of the capability to extract invariant features.
To evaluate the precision of the different methods, Euclidean minimum distance classifier (EMDC) is used as
classifier. Each feature extraction method is tested against IFN/ENIT dataset. The reported results show that
Hough transform and Gabor are more precise in extracting representative and invariant features.
Elleuch et al. [10] proposed a system of Arabic handwritten character recognition. The suggested
system is a multi-class SVM with RBF kernel combined with HOG feature extractor. The use of Gabor filter

 ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 14, No. 1, February 2024: 870-877
872
as a handcrafted feature descriptor is also investigated. The proposed HOG-SVM system was experimented
with IFN/ENIT database and obtained a recognition accuracy of 98.5%. While the Gabor-SVM system
reached 92.8%.
Hassan et al. [11] proposed a model of Arabic word recognition based on SIFT as feature extraction
method and SVM as classifier. The extracted features with SIFT were clustered into groups using k-means.
The proposed system was tested against the Arabic handwritten database AHDB database and presented a
success recognition rate of 99.08%.
Althobaiti and Lu [12] used Freeman chain code for Arabic character identification. The process
starts with setting a bounding box as the smallest rectangle containing the character with all its auxiliary
parts, i.e., dots and symbols. The chain code is extracted from the character shape and reduced to a minimal
7-digit form. The chain code is encoded, and more statistical features are added to construct a feature
sequence of 11 digits. A confusion matrix is calculated to check the utility of the proposed method. It is
claimed that an accuracy of 92% to 97% is reached.
Altwaijry and Al-Turaiki [13] proposed a convolutional neural network (CNN) model for Arabic
handwritten character recognition. The proposed model consists of three convolutional layers, three
maxpooling layers, and two fully connected layers. Finally, an output layer of 29 units. Rectified linear unit
ReLU is chosen as an activation function for convolutional layers, while SoftMax is used for the output layer.
The model is tested on a private database built by authors containing 47,000 images, and the well-known
Arabic handwritten characters database (AHCD) database [14]. The CNN model reached an accuracy of 97%
on the AHCD database, while on the private one, it did not exceed 88%.
Wagaa et al. [15] proposed a CNN model for Arabic character recognition. The model consists of
four convolutional layers, two maxpooling layers, and three fully connected layers. Except for the output
layer, which uses SoftMax, each other layer uses ReLU function for activation. The authors studied the use of
several optimizations and data augmentation algorithms. The proposed model was tested on AHCD and Hijja
datasets. It is claimed that 98% and 91% recognition success rates are reached.
Bai et al. [16] introduced the shared-hidden-layer convolutional neural network (SHL-CNN) for
character recognition of different languages. The proposed model architecture consists of two parts. The first
one is shared for all languages and has two convolutional layers, two maxpooling layers, two contrast
normalization layers, and two local convolutional layers. The second part is a non-shared layer which
consists of a SoftMax fully connected layer. Although the study does not include Arabic letters, yet obtained
some promising results.
Shams et al. [17] proposed a hybrid model for handwritten character recognition based on CNN and
SVM. The proposed CNN architecture includes three convolutional neural layers, each followed by a
maxpooling layer. Then comes one fully connected layer and the output layer, which is submitted to a
dropout function and fed to the SVM classifier. The deep CNN-SVM model was tested on a private database
of 16,800 images and presented a classification error rate of 4.9%, which is very promising.
Younis and Khateeb [18] presented a CNN model for offline Arabic character recognition. Three
convolution layers with one fully connected layer and one output layer. The dropout technique is used with a
probability of 0.5 to reduce overfitting. The model is experimented on both AHCD and AIA9K databases and
obtained 97.6% and 94.8%, respectively.
Alrobah and Albahli [19] presented a novel study using a hybrid model for Arabic handwritten
character recognition. They proposed a recognition model based on a deep convolutional neural network as a
feature extractor. The extracted features are then fed to three machine learning models for classification,
namely SVM, XGBoost, and neural network as SoftMax fully connected layers. They tested multiple
architecture combinations with different parameters on the Hijja dataset. The highest recognition rate was
reported for the CNN-SVM combination, with a success rate of 96.3%.
2. METHOD
In this study, we used different methods of feature extraction and classification to perform Al-
Mojawhar recognition. These two phases are known to be crucial and risky in the process of establishing an
optical character recognition system. It is in these phases that the graphical shape of the character is
converted into a numeric one that can be manipulated and edited digitally. The literature presented several
effective techniques to extract representative features and classify characters. However, in the case of ancient
Arabic characters, the complexity imposed by the shape’s variation and additional signs is too high compared
to modern writing. For feature extraction, we considered using Scale-invariant feature transform known as
SIFT [20], histogram of oriented gradient known as HOG [21], Zernike moments [22], Gabor filter [23], and
contour-based features [24]. We also studied the use of raw pixel data to build features’ vectors.

873
SIFT is a scale invariant-based descriptor. It allows the extraction of features regardless of the scale
change. According to Lowe [20], the principle behind SIFT is to convert the image pixel values into the
rotation and scale invariant coordinates relative to the local features, as shown in Figure 3. In the same
category, HOG descriptor is known as one of the most potent descriptors in feature extraction for pattern
recognition. To calculate HOG features, the images are split into many, equal, and connected zones. Then for
each zone, the edge gradients and orientations are found and combined to form a 1-D histogram. The global
feature vector is represented by the concatenation of all of these histograms.
Figure 3. SIFT key point detection
Zernike moments were introduced initially in the 30s by the physicist and Noble Prize winner Fritz
Zernike. Zernike moments are based on orthogonal radial polynomials. They provide a unique description of the
entity that does not comprehend any redundant information. Unlike most of the previous features’ extraction
methods, which require high-quality thresholding, and pre-processing, there is a Gabor filter. It is well known
for its ability to extract representative information from multi-canal images. The Gabor filter features’ extraction
performance comes from the following property: the invariance to rotation, scale, and translation.
For classification, we used in this study multi-layer perceptron, support vector machines, KNN,
SVM, and random forest. Finally, we used a deep convolutional neural network for its feature extraction and
classification capabilities. One of the primary goals of this study is to compare in terms of efficiency the
handcrafted feature calculated with different descriptors, and the learned feature generated with a deep
convolutional model.
Support vector machines are popular supervised methods used for both classification and regression
problems. They are known for low memory cost because their decision function uses subsets of training
points. For our experiments, we used SVM with RBF and Polynomial kernels, both with a Gamma parameter
range of 0.05 to 0.5.
Random forest is a supervised learning algorithm that operates by constructing multiple decision
trees. It is one of the most widely used algorithms due to its accuracy, simplicity, and flexibility. It randomly
chooses features, makes observations, builds a forest of decision trees, and averages the results.
The KNN algorithm is a simple, non-parametric, and supervised learning algorithm that uses proximity
to make classifications or predictions. It is generally used as a classification algorithm based on the assumption
that similar points can be found next to each other. Hence, the KNN algorithm predicts the appropriate class for
the test data by computing the distance between the test data and all the training points.
The CNNs are a particular sub-category of feed-forward networks used mainly for image processing
either as a shape descriptor or a complete classification model. “The advantage of CNN is that it
automatically extracts the salient features which are invariant and a certain degree to shift and shape
distortions of the input characters” [25]. A CNN model is generally composed of two main parts, the first one
uses convolution and pooling layers to behave as a feature extractor, while the other part is a neural network
of many fully connected layers that works as a classifier.
To compare ancient Arabic with modern handwriting in terms of processing difficulties, we also
used it in our experiments AHCD database. It consists of 16,800 images of modern Arabic handwritten
characters distributed equally in 28 classes. AHCD is known for its robustness and is widely used for
evaluating Arabic handwriting recognition systems.
3. EXPERIMENTAL RESULTS AND DISCUSSION
To evaluate the different methods, we used characters’ images from Al-Mojawhar database (MOJ-DB)
database. The initial data was pre-processed with binarization and denoising. It was then normalized to a size of
50×50 pixels. The initial database has 60 images for each of the 76 characters’ classes.

 ISSN: 2088-8708
874
The following experiments were performed on a Python-OpenCV environment installed on a quad-
core Ryzen 7 PC with a clock speed of 2.8 GHz and 16 GB of RAM. We used an augmented database
containing 600 instances per class. Table 1 presents the test’s accuracies (%) as results of the conducted
experiments.
3.1. Handcrafted features
We built feature vectors using HOG, SIFT, Zernike moment, Gabor filter, and raw pixel data
extracted from MOJ-DB database. To use the selected features, we trained in a first round, a multi-layer
perceptron (MLP). The neural network is composed of four layers, and each one is followed by a dropout
function of 0.5 ratio. The model is trained for 10 epochs with a batch size of 50. In a second experiment, we
fed two RBF-based and polynomial-based SVMs with different features’ sets. The best results were obtained
using the best parameter combinations (𝐶 = 5, Gamma = 0.05, degree = 3). After that, several experiments
were conducted using a KNN classifier with different K parameter values. However, the best results are
obtained with 𝑘 = 3. We then trained random forest to classify the selected features, different numbers of
estimators were tested, yet the best results were obtained with a value of 64.
Table 1. The test accuracy (%) of classification for the different models' combinations
Feature set MLP SVM-RBF SVM-Poly KNN RF
Raw data 73.2 81.8 83.1 73.5 69.3
HOG 82.2 84.6 88.8 83.1 84.2
SIFT 70.7 68.7 75.8 66.8 77.6
Zernike moment 76.1 79.7 74.6 82.1 81.6
Gabor filter 77.2 76.0 77.8 78.5 86.8
In a final experiment on handcrafted features, we tested the four descriptors in terms of time cost. To
this end, we randomly selected 700-character images from MOJ-DB reduced to the size of (50×50) pixels
and calculated the processing time of feature extraction using each of the descriptors for the entire set of
images. Table 2 summarizes the obtained results.
Table 2. Time cost for each features extractor
Feature extractor Time cost (s)
HOG 1.88
SIFT 1.36
Zernike moment 2.3
Gabor filter 0.46
In cases when the available amount of data is not sufficient for machine learning training to produce
accurate models, data augmentation is one of the techniques that may solve the problem. It consists of
transforming the available data instance to generate new data. In this work, we augmented the initial data
using distortion and a slight rotation of the initial character images. We generated two other databases. The
first contains 240 instances for each character class, while the second contains up to 600. To check the effect
of data augmentation on MOJ-DB characters, we trained an SVM based on polynomial kernel fed with HOG
features on the three databases since the chosen model presented the best results in the previous experiments.
The results are summarized in Table 3.
Table 3. The results of classification using SVM-HOG on the augmented databases
Database Test Accuracy
MOJ-DB_60 81.4%
MOJ-DB_240 84.6%
MOJ-DB_600 88.8%
3.2. Deep learning
Deep learning has proven great precision in pattern recognition, especially for Latin character
recognition. We study in this work the usefulness of deep learning in Al-Mojawhar Arabic character
recognition. We studied the performance of several convolutional neural networks architectures on MOJ-DB
and AHCD databases. The best results are obtained by the following model:

875
input(50 × 50 × 1)→Conv(32)→maxpool(2 × 2)→conv(64)→maxpool(2 × 2)
→conv(128)→maxpool(2 × 2)→flat→FC(128)→FC(128)→output(76).
An input layer of 50×50 binary images and three convolution layers, each followed by a maxpooling layer.
The classification consists of two fully connected layers and an output layer of 76 neurons. The model can be
summarized in Figure 4.
Figure 4. CNN architecture for ancient Arabic character recognition
The optimization function significantly influences the model learning. We also examine for this
model toward the effect of optimization functions Adam and root mean squared propagation (RMSProp) on
the recognition accuracy. After 30 epochs of training with a learning rate of 0.0001, the model reached the
results shown in Table 4.
Table 4. The results of classification using the CNN model
Optimizer Test Accuracy on MOJ-DB_600 Test Accuracy on AHCD_600
Adam 95.6% 98.5%
RMSProp 95.2% 97.9%
3.3. Discussion
Most of the previous experiments performed on MOJ-DB show that Al-Mojawhar style presents
more complexity than modern handwritten characters. The classification results prove that ancient Arabic
needs more effort and interest from researchers. The experiments show that the ability of multi-layer
perceptron to classify ancient Arabic character images is limited compared to the other classifiers. Also,
the HOG and Gabor filter outperformed the other feature extraction methods. Moreover, the features
extracted with these two descriptors maintained high representation quality with all the classifiers used in
experiments. In terms of computation speed, Gabor remains the fastest descriptor, while SIFT and HOG
take approximately similar time for processing with an advantage for SIFT. The experimentation
confirmed the fact that invariant moments are the slowest descriptors for feature extraction. For the
classifiers, the polynomial based SVM, followed by random forest, presented the best classification rates
for almost all the feature sets outperforming the multi-layer perceptron and multi-layer perceptron.
Overall, among all the combinations, the one built with SVM-Poly and HOG gave the best classification,
which is 88.8%.
Deep learning has become widely used for character recognition as new architectures, and
optimization parameters are being developed. For Al-Mojawhar style, the proposed model presented the best
results outperforming SVM, KNN, MLP, and random forest. The obtained results (95.6% using Adam and
95.2% using RMSProp) are explained by the great ability of the convolutional network to extract pertinent
features. The obtained results prove that learned features are more accurate and representative than
handcrafted features for ancient Arabic handwriting. As data augmentation helps generate new instances for
each of the initial data classes, it remains the easiest solution for machine learning engineers to feed models
with sufficient data. Augmenting data 10 times helped increase the recognition rate by about 5%, which is
very significant in the case of character recognition.

 ISSN: 2088-8708
876
4. CONCLUSION
In this work, we studied the MOJ-DB database with several experiments to investigate the suitable
features and classifiers for the recognition of Al-Mojawhar calligraphy style writing. This study used some of
the most popular classifiers, including a neural network, support vector machines, Random Forest, KNN, and
a convolutional neural network. We compared the most used feature descriptors in optical character
recognition, including HOG, SIFT, Zernike moment, and Gabor filter. We built different combinations of
classifiers and descriptors. SVM has proven high accuracy when associated with a polynomial kernel and fed
with HOG feature vectors (88.8%). Using data augmentation, we could enlarge our database and enhance the
classification results for the SVM-HOG from 81.4% to 88.8%. We then built a CNN model capable of
recognizing Al-Mojawhar writing style with a success rate of 95.6%. Although we obtained good results,
they are still not sufficient. The comparison of the recognition results with CNN using MOJ-DB and AHCD
proved that ancient Arabic writing presents more processing difficulty and complexity. Therefore, more
efforts are needed to develop robust systems for ancient Arabic writing recognition in order to preserve the
content and value of historical documents. In future works, we aim to evaluate MOJ-DB for words and sub-
words recognition, especially with feature extraction methods that present high precision for scripts of other
languages. We also need to develop more robust systems for ancient Arabic writing recognition by
introducing the latest technologies in shape description and machine learning.
REFERENCES
[1] A. Zoizou, A. Zarghili, and I. Chaker, “Skew Correction and Text Line Extraction of Arabic Historical Documents,”
Communications in Computer and Information Science, vol. 1108, pp. 181–193, 2019, doi: 10.1007/978-3-030-32959-4_13.
[2] M. Kassis, A. Abdalhaleem, A. Droby, R. Alaasam, and J. El-Sana, “VML-HD: The historical Arabic documents dataset for
recognition systems,” 1st IEEE International Workshop on Arabic Script Analysis and Recognition, ASAR 2017, pp. 11–14, 2017,
doi: 10.1109/ASAR.2017.8067751.
[3] B. K. Barakat and J. El-Sana, “Binarization free layout analysis for Arabic historical documents using fully convolutional
networks,” 2nd IEEE International Workshop on Arabic and Derived Script Analysis and Recognition, ASAR 2018, pp. 151–155,
2018, doi: 10.1109/ASAR.2018.8480333.
[4] M. Ibn Khedher, H. Jmila, and M. A. El-Yacoubi, “Automatic processing of historical Arabic documents: A comprehensive
survey,” Pattern Recognition, vol. 100, p. 107144, 2020, doi: 10.1016/j.patcog.2019.107144.
[5] B. Alrehali, N. Alsaedi, H. Alahmadi, and N. Abid, “Historical Arabic manuscripts text recognition using convolutional neural
network,” Proceedings - 2020 6th Conference on Data Science and Machine Learning Applications, CDMA 2020, pp. 37–42,
2020, doi: 10.1109/CDMA47397.2020.00012.
[6] A. Zoizou, A. Zarghili, and I. Chaker, “MOJ-DB: A new database of Arabic historical handwriting and a novel approach for
subwords extraction,” Pattern Recognition Letters, vol. 159, pp. 54–60, 2022, doi: 10.1016/j.patrec.2022.04.040.
[7] M. Elleuch, H. Lahiani, and M. Kherallah, “Recognizing Arabic handwritten script using support vector machine classifier,”
International Conference on Intelligent Systems Design and Applications, ISDA, vol. 2016-June, pp. 551–556, 2016, doi:
10.1109/ISDA.2015.7489176.
[8] N. A. Jebril, H. R. Al-Zoubi, and Q. Abu Al-Haija, “Recognition of handwritten Arabic characters using histograms of
oriented gradient (HOG),” Pattern Recognition and Image Analysis, vol. 28, no. 2, pp. 321–345, 2018, doi:
10.1134/S1054661818020141.
[9] H. Hassen and M. Khemakhem, “A comparative study of Arabic handwritten characters invariant feature,” International Journal
of Advanced Computer Science and Applications, vol. 2, no. 12, pp. 62–68, 2011, doi: 10.14569/ijacsa.2011.021209.
[10] M. Elleuch, A. Hani, and M. Kherallah, “Arabic handwritten script recognition system based on HOG and Gabor features,”
International Arab Journal of Information Technology, vol. 14, no. 4A Special Issue, pp. 639–646, 2017.
[11] A. K. A. Hassan, B. S. Mahdi, and A. A. Mohammed, “Arabic handwriting word recognition based on scale invariant feature
transform and support vector machine,” Iraqi Journal of Science, vol. 60, no. 2, pp. 381–387, 2019, doi:
10.24996/ijs.2019.60.2.18.
[12] H. Althobaiti and C. Lu, “A survey on Arabic optical character recognition and an isolated handwritten Arabic character
recognition algorithm using encoded freeman chain code,” 2017 51st Annual Conference on Information Sciences and Systems,
CISS 2017, pp. 2–7, 2017, doi: 10.1109/CISS.2017.7926062.
[13] N. Altwaijry and I. Al-Turaiki, “Arabic handwriting recognition system using convolutional neural network,” Neural Computing
and Applications, vol. 33, no. 7, pp. 2249–2261, 2021, doi: 10.1007/s00521-020-05070-8.
[14] A. El Sawy, H. El-Bakry, and M. Loey, “CNN for handwritten Arabic digits recognition based on LeNet-5,” Advances in
Intelligent Systems and Computing, vol. 533, pp. 565–575, 2017, doi: 10.1007/978-3-319-48308-5_54.
[15] N. Wagaa, H. Kallel, and N. Mellouli, “Improved Arabic alphabet characters classification using convolutional neural networks
(CNN),” Computational Intelligence and Neuroscience, vol. 2022, p. 16, 2022, doi: 10.1155/2022/9965426.
[16] J. Bai, Z. Chen, B. Feng, and B. Xu, “Image character recognition using deep convolutional neural network learned from different
languages,” 2014 IEEE International Conference on Image Processing, ICIP 2014, pp. 2560–2564, 2014, doi:
10.1109/ICIP.2014.7025518.
[17] M. Shams, A. A. Elsonbaty, and W. Z. El Sawy, “Arabic handwritten character recognition based on convolution neural networks
and support vector machine,” International Journal of Advanced Computer Science and Applications, vol. 11, no. 8, pp. 144–149,
2020, doi: 10.14569/IJACSA.2020.0110819.
[18] K. Younis and A. Khateeb, “Arabic hand-written character recognition based on deep convolutional neural networks,” Jordanian
Journal of Computers and Information Technology, vol. 3, no. 3, p. 186, 2017, doi: 10.5455/jjcit.71-1498142206.
[19] N. Alrobah and S. Albahli, “A hybrid deep model for recognizing Arabic handwritten characters,” IEEE Access, vol. 9,
pp. 87058–87069, 2021, doi: 10.1109/ACCESS.2021.3087647.
[20] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60,
no. 2, pp. 91–110, 2004, doi: 10.1023/B:VISI.0000029664.99615.94.

877
[21] J. Gall, P. Gehler, and B. Leibe, “Preface,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial
Intelligence and Lecture Notes in Bioinformatics), 2015, vol. 9358, pp. v–vi. doi: 10.1007/978-3-319-24947-6.
[22] A. Khotanzad and Y. H. Hong, “Invariant image recognition by Zernike moments,” IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. 12, no. 5, pp. 489–497, 1990, doi: 10.1109/34.55109.
[23] X. Wang, X. Ding, and C. Liu, “Gabor filters-based feature extraction for character recognition,” Pattern Recognition, vol. 38,
no. 3, pp. 369–379, 2005, doi: 10.1016/j.patcog.2004.08.004.
[24] S. Deshmukh and L. Ragha, “Analysis of directional features - Stroke and contour for handwritten character recognition,” 2009
IEEE International Advance Computing Conference, IACC 2009, no. March, pp. 1114–1118, 2009, doi:
10.1109/IADCC.2009.4809170.
[25] H. C. Shin et al., “Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics
and transfer learning,” IEEE Transactions on Medical Imaging, vol. 35, no. 5, pp. 1285–1298, 2016, doi:
10.1109/TMI.2016.2528162.
BIOGRAPHIES OF AUTHORS
Abdelhay Zoizou received in 2017 a master’s degree in intelligent systems and
networks from Sidi Mohamed Ben Abdellah University, Morocco. Currently, he serves as
head of the IT subdepartment at the Ministry of Justice of Morocco. He is also a Ph.D. student
in the Laboratory of Intelligent Systems and Applications at the Faculty of Sciences and
Technologies of Fez. His research is concerned with the applications of machine learning and
artificial intelligence for the optical segmentation and recognition of Arabic handwriting and
historical manuscripts. He can be contacted at abdelhay.zoizou@usmba.ac.ma.
Chaimae Errebiai is a laureate of the master’s degree in intelligent systems and
networks from the Faculty of Sciences and Technologies of Fez in 2022 (USMBA, Morocco),
and is currently a student in the second year of a master’s degree in Computer Science,
Artificial Intelligence Foundations and Applications-Data and Knowledge at the University of
Paul Sabatier in Toulouse, France. Her main research interests include artificial intelligence,
image processing, knowledge management, and graph databases. She can be contacted at
chaimae.errebiai@univ-tlse3.fr.
Arsalane Zarghili is a doctor of science from Sidi Mohamed Ben Abdellah
University (Fez-Morocco). He received his PhD in 2001 in information processing and joined
the same University in 2002 as assistant professor at the Computer Science Department of the
Faculty of Science and Technology of Fez (FST). In 2007, he was head of the Computer
Science Department and chair of the Master of Software Quality at FST-Fez. In 2008, he was
promoted to associate professor and to full professor in 2015. In 2011, he was the co-founder
and the head of the Laboratory of Intelligent Systems and Applications. He is a member of the
steering committee of the Department of Computer Sciences and was a member of the faculty
board. In 2011, he was the chair of Master Intelligent Systems and Networks. In 2020, he was
an elected member of the University Board and a member of the Scientific Research
Committee and the Cooperation and Academic Affairs Committee. He has also been an IEEE
member since 2011. His main research is about pattern recognition, image indexing, and
retrieval systems in cultural heritage, biometrics, and healthcare. He also works on Arabic
Natural Language Processing. He can be contacted at arsalane.zarghili@usmba.ac.ma.
Ilham Chaker is a professor of computer science at the Faculty of Sciences and
Technologies, University of Sidi Mohamed Ben Abdellah (USMBA), Fez. She received a
Ph.D. degree in computer science from USMBA, Fez, in 2011. She is a member of the
Laboratory Intelligent Systems and Applications. Her main research interests include artificial
intelligence, machine learning, optical character recognition, knowledge management, and
natural language processing. She has served on program committees of conferences such as
AI2SD, ISCV, and ICDS. She can be contacted at Ilham.chaker@usmba.ac.ma.

A study of feature extraction for Arabic calligraphy characters recognition

More Related Content

Similar to A study of feature extraction for Arabic calligraphy characters recognition (20)

More from IJECEIAES (20)

Recently uploaded (20)

A study of feature extraction for Arabic calligraphy characters recognition