SlideShare a Scribd company logo
IAES International Journal of Artificial Intelligence (IJ-AI)
Vol. 13, No. 1, March 2024, pp. 500~508
ISSN: 2252-8938, DOI: 10.11591/ijai.v13.i1.pp500-508  500
Journal homepage: http://ijai.iaescore.com
Word embedding for detecting cyberbullying based on
recurrent neural networks
Noor Haydar Shaker, Ban N. Dhannoon
Department of Computer Science, College of Science, Al-Nahrain University, Baghdad, Iraq
Article Info ABSTRACT
Article history:
Received Jan 27, 2023
Revised Mar 19, 2023
Accepted Mar 27, 2023
The phenomenon of cyberbullying has spread and has become one of the
biggest problems facing users of social media sites and generated significant
adverse effects on society and the victim in particular. Finding appropriate
solutions to detect and reduce cyberbullying has become necessary to mitigate
its negative impacts on society and the victim. Twitter comments on two
datasets are used to detect cyberbullying, the first dataset was the Arabic
cyberbullying dataset, and the second was the English cyberbullying dataset.
Three different pre-trained global vectors (GloVe) corpora with different
dimensions were used on the original and preprocessed datasets to represent
the words. Recurrent neural networks (RNN), long short-term memory
(LSTM), Bidirectional LSTM (BiLSTM), gated recurrent unit (GRU), and
Bidirectional GRU (BiGRU) classifiers utilized, evaluated and compared. The
GRU outperform other classifiers on both datasets; its accuracy on the Arabic
cyberbullying dataset using the Arabic GloVe corpus of dimension equal to
256D is 87.83%, while the accuracy on the English datasets using 100 D pre-
trained GloVe corpus is 93.38%.
Keywords:
Deep learning classifiers
Gated recurrent unit
GloVe word embedding
Long short-term memory
Recurrent neural networks
This is an open access article under the CC BY-SA license.
Corresponding Author:
Noor Haydar Shaker
Department of Computer Science, College of Science, Al-Nahrain University
Baghdad, Iraq
Email: noor.haidar21@ced.nahrainuniv.edu.iq
1. INTRODUCTION
The development of technological technologies and the increase in the number of users of social media
sites, including users who try to harm others, led to the spread of cyberbullying. Cyberbullying is a type of
bullying in which one or more persons (the bully) purposefully and frequently cause harm to another person
(the victim) through using technological technologies. Cyberbullies utilize technological technologies like
mobile phones, computers, or other electronic devices to send emails, instant text messages, make comments
on social media or in chat rooms, or otherwise to harass their victims [1], [2]. Cyberbullying may have serious
and long-term consequences for its victims, like a physical, mental, and emotional impact on the victim that
leaves them feeling scared, furious, humiliated, exhausted, or have symptoms such as headaches or stomach
pains. When victims experience cyberbullying, they might start to feel ashamed, nervous, anxious, and insecure
about what people say or think about them. This can lead to withdrawal from friends and family, and it may
lead to the victim's suicide [3], [4]. So, it has become necessary to search for and find solutions to detect
cyberbullying messages. Many attempts have been made in the field of artificial intelligence to detect the
phenomenon of cyberbullying by using machine learning and deep learning techniques, and attempts are
continuing to find the best results and appropriate solutions to detect this phenomenon to reduce the negative
effects that generate in society, especially on the category of teenagers who are more exposed to cyberbullying
than the rest category of society.
Int J Artif Intell ISSN: 2252-8938 
Word embedding for detecting cyberbullying based on recurrent neural networks (Noor Haydar)
501
In this research, we used deep learning classifiers with two labelled datasets (Arabic and English) to
detect the phenomenon of cyberbullying. In the step of word representation, we used a pre-trained global vector
for word representation (Pre-trained GloVe) for obtaining vector representations for words, which facilitates
dealing with these words inside the computer since most electronic devices, including the computer, only
understand and deal with digital values, so it became a step to represent words and convert them into vectors,
the most important step. Each vector contains a number of numbers to represent this word and facilitate dealing
with it inside the computer. Five deep learning classifiers we used in this research to detect cyberbullying are:
standard recurrent neural networks (RNN), long short-term memory (LSTM), bidirectional long short-term
memory (BiLSTM), gated recurrent unit (GRU), and Bidirectional gated recurrent unit (BiGRU) based on a
powerful and robust form of a chain of repeating modules of neural networks with internal memory used for
sequential data. This research is organized: Section 2 presents related works, details of the dataset, word
representations, the classifier, and their corresponding results. Section 3 explains the basic concepts in the
practical part of this research. Section 4 provides the methodology that this research followed to achieve the
results. Section 5 presents the experimental results and discussion.
2. RELATED WORK
Since cyberbullying is one of the major problems we are facing, many researchers have contributed
to developing models based on machine learning and deep learning to detect this type of bullying. Reviewing
previous work found that there was not enough research done to identify Arab cyberbullying in particular. This
can be attributed to many challenges and problems related to the Arabic language itself, such as 1) the lack of
a large data set for adopting it to build prediction models. 2) using colloquial language in speaking, and 3) not
all libraries support the Arabic language [5].
Tyagi et al. [6] employed convolution neural network (CNN) with LSTM as a deep learning module
(CNN-LSTM) on 1.6 million English tweets, which categorize into two classes (negative and positive class).
The accuracy was 81.20% in the CNN-LSTM module with GloVe word embedding model dimension equal to
300D. Al-Bayati et al. [7] used an Arabic dataset taken from the internet, which is called large scale Arabic
book reviews (LABR), and contains over 16,448 rows, including positive labels (1) and negative labels (0).
The dataset was preprocessed by removing any words found in the dataset that are not in Arabic, normalization,
stemming, removing stopwords, and others. The dataset is split into 67% for training, 17% for testing, and 16%
for validation. The dataset is trained and tested with LSTM as a deep learning classifier and a pre-trained
embedding layer as word embedding for word representation. The accuracy was 82% with the LSTM classifier,
batch size 256, and epoch 10, which was the best result in this study.
The result in [8] an English cyberbullying dataset from Kaggle, which was collected from social media
sites like Twitter, Instagram, and Facebook. The dataset includes 100,000 comments, and the dataset was
preprocessed in several processes such as text cleaning, tokenization, stemming, lemmatization, and stopwords
removal. This research used LSTM, BiLSTM, GRU, and RNN as deep learning classifiers. The accuracy was
80.86% with an LSTM, 82.18% with BiLSTM, 81.46% with GRU, and 81.01% with RNN. Higher accuracy
was achieved in this research 82.18% with a BiLSTM.
Janardhana et al. [9] used the movie review (MR) dataset, which included 12,500 positive and 12,500
negative reviews. The dataset was preprocessed in several processes such as eliminating the stopwords and
removing the punctuation. This paper used a GloVe word embedding dimension of 200 with three deep learning
classifiers like LSTM, CNN, and CRNN (Generalized CNN combined with the BiLSTM). The accuracy was
79.47% with LSTM, 72.32% with CNN, and 84% with CRNN. The better accuracy was achieved at 84% with
the CRNN deep learning classifier. The LSTM as a deep neural network has been used with a sentiment-
specific word embedding (SSWE) layer for word representation as can be seen in [10]. The dataset was
compiled from three sources: Twitter, Formspring, and Wikipedia, with each platform contributing 3,000
examples for 9,000. The dataset was preprocessed in several processes, like removing numbers, punctuation
marks, symbols, blank spaces, and other processes. The accuracy of each separate platform was 79.1% with
Twitter, 72% with Formspring, 75.5% with Wikipedia, and 77.9% from the total examples with the LSTM
deep learning classifier. The proposed module in this research has some limitations, like the small size of the
dataset used and the one deep learning classifier tried in this research. Venkatesh et al. [11] applied an English
Twitter dataset including 10,007 comments on tweets, and the dataset was preprocessed in several processes,
such as converting all characters to lowercase, removing the links, removing punctuation, removing
whitespace, and others. The authors tried to use deep learning and machine learning modules to achieve the
best result. The best accuracy achieved was 85% with CNN-LSTM and GloVe word embedding.
Almutiry et al. [12] utilized Arabic comments Twitter dataset size of 17,748 comments tweets, which included
14,178 cyberbullying tweets and 3,570 non-cyberbullying tweets. The Arabic comments Twitter dataset
achieved 84.03% with support vector machine (SVM) as the classifier and term frequency-inverse document
 ISSN: 2252-8938
Int J Artif Intell, Vol. 13, No. 1, March 2024: 500-508
502
frequency (TD-IDF) as feature extraction to word representation of the dataset. Table 1 shows some recently
used methods for feature extraction and the data set with their highest accuracy.
Table 1. The highest related work accuracy for each classifier on the used dataset
Research Dataset Feature extraction/Word embedding Classifier Accuracy
[6] 1.6 million English tweets GloVe CNN-LSTM 81.20%
[7] 16,448 rows from (LABR) pre-trained embedding layer LSTM 82%
[8] 100,000 English comments cyberbullying dataset
from Kaggle
embedding layer LSTM 80.86%
BiLSTM 82.18%
GRU 81.46%
RNN 81.01%
[9] 25,000 reviews from Movie Review GloVe LSTM 79.47%
CNN 72.32%
CRNN 84%
[10] 9,000 examples compiled from Twitter,
Formspring, and Wikipedia
SSWE layer LSTM 77.9%
[11] 10,007 comments on English tweets GloVe CNN-LSTM 85%
[12] 17,748 Arabic comments tweet TF-IDF SVM 84.03%
3. PRELIMINARIES
Global vectors (GloVe) is an algorithm that was trained on a huge number of words using
unsupervised training to obtain the embedding matrix for the words, knowing how close the words are to each
other and drawing the words nearest or furthest from each other. GloVe depends on co-occurrence statistics
and a probability ratio statistic of the words to generate an embedding matrix for these words. Because the
computer understands only digital data, this requires converting words into digital values to make them easier
to understand and deal with inside the computer. GloVe is used to represent words using an embedding matrix
containing many words. Each of these words corresponds to several numerical values, representing the vectors
embedding this word, which are then employed as the input layer for neural networks of deep learning
classifiers [13], [14]. Recurrent neural network (RNN) is one type of deep learning classifier based on keeping
the output of a certain layer and feeding it back to the input to predict the layer's output, but it suffers from the
problem of vanishing and exploding gradients. RNN has been developed into different types of classifiers to
achieve better results and possibly solve the problems that RNN's deep learning classifier suffers from [15].
Long short-term memory (LSTM) is one of the types and developments of the RNN that Solves the problem
of vanishing and exploding gradients, especially when faced with long text sentences. The LSTM contains a
memory that saves the most important information and neglects the less important information through four
gates: forgets gate, input gate, cell state, and output gate. Figure 1 shows the LSTM structure [4], [16].
Figure 1. The LSTM structure [16]
Where the A are the neurons of LSTM, the input gates of the neurons are Xt, Xt-1, Xt+1, and the output
gates are ht, ht-1, ht+1. The two outputs from each neuron to the next neuron represent the forget gate and cell
state. The  is the sigmoid activation function [17], [18]. Bidirectional long short-term memory (BiLSTM) is
also a type of recurrent neural network (RNN). The sequence processing model consists of two LSTMs: the
first takes the input in a forward direction and the other in a backward direction. The BiLSTM is working to
effectively increase the information available to the network and improve the context available to the algorithm.
Int J Artif Intell ISSN: 2252-8938 
Word embedding for detecting cyberbullying based on recurrent neural networks (Noor Haydar)
503
Figure 2 shows how the BiLSTM works, where the input gates of the neurons are Xt, Xt-1, Xt+1, and the output
gates are yt, yt-1, yt+1. The  is the sigmoid nonlinear activation function [19], [20].
Figure 2. The BiLSTM structure [20]
Gated recurrent unit (GRU) is a type of recurrent neural network (RNN); it solves the vanishing and
exploding gradients problems that face RNN. The GRU is similar to the LSTM classifier but with fewer
parameters, generally faster and easier in the training process [21]–[23]. Figure 3 shows the structure of the
GRU classifier [24].
A typical RNN learns sequential information in one direction, i.e., the dependence of the time step t
to the previous temporal steps. Still, potentially available information will be lost. So, BiGRU is suggested,
where a GRU layer is added to process the backward data, causing the yt output at time t to be based on the
information of the previous time steps (Ht−1) and the information of the next time steps (Ht+1) [25].
Figure 3. The structure of GRU deep learning classifier [24]
4. METHODOLOGY
In this research, two datasets on cyberbullying are used, the first in Arabic and the second in English,
each of which was processed with several operations in the preprocessing step. Then, three types of the pre-
trained corpus were used with different dimensions to represent words, making it easier to understand and deal
with them inside the computer. Since the computer only understands digital values, it became necessary to
represent these words with digital values through this step. In the classifiers step, several deep learning
classifiers were used to achieve the best results in classifying and detecting the phenomenon of cyberbullying.
The methodology of all these steps will be shown in Figure 4, which will be clarified and explained for each
step in detail.
 ISSN: 2252-8938
Int J Artif Intell, Vol. 13, No. 1, March 2024: 500-508
504
Figure 4. The methodology of deep learning used to detect cyberbullying in this research
4.1. The input dataset
Searching for the appropriate dataset for the research and its subject is necessary for every practical
research project. Then comes the stage of studying this data and knowing its size, label, and other details that
must be known about the dataset we have chosen. In this research, two Kaggle datasets were used, each having
several tweets related to the research topic: cyberbullying. Tweets are posts or messages that individuals
publish on the Twitter platform to exchange information with each other all over the world [26]. The first
dataset is the Arabic cyberbullying dataset. The size of the first dataset is 17,748 Arabic tweets, including
14,178 cyberbullying tweets and 3,570 non-cyberbullying tweets [12]. The second dataset is the English
cyberbullying dataset. The size of the second dataset is 47k English tweets, including 7,631 not-cyberbullying
tweets, and the rest are cyberbullying tweets, which contain harassing comments like religion, age, and others
[27]. The two cyberbullying datasets used in this research to detect and classify the comments on Twitter on
cyberbullying to reduce and prevent this phenomenon.
4.2. The preprocessing dataset
After the stage of selecting the appropriate dataset for the research, studying it, and considering it as
the input dataset for the research, it became necessary to process this dataset in the preprocessing stage to
achieve better results with this dataset and the subject of the research, and since the dataset may contain noise.
For this reason, the preprocessing stage is required to minimize the number of words and sentences by
eliminating unnecessary words from tweets and trying to connect or approximate words with the same meaning
or words close to each other, among different techniques. The preprocessing dataset process in this research is
divided into two groups of processes. The first is a group of preprocessing processes for a cyberbullying dataset
in Arabic, and a second is a group of preprocessing processes for a cyberbullying dataset in English. The Arabic
cyberbullying dataset is preprocessed in two main steps, normalization and stemming. The normalization
process contains several operations, such as tokenization, removing Arabic stopwords, extra spaces, numbers,
and repeated characters. The stemming process includes light stemming, root stemming, and lemmatization.
The English cyberbullying dataset is preprocessed using normalization (such as tokenization, removing English
stopwords, extra spaces, punctuation and numbers, repeated characters, and others). After the preprocessed
step of two cyberbullying datasets comes an important step: splitting the dataset. Each dataset was divided into
training and testing data with a rate of 8:2, respectively. This research used training and testing data to detect
and classify the comments on Twitter on cyberbullying, whether the comment is cyberbullying or not.
4.3. GloVe word embedding
The computer device only understands digital data. For this reason, we need to represent the words
by converting each word into several vectors, which includes a huge amount of numbers to represent this word
and is easy to understand and deal with these words by the computer device. In this research, we used pre-
Int J Artif Intell ISSN: 2252-8938 
Word embedding for detecting cyberbullying based on recurrent neural networks (Noor Haydar)
505
trained GloVe word embedding to represent the word of each tweet's comments with the Arabic and English
cyberbullying datasets. The GloVe has pre-defined dense vectors for around every 6 billion words of English
literature, along with many other general-use characters like commas, braces, and semicolons. Four varieties
available of GloVe are 50 D, 100 D, 200 D, and 300 D. Here D stands for dimension. 100 D means that each
word has an equivalent vector of size 100. GloVe files are simple text files in the form of a dictionary. Words
are keys, and dense vectors are values of the key.
Three pre-trained GloVe corpora are utilized from the Kaggle. The first GloVe corpus is an Arabic
corpus language with 1,538,616 Arabic words with 256 D. The second GloVe corpus is English, which contains
over a million English words with 100D. The third GloVe corpus contains multi-languages; among these are
Arabic and English. It contains 1,193,514 words with 50 D, 100 D, and 200 D.
4.4. The classifiers
The classifier is an algorithm trained on datasets, and its accuracy depends on finding the best weights
that maximize the accuracy of the tested data. Five deep learning classifiers are used to classify and detect the
phenomenon of cyberbullying on the two datasets (Arabic and English) with pre-trained GloVe. These
classifiers are standard recurrent neural networks (RNN), long short-term memory (LSTM) networks,
Bidirectional LSTM (BiLSTM), gated recurrent units (GRU), and Bidirectional GRU (BiGRU) networks with
different experiments.
5. EXPERIMENT RESULTS AND DISCUSSION
In this section, three different experiments with deep learning classifiers and pre-trained GloVe are
utilized to classify and detect the phenomenon of cyberbullying. Each one of these three experiments contained
a set of results, which we reached by executing a large number of lines of code for each of these three
experiments using the Python language. The Python language, which is considered one of the most important
and most used programming languages in the field of computer science, was used to achieve the best results
for this research. We will explain each of these experiments separately in detail, as shown in sections 5.1 and 5.2.
5.1. The first experiment
The first dataset is Arabic cyberbullying, applied using the Arabic pre-trained corpus GloVe of 256D.
The dataset was trained and tested with 256 batch size, 10 epochs, and splitting the dataset was into 80% for
training and 20% for testing. The accuracy results of this experiment are shown in Table 2.
Table 2. The accuracy of deep learning classifiers with Arabic GloVe corpus 256D
Dataset Preprocess RNN LSTM BILSTM GRU BIGRU
Arabic Cyberbullying Dataset Original 83.79% 85.77% 86.50% 87.83% 86.16%
Normalized 84.35% 85.85% 85.77% 86.30% 86.95%
Light stemming 84.24% 85.85% 86.19% 86.56% 86.73%
Root stemming 84.33% 85.71% 85.40% 85.63% 85.29%
Lemmatization 80.09% 86.25% 86.16% 86.22% 86.13%
From the classifier point of view, the best accuracy applied to the Arabic cyberbullying dataset is
achieved using the GRU classifier with an accuracy of 87.83% applied to the original dataset. If we notice the
rest of the results that were implemented and obtained from the practical part of this research, the classifiers
GRU and BiGRU mostly achieved better results than the rest of the classifiers. Also, from our observation of
the results of the practical part that we conducted in this research, the root stemming process in this experiment
mostly achieved less results than the rest of the processes, and thus the root stemming process in this experiment
has mostly failed to achieve good results compared to the rest of the processes. In contrast, the BiGRU and
RNN conducted their best results after the normalization process dataset.
5.2. The second experiment
This experiment uses the pre-trained corpus GloVe, which contains multi-languages; among these are
Arabic and English with 50 D, 100 D, and 200 D. Two datasets were trained and tested with 256 batch size, 10
epochs, and splitting the dataset was into 80% for training and 20% for testing. The accuracy results of these
experiments is shown in Tables 3-5. The Arabic cyberbullying dataset achieved its best result with the GRU
classifier applied after the lemmatization process on different corpora (50, 100, and 200). Increasing the corpus
size enhances the accuracy, so 200 D achieved the best accuracy among these corpora with 86.59%. Also, the
GRU classifier achieved the best accuracy when applied to the English cyberbullying datasets. The experiments
 ISSN: 2252-8938
Int J Artif Intell, Vol. 13, No. 1, March 2024: 500-508
506
were applied on (50, 100, and 200) dimensions, with 93.38% accuracy achieved using a 100D corpus applied
to the normalized dataset.
According to the results, the GRU, LSTM, and BiGRU classifiers mostly achieved better than the
rest. The root stemmer failed to achieve good results when applied to the Arabic cyberbullying dataset
compared to the rest of the preprocessing operations. The normalized preprocess to the English cyberbullying
datasets enhances its accuracy.
Table 3. The accuracy of deep learning classifiers with GloVe corpus 50D
Dataset Preprocessing RNN LSTM BILSTM GRU BIGRU
Arabic Cyberbullying Dataset Original 83.93% 84.04% 85.74% 84.86% 85.15%
Normalized 84.07% 84.19% 84.64% 85.29% 83.90%
Light stemming 83.42% 84.55% 84.44% 85.12% 84.61%
Root stemming 83.23% 84.52% 84.07% 85.20% 83.56%
Lemmatization 83.99% 85.46% 85.09% 86.19% 85.34%
English Cyberbullying Dataset Original 90.92% 92.23% 92.80% 92.94% 92.30%
Normalized 91.14% 92.74% 92.81% 93.19% 92.85%
Table 4. The accuracy of deep learning classifiers with GloVe corpus 100D
Dataset Preprocessing RNN LSTM BILSTM GRU BIGRU
Arabic Cyberbullying Dataset Original 84.47% 84.50% 84.44% 85.09% 85.60%
Normalized 84.92% 85.20% 85.15% 85.34% 85.34%
Light stemming 84.47% 85.34% 85.71% 85.54% 85.23%
Root stemming 84.33% 84.47% 85.12% 85.03% 84.13%
Lemmatization 83.03% 85.46% 85.63% 86.19% 85.63%
English Cyberbullying Dataset Original 91.42% 92.44% 92.62% 92.84% 92.62%
Normalized 91.74% 92.45% 92.96% 93.38% 93.14%
Table 5. The accuracy of deep learning classifiers with GloVe corpus 200D
Dataset Preprocessing RNN LSTM BILSTM GRU BIGRU
Arabic Cyberbullying Dataset Original 84.98% 85.15% 85.96% 85.85% 85.46%
Normalized 84.44% 85.54% 84.89% 85.99% 85.82%
Light stemming 84.92% 85.03% 84.92% 85.88% 85.68%
Root stemming 82.72% 85.34% 84.21% 85.65% 85.03%
Lemmatization 83.87% 85.79% 85.63% 86.59% 85.48%
English Cyberbullying Dataset Original 91.27% 92.19% 92.38% 92.88% 92.80%
Normalized 91.70% 92.98% 92.44% 93.08% 92.66%
5.3. The third experiment
An English pre-trained corpus GloVe of 100 D was applied to the English cyberbullying dataset. The
dataset was trained and tested with 256 batch size, 10 epochs, and splitting the dataset was into 80% for training
and 20% for testing. Table 6 shows the accuracy of the tested dataset using different preprocessing and
classifiers.
Table 6. The accuracy of deep learning classifiers with English GloVe corpus 100D
Dataset Preprocessing RNN LSTM BILSTM GRU BIGRU
English Cyberbullying Dataset Original 90.50% 91.59% 92.32% 92.40% 92.19%
Normalized 90.63% 92.27% 92.45% 92.50% 92.25%
After the normalized process, the GRU classifier achieved the highest accuracy of 92.50% among the
other classifiers, RNN, LSTM, BiLSTM, and BiGRU. The results show that the normalization process is
essential when using the English dataset. There is a trade-off between increasing the corpus dimension and the
accuracy of results. The Arabic corpus with 256 D outperforms other corpora. It doesn't need any dataset
preprocessing. It is recommended with the GRU classifier. GloVe with 50 D, 100 D, and 200 D is evaluated
in the second pre-trained corpus containing multiple languages. The 100D outperforms other dimensions when
applied to the normalized Arabic and English datasets. The third corpus is English, with 100 D doesn't
outperform the second pre-trained corpus that contains multiple languages. From the classifier's point of view,
the GRU classifier outperforms other classifiers.
Int J Artif Intell ISSN: 2252-8938 
Word embedding for detecting cyberbullying based on recurrent neural networks (Noor Haydar)
507
6. CONCLUSION
Due to the spread of cyberbullying and the adverse effects that result from this phenomenon, it has
become necessary to find appropriate solutions to detect cyberbullying through modern technologies in
artificial intelligence. Current deep learning technologies (RNN, LSTM, BiLSTM, GRU, and BiGRU) are
utilized on two datasets (The Arabic and English cyberbullying datasets). Three different pre-trained GloVe
corpora (the Arabic pre-trained corpus GloVe of 256 D, pre-trained corpus GloVe, which contains multi-
languages with 50 D, 100 D and 200 D, and An English pre-trained corpus GloVe of 100D). The best results
for the Arabic cyberbullying dataset were achieved using the GloVe of 256 D and GRU classifier applied to
the original dataset, which was 87.83% compared with [12], which reached an accuracy of 84.03%. While the
best result for the English cyberbullying datasets was 93.38% achieved when using GloVe 100 D and GRU
classifier after the normalization process.
REFERENCES
[1] T. Alsubait and D. Alfageh, “Comparison of machine learning techniques for cyberbullying detection on YouTube Arabic
comments,” International Journal of Computer Science & Network Security, vol. 21, no. 1, pp. 1–5, 2021.
[2] A. Ali and A. M. Syed, “Cyberbullying Detection Using Machine Learning,” Pakistan Journal of Engineering and Technology
(PakJET), vol. SI, no. 01, pp. 45–50, 2020.
[3] M. Anand and R. Eswari, “Classification of abusive comments in social media using deep learning,” in Proceedings of the 3rd
International Conference on Computing Methodologies and Communication, ICCMC 2019, Mar. 2019, pp. 974–977, doi:
10.1109/ICCMC.2019.8819734.
[4] T. H. H. Aldhyani, M. H. Al-Adhaileh, and S. N. Alsubari, “Cyberbullying identification system based deep learning algorithms,”
Electronics (Switzerland), vol. 11, no. 20, p. 3273, Oct. 2022, doi: 10.3390/electronics11203273.
[5] Z. K. Hussien and B. N. Dhannoon, “Anomaly detection approach based on deep neural network and dropout,” Baghdad Science
Journal, vol. 17, no. 2, pp. 701–709, Jun. 2020, doi: 10.21123/bsj.2020.17.2(SI).0701.
[6] V. Tyagi, A. Kumar, and S. Das, “Sentiment analysis on twitter data using deep learning approach,” in Proceedings - IEEE 2020
2nd International Conference on Advances in Computing, Communication Control and Networking, ICACCCN 2020, Dec. 2020,
pp. 187–190, doi: 10.1109/ICACCCN51052.2020.9362853.
[7] A. Q. Al-Bayati, A. S. Al-Araji, and S. H. Ameen, “Arabic sentiment analysis (ASA) using deep learning approach,” Journal of
Engineering, vol. 26, no. 6, pp. 85–93, Jun. 2020, doi: 10.31026/j.eng.2020.06.07.
[8] C. Iwendi, G. Srivastava, S. Khan, and P. K. R. Maddikunta, “Cyberbullying detection solutions based on deep learning
architectures,” Multimedia Systems, vol. 29, no. 3, pp. 1839–1852, Oct. 2020, doi: 10.1007/s00530-020-00701-5.
[9] D. R. Janardhana, C. P. Vijay, G. B. J. Swamy, and K. Ganaraj, “Feature enhancement based text sentiment classification using
deep learning model,” Oct. 2020, doi: 10.1109/ICCCS49678.2020.9277109.
[10] M. Mahat, “Detecting cyberbullying across multiple social media platforms using deep learning,” in 2021 International Conference
on Advance Computing and Innovative Technologies in Engineering, ICACITE 2021, Mar. 2021, pp. 299–301, doi:
10.1109/ICACITE51222.2021.9404736.
[11] Venkatesh, S. U. Hegde, A. S. Zaiba, and Y. Nagaraju, “Hybrid CNN-LSTM model with glove word vector for sentiment analysis
on football specific tweets,” Proceedings of the 2021 1st International Conference on Advances in Electrical, Computing,
Communications and Sustainable Technologies, ICAECT 2021, 2021, doi: 10.1109/ICAECT49130.2021.9392516.
[12] S. Almutiry and M. Abdel Fattah, “Arabic cyberbullying detection using Arabic sentiment analysis,” The Egyptian Journal of
Language Engineering, vol. 8, no. 1, pp. 39–50, Apr. 2021, doi: 10.21608/ejle.2021.50240.1017.
[13] N. A. Hamzah and B. N. Dhannoon, “The detection of sexual harassment and chat predators using artificial neural network,”
Karbala International Journal of Modern Science, vol. 7, no. 4, pp. 301–312, Dec. 2021, doi: 10.33640/2405-609X.3157.
[14] T. Hossain, H. Z. Mauni, and R. Rab, “Reducing the effect of imbalance in text classification using SVD and glove with ensemble
and deep learning,” Computing and Informatics, vol. 41, no. 1, pp. 98–115, 2022, doi: 10.31577/CAI_2022_1_98.
[15] M. A. Akbar, A. Jazlan, M. Mahbuburrashid, H. F. M. Zaki, M. N. Akhter, and A. H. Embong, “Solar thermal process parameters
forecasting for evacuated tube collectors (Etc) based on RNN-LSTM,” IIUM Engineering Journal, vol. 24, no. 1, pp. 256–268, Jan.
2023, doi: 10.31436/iiumej.v24i1.2374.
[16] P. Zheng, W. Zhao, Y. Lv, L. Qian, and Y. Li, “Health status-based predictive maintenance decision-making via LSTM and markov
decision process,” Mathematics, vol. 11, no. 1, p. 109, Dec. 2023, doi: 10.3390/math11010109.
[17] T. A. Wotaifi and B. N. Dhannoon, “An effective hybrid deep neural network for arabic fake news detection,” Baghdad Science
Journal, Jan. 2023, doi: 10.21123/bsj.2023.7427.
[18] P. Hu, J. Qi, J. Bo, Y. Xia, C.-M. Jiao, and M.-T. Huang, “Research on LSTM-based industrial added value prediction under the
framework of federated learning,” in Proceedings of the 2022 3rd International Conference on Big Data and Informatization
Education (ICBDIE 2022), Atlantis Press International {BV}, 2023, pp. 426–434.
[19] A. Pratomo, M. O. Jatmika, B. Rahmat, and Y. S. Triana, “Transfer learning implementation on BiLSTM with optimizer for
predicting non-ferrous metals prices,” 2022.
[20] D. Naik and C. D. Jaidhar, “A novel multi-layer attention framework for visual description prediction using bidirectional LSTM,”
Journal of Big Data, vol. 9, no. 1, Nov. 2022, doi: 10.1186/s40537-022-00664-6.
[21] G. Shen, Q. Tan, H. Zhang, P. Zeng, and J. Xu, “Deep learning with gated recurrent unit networks for financial sequence
predictions,” Procedia Computer Science, vol. 131, pp. 895–903, 2018, doi: 10.1016/j.procs.2018.04.298.
[22] M. Li et al., “Internet financial credit risk assessment with sliding window and attention mechanism LSTM model,” Tehnicki
Vjesnik, vol. 30, no. 1, pp. 1–7, Feb. 2023, doi: 10.17559/TV-20221110173532.
[23] Y. Liu, X. Liu, Y. Zhang, and S. Li, “CEGH: A hybrid model using CEEMD, entropy, GRU, and history attention for intraday stock
market forecasting,” Entropy, vol. 25, no. 1, p. 71, Dec. 2023, doi: 10.3390/e25010071.
[24] Z. Liu, J. Mei, D. Wang, Y. Guo, and L. Wu, “A novel damage identification method for steel catenary risers based on a novel
CNN-GRU model optimized by PSO,” Journal of Marine Science and Engineering, vol. 11, no. 1, p. 200, Jan. 2023, doi:
10.3390/jmse11010200.
[25] T. Saghi, D. Bustan, and S. S. Aphale, “Bearing fault diagnosis based on multi-scale CNN and bidirectional GRU,” Vibration, vol.
6, no. 1, pp. 11–28, Dec. 2022, doi: 10.3390/vibration6010002.
 ISSN: 2252-8938
Int J Artif Intell, Vol. 13, No. 1, March 2024: 500-508
508
[26] T. A. Wotaifi and B. N. Dhannoon, “Improving prediction of arabic fake news using fuzzy logic and modified random forest model,”
Karbala International Journal of Modern Science, vol. 8, no. 3, pp. 477–485, Aug. 2022, doi: 10.33640/2405-609X.3241.
[27] J. Wang, K. Fu, and C. T. Lu, “SOSNet: A graph convolutional network approach to fine-grained cyberbullying detection,” in
Proceedings - 2020 IEEE International Conference on Big Data, Big Data 2020, Dec. 2020, pp. 1699–1708, doi:
10.1109/BigData50022.2020.9378065.
BIOGRAPHIES OF AUTHORS
Noor Haydar Shaker holds a bachelor's degree in computer science from Al-
Nahrain University, Iraq, since 2019. She is currently a master's student at Al-Nahrain
University. She specialized in artificial intelligence and is currently doing some research
within the field of artificial intelligence, specifically in deep learning algorithms, which is the
field of her master's thesis, which she is currently preparing. She can be contacted at email:
noor.haidar21@ced.nahrainuniv.edu.iq.
Ban N. Dhannoon Ph.D. holder in computer science since 2001 from the
University of Technology, Baghdad, Iraq, with the Dissertation "Fuzzy Rule Extraction". A
professor in Computer Science Dept./College of Science/Al-Nahrain University since 2013.
My research interests are Artificial Intelligence (natural language processing, machine
learning, and Deep Learning), Digital Image Processing, and Pattern Recognition. She can be
contacted at email: ban.n.dhannoon@nahrainuniv.edu.iq.

More Related Content

PDF
Social cyber-criminal, towards automatic real time recognition of malicious p...
PDF
BINARY TEXT CLASSIFICATION OF CYBER HARASSMENT USING DEEP LEARNING
PDF
Detecting cyberbullying text using the approaches with machine learning model...
PDF
A Machine Learning Ensemble Model for the Detection of Cyberbullying
PDF
A MACHINE LEARNING ENSEMBLE MODEL FOR THE DETECTION OF CYBERBULLYING
PDF
A Machine Learning Ensemble Model for the Detection of Cyberbullying
PDF
A study of cyberbullying detection using Deep Learning and Machine Learning T...
PDF
A study of cyberbullying detection using Deep Learning and Machine Learning T...
Social cyber-criminal, towards automatic real time recognition of malicious p...
BINARY TEXT CLASSIFICATION OF CYBER HARASSMENT USING DEEP LEARNING
Detecting cyberbullying text using the approaches with machine learning model...
A Machine Learning Ensemble Model for the Detection of Cyberbullying
A MACHINE LEARNING ENSEMBLE MODEL FOR THE DETECTION OF CYBERBULLYING
A Machine Learning Ensemble Model for the Detection of Cyberbullying
A study of cyberbullying detection using Deep Learning and Machine Learning T...
A study of cyberbullying detection using Deep Learning and Machine Learning T...

Similar to Word embedding for detecting cyberbullying based on recurrent neural networks (20)

PDF
Detection of Cyberbullying on Social Media using Machine Learning
PDF
Predicting cyber bullying on t witter using machine learning
PDF
Smart detection of offensive words in social media using the soundex algorith...
PDF
An evolutionary approach to comparative analysis of detecting Bangla abusive ...
PDF
G05913234
PDF
A hybrid approach based on personality traits for hate speech detection in Ar...
PDF
Security and privacy recommendation of mobile app for Arabic speaking
PDF
Classification of Disastrous Tweets on Twitter using BERT Model
PDF
Automatic detection of safety requests in web and mobile applications using n...
PDF
Categorize balanced dataset for troll detection
PDF
A benchmark study of machine learning models for online fake news detection
PDF
Sentence embedding to improve rumour detection performance model
PDF
Comparison of word embedding features using deep learning in sentiment analysis
PDF
MACHINE LEARNING AND DEEP LEARNING TECHNIQUES FOR DETECTING ABUSIVE CONTENT O...
PDF
Cyber bullying detection project documents free downloas
PDF
MACHINE LEARNING APPLICATIONS IN MALWARE CLASSIFICATION: A METAANALYSIS LITER...
DOCX
Research Paper TopicITS835 – Enterprise Risk Managemen.docx
PDF
IRJET- Semantic Question Matching
PDF
Fake accounts detection system based on bidirectional gated recurrent unit n...
PPTX
1069391_Sharayu Mogare_CyberbullyingDetection on social networks using machin...
Detection of Cyberbullying on Social Media using Machine Learning
Predicting cyber bullying on t witter using machine learning
Smart detection of offensive words in social media using the soundex algorith...
An evolutionary approach to comparative analysis of detecting Bangla abusive ...
G05913234
A hybrid approach based on personality traits for hate speech detection in Ar...
Security and privacy recommendation of mobile app for Arabic speaking
Classification of Disastrous Tweets on Twitter using BERT Model
Automatic detection of safety requests in web and mobile applications using n...
Categorize balanced dataset for troll detection
A benchmark study of machine learning models for online fake news detection
Sentence embedding to improve rumour detection performance model
Comparison of word embedding features using deep learning in sentiment analysis
MACHINE LEARNING AND DEEP LEARNING TECHNIQUES FOR DETECTING ABUSIVE CONTENT O...
Cyber bullying detection project documents free downloas
MACHINE LEARNING APPLICATIONS IN MALWARE CLASSIFICATION: A METAANALYSIS LITER...
Research Paper TopicITS835 – Enterprise Risk Managemen.docx
IRJET- Semantic Question Matching
Fake accounts detection system based on bidirectional gated recurrent unit n...
1069391_Sharayu Mogare_CyberbullyingDetection on social networks using machin...
Ad

More from IAESIJAI (20)

PDF
Hybrid model detection and classification of lung cancer
PDF
Adaptive kernel integration in visual geometry group 16 for enhanced classifi...
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Enhancing fall detection and classification using Jarratt‐butterfly optimizat...
PDF
Deep ensemble learning with uncertainty aware prediction ranking for cervical...
PDF
Event detection in soccer matches through audio classification using transfer...
PDF
Detecting road damage utilizing retinaNet and mobileNet models on edge devices
PDF
Optimizing deep learning models from multi-objective perspective via Bayesian...
PDF
Squeeze-excitation half U-Net and synthetic minority oversampling technique o...
PDF
A novel scalable deep ensemble learning framework for big data classification...
PDF
Exploring DenseNet architectures with particle swarm optimization: efficient ...
PDF
A transfer learning-based deep neural network for tomato plant disease classi...
PDF
U-Net for wheel rim contour detection in robotic deburring
PDF
Deep learning-based classifier for geometric dimensioning and tolerancing sym...
PDF
Enhancing fire detection capabilities: Leveraging you only look once for swif...
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Depression detection through transformers-based emotion recognition in multiv...
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Enhancing financial cybersecurity via advanced machine learning: analysis, co...
PDF
Crop classification using object-oriented method and Google Earth Engine
Hybrid model detection and classification of lung cancer
Adaptive kernel integration in visual geometry group 16 for enhanced classifi...
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Enhancing fall detection and classification using Jarratt‐butterfly optimizat...
Deep ensemble learning with uncertainty aware prediction ranking for cervical...
Event detection in soccer matches through audio classification using transfer...
Detecting road damage utilizing retinaNet and mobileNet models on edge devices
Optimizing deep learning models from multi-objective perspective via Bayesian...
Squeeze-excitation half U-Net and synthetic minority oversampling technique o...
A novel scalable deep ensemble learning framework for big data classification...
Exploring DenseNet architectures with particle swarm optimization: efficient ...
A transfer learning-based deep neural network for tomato plant disease classi...
U-Net for wheel rim contour detection in robotic deburring
Deep learning-based classifier for geometric dimensioning and tolerancing sym...
Enhancing fire detection capabilities: Leveraging you only look once for swif...
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Depression detection through transformers-based emotion recognition in multiv...
A comparative analysis of optical character recognition models for extracting...
Enhancing financial cybersecurity via advanced machine learning: analysis, co...
Crop classification using object-oriented method and Google Earth Engine
Ad

Recently uploaded (20)

PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
cuic standard and advanced reporting.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Spectroscopy.pptx food analysis technology
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Review of recent advances in non-invasive hemoglobin estimation
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Big Data Technologies - Introduction.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Electronic commerce courselecture one. Pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Encapsulation theory and applications.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Understanding_Digital_Forensics_Presentation.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
cuic standard and advanced reporting.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Spectroscopy.pptx food analysis technology
Diabetes mellitus diagnosis method based random forest with bat algorithm
Unlocking AI with Model Context Protocol (MCP)
Review of recent advances in non-invasive hemoglobin estimation
The AUB Centre for AI in Media Proposal.docx
Big Data Technologies - Introduction.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Reach Out and Touch Someone: Haptics and Empathic Computing
Electronic commerce courselecture one. Pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Encapsulation theory and applications.pdf

Word embedding for detecting cyberbullying based on recurrent neural networks

  • 1. IAES International Journal of Artificial Intelligence (IJ-AI) Vol. 13, No. 1, March 2024, pp. 500~508 ISSN: 2252-8938, DOI: 10.11591/ijai.v13.i1.pp500-508  500 Journal homepage: http://ijai.iaescore.com Word embedding for detecting cyberbullying based on recurrent neural networks Noor Haydar Shaker, Ban N. Dhannoon Department of Computer Science, College of Science, Al-Nahrain University, Baghdad, Iraq Article Info ABSTRACT Article history: Received Jan 27, 2023 Revised Mar 19, 2023 Accepted Mar 27, 2023 The phenomenon of cyberbullying has spread and has become one of the biggest problems facing users of social media sites and generated significant adverse effects on society and the victim in particular. Finding appropriate solutions to detect and reduce cyberbullying has become necessary to mitigate its negative impacts on society and the victim. Twitter comments on two datasets are used to detect cyberbullying, the first dataset was the Arabic cyberbullying dataset, and the second was the English cyberbullying dataset. Three different pre-trained global vectors (GloVe) corpora with different dimensions were used on the original and preprocessed datasets to represent the words. Recurrent neural networks (RNN), long short-term memory (LSTM), Bidirectional LSTM (BiLSTM), gated recurrent unit (GRU), and Bidirectional GRU (BiGRU) classifiers utilized, evaluated and compared. The GRU outperform other classifiers on both datasets; its accuracy on the Arabic cyberbullying dataset using the Arabic GloVe corpus of dimension equal to 256D is 87.83%, while the accuracy on the English datasets using 100 D pre- trained GloVe corpus is 93.38%. Keywords: Deep learning classifiers Gated recurrent unit GloVe word embedding Long short-term memory Recurrent neural networks This is an open access article under the CC BY-SA license. Corresponding Author: Noor Haydar Shaker Department of Computer Science, College of Science, Al-Nahrain University Baghdad, Iraq Email: noor.haidar21@ced.nahrainuniv.edu.iq 1. INTRODUCTION The development of technological technologies and the increase in the number of users of social media sites, including users who try to harm others, led to the spread of cyberbullying. Cyberbullying is a type of bullying in which one or more persons (the bully) purposefully and frequently cause harm to another person (the victim) through using technological technologies. Cyberbullies utilize technological technologies like mobile phones, computers, or other electronic devices to send emails, instant text messages, make comments on social media or in chat rooms, or otherwise to harass their victims [1], [2]. Cyberbullying may have serious and long-term consequences for its victims, like a physical, mental, and emotional impact on the victim that leaves them feeling scared, furious, humiliated, exhausted, or have symptoms such as headaches or stomach pains. When victims experience cyberbullying, they might start to feel ashamed, nervous, anxious, and insecure about what people say or think about them. This can lead to withdrawal from friends and family, and it may lead to the victim's suicide [3], [4]. So, it has become necessary to search for and find solutions to detect cyberbullying messages. Many attempts have been made in the field of artificial intelligence to detect the phenomenon of cyberbullying by using machine learning and deep learning techniques, and attempts are continuing to find the best results and appropriate solutions to detect this phenomenon to reduce the negative effects that generate in society, especially on the category of teenagers who are more exposed to cyberbullying than the rest category of society.
  • 2. Int J Artif Intell ISSN: 2252-8938  Word embedding for detecting cyberbullying based on recurrent neural networks (Noor Haydar) 501 In this research, we used deep learning classifiers with two labelled datasets (Arabic and English) to detect the phenomenon of cyberbullying. In the step of word representation, we used a pre-trained global vector for word representation (Pre-trained GloVe) for obtaining vector representations for words, which facilitates dealing with these words inside the computer since most electronic devices, including the computer, only understand and deal with digital values, so it became a step to represent words and convert them into vectors, the most important step. Each vector contains a number of numbers to represent this word and facilitate dealing with it inside the computer. Five deep learning classifiers we used in this research to detect cyberbullying are: standard recurrent neural networks (RNN), long short-term memory (LSTM), bidirectional long short-term memory (BiLSTM), gated recurrent unit (GRU), and Bidirectional gated recurrent unit (BiGRU) based on a powerful and robust form of a chain of repeating modules of neural networks with internal memory used for sequential data. This research is organized: Section 2 presents related works, details of the dataset, word representations, the classifier, and their corresponding results. Section 3 explains the basic concepts in the practical part of this research. Section 4 provides the methodology that this research followed to achieve the results. Section 5 presents the experimental results and discussion. 2. RELATED WORK Since cyberbullying is one of the major problems we are facing, many researchers have contributed to developing models based on machine learning and deep learning to detect this type of bullying. Reviewing previous work found that there was not enough research done to identify Arab cyberbullying in particular. This can be attributed to many challenges and problems related to the Arabic language itself, such as 1) the lack of a large data set for adopting it to build prediction models. 2) using colloquial language in speaking, and 3) not all libraries support the Arabic language [5]. Tyagi et al. [6] employed convolution neural network (CNN) with LSTM as a deep learning module (CNN-LSTM) on 1.6 million English tweets, which categorize into two classes (negative and positive class). The accuracy was 81.20% in the CNN-LSTM module with GloVe word embedding model dimension equal to 300D. Al-Bayati et al. [7] used an Arabic dataset taken from the internet, which is called large scale Arabic book reviews (LABR), and contains over 16,448 rows, including positive labels (1) and negative labels (0). The dataset was preprocessed by removing any words found in the dataset that are not in Arabic, normalization, stemming, removing stopwords, and others. The dataset is split into 67% for training, 17% for testing, and 16% for validation. The dataset is trained and tested with LSTM as a deep learning classifier and a pre-trained embedding layer as word embedding for word representation. The accuracy was 82% with the LSTM classifier, batch size 256, and epoch 10, which was the best result in this study. The result in [8] an English cyberbullying dataset from Kaggle, which was collected from social media sites like Twitter, Instagram, and Facebook. The dataset includes 100,000 comments, and the dataset was preprocessed in several processes such as text cleaning, tokenization, stemming, lemmatization, and stopwords removal. This research used LSTM, BiLSTM, GRU, and RNN as deep learning classifiers. The accuracy was 80.86% with an LSTM, 82.18% with BiLSTM, 81.46% with GRU, and 81.01% with RNN. Higher accuracy was achieved in this research 82.18% with a BiLSTM. Janardhana et al. [9] used the movie review (MR) dataset, which included 12,500 positive and 12,500 negative reviews. The dataset was preprocessed in several processes such as eliminating the stopwords and removing the punctuation. This paper used a GloVe word embedding dimension of 200 with three deep learning classifiers like LSTM, CNN, and CRNN (Generalized CNN combined with the BiLSTM). The accuracy was 79.47% with LSTM, 72.32% with CNN, and 84% with CRNN. The better accuracy was achieved at 84% with the CRNN deep learning classifier. The LSTM as a deep neural network has been used with a sentiment- specific word embedding (SSWE) layer for word representation as can be seen in [10]. The dataset was compiled from three sources: Twitter, Formspring, and Wikipedia, with each platform contributing 3,000 examples for 9,000. The dataset was preprocessed in several processes, like removing numbers, punctuation marks, symbols, blank spaces, and other processes. The accuracy of each separate platform was 79.1% with Twitter, 72% with Formspring, 75.5% with Wikipedia, and 77.9% from the total examples with the LSTM deep learning classifier. The proposed module in this research has some limitations, like the small size of the dataset used and the one deep learning classifier tried in this research. Venkatesh et al. [11] applied an English Twitter dataset including 10,007 comments on tweets, and the dataset was preprocessed in several processes, such as converting all characters to lowercase, removing the links, removing punctuation, removing whitespace, and others. The authors tried to use deep learning and machine learning modules to achieve the best result. The best accuracy achieved was 85% with CNN-LSTM and GloVe word embedding. Almutiry et al. [12] utilized Arabic comments Twitter dataset size of 17,748 comments tweets, which included 14,178 cyberbullying tweets and 3,570 non-cyberbullying tweets. The Arabic comments Twitter dataset achieved 84.03% with support vector machine (SVM) as the classifier and term frequency-inverse document
  • 3.  ISSN: 2252-8938 Int J Artif Intell, Vol. 13, No. 1, March 2024: 500-508 502 frequency (TD-IDF) as feature extraction to word representation of the dataset. Table 1 shows some recently used methods for feature extraction and the data set with their highest accuracy. Table 1. The highest related work accuracy for each classifier on the used dataset Research Dataset Feature extraction/Word embedding Classifier Accuracy [6] 1.6 million English tweets GloVe CNN-LSTM 81.20% [7] 16,448 rows from (LABR) pre-trained embedding layer LSTM 82% [8] 100,000 English comments cyberbullying dataset from Kaggle embedding layer LSTM 80.86% BiLSTM 82.18% GRU 81.46% RNN 81.01% [9] 25,000 reviews from Movie Review GloVe LSTM 79.47% CNN 72.32% CRNN 84% [10] 9,000 examples compiled from Twitter, Formspring, and Wikipedia SSWE layer LSTM 77.9% [11] 10,007 comments on English tweets GloVe CNN-LSTM 85% [12] 17,748 Arabic comments tweet TF-IDF SVM 84.03% 3. PRELIMINARIES Global vectors (GloVe) is an algorithm that was trained on a huge number of words using unsupervised training to obtain the embedding matrix for the words, knowing how close the words are to each other and drawing the words nearest or furthest from each other. GloVe depends on co-occurrence statistics and a probability ratio statistic of the words to generate an embedding matrix for these words. Because the computer understands only digital data, this requires converting words into digital values to make them easier to understand and deal with inside the computer. GloVe is used to represent words using an embedding matrix containing many words. Each of these words corresponds to several numerical values, representing the vectors embedding this word, which are then employed as the input layer for neural networks of deep learning classifiers [13], [14]. Recurrent neural network (RNN) is one type of deep learning classifier based on keeping the output of a certain layer and feeding it back to the input to predict the layer's output, but it suffers from the problem of vanishing and exploding gradients. RNN has been developed into different types of classifiers to achieve better results and possibly solve the problems that RNN's deep learning classifier suffers from [15]. Long short-term memory (LSTM) is one of the types and developments of the RNN that Solves the problem of vanishing and exploding gradients, especially when faced with long text sentences. The LSTM contains a memory that saves the most important information and neglects the less important information through four gates: forgets gate, input gate, cell state, and output gate. Figure 1 shows the LSTM structure [4], [16]. Figure 1. The LSTM structure [16] Where the A are the neurons of LSTM, the input gates of the neurons are Xt, Xt-1, Xt+1, and the output gates are ht, ht-1, ht+1. The two outputs from each neuron to the next neuron represent the forget gate and cell state. The  is the sigmoid activation function [17], [18]. Bidirectional long short-term memory (BiLSTM) is also a type of recurrent neural network (RNN). The sequence processing model consists of two LSTMs: the first takes the input in a forward direction and the other in a backward direction. The BiLSTM is working to effectively increase the information available to the network and improve the context available to the algorithm.
  • 4. Int J Artif Intell ISSN: 2252-8938  Word embedding for detecting cyberbullying based on recurrent neural networks (Noor Haydar) 503 Figure 2 shows how the BiLSTM works, where the input gates of the neurons are Xt, Xt-1, Xt+1, and the output gates are yt, yt-1, yt+1. The  is the sigmoid nonlinear activation function [19], [20]. Figure 2. The BiLSTM structure [20] Gated recurrent unit (GRU) is a type of recurrent neural network (RNN); it solves the vanishing and exploding gradients problems that face RNN. The GRU is similar to the LSTM classifier but with fewer parameters, generally faster and easier in the training process [21]–[23]. Figure 3 shows the structure of the GRU classifier [24]. A typical RNN learns sequential information in one direction, i.e., the dependence of the time step t to the previous temporal steps. Still, potentially available information will be lost. So, BiGRU is suggested, where a GRU layer is added to process the backward data, causing the yt output at time t to be based on the information of the previous time steps (Ht−1) and the information of the next time steps (Ht+1) [25]. Figure 3. The structure of GRU deep learning classifier [24] 4. METHODOLOGY In this research, two datasets on cyberbullying are used, the first in Arabic and the second in English, each of which was processed with several operations in the preprocessing step. Then, three types of the pre- trained corpus were used with different dimensions to represent words, making it easier to understand and deal with them inside the computer. Since the computer only understands digital values, it became necessary to represent these words with digital values through this step. In the classifiers step, several deep learning classifiers were used to achieve the best results in classifying and detecting the phenomenon of cyberbullying. The methodology of all these steps will be shown in Figure 4, which will be clarified and explained for each step in detail.
  • 5.  ISSN: 2252-8938 Int J Artif Intell, Vol. 13, No. 1, March 2024: 500-508 504 Figure 4. The methodology of deep learning used to detect cyberbullying in this research 4.1. The input dataset Searching for the appropriate dataset for the research and its subject is necessary for every practical research project. Then comes the stage of studying this data and knowing its size, label, and other details that must be known about the dataset we have chosen. In this research, two Kaggle datasets were used, each having several tweets related to the research topic: cyberbullying. Tweets are posts or messages that individuals publish on the Twitter platform to exchange information with each other all over the world [26]. The first dataset is the Arabic cyberbullying dataset. The size of the first dataset is 17,748 Arabic tweets, including 14,178 cyberbullying tweets and 3,570 non-cyberbullying tweets [12]. The second dataset is the English cyberbullying dataset. The size of the second dataset is 47k English tweets, including 7,631 not-cyberbullying tweets, and the rest are cyberbullying tweets, which contain harassing comments like religion, age, and others [27]. The two cyberbullying datasets used in this research to detect and classify the comments on Twitter on cyberbullying to reduce and prevent this phenomenon. 4.2. The preprocessing dataset After the stage of selecting the appropriate dataset for the research, studying it, and considering it as the input dataset for the research, it became necessary to process this dataset in the preprocessing stage to achieve better results with this dataset and the subject of the research, and since the dataset may contain noise. For this reason, the preprocessing stage is required to minimize the number of words and sentences by eliminating unnecessary words from tweets and trying to connect or approximate words with the same meaning or words close to each other, among different techniques. The preprocessing dataset process in this research is divided into two groups of processes. The first is a group of preprocessing processes for a cyberbullying dataset in Arabic, and a second is a group of preprocessing processes for a cyberbullying dataset in English. The Arabic cyberbullying dataset is preprocessed in two main steps, normalization and stemming. The normalization process contains several operations, such as tokenization, removing Arabic stopwords, extra spaces, numbers, and repeated characters. The stemming process includes light stemming, root stemming, and lemmatization. The English cyberbullying dataset is preprocessed using normalization (such as tokenization, removing English stopwords, extra spaces, punctuation and numbers, repeated characters, and others). After the preprocessed step of two cyberbullying datasets comes an important step: splitting the dataset. Each dataset was divided into training and testing data with a rate of 8:2, respectively. This research used training and testing data to detect and classify the comments on Twitter on cyberbullying, whether the comment is cyberbullying or not. 4.3. GloVe word embedding The computer device only understands digital data. For this reason, we need to represent the words by converting each word into several vectors, which includes a huge amount of numbers to represent this word and is easy to understand and deal with these words by the computer device. In this research, we used pre-
  • 6. Int J Artif Intell ISSN: 2252-8938  Word embedding for detecting cyberbullying based on recurrent neural networks (Noor Haydar) 505 trained GloVe word embedding to represent the word of each tweet's comments with the Arabic and English cyberbullying datasets. The GloVe has pre-defined dense vectors for around every 6 billion words of English literature, along with many other general-use characters like commas, braces, and semicolons. Four varieties available of GloVe are 50 D, 100 D, 200 D, and 300 D. Here D stands for dimension. 100 D means that each word has an equivalent vector of size 100. GloVe files are simple text files in the form of a dictionary. Words are keys, and dense vectors are values of the key. Three pre-trained GloVe corpora are utilized from the Kaggle. The first GloVe corpus is an Arabic corpus language with 1,538,616 Arabic words with 256 D. The second GloVe corpus is English, which contains over a million English words with 100D. The third GloVe corpus contains multi-languages; among these are Arabic and English. It contains 1,193,514 words with 50 D, 100 D, and 200 D. 4.4. The classifiers The classifier is an algorithm trained on datasets, and its accuracy depends on finding the best weights that maximize the accuracy of the tested data. Five deep learning classifiers are used to classify and detect the phenomenon of cyberbullying on the two datasets (Arabic and English) with pre-trained GloVe. These classifiers are standard recurrent neural networks (RNN), long short-term memory (LSTM) networks, Bidirectional LSTM (BiLSTM), gated recurrent units (GRU), and Bidirectional GRU (BiGRU) networks with different experiments. 5. EXPERIMENT RESULTS AND DISCUSSION In this section, three different experiments with deep learning classifiers and pre-trained GloVe are utilized to classify and detect the phenomenon of cyberbullying. Each one of these three experiments contained a set of results, which we reached by executing a large number of lines of code for each of these three experiments using the Python language. The Python language, which is considered one of the most important and most used programming languages in the field of computer science, was used to achieve the best results for this research. We will explain each of these experiments separately in detail, as shown in sections 5.1 and 5.2. 5.1. The first experiment The first dataset is Arabic cyberbullying, applied using the Arabic pre-trained corpus GloVe of 256D. The dataset was trained and tested with 256 batch size, 10 epochs, and splitting the dataset was into 80% for training and 20% for testing. The accuracy results of this experiment are shown in Table 2. Table 2. The accuracy of deep learning classifiers with Arabic GloVe corpus 256D Dataset Preprocess RNN LSTM BILSTM GRU BIGRU Arabic Cyberbullying Dataset Original 83.79% 85.77% 86.50% 87.83% 86.16% Normalized 84.35% 85.85% 85.77% 86.30% 86.95% Light stemming 84.24% 85.85% 86.19% 86.56% 86.73% Root stemming 84.33% 85.71% 85.40% 85.63% 85.29% Lemmatization 80.09% 86.25% 86.16% 86.22% 86.13% From the classifier point of view, the best accuracy applied to the Arabic cyberbullying dataset is achieved using the GRU classifier with an accuracy of 87.83% applied to the original dataset. If we notice the rest of the results that were implemented and obtained from the practical part of this research, the classifiers GRU and BiGRU mostly achieved better results than the rest of the classifiers. Also, from our observation of the results of the practical part that we conducted in this research, the root stemming process in this experiment mostly achieved less results than the rest of the processes, and thus the root stemming process in this experiment has mostly failed to achieve good results compared to the rest of the processes. In contrast, the BiGRU and RNN conducted their best results after the normalization process dataset. 5.2. The second experiment This experiment uses the pre-trained corpus GloVe, which contains multi-languages; among these are Arabic and English with 50 D, 100 D, and 200 D. Two datasets were trained and tested with 256 batch size, 10 epochs, and splitting the dataset was into 80% for training and 20% for testing. The accuracy results of these experiments is shown in Tables 3-5. The Arabic cyberbullying dataset achieved its best result with the GRU classifier applied after the lemmatization process on different corpora (50, 100, and 200). Increasing the corpus size enhances the accuracy, so 200 D achieved the best accuracy among these corpora with 86.59%. Also, the GRU classifier achieved the best accuracy when applied to the English cyberbullying datasets. The experiments
  • 7.  ISSN: 2252-8938 Int J Artif Intell, Vol. 13, No. 1, March 2024: 500-508 506 were applied on (50, 100, and 200) dimensions, with 93.38% accuracy achieved using a 100D corpus applied to the normalized dataset. According to the results, the GRU, LSTM, and BiGRU classifiers mostly achieved better than the rest. The root stemmer failed to achieve good results when applied to the Arabic cyberbullying dataset compared to the rest of the preprocessing operations. The normalized preprocess to the English cyberbullying datasets enhances its accuracy. Table 3. The accuracy of deep learning classifiers with GloVe corpus 50D Dataset Preprocessing RNN LSTM BILSTM GRU BIGRU Arabic Cyberbullying Dataset Original 83.93% 84.04% 85.74% 84.86% 85.15% Normalized 84.07% 84.19% 84.64% 85.29% 83.90% Light stemming 83.42% 84.55% 84.44% 85.12% 84.61% Root stemming 83.23% 84.52% 84.07% 85.20% 83.56% Lemmatization 83.99% 85.46% 85.09% 86.19% 85.34% English Cyberbullying Dataset Original 90.92% 92.23% 92.80% 92.94% 92.30% Normalized 91.14% 92.74% 92.81% 93.19% 92.85% Table 4. The accuracy of deep learning classifiers with GloVe corpus 100D Dataset Preprocessing RNN LSTM BILSTM GRU BIGRU Arabic Cyberbullying Dataset Original 84.47% 84.50% 84.44% 85.09% 85.60% Normalized 84.92% 85.20% 85.15% 85.34% 85.34% Light stemming 84.47% 85.34% 85.71% 85.54% 85.23% Root stemming 84.33% 84.47% 85.12% 85.03% 84.13% Lemmatization 83.03% 85.46% 85.63% 86.19% 85.63% English Cyberbullying Dataset Original 91.42% 92.44% 92.62% 92.84% 92.62% Normalized 91.74% 92.45% 92.96% 93.38% 93.14% Table 5. The accuracy of deep learning classifiers with GloVe corpus 200D Dataset Preprocessing RNN LSTM BILSTM GRU BIGRU Arabic Cyberbullying Dataset Original 84.98% 85.15% 85.96% 85.85% 85.46% Normalized 84.44% 85.54% 84.89% 85.99% 85.82% Light stemming 84.92% 85.03% 84.92% 85.88% 85.68% Root stemming 82.72% 85.34% 84.21% 85.65% 85.03% Lemmatization 83.87% 85.79% 85.63% 86.59% 85.48% English Cyberbullying Dataset Original 91.27% 92.19% 92.38% 92.88% 92.80% Normalized 91.70% 92.98% 92.44% 93.08% 92.66% 5.3. The third experiment An English pre-trained corpus GloVe of 100 D was applied to the English cyberbullying dataset. The dataset was trained and tested with 256 batch size, 10 epochs, and splitting the dataset was into 80% for training and 20% for testing. Table 6 shows the accuracy of the tested dataset using different preprocessing and classifiers. Table 6. The accuracy of deep learning classifiers with English GloVe corpus 100D Dataset Preprocessing RNN LSTM BILSTM GRU BIGRU English Cyberbullying Dataset Original 90.50% 91.59% 92.32% 92.40% 92.19% Normalized 90.63% 92.27% 92.45% 92.50% 92.25% After the normalized process, the GRU classifier achieved the highest accuracy of 92.50% among the other classifiers, RNN, LSTM, BiLSTM, and BiGRU. The results show that the normalization process is essential when using the English dataset. There is a trade-off between increasing the corpus dimension and the accuracy of results. The Arabic corpus with 256 D outperforms other corpora. It doesn't need any dataset preprocessing. It is recommended with the GRU classifier. GloVe with 50 D, 100 D, and 200 D is evaluated in the second pre-trained corpus containing multiple languages. The 100D outperforms other dimensions when applied to the normalized Arabic and English datasets. The third corpus is English, with 100 D doesn't outperform the second pre-trained corpus that contains multiple languages. From the classifier's point of view, the GRU classifier outperforms other classifiers.
  • 8. Int J Artif Intell ISSN: 2252-8938  Word embedding for detecting cyberbullying based on recurrent neural networks (Noor Haydar) 507 6. CONCLUSION Due to the spread of cyberbullying and the adverse effects that result from this phenomenon, it has become necessary to find appropriate solutions to detect cyberbullying through modern technologies in artificial intelligence. Current deep learning technologies (RNN, LSTM, BiLSTM, GRU, and BiGRU) are utilized on two datasets (The Arabic and English cyberbullying datasets). Three different pre-trained GloVe corpora (the Arabic pre-trained corpus GloVe of 256 D, pre-trained corpus GloVe, which contains multi- languages with 50 D, 100 D and 200 D, and An English pre-trained corpus GloVe of 100D). The best results for the Arabic cyberbullying dataset were achieved using the GloVe of 256 D and GRU classifier applied to the original dataset, which was 87.83% compared with [12], which reached an accuracy of 84.03%. While the best result for the English cyberbullying datasets was 93.38% achieved when using GloVe 100 D and GRU classifier after the normalization process. REFERENCES [1] T. Alsubait and D. Alfageh, “Comparison of machine learning techniques for cyberbullying detection on YouTube Arabic comments,” International Journal of Computer Science & Network Security, vol. 21, no. 1, pp. 1–5, 2021. [2] A. Ali and A. M. Syed, “Cyberbullying Detection Using Machine Learning,” Pakistan Journal of Engineering and Technology (PakJET), vol. SI, no. 01, pp. 45–50, 2020. [3] M. Anand and R. Eswari, “Classification of abusive comments in social media using deep learning,” in Proceedings of the 3rd International Conference on Computing Methodologies and Communication, ICCMC 2019, Mar. 2019, pp. 974–977, doi: 10.1109/ICCMC.2019.8819734. [4] T. H. H. Aldhyani, M. H. Al-Adhaileh, and S. N. Alsubari, “Cyberbullying identification system based deep learning algorithms,” Electronics (Switzerland), vol. 11, no. 20, p. 3273, Oct. 2022, doi: 10.3390/electronics11203273. [5] Z. K. Hussien and B. N. Dhannoon, “Anomaly detection approach based on deep neural network and dropout,” Baghdad Science Journal, vol. 17, no. 2, pp. 701–709, Jun. 2020, doi: 10.21123/bsj.2020.17.2(SI).0701. [6] V. Tyagi, A. Kumar, and S. Das, “Sentiment analysis on twitter data using deep learning approach,” in Proceedings - IEEE 2020 2nd International Conference on Advances in Computing, Communication Control and Networking, ICACCCN 2020, Dec. 2020, pp. 187–190, doi: 10.1109/ICACCCN51052.2020.9362853. [7] A. Q. Al-Bayati, A. S. Al-Araji, and S. H. Ameen, “Arabic sentiment analysis (ASA) using deep learning approach,” Journal of Engineering, vol. 26, no. 6, pp. 85–93, Jun. 2020, doi: 10.31026/j.eng.2020.06.07. [8] C. Iwendi, G. Srivastava, S. Khan, and P. K. R. Maddikunta, “Cyberbullying detection solutions based on deep learning architectures,” Multimedia Systems, vol. 29, no. 3, pp. 1839–1852, Oct. 2020, doi: 10.1007/s00530-020-00701-5. [9] D. R. Janardhana, C. P. Vijay, G. B. J. Swamy, and K. Ganaraj, “Feature enhancement based text sentiment classification using deep learning model,” Oct. 2020, doi: 10.1109/ICCCS49678.2020.9277109. [10] M. Mahat, “Detecting cyberbullying across multiple social media platforms using deep learning,” in 2021 International Conference on Advance Computing and Innovative Technologies in Engineering, ICACITE 2021, Mar. 2021, pp. 299–301, doi: 10.1109/ICACITE51222.2021.9404736. [11] Venkatesh, S. U. Hegde, A. S. Zaiba, and Y. Nagaraju, “Hybrid CNN-LSTM model with glove word vector for sentiment analysis on football specific tweets,” Proceedings of the 2021 1st International Conference on Advances in Electrical, Computing, Communications and Sustainable Technologies, ICAECT 2021, 2021, doi: 10.1109/ICAECT49130.2021.9392516. [12] S. Almutiry and M. Abdel Fattah, “Arabic cyberbullying detection using Arabic sentiment analysis,” The Egyptian Journal of Language Engineering, vol. 8, no. 1, pp. 39–50, Apr. 2021, doi: 10.21608/ejle.2021.50240.1017. [13] N. A. Hamzah and B. N. Dhannoon, “The detection of sexual harassment and chat predators using artificial neural network,” Karbala International Journal of Modern Science, vol. 7, no. 4, pp. 301–312, Dec. 2021, doi: 10.33640/2405-609X.3157. [14] T. Hossain, H. Z. Mauni, and R. Rab, “Reducing the effect of imbalance in text classification using SVD and glove with ensemble and deep learning,” Computing and Informatics, vol. 41, no. 1, pp. 98–115, 2022, doi: 10.31577/CAI_2022_1_98. [15] M. A. Akbar, A. Jazlan, M. Mahbuburrashid, H. F. M. Zaki, M. N. Akhter, and A. H. Embong, “Solar thermal process parameters forecasting for evacuated tube collectors (Etc) based on RNN-LSTM,” IIUM Engineering Journal, vol. 24, no. 1, pp. 256–268, Jan. 2023, doi: 10.31436/iiumej.v24i1.2374. [16] P. Zheng, W. Zhao, Y. Lv, L. Qian, and Y. Li, “Health status-based predictive maintenance decision-making via LSTM and markov decision process,” Mathematics, vol. 11, no. 1, p. 109, Dec. 2023, doi: 10.3390/math11010109. [17] T. A. Wotaifi and B. N. Dhannoon, “An effective hybrid deep neural network for arabic fake news detection,” Baghdad Science Journal, Jan. 2023, doi: 10.21123/bsj.2023.7427. [18] P. Hu, J. Qi, J. Bo, Y. Xia, C.-M. Jiao, and M.-T. Huang, “Research on LSTM-based industrial added value prediction under the framework of federated learning,” in Proceedings of the 2022 3rd International Conference on Big Data and Informatization Education (ICBDIE 2022), Atlantis Press International {BV}, 2023, pp. 426–434. [19] A. Pratomo, M. O. Jatmika, B. Rahmat, and Y. S. Triana, “Transfer learning implementation on BiLSTM with optimizer for predicting non-ferrous metals prices,” 2022. [20] D. Naik and C. D. Jaidhar, “A novel multi-layer attention framework for visual description prediction using bidirectional LSTM,” Journal of Big Data, vol. 9, no. 1, Nov. 2022, doi: 10.1186/s40537-022-00664-6. [21] G. Shen, Q. Tan, H. Zhang, P. Zeng, and J. Xu, “Deep learning with gated recurrent unit networks for financial sequence predictions,” Procedia Computer Science, vol. 131, pp. 895–903, 2018, doi: 10.1016/j.procs.2018.04.298. [22] M. Li et al., “Internet financial credit risk assessment with sliding window and attention mechanism LSTM model,” Tehnicki Vjesnik, vol. 30, no. 1, pp. 1–7, Feb. 2023, doi: 10.17559/TV-20221110173532. [23] Y. Liu, X. Liu, Y. Zhang, and S. Li, “CEGH: A hybrid model using CEEMD, entropy, GRU, and history attention for intraday stock market forecasting,” Entropy, vol. 25, no. 1, p. 71, Dec. 2023, doi: 10.3390/e25010071. [24] Z. Liu, J. Mei, D. Wang, Y. Guo, and L. Wu, “A novel damage identification method for steel catenary risers based on a novel CNN-GRU model optimized by PSO,” Journal of Marine Science and Engineering, vol. 11, no. 1, p. 200, Jan. 2023, doi: 10.3390/jmse11010200. [25] T. Saghi, D. Bustan, and S. S. Aphale, “Bearing fault diagnosis based on multi-scale CNN and bidirectional GRU,” Vibration, vol. 6, no. 1, pp. 11–28, Dec. 2022, doi: 10.3390/vibration6010002.
  • 9.  ISSN: 2252-8938 Int J Artif Intell, Vol. 13, No. 1, March 2024: 500-508 508 [26] T. A. Wotaifi and B. N. Dhannoon, “Improving prediction of arabic fake news using fuzzy logic and modified random forest model,” Karbala International Journal of Modern Science, vol. 8, no. 3, pp. 477–485, Aug. 2022, doi: 10.33640/2405-609X.3241. [27] J. Wang, K. Fu, and C. T. Lu, “SOSNet: A graph convolutional network approach to fine-grained cyberbullying detection,” in Proceedings - 2020 IEEE International Conference on Big Data, Big Data 2020, Dec. 2020, pp. 1699–1708, doi: 10.1109/BigData50022.2020.9378065. BIOGRAPHIES OF AUTHORS Noor Haydar Shaker holds a bachelor's degree in computer science from Al- Nahrain University, Iraq, since 2019. She is currently a master's student at Al-Nahrain University. She specialized in artificial intelligence and is currently doing some research within the field of artificial intelligence, specifically in deep learning algorithms, which is the field of her master's thesis, which she is currently preparing. She can be contacted at email: noor.haidar21@ced.nahrainuniv.edu.iq. Ban N. Dhannoon Ph.D. holder in computer science since 2001 from the University of Technology, Baghdad, Iraq, with the Dissertation "Fuzzy Rule Extraction". A professor in Computer Science Dept./College of Science/Al-Nahrain University since 2013. My research interests are Artificial Intelligence (natural language processing, machine learning, and Deep Learning), Digital Image Processing, and Pattern Recognition. She can be contacted at email: ban.n.dhannoon@nahrainuniv.edu.iq.