SlideShare a Scribd company logo
IAES International Journal of Artificial Intelligence (IJ-AI)
Vol. 13, No. 4, December 2024, pp. 4106~4112
ISSN: 2252-8938, DOI: 10.11591/ijai.v13.i4.pp4106-4112  4106
Journal homepage: http://ijai.iaescore.com
Detecting fraudulent financial statement under imbalanced data
using neural network
Hendra Tjahyadi, Yosua Efraim Young
Study Program of Informatics, Faculty of Computer Science, Universitas Pelita Harapan, Jakarta, Indonesia
Article Info ABSTRACT
Article history:
Received Dec 29, 2023
Revised Apr 19, 2024
Accepted Jun 8, 2024
In this paper a novel approach for detecting fraudulent financial statements by
employing a combination of neural networks and synthetic minority over-
sampling technique (SMOTE) is introduced. This approach is designed to tackle
the problem of imbalanced datasets prevalent in fraudulent cases, which if left
unaddressed will hinder the model to accurately identify fraud. Three neural
network models, each representing different fraud predictors as the input layer:
28 inputs raw financial data; 14 inputs financial ratios data; and 42 inputs
combination both raw financial and financial ratios data are developed.
Experimental validation using established research datasets is conducted to
assess the performance of the proposed method. Performance metrics, namely
area under the curve (AUC), precision, and sensitivity, are used for evaluation,
comparing the proposed model against existing benchmark models found in
literature. Results indicate that the proposed model achieves an AUC score of
70.6% and a precision score of 2.89%, in comparable to the existing models,
with a sensitivity score of 83% outperforming all counterparts. The high
sensitivity rate of the proposed model underscores its practical utility for
auditors and regulators, as it minimizes the risk of false negatives, thereby
enhancing confidence in fraud detection.
Keywords:
Fraudulent financial statements
Machine learning
Neural network
Supervised learning
Synthetic minority over-
sampling technique
This is an open access article under the CC BY-SA license.
Corresponding Author:
Hendra Tjahyadi
Study Program of Informatics, Faculty of Computer Science, Universitas Pelita Harapan
Jakarta, Indonesia
Email: hendra.tjahyadi@uph.edu
1. INTRODUCTION
Financial statement misstatements may arise from either fraud or error, as stated by the International
Federation of Accountants [1]. It is the auditor’s responsibility to provide reasonable assurance that the
financial statements are free from material misstatement. Misleading financial statements can incur significant
costs, especially for investors, regulators, and society at large, as demonstrated in the Enron scandal—one of
the most notable audit and accounting scandals in history and literature [2]–[4]. It began when Enron shocked
the public by reporting a $638 million loss. This case implicated its auditor, Arthur Andersen, which failed to
detect the misstatement and engaged in document shredding related to Enron audits. This highlights the
difficulty in detecting accounting misstatements.
Detecting accounting misstatements can be challenging due to several reasons such as the complexity
of financial transactions, sophisticated fraud schemes, vast amount of data, and human error and bias. These
challenges underscore the need for innovative approach such as data analytics and machine learning in auditing
[3]–[5]. These technologies offer the potential to enhance audit effectiveness, improve risk assessment, and
mitigate the impact of human limitations on audit quality.
Although data analytics and machine learning are expected to demonstrate a superior method, they
are seldom to use on performing audit procedures. It is relatively unknown whether usage of data analytics and
Int J Artif Intell ISSN: 2252-8938 
Detecting fraudulent financial statement under imbalanced data using neural network (Yosua Efraim Young)
4107
machine learning are indeed transformational for the audit [6]. Various research has been conducted in
searching of fraudulent financial statement detection, including utilization of supervised learning and
unsupervised learning. Supervised learning is used including various models such as neural network [7]–[11],
genetic algorithm [12], decision tree (DT) [8]–[10], [13], Bayesian network [8], [9], support vector machines
(SVM) [8], [13]–[15], and logistic regression (LR) [16]. Unsupervised learning implementation use algorithm
such as self-organizing map [17], [18] and k-means clustering [17]. One significant obstacle in machine
learning is the imbalanced data challenge, where unequal class representation leads to inaccurate detection,
with majority classes overshadowing minorities. Publicly available financial statements often exhibit severe
imbalance due to the rarity of fraudulent instances compared to non-fraudulent ones. Therefore, it is crucial for
models to address this imbalance.
This research aims to develop a model for predicting fraudulent financial statements from real public
datasets and to tackle the imbalanced data issue. Three neural networks models with different types of inputs,
namely raw financial data, financial ratios data, and a combination thereof, combined with synthetic minority
over-sampling technique (SMOTE), are proposed in this study. The rest of this paper is organized as follows:
in section 2, we outline previous efforts by researchers to detect fraudulence in financial statement, both using
commonly balanced simulation data and imbalanced real data. Section 3 details the method we propose,
utilizing a combination of neural networks and SMOTE to detect fraudulence in highly imbalanced real data.
In section 4, the experimental results are presented and compared with those of previous researchers. Finally,
in section 5, we conclude with a summary of our findings.
2. LITERATURE REVIEW
The existing literature has focused considerable attention on financial data as crucial indicators of
fraud, encompassing both raw financial data and financial ratios. As fundamental components of financial
statements, financial data have the potential to indicate fraud risk. For example, a liquidity ratio derived directly
from raw financial data could serve as an effective measure of a company's financial pressure. This underscores
the superiority of certain ratios over others [19], particularly those financial data points closely linked to the
fraud triangle theory. By leveraging financial data, several approaches utilizing machine learning and data
mining to detect fraudulent financial statement are found in the literature.
Green and Choi [7] demonstrated the potential of neural network applications in fraud investigation
and utilized it as a detection tool, employing 172 samples, with 86 samples for both fraudulent and
non-fraudulent cases. The model achieved an accuracy rate of 74%. Kotsiantis et al. [8] conducted experiments
on DT, artificial neural network (ANNs), Bayesian networks, rule learners, nearest neighbors, and SVM. This
study demonstrated that DT outperformed other models with 91.2% accuracy using a balanced dataset of 164
Greek companies listed on the Athens stock exchange, comprising 41 fraudulent and 123 non-fraudulent cases.
In a similar works, Kirkos et al. [9] conducted experiments using DT, neural network, and Bayesian belief
networks, revealing that Bayesian belief networks outperformed others with 90.3% accuracy using a balanced
dataset of 76 Greek manufacturing companies, including 38 fraudulent and 38 non-fraudulent cases.
Cecchini et al. [14] using SVM, accurately identified 80% of fraudulent cases and 90.6% of
non-fraudulent cases from a dataset comprising 6,427 non-fraudulent and 205 fraudulent samples. This study
was considered a pioneering work in the field. Dechow et al. [16] presented an alternative method using LR
with financial ratios to detect fraudulent financial statements, signaling the likelihood of misstatement.
Perols [10], with a larger dataset of 15,934 non-fraudulent and 51 fraudulent cases, demonstrated that LR and
SVM outperformed neural network, bagging, C.45, and stacking algorithm. These findings were consistent with
those of Yao et al. [20], which showed that SVM had the highest accuracy among various classification methods.
Randhawa et al. [21] investigated the effectiveness of single and hybrid methods, employing
under-sampling to detect credit card fraud. Their study revealed that combining AdaBoost and majority voting
methods yielded the best results. Bao et al. [15] extended this research by using a large public dataset and
compared the results of re-implementing the models proposed in [14], [16] with a new state-of-the-art model
using RUSBoost. The proposed method outperformed the previous models with an area under curve (AUC) of
72.5% and sensitivity and precision of 4.88% and 4.48%, respectively. Hoang et al. [22] employed XGBoost
and f-XGBoost on the dataset used in [15], resulting in AUC scores of 68.9% and 69.3%, and precision and
sensitivity of 3.56%, 5%, and 3.36%, and 4.22%, respectively. Ashtiani and Raahemi [3] found that a single
model outperformed both ensemble and hybrid approaches. They highlighted Temponeras et al. [23] approach
of employing a deep dense multilayer perceptron, achieving an accuracy of 93.7% using a dataset of 164 Greek
companies. Craja et al. [24] using a text mining approach to detect fraudulent financial statements from annual
reports, demonstrated the effectiveness and preference for ANN, emphasizing their ability to capture complex
relationships among variables. Inspired by the effectiveness of neural networks, this study proposes an approach
combining neural networks and SMOTE to detect fraudulent financial statements in an imbalanced dataset.
 ISSN: 2252-8938
Int J Artif Intell, Vol. 13, No. 4, December 2024: 4106-4112
4108
3. METHOD
The detection model for fraudulent financial statement proposed in this study utilizes a combination
of neural networks and SMOTE. Initially, a severely imbalanced public dataset containing real financial
statements is acquired. Subsequently, the dataset undergoes preprocessing to address the imbalanced dataset
using SMOTE, which generates synthetic samples for the minority class. The preprocessed data is then used
for training and experimentation on the proposed network models. Finally, the results obtained from employing
neural networks are compared to those achieved by the state-of-the-art algorithm proposed in previous
literature. The overall process workflow is illustrated in Figure 1.
Figure 1. Proposed fraud detection workflow
3.1. Data and variables
The dataset is retrieved from previous research by Bao et al. [15], comprising 146,045 records
collected from the COMPUSTAT database, covering all publicly listed U.S. firms from 1990 to 2014. It
includes 42 features, consisting of 28 raw financial data derived from research by Cecchini et al. [14] and 14
financial ratios as researched by Dechow et al. [16]. Following the previous study, the training dataset spans
from 1991 to the test year, with a two-year gap.
Serial fraud, defined as fraudulent cases spanning multiple years, is present in the dataset. The impact
of serial fraud is that it can inflate the model's performance, as the same fraud case may be included in both the
train and test data [15]. Therefore, to prevent overstated results and benchmark against previous literature, the
dataset is preprocessed by recoding all serial fraud as non-fraudulent.
To address the severe imbalance between fraudulent and non-fraudulent cases in the dataset, we
employ a minority oversampling technique called SMOTE [25]. This technique is necessary to address the
challenge where the minority class is often neglected; for example, fraudulent cases represent only 0.67% of
the population, which is the focus of our attention. SMOTE generates new synthetic data through an iterative
process targeting each point in the minority class. It proves to be an effective method for addressing existing
imbalance cases and improving classification performance [26].
3.2. Proposed artificial neural networks
This study experimented with the utilization of ANN, specifically feedforward networks, to detect
fraudulent financial statements from a severely imbalanced dataset. The experiments involved three models or
networks, each representing different fraud predictors as the input layer. The first network used 28 raw financial
data as input layers derived from the fraud predictors of Cecchini et al. [14]. The second network used 14
financial ratios as input layers derived from the fraud predictors of Dechow et al. [16]. Lastly, the third network
used a combined approach from Cecchini and Dechow as the input layer.
The overall architecture of the proposed networks is illustrated in Figure 2. The architecture of the
three networks comprises an input layer followed by three hidden layers and an output layer. The input layer
encompasses three different scenarios, representing different fraud predictors, which can be represented by
input layers of 28, 14, and 42, respectively. Inspired in [23]–[25], the first and second hidden layers consist of
fully connected layers with LeakyReLU (alpha of 0.05) as the activation function to address complex patterns
and relationships of the fraud predictors and handle non-linearity issues. This is followed by L2 regularization
with a coefficient of 0.005 to add a penalty term to the network to avoid overfitting issues [27]. The Adam
optimizer is chosen for its capabilities of efficient computation, minimal memory requirements, and suitability
for large datasets [28]. Additionally, a dropout layer is added with a rate of 0.7 to randomly drop out neurons
in an attempt to prevent overfitting [29]. Finally, an output layer with a sigmoid function is added to perform
binary classification tasks.
3.3. Performance evaluation
The performance of the proposed model is evaluated using three metrics: AUC, sensitivity, and
precision. AUC is a metric used to evaluate the performance of binary classification [30]. It is employed to assess
the accuracy of the proposed model due to the imbalance in the dataset, where the occurrence of fraudulent
samples is not adequately captured in standard accuracy metrics [31]. Therefore, the AUC score provides a more
representative measure of accuracy in this context compared to commonly used accuracy scores.
Int J Artif Intell ISSN: 2252-8938 
Detecting fraudulent financial statement under imbalanced data using neural network (Yosua Efraim Young)
4109
Following previous works [15], the measurement of sensitivity and precision is based on data from
the top 1% of observations from the decision value. This choice is driven by practical considerations, as
regulators may not be able to observe all companies predicted as fraudulent due to resource constraints.
Additionally, this decision is influenced by the results of leading research by Cecchini et al. [14], which
reported a high number of false positives in their SVM performance, correctly classifying 80% of fraud cases
and 90.6% of non-fraud cases. Therefore, to mitigate the allocation of excessive resources toward many false
positives, the focus is on the top 1% of observations.
Figure 2. The architecture of the proposed networks
4. RESULTS AND DISCUSSION
This study proposed an approach for detecting fraudulent financial statements by combining neural
networks and SMOTE. While earlier literature has explored and demonstrated the capability of various
algorithms to detect fraudulent financial statements, there is still enhancement needed specifically in terms of
accuracy. Therefore, developing a model with better accuracy is important, especially a model that can be
relied upon for practical adoption.
Three networks with different fraud predictor as input layer are employed. The results obtained are
summarized in Table 1. In the first network, employing raw financial data as fraud predictor, the proposed
network scored an AUC score of 0.706, with a sensitivity of 50% and 1.39%. The second proposed network
achieved AUC score of 0.693, sensitivity of 67%, and precision of 2.89%, by employing financial ratios as
fraud predictor. Lastly, the third network employing both raw financial data and financial ratios as fraud
predictor resulted in AUC of 0.672, followed by sensitivity of 83% and precision of 1.91%.
Table 1. Summary of comparison (test period 2003–2008)
Fraud predictor AUC Sensitivity (%) Precision (%)
28 raw financial data 0.706 50 1.39
14 financial ratios 0.693 67 2.89
Both 0.672 83 1.91
As presented in Table 1, it is found that the proposed neural network can detect fraudulent financial
statements, with raw financial data is the best fraud predictor measured by AUC score. These results
demonstrated that combining both raw financial data and financial ratios as fraud predictor does not yield to a
higher accuracy, measured by AUC score. From this experimentation, the highest precision is obtained through
employing financial ratios as fraud predictor, whereas the highest sensitivity is obtained through employing
both raw financial data and financial ratios as fraud predictor.
Table 2 provides results obtained from the proposed network for the test period of 2003–2008 in
comparison with previous literature. These results demonstrate that the proposed network has comparable AUC
score and precision in comparison with results obtained from previous literatures by employing algorithms
with SVM, LR, RUSBoost, XGBoost, and f-XGBoost. In contrast, the proposed network demonstrates a
superior sensitivity score, indicating that the model is able to identify fraud without producing a high number
of false negatives, which could translate to undetected fraud. Hence, this demonstrates the model’s assurance
by ensuring reliability and practical utility or adoption. To show robustness of the proposed network, three
alternative test periods are added. Consistent with previous study [15], [22], the additional test periods are
 ISSN: 2252-8938
Int J Artif Intell, Vol. 13, No. 4, December 2024: 4106-4112
4110
2003–2005, 2003–2011, and 2003–2014. The numerical figure of performance metrics for different additional
test periods are presented in Tables 3 to 5.
Table 2. Summary of comparison with previous study (test period 2003–2008)
Fraud predictor Model AUC Sensitivity (%) Precision (%) Reference
28 raw financial data SVM 0.626 2.53 1.92 [15]
Logistic 0.690 0.73 0.85 [15]
RUSBoost 0.725 4.88 4.48 [15]
XGBoost 0.689 3.56 3.36 [22]
f-XGBoost 0.693 5.00 4.22 [22]
ANN 0.706 50 1.39 This study
14 financial ratios Logistic 0.672 3.99 2.63 [15]
ANN 0.693 67 2.89 This study
Both RUSBoost 0.696 3.19 2.54 [15]
ANN 0.672 83 1.91 This study
Table 3. Summary of comparison with previous study (test period 2003–2005)
Fraud predictor Model AUC Sensitivity (%) Precision (%) Reference
28 raw financial data SVM 0.637 2.28 2.53 [15]
Logistic 0.685 1.45 1.69 [15]
RUSBoost 0.753 7.64 7.83 [15]
f-XGBoost 0.691 6.59 6.71 [22]
ANN 0.694 100 2.79 This study
14 financial ratios Logistic 0.649 1.37 1.29 [15]
ANN 0.667 67 2.53 This study
Both ANN 0.656 100 2.52 This study
Table 4. Summary of comparison with previous study (test period 2003–2011)
Fraud predictor Model AUC Sensitivity (%) Precision (%) Reference
28 raw financial data SVM 0.647 3.07 1.98 [15]
Logistic 0.702 1.87 1.19 [15]
RUSBoost 0.710 4.40 3.60 [15]
f-XGBoost 0.678 3.69 3.02 [22]
ANN 0.720 56 1.34 This study
14 financial ratios Logistic 0.672 3.49 2.23 [15]
ANN 0.685 67 2.40 This study
Both ANN 0.693 89 2.45 This study
Table 5. Summary of comparison with previous study (test period 2003–2014)
Fraud predictor Model AUC Sensitivity (%) Precision (%) Reference
28 raw financial data SVM 0.628 2.30 1.48 [15]
Logistic 0.709 1.84 1.04 [15]
RUSBoost 0.717 3.30 2.70 [15]
f-XGBoost 0.678 2.77 2.26 [22]
ANN 0.718 50 1.15 This study
14 financial ratios Logistic 0.702 3.45 1.86 [15]
ANN 0.694 58 1.99 This study
Both ANN 0.686 75 2.02 This study
The results obtained are compared with previous literature and summarized in Figure 3. As shown in
Figure 3, SVM model demonstrate fluctuations in the performance while both RUSBoost and XGBoost
demonstrate a performance decline when the range of the set is extended. The results is accord to
Hoang et al. [22], that the assumption of undetected fraud grows over time makes a longer test period less
reliable. However, in contrast to SVM, RUSBoost, and XGBoost model, the proposed network, and logistic
models show a slightly performance improvement for the extension of the test period. This demonstrates the
robustness of both models and is expected to have a stable performance when tested with new unseen data.
Employing raw financial data as fraud predictor, the proposed network demonstrated the best AUC
score in scenario of using full test set of 2003–2014 as shown in Table 5 by scoring AUC of 0.718 with
precision of 1.15%, and sensitivity of 50%. Considering stability of AUC to demonstrate robustness, the
proposed network score AUC of 0.694 and 0.720, in test period of 2003–2005 and 2003–2011 as shown in
Tables 3 and 4, respectively. This shows that expanding dataset improves the performance of the proposed
network. Then, in the next scenario using period of 2003–2014, the AUC score dropped to 0.718, slightly lower
than previous scenario of 2003–2011. This indicates that while expanding dataset improves the performance,
Int J Artif Intell ISSN: 2252-8938 
Detecting fraudulent financial statement under imbalanced data using neural network (Yosua Efraim Young)
4111
there may be a diminishing return in a certain length of periods. Consistent with the results of Bao et al. [15],
this study results demonstrated that when experimenting with the same model or networks, using 28 raw
financial data that derived from [14] leads to a better result compared to using the other fraud predictors, which
is 14 financial ratios derived from [16].
This study results shows that combining neural network and SMOTE can detect fraudulent financial
statements in a severely imbalanced dataset using raw financial data, financial ratios, or both combined as the
fraud predictor. While the proposed network demonstrated promising utility, it is important to acknowledge
that the dataset used consists of historical data that coming from specific demographics and time periods. This
may promote limitations on generalizability, hence require further calibration or updates to maintain its
effectiveness in the current dynamic environment.
Figure 3. Summary of AUC scores over the additional test periods
5. CONCLUSION
This paper introduces a neural network designed to detect fraudulent financial statements within an
imbalanced dataset, addressing the severe imbalance issue through the utilization of SMOTE. Our experiment
results indicate that the model achieves detection capabilities, with an AUC score of 70.6%, a sensitivity rate
of 83%, and a precision rate of 2.89%. This study contributes significantly by advocating for the integration of
ANN in auditing practices, particularly during the initial audit phase, such as risk assessment procedures. The
proposed model's high sensitivity rate underscores its superiority over similar models, offering practical utility
for auditors and regulators by minimizing the risk of false negatives. However, limitations exist, including the
reliance solely on numerical financial data extracted from financial statements. Future research avenues could
explore the combination of non-financial data and the application of unsupervised learning to address
mislabeling issues, potentially through the implementation of generative artificial intelligence to generate
fraudulent data for training purposes or describing fraud characteristics.
REFERENCES
[1] “International standard on auditing 240: the auditor’s responsibilities relating to fraud in an audit of financial statements,” IFAC, 2013.
Accessed: Dec. 27, 2023. [Online]. Available: https://www.ifac.org/_flysystem/azure-private/publications/files/A012 2013 IAASB
Handbook ISA 240.pdf.
[2] Y. J. Chen, W. C. Liou, Y. M. Chen, and J. H. Wu, “Fraud detection for financial statements of business groups,” International
Journal of Accounting Information Systems, vol. 32, pp. 1–23, 2019, doi: 10.1016/j.accinf.2018.11.004.
[3] M. N. Ashtiani and B. Raahemi, “Intelligent fraud detection in financial statements using machine learning and data mining: a
systematic literature review,” IEEE Access, vol. 10, pp. 72504–72525, 2022, doi: 10.1109/ACCESS.2021.3096799.
[4] W. Xiuguo and D. Shengyong, “An analysis on financial statement fraud detection for Chinese listed companies using deep
learning,” IEEE Access, vol. 10, pp. 22516–22532, 2022, doi: 10.1109/ACCESS.2022.3153478.
[5] D. Botez, “Recent challenge for auditors: using data analytics in the audit of the financial statements,” Brain-broad Research in
Artificial Intelligence and Neuroscience, vol. 9, no. 4, pp. 61–72, 2018.
[6] G. Salijeni, A. S. -Taddei, and S. Turley, “Big data and changes in audit technology: contemplating a research agenda,” Accounting
and Business Research, vol. 49, no. 1, pp. 95–119, 2019, doi: 10.1080/00014788.2018.1459458.
[7] B. P. Green and J. H. Choi, “Assessing the risk of management fraud through neural network technology,” Auditing, vol. 16, no. 1,
pp. 25–28, 1997.
[8] S. Kotsiantis, E. Koumanakos, D. Tzelepis, and V. Tampakas, “Forecasting fraudulent financial statements using data mining,”
International Journal of Computational Intelligence, vol. 3, no. 2, pp. 104–110, 2006.
[9] E. Kirkos, C. Spathis, and Y. Manolopoulos, “Data mining techniques for the detection of fraudulent financial statements,” Expert
Systems with Applications, vol. 32, no. 4, pp. 995–1003, 2007, doi: 10.1016/j.eswa.2006.02.016.
 ISSN: 2252-8938
Int J Artif Intell, Vol. 13, No. 4, December 2024: 4106-4112
4112
[10] J. Perols, “Financial statement fraud detection: an analysis of statistical and machine learning algorithms,” Auditing, vol. 30, no. 2,
pp. 19–50, 2011, doi: 10.2308/ajpt-50009.
[11] C. L. Jan, “Detection of financial statement fraud using deep learning for sustainable development of capital markets under
information asymmetry,” Sustainability, vol. 13, no. 17, pp. 9879–9898, 2021, doi: 10.3390/su13179879.
[12] T. Kiehl, B. Hoogs, L. Christina, and S. Deniz, “Evolving multi-variate time-series patterns for the discrimination of fraudulent
financial filings,” Genetic and Evolutionary Computation Conference, pp. 1-8, 2005.
[13] J. Bertomeu, E. Cheynel, E. Floyd, and W. Pan, “Using machine learning to detect misstatements,” Review of Accounting Studies,
vol. 26, no. 2, pp. 468–519, 2021, doi: 10.1007/s11142-020-09563-8.
[14] M. Cecchini, H. Aytug, G. J. Koehler, and P. Pathak, “Detecting management fraud in public companies,” Management Science,
vol. 56, no. 7, pp. 1146–1160, 2010, doi: 10.1287/mnsc.1100.1174.
[15] Y. Bao, B. Ke, B. Li, Y. J. Yu, and J. Zhang, “Detecting accounting fraud in publicly traded U.S. firms using a machine learning
approach,” Journal of Accounting Research, vol. 58, no. 1, pp. 199–235, 2020, doi: 10.1111/1475-679X.12292.
[16] P. M. Dechow, W. Ge, C. R. Larson, and R. G. Sloan, “Predicting material accounting misstatements,” 39th
Annual Contemporary
Accounting Research Conference, vol. 28, no. 1, pp. 17-82, 2011, doi: 10.1111/j.1911-3846.2010.01041.x.
[17] Q. Deng and G. Mei, “Combining self-organizing map and k-means clustering for detecting fraudulent financial statements,” in
2009 IEEE International Conference on Granular Computing, GRC, 2009, pp. 126–131, doi: 10.1109/GRC.2009.5255148.
[18] S. Y. Huang, R. H. Tsaih, and W. Y. Lin, “Unsupervised neural networks approach for understanding fraudulent financial reporting,”
Industrial Management and Data Systems, vol. 112, no. 2, pp. 224–244, 2012, doi: 10.1108/02635571211204272.
[19] T. R. Izzalqurny, B. Subroto, and A. Ghofar, “Relationship between financial ratio and financial statement fraud risk moderated by auditor
quality,” International Journal of Research in Business and Social Science, vol. 8, no. 4, pp. 34–43, 2019, doi: 10.20525/ijrbs.v8i4.281.
[20] J. Yao, Y. Pan, S. Yang, Y. Chen, and Y. Li, “Detecting fraudulent financial statements for the sustainable development of the
socio-economy in China: a multi-analytic approach,” Sustainability, vol. 11, no. 6, 2019, doi: 10.3390/su11061579.
[21] K. Randhawa, C. K. Loo, M. Seera, C. P. Lim, and A. K. Nandi, “Credit card fraud detection using adaBoost and majority voting,”
IEEE Access, vol. 6, pp. 14277–14284, 2018, doi: 10.1109/ACCESS.2018.2806420.
[22] M. N. Hoang, H. T. L. Nguyen, and H. N. Viet, “A model for detecting accounting frauds by using machine learning,” in The Annual
Hawaii International Conference on System Sciences, 2022, vol. 2022, pp. 1552–1561, doi: 10.24251/hicss.2022.193.
[23] G. S. Temponeras, S. A. N. Alexandropoulos, S. B. Kotsiantis, and M. N. Vrahatis, “Financial fraudulent statements detection
through a deep dense artificial neural network,” in 10th International Conference on Information, Intelligence, Systems and
Applications, IISA 2019, 2019, pp. 1–5, doi: 10.1109/IISA.2019.8900741.
[24] P. Craja, A. Kim, and S. Lessmann, “Deep learning for detecting financial statement fraud,” Decision Support Systems, vol. 139,
2020, doi: 10.1016/j.dss.2020.113421.
[25] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic minority over-sampling technique,” Journal
of Artificial Intelligence Research, vol. 16, pp. 321–357, 2002, doi: 10.1613/jair.953.
[26] D. Elreedy and A. F. Atiya, “A comprehensive analysis of synthetic minority oversampling technique (SMOTE) for handling class
imbalance,” Information Sciences, vol. 505, pp. 32–64, 2019, doi: 10.1016/j.ins.2019.07.070.
[27] I. Goodfellow, Y. Bengio, and A. Courville, Deep learning, Cambridge, Massachusetts: MIT Press, 2016.
[28] D. P. Kingma and J. Ba, “Adam: a method for stochastic optimization,” arXiv-Computer Science, pp. 1-15, 2017, doi:
10.48550/arXiv.1412.6980.
[29] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Computer Society
Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778, doi: 10.1109/CVPR.2016.90.
[30] T. Fawcett, “An introduction to ROC analysis,” Pattern Recognition Letters, vol. 27, no. 8, pp. 861–874, 2006, doi:
10.1016/j.patrec.2005.10.010.
[31] N. Japkowicz, “Assessment metrics for imbalanced learning,” Imbalanced Learning: Foundations, Algorithms, and Applications,
pp. 187–206, 2013, doi: 10.1002/9781118646106.ch8.
BIOGRAPHIES OF AUTHORS
Yosua Efraim Young is a graduate candidate of Informatics Graduate Program
from Universitas Pelita Harapan. He earned his Bachelor’s Degree in Accounting from
Universitas Pelita Harapan in Indonesia. He can be contacted at email:
youngyosua911@gmail.com.
Hendra Tjahyadi is an Associate Professor of Informatics Study Program in
Universitas Pelita Harapan. He earned his Bachelor’s Degree in Electrical Engineering from
Universitas Kristen Maranatha, Master’s Degree in Instrumentation and Control from Institut
Teknologi Bandung, and Ph.D. in Control Engineering from School of Engineering, Flinders
University. His research interests are in adaptive control, signal processing, and artificial
intelligence. He can be contacted at email: hendra.tjahyadi@uph.edu.

More Related Content

PDF
A rule-based machine learning model for financial fraud detection
PDF
A benchmark of health insurance fraud detection using machine learning techni...
PDF
Detecting Fraud Using Transaction Frequency Data
PDF
IRJET- A Comparative Study to Detect Fraud Financial Statement using Data Min...
PDF
Welcome to International Journal of Engineering Research and Development (IJERD)
PDF
A novel ensemble model for detecting fake news
PDF
Development of a Web Application for Fake News Classification using Machine l...
PDF
A Comparative Study on Credit Card Fraud Detection
A rule-based machine learning model for financial fraud detection
A benchmark of health insurance fraud detection using machine learning techni...
Detecting Fraud Using Transaction Frequency Data
IRJET- A Comparative Study to Detect Fraud Financial Statement using Data Min...
Welcome to International Journal of Engineering Research and Development (IJERD)
A novel ensemble model for detecting fake news
Development of a Web Application for Fake News Classification using Machine l...
A Comparative Study on Credit Card Fraud Detection

Similar to Detecting fraudulent financial statement under imbalanced data using neural network (20)

PDF
MapReduce-iterative support vector machine classifier: novel fraud detection...
PDF
Fake News Detection Using Machine Learning
PDF
CREDIT CARD FRAUD DETECTION AND AUTHENTICATION SYSTEM USING MACHINE LEARNING
PDF
PDF
Concept drift and machine learning model for detecting fraudulent transaction...
PDF
Phishing Websites Detection Using Back Propagation Algorithm: A Review
PDF
Detecting outliers and anomalies in data streams
PDF
Comparative Study of Classification Method on Customer Candidate Data to Pred...
PDF
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdf
DOCX
A Distributed Knowledge Distillation Framework for Financial Fraud Detection ...
PDF
Inspection of Certain RNN-ELM Algorithms for Societal Applications
PDF
CREDIT CARD FRAUD DETECTION USING MACHINE LEARNING
PDF
MACHINE LEARNING ALGORITHMS FOR CREDIT CARD FRAUD DETECTION
PDF
Fraud detection in electric power distribution networks using an ann based kn...
PDF
Online Transaction Fraud Detection using Hidden Markov Model & Behavior Analysis
PDF
Machine Learning-Based Approaches for Fraud Detection in Credit Card Transact...
PDF
Enhancing Time Series Anomaly Detection: A Hybrid Model Fusion Approach
PDF
Fake accounts detection on social media using stack ensemble system
PDF
Computer aided audit techniques and fraud detection
PDF
A predictive model for mapping crime using big data analytics
MapReduce-iterative support vector machine classifier: novel fraud detection...
Fake News Detection Using Machine Learning
CREDIT CARD FRAUD DETECTION AND AUTHENTICATION SYSTEM USING MACHINE LEARNING
Concept drift and machine learning model for detecting fraudulent transaction...
Phishing Websites Detection Using Back Propagation Algorithm: A Review
Detecting outliers and anomalies in data streams
Comparative Study of Classification Method on Customer Candidate Data to Pred...
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdf
A Distributed Knowledge Distillation Framework for Financial Fraud Detection ...
Inspection of Certain RNN-ELM Algorithms for Societal Applications
CREDIT CARD FRAUD DETECTION USING MACHINE LEARNING
MACHINE LEARNING ALGORITHMS FOR CREDIT CARD FRAUD DETECTION
Fraud detection in electric power distribution networks using an ann based kn...
Online Transaction Fraud Detection using Hidden Markov Model & Behavior Analysis
Machine Learning-Based Approaches for Fraud Detection in Credit Card Transact...
Enhancing Time Series Anomaly Detection: A Hybrid Model Fusion Approach
Fake accounts detection on social media using stack ensemble system
Computer aided audit techniques and fraud detection
A predictive model for mapping crime using big data analytics
Ad

More from IAESIJAI (20)

PDF
Hybrid model detection and classification of lung cancer
PDF
Adaptive kernel integration in visual geometry group 16 for enhanced classifi...
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Enhancing fall detection and classification using Jarratt‐butterfly optimizat...
PDF
Deep ensemble learning with uncertainty aware prediction ranking for cervical...
PDF
Event detection in soccer matches through audio classification using transfer...
PDF
Detecting road damage utilizing retinaNet and mobileNet models on edge devices
PDF
Optimizing deep learning models from multi-objective perspective via Bayesian...
PDF
Squeeze-excitation half U-Net and synthetic minority oversampling technique o...
PDF
A novel scalable deep ensemble learning framework for big data classification...
PDF
Exploring DenseNet architectures with particle swarm optimization: efficient ...
PDF
A transfer learning-based deep neural network for tomato plant disease classi...
PDF
U-Net for wheel rim contour detection in robotic deburring
PDF
Deep learning-based classifier for geometric dimensioning and tolerancing sym...
PDF
Enhancing fire detection capabilities: Leveraging you only look once for swif...
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Depression detection through transformers-based emotion recognition in multiv...
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Enhancing financial cybersecurity via advanced machine learning: analysis, co...
PDF
Crop classification using object-oriented method and Google Earth Engine
Hybrid model detection and classification of lung cancer
Adaptive kernel integration in visual geometry group 16 for enhanced classifi...
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Enhancing fall detection and classification using Jarratt‐butterfly optimizat...
Deep ensemble learning with uncertainty aware prediction ranking for cervical...
Event detection in soccer matches through audio classification using transfer...
Detecting road damage utilizing retinaNet and mobileNet models on edge devices
Optimizing deep learning models from multi-objective perspective via Bayesian...
Squeeze-excitation half U-Net and synthetic minority oversampling technique o...
A novel scalable deep ensemble learning framework for big data classification...
Exploring DenseNet architectures with particle swarm optimization: efficient ...
A transfer learning-based deep neural network for tomato plant disease classi...
U-Net for wheel rim contour detection in robotic deburring
Deep learning-based classifier for geometric dimensioning and tolerancing sym...
Enhancing fire detection capabilities: Leveraging you only look once for swif...
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Depression detection through transformers-based emotion recognition in multiv...
A comparative analysis of optical character recognition models for extracting...
Enhancing financial cybersecurity via advanced machine learning: analysis, co...
Crop classification using object-oriented method and Google Earth Engine
Ad

Recently uploaded (20)

PDF
Modernizing your data center with Dell and AMD
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Empathic Computing: Creating Shared Understanding
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Big Data Technologies - Introduction.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
DOCX
The AUB Centre for AI in Media Proposal.docx
Modernizing your data center with Dell and AMD
Diabetes mellitus diagnosis method based random forest with bat algorithm
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Agricultural_Statistics_at_a_Glance_2022_0.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Empathic Computing: Creating Shared Understanding
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Big Data Technologies - Introduction.pptx
Spectral efficient network and resource selection model in 5G networks
Unlocking AI with Model Context Protocol (MCP)
Per capita expenditure prediction using model stacking based on satellite ima...
Review of recent advances in non-invasive hemoglobin estimation
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Advanced methodologies resolving dimensionality complications for autism neur...
NewMind AI Monthly Chronicles - July 2025
Mobile App Security Testing_ A Comprehensive Guide.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
The AUB Centre for AI in Media Proposal.docx

Detecting fraudulent financial statement under imbalanced data using neural network

  • 1. IAES International Journal of Artificial Intelligence (IJ-AI) Vol. 13, No. 4, December 2024, pp. 4106~4112 ISSN: 2252-8938, DOI: 10.11591/ijai.v13.i4.pp4106-4112  4106 Journal homepage: http://ijai.iaescore.com Detecting fraudulent financial statement under imbalanced data using neural network Hendra Tjahyadi, Yosua Efraim Young Study Program of Informatics, Faculty of Computer Science, Universitas Pelita Harapan, Jakarta, Indonesia Article Info ABSTRACT Article history: Received Dec 29, 2023 Revised Apr 19, 2024 Accepted Jun 8, 2024 In this paper a novel approach for detecting fraudulent financial statements by employing a combination of neural networks and synthetic minority over- sampling technique (SMOTE) is introduced. This approach is designed to tackle the problem of imbalanced datasets prevalent in fraudulent cases, which if left unaddressed will hinder the model to accurately identify fraud. Three neural network models, each representing different fraud predictors as the input layer: 28 inputs raw financial data; 14 inputs financial ratios data; and 42 inputs combination both raw financial and financial ratios data are developed. Experimental validation using established research datasets is conducted to assess the performance of the proposed method. Performance metrics, namely area under the curve (AUC), precision, and sensitivity, are used for evaluation, comparing the proposed model against existing benchmark models found in literature. Results indicate that the proposed model achieves an AUC score of 70.6% and a precision score of 2.89%, in comparable to the existing models, with a sensitivity score of 83% outperforming all counterparts. The high sensitivity rate of the proposed model underscores its practical utility for auditors and regulators, as it minimizes the risk of false negatives, thereby enhancing confidence in fraud detection. Keywords: Fraudulent financial statements Machine learning Neural network Supervised learning Synthetic minority over- sampling technique This is an open access article under the CC BY-SA license. Corresponding Author: Hendra Tjahyadi Study Program of Informatics, Faculty of Computer Science, Universitas Pelita Harapan Jakarta, Indonesia Email: hendra.tjahyadi@uph.edu 1. INTRODUCTION Financial statement misstatements may arise from either fraud or error, as stated by the International Federation of Accountants [1]. It is the auditor’s responsibility to provide reasonable assurance that the financial statements are free from material misstatement. Misleading financial statements can incur significant costs, especially for investors, regulators, and society at large, as demonstrated in the Enron scandal—one of the most notable audit and accounting scandals in history and literature [2]–[4]. It began when Enron shocked the public by reporting a $638 million loss. This case implicated its auditor, Arthur Andersen, which failed to detect the misstatement and engaged in document shredding related to Enron audits. This highlights the difficulty in detecting accounting misstatements. Detecting accounting misstatements can be challenging due to several reasons such as the complexity of financial transactions, sophisticated fraud schemes, vast amount of data, and human error and bias. These challenges underscore the need for innovative approach such as data analytics and machine learning in auditing [3]–[5]. These technologies offer the potential to enhance audit effectiveness, improve risk assessment, and mitigate the impact of human limitations on audit quality. Although data analytics and machine learning are expected to demonstrate a superior method, they are seldom to use on performing audit procedures. It is relatively unknown whether usage of data analytics and
  • 2. Int J Artif Intell ISSN: 2252-8938  Detecting fraudulent financial statement under imbalanced data using neural network (Yosua Efraim Young) 4107 machine learning are indeed transformational for the audit [6]. Various research has been conducted in searching of fraudulent financial statement detection, including utilization of supervised learning and unsupervised learning. Supervised learning is used including various models such as neural network [7]–[11], genetic algorithm [12], decision tree (DT) [8]–[10], [13], Bayesian network [8], [9], support vector machines (SVM) [8], [13]–[15], and logistic regression (LR) [16]. Unsupervised learning implementation use algorithm such as self-organizing map [17], [18] and k-means clustering [17]. One significant obstacle in machine learning is the imbalanced data challenge, where unequal class representation leads to inaccurate detection, with majority classes overshadowing minorities. Publicly available financial statements often exhibit severe imbalance due to the rarity of fraudulent instances compared to non-fraudulent ones. Therefore, it is crucial for models to address this imbalance. This research aims to develop a model for predicting fraudulent financial statements from real public datasets and to tackle the imbalanced data issue. Three neural networks models with different types of inputs, namely raw financial data, financial ratios data, and a combination thereof, combined with synthetic minority over-sampling technique (SMOTE), are proposed in this study. The rest of this paper is organized as follows: in section 2, we outline previous efforts by researchers to detect fraudulence in financial statement, both using commonly balanced simulation data and imbalanced real data. Section 3 details the method we propose, utilizing a combination of neural networks and SMOTE to detect fraudulence in highly imbalanced real data. In section 4, the experimental results are presented and compared with those of previous researchers. Finally, in section 5, we conclude with a summary of our findings. 2. LITERATURE REVIEW The existing literature has focused considerable attention on financial data as crucial indicators of fraud, encompassing both raw financial data and financial ratios. As fundamental components of financial statements, financial data have the potential to indicate fraud risk. For example, a liquidity ratio derived directly from raw financial data could serve as an effective measure of a company's financial pressure. This underscores the superiority of certain ratios over others [19], particularly those financial data points closely linked to the fraud triangle theory. By leveraging financial data, several approaches utilizing machine learning and data mining to detect fraudulent financial statement are found in the literature. Green and Choi [7] demonstrated the potential of neural network applications in fraud investigation and utilized it as a detection tool, employing 172 samples, with 86 samples for both fraudulent and non-fraudulent cases. The model achieved an accuracy rate of 74%. Kotsiantis et al. [8] conducted experiments on DT, artificial neural network (ANNs), Bayesian networks, rule learners, nearest neighbors, and SVM. This study demonstrated that DT outperformed other models with 91.2% accuracy using a balanced dataset of 164 Greek companies listed on the Athens stock exchange, comprising 41 fraudulent and 123 non-fraudulent cases. In a similar works, Kirkos et al. [9] conducted experiments using DT, neural network, and Bayesian belief networks, revealing that Bayesian belief networks outperformed others with 90.3% accuracy using a balanced dataset of 76 Greek manufacturing companies, including 38 fraudulent and 38 non-fraudulent cases. Cecchini et al. [14] using SVM, accurately identified 80% of fraudulent cases and 90.6% of non-fraudulent cases from a dataset comprising 6,427 non-fraudulent and 205 fraudulent samples. This study was considered a pioneering work in the field. Dechow et al. [16] presented an alternative method using LR with financial ratios to detect fraudulent financial statements, signaling the likelihood of misstatement. Perols [10], with a larger dataset of 15,934 non-fraudulent and 51 fraudulent cases, demonstrated that LR and SVM outperformed neural network, bagging, C.45, and stacking algorithm. These findings were consistent with those of Yao et al. [20], which showed that SVM had the highest accuracy among various classification methods. Randhawa et al. [21] investigated the effectiveness of single and hybrid methods, employing under-sampling to detect credit card fraud. Their study revealed that combining AdaBoost and majority voting methods yielded the best results. Bao et al. [15] extended this research by using a large public dataset and compared the results of re-implementing the models proposed in [14], [16] with a new state-of-the-art model using RUSBoost. The proposed method outperformed the previous models with an area under curve (AUC) of 72.5% and sensitivity and precision of 4.88% and 4.48%, respectively. Hoang et al. [22] employed XGBoost and f-XGBoost on the dataset used in [15], resulting in AUC scores of 68.9% and 69.3%, and precision and sensitivity of 3.56%, 5%, and 3.36%, and 4.22%, respectively. Ashtiani and Raahemi [3] found that a single model outperformed both ensemble and hybrid approaches. They highlighted Temponeras et al. [23] approach of employing a deep dense multilayer perceptron, achieving an accuracy of 93.7% using a dataset of 164 Greek companies. Craja et al. [24] using a text mining approach to detect fraudulent financial statements from annual reports, demonstrated the effectiveness and preference for ANN, emphasizing their ability to capture complex relationships among variables. Inspired by the effectiveness of neural networks, this study proposes an approach combining neural networks and SMOTE to detect fraudulent financial statements in an imbalanced dataset.
  • 3.  ISSN: 2252-8938 Int J Artif Intell, Vol. 13, No. 4, December 2024: 4106-4112 4108 3. METHOD The detection model for fraudulent financial statement proposed in this study utilizes a combination of neural networks and SMOTE. Initially, a severely imbalanced public dataset containing real financial statements is acquired. Subsequently, the dataset undergoes preprocessing to address the imbalanced dataset using SMOTE, which generates synthetic samples for the minority class. The preprocessed data is then used for training and experimentation on the proposed network models. Finally, the results obtained from employing neural networks are compared to those achieved by the state-of-the-art algorithm proposed in previous literature. The overall process workflow is illustrated in Figure 1. Figure 1. Proposed fraud detection workflow 3.1. Data and variables The dataset is retrieved from previous research by Bao et al. [15], comprising 146,045 records collected from the COMPUSTAT database, covering all publicly listed U.S. firms from 1990 to 2014. It includes 42 features, consisting of 28 raw financial data derived from research by Cecchini et al. [14] and 14 financial ratios as researched by Dechow et al. [16]. Following the previous study, the training dataset spans from 1991 to the test year, with a two-year gap. Serial fraud, defined as fraudulent cases spanning multiple years, is present in the dataset. The impact of serial fraud is that it can inflate the model's performance, as the same fraud case may be included in both the train and test data [15]. Therefore, to prevent overstated results and benchmark against previous literature, the dataset is preprocessed by recoding all serial fraud as non-fraudulent. To address the severe imbalance between fraudulent and non-fraudulent cases in the dataset, we employ a minority oversampling technique called SMOTE [25]. This technique is necessary to address the challenge where the minority class is often neglected; for example, fraudulent cases represent only 0.67% of the population, which is the focus of our attention. SMOTE generates new synthetic data through an iterative process targeting each point in the minority class. It proves to be an effective method for addressing existing imbalance cases and improving classification performance [26]. 3.2. Proposed artificial neural networks This study experimented with the utilization of ANN, specifically feedforward networks, to detect fraudulent financial statements from a severely imbalanced dataset. The experiments involved three models or networks, each representing different fraud predictors as the input layer. The first network used 28 raw financial data as input layers derived from the fraud predictors of Cecchini et al. [14]. The second network used 14 financial ratios as input layers derived from the fraud predictors of Dechow et al. [16]. Lastly, the third network used a combined approach from Cecchini and Dechow as the input layer. The overall architecture of the proposed networks is illustrated in Figure 2. The architecture of the three networks comprises an input layer followed by three hidden layers and an output layer. The input layer encompasses three different scenarios, representing different fraud predictors, which can be represented by input layers of 28, 14, and 42, respectively. Inspired in [23]–[25], the first and second hidden layers consist of fully connected layers with LeakyReLU (alpha of 0.05) as the activation function to address complex patterns and relationships of the fraud predictors and handle non-linearity issues. This is followed by L2 regularization with a coefficient of 0.005 to add a penalty term to the network to avoid overfitting issues [27]. The Adam optimizer is chosen for its capabilities of efficient computation, minimal memory requirements, and suitability for large datasets [28]. Additionally, a dropout layer is added with a rate of 0.7 to randomly drop out neurons in an attempt to prevent overfitting [29]. Finally, an output layer with a sigmoid function is added to perform binary classification tasks. 3.3. Performance evaluation The performance of the proposed model is evaluated using three metrics: AUC, sensitivity, and precision. AUC is a metric used to evaluate the performance of binary classification [30]. It is employed to assess the accuracy of the proposed model due to the imbalance in the dataset, where the occurrence of fraudulent samples is not adequately captured in standard accuracy metrics [31]. Therefore, the AUC score provides a more representative measure of accuracy in this context compared to commonly used accuracy scores.
  • 4. Int J Artif Intell ISSN: 2252-8938  Detecting fraudulent financial statement under imbalanced data using neural network (Yosua Efraim Young) 4109 Following previous works [15], the measurement of sensitivity and precision is based on data from the top 1% of observations from the decision value. This choice is driven by practical considerations, as regulators may not be able to observe all companies predicted as fraudulent due to resource constraints. Additionally, this decision is influenced by the results of leading research by Cecchini et al. [14], which reported a high number of false positives in their SVM performance, correctly classifying 80% of fraud cases and 90.6% of non-fraud cases. Therefore, to mitigate the allocation of excessive resources toward many false positives, the focus is on the top 1% of observations. Figure 2. The architecture of the proposed networks 4. RESULTS AND DISCUSSION This study proposed an approach for detecting fraudulent financial statements by combining neural networks and SMOTE. While earlier literature has explored and demonstrated the capability of various algorithms to detect fraudulent financial statements, there is still enhancement needed specifically in terms of accuracy. Therefore, developing a model with better accuracy is important, especially a model that can be relied upon for practical adoption. Three networks with different fraud predictor as input layer are employed. The results obtained are summarized in Table 1. In the first network, employing raw financial data as fraud predictor, the proposed network scored an AUC score of 0.706, with a sensitivity of 50% and 1.39%. The second proposed network achieved AUC score of 0.693, sensitivity of 67%, and precision of 2.89%, by employing financial ratios as fraud predictor. Lastly, the third network employing both raw financial data and financial ratios as fraud predictor resulted in AUC of 0.672, followed by sensitivity of 83% and precision of 1.91%. Table 1. Summary of comparison (test period 2003–2008) Fraud predictor AUC Sensitivity (%) Precision (%) 28 raw financial data 0.706 50 1.39 14 financial ratios 0.693 67 2.89 Both 0.672 83 1.91 As presented in Table 1, it is found that the proposed neural network can detect fraudulent financial statements, with raw financial data is the best fraud predictor measured by AUC score. These results demonstrated that combining both raw financial data and financial ratios as fraud predictor does not yield to a higher accuracy, measured by AUC score. From this experimentation, the highest precision is obtained through employing financial ratios as fraud predictor, whereas the highest sensitivity is obtained through employing both raw financial data and financial ratios as fraud predictor. Table 2 provides results obtained from the proposed network for the test period of 2003–2008 in comparison with previous literature. These results demonstrate that the proposed network has comparable AUC score and precision in comparison with results obtained from previous literatures by employing algorithms with SVM, LR, RUSBoost, XGBoost, and f-XGBoost. In contrast, the proposed network demonstrates a superior sensitivity score, indicating that the model is able to identify fraud without producing a high number of false negatives, which could translate to undetected fraud. Hence, this demonstrates the model’s assurance by ensuring reliability and practical utility or adoption. To show robustness of the proposed network, three alternative test periods are added. Consistent with previous study [15], [22], the additional test periods are
  • 5.  ISSN: 2252-8938 Int J Artif Intell, Vol. 13, No. 4, December 2024: 4106-4112 4110 2003–2005, 2003–2011, and 2003–2014. The numerical figure of performance metrics for different additional test periods are presented in Tables 3 to 5. Table 2. Summary of comparison with previous study (test period 2003–2008) Fraud predictor Model AUC Sensitivity (%) Precision (%) Reference 28 raw financial data SVM 0.626 2.53 1.92 [15] Logistic 0.690 0.73 0.85 [15] RUSBoost 0.725 4.88 4.48 [15] XGBoost 0.689 3.56 3.36 [22] f-XGBoost 0.693 5.00 4.22 [22] ANN 0.706 50 1.39 This study 14 financial ratios Logistic 0.672 3.99 2.63 [15] ANN 0.693 67 2.89 This study Both RUSBoost 0.696 3.19 2.54 [15] ANN 0.672 83 1.91 This study Table 3. Summary of comparison with previous study (test period 2003–2005) Fraud predictor Model AUC Sensitivity (%) Precision (%) Reference 28 raw financial data SVM 0.637 2.28 2.53 [15] Logistic 0.685 1.45 1.69 [15] RUSBoost 0.753 7.64 7.83 [15] f-XGBoost 0.691 6.59 6.71 [22] ANN 0.694 100 2.79 This study 14 financial ratios Logistic 0.649 1.37 1.29 [15] ANN 0.667 67 2.53 This study Both ANN 0.656 100 2.52 This study Table 4. Summary of comparison with previous study (test period 2003–2011) Fraud predictor Model AUC Sensitivity (%) Precision (%) Reference 28 raw financial data SVM 0.647 3.07 1.98 [15] Logistic 0.702 1.87 1.19 [15] RUSBoost 0.710 4.40 3.60 [15] f-XGBoost 0.678 3.69 3.02 [22] ANN 0.720 56 1.34 This study 14 financial ratios Logistic 0.672 3.49 2.23 [15] ANN 0.685 67 2.40 This study Both ANN 0.693 89 2.45 This study Table 5. Summary of comparison with previous study (test period 2003–2014) Fraud predictor Model AUC Sensitivity (%) Precision (%) Reference 28 raw financial data SVM 0.628 2.30 1.48 [15] Logistic 0.709 1.84 1.04 [15] RUSBoost 0.717 3.30 2.70 [15] f-XGBoost 0.678 2.77 2.26 [22] ANN 0.718 50 1.15 This study 14 financial ratios Logistic 0.702 3.45 1.86 [15] ANN 0.694 58 1.99 This study Both ANN 0.686 75 2.02 This study The results obtained are compared with previous literature and summarized in Figure 3. As shown in Figure 3, SVM model demonstrate fluctuations in the performance while both RUSBoost and XGBoost demonstrate a performance decline when the range of the set is extended. The results is accord to Hoang et al. [22], that the assumption of undetected fraud grows over time makes a longer test period less reliable. However, in contrast to SVM, RUSBoost, and XGBoost model, the proposed network, and logistic models show a slightly performance improvement for the extension of the test period. This demonstrates the robustness of both models and is expected to have a stable performance when tested with new unseen data. Employing raw financial data as fraud predictor, the proposed network demonstrated the best AUC score in scenario of using full test set of 2003–2014 as shown in Table 5 by scoring AUC of 0.718 with precision of 1.15%, and sensitivity of 50%. Considering stability of AUC to demonstrate robustness, the proposed network score AUC of 0.694 and 0.720, in test period of 2003–2005 and 2003–2011 as shown in Tables 3 and 4, respectively. This shows that expanding dataset improves the performance of the proposed network. Then, in the next scenario using period of 2003–2014, the AUC score dropped to 0.718, slightly lower than previous scenario of 2003–2011. This indicates that while expanding dataset improves the performance,
  • 6. Int J Artif Intell ISSN: 2252-8938  Detecting fraudulent financial statement under imbalanced data using neural network (Yosua Efraim Young) 4111 there may be a diminishing return in a certain length of periods. Consistent with the results of Bao et al. [15], this study results demonstrated that when experimenting with the same model or networks, using 28 raw financial data that derived from [14] leads to a better result compared to using the other fraud predictors, which is 14 financial ratios derived from [16]. This study results shows that combining neural network and SMOTE can detect fraudulent financial statements in a severely imbalanced dataset using raw financial data, financial ratios, or both combined as the fraud predictor. While the proposed network demonstrated promising utility, it is important to acknowledge that the dataset used consists of historical data that coming from specific demographics and time periods. This may promote limitations on generalizability, hence require further calibration or updates to maintain its effectiveness in the current dynamic environment. Figure 3. Summary of AUC scores over the additional test periods 5. CONCLUSION This paper introduces a neural network designed to detect fraudulent financial statements within an imbalanced dataset, addressing the severe imbalance issue through the utilization of SMOTE. Our experiment results indicate that the model achieves detection capabilities, with an AUC score of 70.6%, a sensitivity rate of 83%, and a precision rate of 2.89%. This study contributes significantly by advocating for the integration of ANN in auditing practices, particularly during the initial audit phase, such as risk assessment procedures. The proposed model's high sensitivity rate underscores its superiority over similar models, offering practical utility for auditors and regulators by minimizing the risk of false negatives. However, limitations exist, including the reliance solely on numerical financial data extracted from financial statements. Future research avenues could explore the combination of non-financial data and the application of unsupervised learning to address mislabeling issues, potentially through the implementation of generative artificial intelligence to generate fraudulent data for training purposes or describing fraud characteristics. REFERENCES [1] “International standard on auditing 240: the auditor’s responsibilities relating to fraud in an audit of financial statements,” IFAC, 2013. Accessed: Dec. 27, 2023. [Online]. Available: https://www.ifac.org/_flysystem/azure-private/publications/files/A012 2013 IAASB Handbook ISA 240.pdf. [2] Y. J. Chen, W. C. Liou, Y. M. Chen, and J. H. Wu, “Fraud detection for financial statements of business groups,” International Journal of Accounting Information Systems, vol. 32, pp. 1–23, 2019, doi: 10.1016/j.accinf.2018.11.004. [3] M. N. Ashtiani and B. Raahemi, “Intelligent fraud detection in financial statements using machine learning and data mining: a systematic literature review,” IEEE Access, vol. 10, pp. 72504–72525, 2022, doi: 10.1109/ACCESS.2021.3096799. [4] W. Xiuguo and D. Shengyong, “An analysis on financial statement fraud detection for Chinese listed companies using deep learning,” IEEE Access, vol. 10, pp. 22516–22532, 2022, doi: 10.1109/ACCESS.2022.3153478. [5] D. Botez, “Recent challenge for auditors: using data analytics in the audit of the financial statements,” Brain-broad Research in Artificial Intelligence and Neuroscience, vol. 9, no. 4, pp. 61–72, 2018. [6] G. Salijeni, A. S. -Taddei, and S. Turley, “Big data and changes in audit technology: contemplating a research agenda,” Accounting and Business Research, vol. 49, no. 1, pp. 95–119, 2019, doi: 10.1080/00014788.2018.1459458. [7] B. P. Green and J. H. Choi, “Assessing the risk of management fraud through neural network technology,” Auditing, vol. 16, no. 1, pp. 25–28, 1997. [8] S. Kotsiantis, E. Koumanakos, D. Tzelepis, and V. Tampakas, “Forecasting fraudulent financial statements using data mining,” International Journal of Computational Intelligence, vol. 3, no. 2, pp. 104–110, 2006. [9] E. Kirkos, C. Spathis, and Y. Manolopoulos, “Data mining techniques for the detection of fraudulent financial statements,” Expert Systems with Applications, vol. 32, no. 4, pp. 995–1003, 2007, doi: 10.1016/j.eswa.2006.02.016.
  • 7.  ISSN: 2252-8938 Int J Artif Intell, Vol. 13, No. 4, December 2024: 4106-4112 4112 [10] J. Perols, “Financial statement fraud detection: an analysis of statistical and machine learning algorithms,” Auditing, vol. 30, no. 2, pp. 19–50, 2011, doi: 10.2308/ajpt-50009. [11] C. L. Jan, “Detection of financial statement fraud using deep learning for sustainable development of capital markets under information asymmetry,” Sustainability, vol. 13, no. 17, pp. 9879–9898, 2021, doi: 10.3390/su13179879. [12] T. Kiehl, B. Hoogs, L. Christina, and S. Deniz, “Evolving multi-variate time-series patterns for the discrimination of fraudulent financial filings,” Genetic and Evolutionary Computation Conference, pp. 1-8, 2005. [13] J. Bertomeu, E. Cheynel, E. Floyd, and W. Pan, “Using machine learning to detect misstatements,” Review of Accounting Studies, vol. 26, no. 2, pp. 468–519, 2021, doi: 10.1007/s11142-020-09563-8. [14] M. Cecchini, H. Aytug, G. J. Koehler, and P. Pathak, “Detecting management fraud in public companies,” Management Science, vol. 56, no. 7, pp. 1146–1160, 2010, doi: 10.1287/mnsc.1100.1174. [15] Y. Bao, B. Ke, B. Li, Y. J. Yu, and J. Zhang, “Detecting accounting fraud in publicly traded U.S. firms using a machine learning approach,” Journal of Accounting Research, vol. 58, no. 1, pp. 199–235, 2020, doi: 10.1111/1475-679X.12292. [16] P. M. Dechow, W. Ge, C. R. Larson, and R. G. Sloan, “Predicting material accounting misstatements,” 39th Annual Contemporary Accounting Research Conference, vol. 28, no. 1, pp. 17-82, 2011, doi: 10.1111/j.1911-3846.2010.01041.x. [17] Q. Deng and G. Mei, “Combining self-organizing map and k-means clustering for detecting fraudulent financial statements,” in 2009 IEEE International Conference on Granular Computing, GRC, 2009, pp. 126–131, doi: 10.1109/GRC.2009.5255148. [18] S. Y. Huang, R. H. Tsaih, and W. Y. Lin, “Unsupervised neural networks approach for understanding fraudulent financial reporting,” Industrial Management and Data Systems, vol. 112, no. 2, pp. 224–244, 2012, doi: 10.1108/02635571211204272. [19] T. R. Izzalqurny, B. Subroto, and A. Ghofar, “Relationship between financial ratio and financial statement fraud risk moderated by auditor quality,” International Journal of Research in Business and Social Science, vol. 8, no. 4, pp. 34–43, 2019, doi: 10.20525/ijrbs.v8i4.281. [20] J. Yao, Y. Pan, S. Yang, Y. Chen, and Y. Li, “Detecting fraudulent financial statements for the sustainable development of the socio-economy in China: a multi-analytic approach,” Sustainability, vol. 11, no. 6, 2019, doi: 10.3390/su11061579. [21] K. Randhawa, C. K. Loo, M. Seera, C. P. Lim, and A. K. Nandi, “Credit card fraud detection using adaBoost and majority voting,” IEEE Access, vol. 6, pp. 14277–14284, 2018, doi: 10.1109/ACCESS.2018.2806420. [22] M. N. Hoang, H. T. L. Nguyen, and H. N. Viet, “A model for detecting accounting frauds by using machine learning,” in The Annual Hawaii International Conference on System Sciences, 2022, vol. 2022, pp. 1552–1561, doi: 10.24251/hicss.2022.193. [23] G. S. Temponeras, S. A. N. Alexandropoulos, S. B. Kotsiantis, and M. N. Vrahatis, “Financial fraudulent statements detection through a deep dense artificial neural network,” in 10th International Conference on Information, Intelligence, Systems and Applications, IISA 2019, 2019, pp. 1–5, doi: 10.1109/IISA.2019.8900741. [24] P. Craja, A. Kim, and S. Lessmann, “Deep learning for detecting financial statement fraud,” Decision Support Systems, vol. 139, 2020, doi: 10.1016/j.dss.2020.113421. [25] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic minority over-sampling technique,” Journal of Artificial Intelligence Research, vol. 16, pp. 321–357, 2002, doi: 10.1613/jair.953. [26] D. Elreedy and A. F. Atiya, “A comprehensive analysis of synthetic minority oversampling technique (SMOTE) for handling class imbalance,” Information Sciences, vol. 505, pp. 32–64, 2019, doi: 10.1016/j.ins.2019.07.070. [27] I. Goodfellow, Y. Bengio, and A. Courville, Deep learning, Cambridge, Massachusetts: MIT Press, 2016. [28] D. P. Kingma and J. Ba, “Adam: a method for stochastic optimization,” arXiv-Computer Science, pp. 1-15, 2017, doi: 10.48550/arXiv.1412.6980. [29] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778, doi: 10.1109/CVPR.2016.90. [30] T. Fawcett, “An introduction to ROC analysis,” Pattern Recognition Letters, vol. 27, no. 8, pp. 861–874, 2006, doi: 10.1016/j.patrec.2005.10.010. [31] N. Japkowicz, “Assessment metrics for imbalanced learning,” Imbalanced Learning: Foundations, Algorithms, and Applications, pp. 187–206, 2013, doi: 10.1002/9781118646106.ch8. BIOGRAPHIES OF AUTHORS Yosua Efraim Young is a graduate candidate of Informatics Graduate Program from Universitas Pelita Harapan. He earned his Bachelor’s Degree in Accounting from Universitas Pelita Harapan in Indonesia. He can be contacted at email: youngyosua911@gmail.com. Hendra Tjahyadi is an Associate Professor of Informatics Study Program in Universitas Pelita Harapan. He earned his Bachelor’s Degree in Electrical Engineering from Universitas Kristen Maranatha, Master’s Degree in Instrumentation and Control from Institut Teknologi Bandung, and Ph.D. in Control Engineering from School of Engineering, Flinders University. His research interests are in adaptive control, signal processing, and artificial intelligence. He can be contacted at email: hendra.tjahyadi@uph.edu.