A Comparative Study of the Performance of Machine Learning Models on a Tax Dataset of Yemen to Detect Levels of Tax Evasion
الكلمات المفتاحية:
ML techniques، K-Fold Validation، Dataset of Tax، Performance measuresالملخص
The performance of a classification model in machine learning is affected by many factors, such as the type of machine learning technology used. Accuracy varies from method to method. This paper presents a comparison between the performance of different models in terms of the machine learning technique used (e.g. KNN, NB, SVM, DT, RF, MLP). Based on the data provided by the Tax Authority of Yemen, which is related to the commercial and industrial profits tax, which consists of 760 attributes, after the preprocessing of data. The dataset partition technique used k-fold validation. The paper shows that the e Naïve Bayes (NB) classifier gave the highest result in accuracy and other measures. Then KNN, SVM, and RF gave the same results in accuracy 99.87%, but in SVM, KNN the results were also the same in the rest measures, while in RF models the rest measures were 97.91%,99.95%, and 98.91% in Recall, Precision and F-score in order. MLP gave 98.42 in accuracy with 66.62%, 64.21%, and 64.40 in the recall, Precision, and F-score, then DT gave 97.76% in accuracy with 57.006% ,99.24% and 72.41% in the recall, precision, and F-score.
التنزيلات
منشور
كيفية الاقتباس
إصدار
القسم
الحقوق الفكرية (c) 2023 Abeer Abdullah Shuja'aaddeen Shujaaaddeen, Fadl M.M. Ba-Alwi
هذا العمل مرخص بموجب Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.