Comparative Analysis and Evaluation of Stemming and  Preprocessing Techniques for Arabic Text

Abdualmajed A. G. Al-Khulaidi; Samer Mohammed Yaseen

doi:10.59628/jast.v1i4.588

Comparative Analysis and Evaluation of Stemming and Preprocessing Techniques for Arabic Text

https://doi.org/10.59628/jast.v1i4.588

المؤلفون

Abdualmajed A. G. Al-Khulaidi Department of Computer Science, Faculty of Computer and Information Technology, Sana'a University, Sana'a, Yemen.
Samer Mohammed Yaseen Department of Computer Science, Faculty of Computer and Information Technology, Sana'a University, Sana'a, Yemen.

الكلمات المفتاحية:

Natural Language Processing، Information Retrieval، Arabic Information Retrieval، Stemming ، Text Preprocessing

الملخص

Arabic information retrieval is challenging due to the language's complex morphology and syntax. Preprocessing and stemming improve the accuracy and efficiency of Arabic information retrieval. This paper provides a comprehensive analysis of the existing literature on Arabic preprocessing and stemming techniques. The paper identifies the limitations and challenges of these techniques. The paper emphasizes the importance of preprocessing and stemming and underscores the need for further research to improve Arabic information retrieval. This study evaluates ten stemmers on a public dataset. The results show that root-based stemmers: Lucene, and khoja got the highest reduction rate 90.9%, and 85% respectively. The results emphasize that root-based stemmers have good conflating ability for similar terms, while light-based stemmers under-stem similar terms.

التنزيلات

بيانات التنزيل غير متوفرة بعد.

PDF (English)

منشور

2023-12-21

كيفية الاقتباس

Al-Khulaidi , A. A. G., & Yaseen, S. M. (2023). Comparative Analysis and Evaluation of Stemming and Preprocessing Techniques for Arabic Text . مجلة جامعة صنعاء للعلوم التطبيقية والتكنولوجيا, 1(4). https://doi.org/10.59628/jast.v1i4.588

تنزيل الاقتباسات

إصدار

مجلد 1 عدد 4 (2023): مجلة جامعة صنعاء للعلوم التطبيقية والتكنولوجيا

القسم

المقالات

هذا العمل مرخص بموجب Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Comparative Analysis and Evaluation of Stemming and Preprocessing Techniques for Arabic Text

المؤلفون

الكلمات المفتاحية:

الملخص

التنزيلات

منشور

كيفية الاقتباس

إصدار

القسم

الأعمال الأكثر قراءة لنفس المؤلف/المؤلفين

المؤلفات المشابهة

المعلومات

اللغة

الاستعراض

الجدول الزمني

الكلمات المفتاحية

إنشاء طلب نشر