This paper investigates the effects of different stemming approaches on Arabic named entity recognition (NER), highlighting the significant role of stemming in processing the morphologically rich Arabic language. It compares light stemming methods and root-extraction techniques using experimental evaluations on established datasets like anercorp and the aqmar Arabic Wikipedia named entity corpus. The challenges of NER in Arabic are detailed, including issues such as the absence of capitalization and spelling variants, while various stemming methods are discussed for their effectiveness in enhancing NER tasks.
Related topics: