Arabic Parts of Speech Tagging

Arabic Part-Of-Speech Tagging is a software which combines morphological analysis with Hidden Markov Model (HMM) and relies on the Arabic sentence structure. On the one hand, the morphological analysis is used to reduce the size of the tags lexicon by segmenting Ara

2025-06-28 16:30:17 - Adil Khan

Project Title

Arabic Parts of Speech Tagging

Project Area of Specialization Software EngineeringProject Summary

Arabic Part-Of-Speech Tagging is a software which combines morphological analysis with Hidden Markov Model (HMM) and relies on the Arabic sentence structure. On the one hand, the morphological analysis is used to reduce the size of the tags lexicon by segmenting Arabic words in their prefixes, stems, and suffixes due to the fact that Arabic is a derivational language. On the other hand, HMM is used to represent the Arabic sentence structure in order to take into account the logical linguistic sequencing. For these purposes, an appropriate tagging system has been proposed to represent the main Arabic part of speech in a hierarchical manner allowing an easy expansion whenever it is needed. Each tag in this system is used to represent a possible state of the HMM and the transitions between tags (states) are governed by the syntax of the sentence.

Project Objectives Project Implementation Method

HIDDEN MARKOV MODEL:

The other approach is statistical approach it contains HMM algorithm. HMM generates model based on set of some input sequence these input sequences are also called states like s1, s2... HMM work like finite state machine. It contains two states one is hidden state and the other one is visible state represented by W and V respectively. Statistical approach is specified on the internal structure of the Arabic sentence. When we entered the Arabic text, it recognizes the morphological characteristics of the word. The use of the linguistic inner structure of the Arabic sentence will permit us to recognize logical sequence of word, and as a result of their corresponding tags. The probability of a certain word occur depends upon its previous word it in a given condition the HMM will be the possible statistical model to keep track of this history. A linguistic study is conducted to determine the Arabic sentence structure by identifying the different main form of nominal and verbal sentences. Every state of an HMM is represented by a possible tag in the lexicon and the transition between states.

RULE BASED METHOD:

Assigns POS tags based on rules. For example (Arabic) we can have a rule that says, words ending with “???” or starting with “??” must be assigned to a noun. Rule-Based Techniques can be used along with Lexical Based approaches to allow POS Tagging of words that are not present in the training corpus but are there in the testing data.[

Benefits of the Project Technical Details of Final Deliverable Final Deliverable of the Project Software SystemCore Industry EducationOther Industries Others Core Technology Artificial Intelligence(AI)Other Technologies OthersSustainable Development Goals Quality EducationRequired Resources

More Posts