|
Dépôt Institutionnel de l'Université Ferhat ABBAS - Sétif 1 >
Faculté des Sciences >
Département d'Informatique >
Mémoires de master >
Veuillez utiliser cette adresse pour citer ce document :
http://dspace.univ-setif.dz:8888/jspui/handle/123456789/5598
|
| Titre: | Sentiment Analysis of Arabic Texts Using GPT Model, Deep Learning, and Machine Learning Algorithms |
| Auteur(s): | Sana, Abdelkadar |
| Mots-clés: | Sentiment Analysis ML: Machine Learning DL: Deep Learning RNN: Recurrent Neural Networks NLP: Natural Language Processing |
| Date de publication: | 2025 |
| Résumé: | With the widespread adoption of Web 2.0 technologies and the rise of social media platforms like
Twitter, Facebook, and YouTube, sentiment analysis has emerged as a central task in Natural
Language Processing (NLP), especially in Arabic, which presents unique challenges due to its
morphological complexity and dialectal diversity.
This study focuses on sentiment analysis in Arabic tweets across multiple dialects, utilizing
a range of traditional and modern techniques. Four classification algorithms were employed:
SVM, Naïve Bayes, Logistic Regression, and Random Forest, along with repurposing
the AraGPT2 generative model (Generative Pretrained Transformer)—originally not designed
for classification—to assess its generalization ability beyond its intended scope. In addition,
a BiLSTM (Bidirectional Long Short-Term Memory) deep learning model was integrated to
evaluate its effectiveness in handling dialect-rich Arabic texts.
AraBERT was used to extract contextual embeddings, while MARBERTv2 served as a finetuned
model for direct sentiment classification. The study introduced several technical innovations:
A hybrid text representation combining TF-IDF and FastText embeddings to blend statistical
weighting with semantic richness.
A Curriculum Learning strategy that incrementally trained the model through phased data
segmentation, enabling training on low-resource environments like Google Colab.
A Fast Convergence approach to reach optimal performance with a minimal number of
training epochs.
Results showed stable performance for traditional classifiers, outstanding effectiveness of
MARBERTv2 for dialectal Arabic, and surprisingly competitive results from AraGPT2 despite
its generative nature. Embedding combinations and staged training significantly improved
memory efficiency and model scalability.
This work contributes to advancing AI research for Arabic and opens promising directions
for building efficient, deployable models. |
| URI/URL: | http://dspace.univ-setif.dz:8888/jspui/handle/123456789/5598 |
| Collection(s) : | Mémoires de master
|
Fichier(s) constituant ce document :
Il n'y a pas de fichiers associés à ce document.
|
Tous les documents dans DSpace sont protégés par copyright, avec tous droits réservés.
|