Sentiment Analysis of Multilingual Tweets Using Hybrid AI Models

Authors

  • Dr. Jaspreet Khurana Waheguru Meher Education Services pvt ltd 5660 176a St, Surrey, BC V3S 4H1, Canada drjaspreetkhurana@gmail.com Author

Keywords:

multilingual sentiment analysis, hybrid AI, transformers, code-switching, lexical features, emoji sentiment, stacked ensembling, social media analytics

Abstract

Social media platforms produce vast, multilingual streams of short, noisy text where sentiment is often signaled through code-switching, slang, emojis, and cultural references. While large multilingual transformers (e.g., mBERT, XLM-R) have improved cross-lingual sentiment classification, performance still degrades on low-resource languages, code-switched text, and sarcasm. This manuscript presents a hybrid AI approach that combines (i) a strong multilingual transformer encoder, (ii) lightweight language-specific lexical/emoji features, (iii) code-switch and script-aware preprocessing, and (iv) stacked ensembling with a shallow meta-learner.

Using a balanced corpus of 200k tweets across five languages—English, Hindi, Spanish, Arabic, and Bengali—with three sentiment classes (positive, negative, neutral), we benchmark baselines (TF–IDF + SVM; mBERT; XLM-R) against two hybrid variants. Our proposed model fuses sentence-level transformer embeddings with affective lexicon counts, emoji sentiment priors, punctuation patterns, and a code-switch intensity score, then feeds them to a gradient-boosting meta-classifier on top of a fine-tuned transformer head. In simulated experiments with stratified splits (70/10/20), the proposed hybrid improves average macro-F1 by 4.0 points over a fine-tuned XLM-R baseline, with significant gains (p < .05) for code-switched and emoji-heavy tweets. Error analysis shows reduced confusion between neutral vs. mildly positive and improved robustness to script mixing (Latin–Devanagari). We discuss model design, training regime, and statistical validation, and we highlight implications for multilingual customer analytics, public-health monitoring, and civic sentiment tracking.

Downloads

Download data is not yet available.

Published

2026-03-01

How to Cite

Khurana, Dr. Jaspreet. “Sentiment Analysis of Multilingual Tweets Using Hybrid AI Models”. International Journal of Advanced Research in Computer Science and Engineering (IJARCSE) 2, no. 1 (March 1, 2026): Mar (12–21). Accessed March 5, 2026. https://ijarcse.org/index.php/ijarcse/article/view/116.

Similar Articles

11-20 of 73

You may also start an advanced similarity search for this article.