Sentiment and emotion in Malay news: A comprehensive analysis using sentiment analysis

Authors

  • Mohd Aftar Abu Bakar Department of Mathematical Sciences, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, 43600 Bangi, Selangor, Malaysia.
  • Wan Nurul Huda W Mamat Saufi Department of Mathematical Sciences, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, 43600 Bangi, Selangor, Malaysia.
  • Noratiqah Mohd Ariff Department of Mathematical Sciences, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, 43600 Bangi, Selangor, Malaysia.

DOI:

https://doi.org/10.24200/jonus.vol10iss2pp342-367

Abstract

Background and Purpose: Emotional framing of news can shape public perception and behaviour. This study examines sentiment in Malay-language headlines from Berita Harian (April 2021–April 2023) to reveal underlying emotions, recurring themes, and societal implications.

Methodology: This research collected headlines from the Berita Harian archive, then applied tokenization, stop-word removal, and normalization. A pre-trained Malay sentiment transformer assigned initial labels (positive, negative, neutral), and a manually verified subset was used to train a Support Vector Machine (SVM). Model performance was measured on a test set via accuracy, precision, recall, and F1-score. Word clouds and count plots highlighted frequent sentiment features.

Findings: The SVM achieved high precision and recall for positive sentiment (0.87/0.85) but lower recall for neutral (0.62), indicating challenges in neutral detection. Dominant topics included COVID‑19, PRU15, and mangsa.

Contributions: By applying transformer labeling with SVM classification, this work extends sentiment analysis to Malay news media. It informs journalists and policymakers about emotional framing in Malaysian headlines.

Keywords: Sentiment analysis, news headlines, TF-IDF features, Support Vector Machine (SVM), Malay language.

Author Biographies

  • Mohd Aftar Abu Bakar, Department of Mathematical Sciences, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, 43600 Bangi, Selangor, Malaysia.

    Mohd Aftar Abu Bakar is an associate professor at the Department of Mathematical Sciences, Faculty of Science and Technology, Universiti Kebangsaan Malaysia. He holds a PhD in Mathematics and Computer Science. His research interests span time series analysis, machine learning, and data analytics. Additionally, he is the Editor in Chief for the Journal of Quality Measurement and Analysis.

  • Wan Nurul Huda W Mamat Saufi, Department of Mathematical Sciences, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, 43600 Bangi, Selangor, Malaysia.

    Wan Nurul Huda Binti W Mamat Saufi is a PhD candidate at the Department of Mathematical Sciences, Faculty of Science And Technology, University Kebangsaan Malaysia. Her research focuses on natural language processing and sentiment analysis in the malay language. She holds a master’s degree in mathematical science and has presented several papers at academic conferences.

  • Noratiqah Mohd Ariff, Department of Mathematical Sciences, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, 43600 Bangi, Selangor, Malaysia.

    Noratiqah Mohd Ariff is an associate professor at the Department of Mathematical Sciences, Faculty of Science and Technology, Universiti Kebangsaan Malaysia. She obtained a Doctor of Philosophy (PhD) in Statistics from Universiti Kebangsaan Malaysia. Her focuses on areas such as Applied Statistics, Computational Statistics, Data Analysis, and Statistical Modelling. 

References

Afif, M. (2024). Applying TF-IDF and k-NN for clickbait detection in Indonesian online news headlines. Journal of Advanced Computing Knowledge and Algorithms, 1(2), 38-41.

Alammary, A. (2021). Arabic questions classification using modified TF-IDF. IEEE Access, 9(1), 95109-95122.

Aslam, F., Awan, T., Syed, J., Kashif, A., & Parveen, M. (2020). Sentiments and emotions evoked by news headlines of coronavirus disease (COVID-19) outbreak. Humanities and Social Sciences Communications, 7(23), 1-9.

Bakar, N. S. A. A., Rahmat, R. A., & Othman, U. F. (2019). Polarity classification tool for sentiment analysis in Malay language. IAES International Journal of Artificial Intelligence, 8(3), 258-263.

Chintalapudi, N., Battineni, G., Di Canio, M., Sagaro, G. G., & Amenta, F. (2021). Text mining with sentiment analysis on seafarers’ medical documents. International Journal of Information Management Data Insights, 1(1), 100005.

Hossain, M. S., Jui, I. J., & Suzana, A. Z. (2021). Sentiment analysis for Bengali newspaper headlines [Unpublished undergraduate thesis]. BRAC University.

Iqbal, B. M., Lhaksmana, K. M., & Setiawan, E. B. (2023). 2024 presidential election sentiment analysis in news media using support vector machine. Journal of Computer Systems and Informatics, 4(2), 397-404.

Mahadzir, N. H., Omar, M. F., Nawi, M. N. M., Salameh, A. A., Hussin, K. C., & Sohail, A. (2022). MELex: The construction of Malay-English sentiment lexicon. Computers, Materials & Continua, 71(1), 1790–1807.

Mohamad, A. K., Jayakrishnan, M., & Nawi, N. H. (2020). Employ Twitter data to perform sentiment analysis in the Malay language. International Journal of Advanced Trends in Computer Science and Engineering, 9(2), 1404–1412.

Mukhtar, R., Iqbal, M., & Faheem, Z. (2021). Pakistani news classification based on headlines. Pakistan Journal of Engineering and Technology, 4(4), 79-85.

Osmani, A., Mohasefi, J. B., & Gharehchopogh, F. S. (2020). Enriched latent dirichlet allocation for sentiment analysis. Expert Systems, 37(4), e12527.

Osmani, A., Mohasefi, J. B., & Gharehchopogh, F. S. (2022). Weighted joint sentiment-topic model for sentiment analysis compared to ALGA: Adaptive lexicon learning using genetic algorithm. Computational Intelligence and Neuroscience, 2022(4), 1-35.

Rozado, D., Hughes, R., & Halberstadt, J. (2022). Longitudinal analysis of sentiment and emotion in news media headlines using automated labelling with transformer language models. PLOS ONE, 17(10), e0276367.

Wongso, R., Luwinda, F. A., Trisnajaya, B. C., & Rusli, O. (2017). News article text classification in Indonesian language. Procedia Computer Science, 116(1), 137-143.

Ying, O. J., Zabidi, M. M. A., Ramli, N., & Sheikh, U. U. (2020). Sentiment analysis of informal Malay tweets with deep learning. IAES International Journal of Artificial Intelligence, 9(2), 212-220.

Zheng, X. (2023). Stock price prediction based on CNN-BiLSTM utilizing sentiment analysis and a two-layer attention mechanism. Advances in Economics Management and Political Sciences, 47(1), 40-49.

Downloads

Published

2025-07-31

How to Cite

Sentiment and emotion in Malay news: A comprehensive analysis using sentiment analysis. (2025). Journal of Nusantara Studies (JONUS), 10(2), 342-367. https://doi.org/10.24200/jonus.vol10iss2pp342-367