Analisis Perbandingan Algoritma Machine Learning untuk Klasifikasi Sentimen Ulasan Pengguna Aplikasi TikTok di Google Play Store

Desta Tri Lestari; Putri Mahirah Syahla; Satya Wibisono; Khairul Rizal; Susliansyah Susliansyah; Rahmat Hidayat

Authors

Desta Tri Lestari Program Studi Informatika, Universitas Bina Sarana Informatika
Putri Mahirah Syahla Program Studi Informatika, Universitas Bina Sarana Informatika
Satya Wibisono Program Studi Informatika, Universitas Bina Sarana Informatika
Khairul Rizal Program Studi Teknologi Informasi, Universitas Bina Sarana Informatika
Susliansyah Susliansyah Program Studi Sistem Informasi, Universitas Bina Sarana Informatika
Rahmat Hidayat Program Studi Sistem Informasi, Universitas Bina Sarana Informatika

Abstract

This research aims to conduct sentiment analysis on user reviews of the TikTok application obtained from the Google Play Store using machine learning approaches. The dataset was collected through a web scraping process, resulting in 8,097 Indonesian-language reviews. All textual data went through several preprocessing stages, including text cleaning, removal of irrelevant characters, normalization, tokenization, stopword removal, and stemming using the Sastrawi algorithm. Sentiment labeling was performed automatically based on the rating, in which 1–2 stars were categorized as negative, 3 stars as neutral, and 4–5 stars as positive. Feature extraction was carried out using the Term Frequency–Inverse Document Frequency (TF-IDF) method to convert text into numerical representations. Four machine learning algorithms were implemented, consisting of Naïve Bayes, Logistic Regression, Support Vector Machine (SVM), and Random Forest. The performance of each model was evaluated using accuracy, precision, recall, and F1-score metrics, along with confusion matrix analysis to observe misclassification patterns. The results show that positive sentiment dominates the dataset, indicating that users generally provide favorable feedback toward the TikTok application. Experiment results reveal that Naïve Bayes achieved the highest accuracy, while Logistic Regression produced the best precision and F1-score. Random Forest showed the lowest performance, whereas SVM remained competitive with stable results across metrics. In addition, Logistic Regression and Naïve Bayes demonstrated the most efficient computation time, while SVM and Random Forest required longer processing duration due to model complexity. Overall, Logistic Regression can be considered the most optimal model in this study due to its balanced evaluation and computational efficiency. These findings demonstrate that machine learning can effectively classify public opinion automatically and serve as valuable input for improving service quality within the TikTok application.

Analisis Perbandingan Algoritma Machine Learning untuk Klasifikasi Sentimen Ulasan Pengguna Aplikasi TikTok di Google Play Store

Authors

Abstract

Downloads

Published

Issue

Section

menu

Current Issue

Information

Developed By

Language