Analisis Perbandingan Algoritma Machine Learning untuk Klasifikasi Sentimen Ulasan Pengguna Aplikasi TikTok di Google Play Store
Abstract
This research aims to conduct sentiment analysis on user reviews of the TikTok application obtained from the Google Play Store using machine learning approaches. The dataset was collected through a web scraping process, resulting in 8,097 Indonesian-language reviews. All textual data went through several preprocessing stages, including text cleaning, removal of irrelevant characters, normalization, tokenization, stopword removal, and stemming using the Sastrawi algorithm. Sentiment labeling was performed automatically based on the rating, in which 1–2 stars were categorized as negative, 3 stars as neutral, and 4–5 stars as positive. Feature extraction was carried out using the Term Frequency–Inverse Document Frequency (TF-IDF) method to convert text into numerical representations. Four machine learning algorithms were implemented, consisting of Naïve Bayes, Logistic Regression, Support Vector Machine (SVM), and Random Forest. The performance of each model was evaluated using accuracy, precision, recall, and F1-score metrics, along with confusion matrix analysis to observe misclassification patterns. The results show that positive sentiment dominates the dataset, indicating that users generally provide favorable feedback toward the TikTok application. Experiment results reveal that Naïve Bayes achieved the highest accuracy, while Logistic Regression produced the best precision and F1-score. Random Forest showed the lowest performance, whereas SVM remained competitive with stable results across metrics. In addition, Logistic Regression and Naïve Bayes demonstrated the most efficient computation time, while SVM and Random Forest required longer processing duration due to model complexity. Overall, Logistic Regression can be considered the most optimal model in this study due to its balanced evaluation and computational efficiency. These findings demonstrate that machine learning can effectively classify public opinion automatically and serve as valuable input for improving service quality within the TikTok application.


