Sistem Segmentasi Program Talk Show Berdasarkan Media Sosial Twitter Menggunakan Metode K-Medoids Clustering
Keywords:
Twitter social media segmentation, k-medoids clustering, cosine similarity, data transformation, silhouette coefficient, ratingAbstract
Innovations on a talk show on television can be a threat. Audience will be divided into groups so that it can make a downgrade rating program. Program ratings affect companies that will use advertising services. Television companies will go bankrupt. The biggest source of income is sales of advertising services. One way to overcome them can be analyzed in public opinion. The results of the analysis can provide information about the attractiveness of the community towards the program. But the analysis process takes a long time and can be done only by a competent person so another process is needed to get the results of the analysis that is fast and can be done by anyone. In this study using K-Medoids Clustering in the process of identifying public opinion. The clustering process known as unsupervised learning will be combined with the labeling process. The previous episode's tweet data will be labeled and then used to obtain the predicted labels from other cluster members. Labels consist of three types, namely 1) theme, 2) resource persons, and 3) programs. Before going through the clustering stage, the tweet data will go through the text preprocessing stage then transformed into a numeric form based on the appearance of the word. Transformation data will be clustered by calculating proximity using Cosine Similarity. Labels from the Medoids cluster will be used on unlabeled tweet data. The cluster results were tested using the Silhouette Coefficient method to get 0.19 results. However, this method successfully predicted public opinion and achieved an accuracy of 80%.
Innovations on a talk show on television can be a threat. Audience will be divided into groups so that it can make a downgrade rating program. Program ratings affect companies that will use advertising services. Television companies will go bankrupt. The biggest source of income is sales of advertising services. One way to overcome them can be analyzed in public opinion. The results of the analysis can provide information about the attractiveness of the community towards the program. But the analysis process takes a long time and can be done only by a competent person so another process is needed to get the results of the analysis that is fast and can be done by anyone. In this study using K-Medoids Clustering in the process of identifying public opinion. The clustering process known as unsupervised learning will be combined with the labeling process. The previous episode's tweet data will be labeled and then used to obtain the predicted labels from other cluster members. Labels consist of three types, namely 1) theme, 2) resource persons, and 3) programs. Before going through the clustering stage, the tweet data will go through the text preprocessing stage then transformed into a numeric form based on the appearance of the word. Transformation data will be clustered by calculating proximity using Cosine Similarity. Labels from the Medoids cluster will be used on unlabeled tweet data. The cluster results were tested using the Silhouette Coefficient method to get 0.19 results. However, this method successfully predicted public opinion and achieved an accuracy of 80%.
References
L. G. B. Ruiz, M. C. Pegalajar, R. Arcucci, and M. Molina-Solana, “A Time-Series Clustering Methodology for Knowledge Extraction in Energy Consumption Data,” Expert Syst. Appl., vol. 160, p. 113731, 2020, doi: 10.1016/j.eswa.2020.113731.
P. Arora, Deepali, and S. Varshney, “Analysis of K-Means and K-Medoids Algorithm for Big Data,” Procedia Comput. Sci., vol. 78, pp. 507–512, 2016, doi: 10.1016/j.procs.2016.02.095.
S. S. Li, “Lifestyles, Technology Clustering, and the Adoption of Over-the-top Television and Internet Protocol Television in Taiwan,” Int. J. Commun., vol. 14, pp. 2017–2035, 2020.
X. Kui et al., “TVseer: A Visual Analytics System for Television Ratings,” Vis. Informatics, vol. 4, no. 3, pp. 1–11, 2020, doi: 10.1016/j.visinf.2020.06.001.
M. A. Pribadi, M. G. Yoedtadi, and K. H. Siswoko, “Perspektif Praktisi Televisi Indonesia terhadap Konvergensi Televisi dan Internet dalam Persaingan Penyajian Informasi di Internet,” J. Muara Ilmu Sos. Humaniora, dan Seni, vol. 1, no. 1, p. 319, 2017, doi: 10.24912/jmishumsen.v1i1.372.
Viva, “tvOne Corporate Website,” tvOne. [Online]. Available: https://tvonenews.tv/. [Accessed: 19-Oct-2019].
Y. Darwis, “Hasil Survei Indeks Kualitas Program Siaran Televisi,” Komisi Penyiaran Indonesia Pusat, Jakarta, 2018.
M. D. Devika, C. Sunitha, and A. Ganesh, “Sentiment Analysis: A Comparative Study on Different Approaches,” Procedia Comput. Sci., vol. 87, pp. 44–49, 2016, doi: 10.1016/j.procs.2016.05.124.
C. J. Hutto and E. Gilbert, “VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text,” in International AAAI Conference on Weblogs and Social Media, 2014, pp. 216–225, doi: 10.1210/en.2011-1066.
R. Ahuja, A. Chug, S. Kohli, S. Gupta, and P. Ahuja, “The Impact of Features Extraction on the Sentiment Analysis,” Procedia Comput. Sci., vol. 152, pp. 341–348, 2019, doi: 10.1016/j.procs.2019.05.008.
M. Guftar, S. H. Ali, A. A. Raja, and U. Qamar, “A Novel Framework for Classification of Syncope Disease using K-Means Clustering Algorithm,” in SAI Intelligent Systems Conference, 2015, pp. 127–132, doi: 10.1109/IntelliSys.2015.7361135.
Z. Shuyang, T. Heittola, and T. Virtanen, “Active Learning for Sound Event Classification by Clustering Unlabeled Data,” in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2017, pp. 751–755, doi: 10.1109/ICASSP.2017.7952256.
M. Darnstadt, H. Meutzner, and D. Kolossa, “Reducing the Cost of Breaking Audio CAPTCHAs by Active and Semi-supervised Learning,” Proc. - 2014 13th Int. Conf. Mach. Learn. Appl. ICMLA 2014, pp. 67–73, 2014, doi: 10.1109/ICMLA.2014.16.
Z. Shuyang, T. Heittola, and T. Virtanen, “An Active Learning Method Using Clustering and Committee-Based Sample Selection for Sound Event Classification,” 16th Int. Work. Acoust. Signal Enhanc. IWAENC 2018 - Proc., pp. 116–120, 2018, doi: 10.1109/IWAENC.2018.8521336.
W. Ji, R. Wang, and J. Ma, “Dictionary-Based Active Learning for Sound Event Classification,” Multimed. Tools Appl., vol. 78, no. 3, pp. 3831–3842, 2019, doi: 10.1007/s11042-018-6380-z.
Y. Tan, “An Improved KNN Text Classification Algorithm Based on K-Medoids and Rough Set,” in International Conference on Intelligent Human-Machine Systems and Cybernetics, 2018, vol. 1, pp. 109–113, doi: 10.1109/IHMSC.2018.00032.
C. N. Dos Santos and M. Gatti, “Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts,” in International Conference on Computational Linguistics, 2014, pp. 69–78.
S. Vijayarani, M. J. Ilamathi, and M. Nithya, “Preprocessing Techniques for Text Mining -An Overview,” Int. J. Comput. Sci. Commun. Networks, vol. 5, no. 1, pp. 7–16, 2016.
Y. H. Chrisnanto and G. Abdillah, “Gambaran Umum Kemampuan Akademik Mahasiswa Unjani Dengan Algoritma Partitioning Around Medoids ( PAM ) Clustering,” in Seminar Nasional Ilmu Pengetahuan dan Teknologi, 2015, pp. 285–290.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2020 http://creativecommons.org/licenses/by/4.0/

This work is licensed under a Creative Commons Attribution 4.0 International License.
http://creativecommons.org/licenses/by/4.0