Harnessing Machine Learning to Decode YouTube Subscriber Dynamics: Regression Predictive Models and Correlations

Authors

  • Sri Mulyati Universitas Padjajaran
  • Samidi Samidi Budiluhur University

DOI:

https://doi.org/10.57152/malcom.v5i3.2084

Keywords:

Machine Learning, Networks, Regression, Subscribers, YouTube

Abstract

YouTube has grown and become a digital media giant. Content creators continue to struggle with predicting subscriber growth. Due to viewers' changing interests and the vast amount of information, it is challenging to determine which factors most influence subscription behavior. Optimizing content strategy and ensuring channel growth need an understanding of these traits. This study uses linear regression models (LR), neural networks (NN), and Gaussian processes (GP) to predict YouTube subscribers and examine category correlations using video data from various topics. The study of correlation matrix analysis was performed with an absolute root mean square error (RMSE) of 26256351, and the NN prediction model outperformed the LR and GP models. The correlation matrix indicates a slight positive correlation of 0.067 among the YouTube categories. Specifically, the correlation coefficients for population, unemployment rate, and urban population are 0.080, -0.012, and 0.082, respectively. These findings suggest future research to create more intentional content and search for significant factors that increase viewership and marketing audience growth.

Downloads

Download data is not yet available.

Author Biography

Samidi Samidi, Budiluhur University

Master of Computer Science

References

Metamorworks, “Competition Issues concerning News Media and Digital Platforms,” 2021. [Online]. Available: https://www.oecd.org/daf/competition/competition-issues-in-news-

B. Auxier, A. Bucaille, K. Westcott, and D. Ortiz, “As seen in your feed: Shopping goes social, trending past US$1 trillion in annual sales,” 2023.

R. Peres, M. Schreier, D. A. Schweidel, and A. Sorescu, “The Creator Economy: An Introduction and a Call for Scholarly Research,” International Journal of Research in Marketing., 2024, [Online]. Available: https://ssrn.com/abstract=4663506

OECD, “Directorate For Financial and Enterprise Affairs Competition Committee,” 2021. [Online]. Available: https://www.oecd.org/daf/competition/competition-issues-in-news-media-and-digital-platforms.htm

Deloitte, “Social Commerce,” 2023.

L. T. Rui, Z. A. Afif, R. D. R. Saedudin, A. Mustapha, and N. Razali, “A regression approach for prediction of Youtube views,” Bulletin of Electrical Engineering and Informatics, vol. 8, no. 4, pp. 1502–1506, Dec. 2019, doi: 10.11591/eei.v8i4.1630.

Prachi, Siddhi, Deepa, and Manju, “Exploring Regression Models for Youtube Views Prediction with Interpretable Insights,” in 2024 IEEE 9th International Conference for Convergence in Technology, I2CT 2024, Institute of Electrical and Electronics Engineers Inc., 2024. doi: 10.1109/I2CT61223.2024.10544146.

J. Abasova, P. Tanuska, and S. Rydzi, “Big data—knowledge discovery in production industry data storages—implementation of best practices,” Applied Sciences (Switzerland), vol. 11, no. 16, Aug. 2021, doi: 10.3390/app11167648.

A. Rianti et al., “CRISP-DM: Metodologi Proyek Data Science,” in Seminar Nasional Teknologi Informasi dan Bisnis (SENATIB), Universitas Duta Bangsa Surakarta, Jul. 2023.

Ramesh. Sharda, Analytics, data science, & artificial intelligence. Pearson, 2020.

D. Maulud and A. M. Abdulazeez, “A Review on Linear Regression Comprehensive in Machine Learning,” Journal of Applied Science and Technology Trends, vol. 1, no. 2, pp. 140–147, Dec. 2020, doi: 10.38094/jastt1457.

V. Trinh, “A Comprehensive Review: Applicability of Deep Neural Networks in Business Decision Making and Market Prediction Investment,” Jan. 2025, [Online]. Available: http://arxiv.org/abs/2502.00151

A. Zeng, H. Ho, and Y. Yu, “Prediction of building electricity usage using Gaussian Process Regression,” Journal of Building Engineering, vol. 28, Mar. 2020, doi: 10.1016/j.jobe.2019.101054.

V. Kotu and B. Deshpande, Data science: concepts and practice. Morgan Kaufmann., 2019.

A. C. Munaro, R. Hübner Barcelos, E. C. Francisco Maffezzolli, J. P. Santos Rodrigues, and E. Cabrera Paraiso, “To engage or not engage? The features of video content on YouTube affecting digital consumer engagement,” Journal of Consumer Behaviour, vol. 20, no. 5, pp. 1336–1352, Sep. 2021, doi: 10.1002/cb.1939.

Y. Zhao, “Visualization and Scoring Models: Trending Videos Discovery and Recommendation based on Information Entropy Method and Principal Component Analysis,” in 2022 3rd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering, ICBAIE 2022, Institute of Electrical and Electronics Engineers Inc., 2022, pp. 451–465. doi: 10.1109/ICBAIE56435.2022.9985885.

A. Pellegrino, Decoding Digital Consumer Behavior Bridging Theory and Practice. Springer Nature Singapore Pte Ltd., 2024. doi: https://doi.org/10.1007/978-981-97-3454-2.

P. Rodríguez-Torrico, R. San José Cabezudo, and S. San-Martín, “Building consumer–brand relationships in the channel-mix era. The role of self–brand connection and product involvement,” Journal of Product and Brand Management, vol. 33, no. 1, pp. 76–90, Jan. 2024, doi: 10.1108/JPBM-10-2022-4181.

V. Ramaswamy and K. Ozcan, “What is co-creation? An interactional creation framework and its implications for value creation,” J Bus Res, vol. 84, pp. 196–205, Mar. 2018, doi: 10.1016/j.jbusres.2017.11.027.

A. Tatar, P. Antoniadis, M. D. de Amorim, and S. Fdida, “From popularity prediction to ranking online news,” Soc Netw Anal Min, vol. 4, no. 1, pp. 1–12, Jan. 2014, doi: 10.1007/s13278-014-0174-8.

Downloads

Published

2025-07-31

How to Cite

Mulyati, S., & Samidi, S. (2025). Harnessing Machine Learning to Decode YouTube Subscriber Dynamics: Regression Predictive Models and Correlations . MALCOM: Indonesian Journal of Machine Learning and Computer Science, 5(3), 990-999. https://doi.org/10.57152/malcom.v5i3.2084