Predictive Sales Analysis in Coffee Shops Using the Random Forest Algorithm

Authors

  • Shella Norma Windrasari Airlangga University
  • Hendro Margono Airlangga University
  • Yudistira Ardi Nugraha Setyawan Putra Airlangga University

DOI:

https://doi.org/10.57152/malcom.v5i3.2023

Keywords:

Data Analysis, Coffee Shop, Machine Learning, Random Forest, Sales Prediction

Abstract

The coffee shop industry has experienced significant growth, evolving into a highly competitive marketplace demanding specialty coffee and personalized experiences. While data-driven strategies are crucial for optimizing operations, many owners still struggle to effectively leverage their sales data to understand dynamic customer behavior and enhance decision-making. Addressing this gap, this study explores the application of machine learning (ML) techniques, specifically the Random Forest Regressor model, to predict sales performance within the coffee shop business environment. By analyzing factors such as transaction timing, store location, product type, and day of the week, this research aims to uncover patterns that can enhance inventory management and customer engagement. The Random Forest model was evaluated through cross-validation, yielding a mean Mean Squared Error (MSE) of 80.97, which indicates moderate predictive accuracy and represents an improvement over traditional forecasting methods commonly employed in the industry. Feature importance analysis revealed that Premium Beans is the most influential predictor, followed by seasonal trends (month), time of day, and weekend sales patterns. These findings underscore the importance of incorporating temporal and contextual factors into forecasting models. 

Downloads

Download data is not yet available.

References

A. Y. F. Tan and A. S. Y. Lo, “A Benefit-Based Approach To Market Segmentation: A Case Study of an American Specialty Coffeehouse Chain in Hong Kong,” J. Hosp. Tour. Res., vol. 32, no. 3, pp. 342–362, Aug. 2008, doi: 10.1177/1096348008317388.

K. Vayadande, R. Deshpande, M. Deshpande, P. Chaudhary, A. Dhangar, and T. Dhangar, “Tracking Barista Productivity and Customer Demographics,” in 2024 5th International Conference on Data Intelligence and Cognitive Informatics (ICDICI), Tirunelveli, India: IEEE, Nov. 2024, pp. 891–898. doi: 10.1109/icdici62993.2024.10810948.

M. Cemberci, S. Cicek Vural, C. Celik, and E. Canbaz, “The Role of Supply Chain Transparency in the Relation between Supply Chain Analytics Capabilities and Firm Performance,” Oper. Supply Chain Manag. Int. J., pp. 253–263, June 2024, doi: 10.31387/oscm0570426.

C. Udokwu, P. Brandtner, F. Darbanian, and T. Falatouri, “Proposals for Addressing Research Gaps at the Intersection of Data Analytics and Supply Chain Management,” J. Adv. Inf. Technol., vol. 13, no. 4, 2022, doi: 10.12720/jait.13.4.338-346.

W. S. Lee, J. Moon, and M. Song, “Attributes of the coffee shop business related to customer satisfaction,” J. Foodserv. Bus. Res., vol. 21, no. 6, pp. 628–641, Nov. 2018, doi: 10.1080/15378020.2018.1524227.

O. Putri Dahlan, S. Putri Dahlan, and M. Fahlevi, “Marketing Mix Elements on Customer Service Satisfaction at Coffee Shops in Jakarta,” E3S Web Conf., vol. 448, p. 01004, 2023, doi: 10.1051/e3sconf/202344801004.

P. Ruangchoengchum and P. Thatphet, “Improving Customer Service Efficiency Using Demand Forecasting with Leagile and Lean Six Sigma Concepts: A Case Study,” Suranaree J. Soc. Sci., vol. 18, no. 1, June 2024, doi: 10.55766/sjss-1-2024-267502.

K. Sinha, “New Trends and their Impact on Business and Society,” J. Creat. Commun., vol. 3, no. 3, pp. 305–317, Nov. 2008, doi: 10.1177/097325861000300304.

I. D. Sudirman and R. Rahmah, “Dynamic Pricing Optimization for Coffee Shops Using a Machine Learning Approach with Random Forest Models,” in 2025 3rd International Conference on Disruptive Technologies (ICDT), Greater Noida, India: IEEE, Mar. 2025, pp. 1282–1286. doi: 10.1109/icdt63985.2025.10986336.

P. S. Dahake and N. Somani, “Harnessing Predictive Analytics for Accurate Consumer Behaviour Forecasting: A Comprehensive Review,” in 2024 2nd DMIHER International Conference on Artificial Intelligence in Healthcare, Education and Industry (IDICAIEI), Wardha, India: IEEE, Nov. 2024, pp. 1–6. doi: 10.1109/idicaiei61867.2024.10842743.

P. S. Dahake, S. Chandak, R. V. Mohare, K. Wadhwani, and P. Bhadade, “The Crystal Ball of Marketing: How Predictive Analytics is Reshaping the Industry?,” in 2023 Second International Conference On Smart Technologies For Smart Nation (SmartTechCon), Singapore, Singapore: IEEE, Aug. 2023, pp. 304–311. doi: 10.1109/smarttechcon57526.2023.10391334.

S. P. Praveen, P. Chaitanya, A. Mohan, V. Shariff, J. V. N. Ramesh, and J. Sunkavalli, “Big Mart Sales using Hybrid Learning Framework with Data Analysis,” in 2023 2nd International Conference on Automation, Computing and Renewable Systems (ICACRS), Pudukkottai, India: IEEE, Dec. 2023, pp. 471–477. doi: 10.1109/icacrs58579.2023.10404941.

H. Pallathadka, M. Jawarneh, F. Sammy, V. Garchar, D. T. Sanchez, and M. Naved, “A Review of Using Artificial Intelligence and Machine Learning in Food and Agriculture Industry,” in 2022 2nd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE), Greater Noida, India: IEEE, Apr. 2022. doi: 10.1109/icacite53722.2022.9823427.

H. Pallathadka et al., “An investigation of various applications of machine learning in food industry,” in AIP Conference Proceedings, Nandyal, India: AIP Publishing, 2023, p. 090001. doi: 10.1063/5.0150516.

R. Mahmoud et al., “Revolutionizing Food Quality With Machine Vision and Machine Learning Techniques,” in Food in the Metaverse and Web 3.0 Era, IGI Global, 2025, pp. 71–124. doi: 10.4018/979-8-3693-9025-2.ch005.

N. J. Watson et al., “Intelligent Sensors for Sustainable Food and Drink Manufacturing,” Front. Sustain. Food Syst., vol. 5, Nov. 2021, doi: 10.3389/fsufs.2021.642786.

A. S. Rao, B. V. Vardhan, and H. Shaik, “Role of Exploratory Data Analysis in Data Science,” in 2021 6th International Conference on Communication and Electronics Systems (ICCES), Coimbatre, India: IEEE, July 2021, pp. 1457–1461. doi: 10.1109/icces51350.2021.9488986.

T. O. Hodson, T. M. Over, and S. S. Foks, “Mean Squared Error, Deconstructed,” J. Adv. Model. Earth Syst., vol. 13, no. 12, Dec. 2021, doi: 10.1029/2021ms002681.

S.-C. Kim, S. R. Salkuti, A. M. Suresh, and M. S. Sankaran, “Data analysis and visualization on titanic and student’s performance datasets-an exploratory study,” Int. J. Inform. Commun. Technol. IJ-ICT, vol. 14, no. 1, p. 68, Apr. 2025, doi: 10.11591/ijict.v14i1.pp68-76.

C. E. Morr et al., “Data Preprocessing,” in International Series in Operations Research & Management Science, Cham: Springer International Publishing, 2022, pp. 117–163. doi: 10.1007/978-3-031-16990-8_4.

J. A. Oribe et al., “Data preprocessing techniques for earth resource management,” in Data Analytics and Artificial Intelligence for Earth Resource Management, Elsevier, 2025, pp. 37–64. doi: 10.1016/b978-0-443-23595-5.00003-6.

K. G. Samuel et al., “Covid-19 Data Preprocessing Approach in Machine Learning for Prediction,” in Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, Cham: Springer Nature Switzerland, 2024, pp. 328–344. doi: 10.1007/978-3-031-56396-6_21.

R. Shweta et al., “Preprocessing of Datasets Using Sequential and Parallel Approach: A Comparison,” in Lecture Notes in Networks and Systems, Singapore: Springer Singapore, 2022, pp. 311–320. doi: 10.1007/978-981-16-2126-0_27.

M. J. Reena, “Preprocessing Big Data using Partitioning Method for Efficient Analysis,” in 2023 IEEE International Conference on Contemporary Computing and Communications (InC4), Bangalore, India: IEEE, Apr. 2023, pp. 1–6. doi: 10.1109/inc457730.2023.10262924.

N. Andrienko and G. Andrienko, Exploratory Analysis of Spatial and Temporal Data. Berlin/Heidelberg: Springer-Verlag, 2006. doi: 10.1007/3-540-31190-4.

S. A. Khan and S. S. Velan, “Application of Exploratory Data Analysis to Generate Inferences on the Occurrence of Breast Cancer using a Sample Dataset,” in 2020 International Conference on Intelligent Engineering and Management (ICIEM), London, United Kingdom: IEEE, June 2020. doi: 10.1109/iciem48762.2020.9160290.

C. Selvi G. and L. Priya G. G., Eds., “An Epidemic Analysis of COVID-19 using Exploratory Data Analysis Approach,” in Predictive Analytics Using Statistics and Big Data: Concepts and Modeling, BENTHAM SCIENCE PUBLISHERS, 2020, pp. 99–111. doi: 10.2174/9789811490491120010010.

T. Gunasegaran and Y.-N. Cheah, “Evolutionary cross validation,” in 2017 8th International Conference on Information Technology (ICIT), Amman, Jordan: IEEE, May 2017, pp. 89–95. doi: 10.1109/icitech.2017.8079960.

J. Smith et al., “Making Early Predictions of the Accuracy of Machine Learning Classifiers,” in Learning in Non-Stationary Environments, New York, NY: Springer New York, 2012, pp. 125–151. doi: 10.1007/978-1-4419-8020-5_6.

B. Kartal and B. B. Üstünda?, “Energy and Entropy based Intelligence Metric for Performance Estimation in DNNs,” in 2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Bali, Indonesia: IEEE, Feb. 2023, pp. 468–473. doi: 10.1109/icaiic57133.2023.10067093.

Downloads

Published

2025-07-31

How to Cite

Windrasari, S. N., Margono, H., & Putra, Y. A. N. S. (2025). Predictive Sales Analysis in Coffee Shops Using the Random Forest Algorithm . MALCOM: Indonesian Journal of Machine Learning and Computer Science, 5(3), 1000-1011. https://doi.org/10.57152/malcom.v5i3.2023