Machine Learning Modeling for Forecasting Repeat Purchases in Online Shopping

Authors

  • Lianzhai Duan President University

DOI:

https://doi.org/10.57152/malcom.v4i3.1388

Keywords:

Data Analysis, Data Modeling, Machine Learning, Online Shopping, Repeat Purchase Forecast

Abstract

Online shopping merchants will conduct a series of marketing activities to increase customers, but in many cases, most of the new customers will not make repeat purchases, which is not conducive to the long-term interests of the merchants. Therefore, it is important for merchants to target users who are more likely to repurchase, as this can reduce marketing costs and increase ROI. Based on the dataset provided by the online shopping website, this paper conducts mining and exploratory analysis of the data, utilizes feature engineering methodology, and modeling analysis using LightGBM, Logistic, Xgboost for machine learning modeling. Meanwhile, parameter optimization and model evaluation verification are performed, Finally, the comparative analysis resulted in LightGBM as the best prediction model, will provide efficient marketing decisions for the operation of online shopping stores.

References

S.L. Gortmaker, D.W. Hosmer, S. Lemeshow, Applied logistic regression, Contemp. Sociol. 23 (1994) 159, https://doi.org/10.2307/2074954.

J.R. Quinlan, Induction of decision trees, Mach. Learn. 1 (1986) 81–106, https:// doi.org/10.1023/A:1022643204877.

L. Breiman, Random forests, Mach. Learn. 45 (2001) 5–32, https://doi.org/ 10.1017/CBO9781107415324.004.

H. Drucker, C.J.C. Surges, L. Kaufman, A. Smola, V. Vapnik, Support vector regression machines, in: Advances in Neural Information Processing Systems, 1997: pp. 155–161.

W.S. McCulloch, W. Pitts, A logical calculus of the ideas immanent in nervous activity, The Bulletin of Mathematical Biophysics. 5 (1943) 115–133, https://doi. org/10.1007/BF02478259.

S. Lessmann, B. Baesens, H.V. Seow, L.C. Thomas, Benchmarking state-of-the-art classification algorithms for credit scoring: an update of research, Eur. J. Oper. Res. 247 (2015) 124–136, https://doi.org/10.1016/j.ejor.2015.05.030.

S. Lessmann, B. Baesens, H.V. Seow, L.C. Thomas, Benchmarking state-of-the-art classification algorithms for credit scoring: an update of research, Eur. J. Oper. Res. 247 (2015) 124–136, https://doi.org/10.1016/j.ejor.2015.05.030.

M. Chau, H. Chen, A machine learning approach to web page filtering using content and structure analysis, Decis. Support. Syst. 44 (2008) 482–494, https:// doi.org/10.1016/j.dss.2007.06.002.

S. Dreiseitl, L. Ohno-Machado, Logistic regression and artificial neural network classification models: a methodology review, J. Biomed. Inform. 35 (2002) 352–359, https://doi.org/10.1016/S1532-0464(03)00034-0.

Srivastava, M., Abhishek, S., Pandey, N., 2023. Electronic word-of-mouth (eWOM) and customer brand engagement (CBE): do they really go hand-in-hand? Electron. Commer. Res. 1-69 https://doi.org/10.1007/s10660-023-09743-z.

Kim, T.S., Sohn, S.Y., 2020. Machine-learning-based deep semantic analysis approach for forecasting new technology convergence. Technol. Forecast. Soc. Chang. 157,120095.

Jordan, M.I., Mitchell, T.M., 2015. Machine learning: trends, perspectives, and prospects. Science 349 (6245), 255–260.

Dwivedi, Y.K., Kshetri, N., Hughes, L., Slade, E.L., Jeyaraj, A., Kar, A.K., Wright, R.,2023a. “So what if ChatGPT wrote it?” multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy. Int. J. Inf. Manag. 71, 102642.

Dwivedi, Y.K., Pandey, N., Currie, W., Micu, A., 2023b. Leveraging ChatGPT and other generative artificial intelligence (AI)-based applications in the hospitality and tourism industry: practices, challenges and research agenda. Int. J. Contemp. Hosp. Manag. https://doi.org/10.1108/IJCHM-05-2023-0686.

Grover, P., Kar, A.K., Dwivedi, Y.K., 2022. Understanding artificial intelligence adoption in operations management: insights from the review of academic literature and social media discussions. Ann. Oper. Res. 308 (1–2), 177–213.

M. Mousavizadeh, D.J. Kim, R. Chen, Effects of assurance mechanisms and consumer concerns on online purchase decisions: an empirical study, Decis. Support. Syst. 92 (2016) 79–90, https://doi.org/10.1016/j.dss.2016.09.011.

A.L.D. Loureiro, V.L. Migu´eis, L.F.M. da Silva, Exploring the use of deep neural networks for sales forecasting in fashion retail, Decis. Support. Syst. 114 (2018) 81–93, https://doi.org/10.1016/j.dss.2018.08.010.

M. Korpusik, S. Sakaki, F. Chen, Y.Y. Chen, Recurrent neural networks for customer purchase prediction on Twitter, in: CEUR Workshop Proceedings, 2016: pp. 47–50.

M Dashand,H Liu . Feature selection for classification [J]. Intelligent Data Analysis , 1997,(03):131-156.

Molina L C, Belanche L, Àngela Nebot. Feature Selection Algorithms: A Survey and Experimental Evaluation[C].IEEE International Conference on Data Mining. DBLP, 2002:306-313.

Downloads

Published

2024-05-25