Analisis Prediksi Customer Churn pada Sektor E-Commerce Berdasarkan Perilaku Transaksi Menggunakan Pendekatan Machine Learning
DOI:
https://doi.org/10.61132/jumbidter.v3i1.1228Keywords:
Customer Churn, E-Commerce, Machine Learning, Random Forest, Transaction BehaviorAbstract
Because it directly impacts revenue, customer loyalty, and long-term business sustainability, customer churn is a critical issue for the e-commerce industry. High churn rates indicate that a business is unable to retain existing customers, which means it is more expensive to acquire new customers. Therefore, a precise analytical approach is needed to identify customer behavior patterns that are likely to churn. Using machine learning methods, this study analyzes and predicts customer churn. For this study, the E-Commerce Customer Churn 2025 dataset, obtained from Kaggle, was used. This dataset consists of 10,000 customer data and contains fifteen variables covering transaction behavior, customer characteristics, and churn status. Data preprocessing, descriptive analysis, exploratory data analysis (EDA), and classification model development using Logistic Regression and Random Forest algorithms were part of the research project. Model evaluation was conducted using a Confusion Matrix and Receiver Operating Characteristic (ROC) Curve to evaluate the model's accuracy and ability to distinguish between churned and non-churned customers. The results showed that the Random Forest model performed better than Logistic Regression, with an ROC-AUC of 1.00. Furthermore, feature importance analysis revealed that the days_since_last_purchase variable was the most dominant factor in predicting customer churn. These findings are expected to help e-commerce companies design more effective, data-driven customer retention strategies.
Downloads
References
Bhattacherjee, A. (2001). Satisfaction, repurchase intent, and repurchase behavior: Investigating the moderating effect of customer characteristics. Journal of Marketing Research, 38(1), 131–142. https://doi.org/10.1509/jmkr.38.1.131.18832
Chen, J. S., & Tsou, H. T. (2016). Creating enduring customer value. Journal of Marketing, 80(6), 36–68. https://doi.org/10.1509/jm.15.0414
Coussement, K., & Van den Poel, D. (2008). Customer lifetime value measurement. Management Science, 54(1), 100–112. https://doi.org/10.1287/mnsc.1070.0746
Fader, P. S., Hardie, B. G. S., & Lee, K. L. (2005). Modeling customer lifetime value. Journal of Service Research, 9(2), 139–155. https://doi.org/10.1177/1094670506293810
Friedman, J. H., Hastie, T., & Tibshirani, R. (2009). Application of data mining techniques in customer relationship management: A literature review and classification. Expert Systems with Applications, 36(2), 2592–2602. https://doi.org/10.1016/j.eswa.2008.02.021
Huang, B., Kechadi, M. T., & Buckley, B. (2012). Customer churn prediction in telecommunications. Journal of Big Data, 6(1), Article 191. https://doi.org/10.1186/s40537-019-0191-6
Larivière, B., Keiningham, T. L., Cooil, B., Aksoy, L., & Malthouse, E. C. (2016). Modeling customer lifetime value. Journal of Service Research, 9(2), 139–155. https://doi.org/10.1177/1094670506293810
Lemmens, A., & Croux, C. (2006). Bagging and boosting classification trees to predict churn. Journal of Marketing Research, 43(2), 276–286.
Ngai, E. W. T., Xiu, L., & Chau, D. C. K. (2009). Application of data mining techniques in customer relationship management: A literature review and classification. Expert Systems with Applications, 36(2), 2592–2602. https://doi.org/10.1016/j.eswa.2008.02.021
Risselada, H., Verhoef, P. C., & Bijmolt, T. H. A. (2010). Staying power of churn prediction models. Journal of Interactive Marketing, 24(3), 198–208.
Shah, D., Kumar, V., Kim, K. H., & Choi, J. (2016). Managing customer profitability: A dynamic perspective. Journal of Marketing, 80(6), 36–68. https://doi.org/10.1509/jm.15.0414
Tsai, C. F., & Chen, M. Y. (2011). Predicting disease risks from highly imbalanced data using random forest. BMC Medical Informatics and Decision Making, 11(1), Article 51. https://doi.org/10.1186/1472-6947-11-51
Verhoef, P. C. (2003). Understanding the effect of customer relationship management efforts on customer retention and customer share development. Journal of Marketing, 67(4), 30–45. https://doi.org/10.1509/jmkg.67.4.30.18685
Yang, X., Wu, L., Zhou, S., & Gao, Z. (2019). A churn prediction model using random forest: Analysis of machine learning techniques for churn prediction and factor identification in telecom sector. IEEE Access, 7, 60134–60149. https://doi.org/10.1109/ACCESS.2019.2914999
Zhang, P., Li, N., & Sun, Y. (2004). An empirical study on predicting user acceptance of e-shopping on the web. Information & Management, 41(3), 351–368. https://doi.org/10.1016/S0378-7206(03)00079-X
Zhao, Y., Li, Y., & Wang, J. (2021). Integrated churn prediction and customer segmentation framework for telco business. IEEE Access, 9, 62118–62136. https://doi.org/10.1109/ACCESS.2021.3073776
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Jurnal Manajemen Bisnis Digital Terkini

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


