Customer Churn Prediction In Subscription-Based Businesses Using Adaboost And Random Forest
Keywords:
Customer Churn, Ensemble Learning, AdaBoost, RF, Combined Model, Feature Importance, Subscription Services.Abstract
Churn analysis is an essential aspect in the subscription-based industry that can help improve sales, customer lifetime value, and business efficiency. Regular statistical techniques tend to have limited capacity to describe complex relationships between variables which can reduce prediction results. Ensemble learning such as AdaBoost and Random Forest (RF) could be the optimal way to address such a challenge in terms of accuracy and interpretability of data prediction. The aim of this study is to create the AI algorithm that will employ AdaBoost, RF, and ensemble techniques to predict the rate of customer churn. The Kaggle Customer Churn dataset was utilized, with 7,043 entries and various features such as demographics, account characteristics, and usage of services. Missing values, categorical values, and normalization were performed in the process of preprocessing. Predictive feature importance analysis was carried out to determine the important predictors that drive churn. When comparing the performance of RF vs. AdaBoost individually, RF produced better results, where 0.88 Accuracy, 0.85 Precision, 0.82 Recall, 0.83 F1 Score, and 0.90 ROC-AUC. An additional improvement in the model's efficiency could be achieved by applying both models, with the final result of 0.90 ROC-AUC. Key predictors include Tenure (0.21), Monthly Charge (0.18), Contract (0.15), Payment Method (0.12), and Tech Support (0.08). These results enable organizations to effectively identify high-risk customers and enhance their strategies for retaining them. The adoption of ensemble methods and particularly the combination of AdaBoost and RF proves that an efficient prediction of churn is possible.




