Midwest Social Sciences Journal


As a fundamental concept of customer relationship management, customer lifetime value (CLV) serves as a crucial metric to identify profitable retail customers. Various methods are available to predict CLV in different contexts. With the development of consumer big data, modern statistics and machine learning algorithms have been gradually adopted in CLV modeling. We introduce two machine learning algorithms—the gradient boosting decision tree (GBDT) and the random forest (RF)—in retail customer CLV modeling and compare their predictive performance with two classical models—the Pareto/NBD (HB) and the Pareto/GGG. To ensure CLV prediction and customer identification robustness, we combined the predictions of the four models to determine which customers are the most—or least—profitable. Using 43 weeks of customer transaction data from a large retailer in China, we predicted customer value in the future 20 weeks. The results show that the predictive performance of GBDT and RF is generally better than that of the Pareto/NBD (HB) and Pareto/GGG models. Because the predictions are not entirely consistent, we combine them to identify profitable and unprofitable customers.