Predicting e-commerce customer satisfaction: traditional machine learning vs. Deep learning approaches


Play all audios:

Loading...

JOURNAL OF RETAILING AND CONSUMER SERVICES Volume 79, July 2024, 103865 https://doi.org/10.1016/j.jretconser.2024.103865Get rights and content ABSTRACT The rapid growth of e-commerce has


increased the need for retailers to understand and predict customer satisfaction to support data-driven managerial decisions. This study analyzes online consumer behavior through a


comparative machine learning modeling approach to forecast future customer satisfaction based on review ratings. Using a large dataset of over 100 k online orders from a major retailer,


traditional machine learning models including random forest and support vector machines are benchmarked against deep learning techniques like multi-layer perceptrons. The predictive models


are assessed for their ability to accurately predict customer satisfaction scores for the next orders based on key e-commerce features including delivery time, order value, and location. The


findings demonstrate that the random forest model can predict future satisfaction with 92% accuracy, outperforming deep learning. The analysis further identifies core drivers of


satisfaction such as delivery time and order accuracy. These insights enable retail managers to make targeted improvements, like optimizing logistics, to increase customer loyalty and


revenue. This study provides a framework for leveraging predictive analytics and machine learning to unlock data-driven insights into online consumer behavior and satisfaction for superior


retail decision-making. The focus on generalizable insights across a major retailer enhances the practical applicability of the machine learning approach for the retail sector. INTRODUCTION


Retailers use customer satisfaction as one of the main ways to measure how well their business is doing. Different studies have shown that overall customer satisfaction is strongly linked to


a company's profits (Bernhardt et al., 2000). Customer satisfaction is the factor that determines whether a company's product or service matches customer needs and is a metric


that can provide business owners with the current status of the business, allowing them to enhance profits and reduce marketing expenses (Gómez et al., 2004). Consumer feedback may assist in


examining previously unconsidered elements, such as delivery, safe packaging, professional and available customer service specialists, and an informative website. Asking clients for their


opinions and respecting their feedback is the only way to make them feel important. Customers are made to feel valued when they are asked for their feedback (Supriyanto et al., 2021). Online


customer reviews help people who are thinking about buying a product, business, or service figure out how good it is. Reports show that online customer reviews have a big effect on a lot of


people's decisions about what to buy (Riaz et al., 2021). Also, companies may learn more about their clients and improve their services as a result of feedback left on review sites


(Zhao et al., 2019). In the real world, there is a significant increase in the quantity of reviews. According to data from the review platform Yelp, the number of written reviews exceeded


233 million by the fourth quarter of 2021, and this number is still increasing (Mewada and Dewang, 2023). There are advantages and disadvantages for both businesses and consumers in the


rapidly expanding pool of customer evaluations available on the Internet (Bilal et al., 2021). Information overload from the variety and volume of Internet customer evaluations makes it


difficult for potential buyers with restricted cognitive ability to identify useful reviews. Real, favorable evaluations boost business credibility. A negative review can highlight a


company's customer service issues and suggest improvements (Hu and Krishen, 2019) (Roetzel, 2019). There are three parts to a customer's shopping experience: before the sale, in


the store, and after the sale (Terblanche, 2018). In the pre-sale phase, the client establishes goals for using the service. The consumer leaves the store with the purchased goods or


services, which is the second phase in the sale process. In the last,” post-sale,” stage, the buyer assesses how their complaints and requests for help were resolved. In our research, we


want to know what a customer thinks right after a shopping trip, so the second phase is the focus of this study. Machine learning algorithms are used in e-commerce to improve customer


experience, increase sales, reduce costs, analyze customer data, and provide personalized recommendations for products and services. It can also be used to detect fraud and automate customer


service tasks. Additionally, machine learning can be used to optimize pricing strategies, improve search engine results, and automate marketing campaigns (Pallathadka et al., 2023). Online


businesses face the challenge of predicting customer satisfaction and review ratings before receiving feedback. Specifically, can businesses estimate the rating a customer is likely to


provide in their next order review before it is submitted? Developing models to effectively forecast ratings enables proactive identification of customer sentiment, allowing businesses to


address issues preemptively and improve experiences. This study aims to predict the next order review (positive or negative) based on historical data on customer orders using machine


learning algorithms and deep learning, enabling businesses to understand satisfaction levels earlier and take corrective actions as needed. This data includes the date of the order, the


product purchased, the customer's review rating, and other customer-related information. Then, we may use these models to identify clients who appear pleased and those who may be


unsatisfied with a product or service. Marketers may utilize this data to adjust their strategies and produce more in line with what their customers want. Also, the current study aims to


find the most important things that affect customer satisfaction in e-commerce and make a model that can predict whether a customer will be happy. The remaining sections of this work will be


structured as follows: The relevant literature is covered in Section 2. The suggested model is provided in Section 3, experiments, and findings in Section 4, followed by a discussion in


Section 5, and finally, Section 6 provides the conclusion and future work. The following points summarize the principal contributions of this paper: proposed machine learning and deep


learning models for the next order review prediction based on a historical dataset from the retail company. Contributions can be detailed in the following points: • A review of the most


recent studies in review rating, classification, and prediction. • A proposal for a framework for classifying customer reviews based on customer historical data. • A comparison of different


supervised machine learning algorithms, such as Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RS), Gradient Boost Classifier (GBC), and Multi-Layer Perceptron (MLP)


as a model used for deep learning. • A framework model was made to solve a binary classification problem, A customer with scores of 1 and 2 will be classified as a not-satisfied customer


(negative review) and a customer with review scores of 3,4 and 5 will be classified as a satisfied customer (positive review) • The framework included the study of the effects of creating


new features, feature selection, and resampling techniques to improve the quality and efficiency of customer review classification. • An Identifying the key features and drivers of customer


satisfaction aligned with prior research, such as delivery performance, pricing, and service experience elements. • Finally, this study indicates that the best classifier is (RF) with an


accuracy of 92%, AUC-ROC 90%, weighted F1-score 92%, weighted recall 91%, and weighted precision 91%. SECTION SNIPPETS RELATED WORK Related work will be divided into two sections. The first


section will review previous studies using traditional algorithms of machine learning, and previous studies using deep learning algorithms will be in the second section. PROPOSED MODEL This


paper presents a model designed to predict customer reviews, to enhance the user experience in the e-commerce domain. The proposed model encompasses a comparative analysis between


traditional machine learning models and a deep learning model (see Fig. 1). The delineation of the problem at hand involves framing it as a binary classification problem, where customers


receiving scores of 1 and 2 are categorized as dissatisfied (indicating a negative review), while those with review scores of EXPERIMENTAL SETUP Experiments were conducted using Python 3.7.3


in Jupyter Notebooks on a standard laptop equipped with an Intel i7 processor and 12 GB RAM. To build, evaluate, and visualize predictive models, specialized Python data science libraries


such as TensorFlow, Scikit-Learn, Pandas, and Matplotlib were leveraged. EVALUATION PERFORMANCE Because of dealing with classification datasets with imbalanced classes, care must be taken in


selecting appropriate evaluation metrics that can provide insights into model performance on the DISCUSSION This section provides an overview of the experimental findings. Additionally,


conduct a comparative analysis between the proposed model and prior research. CONCLUSION AND FUTURE WORK This study demonstrates a successful machine learning approach for predicting online


retail customer satisfaction based on review ratings. The random forest model performed robustly on a large dataset from a major e-commerce retailer, achieving an accuracy of 92% in


classifying satisfied customers. These predictive insights into future consumer behavior enable data-driven managerial decisions to improve customer loyalty and retail performance. The top


features identified by the RF model align FUNDING No funding. CREDIT AUTHORSHIP CONTRIBUTION STATEMENT MAHA ZAGHLOUL: Writing – original draft, Visualization, Validation, Software,


Methodology, Conceptualization. SHERIF BARAKAT: Writing – review & editing, Supervision. AMIRA REZK: Writing – review & editing, Supervision, Conceptualization. DECLARATION OF


COMPETING INTEREST The authors confirm that they do not have any conflicts of interest to declare about this manuscript. They have no financial interests, affiliations, or other competing


interests that could potentially bias my involvement in the publication process. Their contributions to this manuscript are made in good faith and with full transparency. ACKNOWLEDGMENTS


This work is part of a doctoral dissertation that is now in its last stages of completion. REFERENCES (54) * B. Bansal_ et al._ SENTIMENT CLASSIFICATION OF ONLINE CONSUMER REVIEWS USING WORD


VECTOR REPRESENTATIONS PROCEDIA COMPUT. SCI. (2018) * K.L. Bernhardt_ et al._ A LONGITUDINAL ANALYSIS OF SATISFACTION AND PROFITABILITY J. BUS. RES. (2000) * M. Bilal_ et al._ PROFILING


REVIEWERS' SOCIAL NETWORK STRENGTH AND PREDICTING THE “HELPFULNESS” OF ONLINE CUSTOMER REVIEWS ELECTRON. COMMER. RES. APPL. (2021) * M. Buda_ et al._ A SYSTEMATIC STUDY OF THE CLASS


IMBALANCE PROBLEM IN CONVOLUTIONAL NEURAL NETWORKS NEURAL NETWORK. (2018) * S. Dong_ et al._ A SURVEY ON DEEP LEARNING AND ITS APPLICATIONS COMPUTER SCIENCE REVIEW (2021) * M.I. Gómez_ et


al._ CUSTOMER SATISFACTION AND RETAIL SALES PERFORMANCE: AN EMPIRICAL INVESTIGATION J. RETAILING (2004) * H. Hu_ et al._ WHEN IS ENOUGH, ENOUGH? INVESTIGATING PRODUCT REVIEWS AND INFORMATION


OVERLOAD FROM A CONSUMER EMPOWERMENT PERSPECTIVE J. BUS. RES. (2019) * H. Izadkhah TRAINING MULTILAYER NEURAL NETWORKS * R. Khalid_ et al._ A SURVEY ON HYPERPARAMETERS OPTIMIZATION


ALGORITHMS OF FORECASTING MODELS IN SMART GRID SUSTAIN. CITIES SOC. (2020) * P. Kumar_ et al._ NSL-BP: A META CLASSIFIER MODEL BASED PREDICTION OF AMAZON PRODUCT REVIEWS INT. J. INT.


MULTIMEDIA AND ARTIFICIAL INTELLIGENCE (2021) * H. Pallathadka_ et al._ APPLICATIONS OF ARTIFICIAL INTELLIGENCE IN BUSINESS MANAGEMENT, E-COMMERCE AND FINANCE MATER. TODAY: PROC. (2023) *


D.A. Pisner_ et al._ SUPPORT VECTOR MACHINE * B.S. Raghuwanshi_ et al._ SMOTE BASED CLASS-SPECIFIC EXTREME LEARNING MACHINE FOR IMBALANCED LEARNING KNOWL. BASE SYST. (2020) * N. Ravikumar_


et al._ DEEP LEARNING FUNDAMENTALS * N.S. Terblanche REVISITING THE SUPERMARKET IN-STORE CUSTOMER SHOPPING EXPERIENCE J. RETAILING CONSUM. SERV. (2018) * Y. Zhao_ et al._ PREDICTING OVERALL


CUSTOMER SATISFACTION: BIG DATA EVIDENCE FROM HOTEL ONLINE TEXTUAL REVIEWS INT. J. HOSPIT. MANAG. (2019) * B.H. Ahmed_ et al._ REVIEW RATING PREDICTION FRAMEWORK USING DEEP LEARNING J.


AMBIENT INTELL. HUM. COMPUT. (2022) * P. Bahad_ et al._ STUDY OF ADABOOST AND GRADIENT BOOSTING ALGORITHMS FOR PREDICTIVE ANALYTICS (2020) * V. Balakrishnan_ et al._ A DEEP LEARNING APPROACH


IN PREDICTING PRODUCTS' SENTIMENT RATINGS: A COMPARATIVE ANALYSIS J. SUPERCOMPUT. (2022) * S.D. Bappon_ et al._ SENTIMENT ANALYSIS OF BENGALI TEXTS ON ONLINE TECH GADGET REVIEWS USING


MACHINE LEARNING * R. Battiti USING MUTUAL INFORMATION FOR SELECTING FEATURES IN SUPERVISED NEURAL NET LEARNING IEEE TRANS. NEURAL NETWORK. (1994) * P. Branco_ et al._ A SURVEY OF PREDICTIVE


MODELING ON IMBALANCED DOMAINS ACM COMPUT. SURV. (2017) * Y. Cao_ et al._ MIAC: MUTUAL-INFORMATION CLASSIFIER WITH ADASYN FOR IMBALANCED CLASSIFICATION * W. Cao_ et al._ USER ONLINE


PURCHASE BEHAVIOR PREDICTION BASED ON FUSION MODEL OF CATBOOST AND LOGIT * C. Colón-Ruiz_ et al._ COMPARING DEEP LEARNING ARCHITECTURES FOR SENTIMENT ANALYSIS ON DRUG REVIEWS J. BIOMED. INF.


(2020) * A. Das LOGISTIC REGRESSION * M. Fouad_ et al._ EFFECTIVE E-COMMERCE BASED ON PREDICTING THE LEVEL OF CONSUMER SATISFACTION (2023) View more references CITED BY (28) * USING MACHINE


LEARNING TO DEVELOP CUSTOMER INSIGHTS FROM USER-GENERATED CONTENT 2024, Journal of Retailing and Consumer Services Citation Excerpt : By leveraging customer information to justify or


explain past actions and future decisions, symbolic insights help firms uncover the symbolic meaning associated with customers’ behavior, encouraging a deeper exploration of stakeholder


dynamics in the context of brand evaluation (Bharadwaj et al., 2013; Hollebeek et al., 2022; Kühl et al., 2020). Looking ahead, CI will see the likely integration of sophisticated (e.g.,


deep or reinforcement learning) AI models (Zaghloul et al., 2024), which can handle more intricate data structures, thus providing deeper insight. Moreover, multimodal data analysis, which


combines text, images, and videos, is gaining traction (Mohri et al., 2018). Show abstract Uncovering customer insights (CI) is indispensable for contemporary marketing strategies. The


widespread availability of user-generated content (UGC) presents a unique opportunity for firms to gain a nuanced understanding of their customers. However, the size and complexity of UGC


datasets pose significant challenges for traditional market research methods, limiting their effectiveness in this context. To address this challenge, this study leverages natural language


processing (NLP) and machine learning (ML) techniques to extract nuanced insights from UGC. By integrating sentiment analysis and topic modeling algorithms, we analyzed a dataset of


approximately four million X posts (formerly tweets) encompassing 20 global brands across industries. The findings reveal primary brand-related emotions and identify the top 10 keywords


indicative of brand-related sentiment. Using FedEx as a case study, we identify five prominent areas of customer concern: parcel tracking, small business services, the firm's


comparative performance, package delivery dynamics, and customer service. Overall, this study offers a roadmap for academics to navigate the complex landscape of generating CI from UGC


datasets. It thus raises pertinent practical implications, including boosting customer service, refining marketing strategies, and better understanding customer needs and preferences,


thereby contributing to more effective, more responsive business strategies. * A METHOD FOR EXPLORING CONSUMER SATISFACTION FACTORS USING ONLINE REVIEWS: A STUDY ON ANTI-COLD DRUGS 2024,


Journal of Retailing and Consumer Services Show abstract Anti-cold drugs prove effective in alleviating cold symptoms and minimizing complications. In recent years, there has been a growing


tendency among the public to purchase these drugs online, driven by the prevalence of influenza. Maintaining a high level of consumer satisfaction is crucial for online pharmaceutical


platforms. This study introduces a Multi-Attribute Decision Making (MADM) method to analyze online reviews from 31,392 consumers at JD Pharmacy. Unlike previous studies, multiple analytical


tools are employed jointly to identify factors influencing consumer satisfaction. Four dosage forms of anti-cold drugs (granule, tablet, syrup, and spray) were analyzed using the Latent


Dirichlet Allocation Model (LDA). Three tools (topic coherence curve, topic perplexity curve, and pyLDAvis visualization) were jointly applied. Then five satisfaction factors were identified


as logistics distribution, drug efficacy, online consultation service, drug prices, and packaging quality. PROMETHEE II method was applied to rank the above factors. The results revealed


variations in consumer satisfaction among different drug dosage forms. We conducted a detailed analysis of these distinctions. The results from this study will provide an effective reference


to improve consumer satisfaction. * AN EXPLAINABLE MACHINE LEARNING MODEL FOR SENTIMENT ANALYSIS OF ONLINE REVIEWS 2024, Knowledge-Based Systems Citation Excerpt : Despite its advantages,


deep learning does not consistently outperform traditional machine learning methods. Therefore, the choice between the two should be based on the specific context and requirements of the


task [19,20]. For instance, deep learning methods are not suitable for small datasets; instead, they require large datasets for training, as the effectiveness of the algorithm tends to


increase with dataset size [21,22]. Show abstract Over the last two decades and with the widespread use of social media and e-commerce sites, scientific research in the field of sentiment


analysis (SA) has made considerable progress in terms of obtained results and the number of published articles. The greatest part of this progress has been achieved by SA systems based on


machine learning. However, most of these systems lack transparency and explainability, making it difficult to understand their internal processes and consequently to trust their decisions


and predictions. To address this problem, we propose an easy-to-use machine learning model based on an intuitive geometric approach for SA of online reviews. For linearly separable data, we


adopt an iterative algorithm called the explainable algorithm for binary linear classification (EABLC) to determine the maximum-margin separating hyperplane based on the geometric concept of


the convex hull. As an extension of EABLC, two new algorithms are further proposed, namely, the soft explainable algorithm for binary classification and the explainable algorithm for binary


polyhedral classification, to avoid outliers and deal with linearly nonseparable data. Aside from its simplicity and intuitiveness, experimental results on the Amazon product and movie


review sentiment datasets demonstrate the efficiency and robustness of our model, which outperforms ten benchmark classification algorithms. * ONLINE REVIEWS MEET VISUAL ATTENTION: A STUDY


ON CONSUMER PATTERNS IN ADVERTISING, ANALYZING CUSTOMER SATISFACTION, VISUAL ENGAGEMENT, AND PURCHASE INTENTION 2024, Journal of Theoretical and Applied Electronic Commerce Research *


MANAGEMENT AND SALES FORECASTING OF AN E-COMMERCE INFORMATION SYSTEM USING DATA MINING AND CONVOLUTIONAL NEURAL NETWORKS 2024, Indian Journal of Information Sources and Services * SMART


DISTRIBUTION IN E-COMMERCE: HARNESSING MACHINE LEARNING AND DEEP LEARNING APPROACHES FOR IMPROVED LOGISTICS 2024, International Journal of Computational and Experimental Science and


Engineering View all citing articles on Scopus View full text © 2024 Elsevier Ltd. All rights reserved.