ISSN 1309-1581
TR EN

Bankacılıkta Müşteri Verilerini Anlama: CRISP-DM Yaklaşımı ve Makine Öğrenimi Uygulamaları

Understanding Customer Data in Banking: CRISP-DM Approach and Machine Learning Applications
DOI: 10.5824/ajite.2025.01.004.x
Pages: 68-87
TR Öz

Bankacılıkta Müşteri Verilerini Anlama: CRISP-DM Yaklaşımı ve Makine Öğrenimi Uygulamaları

Bu çalışma, günümüzde işletmeler için müşteri bilgilerinin önemini vurgulamakta ve özellikle finansal işlemlerle uğraşan bankaların bu bilgileri işleyip analiz ederek yönetim stratejilerinde kullanmasını ele almaktadır. Moro ve diğerleri tarafından sağlanan anonimleştirilmiş ve amaca uygun revize edilmiş veri seti kullanılarak, banka müşterilerinin vadeli mevduata abone olma olasılığını etkileyen faktörler incelenmiş ve pazarlama stratejilerine yönelik veri destekli öneriler sunulmuştur. Çalışma, müşteri davranışını etkileyen değişkenleri belirlemek için Lojistik Regresyon, Karar Ağaçları (Decision Tree), Rastgele Orman (Random Forest), Destek Vektör Makineleri (SVM) ve XGBoost gibi çeşitli makine öğrenimi tekniklerini kullanarak, CRISP-DM metodolojisini altı aşamada yapılmıştır. En iyi performansı gösteren Lojistik Regresyon modeli olmuştur. Bulgularda "emekli, öğrenci, yüksek eğitim seviyesi, önceki kampanya başarısı, ve Mart ayı gibi değişkenler", müşterinin vadeli mevduat aboneliğine daha yatkın olduğunu göstermektedir. "Konut kredisi ve kredi borcu" gibi finansal yükümlülükler. Müşterinin vadeli mevduat aboneliği yapma olasılığını azaltmaktadır. Önerilerden bazıları arasında anlamlı değişkenlere (ör. meslek, eğitim seviyesi, medeni durum) odaklanarak stratejiler geliştirilmesinin yanı sıra eğitim seviyesi yüksek bireyleri hedefleyen pazarlama stratejileri geliştirilebilir. Finansal durumu zorlayıcı olan bireylerin hedef olay üzerinde olumsuz etkileri olduğundan, bu gruplara yönelik uygun öneriler sunulabilir. Çalışmanın sınırlamaları arasında veri setinin güncelliği ve sınıf dengesizliğini gidermek için kullanılan aşırı örnekleme tekniği yer almaktadır. Gelecekte yapılacak araştırmalarda farklı makine öğrenimi modelleri ve daha geniş veri setleri kullanılarak kapsamlı analizler gerçekleştirilebilir.
EN Abstract

Understanding Customer Data in Banking: CRISP-DM Approach and Machine Learning Applications

This study emphasizes the importance of customer information for businesses today and specifically addresses the need for banks involved in financial transactions to process and analyze this information for use in management strategies. Using an anonymized and purposefully revised dataset provided by Moro et al., factors influencing the likelihood of bank customers subscribing to a term deposit were examined, and data-supported recommendations for marketing strategies were presented. The study followed the CRISP-DM methodology in five main stages (Business Understanding, Data Understanding, Data Preparation, Modeling, and Evaluation) and utilized various machine learning techniques, including Logistic Regression, Decision Trees, Random Forest, Support Vector Machines (SVM), and XGBoost, to identify variables that influence customer behavior. Logistic Regression emerged as the best-performing model. The findings indicate that variables such as "retiree, student, high education level, previous campaign success, and March month" increase the likelihood of customers subscribing to a term deposit, whereas financial obligations like "mortgage and credit debt" decrease this likelihood. Among the recommendations are strategies that focus on significant variables (e.g., occupation, education level, marital status) as well as developing targeted marketing strategies for individuals with a high level of education. Since individuals facing financial constraints have negative impacts on the target outcome, suitable recommendations can be made for these groups as well. Limitations of the study include the dataset's currency and the oversampling technique used to address class imbalance. Future research may achieve more comprehensive results by using different machine learning models and larger datasets.
References 48
  1. Abdulsalam, T. A., & Tajudeen, R. B. (2024). Artificial intelligence (AI) in the banking industry: A review of service areas and customer service journeys in emerging economies. Business & Management Compass, 68(3), 19-43. https://doi.org/10.56065/9hfvrq20
  2. Agarwal, V., Taware, S., Yadav, S., Gangodkar, D., Rao, A., & Srivastav, V. (2022). Customer-churn prediction using machine learning. Proceedings of ICTACS, 899. https://doi.org/10.1109/ICTACS56270.2022.9988187
  3. Bach, M. P., Juković, S., Dumičić, K., & Šarlija, N. (2013). Business client segmentation in banking using self-organizing maps. South East European Journal of Economics and Business, 8(2), 32-41. https://doi.org/10.2478/jeb-2013-0007
  4. Balamurugan, M. (2024). AI-driven adaptive content marketing: Automating strategy adjustments for enhanced consumer engagement. International Journal For Multidisciplinary Research, 6. https://doi.org/10.36948/ijfmr.2024.v06i05.27940
  5. Bhatore, S., Mohan, L., & Reddy, Y. R. (2020). Machine learning techniques for credit risk evaluation: A systematic literature review. Journal of Banking and Financial Technology, 4(1), 111-138. https://doi.org/10.1007/s42786-020-00020-3
  6. Bilge, C., & Nur, T. (2023). Finansal performansı etkileyen içsel faktörler: Mevduat ve katılım bankaları üzerine bir uygulama. Ardahan Üniversitesi İktisadi ve İdari Bilimler Fakültesi Dergisi, 5(2), Article 2. https://doi.org/10.58588/aru-jfeas.1292408
  7. Bumin, M. (2023). Türk bankacılık sektöründe mevduat bankalarının karlılık performansını etkileyen faktörlerin panel regresyon analizi ile belirlenmesi. Muhasebe ve Finansman Dergisi, 100, Article 100. https://doi.org/10.25095/mufad.1326939
  8. Campbell, J. Y., & Cocco, J. F. (2003). Household risk management and optimal mortgage choice. The Quarterly Journal of Economics, 118(4), 1449-1494.
  9. Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321-357. https://doi.org/10.1613/jair.953
  10. Çımrın, A. H., Kaya, İ., & Bahadır, H. (2023). Türkiye'de mavi yakalı çalışan olma hali üzerine bir analiz. Sosyoloji Dergisi, 45, Article 45. https://doi.org/10.59572/sosder.1225697
  11. Claessens, S., & Laeven, L. (2004). What drives bank competition? Some international evidence. Journal of Money, Credit and Banking, 36(3), 563-583.
  12. Demirel, S. (2024). Bankacılıkta dijitalleşmenin müşteri davranışları üzerine etkisi. Retrieved from https://www.gazikitabevi.com.tr/bankacilikta-dijitallesmenin-musteri-davranislari-uzerine-etkisi
  13. Duarte, V., Zuniga-Jara, S., & Contreras, S. (2022). Machine learning and marketing: A systematic literature review. IEEE Access, 10, 93273-93288. https://doi.org/10.1109/ACCESS.2022.3202896
  14. Eckerson, W. W., Hanlon, N., & Barquin, R. (2000). Director of education and research. 5(4).
  15. Fawcett, T., & Provost, F. (2013). Data science for business.
  16. Goodfellow, I. (2016). Deep learning. MIT Press. Retrieved from https://books.google.com/books?hl=en&lr=&id=omivDQAAQBAJ&oi=fnd&pg=PR5&dq=Goodfellow,+I.,+Bengio,+Y.,+%26+Courville,+A.+(2016).+Deep+Learning.+MIT+Press.&ots=MOO5bmsHTY&sig=43Ua_77nITneAJaXBCXMRPWQYbU
  17. Haddadi, S. J., Farshidvard, A., Silva, F. dos S., dos Reis, J. C., & da Silva Reis, M. (2024). Customer churn prediction in imbalanced datasets with resampling methods: A comparative study. Expert Systems with Applications, 246, 123086. https://doi.org/10.1016/j.eswa.2023.123086
  18. He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263-1284. https://doi.org/10.1109/TKDE.2008.239
  19. Ho, S. C., Wong, K. C., Yau, Y. K., & Yip, C. K. (2022). A machine learning approach for predicting bank customer behavior in the banking industry. In Research Anthology on Machine Learning Techniques, Methods, and Applications (pp. 1210-1232). IGI Global. https://doi.org/10.4018/978-1-6684-6291-1.ch063
  20. Hoang, D., & Wiegratz, K. (2023). Machine learning methods in finance: Recent applications and prospects. European Financial Management, 29(5), 1657-1701. https://doi.org/10.1111/eufm.12408
  21. Kotler, P., Keller, K. L., Brady, M., Goodman, M., & Hansen, T. (2016). Marketing management (3rd ed.). Pearson Higher Ed.
  22. Kumar, K., Kuhar, N., & Sharma, M. (2024). Artificial intelligence in the Indian banking system: A systematic literature review (SSRN Scholarly Paper No. 5088937). Social Science Research Network. https://doi.org/10.2139/ssrn.5088937
  23. Lalwani, P., Mishra, M. K., Chadha, J. S., & Sethi, P. (2022). Customer churn prediction system: A machine learning approach. Computing, 104(2), 271-294. https://doi.org/10.1007/s00607-021-00908-y
  24. Leo, M., Sharma, S., & Maddulety, K. (2019). Machine learning in banking risk management: A literature review. Risks, 7(1), 29. https://doi.org/10.3390/risks7010029
  25. Lusardi, A., & Mitchell, O. S. (2014). The economic importance of financial literacy: Theory and evidence. Journal of Economic Literature, 52(1), 5-44. https://doi.org/10.1257/jel.52.1.5
  26. Mahalakshmi, V., Kulkarni, N., Pradeep Kumar, K. V., Suresh Kumar, K., Nidhi Sree, D., & Durga, S. (2022). The role of implementing artificial intelligence and machine learning technologies in the financial services industry for creating competitive intelligence. Materials Today: Proceedings, 56, 2252-2255. https://doi.org/10.1016/j.matpr.2021.11.577
  27. Marbn, S., Mariscal, G., & Segovi, J. (2009). A data mining & knowledge discovery process model. In J. Ponce & A. Karahoc (Eds.), Data mining and knowledge discovery in real life applications. I-Tech Education and Publishing. https://doi.org/10.5772/6438
  28. Marqués, A. I., García, V., & Sánchez, J. S. (2013). A literature review on the application of evolutionary computing to credit scoring. Journal of the Operational Research Society, 64(9), 1384-1399. https://doi.org/10.1057/jors.2012.145
  29. Martínez-Plumed, F., Contreras-Ochando, L., Ferri, C., Hernández-Orallo, J., Kull, M., Lachiche, N., Ramírez-Quintana, M. J., & Flach, P. (2021). CRISP-DM twenty years later: From data mining processes to data science trajectories. IEEE Transactions on Knowledge and Data Engineering, 33(8), 3048-3061. https://doi.org/10.1109/TKDE.2019.2962680
  30. Mienye, I. D., & Jere, N. (2024). Deep learning for credit card fraud detection: A review of algorithms, challenges, and solutions. IEEE Access, 12, 96893-96910. https://doi.org/10.1109/ACCESS.2024.3426955
  31. Moro, S., Cortez, P., & Rita, P. (2014). A data-driven approach to predict the success of bank telemarketing. Decision Support Systems, 62, 22-31. https://doi.org/10.1016/j.dss.2014.03.001
  32. Nayak, S. (2024). Enhancing credit risk assessment using machine learning: A case study on early payment risk prediction.
  33. Noriega, J. P., Rivera, L. A., & Herrera, J. A. (2023). Machine learning for credit risk prediction: A systematic literature review. Data, 8(11), 169. https://doi.org/10.3390/data8110169
  34. Osei, F., Ampomah, G., Kankam-Kwarteng, C., Opoku Bediako, D., & Mensah, R. (2021). Customer satisfaction analysis of banks: The role of market segmentation. Science Journal of Business and Management, 9(2), 126. https://doi.org/10.11648/j.sjbm.20210902.19
  35. Özdemir, G. A. (2021). Dijital bankacılıkta müşteri deneyiminin öncüllerinin ve ardıllarının analizi [Ph.D.]. Retrieved from https://www.proquest.com/docview/2637686536/abstract/C61D402B481E4562PQ/1
  36. Plotnikova, V., Dumas, M., & Milani, F. P. (2022). Applying the CRISP-DM data mining process in the financial services industry: Elicitation of adaptation requirements. Data & Knowledge Engineering, 139, 102013. https://doi.org/10.1016/j.datak.2022.102013
  37. Probesto. (2024). Etkili müşteri segmentasyonu: Yapay zekanın gücünü açığa çıkarma-Probesto. Retrieved from https://www.probesto.com/tr/etkili-musteri-segmentasyonu-yapay-zekanin-gucunu-aciga-cikarma/
  38. Rane, N., Choudhary, S., & Rane, J. (2023). Explainable artificial intelligence (XAI) approaches for transparency and accountability in financial decision-making (SSRN Scholarly Paper No. 4640316). Social Science Research Network. https://doi.org/10.2139/ssrn.4640316
  39. Samek, W., Montavon, G., Lapuschkin, S., Anders, C. J., & Müller, K.-R. (2021). Explaining deep neural networks and beyond: A review of methods and applications. Proceedings of the IEEE, 109(3), 247-278. https://doi.org/10.1109/JPROC.2021.3060483
  40. Sandhya Kona, S. (2020). Customer segmentation and personalization in banking services: Investigating the use of big data analytics to segment banking customers based on their behavior, demographics, and preferences, and leveraging these insights to personalize banking services and marketing campaigns. International Journal of Science and Research (IJSR), 9(8), 1566-1570. https://doi.org/10.21275/SR24522131706
  41. Shi, S., Tse, R., Luo, W., D'Addona, S., & Pau, G. (2022). Machine learning-driven credit risk: A systemic review. Neural Computing and Applications, 34(17), 14327-14339. https://doi.org/10.1007/s00521-022-07472-2
  42. Shmueli, G., Bruce, P. C., Yahav, I., Patel, N. R., & Jr, K. C. L. (2017). Data mining for business analytics: Concepts, techniques, and applications in R. John Wiley & Sons.
  43. Smeureanu, I., Ruxanda, G., & Badea, L. M. (2013). Customer segmentation in private banking sector using machine learning techniques. Journal of Business Economics and Management, 14(5), 923-939. https://doi.org/10.3846/16111699.2012.749807
  44. Verhoef, P., Donkers, B., Langerak, F., Leeflang, P., & Lemon, L. (2003). Understanding the effect of customer relationship management efforts on customer retention and customer share development. Journal of Marketing, 67, 30-45. https://doi.org/10.1509/jmkg.67.4.30.18685
  45. Wang, S., Asif, M., Shahzad, M. F., & Ashfaq, M. (2024). Data privacy and cybersecurity challenges in the digital transformation of the banking sector. Computers & Security, 147, 104051. https://doi.org/10.1016/j.cose.2024.104051
  46. Xiang, S., Zhu, M., Cheng, D., Li, E., Zhao, R., Ouyang, Y., Chen, L., & Zheng, Y. (2023). Semi-supervised credit card fraud detection via attribute-driven graph representation. Proceedings of the AAAI Conference on Artificial Intelligence, 37(12), 14557-14565. https://doi.org/10.1609/aaai.v37i12.26702
  47. Zaki, A. M., Khodadadi, N., Lim, W. H., & Towfek, S. K. (2024). Predictive analytics and machine learning in direct marketing for anticipating bank term deposit subscriptions. American Journal of Business and Operations Research, 11(1), 79-88. https://doi.org/10.54216/AJBOR.110110
  48. Zhuang, Q. R., Yao, Y. W., & Liu, O. (2018). Application of data mining in term deposit marketing. Hong Kong.