(Day 195) Reading about bank term deposit subscription prediction models

Ivan Ivanov · July 14, 2024

Hello :) Today is Day 195!

One of my lab mates sent me a few papers to skim through to help for his team’s project related to predicting bank deposit subscriptions. Thanks to ChatGPT skimming is very easy now. Below are the outputs from ChatGPT on the five papers I got.

Predictive Analytics and Machine Learning in Direct Marketing for Anticipating Bank Term Deposit Subscriptions

  • Introduction:
    • Direct marketing is essential for personalized client communication in banking.
    • Predictive analytics and machine learning offer new opportunities for refining marketing strategies.
    • The research aims to enhance direct marketing’s effectiveness by applying sophisticated analytical models.
  • Literature Review:
    • Examines eight studies on machine learning and data mining in banking.
    • Highlights methodologies like the S_Kohonen network, Improved Whale Optimization Algorithm, META-DES-AAP, and various machine learning models.
    • Emphasizes the importance of time deposits, customer credit products, and telemarketing success in banks.
  • Proposed Methodology:
    • Data Collection and Exploration: Utilizes Kaggle datasets and explores them using visualization tools.
    • Exploratory Data Analysis (EDA): Uses techniques like crosstabulation and heatmaps to uncover patterns and relationships in the data.
    • Feature Engineering and Preprocessing: Transforms and prepares data for machine learning models through encoding and scaling.
    • Model Implementation: Applies models like SGD Classifier, K-Nearest Neighbors, Logistic Regression, Gaussian Naive Bayes, Decision Tree, and Random Forest Classifiers. Evaluates their performance using accuracy, precision, and F1 score.
  • Results and Evaluation:
    • Presents classification results, highlighting the Random Forest Classifier as the best performer.
    • Evaluates models based on metrics like accuracy, sensitivity, specificity, PPV, NPV, and F1 score.
  • Conclusion:
    • Predictive analytics and machine learning significantly impact direct marketing in banking.
    • The study provides a comprehensive analysis and evaluation of different machine learning models.
    • Findings guide future strategies for direct marketing in the banking sector.

Bank Predictions for Prospective Long-Term Deposit Investors Using Machine Learning LightGBM and SMOTE

  • Objective: Investigates methods for predicting potential long-term deposit investors using historical bank data and machine learning techniques.
  • Methods:
    • Uses machine learning algorithms such as logistic regression, Random Forest, and LightGBM to predict consumer interest based on features like age, job, marital status, and more.
    • Applies SMOTE (Synthetic Minority Over-sampling Technique) to balance the dataset, improving prediction accuracy.
  • Data: Kaggle dataset with 32,950 entries and 16 features. Data preprocessing includes cleaning and removing irrelevant data.
  • Results:
    • LightGBM: Achieved the highest accuracy at 90.63%.
    • Random Forest: Accuracy at 90.34%.
    • Logistic Regression: Accuracy at 88.89%.
  • Conclusion: LightGBM combined with SMOTE provides the most accurate predictions for long-term deposit investments, demonstrating higher prediction accuracy compared to previous studies.

Identifying the Best Machine Learning Model for Predicting Bank Term Deposits: An Empirical Study Using Public, Post Financial Crisis Data

  • Introduction:
    • Bank term deposits are a strategy for banks to secure stable funds for lending and investment.
    • Importance of net interest margin, the difference between interest earned from borrowers and interest paid to depositors.
    • Research question: “Which classification model can accurately predict bank term deposits from the public?”
  • Methodology:
    • Bias-Variance Tradeoff: Essential in machine learning to minimize overfitting and underfitting.
    • Accuracy Paradox: Imbalanced dataset with many more instances of customers not making deposits.
    • Statistical Models:
      • Binomial Logistic Regression: Predicts the binary target variable using the logistic function.
      • Decision Tree Classifier: Uses a tree-like structure of nodes to make decisions, avoiding overfitting with pruning and cross-validation.
      • Artificial Neural Network (ANN): Uses interconnected neurons in layers, adjusting weights and biases through training.
      • Support Vector Machine (SVM): Finds the optimal hyperplane to separate classes using kernel functions.
  • Conclusion: The paper compares models to determine the best predictor of bank term deposits, focusing on balancing precision and recall to effectively target potential customers.

Bank Deposit Prediction Using Ensemble Learning

  • Background: Predicting bank deposits is crucial for managing liquidity and making informed decisions.
  • Objective: Develop an ensemble learning approach to predict bank deposits using a combination of machine learning algorithms.
  • Methodology:
    • Data Collection: 10,000 customer records from a Bangladeshi bank with 14 attributes.
    • Data Preprocessing: Cleaning, transforming, and normalizing the dataset.
    • Feature Selection: Correlation-based feature selection method to select relevant features.
    • Ensemble Learning:
      • Trained three base models: Decision Tree, Random Forest, and Gradient Boosting.
      • Combined these models using stacking and voting techniques to create an ensemble model.
  • Results:
    • Ensemble model achieved a higher accuracy of 92.15% in predicting bank deposits.
    • Stacking ensemble approach outperformed the voting approach.
    • Most significant predictors: average balance, age, and occupation.
  • Conclusion: The study demonstrates the effectiveness of ensemble learning in predicting bank deposits, highlighting the potential of machine learning techniques in improving prediction accuracy.

Applying Machine Learning to the Development of Prediction Models for Bank Deposit Subscription

  • Background: Bank deposit subscription is crucial for banking, and predicting customer behavior is essential for effective marketing strategies.
  • Objective: Develop prediction models using machine learning techniques to forecast bank deposit subscription behavior.
  • Methodology:
    • Collected a dataset of 10,000 customers from a commercial bank, including demographic, financial, and behavioral features.
    • Applied four machine learning algorithms: Decision Trees, Random Forest, Support Vector Machines (SVM), and Gradient Boosting.
    • Trained and evaluated models using a 70:30 split of the dataset for training and testing.
    • Used performance metrics like accuracy, precision, recall, F1-score, and area under the ROC curve (AUC).
  • Results:
    • Gradient Boosting algorithm achieved the best performance with an accuracy of 86.2%, precision of 84.5%, recall of 88.1%, F1-score of 86.3%, and AUC of 0.93.
    • Most important features: customer age, income, credit score, and deposit history.
  • Conclusion: The study shows the effectiveness of machine learning in predicting bank deposit subscription behavior, suggesting banks can use these models for targeted marketing and improved customer retention.

On another note, I am staying up late today to watch the EURO final, but while waiting for 4am to come, I am transferring posts to my GitHub blog. I hope I get a decent amount in.

That is all for today!

See you tomorrow :)