Abstract:
In recent years, machine learning models have gained popularity in credit scoring applications
due to their ability to handle large volumes of data and capture complex patterns.
However, the lack of transparency and interpretability in these models raises concerns
regarding their trustworthiness and fairness. This study aims to address this matter by
employing the Shapley Additive Explanations (SHAP) approach to analyse the explainability
of credit scoring machine learning models. The lending club dataset, a comprehensive
collection of loan applications and associated attributes, is utilized for this analysis.
The methodology involves training and evaluating various credit scoring models, including
Random Forest, XGBoost, and CatBoost, and generating SHAP values to quantify the
importance of input features in the prediction process. The results reveal valuable insights
into the factors influencing credit scoring decisions and provide a holistic understanding
of the models’ behaviour. By utilizing SHAP explanations, we gain interpretability and
can identify features that significantly impact the credit scoring outcomes. This knowledge
can help stakeholders, including lenders and regulators, make informed decisions and
improve the transparency and accountability of credit scoring systems. The discoveries
of this study advance the expanding field of explainable artificial intelligence(AI) and its
application in the domain of credit risk management. By enhancing the explainability of
credit scoring models, we aim to increase trust, fairness, and accountability in the lending
process, ultimately shaping a more inclusive and responsible financial ecosystem.