The client is an Indian financial services company. Out of different kind of loans, the volume of their SME loans is very low such that developing machine learning models in a traditional way is not possible.
- Client wanted to develop a quantitative scorecard to underwrite their SME portfolios.
- Leverage different techniques that increases the stability and generalizability of the models.
- Detailed analysis of target definitions along with reject-inferencing to mimic the Eventual-NPA.
- Use of cross-validation with k=3 technique for robust model assessment.
- Use of data augmentation technique to synthetically create both majority and minority classes of cross-validated training datasets to increase the stability of models.
- Leverage random-forest’s variable importance to get best variables for the scorecard.
Given the data with poor quality & quantity, the performance on the client’s out-of-time validation datasets are much closer to the performance during the model development process.