Menu Expand

A Novel Default Risk Prediction and Feature Importance Analysis Technique for Marketplace Lending using Machine Learning

Cite JOURNAL ARTICLE

Style

Imam, S., Huhn, S., Hornuf, L., Drechsler, R. A Novel Default Risk Prediction and Feature Importance Analysis Technique for Marketplace Lending using Machine Learning. Credit and Capital Markets – Kredit und Kapital, 56(1), 27-62. https://doi.org/10.3790/ccm.56.1.27
Imam, Sana Hassan; Huhn, Sebastian; Hornuf, Lars and Drechsler, Rolf "A Novel Default Risk Prediction and Feature Importance Analysis Technique for Marketplace Lending using Machine Learning" Credit and Capital Markets – Kredit und Kapital 56.1, 2023, 27-62. https://doi.org/10.3790/ccm.56.1.27
Imam, Sana Hassan/Huhn, Sebastian/Hornuf, Lars/Drechsler, Rolf (2023): A Novel Default Risk Prediction and Feature Importance Analysis Technique for Marketplace Lending using Machine Learning, in: Credit and Capital Markets – Kredit und Kapital, vol. 56, iss. 1, 27-62, [online] https://doi.org/10.3790/ccm.56.1.27

Format

A Novel Default Risk Prediction and Feature Importance Analysis Technique for Marketplace Lending using Machine Learning

Imam, Sana Hassan | Huhn, Sebastian | Hornuf, Lars | Drechsler, Rolf

Credit and Capital Markets – Kredit und Kapital, Vol. 56 (2023), Iss. 1 : pp. 27–62

Additional Information

Article Details

Author Details

Sana Hassan Imam, Department of Informatics, University of Bremen.

Sebastian Huhn, University of Bremen/DFKI GmbH, 28359 Bremen, Germany.

Lars Hornuf, Faculty of Business and Economics, Technische Universität Dresden, 01069 Dresden, Germany.

Rolf Drechsler, University of Bremen/DFKI GmbH, 28359 Bremen, Germany.

References

  1. Abiodun, O. I. et al. (2019): Comprehensive review of artificial neural network applications to pattern recognition. In: IEEE Access 7, pp. 158820–158846.  Google Scholar
  2. Akerlof, G. A. (1970): The market for ‘lemons’: Quality uncertainty and the market mechanism. Quarterly Journal of Economics, Vol. 84(3), 488–500.  Google Scholar
  3. Aleksandrova, Y. (2021): Comparing performance of machine learning algorithms for default risk prediction in peer to peer lending. Journal of Technology, Education, Management, Informatics, Vol. 10(1), 133–143.  Google Scholar
  4. Anh, N. T. T./Hanh, P. T. M./Le Thu, V. T. (2021): Default in the US peer-to-peer market with covid-19 pandemic update: An empirical analysis from lending club platform. International Journal of Entrepreneurship, Vol. 25(7), 1–19.  Google Scholar
  5. Ari, A./Ratnovski, L./Chen, S. (2020): The dynamics of non-performing loans during banking crises: a new database. Working Paper Series 2395. European Central Bank.  Google Scholar
  6. Assaad, R. H./Fayek, S. (2021): Predicting the price of crude oil and its fluctuations using computational econometrics: deep learning, LSTM, and convolutional neural networks. Econometric Research in Finance, Vol. 6(2), 119–137.  Google Scholar
  7. Bakker, E. (2015): PEER-TO-PEER LENDING: How digital lending marketplaces are disrupting the predominant banking model. BUSINESS INSIDER 23.  Google Scholar
  8. Barasinska, N./Schäfer, D. (2014): Is crowdfunding different? Evidence on the relation between gender and funding success from a German peer-to-peer lending platform. German Economic Review, Vol. 15(4), 436–452.  Google Scholar
  9. Bartholomew, D. (2010): Principal components analysis. International Encyclopedia of Education. Ed. by Peterson, P./Baker, E./McGaw, B. Third Edition. Oxford: Elsevier, pp. 374–377.  Google Scholar
  10. Chromik, M. (2020): reshape: A framework for interactive explanations in XAI based on SHAP. In: Proceedings of 18th European Conference on Computer-Supported Cooperative Work. European Society for Socially Embedded Technologies (EUSSET).  Google Scholar
  11. Cumming, D./Hornuf, L. (2022): Marketplace lending of small-and medium-sized enterprises. Strategic Entrepreneurship Journal, Vol. 16(1), 32–66.  Google Scholar
  12. Diamond, D. W. (1984): Financial intermediation and delegated monitoring. The Review of Economic Studies, Vol. 51(3), 393–414.  Google Scholar
  13. Dorfleitner, G./Hornuf, L./Weber, M. (2022): Paralyzed by shock: the portfolio formation behavior of peer-to-business lending investors. Review of Managerial Science, 1–37.  Google Scholar
  14. Dorfleitner, G./Priberny, C. et al. (2016): Description-text related soft information in peer-to-peer lending – Evidence from two leading European platforms. Journal of Banking & Finance, Vol. 64, 169–187.  Google Scholar
  15. Duan, J. (2019): Financial system modeling using deep neural networks (DNNs) for effective risk assessment and prediction. Journal of the Franklin Institute, Vol. 356(8), 4716–4731.  Google Scholar
  16. Duarte, J./Siegel, S./Young, L. (2012): Trust and credit: The role of appearance in peer-to-peer lending. The Review of Financial Studies, Vol. 25(8), 2455–2484.  Google Scholar
  17. Dzik-Walczak, A./Heba, M. (2021): An implementation of ensemble methods, logistic regression, and neural network for default prediction in peer-to-peer lending. Zbornik Radova Ekonomskog Fakultet au Rijeci, Vol. 39(1), 163–197.  Google Scholar
  18. Herzenstein, M./Sonenshein, S./Dholakia, U. M. (2011): Tell me a good story and I may lend you money: The role of narratives in peer-to-peer lending decisions. Journal of Marketing Research Vol. 48.SPL, 138–149.  Google Scholar
  19. Iyer, R. et al. (2016): Screening peers softly: Inferring the quality of small borrowers. Management Science, Vol. 62(6), 1554–1577.  Google Scholar
  20. Jin, Y./Zhu, Y. (2015): A data-driven approach to predict default risk of loan for online peer-to-peer (P2P) lending. International Conference on Communication Systems and Network Technologies. IEEE, 609–613.  Google Scholar
  21. Käfer, B. (2018): Peer-to-Peer lending – a (financial stability) risk perspective. Review of Economics, Vol. 69(1), 1–25.  Google Scholar
  22. Kaveri, V./Narang, D. (2020): Peer to Peer Lending in India under COVID-19 Scenario. Vinimaya Vol. 41(1), 5–19.  Google Scholar
  23. Kim, J. Y./Cho, S. B. (2019): Predicting repayment of borrows in peer-to-peer social lending with deep dense convolutional network. Expert Systems, Vol. 36(4), e12403.  Google Scholar
  24. Kumar, V. et al. (2016): Credit risk analysis in peer-to-peer lending system. In: IEEE International Conference on Knowledge Engineering and Applications. IEEE, 193–196.  Google Scholar
  25. Lee, J. W./Lee, W. K./Sohn, S. Y. (2021): Graph convolutional network-based credit default prediction utilizing three types of virtual distances among borrowers. Expert Systems with Applications, Vol. 168, 114411.  Google Scholar
  26. Li, Y./Chen, W. (2020): A comparative performance assessment of ensemble learning for credit scoring. Mathematics, Vol. 8(10), 1756.  Google Scholar
  27. Liang, L./Cai, X. (2020): Forecasting peer-to-peer platform default rate with LSTM neural network. Electronic Commerce Research and Applications, Vol. 43, 100997.  Google Scholar
  28. Lin, M./Prabhala, N. R./Viswanathan, S. (2013): Judging borrowers by the company they keep: Friendship networks and information asymmetry in online peer-to-peer lending. Management Science, Vol. 59(1), 17–35.  Google Scholar
  29. Mohammadi, A./Shafi, K. (2017): How wise are crowd? A comparative study of crowd and institutions. Academy of Management Proceedings, 13707.  Google Scholar
  30. Mokhtari, K. E./Higdon, B. P./Başar, A. (2019): Interpreting financial time series with SHAP values. Annual International Conference on Computer Science and Software Engineering, 166–172.  Google Scholar
  31. Najaf, K./Subramaniam, R. K./Atayah, O. F. (2022): Understanding the implications of FinTech Peer-to-Peer (P2P) lending during the COVID-19 pandemic. Journal of Sustainable Finance & Investment, Vol. 12(1), 87–102.  Google Scholar
  32. Nigmonov, A./Daradkeh, H. (2022): From One Crisis to Another: Impact of COVID-19 Pandemic on Peer-to-Peer Lending Market. Financial Transformations Beyond the COVID-19 Health Crisis, 363–402.  Google Scholar
  33. Parker, G. G./Van Alstyne, M. W./Choudary, S. P. (2016): Platform revolution: How networked markets are transforming the economy and how to make them work for you. WW Norton & Company.  Google Scholar
  34. Pope, D. G./Sydnor, J. R. (2011): What’s in a picture? Evidence of discrimination from Prosper. com. Journal of Human Resources, Vol. 46(1), 53–92.  Google Scholar
  35. Ravina, E. (2019): Love & loans: The effect of beauty and personal characteristics in credit markets. Available at SSRN 1107307.  Google Scholar
  36. Rochet, J. C./Tirole, J. (2003): Platform competition in two-sided markets. Journal of the European Economic Association, Vol. 1(4), 990–1029.  Google Scholar
  37. Serrano-Cinca, C./Gutiérrez-Nieto, B./López-Palacios, L. (2015): Determinants of default in P2P lending. PLOS ONE, Vol. 10(10), e0139427.  Google Scholar
  38. Sherstinsky, A. (2020): Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D: Nonlinear Phenomena, Vol. 404, 132306.  Google Scholar
  39. Siami-Namini, S./Tavakoli, N./Namin, A. S. (2019): The performance of LSTM and BiLSTM in forecasting time series. In: IEEE International Conference on Big Data. IEEE, 3285–3292.  Google Scholar
  40. Song, Y. et al. (2020): Multi-view ensemble learning based on distance-to-model and adaptive clustering for imbalanced credit risk assessment in P2P lending. In: Information Sciences, Vol. 525, 182–204.  Google Scholar
  41. Stiglitz, J. E./Weiss, A. (1981): Credit rationing in markets with imperfect information. In: The American Economic Review, Vol. 71(3), 393–410.  Google Scholar
  42. Tang, H. (2019): Peer-to-peer lenders versus banks: substitutes or complements? The Review of Financial Studies, Vol. 32(5), 1900–1938.  Google Scholar
  43. Teply, P./Polena, M. (2020): Best classification algorithms in peer-to-peer lending. In: North American Journal of Economics and Finance, Vol. 51, 100904.  Google Scholar
  44. Varma, S./Simon, R. (2006): Bias in error estimation when using cross-validation for model selection. In: BMC bioinformatics, Vol. 7(1), 1–8.  Google Scholar
  45. Wang, Y./Ni, X. S. (2020): Risk prediction of peer-to-peer lending market by a LSTM model with macroeconomic factor. ACM Southeast Conference 2020, 181–187.  Google Scholar
  46. Wei, Z./Lin, M. (2017): Market mechanisms in online peer-to-peer lending. Management Science Vol. 63(12), 4236–4257.  Google Scholar
  47. Xia, Y./Liu, C./Liu, N. (2017): Cost-sensitive boosted tree for loan evaluation in peer-to-peer lending. Electronic Commerce Research and Applications, Vol. 24, 30–49.  Google Scholar
  48. Xu, J./Lu, Z./Xie, Y. (2021): Loan default prediction of Chinese P2P market: a machine learning methodology. Scientific Reports, Vol. 11(1), 1–19.  Google Scholar
  49. Yang, K./Shahabi, C. (2005): On the stationarity of multivariate time series for correlation-based data analysis. IEEE International Conference on Data Mining, Vol. 4.  Google Scholar
  50. Ye, X./Dong, L./Ma, D. (2018): Loan evaluation in P2P lending based on Random Forest optimized by genetic algorithm with profit score. Electronic Commerce Research and Applications, Vol. 32, 23–36.  Google Scholar
  51. Zhu, L. et al. (2019): A study on predicting loan default based on the random forest algorithm. Procedia Computer Science, Vol. 162, 503–513.  Google Scholar

Abstract

Marketplace lending has fundamentally changed the relationship between borrowers and lenders in financial markets. As with many other financial products that have emerged in recent years, internet-based investors may be inexperienced in marketplace lending, highlighting the importance of forecasting default rates and evaluating default features such as the loan amount, interest rates, and FICO score. Potential borrowers on marketplace lending platforms may already have been rejected by banks as too risky to lend to, which amplifies the problem of asymmetric information. This paper proposes a holistic data processing flow for the loan status classification of marketplace lending multivariate time series data by using the Bidirectional Long Short-Term Memory model (BiLSTM) to predict “non-default,” “distressed,” and “default” loan status, which outperforms conventional techniques. We adopt the SHapely Additive exPlanations (SHAP) and a four-step ahead model, allowing us to extract the most significant features for default risk assessment. Using our approach, lenders and regulators can identify the most relevant features to enhance the default risk assessment method over time in addition to early risk prediction.