A Novel Default Risk Prediction and Feature Importance Analysis Technique for Marketplace Lending using Machine Learning
JOURNAL ARTICLE
Cite JOURNAL ARTICLE
Style
Format
A Novel Default Risk Prediction and Feature Importance Analysis Technique for Marketplace Lending using Machine Learning
Imam, Sana Hassan | Huhn, Sebastian | Hornuf, Lars | Drechsler, Rolf
Credit and Capital Markets – Kredit und Kapital, Vol. 56 (2023), Iss. 1 : pp. 27–62
Additional Information
Article Details
Author Details
Sana Hassan Imam, Department of Informatics, University of Bremen.
Sebastian Huhn, University of Bremen/DFKI GmbH, 28359 Bremen, Germany.
Lars Hornuf, Faculty of Business and Economics, Technische Universität Dresden, 01069 Dresden, Germany.
Rolf Drechsler, University of Bremen/DFKI GmbH, 28359 Bremen, Germany.
References
-
Abiodun, O. I. et al. (2019): Comprehensive review of artificial neural network applications to pattern recognition. In: IEEE Access 7, pp. 158820–158846.
Google Scholar -
Akerlof, G. A. (1970): The market for ‘lemons’: Quality uncertainty and the market mechanism. Quarterly Journal of Economics, Vol. 84(3), 488–500.
Google Scholar -
Aleksandrova, Y. (2021): Comparing performance of machine learning algorithms for default risk prediction in peer to peer lending. Journal of Technology, Education, Management, Informatics, Vol. 10(1), 133–143.
Google Scholar -
Anh, N. T. T./Hanh, P. T. M./Le Thu, V. T. (2021): Default in the US peer-to-peer market with covid-19 pandemic update: An empirical analysis from lending club platform. International Journal of Entrepreneurship, Vol. 25(7), 1–19.
Google Scholar -
Ari, A./Ratnovski, L./Chen, S. (2020): The dynamics of non-performing loans during banking crises: a new database. Working Paper Series 2395. European Central Bank.
Google Scholar -
Assaad, R. H./Fayek, S. (2021): Predicting the price of crude oil and its fluctuations using computational econometrics: deep learning, LSTM, and convolutional neural networks. Econometric Research in Finance, Vol. 6(2), 119–137.
Google Scholar -
Bakker, E. (2015): PEER-TO-PEER LENDING: How digital lending marketplaces are disrupting the predominant banking model. BUSINESS INSIDER 23.
Google Scholar -
Barasinska, N./Schäfer, D. (2014): Is crowdfunding different? Evidence on the relation between gender and funding success from a German peer-to-peer lending platform. German Economic Review, Vol. 15(4), 436–452.
Google Scholar -
Bartholomew, D. (2010): Principal components analysis. International Encyclopedia of Education. Ed. by Peterson, P./Baker, E./McGaw, B. Third Edition. Oxford: Elsevier, pp. 374–377.
Google Scholar -
Chromik, M. (2020): reshape: A framework for interactive explanations in XAI based on SHAP. In: Proceedings of 18th European Conference on Computer-Supported Cooperative Work. European Society for Socially Embedded Technologies (EUSSET).
Google Scholar -
Cumming, D./Hornuf, L. (2022): Marketplace lending of small-and medium-sized enterprises. Strategic Entrepreneurship Journal, Vol. 16(1), 32–66.
Google Scholar -
Diamond, D. W. (1984): Financial intermediation and delegated monitoring. The Review of Economic Studies, Vol. 51(3), 393–414.
Google Scholar -
Dorfleitner, G./Hornuf, L./Weber, M. (2022): Paralyzed by shock: the portfolio formation behavior of peer-to-business lending investors. Review of Managerial Science, 1–37.
Google Scholar -
Dorfleitner, G./Priberny, C. et al. (2016): Description-text related soft information in peer-to-peer lending – Evidence from two leading European platforms. Journal of Banking & Finance, Vol. 64, 169–187.
Google Scholar -
Duan, J. (2019): Financial system modeling using deep neural networks (DNNs) for effective risk assessment and prediction. Journal of the Franklin Institute, Vol. 356(8), 4716–4731.
Google Scholar -
Duarte, J./Siegel, S./Young, L. (2012): Trust and credit: The role of appearance in peer-to-peer lending. The Review of Financial Studies, Vol. 25(8), 2455–2484.
Google Scholar -
Dzik-Walczak, A./Heba, M. (2021): An implementation of ensemble methods, logistic regression, and neural network for default prediction in peer-to-peer lending. Zbornik Radova Ekonomskog Fakultet au Rijeci, Vol. 39(1), 163–197.
Google Scholar -
Herzenstein, M./Sonenshein, S./Dholakia, U. M. (2011): Tell me a good story and I may lend you money: The role of narratives in peer-to-peer lending decisions. Journal of Marketing Research Vol. 48.SPL, 138–149.
Google Scholar -
Iyer, R. et al. (2016): Screening peers softly: Inferring the quality of small borrowers. Management Science, Vol. 62(6), 1554–1577.
Google Scholar -
Jin, Y./Zhu, Y. (2015): A data-driven approach to predict default risk of loan for online peer-to-peer (P2P) lending. International Conference on Communication Systems and Network Technologies. IEEE, 609–613.
Google Scholar -
Käfer, B. (2018): Peer-to-Peer lending – a (financial stability) risk perspective. Review of Economics, Vol. 69(1), 1–25.
Google Scholar -
Kaveri, V./Narang, D. (2020): Peer to Peer Lending in India under COVID-19 Scenario. Vinimaya Vol. 41(1), 5–19.
Google Scholar -
Kim, J. Y./Cho, S. B. (2019): Predicting repayment of borrows in peer-to-peer social lending with deep dense convolutional network. Expert Systems, Vol. 36(4), e12403.
Google Scholar -
Kumar, V. et al. (2016): Credit risk analysis in peer-to-peer lending system. In: IEEE International Conference on Knowledge Engineering and Applications. IEEE, 193–196.
Google Scholar -
Lee, J. W./Lee, W. K./Sohn, S. Y. (2021): Graph convolutional network-based credit default prediction utilizing three types of virtual distances among borrowers. Expert Systems with Applications, Vol. 168, 114411.
Google Scholar -
Li, Y./Chen, W. (2020): A comparative performance assessment of ensemble learning for credit scoring. Mathematics, Vol. 8(10), 1756.
Google Scholar -
Liang, L./Cai, X. (2020): Forecasting peer-to-peer platform default rate with LSTM neural network. Electronic Commerce Research and Applications, Vol. 43, 100997.
Google Scholar -
Lin, M./Prabhala, N. R./Viswanathan, S. (2013): Judging borrowers by the company they keep: Friendship networks and information asymmetry in online peer-to-peer lending. Management Science, Vol. 59(1), 17–35.
Google Scholar -
Mohammadi, A./Shafi, K. (2017): How wise are crowd? A comparative study of crowd and institutions. Academy of Management Proceedings, 13707.
Google Scholar -
Mokhtari, K. E./Higdon, B. P./Başar, A. (2019): Interpreting financial time series with SHAP values. Annual International Conference on Computer Science and Software Engineering, 166–172.
Google Scholar -
Najaf, K./Subramaniam, R. K./Atayah, O. F. (2022): Understanding the implications of FinTech Peer-to-Peer (P2P) lending during the COVID-19 pandemic. Journal of Sustainable Finance & Investment, Vol. 12(1), 87–102.
Google Scholar -
Nigmonov, A./Daradkeh, H. (2022): From One Crisis to Another: Impact of COVID-19 Pandemic on Peer-to-Peer Lending Market. Financial Transformations Beyond the COVID-19 Health Crisis, 363–402.
Google Scholar -
Parker, G. G./Van Alstyne, M. W./Choudary, S. P. (2016): Platform revolution: How networked markets are transforming the economy and how to make them work for you. WW Norton & Company.
Google Scholar -
Pope, D. G./Sydnor, J. R. (2011): What’s in a picture? Evidence of discrimination from Prosper. com. Journal of Human Resources, Vol. 46(1), 53–92.
Google Scholar -
Ravina, E. (2019): Love & loans: The effect of beauty and personal characteristics in credit markets. Available at SSRN 1107307.
Google Scholar -
Rochet, J. C./Tirole, J. (2003): Platform competition in two-sided markets. Journal of the European Economic Association, Vol. 1(4), 990–1029.
Google Scholar -
Serrano-Cinca, C./Gutiérrez-Nieto, B./López-Palacios, L. (2015): Determinants of default in P2P lending. PLOS ONE, Vol. 10(10), e0139427.
Google Scholar -
Sherstinsky, A. (2020): Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D: Nonlinear Phenomena, Vol. 404, 132306.
Google Scholar -
Siami-Namini, S./Tavakoli, N./Namin, A. S. (2019): The performance of LSTM and BiLSTM in forecasting time series. In: IEEE International Conference on Big Data. IEEE, 3285–3292.
Google Scholar -
Song, Y. et al. (2020): Multi-view ensemble learning based on distance-to-model and adaptive clustering for imbalanced credit risk assessment in P2P lending. In: Information Sciences, Vol. 525, 182–204.
Google Scholar -
Stiglitz, J. E./Weiss, A. (1981): Credit rationing in markets with imperfect information. In: The American Economic Review, Vol. 71(3), 393–410.
Google Scholar -
Tang, H. (2019): Peer-to-peer lenders versus banks: substitutes or complements? The Review of Financial Studies, Vol. 32(5), 1900–1938.
Google Scholar -
Teply, P./Polena, M. (2020): Best classification algorithms in peer-to-peer lending. In: North American Journal of Economics and Finance, Vol. 51, 100904.
Google Scholar -
Varma, S./Simon, R. (2006): Bias in error estimation when using cross-validation for model selection. In: BMC bioinformatics, Vol. 7(1), 1–8.
Google Scholar -
Wang, Y./Ni, X. S. (2020): Risk prediction of peer-to-peer lending market by a LSTM model with macroeconomic factor. ACM Southeast Conference 2020, 181–187.
Google Scholar -
Wei, Z./Lin, M. (2017): Market mechanisms in online peer-to-peer lending. Management Science Vol. 63(12), 4236–4257.
Google Scholar -
Xia, Y./Liu, C./Liu, N. (2017): Cost-sensitive boosted tree for loan evaluation in peer-to-peer lending. Electronic Commerce Research and Applications, Vol. 24, 30–49.
Google Scholar -
Xu, J./Lu, Z./Xie, Y. (2021): Loan default prediction of Chinese P2P market: a machine learning methodology. Scientific Reports, Vol. 11(1), 1–19.
Google Scholar -
Yang, K./Shahabi, C. (2005): On the stationarity of multivariate time series for correlation-based data analysis. IEEE International Conference on Data Mining, Vol. 4.
Google Scholar -
Ye, X./Dong, L./Ma, D. (2018): Loan evaluation in P2P lending based on Random Forest optimized by genetic algorithm with profit score. Electronic Commerce Research and Applications, Vol. 32, 23–36.
Google Scholar -
Zhu, L. et al. (2019): A study on predicting loan default based on the random forest algorithm. Procedia Computer Science, Vol. 162, 503–513.
Google Scholar
Abstract
Marketplace lending has fundamentally changed the relationship between borrowers and lenders in financial markets. As with many other financial products that have emerged in recent years, internet-based investors may be inexperienced in marketplace lending, highlighting the importance of forecasting default rates and evaluating default features such as the loan amount, interest rates, and FICO score. Potential borrowers on marketplace lending platforms may already have been rejected by banks as too risky to lend to, which amplifies the problem of asymmetric information. This paper proposes a holistic data processing flow for the loan status classification of marketplace lending multivariate time series data by using the Bidirectional Long Short-Term Memory model (BiLSTM) to predict “non-default,” “distressed,” and “default” loan status, which outperforms conventional techniques. We adopt the SHapely Additive exPlanations (SHAP) and a four-step ahead model, allowing us to extract the most significant features for default risk assessment. Using our approach, lenders and regulators can identify the most relevant features to enhance the default risk assessment method over time in addition to early risk prediction.