Cargando…

P2P Lending Default Prediction Based on AI and Statistical Models

Peer-to-peer lending (P2P lending) has proliferated in recent years thanks to Fintech and big data advancements. However, P2P lending platforms are not tightly governed by relevant laws yet, as their development speed has far exceeded that of regulations. Therefore, P2P lending operations are still...

Descripción completa

Detalles Bibliográficos
Autores principales: Ko, Po-Chang, Lin, Ping-Chen, Do, Hoang-Thu, Huang, You-Fu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9222552/
https://www.ncbi.nlm.nih.gov/pubmed/35741522
http://dx.doi.org/10.3390/e24060801
_version_ 1784732893600284672
author Ko, Po-Chang
Lin, Ping-Chen
Do, Hoang-Thu
Huang, You-Fu
author_facet Ko, Po-Chang
Lin, Ping-Chen
Do, Hoang-Thu
Huang, You-Fu
author_sort Ko, Po-Chang
collection PubMed
description Peer-to-peer lending (P2P lending) has proliferated in recent years thanks to Fintech and big data advancements. However, P2P lending platforms are not tightly governed by relevant laws yet, as their development speed has far exceeded that of regulations. Therefore, P2P lending operations are still subject to risks. This paper proposes prediction models to mitigate the risks of default and asymmetric information on P2P lending platforms. Specifically, we designed sophisticated procedures to pre-process mass data extracted from Lending Club in 2018 Q3–2019 Q2. After that, three statistical models, namely, Logistic Regression, Bayesian Classifier, and Linear Discriminant Analysis (LDA), and five AI models, namely, Decision Tree, Random Forest, LightGBM, Artificial Neural Network (ANN), and Convolutional Neural Network (CNN), were utilized for data analysis. The loan statuses of Lending Club’s customers were rationally classified. To evaluate the models, we adopted the confusion matrix series of metrics, AUC-ROC curve, Kolmogorov–Smirnov chart (KS), and Student’s t-test. Empirical studies show that LightGBM produces the best performance and is 2.91% more accurate than the other models, resulting in a revenue improvement of nearly USD 24 million for Lending Club. Student’s t-test proves that the differences between models are statistically significant.
format Online
Article
Text
id pubmed-9222552
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-92225522022-06-24 P2P Lending Default Prediction Based on AI and Statistical Models Ko, Po-Chang Lin, Ping-Chen Do, Hoang-Thu Huang, You-Fu Entropy (Basel) Article Peer-to-peer lending (P2P lending) has proliferated in recent years thanks to Fintech and big data advancements. However, P2P lending platforms are not tightly governed by relevant laws yet, as their development speed has far exceeded that of regulations. Therefore, P2P lending operations are still subject to risks. This paper proposes prediction models to mitigate the risks of default and asymmetric information on P2P lending platforms. Specifically, we designed sophisticated procedures to pre-process mass data extracted from Lending Club in 2018 Q3–2019 Q2. After that, three statistical models, namely, Logistic Regression, Bayesian Classifier, and Linear Discriminant Analysis (LDA), and five AI models, namely, Decision Tree, Random Forest, LightGBM, Artificial Neural Network (ANN), and Convolutional Neural Network (CNN), were utilized for data analysis. The loan statuses of Lending Club’s customers were rationally classified. To evaluate the models, we adopted the confusion matrix series of metrics, AUC-ROC curve, Kolmogorov–Smirnov chart (KS), and Student’s t-test. Empirical studies show that LightGBM produces the best performance and is 2.91% more accurate than the other models, resulting in a revenue improvement of nearly USD 24 million for Lending Club. Student’s t-test proves that the differences between models are statistically significant. MDPI 2022-06-08 /pmc/articles/PMC9222552/ /pubmed/35741522 http://dx.doi.org/10.3390/e24060801 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Ko, Po-Chang
Lin, Ping-Chen
Do, Hoang-Thu
Huang, You-Fu
P2P Lending Default Prediction Based on AI and Statistical Models
title P2P Lending Default Prediction Based on AI and Statistical Models
title_full P2P Lending Default Prediction Based on AI and Statistical Models
title_fullStr P2P Lending Default Prediction Based on AI and Statistical Models
title_full_unstemmed P2P Lending Default Prediction Based on AI and Statistical Models
title_short P2P Lending Default Prediction Based on AI and Statistical Models
title_sort p2p lending default prediction based on ai and statistical models
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9222552/
https://www.ncbi.nlm.nih.gov/pubmed/35741522
http://dx.doi.org/10.3390/e24060801
work_keys_str_mv AT kopochang p2plendingdefaultpredictionbasedonaiandstatisticalmodels
AT linpingchen p2plendingdefaultpredictionbasedonaiandstatisticalmodels
AT dohoangthu p2plendingdefaultpredictionbasedonaiandstatisticalmodels
AT huangyoufu p2plendingdefaultpredictionbasedonaiandstatisticalmodels