Cargando…

Data-Driven Investment Strategies for Peer-to-Peer Lending: A Case Study for Teaching Data Science

We develop a number of data-driven investment strategies that demonstrate how machine learning and data analytics can be used to guide investments in peer-to-peer loans. We detail the process starting with the acquisition of (real) data from a peer-to-peer lending platform all the way to the develop...

Descripción completa

Detalles Bibliográficos
Autores principales: Cohen, Maxime C., Guetta, C. Daniel, Jiao, Kevin, Provost, Foster
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Mary Ann Liebert, Inc., publishers 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6154448/
https://www.ncbi.nlm.nih.gov/pubmed/30283728
http://dx.doi.org/10.1089/big.2018.0092
_version_ 1783357692517023744
author Cohen, Maxime C.
Guetta, C. Daniel
Jiao, Kevin
Provost, Foster
author_facet Cohen, Maxime C.
Guetta, C. Daniel
Jiao, Kevin
Provost, Foster
author_sort Cohen, Maxime C.
collection PubMed
description We develop a number of data-driven investment strategies that demonstrate how machine learning and data analytics can be used to guide investments in peer-to-peer loans. We detail the process starting with the acquisition of (real) data from a peer-to-peer lending platform all the way to the development and evaluation of investment strategies based on a variety of approaches. We focus heavily on how to apply and evaluate the data science methods, and resulting strategies, in a real-world business setting. The material presented in this article can be used by instructors who teach data science courses, at the undergraduate or graduate levels. Importantly, we go beyond just evaluating predictive performance of models, to assess how well the strategies would actually perform, using real, publicly available data. Our treatment is comprehensive and ranges from qualitative to technical, but is also modular—which gives instructors the flexibility to focus on specific parts of the case, depending on the topics they want to cover. The learning concepts include the following: data cleaning and ingestion, classification/probability estimation modeling, regression modeling, analytical engineering, calibration curves, data leakage, evaluation of model performance, basic portfolio optimization, evaluation of investment strategies, and using Python for data science.
format Online
Article
Text
id pubmed-6154448
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Mary Ann Liebert, Inc., publishers
record_format MEDLINE/PubMed
spelling pubmed-61544482018-10-03 Data-Driven Investment Strategies for Peer-to-Peer Lending: A Case Study for Teaching Data Science Cohen, Maxime C. Guetta, C. Daniel Jiao, Kevin Provost, Foster Big Data Original Articles We develop a number of data-driven investment strategies that demonstrate how machine learning and data analytics can be used to guide investments in peer-to-peer loans. We detail the process starting with the acquisition of (real) data from a peer-to-peer lending platform all the way to the development and evaluation of investment strategies based on a variety of approaches. We focus heavily on how to apply and evaluate the data science methods, and resulting strategies, in a real-world business setting. The material presented in this article can be used by instructors who teach data science courses, at the undergraduate or graduate levels. Importantly, we go beyond just evaluating predictive performance of models, to assess how well the strategies would actually perform, using real, publicly available data. Our treatment is comprehensive and ranges from qualitative to technical, but is also modular—which gives instructors the flexibility to focus on specific parts of the case, depending on the topics they want to cover. The learning concepts include the following: data cleaning and ingestion, classification/probability estimation modeling, regression modeling, analytical engineering, calibration curves, data leakage, evaluation of model performance, basic portfolio optimization, evaluation of investment strategies, and using Python for data science. Mary Ann Liebert, Inc., publishers 2018-09-01 2018-09-17 /pmc/articles/PMC6154448/ /pubmed/30283728 http://dx.doi.org/10.1089/big.2018.0092 Text en © Maxime C. Cohen et al., 2018; Published by Mary Ann Liebert, Inc. This Open Access article is distributed under the terms of the Creative Commons Attribution Noncommercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and the source are cited.
spellingShingle Original Articles
Cohen, Maxime C.
Guetta, C. Daniel
Jiao, Kevin
Provost, Foster
Data-Driven Investment Strategies for Peer-to-Peer Lending: A Case Study for Teaching Data Science
title Data-Driven Investment Strategies for Peer-to-Peer Lending: A Case Study for Teaching Data Science
title_full Data-Driven Investment Strategies for Peer-to-Peer Lending: A Case Study for Teaching Data Science
title_fullStr Data-Driven Investment Strategies for Peer-to-Peer Lending: A Case Study for Teaching Data Science
title_full_unstemmed Data-Driven Investment Strategies for Peer-to-Peer Lending: A Case Study for Teaching Data Science
title_short Data-Driven Investment Strategies for Peer-to-Peer Lending: A Case Study for Teaching Data Science
title_sort data-driven investment strategies for peer-to-peer lending: a case study for teaching data science
topic Original Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6154448/
https://www.ncbi.nlm.nih.gov/pubmed/30283728
http://dx.doi.org/10.1089/big.2018.0092
work_keys_str_mv AT cohenmaximec datadriveninvestmentstrategiesforpeertopeerlendingacasestudyforteachingdatascience
AT guettacdaniel datadriveninvestmentstrategiesforpeertopeerlendingacasestudyforteachingdatascience
AT jiaokevin datadriveninvestmentstrategiesforpeertopeerlendingacasestudyforteachingdatascience
AT provostfoster datadriveninvestmentstrategiesforpeertopeerlendingacasestudyforteachingdatascience