Cargando…
Sparse conditional logistic regression for analyzing large-scale matched data from epidemiological studies: a simple algorithm
This paper considers the problem of estimation and variable selection for large high-dimensional data (high number of predictors p and large sample size N, without excluding the possibility that N < p) resulting from an individually matched case-control study. We develop a simple algorithm for th...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4416185/ https://www.ncbi.nlm.nih.gov/pubmed/25916593 http://dx.doi.org/10.1186/1471-2105-16-S6-S1 |
_version_ | 1782369193089302528 |
---|---|
author | Avalos, Marta Pouyes, Hélène Grandvalet, Yves Orriols, Ludivine Lagarde, Emmanuel |
author_facet | Avalos, Marta Pouyes, Hélène Grandvalet, Yves Orriols, Ludivine Lagarde, Emmanuel |
author_sort | Avalos, Marta |
collection | PubMed |
description | This paper considers the problem of estimation and variable selection for large high-dimensional data (high number of predictors p and large sample size N, without excluding the possibility that N < p) resulting from an individually matched case-control study. We develop a simple algorithm for the adaptation of the Lasso and related methods to the conditional logistic regression model. Our proposal relies on the simplification of the calculations involved in the likelihood function. Then, the proposed algorithm iteratively solves reweighted Lasso problems using cyclical coordinate descent, computed along a regularization path. This method can handle large problems and deal with sparse features efficiently. We discuss benefits and drawbacks with respect to the existing available implementations. We also illustrate the interest and use of these techniques on a pharmacoepidemiological study of medication use and traffic safety. |
format | Online Article Text |
id | pubmed-4416185 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-44161852015-05-07 Sparse conditional logistic regression for analyzing large-scale matched data from epidemiological studies: a simple algorithm Avalos, Marta Pouyes, Hélène Grandvalet, Yves Orriols, Ludivine Lagarde, Emmanuel BMC Bioinformatics Research This paper considers the problem of estimation and variable selection for large high-dimensional data (high number of predictors p and large sample size N, without excluding the possibility that N < p) resulting from an individually matched case-control study. We develop a simple algorithm for the adaptation of the Lasso and related methods to the conditional logistic regression model. Our proposal relies on the simplification of the calculations involved in the likelihood function. Then, the proposed algorithm iteratively solves reweighted Lasso problems using cyclical coordinate descent, computed along a regularization path. This method can handle large problems and deal with sparse features efficiently. We discuss benefits and drawbacks with respect to the existing available implementations. We also illustrate the interest and use of these techniques on a pharmacoepidemiological study of medication use and traffic safety. BioMed Central 2015-04-17 /pmc/articles/PMC4416185/ /pubmed/25916593 http://dx.doi.org/10.1186/1471-2105-16-S6-S1 Text en Copyright © 2015 Avalos et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Avalos, Marta Pouyes, Hélène Grandvalet, Yves Orriols, Ludivine Lagarde, Emmanuel Sparse conditional logistic regression for analyzing large-scale matched data from epidemiological studies: a simple algorithm |
title | Sparse conditional logistic regression for analyzing large-scale matched data from epidemiological studies: a simple algorithm |
title_full | Sparse conditional logistic regression for analyzing large-scale matched data from epidemiological studies: a simple algorithm |
title_fullStr | Sparse conditional logistic regression for analyzing large-scale matched data from epidemiological studies: a simple algorithm |
title_full_unstemmed | Sparse conditional logistic regression for analyzing large-scale matched data from epidemiological studies: a simple algorithm |
title_short | Sparse conditional logistic regression for analyzing large-scale matched data from epidemiological studies: a simple algorithm |
title_sort | sparse conditional logistic regression for analyzing large-scale matched data from epidemiological studies: a simple algorithm |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4416185/ https://www.ncbi.nlm.nih.gov/pubmed/25916593 http://dx.doi.org/10.1186/1471-2105-16-S6-S1 |
work_keys_str_mv | AT avalosmarta sparseconditionallogisticregressionforanalyzinglargescalematcheddatafromepidemiologicalstudiesasimplealgorithm AT pouyeshelene sparseconditionallogisticregressionforanalyzinglargescalematcheddatafromepidemiologicalstudiesasimplealgorithm AT grandvaletyves sparseconditionallogisticregressionforanalyzinglargescalematcheddatafromepidemiologicalstudiesasimplealgorithm AT orriolsludivine sparseconditionallogisticregressionforanalyzinglargescalematcheddatafromepidemiologicalstudiesasimplealgorithm AT lagardeemmanuel sparseconditionallogisticregressionforanalyzinglargescalematcheddatafromepidemiologicalstudiesasimplealgorithm |