Cargando…

Implementation and validation of a probabilistic linkage method for population databases without identification variables

Linking records of the same person from different sources makes it possible to build administrative cohorts and perform longitudinal analyzes, as an alternative to traditional cohort studies, and have important practical implications in producing knowledge in public health. We implemented the Felleg...

Descripción completa

Detalles Bibliográficos
Autores principales: Quezada-Sánchez, Amado D., Espín-Arellano, Iván, Morales-Carmona, Evangelina, Molina-Vélez, Diana, Palacio-Mejía, Lina Sofía, González-González, Edgar Leonel, Alvarez Aceves, Mariana, Hernández-Ávila, Juan Eugenio
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9793263/
https://www.ncbi.nlm.nih.gov/pubmed/36582715
http://dx.doi.org/10.1016/j.heliyon.2022.e12311
Descripción
Sumario:Linking records of the same person from different sources makes it possible to build administrative cohorts and perform longitudinal analyzes, as an alternative to traditional cohort studies, and have important practical implications in producing knowledge in public health. We implemented the Fellegi-Sunter probabilistic linkage method to a sample of records from the Mexican Automated System for Hospital Discharges and the Statistical and Epidemiological System for Deaths and evaluated its performance. The records in each source were randomly divided into a training sample (25%) and a validation sample (75%). We evaluated different types of blocking in terms of complexity reduction and pairs completeness, and record linkage in terms of sensitivity and positive predictive value. In the validation sample, a blocking scheme based on trigrams of the full name achieved 95.76% pairs completeness and 99.9996% complexity reduction. After pairs classification, we achieved a sensitivity of 90.72% and a positive predictive value of 97.10% in the validation sample. Both values were about one percentage point higher than that obtained in the automatic classification without clerical review of potential pairs. We concluded that the linkage algorithm achieved a good performance in terms of sensitivity and positive predictive value and can be used to build administrative cohorts for the epidemiological analysis of populations with records in health information systems.