Cargando…

A graph-theory method for pattern identification in geographical epidemiology – a preliminary application to deprivation and mortality

BACKGROUND: Graph theoretical methods are extensively used in the field of computational chemistry to search datasets of compounds to see if they contain particular molecular sub-structures or patterns. We describe a preliminary application of a graph theoretical method, developed in computational c...

Descripción completa

Detalles Bibliográficos
Autores principales: Maheswaran, Ravi, Craigs, Cheryl, Read, Simon, Bath, Peter A, Willett, Peter
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2686691/
https://www.ncbi.nlm.nih.gov/pubmed/19439082
http://dx.doi.org/10.1186/1476-072X-8-28
_version_ 1782167458988163072
author Maheswaran, Ravi
Craigs, Cheryl
Read, Simon
Bath, Peter A
Willett, Peter
author_facet Maheswaran, Ravi
Craigs, Cheryl
Read, Simon
Bath, Peter A
Willett, Peter
author_sort Maheswaran, Ravi
collection PubMed
description BACKGROUND: Graph theoretical methods are extensively used in the field of computational chemistry to search datasets of compounds to see if they contain particular molecular sub-structures or patterns. We describe a preliminary application of a graph theoretical method, developed in computational chemistry, to geographical epidemiology in relation to testing a prior hypothesis. We tested the methodology on the hypothesis that if a socioeconomically deprived neighbourhood is situated in a wider deprived area, then that neighbourhood would experience greater adverse effects on mortality compared with a similarly deprived neighbourhood which is situated in a wider area with generally less deprivation. METHODS: We used the Trent Region Health Authority area for this study, which contained 10,665 census enumeration districts (CED). Graphs are mathematical representations of objects and their relationships and within the context of this study, nodes represented CEDs and edges were determined by whether or not CEDs were neighbours (shared a common boundary). The overall area in this study was represented by one large graph comprising all CEDs in the region, along with their adjacency information. We used mortality data from 1988–1998, CED level population estimates and the Townsend Material Deprivation Index as an indicator of neighbourhood level deprivation. We defined deprived CEDs as those in the top 20% most deprived in the Region. We then set out to classify these deprived CEDs into seven groups defined by increasing deprivation levels in the neighbouring CEDs. 506 (24.2%) of the deprived CEDs had five adjacent CEDs and we limited pattern development and searching to these CEDs. We developed seven query patterns and used the RASCAL (Rapid Similarity Calculator) program to carry out the search for each of the query patterns. This program used a maximum common subgraph isomorphism method which was modified to handle geographical data. RESULTS: Of the 506 deprived CEDs, 10 were not identified as belonging to any of the seven groups because they were adjacent to a CED with a missing deprivation category quintile, and none fell within query Group 1 (a deprived CED for which all five adjacent CEDs were affluent). Only four CEDs fell within Group 2, which was defined as having four affluent adjacent CEDs and one non-affluent adjacent CED. The numbers of CEDs in Groups 3–7 were 17, 214, 95, 81 and 85 respectively. Age and sex adjusted mortality rate ratios showed a non-significant trend towards increasing mortality risk across Groups (Chi-square = 3.26, df = 1, p = 0.07). CONCLUSION: Graph theoretical methods developed in computational chemistry may be a useful addition to the current GIS based methods available for geographical epidemiology but further developmental work is required. An important requirement will be the development of methods for specifying multiple complex search patterns. Further work is also required to examine the utility of using distance, as opposed to adjacency, to describe edges in graphs, and to examine methods for pattern specification when the nodes have multiple attributes attached to them.
format Text
id pubmed-2686691
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26866912009-05-27 A graph-theory method for pattern identification in geographical epidemiology – a preliminary application to deprivation and mortality Maheswaran, Ravi Craigs, Cheryl Read, Simon Bath, Peter A Willett, Peter Int J Health Geogr Methodology BACKGROUND: Graph theoretical methods are extensively used in the field of computational chemistry to search datasets of compounds to see if they contain particular molecular sub-structures or patterns. We describe a preliminary application of a graph theoretical method, developed in computational chemistry, to geographical epidemiology in relation to testing a prior hypothesis. We tested the methodology on the hypothesis that if a socioeconomically deprived neighbourhood is situated in a wider deprived area, then that neighbourhood would experience greater adverse effects on mortality compared with a similarly deprived neighbourhood which is situated in a wider area with generally less deprivation. METHODS: We used the Trent Region Health Authority area for this study, which contained 10,665 census enumeration districts (CED). Graphs are mathematical representations of objects and their relationships and within the context of this study, nodes represented CEDs and edges were determined by whether or not CEDs were neighbours (shared a common boundary). The overall area in this study was represented by one large graph comprising all CEDs in the region, along with their adjacency information. We used mortality data from 1988–1998, CED level population estimates and the Townsend Material Deprivation Index as an indicator of neighbourhood level deprivation. We defined deprived CEDs as those in the top 20% most deprived in the Region. We then set out to classify these deprived CEDs into seven groups defined by increasing deprivation levels in the neighbouring CEDs. 506 (24.2%) of the deprived CEDs had five adjacent CEDs and we limited pattern development and searching to these CEDs. We developed seven query patterns and used the RASCAL (Rapid Similarity Calculator) program to carry out the search for each of the query patterns. This program used a maximum common subgraph isomorphism method which was modified to handle geographical data. RESULTS: Of the 506 deprived CEDs, 10 were not identified as belonging to any of the seven groups because they were adjacent to a CED with a missing deprivation category quintile, and none fell within query Group 1 (a deprived CED for which all five adjacent CEDs were affluent). Only four CEDs fell within Group 2, which was defined as having four affluent adjacent CEDs and one non-affluent adjacent CED. The numbers of CEDs in Groups 3–7 were 17, 214, 95, 81 and 85 respectively. Age and sex adjusted mortality rate ratios showed a non-significant trend towards increasing mortality risk across Groups (Chi-square = 3.26, df = 1, p = 0.07). CONCLUSION: Graph theoretical methods developed in computational chemistry may be a useful addition to the current GIS based methods available for geographical epidemiology but further developmental work is required. An important requirement will be the development of methods for specifying multiple complex search patterns. Further work is also required to examine the utility of using distance, as opposed to adjacency, to describe edges in graphs, and to examine methods for pattern specification when the nodes have multiple attributes attached to them. BioMed Central 2009-05-13 /pmc/articles/PMC2686691/ /pubmed/19439082 http://dx.doi.org/10.1186/1476-072X-8-28 Text en Copyright © 2009 Maheswaran et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology
Maheswaran, Ravi
Craigs, Cheryl
Read, Simon
Bath, Peter A
Willett, Peter
A graph-theory method for pattern identification in geographical epidemiology – a preliminary application to deprivation and mortality
title A graph-theory method for pattern identification in geographical epidemiology – a preliminary application to deprivation and mortality
title_full A graph-theory method for pattern identification in geographical epidemiology – a preliminary application to deprivation and mortality
title_fullStr A graph-theory method for pattern identification in geographical epidemiology – a preliminary application to deprivation and mortality
title_full_unstemmed A graph-theory method for pattern identification in geographical epidemiology – a preliminary application to deprivation and mortality
title_short A graph-theory method for pattern identification in geographical epidemiology – a preliminary application to deprivation and mortality
title_sort graph-theory method for pattern identification in geographical epidemiology – a preliminary application to deprivation and mortality
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2686691/
https://www.ncbi.nlm.nih.gov/pubmed/19439082
http://dx.doi.org/10.1186/1476-072X-8-28
work_keys_str_mv AT maheswaranravi agraphtheorymethodforpatternidentificationingeographicalepidemiologyapreliminaryapplicationtodeprivationandmortality
AT craigscheryl agraphtheorymethodforpatternidentificationingeographicalepidemiologyapreliminaryapplicationtodeprivationandmortality
AT readsimon agraphtheorymethodforpatternidentificationingeographicalepidemiologyapreliminaryapplicationtodeprivationandmortality
AT bathpetera agraphtheorymethodforpatternidentificationingeographicalepidemiologyapreliminaryapplicationtodeprivationandmortality
AT willettpeter agraphtheorymethodforpatternidentificationingeographicalepidemiologyapreliminaryapplicationtodeprivationandmortality
AT maheswaranravi graphtheorymethodforpatternidentificationingeographicalepidemiologyapreliminaryapplicationtodeprivationandmortality
AT craigscheryl graphtheorymethodforpatternidentificationingeographicalepidemiologyapreliminaryapplicationtodeprivationandmortality
AT readsimon graphtheorymethodforpatternidentificationingeographicalepidemiologyapreliminaryapplicationtodeprivationandmortality
AT bathpetera graphtheorymethodforpatternidentificationingeographicalepidemiologyapreliminaryapplicationtodeprivationandmortality
AT willettpeter graphtheorymethodforpatternidentificationingeographicalepidemiologyapreliminaryapplicationtodeprivationandmortality