Cargando…
A graph-theory method for pattern identification in geographical epidemiology – a preliminary application to deprivation and mortality
BACKGROUND: Graph theoretical methods are extensively used in the field of computational chemistry to search datasets of compounds to see if they contain particular molecular sub-structures or patterns. We describe a preliminary application of a graph theoretical method, developed in computational c...
Autores principales: | , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2686691/ https://www.ncbi.nlm.nih.gov/pubmed/19439082 http://dx.doi.org/10.1186/1476-072X-8-28 |
_version_ | 1782167458988163072 |
---|---|
author | Maheswaran, Ravi Craigs, Cheryl Read, Simon Bath, Peter A Willett, Peter |
author_facet | Maheswaran, Ravi Craigs, Cheryl Read, Simon Bath, Peter A Willett, Peter |
author_sort | Maheswaran, Ravi |
collection | PubMed |
description | BACKGROUND: Graph theoretical methods are extensively used in the field of computational chemistry to search datasets of compounds to see if they contain particular molecular sub-structures or patterns. We describe a preliminary application of a graph theoretical method, developed in computational chemistry, to geographical epidemiology in relation to testing a prior hypothesis. We tested the methodology on the hypothesis that if a socioeconomically deprived neighbourhood is situated in a wider deprived area, then that neighbourhood would experience greater adverse effects on mortality compared with a similarly deprived neighbourhood which is situated in a wider area with generally less deprivation. METHODS: We used the Trent Region Health Authority area for this study, which contained 10,665 census enumeration districts (CED). Graphs are mathematical representations of objects and their relationships and within the context of this study, nodes represented CEDs and edges were determined by whether or not CEDs were neighbours (shared a common boundary). The overall area in this study was represented by one large graph comprising all CEDs in the region, along with their adjacency information. We used mortality data from 1988–1998, CED level population estimates and the Townsend Material Deprivation Index as an indicator of neighbourhood level deprivation. We defined deprived CEDs as those in the top 20% most deprived in the Region. We then set out to classify these deprived CEDs into seven groups defined by increasing deprivation levels in the neighbouring CEDs. 506 (24.2%) of the deprived CEDs had five adjacent CEDs and we limited pattern development and searching to these CEDs. We developed seven query patterns and used the RASCAL (Rapid Similarity Calculator) program to carry out the search for each of the query patterns. This program used a maximum common subgraph isomorphism method which was modified to handle geographical data. RESULTS: Of the 506 deprived CEDs, 10 were not identified as belonging to any of the seven groups because they were adjacent to a CED with a missing deprivation category quintile, and none fell within query Group 1 (a deprived CED for which all five adjacent CEDs were affluent). Only four CEDs fell within Group 2, which was defined as having four affluent adjacent CEDs and one non-affluent adjacent CED. The numbers of CEDs in Groups 3–7 were 17, 214, 95, 81 and 85 respectively. Age and sex adjusted mortality rate ratios showed a non-significant trend towards increasing mortality risk across Groups (Chi-square = 3.26, df = 1, p = 0.07). CONCLUSION: Graph theoretical methods developed in computational chemistry may be a useful addition to the current GIS based methods available for geographical epidemiology but further developmental work is required. An important requirement will be the development of methods for specifying multiple complex search patterns. Further work is also required to examine the utility of using distance, as opposed to adjacency, to describe edges in graphs, and to examine methods for pattern specification when the nodes have multiple attributes attached to them. |
format | Text |
id | pubmed-2686691 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-26866912009-05-27 A graph-theory method for pattern identification in geographical epidemiology – a preliminary application to deprivation and mortality Maheswaran, Ravi Craigs, Cheryl Read, Simon Bath, Peter A Willett, Peter Int J Health Geogr Methodology BACKGROUND: Graph theoretical methods are extensively used in the field of computational chemistry to search datasets of compounds to see if they contain particular molecular sub-structures or patterns. We describe a preliminary application of a graph theoretical method, developed in computational chemistry, to geographical epidemiology in relation to testing a prior hypothesis. We tested the methodology on the hypothesis that if a socioeconomically deprived neighbourhood is situated in a wider deprived area, then that neighbourhood would experience greater adverse effects on mortality compared with a similarly deprived neighbourhood which is situated in a wider area with generally less deprivation. METHODS: We used the Trent Region Health Authority area for this study, which contained 10,665 census enumeration districts (CED). Graphs are mathematical representations of objects and their relationships and within the context of this study, nodes represented CEDs and edges were determined by whether or not CEDs were neighbours (shared a common boundary). The overall area in this study was represented by one large graph comprising all CEDs in the region, along with their adjacency information. We used mortality data from 1988–1998, CED level population estimates and the Townsend Material Deprivation Index as an indicator of neighbourhood level deprivation. We defined deprived CEDs as those in the top 20% most deprived in the Region. We then set out to classify these deprived CEDs into seven groups defined by increasing deprivation levels in the neighbouring CEDs. 506 (24.2%) of the deprived CEDs had five adjacent CEDs and we limited pattern development and searching to these CEDs. We developed seven query patterns and used the RASCAL (Rapid Similarity Calculator) program to carry out the search for each of the query patterns. This program used a maximum common subgraph isomorphism method which was modified to handle geographical data. RESULTS: Of the 506 deprived CEDs, 10 were not identified as belonging to any of the seven groups because they were adjacent to a CED with a missing deprivation category quintile, and none fell within query Group 1 (a deprived CED for which all five adjacent CEDs were affluent). Only four CEDs fell within Group 2, which was defined as having four affluent adjacent CEDs and one non-affluent adjacent CED. The numbers of CEDs in Groups 3–7 were 17, 214, 95, 81 and 85 respectively. Age and sex adjusted mortality rate ratios showed a non-significant trend towards increasing mortality risk across Groups (Chi-square = 3.26, df = 1, p = 0.07). CONCLUSION: Graph theoretical methods developed in computational chemistry may be a useful addition to the current GIS based methods available for geographical epidemiology but further developmental work is required. An important requirement will be the development of methods for specifying multiple complex search patterns. Further work is also required to examine the utility of using distance, as opposed to adjacency, to describe edges in graphs, and to examine methods for pattern specification when the nodes have multiple attributes attached to them. BioMed Central 2009-05-13 /pmc/articles/PMC2686691/ /pubmed/19439082 http://dx.doi.org/10.1186/1476-072X-8-28 Text en Copyright © 2009 Maheswaran et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Maheswaran, Ravi Craigs, Cheryl Read, Simon Bath, Peter A Willett, Peter A graph-theory method for pattern identification in geographical epidemiology – a preliminary application to deprivation and mortality |
title | A graph-theory method for pattern identification in geographical epidemiology – a preliminary application to deprivation and mortality |
title_full | A graph-theory method for pattern identification in geographical epidemiology – a preliminary application to deprivation and mortality |
title_fullStr | A graph-theory method for pattern identification in geographical epidemiology – a preliminary application to deprivation and mortality |
title_full_unstemmed | A graph-theory method for pattern identification in geographical epidemiology – a preliminary application to deprivation and mortality |
title_short | A graph-theory method for pattern identification in geographical epidemiology – a preliminary application to deprivation and mortality |
title_sort | graph-theory method for pattern identification in geographical epidemiology – a preliminary application to deprivation and mortality |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2686691/ https://www.ncbi.nlm.nih.gov/pubmed/19439082 http://dx.doi.org/10.1186/1476-072X-8-28 |
work_keys_str_mv | AT maheswaranravi agraphtheorymethodforpatternidentificationingeographicalepidemiologyapreliminaryapplicationtodeprivationandmortality AT craigscheryl agraphtheorymethodforpatternidentificationingeographicalepidemiologyapreliminaryapplicationtodeprivationandmortality AT readsimon agraphtheorymethodforpatternidentificationingeographicalepidemiologyapreliminaryapplicationtodeprivationandmortality AT bathpetera agraphtheorymethodforpatternidentificationingeographicalepidemiologyapreliminaryapplicationtodeprivationandmortality AT willettpeter agraphtheorymethodforpatternidentificationingeographicalepidemiologyapreliminaryapplicationtodeprivationandmortality AT maheswaranravi graphtheorymethodforpatternidentificationingeographicalepidemiologyapreliminaryapplicationtodeprivationandmortality AT craigscheryl graphtheorymethodforpatternidentificationingeographicalepidemiologyapreliminaryapplicationtodeprivationandmortality AT readsimon graphtheorymethodforpatternidentificationingeographicalepidemiologyapreliminaryapplicationtodeprivationandmortality AT bathpetera graphtheorymethodforpatternidentificationingeographicalepidemiologyapreliminaryapplicationtodeprivationandmortality AT willettpeter graphtheorymethodforpatternidentificationingeographicalepidemiologyapreliminaryapplicationtodeprivationandmortality |