Cargando…

Residential scene classification for gridded population sampling in developing countries using deep convolutional neural networks on satellite imagery

BACKGROUND: Conducting surveys in low- and middle-income countries is often challenging because many areas lack a complete sampling frame, have outdated census information, or have limited data available for designing and selecting a representative sample. Geosampling is a probability-based, gridded...

Descripción completa

Detalles Bibliográficos
Autores principales:	Chew, Robert F., Amer, Safaa, Jones, Kasey, Unangst, Jennifer, Cajka, James, Allpress, Justine, Bruhn, Mark
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2018
Materias:	Methodology
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5944062/ https://www.ncbi.nlm.nih.gov/pubmed/29743081 http://dx.doi.org/10.1186/s12942-018-0132-1

_version_	1783321752061870080
author	Chew, Robert F. Amer, Safaa Jones, Kasey Unangst, Jennifer Cajka, James Allpress, Justine Bruhn, Mark
author_facet	Chew, Robert F. Amer, Safaa Jones, Kasey Unangst, Jennifer Cajka, James Allpress, Justine Bruhn, Mark
author_sort	Chew, Robert F.
collection	PubMed
description	BACKGROUND: Conducting surveys in low- and middle-income countries is often challenging because many areas lack a complete sampling frame, have outdated census information, or have limited data available for designing and selecting a representative sample. Geosampling is a probability-based, gridded population sampling method that addresses some of these issues by using geographic information system (GIS) tools to create logistically manageable area units for sampling. GIS grid cells are overlaid to partition a country’s existing administrative boundaries into area units that vary in size from 50 m × 50 m to 150 m × 150 m. To avoid sending interviewers to unoccupied areas, researchers manually classify grid cells as “residential” or “nonresidential” through visual inspection of aerial images. “Nonresidential” units are then excluded from sampling and data collection. This process of manually classifying sampling units has drawbacks since it is labor intensive, prone to human error, and creates the need for simplifying assumptions during calculation of design-based sampling weights. In this paper, we discuss the development of a deep learning classification model to predict whether aerial images are residential or nonresidential, thus reducing manual labor and eliminating the need for simplifying assumptions. RESULTS: On our test sets, the model performs comparable to a human-level baseline in both Nigeria (94.5% accuracy) and Guatemala (96.4% accuracy), and outperforms baseline machine learning models trained on crowdsourced or remote-sensed geospatial features. Additionally, our findings suggest that this approach can work well in new areas with relatively modest amounts of training data. CONCLUSIONS: Gridded population sampling methods like geosampling are becoming increasingly popular in countries with outdated or inaccurate census data because of their timeliness, flexibility, and cost. Using deep learning models directly on satellite images, we provide a novel method for sample frame construction that identifies residential gridded aerial units. In cases where manual classification of satellite images is used to (1) correct for errors in gridded population data sets or (2) classify grids where population estimates are unavailable, this methodology can help reduce annotation burden with comparable quality to human analysts.
format	Online Article Text
id	pubmed-5944062
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-59440622018-05-14 Residential scene classification for gridded population sampling in developing countries using deep convolutional neural networks on satellite imagery Chew, Robert F. Amer, Safaa Jones, Kasey Unangst, Jennifer Cajka, James Allpress, Justine Bruhn, Mark Int J Health Geogr Methodology BACKGROUND: Conducting surveys in low- and middle-income countries is often challenging because many areas lack a complete sampling frame, have outdated census information, or have limited data available for designing and selecting a representative sample. Geosampling is a probability-based, gridded population sampling method that addresses some of these issues by using geographic information system (GIS) tools to create logistically manageable area units for sampling. GIS grid cells are overlaid to partition a country’s existing administrative boundaries into area units that vary in size from 50 m × 50 m to 150 m × 150 m. To avoid sending interviewers to unoccupied areas, researchers manually classify grid cells as “residential” or “nonresidential” through visual inspection of aerial images. “Nonresidential” units are then excluded from sampling and data collection. This process of manually classifying sampling units has drawbacks since it is labor intensive, prone to human error, and creates the need for simplifying assumptions during calculation of design-based sampling weights. In this paper, we discuss the development of a deep learning classification model to predict whether aerial images are residential or nonresidential, thus reducing manual labor and eliminating the need for simplifying assumptions. RESULTS: On our test sets, the model performs comparable to a human-level baseline in both Nigeria (94.5% accuracy) and Guatemala (96.4% accuracy), and outperforms baseline machine learning models trained on crowdsourced or remote-sensed geospatial features. Additionally, our findings suggest that this approach can work well in new areas with relatively modest amounts of training data. CONCLUSIONS: Gridded population sampling methods like geosampling are becoming increasingly popular in countries with outdated or inaccurate census data because of their timeliness, flexibility, and cost. Using deep learning models directly on satellite images, we provide a novel method for sample frame construction that identifies residential gridded aerial units. In cases where manual classification of satellite images is used to (1) correct for errors in gridded population data sets or (2) classify grids where population estimates are unavailable, this methodology can help reduce annotation burden with comparable quality to human analysts. BioMed Central 2018-05-09 /pmc/articles/PMC5944062/ /pubmed/29743081 http://dx.doi.org/10.1186/s12942-018-0132-1 Text en © The Author(s) 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Methodology Chew, Robert F. Amer, Safaa Jones, Kasey Unangst, Jennifer Cajka, James Allpress, Justine Bruhn, Mark Residential scene classification for gridded population sampling in developing countries using deep convolutional neural networks on satellite imagery
title	Residential scene classification for gridded population sampling in developing countries using deep convolutional neural networks on satellite imagery
title_full	Residential scene classification for gridded population sampling in developing countries using deep convolutional neural networks on satellite imagery
title_fullStr	Residential scene classification for gridded population sampling in developing countries using deep convolutional neural networks on satellite imagery
title_full_unstemmed	Residential scene classification for gridded population sampling in developing countries using deep convolutional neural networks on satellite imagery
title_short	Residential scene classification for gridded population sampling in developing countries using deep convolutional neural networks on satellite imagery
title_sort	residential scene classification for gridded population sampling in developing countries using deep convolutional neural networks on satellite imagery
topic	Methodology
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5944062/ https://www.ncbi.nlm.nih.gov/pubmed/29743081 http://dx.doi.org/10.1186/s12942-018-0132-1
work_keys_str_mv	AT chewrobertf residentialsceneclassificationforgriddedpopulationsamplingindevelopingcountriesusingdeepconvolutionalneuralnetworksonsatelliteimagery AT amersafaa residentialsceneclassificationforgriddedpopulationsamplingindevelopingcountriesusingdeepconvolutionalneuralnetworksonsatelliteimagery AT joneskasey residentialsceneclassificationforgriddedpopulationsamplingindevelopingcountriesusingdeepconvolutionalneuralnetworksonsatelliteimagery AT unangstjennifer residentialsceneclassificationforgriddedpopulationsamplingindevelopingcountriesusingdeepconvolutionalneuralnetworksonsatelliteimagery AT cajkajames residentialsceneclassificationforgriddedpopulationsamplingindevelopingcountriesusingdeepconvolutionalneuralnetworksonsatelliteimagery AT allpressjustine residentialsceneclassificationforgriddedpopulationsamplingindevelopingcountriesusingdeepconvolutionalneuralnetworksonsatelliteimagery AT bruhnmark residentialsceneclassificationforgriddedpopulationsamplingindevelopingcountriesusingdeepconvolutionalneuralnetworksonsatelliteimagery

Residential scene classification for gridded population sampling in developing countries using deep convolutional neural networks on satellite imagery

Ejemplares similares