Cargando…
DISPAQ: Distributed Profitable-Area Query from Big Taxi Trip Data †
One of the crucial problems for taxi drivers is to efficiently locate passengers in order to increase profits. The rapid advancement and ubiquitous penetration of Internet of Things (IoT) technology into transportation industries enables us to provide taxi drivers with locations that have more poten...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5677113/ https://www.ncbi.nlm.nih.gov/pubmed/28946679 http://dx.doi.org/10.3390/s17102201 |
_version_ | 1783277177971671040 |
---|---|
author | Putri, Fadhilah Kurnia Song, Giltae Kwon, Joonho Rao, Praveen |
author_facet | Putri, Fadhilah Kurnia Song, Giltae Kwon, Joonho Rao, Praveen |
author_sort | Putri, Fadhilah Kurnia |
collection | PubMed |
description | One of the crucial problems for taxi drivers is to efficiently locate passengers in order to increase profits. The rapid advancement and ubiquitous penetration of Internet of Things (IoT) technology into transportation industries enables us to provide taxi drivers with locations that have more potential passengers (more profitable areas) by analyzing and querying taxi trip data. In this paper, we propose a query processing system, called Distributed Profitable-Area Query (DISPAQ) which efficiently identifies profitable areas by exploiting the Apache Software Foundation’s Spark framework and a MongoDB database. DISPAQ first maintains a profitable-area query index (PQ-index) by extracting area summaries and route summaries from raw taxi trip data. It then identifies candidate profitable areas by searching the PQ-index during query processing. Then, it exploits a Z-Skyline algorithm, which is an extension of skyline processing with a Z-order space filling curve, to quickly refine the candidate profitable areas. To improve the performance of distributed query processing, we also propose local Z-Skyline optimization, which reduces the number of dominant tests by distributing killer profitable areas to each cluster node. Through extensive evaluation with real datasets, we demonstrate that our DISPAQ system provides a scalable and efficient solution for processing profitable-area queries from huge amounts of big taxi trip data. |
format | Online Article Text |
id | pubmed-5677113 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-56771132017-11-17 DISPAQ: Distributed Profitable-Area Query from Big Taxi Trip Data † Putri, Fadhilah Kurnia Song, Giltae Kwon, Joonho Rao, Praveen Sensors (Basel) Article One of the crucial problems for taxi drivers is to efficiently locate passengers in order to increase profits. The rapid advancement and ubiquitous penetration of Internet of Things (IoT) technology into transportation industries enables us to provide taxi drivers with locations that have more potential passengers (more profitable areas) by analyzing and querying taxi trip data. In this paper, we propose a query processing system, called Distributed Profitable-Area Query (DISPAQ) which efficiently identifies profitable areas by exploiting the Apache Software Foundation’s Spark framework and a MongoDB database. DISPAQ first maintains a profitable-area query index (PQ-index) by extracting area summaries and route summaries from raw taxi trip data. It then identifies candidate profitable areas by searching the PQ-index during query processing. Then, it exploits a Z-Skyline algorithm, which is an extension of skyline processing with a Z-order space filling curve, to quickly refine the candidate profitable areas. To improve the performance of distributed query processing, we also propose local Z-Skyline optimization, which reduces the number of dominant tests by distributing killer profitable areas to each cluster node. Through extensive evaluation with real datasets, we demonstrate that our DISPAQ system provides a scalable and efficient solution for processing profitable-area queries from huge amounts of big taxi trip data. MDPI 2017-09-25 /pmc/articles/PMC5677113/ /pubmed/28946679 http://dx.doi.org/10.3390/s17102201 Text en © 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Putri, Fadhilah Kurnia Song, Giltae Kwon, Joonho Rao, Praveen DISPAQ: Distributed Profitable-Area Query from Big Taxi Trip Data † |
title | DISPAQ: Distributed Profitable-Area Query from Big Taxi Trip Data † |
title_full | DISPAQ: Distributed Profitable-Area Query from Big Taxi Trip Data † |
title_fullStr | DISPAQ: Distributed Profitable-Area Query from Big Taxi Trip Data † |
title_full_unstemmed | DISPAQ: Distributed Profitable-Area Query from Big Taxi Trip Data † |
title_short | DISPAQ: Distributed Profitable-Area Query from Big Taxi Trip Data † |
title_sort | dispaq: distributed profitable-area query from big taxi trip data † |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5677113/ https://www.ncbi.nlm.nih.gov/pubmed/28946679 http://dx.doi.org/10.3390/s17102201 |
work_keys_str_mv | AT putrifadhilahkurnia dispaqdistributedprofitableareaqueryfrombigtaxitripdata AT songgiltae dispaqdistributedprofitableareaqueryfrombigtaxitripdata AT kwonjoonho dispaqdistributedprofitableareaqueryfrombigtaxitripdata AT raopraveen dispaqdistributedprofitableareaqueryfrombigtaxitripdata |