Cargando…

DISPAQ: Distributed Profitable-Area Query from Big Taxi Trip Data †

One of the crucial problems for taxi drivers is to efficiently locate passengers in order to increase profits. The rapid advancement and ubiquitous penetration of Internet of Things (IoT) technology into transportation industries enables us to provide taxi drivers with locations that have more poten...

Descripción completa

Detalles Bibliográficos
Autores principales: Putri, Fadhilah Kurnia, Song, Giltae, Kwon, Joonho, Rao, Praveen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5677113/
https://www.ncbi.nlm.nih.gov/pubmed/28946679
http://dx.doi.org/10.3390/s17102201
_version_ 1783277177971671040
author Putri, Fadhilah Kurnia
Song, Giltae
Kwon, Joonho
Rao, Praveen
author_facet Putri, Fadhilah Kurnia
Song, Giltae
Kwon, Joonho
Rao, Praveen
author_sort Putri, Fadhilah Kurnia
collection PubMed
description One of the crucial problems for taxi drivers is to efficiently locate passengers in order to increase profits. The rapid advancement and ubiquitous penetration of Internet of Things (IoT) technology into transportation industries enables us to provide taxi drivers with locations that have more potential passengers (more profitable areas) by analyzing and querying taxi trip data. In this paper, we propose a query processing system, called Distributed Profitable-Area Query (DISPAQ) which efficiently identifies profitable areas by exploiting the Apache Software Foundation’s Spark framework and a MongoDB database. DISPAQ first maintains a profitable-area query index (PQ-index) by extracting area summaries and route summaries from raw taxi trip data. It then identifies candidate profitable areas by searching the PQ-index during query processing. Then, it exploits a Z-Skyline algorithm, which is an extension of skyline processing with a Z-order space filling curve, to quickly refine the candidate profitable areas. To improve the performance of distributed query processing, we also propose local Z-Skyline optimization, which reduces the number of dominant tests by distributing killer profitable areas to each cluster node. Through extensive evaluation with real datasets, we demonstrate that our DISPAQ system provides a scalable and efficient solution for processing profitable-area queries from huge amounts of big taxi trip data.
format Online
Article
Text
id pubmed-5677113
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-56771132017-11-17 DISPAQ: Distributed Profitable-Area Query from Big Taxi Trip Data † Putri, Fadhilah Kurnia Song, Giltae Kwon, Joonho Rao, Praveen Sensors (Basel) Article One of the crucial problems for taxi drivers is to efficiently locate passengers in order to increase profits. The rapid advancement and ubiquitous penetration of Internet of Things (IoT) technology into transportation industries enables us to provide taxi drivers with locations that have more potential passengers (more profitable areas) by analyzing and querying taxi trip data. In this paper, we propose a query processing system, called Distributed Profitable-Area Query (DISPAQ) which efficiently identifies profitable areas by exploiting the Apache Software Foundation’s Spark framework and a MongoDB database. DISPAQ first maintains a profitable-area query index (PQ-index) by extracting area summaries and route summaries from raw taxi trip data. It then identifies candidate profitable areas by searching the PQ-index during query processing. Then, it exploits a Z-Skyline algorithm, which is an extension of skyline processing with a Z-order space filling curve, to quickly refine the candidate profitable areas. To improve the performance of distributed query processing, we also propose local Z-Skyline optimization, which reduces the number of dominant tests by distributing killer profitable areas to each cluster node. Through extensive evaluation with real datasets, we demonstrate that our DISPAQ system provides a scalable and efficient solution for processing profitable-area queries from huge amounts of big taxi trip data. MDPI 2017-09-25 /pmc/articles/PMC5677113/ /pubmed/28946679 http://dx.doi.org/10.3390/s17102201 Text en © 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Putri, Fadhilah Kurnia
Song, Giltae
Kwon, Joonho
Rao, Praveen
DISPAQ: Distributed Profitable-Area Query from Big Taxi Trip Data †
title DISPAQ: Distributed Profitable-Area Query from Big Taxi Trip Data †
title_full DISPAQ: Distributed Profitable-Area Query from Big Taxi Trip Data †
title_fullStr DISPAQ: Distributed Profitable-Area Query from Big Taxi Trip Data †
title_full_unstemmed DISPAQ: Distributed Profitable-Area Query from Big Taxi Trip Data †
title_short DISPAQ: Distributed Profitable-Area Query from Big Taxi Trip Data †
title_sort dispaq: distributed profitable-area query from big taxi trip data †
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5677113/
https://www.ncbi.nlm.nih.gov/pubmed/28946679
http://dx.doi.org/10.3390/s17102201
work_keys_str_mv AT putrifadhilahkurnia dispaqdistributedprofitableareaqueryfrombigtaxitripdata
AT songgiltae dispaqdistributedprofitableareaqueryfrombigtaxitripdata
AT kwonjoonho dispaqdistributedprofitableareaqueryfrombigtaxitripdata
AT raopraveen dispaqdistributedprofitableareaqueryfrombigtaxitripdata