Cargando…

Improving Website Hyperlink Structure Using Server Logs

Good websites should be easy to navigate via hyperlinks, yet maintaining a high-quality link structure is difficult. Identifying pairs of pages that should be linked may be hard for human editors, especially if the site is large and changes frequently. Further, given a set of useful link candidates,...

Descripción completa

Detalles Bibliográficos
Autores principales:	Paranjape, Ashwin, West, Robert, Zia, Leila, Leskovec, Jure
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	2016
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5365094/ https://www.ncbi.nlm.nih.gov/pubmed/28345077 http://dx.doi.org/10.1145/2835776.2835832

_version_	1782517453174079488
author	Paranjape, Ashwin West, Robert Zia, Leila Leskovec, Jure
author_facet	Paranjape, Ashwin West, Robert Zia, Leila Leskovec, Jure
author_sort	Paranjape, Ashwin
collection	PubMed
description	Good websites should be easy to navigate via hyperlinks, yet maintaining a high-quality link structure is difficult. Identifying pairs of pages that should be linked may be hard for human editors, especially if the site is large and changes frequently. Further, given a set of useful link candidates, the task of incorporating them into the site can be expensive, since it typically involves humans editing pages. In the light of these challenges, it is desirable to develop data-driven methods for automating the link placement task. Here we develop an approach for automatically finding useful hyperlinks to add to a website. We show that passively collected server logs, beyond telling us which existing links are useful, also contain implicit signals indicating which nonexistent links would be useful if they were to be introduced. We leverage these signals to model the future usefulness of yet nonexistent links. Based on our model, we define the problem of link placement under budget constraints and propose an efficient algorithm for solving it. We demonstrate the effectiveness of our approach by evaluating it on Wikipedia, a large website for which we have access to both server logs (used for finding useful new links) and the complete revision history (containing a ground truth of new links). As our method is based exclusively on standard server logs, it may also be applied to any other website, as we show with the example of the biomedical research site Simtk.
format	Online Article Text
id	pubmed-5365094
institution	National Center for Biotechnology Information
language	English
publishDate	2016
record_format	MEDLINE/PubMed
spelling	pubmed-53650942017-03-24 Improving Website Hyperlink Structure Using Server Logs Paranjape, Ashwin West, Robert Zia, Leila Leskovec, Jure Proc Int Conf Web Search Data Min Article Good websites should be easy to navigate via hyperlinks, yet maintaining a high-quality link structure is difficult. Identifying pairs of pages that should be linked may be hard for human editors, especially if the site is large and changes frequently. Further, given a set of useful link candidates, the task of incorporating them into the site can be expensive, since it typically involves humans editing pages. In the light of these challenges, it is desirable to develop data-driven methods for automating the link placement task. Here we develop an approach for automatically finding useful hyperlinks to add to a website. We show that passively collected server logs, beyond telling us which existing links are useful, also contain implicit signals indicating which nonexistent links would be useful if they were to be introduced. We leverage these signals to model the future usefulness of yet nonexistent links. Based on our model, we define the problem of link placement under budget constraints and propose an efficient algorithm for solving it. We demonstrate the effectiveness of our approach by evaluating it on Wikipedia, a large website for which we have access to both server logs (used for finding useful new links) and the complete revision history (containing a ground truth of new links). As our method is based exclusively on standard server logs, it may also be applied to any other website, as we show with the example of the biomedical research site Simtk. 2016-02 /pmc/articles/PMC5365094/ /pubmed/28345077 http://dx.doi.org/10.1145/2835776.2835832 Text en http://creativecommons.org/licenses/by-sai/4.0/ This work is licensed under a Creative Commons Attribution-ShareAlike International 4.0. License.
spellingShingle	Article Paranjape, Ashwin West, Robert Zia, Leila Leskovec, Jure Improving Website Hyperlink Structure Using Server Logs
title	Improving Website Hyperlink Structure Using Server Logs
title_full	Improving Website Hyperlink Structure Using Server Logs
title_fullStr	Improving Website Hyperlink Structure Using Server Logs
title_full_unstemmed	Improving Website Hyperlink Structure Using Server Logs
title_short	Improving Website Hyperlink Structure Using Server Logs
title_sort	improving website hyperlink structure using server logs
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5365094/ https://www.ncbi.nlm.nih.gov/pubmed/28345077 http://dx.doi.org/10.1145/2835776.2835832
work_keys_str_mv	AT paranjapeashwin improvingwebsitehyperlinkstructureusingserverlogs AT westrobert improvingwebsitehyperlinkstructureusingserverlogs AT zialeila improvingwebsitehyperlinkstructureusingserverlogs AT leskovecjure improvingwebsitehyperlinkstructureusingserverlogs

Improving Website Hyperlink Structure Using Server Logs

Ejemplares similares