Cargando…

Predicting and understanding law-making with word vectors and an ensemble model

Out of nearly 70,000 bills introduced in the U.S. Congress from 2001 to 2015, only 2,513 were enacted. We developed a machine learning approach to forecasting the probability that any bill will become law. Starting in 2001 with the 107th Congress, we trained models on data from previous Congresses,...

Descripción completa

Detalles Bibliográficos
Autor principal:	Nay, John J.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2017
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5425031/ https://www.ncbi.nlm.nih.gov/pubmed/28489868 http://dx.doi.org/10.1371/journal.pone.0176999

_version_	1783235238902628352
author	Nay, John J.
author_facet	Nay, John J.
author_sort	Nay, John J.
collection	PubMed
description	Out of nearly 70,000 bills introduced in the U.S. Congress from 2001 to 2015, only 2,513 were enacted. We developed a machine learning approach to forecasting the probability that any bill will become law. Starting in 2001 with the 107th Congress, we trained models on data from previous Congresses, predicted all bills in the current Congress, and repeated until the 113th Congress served as the test. For prediction we scored each sentence of a bill with a language model that embeds legislative vocabulary into a high-dimensional, semantic-laden vector space. This language representation enables our investigation into which words increase the probability of enactment for any topic. To test the relative importance of text and context, we compared the text model to a context-only model that uses variables such as whether the bill’s sponsor is in the majority party. To test the effect of changes to bills after their introduction on our ability to predict their final outcome, we compared using the bill text and meta-data available at the time of introduction with using the most recent data. At the time of introduction context-only predictions outperform text-only, and with the newest data text-only outperforms context-only. Combining text and context always performs best. We conducted a global sensitivity analysis on the combined model to determine important variables predicting enactment.
format	Online Article Text
id	pubmed-5425031
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-54250312017-05-15 Predicting and understanding law-making with word vectors and an ensemble model Nay, John J. PLoS One Research Article Out of nearly 70,000 bills introduced in the U.S. Congress from 2001 to 2015, only 2,513 were enacted. We developed a machine learning approach to forecasting the probability that any bill will become law. Starting in 2001 with the 107th Congress, we trained models on data from previous Congresses, predicted all bills in the current Congress, and repeated until the 113th Congress served as the test. For prediction we scored each sentence of a bill with a language model that embeds legislative vocabulary into a high-dimensional, semantic-laden vector space. This language representation enables our investigation into which words increase the probability of enactment for any topic. To test the relative importance of text and context, we compared the text model to a context-only model that uses variables such as whether the bill’s sponsor is in the majority party. To test the effect of changes to bills after their introduction on our ability to predict their final outcome, we compared using the bill text and meta-data available at the time of introduction with using the most recent data. At the time of introduction context-only predictions outperform text-only, and with the newest data text-only outperforms context-only. Combining text and context always performs best. We conducted a global sensitivity analysis on the combined model to determine important variables predicting enactment. Public Library of Science 2017-05-10 /pmc/articles/PMC5425031/ /pubmed/28489868 http://dx.doi.org/10.1371/journal.pone.0176999 Text en © 2017 John J. Nay http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Nay, John J. Predicting and understanding law-making with word vectors and an ensemble model
title	Predicting and understanding law-making with word vectors and an ensemble model
title_full	Predicting and understanding law-making with word vectors and an ensemble model
title_fullStr	Predicting and understanding law-making with word vectors and an ensemble model
title_full_unstemmed	Predicting and understanding law-making with word vectors and an ensemble model
title_short	Predicting and understanding law-making with word vectors and an ensemble model
title_sort	predicting and understanding law-making with word vectors and an ensemble model
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5425031/ https://www.ncbi.nlm.nih.gov/pubmed/28489868 http://dx.doi.org/10.1371/journal.pone.0176999
work_keys_str_mv	AT nayjohnj predictingandunderstandinglawmakingwithwordvectorsandanensemblemodel

Predicting and understanding law-making with word vectors and an ensemble model

Ejemplares similares