<< Chapter < Page | Chapter >> Page > |
But where might a state poll go wrong? The most obvious answer is with independent voters, voters who haven't made up their mind yet. For example, Rasmussen's latest Colorado poll has 5% of people undecided, 6% of people for some third party candidate, and equal percentages for the two major party candidates [link] . State level polls are fine for a month before an election: vice presidential candidates are decided, undecided voters are becoming less and less common, primary struggles are long finished, and party conventions are over. But as of the writing of this paper, these factors, crucial for the accuracy of polls, are not yet decided.
A key to predicting a presidential election is using swing states as the focus of the analysis. In the United States, swing states such as Ohio, Florida, Colorado, Virginia, and Nevada are different than other states in the sense that they are on the border between Republican and Democrat, hovering around 50% in recent elections. These swing states are often bellwethers for the presidency. For example, no president has been elected without winning Ohio since John F. Kennedy lost Ohio in 1960. Also, by focusing on the state level, data becomes more meaningful. For example, while the national unemployment rate is hovering around 8.2% for the June jobs report from the Bureau of Labor Statistics, North Dakota has unemployment below 5% whereas Nevada and California have statewide unemployment rates above 10% [link] . Unemployment levels by county can vary even more from the national average [link] . The point here is that while national averages are indicative of the state of the country as a whole, they are less meaningful for the states in question.
One important insight about the level of detail available on the state and local level is the relationship between counties in a state. Two observations are absolutely critical for the prediction method described herein. First is that strongly Republican counties are usually not right next to strongly Democratic counties, and that the change between Democratic and Republican counties in a state is gradual rather than sudden. The second observation is that the differences between two counties is not random and does not vary greatly nor randomly between elections. If one knows the voting percentages of one county, it is possible to guess the voting percentages of the neighboring counties using historical data. In order to take advantage of these facts, we can model the counties as a Markov Random Field and find the most likely outcome for each county (maximum a posteriori inference).
A Markov Random Field (or Markov network) is an undirected, probabilistic graphical model. It consists of a set of nodes connected by a set of edges, where each node takes exactly one state. The nodes in a Markov Random Field have the Markov property in the sense that their state depends only on the nodes they are connected to as well as their own built-in preferences. Each assignment of states over the network yields a level of energy, which is a measure of the probability of that assignment occurring given the model. The exact probability of a particular assignment is its energy divided by the sum of the energies of all possible assignments. In our research, we focus on pairwise Markov Random Fields, in which any potential function describing the energy attained from an assignment of nodes' states is a function over no more than 2 nodes. We note that any Markov Random Field can be converted to a pairwise Markov Random Field [link] .
Notification Switch
Would you like to follow the 'The art of the pfug' conversation and receive update notifications?