DDHQ’s election research on applications of correlated simulations was recently published in the Harvard Data Science Review. We figured out that with some minor mathematical simplification, we could simulate elections faster than we were previously able to. Our strategy for faster simulations was centered around the fact that it is much easier, computationally, for a computer to add large matrices than to multiply them. Additionally, we wanted to make sure that all combinations of individual race outcomes made sense. For example: Republican overperformance in a Midwestern state should increase Republican chances of winning Midwestern states, not the other way around. In our paper, we explain how some of the most well-respected election models, such as FiveThirtyEight and The Economist, have some very bizarre combinations of outcomes, and we describe how we can avoid this problem by separating states into regions and analyzing each region together.
We also tackled the problem of handling elections with multiple (3 or more) viable candidates. With multiple candidates, one has to model all the candidates separately, rather than the margin between two candidates. We resolved the issue during this year’s primary elections by creating a geographic model that used the results from completed counties and the locations of the candidates’ home bases to estimate the final vote share in real-time, along with their probabilities of winning. The theory is a little complicated, but we not only assume that nearby counties are predictive of counties yet to come in, but that candidates also perform strongest around their home counties. This process enabled us to see the eventual outcomes at a very early stage in states like Nebraska and North Carolina during the primary season. Looking ahead to the general, we also devised a way to see the outcome of the US House.
When DDHQ calls a race, we are never certain what the exact final margin will be. We are only sure that the Republican or Democrat will win. Consequently, we have to take the remaining uncertainty of the called race into account. For example, Democrats winning every district in Massachusetts could mean anything from a blue wave to a red wave, depending on the margins in each race. If we guess what the margins are, we are limiting the actual scope of possible outcomes, and it is not a good idea to try to guess exact margins in an era of large blue-shifts and red-shifts that occur in different states. We have coalesced these ideas with our fast correlated simulation methodology to create a live seat calculator for the House that will adjust the estimate as we call races on election night. Our calculator is quite flexible, being able to quickly adjust to wave scenarios (though probably not flexible enough to account for an AOC or MTG loss), and providing confidence intervals and what-if scenarios.
Check out the full article at this link: https://hdsr.mitpress.mit.edu/pub/ftwzgrij/release/1?readingCollection=7e5ac077