Seeing the Forest From the Trees: Modeling Voter Contact Data

Politics_magazine_logo.jpgBy Mike O'Neill

In the early years of doing call centre-based voter contact in Canada, I was often asked by clients, “How am I doing?” It was not always easy to give them an answer! In seeking a solution to this problem we first developed a data model that would later evolve into a micro-targeting program.

Strict spending limits mean few of our campaigns at the district level have access to professional polling based on valid statistical samples of the population. Campaigns might commission a pre-election poll but not district-specific nightly tracking during the election. Although we were conducting voter identification with a large number of observations, up to 100,000 per night, our sample was not truly random, as required by polling. Our challenge was to create a model from the most granular data and project or make an inference about the whole electorate and ultimately, to predict whether our clients would win or lose their elections.

“How am I doing?” Our experience showed that if a certain percentage of voters indicated they were supporting the candidate and the Liberal Party (we work exclusively for Liberals) they were very likely to win. This heuristic worked well enough but could be confounded by three-way races or electoral districts with unusually high or low voter turnout. In addition, clients could bias the sample of voters we were calling by the order in which they ranked their precincts for calling. They could also prioritize calling past supporters or could reserve these past supporters for contact by their own volunteers. These factors impacted the sample and were confounding in close races.

We started working on a solution in advance of the 2003 Ontario General Election when we would be working on behalf of 60 candidates. We developed a model by assigning every voter (about six million) a probability from 0 – 1.0 that they would respond by saying they intended to vote Liberal. This probability was the expected value. We then took the range of possible answers to the VoterID script and converted them to probabilities ranging from 0 – 1.0. These were the actual values. At the end of each day we took the sample and compared the expected versus actual values, made a determination of whether we were doing better (actual>expected) or worse (actual<expected), then reviewed the previous election results and made predictions about whether we would win or lose the district.

This first iteration of the model worked. We immediately noticed sharp spikes in support in some unexpected electoral districts and the model ultimately predicted the result of the election quite closely.

Notwithstanding this early success, the model needed further refinement. I had made up the table converting voters’ answers (“I am undecided at this time”) to numerical probabilities based on experience rather than statistics. We were able to tell clients they were doing better or worse, but not how much better or worse. As well, we felt that we could create two attributes to describe voters, a probability of voting and a probability of voting Liberal. Finally, we wanted to validate our results to see how closely they tracked with actual voting patterns on election day.

We turned to Professor Matt Lebo, a Canadian teaching political science at Stony Brook University on Long Island, NY, and his colleague Professor Andrew Sidman, a recent PhD graduate embarking on a career at City College in New York. Lebo has published a volume of work on electoral modeling and has studied elections in the United States and the United Kingdom.

Lebo and Sidman first validated the 10 years of data we had collected which showed a high degree of correlation between our results and actual election results both on the precinct and electoral district levels. Using some more advanced statistical techniques, they then re-created the coefficients for voters’ answers. Lastly, they described every Canadian voter in terms of a probability of voting Liberal and a probability of voting. We then put the model to work in the 2007 Ontario and 2008 federal general elections, each day comparing expected and actual results, and it proved successful in predicting outcomes.

The model has also become a micro-targeting tool. We can display a graph that plots every voter in a riding according to their voting behaviour and identify core supporters, supporters who vote infrequently, swing voters, supporters of other parties and non-participants. In the 2008 election we targeted infrequent voters with GOTV calls during the advance polls, and targeted swing voters with persuasion calls.

We can also display a matrix showing the number of voters in each of these groups so that our clients can see how to build their winning plurality. Do I need to activate my core vote, or do I need to recover voters who moved to other parties? We expect increased interest in this type of targeting following the success of the Obama campaign in “re-shaping the electorate” and convincing traditional non-participants to vote in 2008.


Mike O’Neill is the President of First Contact, a Canadian company that specializes in voter contact management for Liberal clients and First Voter Contact, a U.S. company serving Democrats. O’Neill has been active in politics for 20 years as a staff member, campaign manager and service provider and has been a frequent speaker at campaign training events including Campaigns and Elections.

Client Login

Request a Quote