Although the polling-rich election season ended months ago, there remains plenty of debate surrounding public opinion polling. Nowadays social media and cable news chatter centers around around Donald Trump’s approval ratings. The data, which has come in the form of 971 different opinion polls as of February 27th, has been interpreted very differently by separate sects of the public–including by the president himself:
The same people who did the phony election polls, and were so wrong, are now doing approval rating polls. They are rigged just like before.— Donald J. Trump (@realDonaldTrump) January 17, 2017
It’s unlikely that data (of any kind) could sway Trump. But the wide-ranging reaction by the public is more understandable as these polls are telling somewhat different stories. Of the 95 approval rating polls that have been conducted, the average net rating is -3.4 points (net rating = Approve % - Disapprove %, where negative values indicate more people disapprove of Trump than approve of him). But this net approval rating has ranged from as low as -18 points to +18 points.
Given how a poll can differ by several factors such as the medium through which it’s conducted and the portion of the public it surveys, this variability should not come as a total surprise. What’s important to do in this case is to measure what qualities of polls lead to a more favorable or unfavorable result for Trump in his approval rating. In this way, interpreting the influx of polls out in the public domain becomes easier.
Trying to measure the effects of different factors all at one makes this situation ripe for multivariate regression analysis. Natalie Jackson of HuffPost Pollster took a key first step on this front, finding that the effects of the Rasmussen poll, polls with registered voters, and polls conducted online had positive significant effects in a regression predicting Trump net approval. Below, we expand on this by first running a more recent regression and doing so for both net approval rating and approval percentage, and then calculating house effects for each pollster.
1. What Affects Trump’s Approval Rating?
In order to isolate the effects of different factors, we first have to determine which factors of a poll to look at. In the table below (scroll to the bottom), we ran models that predicted net approval rating (% approving of Trump in a poll minus % disapproving of Trump), appearing in columns 1-3, and approval percentage (% approving of Trump in a poll), appearing in column 4-6. The data came from the HuffPost Pollster website. We used a few different independent variables for both sets of models:
- Survey population (i.e. polling universe): We looked at polls surveying either all adults in the United States, only registered voters, and or "likely voters" (modeled on their likelihood to vote in the next election). The "Adult Population" effect serves as the baseline for the estimate for this variable, with the table below showing the effects of a "Registered Voter Population" and "Likely Voter Population" relative to the "Adult Population" effect.
- Survey mode: This variable takes into account how a survey is conducted: through a live phone interview, a self-administered online questionnaire, or a mix of interactive voice response and online surveys (IVR/Online). We limit our scope to these three survey modes. Like with the previous variable, we have a baseline--"Live Phone" polls--with effects for "IVR/Online" and "Online" polls measured relative to this baseline appearing in the regression table.
- Days since the inauguration: Calculated as the days between a poll's end field date and January 20th, 2017.
- Poll field time: Calculated as the difference in days between the start and end date of a poll's period in the field.
- No opinion percentage: Calculated by subtracting the percent approving and disapproving of Trump from 100, leaving us with people who weren't sure or had no opinion about Trump in a poll.
- Pollster: which survey house conducted the poll?.
For both net approval and approval, we wanted to see the impact of three models of Trump approval:
- Model 1: Mode, population, days since inauguration, poll field time, and opinion percentage
- Model 2: Mode, population, days since inauguration, poll field time, opinion percentage, and pollster
- Model 3* (my favorite!): Mode, population, days since inauguration, and pollster
Both sets of models come up with several statistically significant independent variables, and they explain a large amount (anywhere from 74% to 91%) of the variance in Trump approval opinion polls. The output for these models is at the bottom of the page and I encourage you to check it out.
In model 1 predicting net approval, polls that survey registered voters result in a net 4.9 points more for Trump’s approval ratings than polls that survey all adults. Likely voter polls are even nicer to Trump, giving him a 5 point boost relative to polls of all adults. These are very similar findings to thos eby Charles Franklin — law professor and pollster extraordinaire — who look at the average differences between modes and pollsters.
Franklin’s piece looked at the average differences between different modes, populations, and pollsters and found very similar findings to what our analysis shows. Although our analysis is more “statistical” (IE: it uses a model), it’s worth also recreating his piece.
Firstly, there are differences in pollster modes:
Then, there are differences between populations:
Our analysis echoes this.
We find that the population of a poll — whether or not it is a poll of all adults, registered voters, or likely voters — has a very significant impact on Trump’s net approval rating. A generic poll of registered voters gives President Donald Trump an approval rating 4.9% higher than a poll of all adults. Moving even more to the right, polls of likely voters give Trump a 5.2% bounce.
Relative to live phone surveys, IVR/online polls are a 8.9% points and internet polls a net 7.3% points more favorable to Trump. This makes the early mode effect in Trump approval rating polls very clear: surveys conducted online and without a live interviewer result in much better net approval ratings for Trump than surveys conducted over the phone by live interviewers.
The variable for days since the inauguration is also statistically significant, but in the negative direction: with each day we get further away from the inauguration, Trump’s net approval gets 0.26% points worse. This makes sense given that events during his presidency have likely only tarnished his image rather than improved it. Trump’s travel-ban executive order comes to mind.
The variable for the amount of days a poll was conducted is also significant and negative, which would indicate that as a poll was fielded for a longer period, the worse Trump’s net approval would result. However, it’s hard to see what actual mechanism is causing this and it’s likely that this variable picks up the effect of another variable (e.g. survey quality, or pollster), so this significant effect is not very meaningful. We drop it entirely when analyzing pollster house effects, although it still has a significant effect on accuracy of the model.
The second group of models regresses approval percentage — rather than net approval — on all the aforementioned predictors. The same significant effects (coefficients) result and are in the same direction as those in model 1: registered voter populations, IVR/online survey modes, internet only survey modes, fewer days since the inauguration, and shorter field periods result in higher percentages approving of Trump. The variable for no opinion percentage comes up as significant and negative, but this is an artifact of it being related to the dependent variable in this model; approve % and 100 - (approve % + disapprove %) are part of the same 100% of all respondents, so a change in one of these variables will always be negatively associated with a change in the other.
2. Survey House Effects
Evidence of these mode, population, and period effects are not new. Where we add a new layer of understanding is in calculating survey house effects below. As we mentioned above, Charles Franklin recently took a look into house effects. Here’s what he found, in terms of the differences between each pollster’s average Trump approval rating.
We must disclose a caveat to this research into house effects: at this early stage in Trump’s presidency, there aren’t as many approval rating polls to evaluate as we would like. There are currently 36 from Gallup and 23 from Rasmussen, but no other pollster has conducted more than five polls asking about Trump approval. This presents a problem for calculating house effects, as any model based on a small sample of polls from a given pollster will have some error driven by just a single poll.
This being said, survey house effects are likely fairly variable in these first few months of the Trump era. Although effects that appear at this point could easily change over the course of the next few months, it’s still worth taking a look at the differences between Rasmussen, Pew, et.c now. But keep this caveat in mind when viewing the below house effect calculations — they give only a good early picture at house effects, and not as clear a signal as would get in a few months.
Method: We downloaded our approval rating data for Trump from the HuffPost Pollster website. Including only data for a poll’s entire population (and not just Republicans or Democrats, for example), we created 17 different variables for the 17 different pollsters who have asked about the president’s approval rating. These variables individually went into different regressions predicting Trump approval percentage (or his net approval rating), along with population (adults–the baseline–registered voters, and likely voters), mode (live phone–the baseline–Internet, and IVR/Online), days since the inauguration, poll field time, and the no opinion percentage (as described before). In this way, for each pollster, we were able to make all other polls the baseline in a regression, and then calculate the effect of each pollster on Trump approval (or net rating) relative to a baseline of all other polls. We term this effect–the coefficient from each different regression for each different pollster–the “house effect” of a given pollster.
The graph below plots the survey house effect for 17 different pollsters when using approval rating as the dependent variable in 17 different regressions, from greatest effect against Trump in blue to greatest effect in favor of his net rating in red:
After controlling for various different survey characteristics, PPP polls have the strongest in-house effect against Trump out of all polls measuring approval rating of the new president. On the other end of the spectrum, Rasmussen polls have the strongest in-house effect in favor of Trump in terms of producing greater approval.
The below table lays out all the survey house effects for each of the 17 pollsters and for net approval and approval percentage. Let’s use Approval as an example for how to interpret these numbers. Rasmussen polls have an in-house bias that makes Trump net approval 6.9% points better relative to all other polls. Meanwhile, PPP has the opposite effect, as relative to all other pollsters, its in-house bias is 6.9% net points worse for Trump. Gallup, at a net -0.7% points, is currently the poll with the smallest in-house survey effect in either direction. The effects for all the other pollsters follow the same scheme–negative values indicate a survey house bias against Trump, and positive values indicate a survey house bias in favor of Trump.
|PPP (D)||-6.91||Democratic Bias|
|Politico/Morning Consult||2.69||Republican Bias|
3. Wrapping Up
In sum, we have found that:
- Polls of registered and likely voters are more favorable towards President Trump.
- Polls conducted over the Internet or automatically via phone are also more favorable to POTUS.
- Rasmussen and Public Policy Polling are extreme examples of pollster house effects, where the method by which a pollster conducts their poll other than mode and population have effects on their look at Trump approval ratings.
We have built of previous work from Natalie Jackson and Charles Franklin who applied regression models and boxplots (oh, how we love boxplots…) to polls of Trump’s approval rating and found similar results.
As Trump’s presidency progresses, Gallup will likely continue to provide the best look at Trump’s approval rating.
I am in the process of automating the analysis outlined above and publishing an interactive that updates whenever a new poll becomes available. Stay tuned for that!
If you enjoyed the analysis, please consider follow Alexander and I on twitter. We do a lot of work on our blogs, but even more of our musings appear there, in 140 characters or less.
As always, you can find the code for reproducing this analysis here.
I’ll leave you with this ancient insight:
"We hold these truths to be self-evident, that all polls aren't created equal, that they are endowed by their pollster with certain unalienable bias, that among these are mode, population and survey house. That to correct these bias, analysts are instituted among Rstudio, deriving their just powers from the regressions thereof" Thomas Jefferson
1 The previous piece published by Alexander has 95 polls in its analysis. This one has ninety-seven. Although it does not make that much of a difference, you’ll notice that Alex’s piece has Gallup as the “best” pollster, whereas this analysis awards that title to Marist. This is exactly why we emphasize the temporary nature of this analysis; it quite literally changed overnight.
4. Additional Information
If you’re enlightened to the meaning of statistical significance, feel free to read the below regression coefficients. If not, thanks for reading, and we’ll talk next time!
|population2. Registered Voters||4.953*||-6.261||-6.831||2.477||-3.130||-5.293*|
|population3. Likely Voters||5.232||5.798||4.885||2.616||2.899||4.549|
|Residual Std. Error||4.203 (df = 89)||3.023 (df = 75)||3.161 (df = 77)||2.101 (df = 89)||1.512 (df = 75)||1.692 (df = 77)|
|F Statistic||36.185 (df = 7; 89)||27.929 (df = 21; 75)||27.799 (df = 19; 77)||66.166 (df = 7; 89)||47.242 (df = 21; 75)||40.750* (df = 19; 77)|
|Note:||*p<0.1; p<0.05; ***p<0.01|
- What to Expect in the PA-18 Special Election (And How to Watch The Results)
- 2018 Expectations Roughly Stable Amid Democratic Polling Slide