"The Polls Were Wrong" Is the Easiest Answer

Since Sunday’s election I have been in the improbable position of defending Turkish pollsters. To be clear, I have had many, many quibbles with their lack of methodological transparency and the perception of bias that causes.

Unlike other forms of public opinion or policy research, however, election polling provides a day of reckoning: you're right or you're wrong. It’s there for everyone to see. If you can't get the election right, within the margin of error of your sample, no one should pay attention to your data. If you repeatedly get it right, you deserve a degree of credibility.

Here's the thing: several Turkish pollsters were pretty damn close to getting the 7 June parliamentary election correct. And when I say "correct" I mean "correct within the appropriate margin of error in the days immediately before the election." Therefore, their November election results which missed the mark should not be dismissed outright, especially since multiple pollsters reported similar results.

I'm going to digress a bit. It’s absolutely fundamental to understand the basic principles of probability sampling if you’re going to comment on pollsters’ performance.

A poll reflects voters' views on the days the survey was conducted within a margin of error. Here's what that means: if you draw a truly random sample of, let's say, n=1000 people within a universe (Turkey, for example), and you write the questions and implement the survey in ways that diminishes biases inherent in survey research and your sample reflects the demographics of your universe, your data will, within a standard margin of error of plus or minus 3.1 percentage points (characterized, shorthand, as MoE +/- 3), reflect the views of that universe on that day. That means the results of any data point could vary three points higher or three points lower. This is critical to take into consideration when declaring a poll "right" or "wrong" relative to election results or stating a candidate/party "is ahead in the polls."

Two important takeaways:

• The only way to achieve a margin of error of zero is to interview every single person in the universe (Turkey, for example). That's impossible and is why we rely on probability sampling. The trade-off is we have to accept and accommodate the margin of error in our analysis. If we fail to do that, we're wrong. Period.

• Pre-election polls are not designed to project what will happen on election day (you can do that through modeling, but it's risky). This is why everyone (especially candidates who are about to lose) says the only poll that matters is the one on election day -- it's the only one that's a 100% accurate report of voters' views with no margin of error.

If you don't believe all this, go take a statistics class and then we'll argue about it. It's science, not magic. Also, please do not give me an exegesis on academic research. Like these pollsters, I work in the real world with budgets and time constraints.

So, let's look at the last three public polls taken before the 7 June election. I chose these three because 1) fieldwork was conducted the week or two before the election and 2) they shared their sample sizes so we know the margin of error. (There may be others, but I found these data here). We want to look at polls conducted as close as possible to the election because they’ll capture the effects of late-breaking campaign dynamics. (Also, not rounding is an affectation. I round. Personal opinion).

 

 

AKP

CHP

MHP

HDP

Sample Size

MOE

Date

MAK

44

25

16

19

n=2155

+/- 2.1

18-26 May

SONAR

41

26

18

10

n=3000

+/-1.8

25 May

Gezici

39

29

17

12

n=4860

+/-1.4

23-24 May

Andy Ar

42

26

16

11

n=4166

+/- 1.5

21-24 May

June Results

41

25

17

13

n/a

0

7 June

 

I draw two conclusions.

First, putting aside ORC which overrepresented AKP and underrepresented HDP, Konda and Gezici were pretty damn close to the final result (by that I mean close to within the MoE), considering data was collected a week before election day.

Secondly, though it can be risky to compare data collected by different operations, their data are very similar, which suggests they are using similar methodology and making similar assumptions. That’s the way it should be.

Next, let’s look at publicly released data for the November election. I borrowed most of these data from the delightful James in Turkey and he did not always include the margin of error. I will take that up with him at a future date. Let’s assume pollsters without sample size indicated interviewed between n=3000 and n=5000 (that’s what they did in June), so the margin of error will be between +/-1 and +/-2 

 

 

AKP

CHP

MHP

HDP

Sample Size

MOE

Date

Andy R

44

27

14

13

n=2400

+/-2

24-29 Oct

Konda

42

28

14

14

n=2900

+/1.8

24-25 Oct

A&G

47

25

14

12

n=4536

+/1.4

24-25 Oct

Metropoll

43

26

14

13

n=

 

15 Oct

Gezici

43

26

15

12

n=

 

15 Oct

ORC

43

27

14

12

n=

 

15 Oct

AKAM

40

28

14

14

n=

 

15 Oct

Konsensus

43

29

13

12

n=

 

15 Oct

Unofficial Final

49

25

12

11

N/A

+/-0

1 November

 

AKP’s final number falls outside all the polls’ MoE, except A&G's. The next closest, Andy R, conducted the latest fieldwork so was in the best position to capture emerging trends, such as a surge in AKP support. Andy R still underreported AKP support by five percentage points. That’s a lot. A&G didn’t release any tracking data so it’s hard to know if it’s an outlier or ahead of the others in capturing the AKP surge. The latter is possible and I will address it in a future post.

If consistent sampling methodologies and questions are used, it’s possible track data over time to see if it changes. Big unexplainable differences from one dataset to another could indicate a problem in the methodology. I like it when pollsters provide election tracking data. It suggests sound sampling and alerts us to important trends in public opinion.

For fun, let’s take a look at two of those who did:

 

KONDA

 

AKP

CHP

MHP

HDP

June 7 Results

41

25

17

13

Aug 8-9

44

26

15

13

5-6 Sept

42

25

16

12

3-4 Oct

41

29

15

12

17-18 Oct

42

28

15

13

24-25 Oct

42

28

14

14

Unofficial November Final

49

25

12

11

 

 

GEZICI

 

AKP

CHP

MHP

HDP

June 7 Results

41

25

17

13

3-4 Oct

41

28

17

14

17-18 Oct

41

27

16

13

24-25 Oct

43

26

15

12

Unofficial November Final

49

25

12

11

 

Not only are these two pollsters consistent over time, they are also consistent with the final June results and compare favorably with each other. Nothing in either of their datasets suggests a big shift in opinion toward AKP (they do indicate an AKP trend, which is plausible). Yet, inthe end, their last polls are wrong wrong wrong about the November result. That’s really troubling.

How could pollsters who nailed it in June have missed it in November? How can they be consistent over time and with each other and be wrong on election day? Falling back on “the polls are wrong” as analysis is simply inadequate. If you’re going to disregard months of consistent data, you should provide an explanation for how it went wrong.

I honestly can’t give an adequate explanation. Because I have other things to do and you have short attention spans when it comes to statistics, I will address what I think are the three most likely polling error culprits in future posts. These include (in random order of in likelihood):

• Errors in methodology (this will address the absurd argument that since UK and US pollsters were wrong, it follows that polls in Turkey are also wrong. I can’t believe this is even necessary)

• Errors in analysis (not reporting or considering Undecideds or softening support, which is my current theory of choice)

Election dynamics that cannot be captured by polling

 

NOTES: If you want to look at a few other pollsters’ June data, here it is. I don’t think it’s totally fair to judge their accuracy based on data collected weeks before election day, but, with the exception of under-representing HDP, most of them (except MAK) actually are pretty close and provide more evidence of the consistency of public opinion. Being off on HDP can be forgiven because HDP had what campaign people refer to momentum and it is plausible HDP’s support increased in the final weeks. 

 

 

AKP

CHP

MHP

HDP

Sample Size

MOE

Date

MAK

44

25

16

19

n=2155

+/- 2.1

18-26 May

SONAR

41

26

18

10

n=3000

+/- 1.8

25 May

Gezici

39

29

17

12

n=4860

+/- 1.4

23–24 May

Andy Ar

42

26

16

11

n=4166

+/- 1.5

21-24 May

June Results

41

25

17

13

N/A

0

7 June