Percentages for Sceptics: Part II

In the first percentages for sceptics post, I showed that, if you are given a percentage, you can work out the minimum number of people to whom you would have to pose a yes-or-no question to be able to get that percentage. Ideally, I hope to add to your scepticism of percentages that are unaccompanied by the number of respondents. It’s easy to be suspicious of nice, round percentages like 10%, 20%, 50% etc., but in fact all but 14 of the whole number percentages can come from polls with 20 or fewer people.

The aim of this post is to take this approach to the next level. After a quick quiz, I’ll go through two examples, the second where I reverse-engineer a pie chart is the cleaner of the two. Don’t get hung up on any of the particulars of the numbers, especially in the dating example, what they are isn’t important, it’s more the fact that we can get them: most of the post functions as a demonstration of the principle.

Warm-up puzzle: A special case

In some survey 22% of people answered “yes”, 79% answered “no” (both to zero decimal places). Each person interviewed chose exactly one of the two options. What is the least number of people that could have been interviewed to get this result? Answer at the end of this post. It’s an on-topic mathematical question, not involving any silly tricks.

Dating data

Let’s take a horrible press release reported as news by the Daily Mail, (commented on by the Neurobonkers blog) under the succinct headline: The dating rule book is being rewritten with one in four single girls dating three men at a time and a third happy to propose.

Given the trivial nature of the survey, alarm bells should be ringing; and the fact that it is “according to the study by restaurant chain T.G.I. Friday’s” means, like their food, this ‘research’ might best be taken with a pinch of salt.

We’ll concentrate on the following paragraph:

The research showed 24 per cent of unmarried women would happily ask a man to marry them. They might want to think twice, as one in seven men said they would be horrified at the prospect.

But 54 per cent of men said they would be quite happy if their girlfriend proposed and 16 per cent would be ‘over the moon’.

It is not just proposals where women are prepared to be forward, with 65 per cent asking men out on dates, according to the study by restaurant chain T.G.I. Friday’s.

A quarter of women admit they like the feeling of control that comes from asking a man out and 27 per cent say ‘traditional dating rules no longer apply’.

We could go through and apply the percentage sceptic table to each percentage (24% needs 17 people; 54% needs 13, 65% needs 17 again, 27% requires just 11), but this is fairly mindless. Can we do any better?

Neurobonkers gives a link to the original press release, so we can delve further into this word problem. We’ll make the following assumptions:

1. TGI Friday’s PR company interviewed some number of people and asked some multiple choice questions.
2. For each question they took the number who gave a certain response, divided by the number who replied to a question, and multiplied by 100 to give a percentage.
3. They correctly rounded this percentage to 0 decimal places, ie. to the nearest whole number.
4. Each block quote below has percentages from a single question. (Or more weakly, the same number gave answers to all questions posed in each block quote).

In particular, they asked different questions to men and women, so we definitely shouldn’t mix the two (and hopefully they didn’t either). Let’s begin!

Six per cent of women saying they would definitely ask a man to marry them and a further 24 per cent saying they would consider it.

The idea is to check which numbers of respondents (less than 100) could simultaneously give percentages of 6% and 24% when rounded. Clearly, you can always do this with 100 people, and it’s always possible with any higher number. Consequently, unless we suspect that the number of people questioned is less than a hundred there’s no point doing the analysis (this threshold is 1000 in the second section). I’ve made a simple computer program that calculates this by brute force, so let’s see some results.

The lowest number that can give both 6% and 24% is 17, no better than the list-checking approach, because 1/17 = 5.88… = 6% (0 d.p.), and 4/17 = 23.52… = 24% (0 d.p.). However, we can also generate the list of all numbers up to 100 that are possibilities: 17, 33, 34, 49–51, 54, 62, 63, 66–68, 70–72, 78–80, 82–90, and all numbers greater than 93. Notice the possibilities get closer together the higher up you go.

Nearly a third of women (27 per cent) don’t think that traditional dating rules still apply. A quarter (24 per cent) of women like the feeling of control when they ask a man out. 17 per cent of women love taking the lead in relationships.

To get 17%, 24% and 27% needs a much higher minimum, 41. The list starts 41, 59, 63, 66, 70, etc.  Now you’ve hopefully got the idea, I’ll postpone any further ‘analysis’ until later. To reiterate, like the survey, don’t take the numbers in this post too seriously.

Humble Pie

Here’s a cleaner example. In 2009, Travel PR asked some people which celebrity they would most like to go diving with. They presented some suitable options known to enjoy diving like Kate Humble, Monty Halls (not of Monty Hall problem fame), Jacques Cousteau, and David Attenborough, but allowed people to nominate someone not on the list. Then they give all the results to one decimal place in what they deem to be a pie chart. Kate Humble won, and hence they call it a “Humble Pie Chart”. The question is, how big was Kate Humble’s win?

Only a blurry picture has been archived, so here’s my recreation of it, minus celebrities.

Let me just say I find this press release harmless. The fact that the data couldn’t be confused for a serious statistic purporting to be representative for a population helps. The full release of data, sufficient transparency in how they got the information, and the name-play, all contribute positively too. Also, they don’t present it as ‘research’. And the over-precision, I excuse for the following reason.

With data to one decimal place, we can take a survey which we suspect has queried fewer than one thousand people, be much more confident about the programme’s output. The single question means many of the numbered assumptions above that we had to make previously are trivially true. We can also sum them to check whether they add up to about 100%. They do, exactly.

Putting these percentages (1.5%, 3.0%, 4.5%, 10.4%, 14.9%, 16.4%, and 17.9%) through my programme (“Computer: set decimal places to one”) gives 154 suggestions below 1000, starting: 67, 134, 201, 268, 335, 336, 396, 402, 403, 463, 469, 470, 530…

But we don’t actually need to take the computer’s word for it (or mine for that matter). Take another look at the percentages. It seems completely clear to me that the 13 lots of 1.5% are single votes, with many of them probably being the interviewees’ unique suggestions (such as Bill Bailey: what might it be like to dive with him?). It seems unlikely that these suggestions such as Mick Jagger, Buzz Aldrin, Gordon Brown etc. would each get repeated, or, even if on the original list, all receive exactly two votes, with no-one getting  only one vote.

Each other percentage is clearly a multiple of the lowest vote too (at least approximately, to account for rounding). For 1.5% to represent anything other than a single person, all the other votes would have been effectively made in pairs or triples. We might estimate the chance of all the votes in each of the 21 categories being even, as approximately $\frac{1}{2^{21}}$ or about one in two million. That was assuming odds and evens come up with equal probability: with the chance to make suggestions, I effectively claimed earlier that one is more likely than two.

So that means that the numbers out of 67 clearly are: 1, 2, 3, 7, 10, 11, 12. That means that Kate Humble won by a single vote, and should embrace her surname—at least when it comes to the public’s decisions about sub-aqueous companions.

More dating data

Over half of men (54 per cent) would be happy if their girlfriend popped the question and 16 per cent would be ‘over the moon’.  But a word of warning – one in ten (ten per cent) would be horrified if it happened, according to a new dating survey of unmarried Brits from restaurant chain T.G.I. Friday’s.

Using the three percentages, 10%, 16%, 54% results in 50, 61, 63, 67-70, 79-83, 87, 89-94, 96+.

Men are more than happy to share the responsibility too, one sixth of men (16 per cent) now claiming that traditional dating rules are outdated and sexist. A third of men (31 per cent) find it refreshing to be asked out.

Possibilities giving 16% and 31% are: 32, 45, 49, 51, 55, 58, 61, and lots of higher ones.

Just one in six women (18 per cent) expects a man to pay for the first date and 39 per cent think that splitting the bill is most appropriate.

18% and 39% could have: 28, 33, 38, 44, 49, 51, 56, 57, 61, or many larger number of respondents.

Men are also backing this change in attitude with a quarter (25 per cent) saying they would expect a women to contribute to the bill and a further 24 per cent would expect to ‘go Dutch’.

Because 24% and 25% are so close, they lead to our highest minimum: 51, 55, 59, 63, 67, 68, 71, etc. However, if we interpret this result as a two-phase question: then 49% expect a contribution, and a subset of these people, 24% of the total want to ‘go Dutch’ (everyone pays their own way). Though this doesn’t makes a big difference to the results, it does to our reverse-engineering, changing the returned numbers radically to: 37, 41, 45, 49, and then all the previous possibilities, with a few extras thrown in.

I’ll ignore single percentages peppered about the press release.

No breaking up

Finally, under one final, slightly dubious assumption let’s collect all the percentages for women together, and then all those for men together.

The assumption made is one of the following options (which I judge to be in decreasing order of likelihood):

• They persuaded all women who responded to any one question to respond to all of them.
• The PR agency kept women who didn’t respond to one question in the denominator (if they did, then it was wrong to do so).
• The number of women who responded to each question coincidentally happened to be the same for each question. This is pretty unlikely, unless the first option if the first option is false.

Feeding it all the female stats, 6%, 17%, 18%, 24%, 27%, and 39%, the percentage calculatron spits out the possibilities: 66, 71, 82-84, 88-90, 93-96, 98+.

Under the same assumptions for the menfolk, with 10%, 16% 24%, 25%, 31%, 54%, we get back: 67, 68, 80, 83, 87, 89, 91, 93, 96, 97, 99+.

Without further information, my best guess is the minimum: 66 and 67 respondents. This is a very sceptical estimate. In support of this, it happens to coincide with the number asked in the Humble survey above: it’s not implausible that it could be a rule-of-thumb for survey sizes used by PR companies.

One possible way of being more certain is checking whether the minimum is a local minimum: if you change some of the input percentages by a little bit, does it tend to raise the minimum? In a very unscientific check of a few arbitrary cases for both, small perturbations removed the options under seventy. This test, if performed more rigourously with the same results, might relate to some probabilistic assumptions under which 66/67/68 is the best guess for these surveys, perhaps assuming that nearby responses are equally likely, as is the number of respondents.

For instance, if someone picks either a 6-sided or 20-sided dice at random (decided by flipping a fair coin), and we’re told the outcome is a 5, it’s more likely to have come from the 6-sided dice: a 10 in 13 chance, using conditional probability. (Conversely, if we’re told it’s a 7, then we know for certain which die was rolled).

Of course, this all still relies on the additional unreliable assumption above everyone answering each question. And survey sizes aren’t chosen uniformly at random (though if smaller sizes are more likely, this works in the favour guessing the minimum).

[Edit: my assumptions didn’t hold. Apparently they asked 2000 singles. As 1000 women is a lot more than 100, I had no chance of being close. Of course, remember, sample size isn’t everything.]

A special case: answer

The minimum number of people we could have interviewed to get 22%, 79% is 200. Note that 22+79=101. Therefore we must have had exactly 21.5% and 78.5% as the pre-rounded result. This is the least number of interviewees required for both percentages to round up. We need to find $n$ such that $\frac{215n}{1000}=\frac{43n}{200}$ is also an integer. As 43 and 200 are coprime, $n=200$ is clearly the least number of interviewees, with 43 saying “yes”.

For a higher number of interviewees to be a valid option, it must also be a multiple of 200.