Friday, October 24, 2014

Draw Fixing

There has always been a belief in the tennis world that when draws are made, it isn't as random as it is supposed to be. It is a big claim, but it appears to have some validity to it based on the statistics. However, the statisticians who are trying to prove that the draws aren't entirely random use an old mathematical trick to prove their point, which isn't false, but it is misleading.

One of the best examples of using this kind of mathematical persuasion about draw-fixing is in this article. It is a very well written article in which the writer clearly has a very good understanding of statistics. However, the writer also has an agenda, which can result in drawing the wrong conclusions.

It was written by Katarina Pijetlovic and the trick she used was simple, but well disguised. I'll explain it with an example: Consider the match between John Isner and Nicholas Mahut. There were 137 holds to start the fifth set. Based on the stats, there was an 85% chance of Mahut holding in each game and an 87% chance of Isner holding in each game. That means there was a 0.000000106% chance of there being 137 holds to start the fifth set.

Based on this, you could conclude that Mahut and Isner decided that they wanted to hold the record for longest match ever so neither of them tried to break serve. Then maybe even add a narrative to make the argument even stronger. Say Isner and Mahut were trying to get players to receive more prize money for first round exits, so they made the match last three days to strengthen their argument that they deserve it.

However, this conclusion certainly false. Nobody who watched the match thought this was happening on purpose. Although the statistical probability makes this argument seem undeniable, an honest look at the events that took place shows that the match wasn't fixed.

Pijetlovic has made the same argument in her article. She took an event that has already occurred. Then she went back and measured the statistical odds of it. After finding the odds very low, she drew a wrong conclusion that she supported with a narrative about tournament directors wanting Rafael Nadal and Roger Federer in finals.

This can be done with almost anything.

There is one other stat that Pijetlovic uses that has nothing to do with odds that jumps out at readers. It's that Federer and Djokovic were on the same half of the draw in 12 consecutive grand slam events on hard and grass courts.

However, this stat is flawed too. Look at all the qualifiers that exist in this stat: Federer, Djokovic, slam event, hard, and grass court. That is five different qualifiers. If you have enough qualifiers, anything can be an impressive stat.

Consider Tobias Kamke - the No. 93 player in the world. He is a decent player but probably won't be remembered too long after he retires by most tennis fans. Yet, he is the best player that was born in Germany, plays a two-handed backhand, hasn't had a 29th birthday yet, and has a win over a top 10 opponent.

With just four qualifiers, I made Kamke the best player in the world. Pijetlovic uses five! With every qualifier added, her number loses statistical significance. After all five qualifiers, 12 seems almost expected - not a sign of fixing.

Pijetlovic did a great job using these tricks and I would do the same thing if I wanted to prove a theory like that. It's a very effective strategy, but it doesn't work this time because draw-fixing simply doesn't happen on the ATP or in the ITF.

7 comments:

  1. Very interesting. I have a few comments to make, though.

    I think the example that you present with the Mahut-Isner match is not really relevant. Of course, once an event has happened, one can go backwards and compute the probability of exactly every factor involved happening, the more factors one includes, the lower the probability. Nobody would jump to the conclusion that the people involved were planning that precise outcome just because the probability of it happening is very small. A tennis match is not governed by random acts, there is more decision making involved. They don't toss a coin to decide which way to serve, or whether to try to break serve or not.There is action and reaction. To assume that once an event has occured, the persons involved must have planned it exactly that way and then to assume a motivation for doing so would indeed be non-rational, to put it mildly.

    This is not the claim of the artcle you linked.

    There are certain events which, by nature, are planned to be random. Like tennis draws or the national lottery, one hopes.

    Imagine a national lottery in which every year the son of the president wins it. Somebody has to win it, right? If there are only 2 tickets sold, one might believe that the son of the president was lucky. the more tickets that you sell, and the more years that the son keeps getting the prize, the more suspicious one gets. It starts to look less random.

    That's the claim of the article. That to have 12 grand Slam draws and to get Novak always in Roger's draw and Murray always in Rafa's draw is a bit suspicious. A truly random event would be 6 and 6, or 5 and 7... or even 4 and 8... to go 12 and 0 is a little suspicious. to put it mildly.

    The fact that only in Roland Garros one gets truly random-looking draws is also interesting. So Novak got to face Rafa on clay and nowhere else.

    I had read the article that she mentions, analising the computer-generated part of the grand-slam draws. they present statistical anomalies as well.

    I am very aware that statistical anomalies are not conclusive proof, but they cannot be dismissed by the examples that you gave. They are not dealing with the same kind of phenomena.

    Could you explain the Kamke example? I didn't get why you concluded that "12 would be expected".

    One of the many misundesrtandings of probability I have encountered is used to prove the existence of God. The argument goes: the probability that we have such a planet, with all the right conditions for us to thrive are so low, that somebody must have planned it. It's the other way round: we are here precisely because those conditions were met. Otherwise we wouldn't be here.

    Think of the probabilities taht the right sperm fertilised the right egg and as a result you were born. There almost 0. Yet you are here. That doesn't mean that somebody rigged the draws. LOL

    But it is a totally different situation if you get somebody with a coin and you start betting. If it lands on heads 12 consecutive times, you would be perfectly justified in asking yourself whether the coin is quite even or it has a little flaw.

    Lastly, you finish the article like this:

    "It's a very effective strategy, but it doesn't work this time because draw-fixing simply doesn't happen on the ATP or in the ITF."

    Nothing in your article justifies that conclusion. Saying it simply doesn't happen doesn't mean it doesn't.

    I'm not saying it does: I don't have enough prove. All I am saying is that draws don't look completely random. And they haven't done for years. That's all.




    ReplyDelete
  2. "One of the many misundesrtandings of probability I have encountered is used to prove the existence of God. The argument goes: the probability that we have such a planet, with all the right conditions for us to thrive are so low, that somebody must have planned it. It's the other way round: we are here precisely because those conditions were met. Otherwise we wouldn't be here."

    What you said here was great. I was going to use this example, but I switched it to the Isner-Mahut example to keep it about tennis.

    Of course you are right that someone would be justified to question the process if the stats were as shocking as they appear to be. But I think readers of Pijetlovic's article look at her reasons to question the process as proof that the process is rigged. I'm saying that is an illogical leap.

    About the 12 statistic: The original stat reads as: The draws produced "the same result 12 out of 12 times"
    But there were also the qualifiers, where were:
    1. From 2008 to 2011
    2. At grand slam events
    3. On grass courts or hard courts
    4. With Federer and Djokovic
    5. On the same side

    You take the original stat, and you say there is a very low probability of that. Then you consider the qualifiers and realize it isn't too shocking.

    My example: Tobias Kamke is the best player in the world
    Qualifiers: Among players that were
    1. born in Germany
    2. play a two-handed backhand
    3. haven't had a 29th birthday yet
    4. have a win over a top 10 opponent

    When you consider the qualifiers you realize that there are probably many players better than Kamke in the world.

    Also, my conclusion that draws are not rigged was not based on anything I said in this article. It's just a belief I have. My goal in this article is to show that the evidence for draw fixing is misleading and that there is no reason to think that it does happen.

    Thanks for your comment. Those are some valuable insights and you clearly have a strong understanding of probability and statistics.

    ReplyDelete
  3. I just wrote an answer and it got lost. Grrr... I'll type it again, but I need to check that it doesn't disappear.

    ReplyDelete
  4. Full disclosure: I didn't read the whole of Pijetlovic's article. I could sense that she was going for the overkill and would end up hurting her case. :-)
    But I did read, a few years ago, the study she mentions about a statistical analysis of GS draws. The study checked how random the players that the top seeds had to face in the first rounds actually were. It appears that the Roland Garros draws looked quite random, but the other slams presented statistical anomalies. Not surprisingly, at least to me, this is the ranking of the Grand Slams according to how random their draws looked, from most random to least random:

    Roland Garros
    Australian Open
    Wimbledon
    USO

    (I know that Wimbledon likes to give the image that they don't care about advertising and sponsors and mundane things like money, but it's all a facade, of which we have had some evidence not too long ago lol)

    In any case, I ignored all talk of qualifiers and concentrated on one fact that I had already noticed: Novak ending up in Roger's half in 12/12 slams on hard-courts and grass. The probability of that is 1/4096. Enough to raise an eyebrow. In contrast, during that same period, Novak was drawn in Roger's half 2 times and in Rafa's half 2 times in Roland Garros...It does look a little different, does it not? There's a different feeling to it. :-)

    I know that fact alone is not conclusive evidence, it is just odd. But it is not misleading evidence. Maybe insufficient.

    It is your belief that draws are not rigged...It used to be mine as well (with a few exceptions, understandable and irrelevant in the large scheme of things). But recent draws have forced me to discard that belief. I don't have enough evidence to believe that draws are indeed rigged. So I'm on the fence on this one right now.

    During the Christmas break, if I have the time and the inclination, I may sit down and analyse recent draws properly. For the time being, I'm just paying attention to certain patterns that bother me a little. I have to do some proper work before reaching any conclusions.

    Still: I am sad taht I had to drop that belief. I love this sport.

    PS I don't really have a good grasp of statistics and probability. I took a course at university as an undergraduate, but I'm a pure mathematician. I don't really like probability and statistics. LOL

    ReplyDelete
  5. "But it is not misleading evidence. Maybe insufficient."
    You're right about that part. The evidence itself isn't misleading (numbers can never be misleading), but the way it was interpreted is misleading. Yes, the odds of this occurring is 1/4096, but lots of things occur in draws that are just as rare if given enough qualifiers.This one just happens to involve two top players.

    What I don't understand is why people think the tournaments would want the top two seeds (and not 3 and 4) to have particularly easy or hard draws, when they almost always win them regardless of how hard they are.

    ReplyDelete
  6. "What I don't understand is why people think the tournaments would want the top two seeds (and not 3 and 4) to have particularly easy or hard draws, when they almost always win them regardless of how hard they are."

    That's not really what the study was about. The ESPN study analised draws with reference to all the seeded players. In the US, especially, where TV ratings are of tantamount importance, it makes sense to protect names who are recognisable to the casual viewer, because that would increase ratings. And, not surprisisngly, the study found the most anomalies, with seeded players getting easier draws in general, precisely in the USO draws.

    The case I was referring to, and which I find incredibly odd, is the fact that Murray was drawn in Rafa's half and Novak in Roger's in 12/12 GS tournaments on hard-courts and grass. Let's remember that in their early careers, Murray had a much better rate of success against Roger than against Rafa, and for Novak it was the other way round. The Fedal rivalry was a product very sucessfully marketed and firmly in place. It is no wonder that organisers prefereed a Fedal final to any other final. Hence making it difficult to the newcomers to prevent the dream final from happening. On clay it was irrelevant...And we get random draws...Or maybe the French are more moral, whatever the prejudices. :-)

    The fact is that sponsors/TV announcers/tournament organisers got the winning ticket out of a possible 4096 in the lottery organised by themselves. :-)

    ReplyDelete