Friday, March 28, 2014

Myth vs. Math: Big Servers in Tiebreakers

This is the first installment in a five-part series called Myth vs. Math. In this series, I am going to take a look at five widely-accepted statements that tennis writers, analysts, fans, and commentators frequently make. I'm going to take these statements and see how they hold up against the numbers. The first statement in this series is "big servers have a notable advantage in tiebreakers."

The Myth

In tennis, the serve is an absolutely essential part of every player's game. It's the shot that starts every point. Having a big serve like Goran Ivanisevic, Andy Roddick, or Ivo Karlovic can be a huge weapon for getting out of trouble. However, having a weak serve means working a lot harder to win service games.

Anytime one of the big servers reaches a tiebreaker against a player, who relies more on other parts of their game, someone is bound to claim that the bigger server has the advantage. Before tiebreakers, when a commentator asks the other who they are picking to win the tiebreaker, the response is often something along the lines of "John Doe has the bigger serve, so I would have to give him the advantage in this tiebreaker."

And these kind of statements seem to make sense. Sets involving big servers have a higher probability of reaching a tiebreaker, so the big servers have more experience in those situations. Also, taking care of the points on serve is particularly crucial when one mini-break can determine a set.

However, does that mean the big servers actually have an advantage? Every point in a tiebreaker matters, whether it is on serve or not. And the big servers tend to be weak returners. So is being a big server really an advantage in a tiebreaker, or does it just make up for a weak return game?

The Math

The statistical mark of any big server is aces. So what I want to see is how high the correlation between aces and tiebreak win percentage is. If there is a high correlation, then the claim is true, but if there isn't a notable correlation, then the math

The ATP began tracking aces back in 1991. Since then, 42 players have played at least 600 tour-level matches. I used those 42 players as my sample to create this scatterplot.

On the x-axis is the players' aces per match, while the y-axis is each players' win percentage in tiebreaks multiplied by 50. I multiplied it by 50 simply to make the range of the two sets of data equivalent to each other. Each 'x' on the scatterplot represents one of the 42 players. The line through the middle of the data is a linear least squares fit. It shows for an average player how much an increase in aces per match will increase the chance of winning a tiebreaker.
 
To make this a little more tangible, let's look at how individual players fit on the scatterplot. The 'x' furthest to the right of the graph is Goran Ivanisevic, who won 58.9% of his tiebreakers, which is better than expected for a player with over 13 aces per match. The 'x' is the upper-left is Rafael Nadal, who strikes an average of just under three aces per match, but still wins 63.8% of his tiebreakers. At the bottom of the graph is Vincent Spadea, who won just 41.9% of his tiebreakers, while hitting 3.5 aces per match. The 'x' that is closest to sitting perfectly on the line is Tim Henman, who hit 5.9 aces per match, while winning 53.6% of his tiebreakers.
 
The slope of the linear regression is very mild, indicating that while there is a connection between aces and winning percentage in tiebreakers, it is very week. The correlation coefficient is merely .277, which means that aces per match are just 7.69% predictive of the outcome of a tiebreak.
 
Flaws in the Math
 
First, the scatterplot does nothing to account for differences in surface. This scatterplot treats all surfaces as equal. If it were broken down by surface, we may find that the statement is more true on a particular surface than another. However, the plot does show that overall, the connection is hardly impressive.

Second, in every point, there are two kinds of serves: 1st and 2nd serve. However, aces are primarily a first serve statistic since aces are so rare off the second serve. The biggest servers though are able to get a substantial amount of unreturned serves on the second delivery. A comparison of unreturned serves to tiebreak win percentage would produce a more accurate scatterplot. However, stats on unreturned serves aren't tracked at every match.
 
Third potential flaw is the sample size. Since I only looked at players with over 600 career matches played, I looked only at players who won often enough to play 600 matches. As a result, the majority of the players in the scatterplot had a winning record. In fact, the average win percentage in tiebreakers of the 42 players 54%. A look at more players with a losing record in tiebreakers could create a more complete scatterplot
 
Conclusion
Math seems to disprove the claim that "big servers have a notable advantage in tiebreakers." Having a big serve does seem to be a slight benefit in tiebreakers, but it is not much of a factor in determining the winner of a tiebreaker.
 
Supporters of the claim will often point to John Isner, who averaged exactly 16 aces per match after reaching the semifinals in Indian Wells and has a 65% winning mark in tiebreakers. However, the data shows that Isner's stats are an outlier and not indicative of other players. So perhaps the reason for Isner's success in tiebreakers has to do with something more than just having a monstrous serve.
 
So the widely-accepted claim that having a big serve is an advantage in tiebreakers should be a rejected claim.

No comments:

Post a Comment