Category Archives: Research

Avoiding Double Faults When It Matters

The more gut-wrenching the moment, the more likely it is to stick in memory.  We easily recall our favorite player double-faulting away an important game; we quickly forget the double fault at 30-0 in the middle of the previous set.  Which one is more common? The mega-choke or the irrelevancy?

There are three main factors that contribute to double faults:

  1. Aggressiveness on second serve. Go for too much, you’ll hit more double faults.  Go for too little, your opponent will hit better returns.
  2. Weakness under pressure. If you miss this one, you lose the point. The bigger the point, the more pressure to deliver.
  3. Chance. No server is perfect, and every once in a while, a second serve will go wrong for no good reason.  (Also, wind shifts, distractions, broken strings, and so on.)

In this post, I’ll introduce a method to help us measure how much each of those factors influences double faults on the ATP tour. We’ll soon have some answers.

In-game volatility

At 30-40, there’s more at stake than at 0-0 or 30-0.  If you believe double faults are largely a function of server weakness under pressure, you would expect more double faults at 30-40 than at lower-pressure moments.  To properly address the question, we need to attach some numbers to the concepts of “high pressure” and “low pressure.”

That’s where volatility comes in.  It quantifies how much a point matters by considering several win probabilities.  An average server on the ATP tour starts a game with an 81.2% chance of holding serve.  If he wins the first point, his chances of winning the game increase to 89.4%. If he loses, the odds fall to 66.7%.  The volatility of that first point is defined as the difference between those two outcomes: 89.4% – 66.7% = 22.7%.

(Of course, any number of things can tweak the odds. A big server, a fast surface, or a crappy returner will increase the hold percentages. These are all averages.)

The least volatile point is 40-0, when the volatility is 3.1%. If the server wins, he wins the game (after which, his probability of winning the game is, well, 100%). If he loses, he falls to 40-15, where the heavy server bias of the men’s game means he still has a 96.9% chance of holding serve.

The most volatile point is 30-40 (or ad-out, which is logically equivalent), when the volatility is 76.0%.  If the server wins, he gets back to deuce, which is strongly in his favor. If he loses, he’s been broken.

Mixing in double faults

Using point-by-point data from 2012 Grand Slam tournaments, we can group double faults by game score.  At 40-0, the server double faulted 3.0% of points; at 30-0, 4.2%; at ad-out, 2.8%.

At any of the nine least volatile scores, servers double faulted 3.0% of points. At the nine most volatile scores, the rate was only 2.7%.

(At the end of this post, you can find more complete results.)

To be a little more sophisticated about it, we can measure the correlation between double-fault rate and volatility.  The relationship is obviously negative, with an r-squared of .367.  Given the relative rarity of double faults and the possibility that a player will simply lose concentration for a moment at any time, that’s a reasonably meaningful relationship.

And in fact, we can do better.  Scores like 30-0 and 40-0 are dominated by better servers, while weaker servers are more likely to end up at 30-40. To control for the slightly different populations, we can use “adjusted double faults” by estimating how many DFs we’d expect from these different populations.  For instance, we find that at 30-0, servers double fault 26.7% more than their season average, while at 30-40, they double fault 28.6% less than average.

Running the numbers with adjusted double fault rate instead of actual double faults, we get an r-squared of .444.  To a moderate extent, servers limit their double faults as the pressure builds against them.

More pressure on pressure

At any pivotal moment, one where a single point could decide the game, set, or match, servers double fault less than their seasonal average.  On break point, 19.1% less than average. With set point on their racket, 22.2% less. Facing set point, a whopping 45.2% less.

The numbers are equally dramatic on match point, though the limited sample means we can only read so much into them.  On match point, servers double faulted only 4 times in 296 opportunities (1.4%), while facing match point, they double faulted only 4 times in 191 chances (2.2%).

Better concentration or just backing off?

By now, it’s clear that double faults are less frequent on important points.  Idle psychologizing might lead us to conclude that players lose concentration on unimportant points, leading to double faults at 40-0. Or that they buckle down and focus on the big points.

While there is surely some truth in the psychologizing–after all, Ernests Gulbis is in our sample–it is more likely that players manage their double fault rates by changing their second-serve approach.  With a better than 9-in-10 chance of winning a game, why carefully spin it in when you can hit a flashy topspin gem into the corner?  At break point, there’s no thought of gems, just fighting on to play another point.

And here, the numbers back us up, at least a little bit.  If players are avoiding double faults by hitting more conservative second serves on important points, we would expect them to lose a few more second serve points when the serve lands in play.

It’s a weak relationship, but at least the data suggests that it points in the expected direction.  The correlation between in-game volatility and percentage of second serve points won is negative (r = -0.282, r-squared = 0.08).  Complicating the results may be the returner’s conservative approach on such points, when his initial goal is simply to keep the ball in play, as well.

Clearly, chance plays a substantial role in double faults, as we expected from the beginning.  It’s also clear that there’s more to it.  Some players do succumb to the pressure and double fault some of the time, but those moments represent the minority.  Servers demonstrate the ability to limit double faults, and do so as the importance of the point increases.

Continue reading

Leave a Comment

Filed under Research, Serve statistics

The Unlikeliness of Inducing Double Faults

Some players are much better returners than others.  Many players are such good returners that everyone knows it, agrees upon it, and changes their game accordingly.  This much, I suspect, we can all agree on.

How far does that go? When players are altering their service tactics and changing their risk calculations based on the man on the other side of the net, does the effect show up in the numbers? Do players double fault more or less depending on their opponent?

Put it another way: Do some players consistently induce more double faults than others?

The conventional wisdom, to the extent the issue is raised , is yes.  When a server faces a strong returner, like Andy Murray or Gilles Simon, it’s not unusual to hear a commentator explain that the server is under more pressure, and when a second serve misses the box, the returner often gets the credit.

Credit where credit isn’t due

In the last 52 weeks, Jeremy Chardy‘s opponents have hit double faults on 4.3% of their service points, the highest rate of anyone in the top 50.  At the other extreme, Simon’s opponents doubled only 2.8% of the time, with Novak Djokovic and Rafael Nadal just ahead of him at 2.9% and 3.0%, respectively.

The conventional wisdom isn’t off to a good start.

But the simple numbers are misleading–as the simple numbers so often are.  Djokovic and Nadal, going deep into tournaments almost every week, play tougher opponents.  Djokovic’s median opponent over the last year was ranked 21st, while Chardy’s was outside the top 50.  While it isn’t always true that higher-ranked opponents hit fewer double faults, it’s certainly something worth taking into consideration.  So even though Chardy has certainly benefited from some poorly aimed second serves, it may not be accurate to say he has benefited the most–he might have simply faced a schedule full of would-be Fernando Verdascos.

Looking now at the most recent full season, 2012, it turns out that Djokovic did face those players least likely to double fault.  His opponents DF’d on 2.9% of points, while Filippo Volandri‘s did so on 3.9% of points.  While these are minor differences when compared to all points played, they are enormous when attempting to measure the returners impact on DF rate.  While Djokovic “induced” double faults on 3.0% of points and Volandri did so on 3.9% of points, you can see the importance of considering their opponents.  Despite the difference in rates, neither player had much effect on their opponents, as least as far as double faulting is concerned.

This approach allows to express opponent’s DF rate in a more efficient way, relative to “expected” DF rate.  Volandri benefited from 1% more doubles than expected, Chardy enjoyed a whopping 39% more than expected, and–to illustrate the other extreme–Simon received 31% fewer doubles than his opponents would be predicted to suffer through.

You can’t always get what you want

One thing is clear by now. Regardless of your method and its sophistication, some players got a lot more free return points in 2012 than others.  But is it a skill?

If it is a skill, we would expect the same players to top the leaderboard from one year to the next.  Or, at least, the same players would “induce” more double faults than expected from one year to the next.

They don’t.  I found 1405 consecutive pairs of “player-years” since 1991 with at least 30 matches against tour-level regulars in each season. Then I compared their adjusted opponents’ double fault rate in year one with the rate in year two.  The correlation is positive, but very weak: r = 0.13.

Nadal, one player who we would expect to have an effect on his opponents, makes for a good illustration.  In the last nine years, he has had six seasons in which he received fewer doubles than expected, three with more.  In 2011, it was 15% fewer than expected; last year, it was 9% more. Murray has fluctuated between -18% and +25%. Lots of noise, very little signal.

There may be a very small number of players who affect the rate of double faults (positively or negatively) consistently over the course of their career, but a much greater amount of the variation between players is attributable to luck.  Let’s hope Chardy hasn’t built a new game plan around his ability to induce double faults.

The value of negative results

Regular readers of the blog shouldn’t be surprised to plow through 600 words just to reach a conclusion of “nothing to see here.”  Sorry about that. Positive findings are always more fun. Plus, they give you more interesting things to talk about at cocktail parties.

Despite the lack of excitement, there are two reasons to persist in publishing (and, on your end, understanding) negative findings.

First, negative results indicate when journalists and commentators are selling us a bill of goods. We all like stories, and commentators make their living “explaining” causal connections.  Sometimes they’re just making things up as they go along. “That’s bad luck” is a common explanation when a would-be winner clips the net cord, but rarely otherwise.  However, there’s a lot more luck in sport than these obvious instances.  We’re smarter, more rational fans when we understand this.

(Though I don’t know if being smarter or rational helps us enjoy the sport more.  Sorry about that, too.)

Second, negative results can have predictive value. If a player has benefited or suffered from an extreme opponents’ double-fault rate (or tiebreak percentage) and we also know that there is little year-to-year correlation, we can expect that the stat will go back to normal next year. In Chardy’s case, we can predict he won’t get as many free return points, thus he won’t continue to win quite as many return points, thus his overall results might suffer.  Admittedly, in the case of this statistic, regression to the mean would have a tiny effect on something like winning percentage or ATP rank.

So at Heavy Topspin, negative results are here to stay. More importantly, we can all stop trying to figure out how Jeremy Chardy is inducing all those double faults.

5 Comments

Filed under Research, Serve statistics

The Mirage of Surface Speed Convergence

Rafael Nadal won Indian Wells. Roger Federer won on the blue clay. Even Alessio Di Mauro won a match on a hard court last week.

That’s just a sliver of the anecdotal evidence for one of the most common complaints about contemporary ATP tennis: Surface speeds are converging. Hard courts used to play faster, allowing for more variety in the game and providing more opportunities to different types of players. Or so the story goes.

This debate skipped the stage of determining whether the convergence is actually happening. The media has moved straight to the more controversial subject of whether it should. (Coincidentally, it’s easier to churn out columns about the latter.)

We can test these things, and we’re going to in a minute.  First, it’s important to clarify what exactly we mean by surface speed, and what we can and cannot learn about it from traditional match statistics.

There are many factors that contribute to how fast a tennis ball moves through the air (altitude, humidity, ball type) and many that affect the nature of the bounce (all of the same, plus surface). If you’re actually on court, hitting balls, you’ll notice a lot of details: how high the ball is bouncing, how fast it seems to come off of your opponent’s racket, how the surface and the atmosphere are affecting spin, and more.  Hawkeye allows us to quantify some of those things, but the available data is very limited.

While things like ball bounce and shot speed can be quantified, they haven’t been tracked for long enough to help us here.  We’re stuck with the same old stats — aces, serve percentages, break points, and so on.

Thus, when we talk about “surface speed” or “court speed,” we’re not just talking about the immediate physical characteristics of the concrete, lawn, or dirt.  Instead, we’re referring to how the surface–together with the weather, the altitude, the balls, and a handful of other minor factors–affects play.  I can’t tell you whether balls bounced faster on hard courts in 2012 than in 1992.  But I can tell you that players hit about 25% more aces.

Quantifying the convergence

In what follows, we’ll use two stats: ace rate and break rate.  When courts play faster, there are more aces and fewer breaks of serve.  The slower the court, the more the advantage swings to the returner, limiting free points on serve and increasing the frequency of service breaks.

To compare hard courts to clay courts, I looked for instances where the same pair of players faced off during the same year on both surfaces.  There are plenty–about 100 such pairs for each of the last dozen years, and about 80 per year before that, back to 1991.  Focusing on these head-to-heads prevents us from giving too much weight to players who play almost exclusively on one surface.  Andy Roddick helped increase the ace rate and decrease the break rate on hard courts for years, but he barely influences the clay court numbers, since he skipped so many of those tournaments.

Thus, we’re comparing apples to apples, like the matches this year between David Ferrer and Fabio Fognini.  On clay, Ferrer aced Fognini only once per hundred service points; on hard, he did so six times as often.  Any one matchup could be misleading, but combine 100 of them and you have something worth looking at.  (This methodology, unfortunately, precludes measuring grass-court speed.  There simply aren’t enough matches on grass to give us a reliable sample.)

Aggregate all the clay court matches and all the hard court matches, and you have overall numbers that can be compared.  For instance, in 2012, service breaks accounted for 22.0% of these games on clay, against 20.5% of games on hard.  Divide one by the other, and we can see that the clay-court break rate is 7.4% higher than its hard-court counterpart.

That’s one of the smallest differences of the last 20 years, but it’s far from the whole story.  Run the same algorithm for every season back to 1991 (the extent of available stats), and you have everything from a 2.8% difference in 2002 to a 32.8% difference in 2003.  Smooth the outliers by calculating five-year moving averages, and you get finally get something a bit more meaningful:

breakdiff

The larger the difference, the bigger the difference between hard and clay courts.  The most extreme five-year period in this span was 2003-07, when there were 25.4% more breaks on clay courts than on hard courts.  There has been a steady decline since then (to 16.9% for 2008-12), but not to as low a point as the early 90s (14.0% for 1991-1996), and only a bit lower than the turn of the century (17.8% for 1998-2002).  These numbers hardly identify the good old days when men were men and hard courts were hard.

When we turn to ace rate, the trend provides even less support for the surface-convergence theory.  Here are the same 5-year averages, representing the difference between hard-court ace rate and clay-court ace rate:

acediff2

Here again, the most diverse results occurred during the 5-year span from 2003 to 2007, when hard-court aces were 51.3% higher than clay-court aces.  Since then, the difference has fallen to 46%, still a relatively large gap, one that only occurred in two single years before 2003.

If surfaces are converging, why is there a bigger difference in aces now than there was 10, 15, or 20 years ago? Why don’t we see hard-court break rates getting any closer to clay-court break rates?

However fast or high balls are bouncing off of today’s tennis surfaces, courts just aren’t playing any less diversely than they used to.  In the last 20 years, the game has changed in any number of ways, some of which can make hard-court matches look like clay-court contests and vice versa.  But with the profiles of clay and hard courts relatively unchanged over the last 20 years, it’s time for pundits to find something else to complain about.

11 Comments

Filed under Research, Surface speed

Warming Up and Losing Out

This week’s pair of ATP warmups for the Australian Open provide quite the contrast.

In Sydney, only one seeded player (the hardly automatic Andreas Seppi) reached the semifinals, and only one other even made the quarters. Across the ditch in Auckland, three of the final four are among the top four seeds, and the fourth, Gael Monfils, would typically sport a ranking in the same range.

Sydney fits a conventional narrative, while Auckland confounds it. The week before a Grand Slam, many of the top players are out of action, while those who are in action … well, let’s just say warmups don’t always appear to be their top priority.

Winning in 250s

The ATP schedule gives us a convenient natural experiment in order to determine whether slam warmups really are different.

(For convenience, I’m using the term “warmups.” However, we’re only looking at tournaments the week before a slam starts. Sydney is included, but not Brisbane, even though events two weeks before Austrlian and Wimbledon are generally called “warmups.”)

Since 2009, all of the lowest rung of tour-level events have been worth 250 points to the winner. Conveniently, all tourneys the week before slams have fallen into this category.

To see if players seem to treat slam warmups differently from other events, we can simply compare results from warmups to those from other 250s. It isn’t perfect, since a few 250s have draws of more than 32 players and the field quality isn’t identical in all tourneys at this level, but by looking at a few different metrics, we can limit the impact of those quibbles.

Who cares?

Let’s start by simply counting wins and losses of seeded players. In slam warmups from 2009 through 2012, seeds won about 61% of matches against unseeded opponents (224 of 365), while in other 250s, seeds win over 70% of those matches (1499 of 2129). That’s a substantial difference.

To eliminate the quirks of the bigger 250 draw at Queen’s Club, and perhaps toss out some first-round retirements as well, let’s consider the records that seeds have posted in specific rounds.

In the round of 16 at slam warmups, seeds have gone 71-50, for a winning percentage of 58.7%. At other 250s, seeds have won 591 against 223 losses, a percentage of 72.6%.

In the quarterfinals of slam warmups, seeds have beaten unseeded players in 33 of 46 matches–71.7% of encounters. In other 250s, similar matchups have gone to the seeded player 200 of 275 times, or 72.7% of the time.

It seems that many top-ranked players show up at slam warmups with the intent of getting one or two matches under their belt. (Or perhaps fulfilling an obligation to a sponsor.) Those players don’t perform up to their usual standard. But as shown by the comparable records in quarterfinals, those who come to compete play at their usual level.

A few other looks

One issue that seems to have a particular impact in slam warmups is last-minute withdrawals, like that of second-seed Gilles Simon in Sydney this week. Those don’t show up in the won-loss records.

To consider the overall picture, including withdrawals, we can count the number of seeds who reach the semifinals in our different categories of ATP 250s.

In slam warmups, the semifinal fields in the last four years have consisted of 53 seeds and 43 nonseeds–about 55% top-ranked players. In other 250 semifinals, we’ve seen 365 seeds against 191 nonseeds–66% seeds.

Yet another angle is the performance of the top four seeds. In 250s, the 5 through 8 seeds are often barely distinguishable from the rest of the pack. For example, in Sydney this week, those last four seeds are Florian Mayer, Radek Stepanek, Jeremy Chardy, and Marcel Granollers. Not much difference between those guys and unseeded semifinalists Julien Benneteau, Kevin Anderson, and Bernard Tomic.

There’s no clear line between first-rank guys and the rest of the pack, but taking the top half of the seeds seems as good as any other option.

The results are similar to what we saw with the larger pool of seeds. Overall, when a top-four seed played a non-top-four opponent in a slam warmup, he won 65% of matches (129 of 199). In other 250s, he won 74% (978 of 1321).

In the round of 16, top-fours went 51-24 in slam warmups, for a record of 68%, compared to 76% (366-114) in other 250s.

Where the top four seeds differ from other seeds is in the quarterfinal round. In slam warmup QFs, top-fours went 31-20, winning 61% of matches. In other tourneys, they won 71% (261-105). Perhaps the first-round bye in many slam warmups means that top seeds want two warmup matches, but no more.

As mentioned, these experiments give us imprecise results, as they don’t take into account the exact field quality of the various 250s. While they may not be the final word on this question, these numbers do strongly indicate that higher-ranked players don’t view slam warmups as particularly important. Against a similar pool of opponents, they win far more matches in 250s at other times throughout the year.

Perhaps that’s one reason why winning an Aussie Open warmup doesn’t forecast any particular level of success in Melbourne–these are tournaments where some of your most highly-ranked opponents just aren’t trying as hard as usual.

2 Comments

Filed under Research

Responding to Pressure at 5-5

In a post last week, I presented some data that suggested that servers weaken a bit under the pressure of a tiebreak.  It’s not a strong effect, but it’s a consistent one.  A possible explanation–that all that time between points gives servers a chance to psych themselves out, yet may not affect returners the same way–would apply almost as much to games toward the business end of a set, such as at 5-5 or 5-6.

In other words, if players don’t serve as well (or they return better) when things get tight, we’d expect to see more breaks toward the end of a set–more breaks than expected at 5-5, but perhaps fewer breaks than expected at 2-2.

This also opens up a possible method for evaluating players, as Carl Bialik has suggested.  If someone is losing more sets 5-7 than they are winning 7-5, it may be that they are wilting under the pressure of 5-5 more than the average player.  It would make sense if the players who consistently exceed tiebreak expectations also regularly outperform 7-5 expectations as well.

Within the constraints of the ATP’s Matchstats, 7-5 sets are a great way to identify these patterns.  While some 6-4 sets end with a break (or a break followed by a set-sealing hold), a 6-4 set doesn’t necessarily end that way.  But a 7-5 set must have reached 5-5 before one player took control.

If the hypothesis is correct that players get tighter on serve as the end of the set approaches, we would expect more 7-5 sets in the real world than simulations would imply.

To estimate the number of sets that should end 7-5, we need to take each player’s service points won from each match.  With that, we can calculate the probabilities that sets will end at any given score.  Repeat the process for every match over a period of time and we get a general idea of how often we should see 7-5 sets.

As it turns out, 7-5 sets should make up about 7.8% of all sets.  In fact, 8.8% of sets end 7-5.  Not a huge difference, but one that is fairly consistent from year to year.  Every year since 1991, where this dataset begins, there have always been more 7-5s than expected.  It certainly adds more weight to the claim that the balance of power swings to the returner toward the end of a tight set.

(My set-prediction model doesn’t exactly replicate reality, since players win more games than their service winning percentages predict, in large part because almost all servers are better in either the deuce or ad court, and the variance between them makes it more likely that the player wins a given service game.  When applying a crude adjustment for this, the crumbling-server hypothesis looks even better–the more games servers are predicted to win, the fewer predicted 7-5 sets.)

Identifying the unbreakable

This type of discussion must make you wonder: Which players are good as this stuff?  If it is true that late-set pressure results in more breaks, it seems obvious that some players are more prone to that pressure, and that other players take advantage of that pressure.

In an ideal world, we’d be able to identify some great 7-5 records, point out some 5-7 records, and have some great new insights into players.

As it is … we might.

As we saw last week with tiebreak analysis, we can’t simply count up a player’s 7-5 sets and compare that total to his 5-7 set losses.  Over the last three years, Andy Roddick won more than 55% of his 7-5 and 5-7 sets, but given the players he faced in those sets and their performances in those matches, he should have won 62%.

There are two ways to quantify player accomplishments in this department.  The first evaluates how well a player avoids losing 5-7 when he reaches 5-5; the other compares his ability to break for 7-5 against his proneness to being broken for 5-7.

Let’s call the first stat Five-Seven AVoidance, or FSAV.  For any player, we first add up the sets that reached 5-5, then count the sets that he won 7-5 or reached a tiebreak.  Then we use the general method described above to estimate how many times the player should have reached 5-5, and how many of those times he should have avoided 5-7.   Since the beginning of 2010, Kei Nishikori has avoided a 5-7 finish in about 92% of the sets in which he reached 5-5.  My model would have expected him to avoid 5-7 only about 84% of the time.  (The model expects that most players will avoid 5-7 about 82-90% of the time they reach 5-5.)

From those numbers, we discover that Nishikori lost 5-7 less than half as often as we would have expected him to.  No other player comes close to that mark. In everyday language, FSAV approximates how often a player was able to hold serve at 5-5 or 5-6.  Important skill, that.

The second stat is more narrowly focused on 5-5 sets that do not reach a tiebreak.  Let’s call this one the Seven-Five Outperformance Rate, or SFOR, similar to the TBOR (TieBreak Outperformance Rate) I introduced last week.

Here, instead of comparing 5-7s to all 5-5 sets, we compare 5-7s to 7-5s.  In other words: Is the player more likely to break for 7-5 or be broken for 5-7?  As with the previous stat, after calculating the simple rate (that is, number of 7-5 sets divided by total number of 7-5 and 5-7 sets), we compare that to the results that the model would have expected the player to post.

Bizarrely enough, our three-year leader in SFOR is Ernests Gulbis, who has won about 73% of his 7-5 and 5-7 sets, compared to the 50% the model expects of him.  (It’s even more impressive when compared to the 7% that I personally would have expected from him.)

As the highlighting of Gulbis suggests, these stats probably don’t yet belong in our everyday toolbox.  There simply aren’t very many 7-5 sets, even if–as I established above–there are a few more than we would expect.  For reference, there are almost twice as many tiebreaks as 7-5s.

And to keep Gulbis in the spotlight, it may be that winning 7-5 sets is more a function of getting to 5-5 when you shouldn’t.  Perhaps many of those 7-5s racked up by the Latvian came when he should have put the set away 6-2.  Once 5-5 came along, he finally decided to get serious.  As Gulbis himself might tell you, it’s anybody’s guess.

Follow the jump for FSAV and SFOR on about 50 or so of the most active players (including all tour-level matches (but excluding Davis Cup) since the beginning of 2010, sorted by FSAV) and decide for yourself.

Continue reading

4 Comments

Filed under Research, Tiebreaks

How Good is Brian Baker?

In his remarkable comeback this year, Brian Baker has already recorded two top-20 scalps, along with seven other victories against players in the top 100.   In the same span of six months, he’s also lost to a player barely inside the top 400, and suffered another six defeats against guys outside the top 100.

This is inconsistency of historic magnitude.  The list of players he’s beaten may actually be more impressive than the list of those who have beaten him!  Adding to the confusion, we don’t have any other recent results from him.  We can’t just wave our hands and point to his 2011 performance level as an accurate indicator of his current level.

One measurement of player ability, the ATP ranking system, places him at #78, a number that seems just as ridiculous when he’s beating Philipp Kohlschreiber at a Masters event as when he’s losing to Maxime Authom at a challenger.  But overall, the ATP estimate doesn’t seem too far-fetched.  It’s certainly better than what jrank (my rating system) spits out.  That algorithm doesn’t know what to do with such a limited track record, so it places him far outside the top 100.

We can do better.  As we’ll see, Baker’s results suggest he belongs on the cusp of the top 50.

Uniquely limited results

Imagine a completely unknown player is given a wild card into a major event.  We don’t know where he came from or who he might have beaten in the past.  He’s a completely blank slate.  If we wanted to estimate his ability level, we would have to wait until we got some results.

If that player won an opening-round match against the 17th-best player in the world, our best guess would be that he is better than #17, but we wouldn’t know how much.  If he lost that opening round match, we would assume he is worse than #17.  We might use statistics from that match to estimate how much better or worse than #17.

As our unknown kept playing more matches, we would update our estimate, using additional data as it came in.

(You might protest that in the early going, we should regress our estimate to the mean, since if some random guy came out of nowhere, he probably isn’t one of the 16 best tennis players in the world–there was a reason he was nowhere.  And, in such a real-world scenario, you would be right.  But such a case, what is the mean?  If a baseball player is called up from Triple-A, an intelligent observer, such as a scout or team executive, considers him at least marginally MLB-level, so we would regress our estimate to the level of marginal MLB players.  But if a player receives a wild card into a tennis tournament, what do we know?)

Few tennis players in history have come closer to this unknown than Brian Baker.  Sure, everyone has to start somewhere, but usually “somewhere” is a long string of futures tournaments, followed by an even longer string of challengers.  By the time a player bags his first top-20 scalp, we have lots and lots of data to work with.

When other players were racking up several dozen matches every year, Brian Baker was rehabbing injuries and coaching college tennis.  We can only judge him based on a small number of recent results.  And those results are particularly contradictory.

Working backward

Intuitively, it’s tough to accept that a single player has beaten a bunch of good players and lost to several weaker ones.  No matter how good that guy is, such a set of outcomes is unlikely.

But how unlikely?  That question is the key to estimating Baker’s current level.

Rather than assuming Baker is playing at a certain level (like that of #78) and scratching our heads at his inconsistency, we can work backwards–take his results and determine the likelihood that he is playing at various levels.

For instance, we could assume that Baker is #5 in the world.  If so, some of his results would be very predictable (like the two wins against Blake Strode) and others would be particularly jarring.  We could go further and calculate the probability that the #5 player in the world would amass Baker’s specific match record.  Those odds, of course, are vanishingly small.

If you repeat the process for every possible ranking, you get a probability that #5, or #12, or #77 would win the matches Baker has won and lose the matches he has lost.  One of those probabilities will be higher than the others, and that’s our best guess of how highly we should regard the American.

(If you’re interested in methodology, click “Continue Reading” below.)

Using this method, we discover that Baker has played at the level of someone with about 820 ATP ranking points, putting him around #54, in a tight pack with Grigor Dimitrov, Gilles Muller, Alejandro Falla, and Lukas Lacko.  With every match he plays, we can continue to fine-tune our estimate.

There are many factors we need to ignore to do an analysis like this, largely because of the limited data that led us to the topic in the first place.  Many of Baker’s worst results have come on hard courts; perhaps he will prove over a longer period to be stronger on clay and grass.  If his ability level has changed over the last six months, as seems very likely, this approach fails to take it into consideration.

But because of the unique nature of Baker’s comeback, which makes it difficult to assume anything about his ability level–this approach allows us to a make a reasonably good guess.  And with such a strange mix of great wins and rough losses, a good guess is all we can hope for.

Continue reading

1 Comment

Filed under Brian Baker, Research

Tommy Haas: Old and Winning

For all the talk of 30-somethings at the top of the modern men’s game, tennis players decline quickly.  30 may be the new 20, but 35 is still the same old 35, and 35-year-old tennis players are usually found on the champions tour, the doubles court, or national television.

Yet Tommy Haas, aged 34 years and 5 months, is enjoying a resurgence, having reached three finals in the last two months–on three different surfaces.  He’s one of the hottest players on tour of any age.

34-year-olds don’t do things like that.  In the last ten years, players 34 and older have accounted for fewer than 1% of wins on the ATP tour.  From 2008 to 2011, all 34-year-olds–combined–won a total of 17 tour-level matches.  In the five months since his birthday, Haas has won 22.

To find a point of comparison, we need to go back five years, to the 2007 campaign of Fabrice Santoro, and slightly earlier, to Andre Agassi‘s 2004 season.  Agassi at 34 was better than Haas at 34, winning 37 tour-level matches and reaching two grand slam quarterfinals.  Agassi was the best “old” player since Jimmy Connors and the only man in the discussion since the 1970s.

Yet already, Haas is among the best 34-and-overs in ATP history.  His 22 wins since his 34th birthday are good for 28th on the all-time list, ahead of Fred Stolle and just behind Roy Emerson.  But that understates Haas’s accomplishment.  With the exceptions of Santoro, Agassi, and Connors (whose 178 wins-past-34 are good for 2nd on the all time list, behind Ken Rosewall), everyone on the list retired more than 20 years ago.

Comparisons to Haas’s contemporaries do a better job of illustrating how unusual he is.  The only two older men to have won a match on tour this year are Arnaud Clement and Ruben Ramirez Hidalgo, neither of whom are a factor anywhere but the challenger tour.  The other 34-year-old to win some matches this season is hyper-fit warrior Michael Russell, who took advantage of the weak draws in Atlanta and Los Angeles.

As long as he stays healthy, Haas is far from finished.  According to Jrank, he’s the 11th-best hard court player in the game right now. He may not have another grand slam final ahead of him, as Agassi did at the same age, but he has more wins in his future than most players a decade his junior.

1 Comment

Filed under Research, Tommy Haas

The Hangover Effect of a Marathon Fifth Set

Marathon sets are again the talk of tennis.  We won’t soon forget Roger Federer‘s 19-17 third-set win over Juan Martin Del Potro … or Roger’s weak performance in the match that followed.

The unusual Olympic format–best-of-three, no final-set tiebreak–brought several issues to the fore.  Should best of three be enough for slams?  It certainly gave us plenty of dramatics last week.  And is it finally time to end the no-tiebreak madness?  For all of the occasional drama, do we really need to see even more service holds in John Isner matches?

Peter Bodo makes the case for a marathon-free world:

[M]y main reason for embracing the final-set tiebreaker is not the obvious one that would be cited by most time-sensitive television producers. The real problem with deuce sets is that when a match goes as long as Federer v. Delpo or even Jo-Wilfried Tsonga v. Milos Raonic (that one went 25-23, for Tsonga) the reward for the winner’s heroic feat is almost always a quick subsequent loss.

As Bodo goes on to illustrate, this seems anecdotally true.  But who cares about anecdotes?  This is a testable hypothesis.

As we’ll see, there is a noticeable hangover effect when a player has fought through a marathon fifth set.  But the alternative–a fifth-set tiebreak–produces nearly the same hangover.

There have been 146 marathon fifth sets–matches in which the final set reached 6-6–in Grand Slam tennis since the beginning of 2001.  The record of those 146 winners in their next round is dreadful: 43-103, or 29.5%.  It’s even worse than that, actually.  Four times, two marathon men went on to play each other, so four of those wins were inevitable.

However, that isn’t the end of the story.  To prove that fifth-set marathons significantly weaken their winners, we need to establish two things: (1) They had a decent shot at beating their next opponents anyway, and (2) if a fifth-set tiebreak were played, their chances would have been better.

Post-marathon underdogs

The first issue is a bit sneaky.  If a player has to go deep into the fifth set to win in the early rounds, he’s hardly a dominating presence in the draw.  Consider the extreme case of Yen Hsun Lu, who in 2010, beat Andy Roddick in a 9-7 fifth set, advancing to play Novak Djokovic in the Wimbledon quarters.  Sure, Lu was tired, but what were the odds of an upset even if Roddick lost in three?  Top players rarely need five hours to push through an early-round opponent.

To quantify this, we can turn to jrank-driven predictions.  Using these measures of each player’s ability level at the time of the match, we can estimate the actual chances of our 146 marathon men.

The marathon men would have been underdogs in their next match no matter what.  On average, each one had a 43.4% chance of winning, meaning that of the 146 matches, they should have won 63 of them.  Even adjusting for their underdog status, they seem to have suffered from their marathons–they won 43 of those matches, barely two-third the number that they “should” have won.

Almost-but-not-quite marathons

We’ve established that once a player enters the uncharted territory beyond 6-6, his chances of winning the next match are substantially weakened.  But surely the fatigue didn’t set in right at the moment the chair umpire called “6-6.”  Even if the fifth set is a bagel, simply playing five sets of professional-level tennis is exhausting, and might impact one’s performance a day or two later.

The most relevant set of matches for comparison are US Open five-setters that went to a final-set tiebreak.  Since 2001, we have 40 of those.  In their next matches, the winners of the almost-marathons went a dismal 11-29 (27.5%)–worse than the marathon men!

Compared to their expectations, though, they did a bit better.  Those forty men, on average, had a 38% chance of winning their next matches, meaning we would expect them to win about 15 of the 40.  Relative to the predictions we would have made at the time, this small sample of fifth-set-tiebreak winners outperformed the marathon men, but just barely.

For a bigger sample, we can turn to the slightly shorter–but still epic–matches that end 7-5 in the fifth.  Of the 95 such matches since 2001, the 7-5 winners went on win 49, or 51.5% of their next matches!  This despite the fact they were collective underdogs, expected to win only 48%, or 46 of those matches.

What now?

Since the 7-5 group performed so differently in their next matches, it’s tempting to speculate why they did so.  My best guess: If a player manages a break before the set goes 6-6, he’s relatively fresh, physically and mentally.  The sort of player who can break at 5-5 or 6-5 is one who can come back a day or two later and plow through another three or four hard-fought sets.

By contrast, matches that get to 6-6–whether they end in a tiebreak or not–are usually battles of attrition.  Think Isner-Mahut: The longer it lasted, the less likely either player could challenge the other’s serve.  That brand of tennis had set in before 6-6 in the fifth: If one of the players pulled out a 7-4 tiebreak, it wouldn’t say much about his fitness or mental stamina, simply that someone is bound to get lucky for a point or two.

Based on the limited data we have, there just isn’t much difference between the after-effects of fifth-set marathons and fifth-set tiebreaks.  In both cases, the marathon men weren’t going to be favored anyway, and their fatigue hurts them even more.  Changing format to fifth-set tiebreaks would have little effect on future outcomes–it would just make those matches a bit more dependent on a lucky bounce.

3 Comments

Filed under Research

Serving First in Marathon Sets

Last night, when Jo Wilfried Tsonga finally defeated Milos Raonic, it was on a match-ending break of serve.  Conventional wisdom suggests that’s often how it goes.  Whoever serves first in a long set seems to have the advantage.  There’s less pressure to hold serve at 7-7 (or 47-47) than there is at 7-8.

Tsonga won his contest with a match-ending break point; Isner finished off his 70-68 set on Mahut’s serve; and when Federer and Roddick went to 14-14 in the 2009 Wimbledon final, Roger held for 15-14 before breaking the American.  Is it a trend?

As it turns out, those three high-profile matches have misled us.  Based on the limited data available, the first server in fifth-set epics has little or no advantage.

(Third-set epics are so rare that we might as well ignore them–the Olympics is the only tournament where men play best-of-three with no tiebreak in the final set.)

We don’t know who served first for every marathon fifth set in tennis history, but we can figure it out for some.  The ATP has limited stats for most matches back to 1991, and those stats include numbers of service games.  When the number of service games is equal for both players, we’re stuck at square one.  When one player has more than the other, that guy must have served the first game of the match–and the last.  Since marathon sets must contain an even number of games, we know who served first in the final set.

The result is a pool of 138 matches in which the fifth set ended at 8-6 or higher and we know who served first.  Of those, the guy who served first–at 0-0, 1-1, 6-6, and so on–won the match 67 times (48.6%).  It’s a coin toss.

If we take pressure out of the equation, this makes perfect sense.  If two guys have gotten to 6-6 in the fifth set, they’re playing as equally as two tennis players can play.  It’s only when we consider the stress of serving to stay in the match that we start to suspect that one player–but not the other–won’t be able to hold up his end.

For a bigger dataset, we can look to similar situations.  Consider 5-setters that end 7-5 in the fifth.  Those don’t have the cachet of matches that go farther, but they are quite epic in their own right.  We know who served first in 86 such matches, and of those, the man who served first won only 38 (44.2%).  It’s not exactly proof that the first server has a disadvantage, but it does cast more doubt on the conventional wisdom.

If want more than 200 or so matches, we need to weaken our definition of “epic.”  Tiebreaks aren’t relevant here, since we’re looking for instances where one player was broken under pressure.  But we can use best-of-three contests that ended 7-5.

With so many more best-of-three matches on the schedule, our dataset is now much bigger.  We know who served first for 753 tour-level matches that ended 7-5 in the third.  Of these, the player who served first went 412-341, winning nearly 55% of matches.

If you want evidence that the conventional wisdom is correct, there you go.  If a match reaches 5-5 in the deciding set and ends with a break, there is, altogether, a 53% chance that the first server wins.

But with our more limited data, it’s impossible to draw the same conclusion about five-setters once they head into the barely-charted territory beyond 6-6.

6 Comments

Filed under Research

Who Benefits From Byes?

Roughly two-thirds of ATP tour-level tournaments have byes in the draw.  31 events–including the two this week, in Kitzbuhel and Los Angeles–have 28-man fields, with first-round byes for the top four seeds.

The obvious beneficiaries are the top four seeds.  They get free passes into the second round, eliminating the chance they’ll be handed a first-round exit.  It’s also a guarantee of greater prize money and more ranking points.  First-round byes are such a feature of the ATP tour, at least in part, because they help smaller tournaments convince big-name players to sign up.

Of course, you can’t simply hand an advantage to the top four seeds without affecting others.  In this most common format, a 28-man field with eight seeds and four byes, there are three important groups: The top four seeds, the bottom four seeds, and the rest of the field.

The top four seeds: The main effect of byes on the top four seeds is that, as noted, they don’t have play first-round matches.  The extent of that effect depends on how much of a threat the first-rounder would’ve been.

To quantify these effects, I ran simulations for the 2012 Estoril tournament.  First, I simulated the draw as the tournament was played, with 28 players and top seeds of Juan Martin Del Potro, Richard Gasquet, Stanislas Wawrinka, and Albert Ramos.  Second, I added the next four players on the alternate list to the draw in place of the byes.  To eliminate any bias stemming from the specific arrangement of the draw, I re-generated the brackets for each simulation.

In the 32-man field, Delpo won his first round match about 90% of the time, Gasquet and Wawrinka about 80%, and Ramos just under 60%.  Accordingly, Delpo didn’t benefit too much from the bye, but Ramos gained enormously.

However, when measured by expected ranking points, none of these four men gained as much as skipping the first round would suggest.  For instance, if Delpo would win only 90% of his first-round matches, removing that impediment would be expected to raise his other outcomes by (1/0.9 – 1), or 11%.  In fact, in the 28-man simulation, he gained only 9.5% over his 32-man expectation.

The slight difference is due to the other top seeds.  If Delpo is more likely to reach, say, the semifinals, then the same effect applies to Gasquet and Wawrinka, the two men who would be most likely to knock him out of the draw.  So while the bye itself increases Delpo’s expected ranking points by 11%, the increased probability of facing the other top seeds reduces it a bit.

Still, the net effect on the top four seeds is overwhelmingly positive.  For Gasquet and Wawrinka, the bye itself increases their expectations by 27% each, for a net effect of 24%, while for Ramos, the bye is a 74% increase, resulting in a net effect of 70%.

The next four seeds: The men seeded five through eight are the losers.  They must play a first-round match–which, in the Estoril example, they each have about a 60% chance of winning–but they are more likely to face one of the top four seeds later on.

The average effect of adding byes to the draw is a 5% decrease in expected ranking points for these lower four seeds.  They aren’t guaranteed to reach the quarterfinal, but in the 28-man version, if they do reach the quarters, they are at least 10% more likely to face a higher-ranked opponent.

The rest of the pack: Nearly everyone else benefits.  The effect of byes touches unseeded players in two ways, which work in opposite directions.  First, and most significantly, no one has to play a top-four seed in the first round.  In Estoril, the toughest first round opponent was 5th-seed Denis Istomin, not exactly a fearsome name in the locker room.  Because of the byes, nearly every player has a 40% chance of reaching the second round.

The countervailing force is a minor one–not enough to neutralize the advantage of missing top seeds in the first round.  When the field shrinks from 32 to 28, the average opponent is a bit better.  If four additional players were added to the Estoril field, they wouldn’t be automatically placed in the positions of the byes.  They would be randomly placed in the draw like everyone else.  Having those four lower-ranked players would give some players even easier first-round matches.

But on balance, for unseeded players, the goal is simply to win a match or two.  The best way to increase their chances of doing so is to keep the best players out of their path for as long as possible.  Byes take care of that.  The net benefit to unseeded players is an addition of 1% to 3% of their expected ranking points.  Generally speaking, the worse the player, the bigger the benefit.

The one exception to this rule is if an unseeded player is actually better than some of the seeds.  According to jrank, Igor Andreev was a better player than 8th-seed Flavio Cipolla going into Estoril.  Thus, the logic that applies to the bottom seeds applies to him.  He was likely to advance to the quarterfinals, so the effect of the byes was mainly to give him a tougher quarterfinal opponent.  In each tournament, this might affect one or two players–in Estoril, Andreev was the only one.

One more consideration: As we’ve seen, 23 of the 28 players benefited from the byes.  And the five players who were negatively affected didn’t lose too much.  How is that possible?

There’s one more group we haven’t talked about: The four players who aren’t included in a 32-man draw.  They don’t have much of a chance of reaching the final rounds, but they wouldn’t be much worse than the rest of the unseeded pack.

One of the players I used for this example, Igor Sijsling, just missed the cut, but in a 32-man draw, he would have been expected to take home 23 ranking points and about $9,000.  By adding four byes, the tournament is essentially taking what it would have given to Sijsling and three other players and divvying it up among the remaining 28.  The pie is the same size, but fewer players can claim a slice.

In the end, those four “missing” players are the only real losers, and they always have the option to head to a challenger for the chance of picking just as many points, even if they probably don’t come with as many dollars.

The winners, beyond the top seeds and the tournament organizers, are ultimately the fans.  When top players have more reason to play small tournaments, we get to watch more high-profile matchups, and ATP 250s look a bit less like Kitzbuhel and a bit more like Doha.

Leave a Comment

Filed under Research