Is the US Open Draw Truly Random?

Last week, an ESPN “Outside the Lines” article called into question the fairness of the U.S. Open main draw.  A researcher discovered that the top two seeds (both men and women) have gotten very easy first-round assignments.

This is one small step away from a direct accusation of draw-rigging by the USTA.  It’s a serious claim, and while the article’s author leans heavily on a single academic who supports the methodology used, it’s not at all clear that anything unacceptable is going on.

What they found

For some reason, the study focused on the top two seeds.  It’s not at all clear why it did so–I have no idea what the USTA’s motive would be for rigging the draw in favor of the top two seeds, regardless of their identity.  Sure, there were a few years when a Federer-Nadal final would have been particularly mouthwatering, or when American viewers craved a Serena-Venus showdown in Flushing, but why would the USTA be tweaking a draw in favor of Gustavo Kuerten?  Marat Safin?  Amelie Mauresmo? Dinara Safina?

For the moment, let’s set that major concern aside.  To quantify the difficulty of each player’s first-round opponents, the ESPN study invented a metric called “difficulty score.”  We’ll come back to “difficulty score” in a bit.

A simple look at the lists they assembled of first-round opponents does suggest that something untoward is going on.  In the last ten years of men’s draws, a top-two seed has faced a top-80 opponent only four times, and not once in the last five years.  Seeded players should face top-80 opponents about half the time.

If we are truly interested in the first-rounders assigned to top-two seeds, it’s clear that these players have been given an easier path than what would be statistically expected.  But it’s not yet clear that it’s anything other than good luck.

Breaking down “difficulty score”

Here’s the explanation of the metric that ESPN used:

So if a top two seed faced the 33rd-ranked player in the first round, he/she would get a difficulty score of 0.995 for that round; if he/she faced the 128th-ranked player in the first round, the score for that round would be 0.005. An average opponent (ranked around 80th or 81st), would correspond to a difficulty score near 0.500, which should be the average difficulty score over several years of draws.

I don’t understand why the ESPN study needed to switch from ordinal rankings (1 to 128) to difficulty scores between 0.005 and 0.995.  But I replicated the work using ordinal rankings instead of difficulty scores, and came up with the same results.

The average first round opponent for the top two seeds in each year’s men’s draw has been about the 98th-best player in the draw.  Given that seeds can draw anyone from 33 to 128, the average “should” be around 80.  With difficulty scores, ESPN says that the likelihood of the last ten years of easy draws is 0.3%.  With ordinal rankings, I found approximately the same.  The last thing the sports-analysis world needs is another superfluous metric, but at least this one doesn’t appear to be misleading.

What about better reasons for rigging?

The core problem here is this: Why do we care  specifically about the draws for the first two seeds?  Or, why would the USTA care enough to compromise the fairness of the draw?

As ESPN highlighted, some of the first-round victims are American wild cards.  Scoville Jenkins, for instance, was fed to the wolves twice, once each against Federer and Roddick.  If we’re really fishing for an explanation, perhaps the USTA wants to put up-and-coming stars such as Jenkins, Devin Britton, and Coco Vandeweghe on a big stage, either to showcase these players, or to make otherwise pedestrian blowouts more interesting.  I suppose I’d rather watch Nadal play Jack Sock than, say, Diego Junqueira.

But that’s ex post facto reasoning of the most blatant sort.  If the USTA were going to rig the draw, wouldn’t they be more likely to do so in favor of top Americans?  Or in favor of a broader range of seeds, to better ensure marquee matchups for the second week?  Or rig second-round matchups for top players, to ensure that the big names make it to the middle weekend?

If no evidence of draw manipulation appears in any of those other scenarios, it would seem that ESPN discovered something more like the famous correlation between the S&P 500 and butter production in Bangladesh.  If your search for a newsworthy conclusion is sufficiently wide, you’re bound to find something.

The top seeds

As I’ve said, there’s no doubt that the top two seeds in the men’s draw have had an easy go of it in the last ten years, since the draws started seeding 32 players instead of 16.  The same is true of the women.

The top two in both the men’s and women’s draws faced an opponent who ranked roughly 98th out of the 128 field.  The odds of this happening on either side are tiny–about 0.25%.  The chances that a single tournament would randomly produce draws so easy for the top two men and women for ten years are effectively zero.

Beyond the top two, however, any suspicions quickly disappear.  The average opponent for the top four seeded men has been ranked about 89 out of 128, meaning that #3 and #4 face opponents around #80–dead average.  The average first-round assignment for the top eight seeded men has been around 87, meaning that seeds 5-8 face average opponents in the mid-80s.  Nothing to cause a raised eyebrow there, and the numbers are almost identical on the women’s side.

To go one step further, there’s no evidence of manipulation in the second-round draws.  In fact, the top two women’s seeds faced particularly tough 2nd round opponents–there was only a 20% chance that those twenty women would be given as tough of 2nd round assignments as they have.

Before looking at the draws of U.S. players, a quick summary.  While the top two seeds were given very low-ranked opponents in the first round, the effect did not extend to the second round, or to any seeds beyond the top two.

The American draws

If the USTA were to tweak the draws, you’d expect them to do so in favor of the home players, if for no other reason than television ratings.  But they haven’t.

Let’s start with the American men.  The top two ranked American men each year have faced opponents ranked, on average, 79 of 128.  That’s a bit tougher than average.  If we expand the analysis to the top four ranked Americans, or just seeded Americans, the results stay around average.  If anyone is manipulating the draws in favor of American men, they are either doing it without regard for ATP rankings, or they aren’t doing a very good job.

More surprising is the average opponent of all American men.  The average opponent of an American man in the last ten years has been 61.2 — considerably lower than 80, in part because unseeded men may draw seeded players in the first round.  But the average shouldn’t be that low.  In fact, there is only a 20% chance that American men would be given such a tough assignment.

Results for the women are mostly similar.  The top two American women each year have gotten a slightly easy draw–the average opponent rank is 83 of 128.  Keep in mind, however, that this overlaps with the analysis of the top two seeded women–five of the 20 top-two-seeded women were Americans, and in almost each one of those five cases, those women faced one of the weakest players in the draw.  In other words, there’s more evidence that the draw is skewed in favor of the top two seeds than the top two Americans.

As with the men, American women in general have been given tough assignments.  In fact, there is only a 16% chance that American women would face such tough first round opponents as they have.

What this means

If the USTA (or anyone else) is messing with the US Open draws, they are doing so in a nearly inscrutable way.  The only evidence of manipulation is with each year’s top two seeds, as ESPN highlighted.

The theory I mentioned above–that it might be desirable to pit top players against up-and-coming Americans–is appealing, but also not supported by the evidence.  Only five of the 20 opponents of top-two men’s seeds (and six of 20 women’s opponents) has been American, despite the fact that the U.S. contributes five or six lowly-ranked wild cards each year, in addition to a disproportionate number of qualifiers.

It’s an odd situation.  The first-round opponents of the top two seeds makes for a plausible target of draw manipulation, if not the most obvious one.

Postscript: One more question

I mentioned earlier that I’d rather watch Nadal play Jack Sock than Diego Junqueira.  I like up-and-comers, and it’s always interesting to see whether a new opponent forces a top player to change tactics.  It makes for a more interesting match than Nadal (or any top-tenner) against a 29-year-old who has hovered for years around #100.

My question, then: If you’re Rafa Nadal, and (presumably) you want to go deep at the U.S. Open, who would you rather play?  The American wild card ranked #450, or the veteran ranked #99?  A tougher question: Sock, or a veteran who was nearly seeded, like Fabio Fognini?  I can see different players making different choices, but I don’t think it’s clear cut.

It is the draws of Jenkins, Britton, Glatch–in other words, the Jack Socks of previous years–that give us this evidence of manipulation.  On paper, the 127th-highest-ranked player in the draw looks like the 127th-best, but in practice, it’s not nearly so clear cut.  And if these wild cards really are “wild cards,” what looks like an easy draw may not be much easier than yet another dissection of Sergiy Stakhovsky or Albert Montanes.

It may be true that at some stage, the US Open draws are being manipulated for (and only for) the top two seeds in each field.  But that doesn’t tell us whether those players are gaining anything from it.  It’s far from clear that the lowest-ranked players in each draw are the easiest opponents.

About these ads

2 Comments

Filed under Research, U.S. Open

2 responses to “Is the US Open Draw Truly Random?

  1. Pingback: US Open Draw Datasets | Heavy Topspin: A Tennis Blog

  2. Vinny Pop

    They do it to gain media advertising dollars its quite clear you will never see fed face nadal or jocovic face murry or fed or nadal in a first or second round. This is disgraceful its all about advertising. Is this fair ? What it does is protect the ranking of the top 4 players as much as possible. Because they never meet other than semis to finals they have their rankings protected. Top tennis players no whats going on and so do the tennis organisations and media but it is manipulation and ethically wrong IMHO

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s