Better Race Scheduling Using Big Data
Ben Vonwiller
Ben Vonwiller, Partner, McKinsey & Company

Stuart S. Janney III:

Thank you, Cathy, for giving us a better understanding of the ROAP program and of those who have gone through it.

Like many other sports enterprises, we've often turned to McKinsey & Company for guidance. In fact, the seminal McKinsey report outlining a national drug testing program was unveiled at this event in 1991. Six years ago Dan Singer of McKinsey shared the findings and recommendations emanating from a comprehensive economic study of our support, called Driving Sustainable Growth for Racing and Breeding.

One aspect of that study focused on uncoordinated scheduling of races and how it was costing tracks significant handle. It's an issue that is yet to be resolved, and it continues to annoy our fans.

Today, Ben Vonwiller of McKinsey is with us to share some additional insights into race scheduling and the use of big data. Ben is no stranger to racing or The Jockey Club. He worked on a drug testing report that was featured in the 2014 Round Table Conference, and he moderated a panel featuring former NBA Commissioner David Stern at our Pan-Am Conference in New York in 2015. He's a leader of McKinsey's Global Media & Entertainment and Professional Sports Practices.

Please welcome Ben Vonwiller.

Ben Vonwiller:

Good morning. The topic for us today is how you can leverage big data and advance analytic techniques to improve scheduling in Thoroughbred racing. I'm sure many of you are aware of the benefits that analytics are starting to bring to professional sports. We heard about it from Jim this morning in terms of injury predictions with Thoroughbreds. We've seen that with players in other professional sports as well. We've seen it in scouting. We've seen it in integrity monitoring. And we're also seeing real value being created by applying these analytical techniques to optimizing scheduling, effectively choosing the best possible coordination of individual events out of often trillions of potential choices.

We've done some work in this space. We have a team that worked with the NFL. And The Jockey Club had asked us whether there may be value in applying those same techniques to racing. So this was not a formal engagement, but we spent some time assessing whether the pre-conditions for scheduling analytics were in place and, if they were, whether there was real value that could be created.

Let me start by just sharing a case example. We have a team of applied mathematicians and operations researchers who had been asked by the NFL to see if they could improve the scheduling process that the NFL had undertaken in the past.

For those of you not familiar with the NFL, there are 32 teams. They're organized in two conferences, each conference has four divisions, each team plays 16 regular season games in a 17-week period.

That sounds relatively simple to program. So what makes it hard? First of all, just the sheer volume of combinations that are possible. There are literally trillions and trillions of combinations the NFL could choose between in designing a schedule for any given year. You have individual team match-ups, you have days of the week where the teams could play, and you have broadcast slots on those days where you could assign them. When you add all these up together, you just get an overwhelming number of combinations.

Secondly, there are hard and soft constraints that need to be applied to those potential combinations. Hard constraints are things like stadium availability. It may be booked for a concert or another event on a given day. Or you have two teams in New York; they both can't play at home on the same day.

Then there are competitive balance issues. You don't want a team that's played Monday night to then play again on a Thursday night in the same week. A team that travels internationally for a game typically has a bye after that trip. All of these factors have to be considered.

Lastly, there are traditions that they're keen to respect and preserve. Classic example of this, the Cowboys play at home on Thanksgiving every year.

So our team came in, and this is a process that typically takes months, and applied advanced analytics techniques to effectively eliminate large numbers of unattractive schedules and remove them from consideration, so these are low-value outcomes, so that the NFL could focus on the highest value sets of schedules and they could effectively view more schedules, better schedules, and view them faster to sort of cut through the processing time.

To give you an example of the power of this technique, our team in seven hours was able to find a schedule that was better than the last year's published schedule.

So we took that team and said, What can we do in racing? The problem statement we defined was overlapping schedules. So you can see an example on the screen right now. We picked a random day for this. It's in January of this year. You're looking at post times, the number of post times in any given slot on that day, and you can see there are multiple occasions where there are multiple races that are scheduled on or close to each other.

It gets even more complicated when you then look at off times instead of post times, and you can see on the screen a tweet here from the past week where you had the Whitney and the West Virginia Derby with off times that they ran within a minute of each other.

So this is clearly not an optimal fan experience. And we also believed, as a hypothesis for this work, that this had a direct financial impact on the industry as well.

Our hypothesis is that if you maximize the share of attention betters can focus on any one race, they will bet more often. To test that, we took a data set of about 40,000 races from 2015. The first thing we needed to do was to build a model that predicted handle. So we ran a multivariate regression, tested an array of features to see which was significant in explaining handle. Our model can actually explain or predict handle pretty well. It's a pass grade of about 70%. And many of the features that predict handle have been seen in other attempts, in other models -- so field size, purse size, track, race type.

The new feature that we added into this exercise to get at the cost of scheduling we called concurrent purse. It was a metric we defined to act as a proxy for the share of attention any race gets from betters.

How do we define concurrent purse? You can see on the screen the definition is the share of the aggregate total purse represented by each race in any given time period, what does that mean.

We took a race, we took the sum of that race's purse and then all of the purses that were represented by races that had off times within five minutes of that race. So that is the total available aggregate purse. Then we asked how much or what share of that aggregate purse did our race represent. If it had 100% share of concurrent purse, it meant it was the only race running in that time slot. If there were two races with the same purse size, you'd have a 50% share of concurrent purse for that race.

What our model suggests is that the higher the share of current purse you have, the more your handle will grow. It actually demonstrates that if you can maximize share of attention, people are more likely to bet.

We then asked, okay, how widespread is the problem? How much overlap do we see in racing? As you can see on the screen, we think it's actually a systemic problem across racing. We looked here at roughly the top 10% of races in that data set based on purse size, and we found there's somewhere about 1,500 races in that year where their share of aggregate purse was 60% or lower. So they simply weren't dominating betters' attention at their off time.

So when we asked the question does scheduling matter, the answer is yes. The next question to ask is: Is there anything we can do about it? This may be a problem, but there may not be any fix that's available.

The next question we asked was is there white space available in these race cards in our data set which effectively represents unused inventory where you could schedule races, and they found that there was.

So they conducted an exercise to effectively de-duplicate those race times and see what the predicted handle would be. You can see on the screen, the payoff is real and significant. Our model predicts a $400 million increase in handle across the industry from better scheduling by de-duplicating races.

The way the model got that was we took a set of the most heavily duplicated races and moved them to white space. We did not change any of the fundamental building blocks of scheduling, so we didn't move them across tracks, we didn't move them across days of the track, and we didn't move them outside the typical race hours of that track. We just found white space across the system. In doing so, we took the share of concurrent purses for the races we moved from 37% to above 80%, so effectively doubling their share of attention from betters, and that's what produced the 400 million.

In terms of the aggregate impact of the moves of the selected races, we took a 10-point increase in share of concurrent purse from that data set, from about 48% to 59%.

We then said, okay, there seems to be a clear-cut data-driven point of view emerging here. We wanted to test whether this answer in the model was applicable in the real world. So we spoke to race secretaries, race directors, broadcasters and others in the industry to get their reactions to what our model was telling us. We heard five potential challenges as to why this may not work in real life.

The first one we heard was a simple one, which is complete de-duplication of races is just not possible. If you want to have two races per hour at a track and you want to schedule races across the industry at five-minute increments, you pretty quickly get to a position where you can only run six tracks at any point in time.

While this is correct, we think the real power of scheduling analytics is not complete de-duplication, it's actually optimizing the de-duplication you're bound to end up with. The secret is how you pair which races with each other to maximize each race's share of concurrent purse.

So, an end goal where we have no overlapping races is not possible. An end-goal where we have overlapping races that are truly optimized to maximize handle is absolutely possible, we believe.

The second objection we heard was this will only work if everyone cooperates. This $400 million is an externality, it's only attainable if we get industry-wide cooperation. That is true if we're talking about the $400 million. The question we wanted to ask was is there value that still could be unlocked if just certain actors in the industry cooperated.

So we went back to the data and said what would the model say if we just optimized schedules across the five largest track operators? What we found was you could still get nearly half of that total opportunity just by cooperation across these big five track operators. If the total handle increase opportunity was $400 million, when everybody cooperated, those five track operators stood to benefit by about $240 million. If they just cooperated amongst themselves, our model suggests there is still a $150 million incremental handle opportunity, which in each of their cases would represent about 2 to 3% growth in handle.

In addition to that 150 million, there would probably be a windfall benefit to other tracks of about $30 million, even though they weren't participating in the cooperation.

The next objective we considered was that those that understand the benefits of cooperation are already doing it, and those who aren't willing to cooperate, won't cooperate.

While we think it is clearly true that there is cooperation happening in the industry, and we heard about it, that cooperation to us felt like humans cooperating with humans, and the real benefit, even beyond a willingness to cooperate, is applying these scheduling algorithms to the races that were part of that cooperation set to further optimize communications.

So willingness to cooperate alone is not enough. Actually applying some of these tools to maximize the benefits and leverage of that cooperation will unlock further value.

The next challenge we heard was the different racetracks want to operate with different frequencies. Larger tracks may want a 32- to 34-minute gap between races; smaller tracks may have a shorter gap, maybe 28 to 30 minutes. Racetracks may vary gaps based on time of year -- when there's more daylight hours, they'd have larger intervals; as daylight hours shrink, they run shorter intervals. This adds to complexity.

That statement is absolutely true, but we would argue that is exactly the kind of complexity that advanced analytics are best placed to deal with. The algorithm can actually optimize across all of these constraints and still find the best possible scheduling outcomes.

The last challenge we heard was perhaps one of the more difficult ones for us to think through. As we spoke to people, we said, well, scheduling will apply to post times. As your model suggests, what betters care about is off times, and post dragging will ultimately defeat any scheduling benefits you can create around post times. And when you look into it, we heard the vast majority of post dragging is driven by unavoidable or external factors -- a loose horse, weather conditions at the race.

So we went back and looked at the data and said is this true and is there anything we can do about it, is post dragging truly unavoidable. And the answer we found was as follows. We looked at the top 25 tracks in terms of handle. When we looked at the five worst-performing tracks in terms of post time adherence, we found between 50 and 70-plus percent of races weighted by handle were starting more than five minutes after post time. Actually when we looked across the full set of tracks, about 13 of the 25 tracks had about 30% of off times that were more than five minutes after a post time. So we thought, okay, maybe there is a point here.

Then we looked at the top five performers in terms of racetracks with post time integrity, and we found a completely different story. All five of these tracks were able to start races between 93 and 97% of the time in less than five minutes from post time.

So we didn't see any feature that was explained to us that would explain variants like this based on external factors, and that led us to believe that there is still real potential for scheduling optimization in racing.

So let me recap. Our question was do the preconditions for schedule optimization exist in racing and can applying scheduling analytics really drive value. We left that exercise with real conviction that the answer is yes. There will need to be certain things that are done in order to unlock that value, and we've captured a couple of them on the slide here.

Firstly, we will need to build a scheduling algorithm that can help program races across tracks to find the best possible outcome across the trillions of outcomes that are possible.

Secondly, we need to achieve better post time integrity so that that scheduling algorithm can actually create value for us. That will involve better coordination between tracks.

Then it is also worth considering, we think, some form of master scheduling role. In other jurisdictions this exists. It could be a broadcaster in Australia -- in Australia it's Sky Racing -- it could be TVG here, it could be someone else. But someone to help coordinate scheduling across tracks.

The reason why we think this is worth that investment is that the payoff is real. $400 million dollars, 4% increase in handle is a real step change that is worth the effort, we believe.

So I thank you for your time.

Back Agenda Next