Simulating the Rugby World Cup 2019 Japan in R

This is one helpful add-on!!

[This article was first published on R – stats on the cloud, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.

I really like running simulation models before sporting events because they can give you a much greater depth of understanding of team performance compared to the ‘raw’ odds that you might get from the media or bookmakers, or the often varied opinions of different sports pundits.

Yes, Ireland usually get knocked out in the Quarter Finals, and this is what many people are saying is most likely to happen again this year – but does the data show this? Pool C is the ‘pool of death’, right? With England, France and Argentina vying for qualification in the top two spots. Or, does Pool D hold that title, with Wales, Australia and Fiji battling it out? Oh, and we know Italy is having a tough time right now, but if we could have 10,000 world cups, would they win it even once?

That’s where a simulation can provide a lot of extra oompf for a little extra effort.  If we can take the ‘raw’ odds but then add in some volatility then we can understand the distribution of outcomes.

That being said, with the Rugby World Cup 2019 officially starting in just over two weeks on 20 September, I thought I’d best run a simulation model.  The model is relatively simple, but nevertheless I hope it can provide some insight into both the tournament and also more generally how to simulate a sporting competition!

Venues of the RWC 2019 Japan
Structure of the tournament. Source: RWC2019

Rating Data – How good is a team?  Given that, how likely is team A to beat team B?

I used two different data sources:

World Rugby, the global governing body of rugby union, publishes official world rankings weekly. These have some shortcomings, although they’re widely followed by the media and have been around for many years.

Rugby Pass, a rugby broadcaster from New Zealand, publishes its Rugby Pass Index. It’s a bit of a black-box as they don’t publish their methodology, and it has only been around for a year, but they claim to use machine learning and it appears to work at player-granularity.

When plotted for each team, there is a fairly linear relationship between the two ratings systems, although the World Rugby rankings make a smaller gap between the lower and the higher teams.

I could write another post comparing the two ratings systems, but in a nutshell I believe Rugby Pass Index more accurately represents the current form of teams, although it doesn’t track the lower tier nations so well which the World Rugby rankings cover.

A source of recent volatility in the ratings has been the summer friendly matches, which are counted as full fixtures by both systems, although which may have been regarded by some teams as experimental/warm-up matches.

So I decided to take a consensus, using the mean of both ratings for each team, for before and after the summer friendly matches. To allow pathways for teams with performance improvement/drop between now and tournament start, a rand(-5,5) adjustment was made to each rating per simulation. Finally, Japan was given a 1.5pt home advantage adjustment as host.

The favourite team doesn’t always win – how much leeway for luck is there? We need a Distribution for Points Scored in a Rugby Match.

For those less familiar with rugby, these are examples of scorelines, from RWC 2015.

In a game like soccer, you get a lot of draws because goals don’t happen very often, at least compared to rugby where it’s not unusual to see 60pts or more sc

Meet this smart plugin.

Read full article at the Original Source

Copyright for syndicated content belongs to the Linked Source