UEFA CL Round of 16 Drawings 2014/15

A new article regarding the 2014/15 drawings will be available shortly is now available (see below). For now you can simulate the drawing process with my newly created Shiny App:

Champions League Drawings – Shiny App

shiny_app

or have a look at Dr. Martin Theus blog. The underlying R code and all the data will also be made available shortly can now be found here: code and all possible drawings as csv.

Note: The App is currently running on a low performance server (Amazon EC2 Micro Instance). Initial loading may take a bit.

The draw and its rules:

The are three rules for the the round of 16 drawing (see Wikipedia):

1) the group winners will be drawn against the group runners-up, with the group winners hosting the second leg.

2) teams from the same group cannot be drawn against each other.

3) teams from the same association (league) cannot be drawn against each other.

Now these rules look fairly simple, and reasonable, but through the asymmetric nature of rule 3) this induces some interesting implications. This restriction only affects the leagues with more than one top team, like Barclay’s Premier League and the German Bundesliga. This is the reason why English vs. German matches are over-proportionally likely. For example FC Bayern München vs. FC Arsenal has a probability of 24% in this years draw, last year this was even higher at 30.8% (this was drawn in the end), if restriction 3) would not exist this probability would only be 14.3% (1 out of 7). This high likelihood is also true for the other German and English teams, e.g. FC Chelsea vs. Bayer 04 Leverkusen happens with a chance of 28.7% and thus is the single most likely match we will see.

draw1

The intuition behind this is somewhat hard to get, but i will try an explanation: The likelihood of each match is calculated as the fraction of draws in which this match occurs divided by the the number of all possible draws. If you account for all the restrictions you end up with just 4516 possible draws (see calculation below). Now consider two scenarios:

a) FC Bayern München is drawn against FC Arsenal (occurs in 1086 draws, i.e. 24%)

b) FC Bayern München is drawn against any single other possible opponent (occurs in 824 draws, i.e. 18.2% or 958 draws, i.e. 21.1% depending on the opponent)

So why is scenario a) much more likely? The main reason is that there must be a solution in the end, this means that every group-winner must have exactly on allowed opponent. Practically this is a fourth restriction. You will notice this in the live draw on Monday: Sometimes some matches will not be allowed in advance since this would lead to no solution. If FC Bayern München versus FC Arsenal is not drawn (scenario b)) there are fewer possibilities in the end for a solution, since the restriction has to be maintained for the later drawn matches.

This does not only hold for German and English teams, but for all teams with at least on group-winner and group runner-up. This time it is also France. This becomes pretty obvious once you play around a bit with the app, since the draw can be practically over after just four draws if only the unlikely matches are selected at first (there are actually several possibilities for this scenario), since then there remains just one solution. Consider this example where always the matches with the least conditionally probability are selected with ties broken at random: Atletico Madrid vs. FC Basel –  Real Madrid vs. Juventus Turin – FC Barcelona vs. Shaktar Donetsk. For the moment we stop here. Now only 6 draws are possible (remember this is the just the third draw out of eight). The conditional probabilities look like this:

draw2

So now we have seven possibilities to end the draw with the next draw (all that occur with likelihood of 16.7%, i.e. in 1 out of 6). Of course these are only matches where we do not solve any restriction. If we would chose FC Bayern vs. FC Arsenal here, we could go on with a fifth draw. We proceed and choose AS Monaco vs. Manchester City. Voila we are done! This is the only possible solution in this case:

draw3

This is why:

Borussia Dortmund has to play against Paris St. Germain, (FC Schalke 04 and Bayer Leverkusen are not possible because of restriction 3), FC Arsenal is not possible because of restriction 2).

FC Bayern München has to face FC Arsenal since it also can’t play against FC Schalke 04 and Bayer Leverkusen, and we already know that Borussia Dortmund must face Paris St. Germain.

FC Chelsea has to face Bayer 04 Leverkusen, since FC Schalke 04 is not possible (restriction 2), same group) and all other possible opponents are already drawn.

FC Porto vs. FC Schalke 04 is clear now, since this is the only match that is left.

(Note that if we had not selected AS Monaco vs. Manchester City, and instead chosen FC Bayern München vs. FC Arsenal here, we could still get the same result, but would not be limited to this result.)

I hope this post clarifies at least a bit that the drawing mechanism induces a lot of bias though all its restrictions. Germany vs. England might stay the most likely matches in the near future for the round of 16. The draw might be over after just four draws, but it does not depend on chronologically order.

I encourage you to use the Shiny App to explore the drawing!.

How the calculation is implemented:

In principle all possible and allowed drawings are calculated. This is the fasted way i found:

1) calculate all possible group winner vs. group runner-up matches without respecting the latter two restrictions. This leads to 8×8 = 64 matches. Then employ all restrictions, i.e. remove all matches where two teams from the same group are drawn against each other and also remove all matches where two teams from the same league (association) are drawn against each other. This leads to 49 (notice there are 15 probabilities of zero in the plot) remaining possible matches.

2) calculate all possible draws (tuples of eight matches). At first ignore the restriction that one team can only be drawn once. Proceed as follows: Select one group winner and all possible matches that are possible for this team from the table created in step 1.

Example: There are 7 possible opponents for Atletico Madrid (all except Juventus Turin, since this team is from the same group). This leads to 7 possible matches in step 1. For Borussia Dortmund there are 5 possible opponents (FC Schalke 04 and Bayer Leverkusen are not possible because they are also from the German Bundesliga, FC Arsenal is not possible since this team is from the same group). This leads to 5 possible matches in step 1. Proceed like this for all group winners and calculate the cross product of all matches.

You end up with 7 (Atletico Madrid) x 7 (Real Madrid) x 6 ( AS Monaco) x 5 (Borissia Dortmund) x 5 (FC Bayern München) x 7 (FC Barcelona) x 5 (FC Chelsea) x 7 (FC Porto) = 1,800,750 Possibilities (a 1,800,750 x 16 (number of teams) matrix).

3) Remove all matches where one group second is selected more than once. This leads to 4516 possible draws.

4) To calculate the probabilities simply count all draws where the desired match occurs and divide by all possible draws.

It takes about 2 seconds to compute on my standard desktop machine and uses 300MB of memory. It is also possible to simulate the draw a drawing, but this is way slower (see last years post for (messy) code). It of course leads to the same result.

The App takes the pre-calculated table of drawings and just selects parts of it.

One Draw Can Change Everything – UEFA CL Round of 16 Drawings

Today the UEFA Champions League Round of 16 matchups were drawn. The rather simple rules for drawing the matchups lead to quite interesting results. There are three rules (s. http://en.wikipedia.org/wiki/2013%E2%80%9314_UEFA_Champions_League):

  1. The group winners will be drawn against the group runners-up, with the group winners hosting the second leg.
  2. Teams from the same group cannot be drawn against each other.
  3. Teams from the same association cannot be drawn against each other.

group winners: Manchester United (MAN), Real Madrid (REA), Paris St. Germain (PSG), FC Bayern (BAY), FC Chelsea (CHL), Borussia Dortmund (BVB), Atletico Madrid (ATL) and FC Barcelona (BAR). (from group A to H)

group runners-up: Bayer 04 Leverkusen (B04), Galatasaray (GAL), Olympiacos (PIR), Manchester City (MAC), Schalke 04 (S04), FC Arsenal (FCA), Zenit St. Petersburg (ZEN), AC Milan (ACM). (from group A to H)

The question that arises here is: how do these restrictions influence the “random” drawing. To answer that question i wrote a function in R that simulates 1 million random draws to make sure that in the end i get all possible drawings. (Ex ante i did not know how many simulations had to be done. Now i know far less than 1 million are actually needed.) This took about 40 minutes. In the end one gets the result that only 3497 different matchups are actually possible:

UEFA0

One can clearly see that matchups of Bundesliga vs. Barclays Premier League are much more likely than any other matches. That is because there are less restrictions for the other matches if one of these matches drawn (restriction 2 and 3). The probability for FC Bayern vs. FC Arsenal (30.8%) is nearly twice as high as any other single matchup. The same is true for the other possible Bundesliga vs. Barclays Premier League matchups. To get a little more behind the logic of these probabilities we can now have a closer look at the conditional probabilies given the first draw, second draw etc. In the first draw of todays lottery the matchup FC Barcelona vs. Manchester City has been drawn by Luis Figo. The probabilities changed in the following way:

UEFA1

FC Bayern vs. FC Arsenal got even more likely. After the first matchup is now fixed there are only 605 possibilities left. After the next draw: Manchester United vs. Olypiakos F.C. the probabilities changed to the following:

UEFA2

At this stage there are no restrictions in the drawing process except the three restrictions from above (this will change in the next draws). The thrid matchup drawn is Atletico Madrid vs. AC Milan.

UEFA3

Now only only 11(!) draws are possible. Real Madrid vs. Schalke 04 gets a probaility of more than 50%, so does FC Bayern vs. FC Arsenal. Borussia Dortmund basically has a 50:50 chance of getting Galatasary. In the next draw the less likely Paris Saint-Germain vs. Bayer 04 Leverkusen is drawn (18.2%). This leads to the following conditional probabilities:

UEFA4

Leaving us with just 2(!) (remember next is the fifth draw from eight) possibilities. FC Bayern vs. Arsenal is fixed now without even being drawn, so is Real Madrid vs. Schalke 04! Galatasary is now selected to get ist oppenent. Why cant its opponent be FC Bayern? Lets check two examples:

  1. GAL-BAY gets drawn here, in the next draw ZEN-BVB is selected. But this would lead to a dead end. Since then there are only two matches left to be decided, either FCA-CHL and S04-REA, which is not possible because FCA and CHL are from the same association (restriction 3), or FCA-REA and S04-CHL, which is also not possible because S04 and CHL were from the same group (restriction 2).
  2. GAL-BAY gets drawn here, in the next draw FCA-REA is selected. This would also lead to a dead end since there is no possible opponent left for S04. CHL is not possible (from same group, restriction 2). BAY and BVB are not possible (same association, restriction 3)

If one checked all other possibilities, all but two final drawings would lead to a dead end. That is the reason why in the drawings the above two teams were not allowed to be selected by Luis Figo (green dot = allowed, grey dot = not allowed):(Source: http://www.youtube.com/watch?v=moDwj6pHSPg at 5:35) The chance of Galatasaray getting Chealsea or Borussia Dortmund is now 50:50. Thats the only decision left here. Chelsea is being selected. From now on everything else is clear.

UEFA5

Now Gianni Infantino selected Schalke 04 to get its opponent. The opponent is clear now. It cant be Borussia Dortmund because they are both from the Bundesliga (restriction 3). It cant get FC Bayern for the same reason. So now we are only left with FC Bayern, Borussia Dortmund, Zenit St. Petersburg and FC Arsenal. FC Bayern cannot face Borussia Dortmund, since they both are group winners (and from the same association, restrictions 1 and 3). FC Bayern cannot face Zenit St. Pertersburg because then Borussia Dortmund would have to face Arsenal FC, which is not possible due to restriction 2 (from same group). So now we are done 😉 The last draws are just pro froma. Nothing changes, as one can see here (just for completeness):

UEFA6

Schalke 04 vs. Real Madrid has been selected before. Now Zenit St. Petersburg is drawn. We already know its opponent: Borussia Dortmund.

UEFA7

Now we are only left with FC Bayern and FC Arsenal. Probability is of course 100%:

UEFA8

So this is the final result.

So in the end these rather simple rules reduced the number of possible matchups quite substantially. After only three draws we were left with just 11 possibilities, after four draws only two and after five draws everything was clear.

In the end it is rather hard to get all the complete logic behind all the probabilities, since there are many possibilities, and also a lot unsymmetrical restrictions. The only easy rule of thumb probably is, that if two teams that have more than restriction 1 and 2 get drawn against each other this leads to less restrictions for the other matches, allowing more possibilities in the end, which vice versa makes this drawing (ex ante) more likely. A draw of MAC-BVB in the first draw for example leads to 1077 matchups, whereas the draw of MAC-BAR led to only 605 possible matchups. Or in the other way round as seen in draw 4 and 5: If none of these likely draws (picture one) happen at the beginning, they must happen in the end (FCA-BAY).

Code for the simulations and plots can be found here: http://pastebin.com/wngGBkTZ. R 3.0.2, with packages: data.table, plyr and ggplot2 was used. A .csv file with all possible drawings can be found here: http://pastebin.com/DGVcj0E2.