Statistics in Football

This page details the following:

(1) – Collecting Football Data
(2) – Analysing Football Data
(3) – Conveying Football Data

Collecting Football Data

At this current moment in time, football data is not collected by one, unformed body. This means multiple companies collect and produce their own data, selling them to football clubs, organisations, independent agencies, journalists, free-lance writers and more.

Immediately, there is a problem of consistency. This is because these individual companies will define actions in various ways. For example, the definition of a dribble according to WyScout is different to the definition proposed by Opta. Thus, they inevitably produce two different sets of numbers for that particular metric. Likewise, InStat views pressing to be an action that involves a minimum of two players whereas other companies do not. Thus, they will produce numbers which convey a lower amount of presses than what other companies produce who begin collecting data when one player begins to press.

As a result, analysts constantly debate of statistics whilst a lot of their debating actually revolves around definitions from companies and not the players in discussion. This only creates further discord amongst the football community.

Lastly, many companies sell their data at extortionate rates. For example, Opta sells one league’s worth of data for a total of £10,000 and this only covers one season. Consequently, upcoming analysts frequently come up against problems of data collection due to their lack of funds and, thus, have to rely on public, though basic, sites such as WhoScored and Understat.

In conclusion, data collection will always be a problem as long as football does not turn to its main body, FIFA, who, in reality, should take full responsibility for producing footballing data. This would extremely enhance the sport as we know it.

Analysing Football Data

Another issue arises in analytics when one discussing not just what to analyse but how to analyse it. For example, if a player creates five chances in one game this would be viewed as a considerable number. Another player may only create two chances and will not be viewed as favourably. However, the player who created two chances may have created two high-quality chances compared to the former.

For example, the second player may have an xA of 0.35 per chance created which would give them a total of 0.70 xA in a game. The former may have an xA of 0.05 per chance and thus ends with 0.25 xA. Thus, this player created more chances but the other player created better chances. Now, the question is simple: are you more likely to score from a chance which has a 35% probability of being scored or a chance that has a 5% probability of being scored? Obviously, the answer is the former. Therefore, the second player who only created two chances is more valuable in their creativity.

Despite this, a lot of analysis in the football community is either too advanced and, thus, cannot be understood by the normal fan or is too simplified to the extent that it loses its meaning. It is upon analysts to develop their craft and not be hasty in entering into the world of analytics. Honing one’s craft is done throughout watching multiple games, being perceptive, focusing on the more hidden aspects of what you see (i.e. a player’s movement as opposed to their through pass or shot) and analysing how important one thinks that was to the event that unfolded (e.g. a cross, shot, tackle, etc). Then, one should consult those whom they trust and seek further opinion. This is how we develop as people.

Ultimately, football data and football analysis is in its teething phase and it will take years before any real, meaningful stage has been reached.

Conveying Football Data

This is the part most people familiarise themselves with as it is the part they see most. The graph. The chart. The table. The list. How should data be conveyed and portrayed? The answer is simple:

(1) – Does one’s audience understand the fundamental point?
(2) – Will they need an explanation?
(3) – Is the information easy to read?

These are principle points when conveying any type of information. A person should speak to the audience in a way they understand. They should not seek to display their perceived knowledge to the detriment of the receiver. If they one to whom you are speaking cannot understand you then you have failed in your communication.

Thus, I always to try explain posts, graphs, tables, etc., and if someone struggles to understand it I will endeavour to clarify further. This is what a good educator does. Far too often have football analysts produced graphs and charts that are too difficult to grasp even to other analysts. Rather, one should read the audience’s level before they communicate with them.