Machine Learning and Artificial Intelligence Models for the Automation of Data Collection in AFL
Location: Melbourne, VIC
Duration: 5 months
Champion Data has been the official data collection partners of the AFL since 1999. In the 20 years since, the depth and breadth of available data from an AFL match has grown at a steady rate. The demands on the accuracy, consistency, and timeliness of captured data has increased with the importance placed on the analysis of this data by media and industry partners. With a significant increase in the breadth of data collected, the resource cost of maintaining this accuracy and consistency is now a significant cost to the business. With the available data set, automating much of this process should be achievable.
Research to be Conducted
By combining player location data with event data collected manually during an AFL match (kicks, tackles, hitouts, etc) Champion Data seeks to identify potential errors in the identification of players or the classification of recorded actions. Some examples of data that can be used to detect potential errors are:
- The distance between two rucks at a ruck contest
- The distance between a tackler and the player in possession
- The recorded location of events relative to the location of the tagged player
- The recorded pressure level on a disposal relative to location and trajectories of opposition players
- The recorded intent of a kick relative to the location and trajectories of teammates
- The recorded direction of a kick relative to the start and end point of the kick
As the observed error rate of these data points approaches zero, this will also open an opportunity to automate the collection of these data points using the same classification methods used in the detection of errors.
If you’re a PhD student and meet some or all the below we want to hear from you. We strongly encourage women, indigenous and disadvantaged candidates to apply:
- Programming (Python preferred)
- Machine learning
- Data science literacy
- Knowledge of AFL preferred
Champion Data expects that the research will culminate in a report outlining the process of automated quality assurance on collected data. It expects a supporting codebase that is fully developed and ready to be used during the capture of a live sporting event. The codebase will fit into the Champion Data ecosystem of data capture, storage and delivery. Further, it expects guidance for extensions into future research, including but not limited to the automated collection of the data sets.
The intern will receive $3,000 per month of the internship, usually in the form of stipend payments.
It is expected that the intern will primarily undertake this research project during regular business hours, spending at least 80% of their time on-site with the industry partner. The intern will be expected to maintain contact with their academic mentor throughout the internship either through face-to-face or phone meetings as appropriate.
The intern and their academic mentor will have the opportunity to negotiate the project’s scope, milestones and timeline during the project planning stage.
5 February 2020
APR – 1100