Identifying Anomalous Data Patterns in Victorian Government Linked Data using Machine Learning

Location: Melbourne CBD, Victoria

Duration: 5 months

Start date: ASAP

Keywords: Improving Victorian Government Client Outcomes, Predictive Analytics, Deep Machine Learning, Linked Data, Health and Human Services

Eligibility: Open to all PhD Students, domestic students and students on international student visas

Project Background

Under the auspices of the Centre of Victorian Data Linkages (CVDL) in the Victorian Department of Health and Human Services, the Victorian Government is establishing a centralised linked client database with the aim of monitoring and improving outcomes for people using its funded services. There is a need to develop and apply machine learning techniques to quality assure the linked data, as it is too large to manually review.

The data contains information at the client level (e.g. time of hospital admission, patient characteristics, etc.), and this has been aggregated and transformed into a multivariate time-series structure by the Department of Health and Human Services. Anomaly detection should therefore be conducted at the aggregate level to uncover systematic data issues (e.g. data entry errors, missing data, external influences, administrative processes, possible policy impacts on service delivery, etc.).

Research to be Conducted

  1. Conduct an academic literature review exploring how machine learning has been used to identify aggregate anomalous time series patterns in health and human services data.
  2. Design machine learning techniques to identify anomalous aggregate time series patterns (e.g. additive outliers, transitory changes, level shifts, irregularities at the multivariate level, etc.), and implement them for operational use in Microsoft Azure.
  3. Experimentally evaluate anomaly detection methods using synthetic and real CVDL data, and assess for accuracy (e.g. AUC, F1, Recall and Precision).
  4. Write an internal confidential client report.
  5. Present to the Director SIA Branch on the key research findings; and recommend future research directions.

Skills Required

We are looking for a PhD student with the following skills:


  • Keen interest in health and human service applications of machine learning.
  • Good background knowledge of time series data, and machine learning techniques.
  • Good programming skills (preferably R, Python, and SQL).
  • The Victorian Government linked data is highly protected and can only be accessed from restricted computers onsite at 50 Lonsdale St, Melbourne. Remote access to this is not possible. Therefore, for interstate candidates, there may be a need to live temporarily in Victoria at their own expense.
  • Willing to be supervised by an academic team selected by the Victorian Department of Health and Human Services for the duration of the internship (co-supervision with the student’s preferred supervisor can also be discussed).


  • A PhD student in their third year or about to finish their degree, expressing an interest in working with industry or government to start a working career.

Expected Outcomes

The PhD student will gain experience working in a major Victorian Government department on an exciting high-profile project, positioning them well in their career.

Additional Details

The intern will receive $3,000 per month of the internship, usually in the form of stipend payments.

It is expected that the intern will primarily undertake this research project during regular business hours, spending at least 80% of their time on-site with the industry partner.  The intern will be expected to maintain contact with their academic mentor throughout the internship either through face-to-face or phone meetings as appropriate.

The intern and their academic mentor will have the opportunity to negotiate the project’s scope, milestones and timeline during the project planning stage.

Applications Close

5 September 2018


INT – 0414


03 8344 1785