Biomarker Discovery Pipeline for High-Throughput Data Analysis

Engineering, IT, Mathematics and Statistics

ABOUT THE INDUSTRY PARTNER

CSL is a global plasma biotechnology company with a diverse drug portfolio, headquartered in Melbourne. The Biostatistics team within CSL Research works on providing innovative solutions to diverse research problems pertaining to the company’s drug development pipeline.

WHAT’S IN IT FOR YOU?

  • Gain real-world experience in Research Biostatistics in the biopharmaceutical industry.
  • Exposure to statistical concepts, methods, and strategies for pre-processing and analysis of high-throughput data (e.g., protein expression data)
  • Be a part of a team having real impact across a wide range of human diseases and therapeutic areas.
  • Networking and mentorship opportunities with CSL biostatisticians and collaborators.

RESEARCH TO BE CONDUCTED

Longitudinal studies that track clinical and proteomic measurements over time provide powerful opportunities for biomarker discovery, yet analysing these complex datasets remains challenging due to limited, highly technical tools. We’re seeking a talented intern to address this gap by developing a robust, well documented R based pipeline that supports a wide range of outcomes—including binary, time to event, continuous, and count data—while accommodating both longitudinal predictors and longitudinal outcomes.

The project will begin with an updated literature review that builds on previous work, summarising best practice statistical methods for modelling high dimensional and longitudinal high-throughput data, identifying gaps, and evaluating available analytical tools. Insights from this review will guide the design of an integrated workflow for data preprocessing, feature selection, model development, validation, and interpretation across diverse outcome types. The intern will implement this workflow as a clear, script-based pipeline, emphasizing accessibility, modularity, and reproducibility. Finally, the pipeline will be applied to CSL case studies and/or publicly available datasets to demonstrate practical utility, document limitations, and highlight opportunities for future enhancement, ultimately enabling researchers to perform sophisticated analyses and extract meaningful biological insights more efficiently.

SKILLS WISH LIST

If you’re a postgraduate research student and meet some or all the below we want to hear from you. We strongly encourage women, indigenous and disadvantaged candidates to apply:

  • Bachelors, Honours or Master’s degree in Statistics, Biostatistics or Bioinformatics, or a PhD candidate in Statistics, Biostatistics or Bioinformatics.
  • Solid understanding of statistical principles and methods for longitudinal data analysis (e.g. high-throughput data).
  • Extensive experience in R programming
  • High level of verbal and written communication skills.

RESEARCH OUTCOMES

  •  R-based analysis pipeline for biomarker discovery supporting:
    • Binary, time to event, continuous, and count outcomes
    • Cross-sectional and longitudinal predictors and outcomes
  • Modular scripts covering:
    • Data pre processing and QC
    • Feature selection
    • Model development and validation
    • Interpretation and reporting
  • Vignette style documentation explaining workflow, methods, and usage examples
  • Updated literature review summarizing best practice methods for high dimensional and longitudinal biomarker analysis, with identified gaps and recommendations
  • Case study applications using CSL datasets and/or public data, providing reproducible examples that showcase performance and limitations
  • Summary of limitations and future opportunities to guide subsequent methodological or pipeline enhancements

ADDITIONAL DETAILS

The intern will receive $3,300 per month of the internship, usually in the form of scholarship payments.

It is expected that the intern will primarily undertake this research project during regular business hours and maintain contact with their academic mentor throughout the internship either through face-to-face or phone meetings as appropriate.

The intern and their academic mentor will have the opportunity to negotiate the project’s scope, milestones and timeline during the project planning stage.

Please note, applications are reviewed regularly and this internship may be filled prior to the advertised closing date if a suitable applicant is identified. Early submissions are encouraged.

LOCATION:
Melbourne, VIC
DURATION:
6 months
CLOSING DATE:
04/02/2026
ELIGIBILITY:
PhD students only, both domestic & international
REF NO:
APR - 2945

INTERNSHIP CONTACT

CONNECT WITH APR.INTERN

Suggested Internships

AUSTRALIAN BROADCASTING CORPORATION (APR – 2652)

Location:
Melbourne, VIC
Optimising Push Notifications for Enhanced User Engagement

PROGRAMMED FACILITY MANAGEMENT (APR – 2935)

Location:
Parramatta, NSW
Enhancing the Effectiveness and Uptake of Integrated Management Systems (IMS) in Asset-Intensive Organisations

PROGRAMMED FACILITY MANAGEMENT (APR – 2934)

Location:
Parramatta, NSW
Developing Tailored Asset Management Strategies for Ageing and Critical Infrastructure