Publication
Citations
H-index

Hi! 👋

My name is Sean

I'm a PhD student in Natural Language Processing (NLP) at Durham University specialising in AI for public health research.


Natural Language Processing

Expert in transformer models, Retrieval Augmented Generation (RAG) systems, topic modelling, explainability, and multi-modal architectures using PyTorch and Huggingface. Specialised in low-resource training and domain adaptation.

Data Analysis and Statistical Modelling

Proficient in Python, R, SQL . Experienced in scalable pipeline design for low-resource environments, analyzing datasets of 10M+ records, Git-based version control, and advanced statistical modeling.

Data Visualisation

Adept at creating clear, informative, and publication-quality visualisations using Matplotlib, Seaborn, ggplot2, and Plotly. Skilled in website creation for serving models and presenting visualisations in an interactive manner.

Teaching and Mentoring

Experienced in teaching NLP to master's-level students, adapting methods for varied skill levels. Mentoring a master’s students on an epidemiology-based project. Pursuing an Associate Fellowship in Higher Education to refine my teaching practice.

Communication and Presentations

Experienced in delivering technical presentations to limited prior knowledge audiences. Capable of tailoring complex concepts to both expert and lay audiences, ensuring clarity and engagement. Won best talk awards.

Ethical and Scientific Writing

Published in peer-reviewed journals, skilled at structuring clear and concise papers, emphasising reproducibility and scientific rigor. Experience in writing ethics proposals.

Education

PhD Computer Science (Natural Language Processing)

October 2021 - Present

Durham University

Biotechnology and Biological Sciences Research Council (BBSRC) funding to explore Natural Language Processing and deep learning methods to analyse over 10 million first opinion veterinary electronic health records (EHRs) from across the UK to understand critical public health matters. Key achievements include:

  • PetBERT Development: Designed a novel domain-specific transformer language model.
  • Public Health Research: Conducted studies on critical public health issues such as disease outbreak detection, socioeconomic disparities in premature mortality, and patterns of antimicrobial usage in alignment with stewardship guidelines.
  • Open Benchmark Dataset: Established the first benchmark dataset for veterinary free-text EHRs, setting a new standard for research in this field.
  • Multimodal and Explainable AI: Integrated free text and structured data with explainability tools to provide actionable insights into model decision-making processes.
  • Multinational Collaboration: Coordinated a multinational study within the EU ENOVAT project, identifying barriers for antimicrobial stewardship guideline adoption.
  • International Protocol Leadership: Leading the creation of global protocols for anonymisation, sharing of veterinary free-text EHRs and language models to promote ethical data use and collaboration.
  • Impact: My research bridges computational innovation and public health applications, advocates for open science principles and setting new standards for veterinary data and model sharing
    Thesis: Natural Language Processing for Early Detection and Mitigation of Critical Public Health Threats

    2:1 BSc(hons) Biomedical Sciences

    October 2018 - July 2021

    University of Kent

    Proficient in applying interdisciplinary knowledge across biology and healthcare to analyse and address health-related challenges. Experience in laboratory methods in Genetics, Microbiology, Biochemistry, and Immunology

    Thesis: Antimicrobial usage in hospitalised SARS-CoV-2 patients and the impact on the Gut Microbiome

    Publications

    First Authorships

    Title Authors Venue Date Link Download
    Premature mortality analysis of 52,000 deceased cats and dogs exposes socioeconomic disparities. Farrell, S., Anderson, K., Noble, P.-J.M. and Al Moubayed, N. Scientific Reports 20/09/2024 Link Download
    Explainable text-tabular models for predicting mortality risk in companion animals. Farrell, S.*, Burton, J.*, Noble, P.-J.M. and Al Moubayed, N. Scientific Reports 20/06/2024 Link Download
    PetBERT: automated ICD-11 syndromic disease coding for outbreak detection in first opinion veterinary electronic health records. Farrell, S., Appleton, C., Noble, P.-J.M. and Al Moubayed, N. Scientific Reports 21/10/2023 Link Download
    A multinational survey of companion animal veterinary clinicians: How can antimicrobial stewardship guidelines be optimised for the target stakeholder? Farrell, S., Bagcigil, A.F., Chaintoutis, S.C., Firth, C., Aydin, F.G., Hare, C., Maaland, M., Mateus, A., Vale, A.P., Windahl, U., Damborg, P., Timofte, D., Singleton, D.A. and Allerton, F. The Veterinary Journal 23/09/2023 Link Download
    Seasonality and other risk factors for fleas infestations in domestic dogs and cats. Farrell, S., McGarry, J., Noble, P.-J.M., Pinchbeck, G.J., Cantwell, S., Radford, A.D. and Singleton, D.A. Medical and Veterinary Entomology 09/01/2023 Link Download
    Seasonality and risk factors for myxomatosis in pet rabbits in Great Britain. Farrell, S., Noble, P.-J.M., Pinchbeck, G.L., Brant, B., Caravaggi, A., Singleton, D.A. and Radford, A.D. Preventive Veterinary Medicine 08/02/2020 Link Download

    Co-authorships

    Title Authors Venue Date Link Download
    Text mining for disease surveillance in veterinary clinical data: part two, training computers to identify features in clinical text. Davies, H., Nenadic, G., Alfattni, G., Arguello Casteleiro, M., Al Moubayed, N., Farrell, S., Radford, A.D. and Noble, P.-J.M.. Frontiers in Veterinary Science 22/08/2024 Link Download
    Text mining for disease surveillance in veterinary clinical data: part two, training computers to identify features in clinical text. Davies, H., Nenadic, G., Alfattni, G., Arguello Casteleiro, M., Al Moubayed, N., Farrell, S., Radford, A.D. and Noble, P.-J.M.. Frontiers in Veterinary Science 23/01/2024 Link Download
    Evaluating ChatGPT text mining of clinical records for companion animal obesity monitoring. Fins, I.S., Davies, H., Farrell, S., Torres, J.R., Pinchbeck, G., Radford, A.D. and Noble, P.-J.M. The Veterinary Record 06/12/2023 Link Download
    SARS-CoV-2 neutralising antibodies in dogs and cats in the United Kingdom. Smith, S.L., Anderson, E.R., Cansado-Utrilla, C., Prince, T., Farrell, S., Brant, B., Smyth, S., Noble, P.-J.M., Pinchbeck, G.L., Marshall, N., Roberts, L., Hughes, G.L., Radford, A.D. and Patterson, E.I. Current Research in Virological Science 04/08/2021 Link Download

    In Review

    Title Authors Venue Date Link Download
    PetEVAL: A veterinary free text electronic health records benchmark for the anonymisation and identification of disease patterns. Farrell, S., Radford, A.D., Noble, P.-J.M. & Al Moubayed, N. In Review - - Download
    Automated Disease Classification of Veterinary Clinical Narratives for Antimicrobial Stewardship Guideline Monitoring. Farrell, S., Singleton, D.A., Radford, A.D., Pinchbeck G., Noble, P.-J.M. & Al Moubayed, N. In Review - - Download
    Comprehensive representation of health-related phenotypes in one million dogs using topic modelling of electronic health records. Noble, P.-J.M., Farrell, S., Al Moubayed, N. & Radford, A.D., In Review - - Download

    *Equal Contribution

    Conferences and Presentations

    Title Venue Location Date Type Link
    Syndromic Disease Surveillance and Multi-label Classifiers for Antimicrobial Usage Assessment Association for Veterinary Informatics Talbot Veterinary Informatics Symposium Virginia-Tech University, USA 1/06/2024 Oral Download
    PetBERT: Applications in Veterinary Syndromic Disease Surveillance Symposium on Artificial Intelligence in Veterinary Medicine Cornell University, USA 1/06/2024 Oral Download
    Where are all the antimicrobials being used? LLM’s for monitoring adherence to antimicrobial stewardship guidelines in veterinary practices HealTAC Annual Conference Lancaster University, UK 1/06/2024 Poster Download
    Disease Outbreak Detection Using Large Language Models Annual BBSRC NLD DTP Conference Durham University, UK 1/06/2024 Oral Download
    Survey Results on what do clinicians want from their antimicrobial stewardship guidelines European Network for the Optimisation of Veterinary Antimicrobial Therapy (ENOVAT) Meeting University of Copenhagen, Netherlands 1/06/2024 Oral Download
    Syndromic Disease Classification of Veterinary EHR Notes for Disease Outbreak Detection HealTAC Annual Conference University of Manchester, UK 1/06/2024 Poster Download
    Predictive Power of Large Language Models in Determining Mortality Risks in Companion Animals Annual BBSRC NLD DTP Conference Durham University, UK 1/06/2024 Oral Download
    Syndromic Surveillance for Understanding Antimicrobial Usage in the veterinary community Medical Research Foundation National PhD Conference University of Bristol, UK 1/06/2024 Poster Download
    Current Status of multinational survey of companion animal veterinary clinicians: How can antimicrobial stewardship guidelines be optimised for the target stakeholder? European Network for the Optimisation of Veterinary Antimicrobial Therapy (ENOVAT) Meeting Aristotle University of Thessaloniki, Greece 1/06/2024 Oral Download

    Employment

    Data Science Intern @ Evergreen Life

    June 2024 - September2024

  • Developed a Retrieval Augmented Generation pipeline utilising generative language model (LLM) to deliver personalised healthcare advice, ensuring information safety and accuracy by aligning outputs with the Evergreen Life article repository.
  • Designed an algorithm-driven content recommendation system, tailoring advice to individual user needs.
  • Impact: My work is being integrated into an app serving over 1 million NHS patients, enhancing its capability to provide personalised, reliable healthcare guidance.

    Natural Language Processing Demonstrator @ Durham University

    January 2022 - Present

  • Delivered comprehensive NLP curriculum from foundational machine learning to advanced Deep Learning and Transformers, adapting technical content for diverse audiences ranging from MSc Computer Science to MBA Business Analytics students
  • Undergraduate Researcher @ University of Liverpool

    July 2020 – September 2021
  • Conducted research on geospatial and risk factors for fleas as a second-year undergraduate
  • Utilised electronic health records (EHRs) from over 34,000 animals across UK first-opinion veterinary practices.
  • Impact: Published findings in the Journal of Medical and Veterinary Entomology

    Undergraduate Researcher @ University of Liverpool

    June 2019 – January 2020
  • Conducted research on geospatial and risk factors for Myxomatosis as a first-year undergraduate
  • First exposure to large-scale data analytics, applying statistical methodologies such as multivariate logistic modelling for risk factor analysis
  • Secured funding to produce and distribute educational posters summarising research findings to veterinary practices across the UK.
  • Impact: Published findings in the Journal of Preventive Veterinary Medicine

    Customer Experience Supervisor @ Sainsbury's

    October 2016 – September 2021

    Teaching

    Natural Language Processing @ Durham University

    January 2021 – Present

  • Leading workshops from statistical methods for natural language processing to large language model training
  • Designed workshop materials including guides and code
  • Taught to the taught-Masters Computer Science Programme
  • Natural Language Analysis @ Durham University

    January 2021 – Present

  • Leading workshops from statistical methods for natural language processing to large language model training
  • Taught to the Master of Business Administration (MBA) programme
  • Limited prior knowledge in python, therefore classes are much more focussed on understanding the code and attaining results before grasping with technical knowledge around deep learning methodologies
  • Computational Thinking @ Durham University

    December 2024 - January 2025

  • Supported in the marking of undergraduate coursework around essay pieces
  • Supported in the development of marking rubrics
  • Introduction to Natural Language Processing @ NGSchool Machine Learning in Computational Biology

    June 2021

  • Invited to deliver a lecture on the basics of natural language processing to a group of bioscience students
  • Designed a workshop to introduce the students to the basics of NLP and how it can be applied to biological data
  • Limited prior knowledge in python and machine learning practices
  • Contact Me

    Feel free to reach out for collaborations, job opportunities, or just a chat—I’d be happy to connect!

    Email Me Linkedin Me
    Email copied!