Research Seminar: Challenges in Annotating Datasets to Quantify Bias

Gill delivered a presentation at the ARC Training Centre in Data Analytics for Resources and Environments (DARE), The University of Sydney, where she shared insights from her research focused on quantifying bias in language models with the ultimate goal of debiasing. This work constitutes a significant aspect of our ongoing ethical computing project.

Link to the recording: https://youtu.be/nvKOC6jShxw

Abstract:

Recent advances in artificial intelligence, including the development of highly sophisticated large language models (LLM), have proven beneficial in many real-world applications. However, evidence of inherent bias encoded in these LLMs has raised concerns about equity. In response, there has been an increase in research dealing with bias, including studies focusing on quantifying bias and developing debiasing techniques. Benchmark bias datasets have also been developed for binary gender classification and ethical/racial considerations, focusing predominantly on American demographics. However, there is minimal research in understanding and quantifying bias related to under-represented societies. Motivated by the lack of annotated datasets for quantifying bias in under-represented societies, we endeavoured to create benchmark datasets for the New Zealand (NZ) population. We faced many challenges in this process, despite the availability of three annotators. This research outlines the manual annotation process, provides an overview of the challenges we encountered and lessons learnt, and presents recommendations for future research.

Professor Gillian Dobbie: https://profiles.auckland.ac.nz/g-dobbie

Research Seminar: Challenges in Annotating Datasets to Quantify Bias

Recent Posts

Archives