Please note: This PhD defence will take place in DC 3317 and online.
Yiwei Lu, PhD candidate
David R. Cheriton School of Computer Science
Supervisors: Professors Yaoliang Yu, Sun Sun
Recent advances in machine learning (ML) have been driven by models trained on vast datasets collected from the internet. While this approach has led to remarkable capabilities, it also introduces critical vulnerabilities: training data from the wild can be untrustworthy, containing harmful content or being susceptible to data poisoning attacks.
This talk examines the impact of untrusted training data through two crucial lenses: the ML developer’s perspective, focusing on model integrity, and the data owner’s perspective, addressing privacy and copyright concerns. I will present theoretical frameworks for analyzing these challenges and empirical evidence drawn from various ML pipelines. Additionally, I will introduce possible countermeasures to defend against data poisoning attacks, providing practical solutions for building more robust ML systems.
To attend this PhD defence in person, please go to DC 3317. You can also attend virtually on Zoom.