Cassidy Laidlaw

I’m a fifth-year PhD student studying computer science at the University of California, Berkeley. I’m interested in human-AI cooperation, machine learning safety and robustness, and bridging the theory-practice gap in reinforcement learning. I received my BS in computer science and mathematics from the University of Maryland in 2018. My PhD is currently funded by an Open Phil AI Fellowship and I was previously the recipient of a National Defense Science and Engineering Graduate (NDSEG) Fellowship.

Scroll down to see my publications.

Publications and Preprints

More information is also available in my Google Scholar profile.

Correlated Proxies: A New Definition and Improved Mitigation for Reward Hacking

Cassidy Laidlaw, Shivam Singhal, and Anca Dragan. ArXiv Preprint 2024.

The Effective Horizon Explains Deep RL Performance in Stochastic Environments

Cassidy Laidlaw, Banghua Zhu, Stuart Russell, and Anca Dragan. ICLR 2024.

Spotlight (given to ~16% of accepted papers)

Distributional Preference Learning: Understanding and Accounting for Hidden Context in RLHF

Anand Siththaranjan*, Cassidy Laidlaw*, and Dylan Hadfield-Menell. ICLR 2024.

Best paper honorable mention at the 2023 NeurIPS Workshop on Instruction Tuning and Instruction Following

Bridging RL Theory and Practice with the Effective Horizon

Cassidy Laidlaw, Stuart Russell, and Anca Dragan. NeurIPS 2023.

Oral (given to ~2% of accepted papers)

Best paper award at the 2023 ICML Workshop on New Frontiers in Learning, Control, and Dynamical Systems

The Boltzmann Policy Distribution: Accounting for Systematic Suboptimality in Human Models

Cassidy Laidlaw and Anca Dragan. ICLR 2022.

Uncertain Decisions Facilitate Better Preference Learning

Cassidy Laidlaw and Stuart Russell. NeurIPS 2021.

Spotlight (given to ~12% of accepted papers)

Perceptual Adversarial Robustness: Defense Against Unseen Threat Models

Cassidy Laidlaw, Sahil Singla, and Soheil Feizi. ICLR 2021.

Functional Adversarial Attacks

Cassidy Laidlaw and Soheil Feizi. NeurIPS 2019.

Capture, Learning, and Synthesis of 3D Speaking Styles

Daniel Cudeiro*, Timo Bolkart*, Cassidy Laidlaw, Anurag Ranjan, and Michael Black. CVPR 2019.

Playing it Safe: Adversarial Robustness with an Abstain Option

Cassidy Laidlaw and Soheil Feizi. arXiv Preprint 2019.