Niloofar Mireshghallah

I am a Research Scientist at Meta AI’s FAIR Alignment group in San Francisco. Beginning Fall 2026, I will join Carnegie Mellon University’s Engineering & Public Policy (EPP) Department and Language Technologies Institute (LTI) as an Assistant Professor.

My research interests are privacy, natural language processing, and the societal implications of ML. I explore the interplay between data, its influence on models, and the expectations of the people who regulate and use these models. My work has been recognized by the NCWIT Collegiate Award and the Rising Star in Adversarial ML Award.

Previously, I was a postdoctoral scholar at University of Washington, advised by Yejin Choi and Yulia Tsvetkov. I received my PhD from UC San Diego, advised by Taylor Berg-Kirkpatrick, and during that time I was also a part-time researcher / intern at Microsoft Research—working with the Privacy in AI, Algorithms, and Semantic Machines teams on differential privacy, model compression, and data synthesis.

Recruiting & collaborations: If you are interested in working with me, please fill out this brief form .

✦ Explanation about my name: I used to publish under Fatemeh which is my legal name. But I now go by Niloofar, the Lily flower in Farsi!

✦ My academic Job-market material (Fall 2024): Research statement · Teaching statement · DEI statement · CV · Job-talk slides

News Highlights

🗺️

I will be giving a keynote at the Memorization workshop at ACL titled Emergent Misalignment Through the Lens of Non-verbatim Memorization on August 1st! View talk slides.

🗺️

I will be giving a keynote at the LLM Security workshop at ACL titled What does it mean for an AI agent to preserve privacy? on August 1st! View talk slides.

🗺️

I will be giving an in-person talk at the Stanford NLP Seminar on January 16th! View talk slides, the privacy and memorization in LLMs reading list and some of my thoughts on the future directions for privacy in LLMs.

🗺️

I will be giving an invited talk at the TrustNLP Workshop @NAACL 2025!

🗺️

I am attending NeurIPS 2024 and giving an invited keynote at the Red Teaming GenAI workshop on A False Sense of Privacy: Semantic Leakage and Non-literal Copying in LLMs. View talk slides and recording (jump to 04:50:00).

🗺️

I will be visiting Johns Hopkins university to give a talk on December 9th!

🎙️

I appeared on a panel at the Future of Privacy Forum - Technologist Roundtable for Policymakers: Key Issues in Privacy and AI (write-up coming soon!)

🎙️

I appeared on the Thesis Review podcast with Sean Welleck where I talked about my work on Auditing and Mitigating Safety Risks in Large Language Models.

🎙️

I wrote a blogpost on "Should I do a postdoc?" based on my experience - check out the blog post and video with Sasha Rush!

🎙️

I gave an invited keynote talk at the SRI International C3E workshop hosted by SRI and NSA. View talk slides.

📰

I was interviewed by UW News about OpenAI's O1 update and advances in math and reasoning. Read the interview.

📰

I was interviewed by the Washington Post on Google's AI image generator controversy and disclosure of personal information in conversations with ChatGPT.

Selected Publications

For the full list, please refer to my Google Scholar page.

AI as Humanity's Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text

ICLR 2025

X. Lu, M. Sclar, S. Hallinan, N. Mireshghallah, J. Liu, S. Han, A. Ettinger, L. Jiang, K. Chandu, N. Dziri, Y. Choi
Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs

NAACL 2025

A. Kassem^*, O. Mahmoud^*, N. Mireshghallah^*, H. Kim, Y. Tsvetkov, Y. Choi, S. Saad, S. Rana
Differentially Private Learning Needs Better Model Initialization and Self-Distillation

NAACL 2025

I. Ngong, J. Near, N. Mireshghallah
A False Sense of Privacy: Evaluating Textual Data Sanitization Beyond Surface-level Privacy Leakage

Neurips Safe Generative AI Workshop 2024

R. Xin^*, N. Mireshghallah^*, S. S. Li, M. Duan, H. Kim, Y. Choi, Y. Tsvetkov, S. Oh, P. W. Koh
WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models

NeurIPS 2024

L. Jiang, K. Rao, S. Han, A. Ettinger, F. Brahman, S. Kumar, N. Mireshghallah, X. Lu, M. Sap, Y. Choi, N. Dziri
CopyBench: Measuring literal and non-literal reproduction of copyright-protected text in language model generation

EMNLP 2024

T. Chen, N. Mireshghallah^*, A. Asai^*, S. Min, J. Grimmelmann, Y. Choi, H. Hajishirzi, L. Zettlemoyer, P. W. Koh
Trust No Bot: Discovering Personal Disclosures in Human-LLM Conversations in the Wild

COLM 2024

N. Mireshghallah^*, M. Antoniak^*, Y. More^*, Y. Choi, G. Farnadi
Do membership inference attacks work on large language models?

COLM 2024

M. Duan, A. Suri, N. Mireshghallah, S. Min, W. Shi, L. Zettlemoyer, Y. Tsvetkov, Y. Choi, D. Evans, H. Hajishirzi
Machine Unlearning Doesn't Do What You Think

Extended Abstract at GenLaw 2024

K. Lee, A. F. Cooper, C. A. Choquette-Choo, K. Liu, M. Jagielski, N. Mireshghallah, L. Ahmed, J. Grimmelmann, D. Bau, C. De Sa, et al.
A Roadmap to Pluralistic Alignment

ICML 2024

T. Sorensen, J. Moore, J. Fisher, M. Gordon, N. Mireshghallah, C. M. Rytting, A. Ye, L. Jiang, X. Lu, N. Dziri, T. Althoff, Y. Choi
Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory

ICLR 2024

N. Mireshghallah^*, H. Kim^*, X. Zhou, Y. Tsvetkov, M. Sap, R. Shokri, Y. Choi
Privacy-preserving in-context learning with differentially private few-shot generation

ICLR 2024

X. Tang, R. Shin, H. A. Inan, A. Manoel, N. Mireshghallah,Z. Lin, S. Gopi, J. Kulkarni, R. Sim
Smaller Language Models are Better Black-box Machine-Generated Text Detectors

EACL 2024

N. Mireshghallah, J. Mattern, S. Gao, R. Shokri, T. Berg-Kirkpatrick
Privacy-Preserving Domain Adaptation of Semantic Parsers

ACL 2023

N. Mireshghallah, R. Shin, Y. Su, T. Hashimoto, J. Eisner
Non-Parametric Temporal Adaptation for Social Media Topic Classification

EMNLP 2023

N. Mireshghallah^*, N. Vogler^*, J. He, O. Florez, A. El-Kishky, T. Berg-Kirkpatrick
A Block Metropolis-Hastings Sampler for Controllable Energy-based Text Generation

CoNLL 2023

J. Forristal, N. Mireshghallah, G. Durrett, T. Berg-Kirkpatrick
Membership Inference Attacks against Language Models via Neighbourhood Comparison

ACL 2023

J. Mattern, N. Mireshghallah, Z. Jin, B. Scholkop, M. Sachan, T. Berg-Kirkpatrick
Differentially Private Model Compression

NeurIPS 2022

N. Mireshghallah, A. Backurs, H. A. Inan, L. Wutschitz, J. Kulkarni
Memorization in NLP Fine-tuning Methods

EMNLP 2022

N. Mireshghallah, A. Uniyal, T. Wang, D. Evans, T. Berg-Kirkpatrick
Quantifying Privacy Risks of Masked Language Models Using Membership Inference Attacks

EMNLP 2022

N. Mireshghallah, K. Goyal, A. Uniyal, T. Berg-Kirkpatrick, R. Shokri
UserIdentifier: Implicit User Representations for Simple and Effective Personalized Sentiment Analysis

NAACL 2022

N. Mireshghallah, V. Shrivastava, M. Shokouhi, T. Berg-Kirkpatrick, R. Sim, D. Dimitriadis
What Does it Mean for a Language Model to Preserve Privacy?

FAccT 2022

H. Brown, K. Lee, N. Mireshghallah, R. Shokri, F. Tram'er
Mix and Match: Learning-free Controllable Text Generation

ACL 2022

N. Mireshghallah, K. Goyal, T. Berg-Kirkpatrick
Style Pooling: Automatic Text Style Obfuscation for Improved Classification Fairness

EMNLP 2021

N. Mireshghallah, T. Berg-Kirkpatrick
Privacy Regularization: Joint Privacy-Utility Optimization in Language Models

NAACL 2021

N. Mireshghallah, H. Inan, M. Hasegawa, V. Rühle, T. Berg-Kirkpatrick, R. Sim
Divide and Conquer: Leveraging Intermediate Feature Representations for Quantized Training of Neural Networks

ICML 2020

A. Elthakeb, P. Pilligundla, N. Mireshghallah, A. Cloninger, H. Esmaeilzadeh
Not All Features Are Equal: Discovering Essential Features for Preserving Prediction Privacy

WWW 2021

N. Mireshghallah, M. Taram, A. Jalali, A. T. Elthakeb, D. Tullsen, H. Esmaeilzadeh
Shredder: Learning Noise Distributions to Protect Inference Privacy

ASPLOS 2020

N. Mireshghallah, M. Taram, A. Jalali, D. Tullsen, H. Esmaeilzadeh

Invited Talks

Stanford University (NLP Seminar)

NLP Seminar, Jan. 2025

Privacy, Copyright and Data Integrity: The Cascading Implications of Generative AI

Slides | Reading List
Fifth Workshop on Trustworthy Natural Language Processing @NAACL 2025(TrustNLP)

Workshop, May. 2025
University of California, Los Angeles

Guest lecture for CS 269 - Computational Ethics, LLMs and the Future of NLP, Jan. 2025

Privacy, Copyright and Data Integrity: The Cascading Implications of Generative AI

Slides
NeurIPS Conference (Red Teaming GenAI workshop)

Red Teaming GenAI workshop, Dec. 2024

A False Sense of Privacy: Semantic Leakage and Non-literal Copying in LLMs

Slides | Recording (jump to 04:50:00)
NeurIPS Conference (PrivacyML Tutorial)

Panelist, Dec. 2024

PrivacyML: Meaningful Privacy-Preserving Machine Learning tutorial

Recording (jump to 01:52:00)
Johns Hopkins University

CS Department Seminar, Dec. 2024

Privacy, Copyright and Data Integrity: The Cascading Implications of Generative AI

Slides
Future of Privacy Forum

Panelist, Nov. 2024

Technologist Roundtable for Policymakers: Key Issues in Privacy and AI
University of Utah

Guest lecture for the School of Computing CS 6340/5340 NLP course, Nov. 2024

Can LLMs Keep a Secret?

Slides | Recording
UMass Amherst

NLP Seminar, Oct. 2024

Membership Inference Attacks and Contextual Integrity for Language

Slides
Northeastern University

Khoury College of Computer Sciences Security Seminar, Oct. 2024

Membership Inference Attacks and Contextual Integrity for Language

Slides
Stanford Research Institute (SRI) International

Computational Cybersecurity in Compromised Environments (C3E) workshop, Sep. 2024

Can LLMs keep a secret? Testing privacy implications of Language Models via Contextual Integrity

Slides
LinkedIn Research

Privacy Tech Talk, Sep. 2024

Can LLMs keep a secret? Testing privacy implications of Language Models via Contextual Integrity
National Academies (NASEM)

Forum on Cyber Resilience, Aug. 2024

Oversharing with LLMs is underrated: the curious case of personal disclosures in human-LLM conversations

Slides
ML Collective

DLCT reading group, Aug. 2024

Privacy in LLMs: Understanding what data is imprinted in LMs and how it might surface!

Slides | Recording
Carnegie Mellon University

Invited Talk, Jun. 2024

Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs

Slides
Generative AI and Law workshop, Washington DC

Invited Talk, Apr. 2024

What is differential privacy? And what is it not?

Slides
Meta AI Research

Invited Talk, Apr. 2024

Membership Inference Attacks and Contextual Integrity for Language
Georgia Institute of Technology

Guest lecture for the School of Interactive Computing, Apr. 2024

Safety in LLMs: Privacy and Memorization
University of Washington

Guest lecture for CSE 484 and 582 courses on Computer Security and Ethics in AI, Apr. 2024

Safety in LLMs: Privacy and Memorization
Carnegie Mellon University

Guest lecture for LTI 11-830 course on Computational Ethics in NLP, Mar. 2024

Safety in LLMs: Privacy and Memorization
Simons Collaboration

TOC4Fairness Seminar, Mar. 2024

Membership Inference Attacks and Contextual Integrity for Language

Slides | Recording
University of California, Santa Barbara

NLP Seminar Invited Talk, Mar. 2024

Can LLMs Keep a Secret? Testing Privacy Implications of LLMs

Slides
University of California, Los Angeles

NLP Seminar Invited Talk, Mar. 2024

Can LLMs Keep a Secret? Testing Privacy Implications of LLMs

Slides
University of Texas at Austin

Guest lecture for LIN 393 course on Social Applications and Impact of NLP, Feb. 2024

Can LLMs Keep a Secret? Testing Privacy Implications of LLMs

Slides
Google Brain

Google Tech Talk, Feb. 2024

Can LLMs Keep a Secret? Testing Privacy Implications of LLMs

Slides | Recording
University of Washington

Allen School Colloquium, Jan. 2024

Can LLMs Keep a Secret? Testing Privacy Implications of LLMs

Slides | Recording
University of Washington

eScience Institute Seminars, Nov. 2023

Privacy Auditing and Protection in Large Language Model

Slides
CISPA Helmholtz Center for Security

Invited Talk, Sep. 2023

What does privacy-preserving NLP entail?
Max Planck Institute for Software Systems

Next 10 in AI Series, Sep. 2023

Auditing and Mitigating Safety Risks in LLMs

Slides
Mila / McGill University

Invited Talk, May 2023

Privacy Auditing and Protection in Large Language Models
EACL 2023

Tutorial co-instruction, May 2023

Private NLP: Federated Learning and Privacy Regularization

Slides | Recording
LLM Interfaces Workshop and Hackathon

Invited Talk, Apr. 2023

Learning-free Controllable Text Generation

Slides | Recording
University of Washington

Invited Talk, Apr. 2023

Auditing and Mitigating Safety Risks in Large Language Models

Slides
NDSS Conference

Keynote talk for EthiCS workshop, Feb. 2023

How much can we trust large language models?
Google

Federated Learning Seminar, Feb. 2023

Privacy Auditing and Protection in Large Language Models

Slides
University of Texas Austin

Invited Talk, Oct. 2022

How much can we trust large language models?

Slides
Johns Hopkins University

Guest lecture for CS 601.670 course on Artificial Agents, Sep. 2022

Mix and Match: Learning-free Controllable Text Generation

Slides
KDD Conference

Adversarial ML workshop, Aug. 2022

How much can we trust large language models?

Slides | Recording
Microsoft Research Cambridge

Invited Talk, Mar. 2022

What Does it Mean for a Language Model to Preserve Privacy?

Slides
University of Maine

Guest lecture for COS435/535 course on Information Privacy Engineering, Dec. 2021

Improving Attribute Privacy and Fairness for Natural Language Processing

Slides
National University of Singapore

Invited Talk, Nov. 2021

Style Pooling: Automatic Text Style Obfuscation for Fairness

Slides
Big Science for Large Language Models

Invited Panelist, Oct. 2021

Privacy-Preserving Natural Language Processing

Recording
Research Society MIT Manipal

Cognizance Event Invited Talk, Jul. 2021

Privacy and Interpretability of DNN Inference

Slides | Recording
Alan Turing Institute

Privacy and Security in ML Seminars, Jun. 2021

Low-overhead Techniques for Privacy and Fairness of DNNs

Slides | Recording
Split Learning Workshop

Invited Talk, Mar. 2021

Shredder: Learning Noise Distributions to Protect Inference Privacy

Slides | Recording
University of Massachusetts Amherst

Machine Learning and Friends Lunch, Oct. 2020

Privacy and Fairness in DNN Inference
OpenMined Privacy Conference

Invited Talk, Sep. 2020

Privacy-Preserving Natural Language Processing

Slides | Recording
Microsoft Research AI

Breakthroughs Workshop, Sep. 2020

Private Text Generation through Regularization

Awards and Honors

🏆

Momental Foundation Mistletoe Research Fellowship (MRF) Finalist, 2023

🌟

Rising Star in Adversarial Machine Learning (AdvML) Award Winner, 2022. AdvML Workshop

🌟

Rising Stars in EECS, 2022. Event Page

🎓

UCSD CSE Excellence in Leadership and Service Award Winner, 2022

🌟

FAccT Doctoral Consortium, 2022. FAccT 2022

👩‍💻

Qualcomm Innovation Fellowship Finalist, 2021. Fellowship Page

👩‍💻

NCWIT (National Center for Women & IT) Collegiate Award Winner, 2020. NCWIT Awards

🎓

National University Entrance Exam in Math, 2014. Ranked 249^th of 223,000

🎓

National University Entrance Exam in Foreign Languages, 2014. Ranked 57^th of 119,000

🎓

National Organization for Exceptional Talents (NODET), 2008. Admitted, ~2% Acceptance Rate

Featured Press & Media

🎙️

Thesis Review podcast episode about my work on Auditing and Mitigating Safety Risks in Large Language Models

🎙️

Should I do a postdoc guest video on Sasha's channel - along with the blog post

📄

Science - AI writing is improving, but it still can't match human creativity

📄

UW News - AI researcher discusses the new version of ChatGPT's advances in math and reasoning

📄

Washington Post - How to opt out of having your data 'train' ChatGPT and other AI chatbots

📄

Washington Post - Google's weird AI answers hint at a fundamental problem

📄

Washington Post - What do people really ask chatbots?

📄

WIRED - How to Stop Your Data From Being Used to Train AI

Recent Co-organized Workshops

[for full list check my CV]

◆

Privacy Regulation and Protection in Machine Learning (PML @ICLR2024)

◆

Privacy-Preserving Artificial Intelligence (PPAI @AAAI2024)

◆

Generative AI + Law (GenLaw @ICML2023)

Industry Research Experience

Microsoft Semantic Machines

Fall 2022-Fall 2023 (Part-time), Summer 2022 (Intern)

Mentors: Richard Shin, Yu Su, Tatsunori Hashimoto, Jason Eisner
Microsoft Research, Algorithms Group, Redmond Lab

Winter 2022 (Intern)

Mentors: Sergey Yekhanin, Arturs Backurs
Microsoft Research, Language, Learning and Privacy Group, Redmond Lab

Summer 2021 (Intern), Summer 2020 (Intern)

Mentors: Dimitrios Dimitriadis, Robert Sim
Western Digital Co. Research and Development

Summer 2019 (Intern)

Mentor: Anand Kulkarni

Diversity, Inclusion & Mentorship

🔹

Mentor on the 'How to broadcast your research to a wider audience?' panel at ACL Mentorship Program -- 2025

🔹

Mentor for the mentorship program at WiML event in NeurIPS 2024

🔹

D&I chair at NAACL 2025

🔹

Widening NLP (WiNLP) co-chair

🔹

Socio-cultural D&I chair at NAACL 2022

🔹

Mentor for the Graduate Women in Computing (GradWIC) at UCSD

🔹

Mentor for the UC San Diego Women Organization for Research Mentoring (WORM) in STEM

🔹

Co-leader for the "Feminist Perspectives for Machine Learning & Computer Vision" Break-out session at the Women in Machine Learning (WiML) 2020 Un-workshop Held at ICML 2020

🔹

Mentor for the USENIX Security 2020 Undergraduate Mentorship Program

🔹

Volunteer at the Women in Machine Learning 2019 Workshop Held at NeurIPS 2019

🔹

Invited Speaker at the Women in Machine Learning and Data Science (WiMLDS) NeurIPS 2019 Meetup

🔹

Mentor for the UCSD CSE Early Research Scholars Program (CSE-ERSP) in 2018

Professional Services

[Outdated, for an updated version check my CV]

◆

Reviewer for ICLR 2022

◆

Reviewer for NeurIPS 2021

◆

Reviewer for ICML 2021

◆

Shadow PC member for IEEE Security and Privacy Conference Winter 2021

◆

Artifact Evaluation Program Committee Member for USENIX Security 2021

◆

Reviewer for ICLR 2021 Conference

◆

Program Committee member for the LatinX in AI Research Workshop at ICML 2020 (LXAI)

◆

Reviewer for the 2020 Workshop on Human Interpretability in Machine Learning (WHI) at ICML 2020

◆

Program Committee member for the MLArchSys workshop at ISCA 2020

◆

Security & Privacy Committee Member and Session Chair for Grace Hopper Celebration (GHC) 2020

◆

GHC (Grace Hopper Celebration) 2020 Privacy and Security Committee Member

◆

Reviewer for ICML 2020 Conference

◆

Artifact Evaluation Program Committee Member for ASPLOS 2020

◆

Reviewer for IEEE TC Journal

◆

Reviewer for ACM TACO Journal

Books I Like!

📚

Range: Why Generalists Triumph in a Specialized World by D. Epstein

📚

Messy: The Power of Disorder to Transform Our Lives by T. Harford

📚

Small Is Beautiful: Economics As If People Mattered by E. F. Schumacher

📚

Quarter-life by Satya Doyle Byock

📚

The Body Keeps the Score by Bessel van der Kolk

📚

36 Views of Mount Fuji by Cathy Davidson

📚

Indistractable by Nir Eyal

📚

Sapiens: A Brief History of Humankind by Yuval Noah Harari

📚

The Martian by Andy Weir

📚

The Solitaire Mystery by Jostein Gaarder

📚

The Orange Girl by Jostein Gaarder

📚

Life is Short: A Letter to St Augustine by Jostein Gaarder

📚

The Alchemist by Paulo Coelho