Publications | Xiaodong Liu

AI Safety & Alignment

Statistical Estimation of Adversarial Risk in LLMs under Best-of-N Sampling (2026)
MultiBreak: A Scalable and Diverse Multi-turn Jailbreak Benchmark for Evaluating LLM Safety, ICML 2026
SEMA: Simple yet Effective Learning for Multi-Turn Jailbreak Attacks, ICLR 2026
FlowRL: Matching Reward Distributions for LLM Reasoning, ICLR 2026
The Illusion of Readiness in Health AI, Nature Medicine 2026
Iterative Self-Tuning LLMs for Enhanced Jailbreaking Capabilities, NAACL Oral 2025
Adversarial Training for Large Neural Language Models (2020)

Efficient & Scalable AI

Training Large Reasoning Models Efficiently via Progressive Thought Encoding, ICLR 2026
Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts, ACL 2025
Efficient Long Sequence Modeling via State Space Augmented Transformer, COLM 2024
Bridging Discrete and Backpropagation: Straight-Through and Beyond, NeurIPS Oral 2023

Foundation Model Innovation

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Fast-ELECTRA for Efficient Pre-training, ICLR 2024
METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models (2022)
DeBERTa: Decoding-Enhanced BERT with Disentangled Attention, ICLR 2021
Domain-Specific Language Model Pretraining for Biomedical NLP, ACM TCHC 2021 Best Paper Award
UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training, ICML 2020
Unified Language Model Pre-training for Natural Language Understanding and Generation, NeurIPS 2019
Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding (2019)
Multi-Task Deep Neural Networks for Natural Language Understanding, ACL 2019 — First to outperform BERT on GLUE