AI Safety & Alignment
- Statistical Estimation of Adversarial Risk in LLMs under Best-of-N Sampling (2026)
- SEMA: Simple yet Effective Learning for Multi-Turn Jailbreak Attacks, ICLR 2026
- FlowRL: Matching Reward Distributions for LLM Reasoning, ICLR 2026
- The Illusion of Readiness in Health AI, Nature Medicine (under review, 2026)
- Iterative Self-Tuning LLMs for Enhanced Jailbreaking Capabilities, NAACL Oral 2025
- Adversarial Training for Large Neural Language Models (2020)
Efficient & Scalable AI
- Training Large Reasoning Models Efficiently via Progressive Thought Encoding, ICLR 2026
- Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts, ACL 2025
- Efficient Long Sequence Modeling via State Space Augmented Transformer, COLM 2024
- Bridging Discrete and Backpropagation: Straight-Through and Beyond, NeurIPS Oral 2023
Foundation Model Innovation
- Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
- Fast-ELECTRA for Efficient Pre-training, ICLR 2024
- METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models (2022)
- DeBERTa: Decoding-Enhanced BERT with Disentangled Attention, ICLR 2021
- Domain-Specific Language Model Pretraining for Biomedical NLP, ACM TCHC 2021 — Best Paper Award
- UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training, ICML 2020
- Unified Language Model Pre-training for Natural Language Understanding and Generation, NeurIPS 2019
- Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding (2019)
- Multi-Task Deep Neural Networks for Natural Language Understanding, ACL 2019 — First to outperform BERT on GLUE