AI Safety & Alignment

  • Statistical Estimation of Adversarial Risk in LLMs under Best-of-N Sampling (2026)
  • SEMA: Simple yet Effective Learning for Multi-Turn Jailbreak Attacks, ICLR 2026
  • FlowRL: Matching Reward Distributions for LLM Reasoning, ICLR 2026
  • The Illusion of Readiness in Health AI, Nature Medicine (under review, 2026)
  • Iterative Self-Tuning LLMs for Enhanced Jailbreaking Capabilities, NAACL Oral 2025
  • Adversarial Training for Large Neural Language Models (2020)

Efficient & Scalable AI

  • Training Large Reasoning Models Efficiently via Progressive Thought Encoding, ICLR 2026
  • Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts, ACL 2025
  • Efficient Long Sequence Modeling via State Space Augmented Transformer, COLM 2024
  • Bridging Discrete and Backpropagation: Straight-Through and Beyond, NeurIPS Oral 2023

Foundation Model Innovation

  • Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
  • Fast-ELECTRA for Efficient Pre-training, ICLR 2024
  • METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models (2022)
  • DeBERTa: Decoding-Enhanced BERT with Disentangled Attention, ICLR 2021
  • Domain-Specific Language Model Pretraining for Biomedical NLP, ACM TCHC 2021 — Best Paper Award
  • UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training, ICML 2020
  • Unified Language Model Pre-training for Natural Language Understanding and Generation, NeurIPS 2019
  • Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding (2019)
  • Multi-Task Deep Neural Networks for Natural Language Understanding, ACL 2019 — First to outperform BERT on GLUE