About Me
I joined the Computer Science (CS) department at the University of Virginia (UVA) in 2024 as an assistant professor (tenure-track). Previously, I earned my Ph.D. from University of Illinois Urbana-Champaign (UIUC) where I worked with Jiawei Han and spent time as a visiting researcher at the Princeton NLP Group, working with Danqi Chen.
I am looking for self-motivated PhD students and interns! Please fill out this form if you are interested in working with me. After completing the form, you are also welcome to reach out via email. I will read all submitted forms and emails but I do apologize for not being able to respond to each of them!
Research
I am broadly interested in the fields of natural language processing (NLP), machine learning (ML), and data mining. Currently, I am especially passionate about the advancements in large language models (LLMs). Here are some papers that reflect my interests:
- Training Language Models for Better Alignment and Reliability:
- [NeurIPS’24 Meng et al.] SimPO: Simple Preference Optimization with a Reference-Free Reward
- [arXiv’24 Wei et al.] InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales
- Large Language Models for Synthetic Data Generation:
- [ICML’23 Meng et al.] Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning
- [NeurIPS’22 Meng et al.] Generating Training Data with Language Models: Towards Zero-Shot Language Understanding
- Learning from Weak Supervision:
- [EMNLP’21 Meng et al.] Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training
- [EMNLP’20 Meng et al.] Text Classification Using Label Names Only: A Language Model Self-Training Approach
- [CIKM’18 Meng et al.] Weakly-Supervised Neural Text Classification
- In the past, I have also worked on pretraining and representation Learning for NLP:
- [ICLR’24 Meng et al.] Representation Deficiency in Masked Language Modeling
- [ICLR’22 Meng et al.] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators
- [NeurIPS’21 Meng et al.] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
- [NeurIPS’19 Meng et al.] Spherical Text Embedding
News
[2024-2025 Service] ICML 2024 (Area Chair), COLM 2024 (Area Chair), NeurIPS 2024 (Area Chair), ICLR 2025 (Area Chair), ACL Rolling Review (Area Chair), TMLR (Action Editor).
[2024.09] Two papers on Preference Optimization and Contrastive Decoding in MoE accepted to NeurIPS 2024!
[2024.09] Two papers on Zero-Shot Relation Extraction and LLM Persona Survey accepted to EMNLP 2024 Main Conference/Findings!
[2024.08] My Ph.D. thesis won the ACM SIGKDD 2024 Dissertation Award!
[2024.05] One paper on Language Model Reasoning on Graphs accepted to ACL 2024 Findings!
[2024.04] Received the Superalignment Fast Grants from OpenAI (UVA Press Release)!
[2024.01] Two papers on Masked Language Modeling and Language Model Evaluation accepted to ICLR 2024!
[2023.10] One paper on Weakly Supervised Text Classification accepted to EMNLP 2023!
[2023.09] One paper on Language Models as Training Data Generators accepted to NeurIPS 2023 Datasets and Benchmarks Track!
Education
Ph.D. (2023) in Computer Science, University of Illinois Urbana-Champaign
Thesis: Efficient and Effective Learning of Text Representations (ACM SIGKDD 2024 Dissertation Award)M.S. (2019) in Computer Science, University of Illinois Urbana-Champaign
Thesis: Weakly-Supervised Text ClassificationB.S. (2017) in Computer Engineering, University of Illinois Urbana-Champaign
Graduated with Highest Honor & Bronze Tablet
Contact
Email: yumeng5[at]virginia[dot]edu
Office: Rice Hall 408, 85 Engineer’s Way, Charlottesville, Virginia 22903