About Me
I joined the Computer Science (CS) department at the University of Virginia (UVA) in 2024 as an assistant professor (tenure-track). Previously, I earned my Ph.D. from University of Illinois Urbana-Champaign (UIUC) where I worked with Jiawei Han and spent time as a visiting researcher at the Princeton NLP Group, working with Danqi Chen.
I am looking for self-motivated PhD students and interns! Please fill out this form if you are interested in working with me. After completing the form, you are also welcome to reach out via email. I will read all submitted forms and emails but I do apologize for not being able to respond to each of them!
Research
I am broadly interested in the fields of natural language processing (NLP), machine learning (ML), and data mining. Nowadays, I am especially passionate about the developments in large language models (LLMs). Some of my past papers are as follows:
- Pretraining and Representation Learning for NLP:
- [ICLR’24 Meng et al.] Representation Deficiency in Masked Language Modeling
- [ICLR’22 Meng et al.] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators
- [NeurIPS’21 Meng et al.] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
- [NeurIPS’19 Meng et al.] Spherical Text Embedding
- Large Language Models for Few-Shot and Zero-Shot Learning:
- [ICML’23 Meng et al.] Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning
- [NeurIPS’22 Meng et al.] Generating Training Data with Language Models: Towards Zero-Shot Language Understanding
- Learning from Weak Supervision:
- [EMNLP’21 Meng et al.] Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training
- [EMNLP’20 Meng et al.] Text Classification Using Label Names Only: A Language Model Self-Training Approach
- [CIKM’18 Meng et al.] Weakly-Supervised Neural Text Classification
News
[2024 Service] ICML 2024 (Area Chair), COLM 2024 (Area Chair), NeurIPS 2024 (Area Chair), TMLR (Action Editor).
[2024.01] Two papers on Masked Language Modeling and Language Model Evaluation have been accepted by ICLR 2024!
[2023.10] One paper on Weakly Supervised Text Classification has been accepted by EMNLP 2023!
[2023.09] One paper on Language Models as Training Data Generators has been accepted by NeurIPS 2023 Datasets and Benchmarks Track!
[2023.05] One paper on Weakly Supervised Scientific Text Classification has been accepted by KDD 2023!
[2023.05] Two papers on Language Model Pretraining on Text-Rich Network and Retrieval-Enhanced Weakly-Supervised Text Classification have been accepted by ACL 2023 Main Conference/Findings!
[2023.04] Our tutorial on Pretrained Language Representations for Text Understanding has been accepted by KDD 2023!
[2023.04] One paper on Few-Shot Learning has been accepted by ICML 2023!
Education
Ph.D. (2023) in Computer Science, University of Illinois Urbana-Champaign
Thesis: Efficient and Effective Learning of Text RepresentationsM.S. (2019) in Computer Science, University of Illinois Urbana-Champaign
Thesis: Weakly-Supervised Text ClassificationB.S. (2017) in Computer Engineering, University of Illinois Urbana-Champaign
Graduated with Highest Honor & Bronze Tablet
Contact
Email: yumeng5[at]virginia[dot]edu
Office: Rice Hall 408, 85 Engineer’s Way, Charlottesville, Virginia 22093