About Me
I obtained my Ph.D. from University of Illinois Urbana-Champaign (UIUC) where I worked with Jiawei Han. Currently, I am visiting Princeton NLP Group, working with Danqi Chen. I am grateful for being supported by the Google PhD fellowship since 2021.
I will join the Computer Science Department at the University of Virginia (UVA) as an assistant professor (tenure-track) in January 2024. I am looking for PhD students and interns! Please fill out this form if you are interested in working with me. After completing the form, you are also welcome to reach out via email. I will read all submitted forms and emails but I do apologize for not being able to respond to each of them!
Research
I am broadly interested in the fields of natural language processing (NLP), machine learning (ML), and data mining. Nowadays, I am especially passionate about the developments in large language models (LLMs). Some of my past papers are as follows:
- Pretraining and Representation Learning for NLP:
- [arXiv’23 Meng et al.] Representation Deficiency in Masked Language Modeling
- [ICLR’22 Meng et al.] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators
- [NeurIPS’21 Meng et al.] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
- [NeurIPS’19 Meng et al.] Spherical Text Embedding
- Large Language Models for Few-Shot and Zero-Shot Learning:
- [ICML’23 Meng et al.] Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning
- [NeurIPS’22 Meng et al.] Generating Training Data with Language Models: Towards Zero-Shot Language Understanding
- Text Mining Paradigms and Applications:
- [WWW’22 Meng et al.] Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations
- [EMNLP’20 Meng et al.] Text Classification Using Label Names Only: A Language Model Self-Training Approach
- [CIKM’18 Meng et al.] Weakly-Supervised Neural Text Classification
News
[2023.10] One paper on Weakly Supervised Text Classification has been accepted by EMNLP 2023!
[2023.09] One paper on Language Models as Training Data Generators has been accepted by NeurIPS 2023 Datasets and Benchmarks Track!
[2023.05] One paper on Weakly Supervised Scientific Text Classification has been accepted by KDD 2023!
[2023.05] Two papers on Language Model Pretraining on Text-Rich Network and Retrieval-Enhanced Weakly-Supervised Text Classification have been accepted by ACL 2023 Main Conference/Findings!
[2023.04] Our tutorial on Pretrained Language Representations for Text Understanding has been accepted by KDD 2023!
[2023.04] One paper on Few-Shot Learning has been accepted by ICML 2023!
[2023.01] Two papers on Metadata-Enhanced Scientific Text Classification and Unsupervised Online Story Discovery have been accepted by WWW 2023!
[2023.01] One paper on Learning Text-Rich Network Representations has been accepted by ICLR 2023!
[2022.12] Our tutorial on Turning Web-Scale Texts to Knowledge: Transferring Pretrained Representations to Text Mining Applications has been accepted by WWW 2023!
Education
Ph.D. (2023) in Computer Science, University of Illinois Urbana-Champaign
Thesis: Efficient and Effective Learning of Text RepresentationsM.S. (2019) in Computer Science, University of Illinois Urbana-Champaign
Thesis: Weakly-Supervised Text ClassificationB.S. (2017) in Computer Engineering, University of Illinois Urbana-Champaign
Graduated with Highest Honor & Bronze Tablet
Contact
Email: yumeng5[at]illinois[dot]edu
Office:
Room 1113, Thomas M. Siebel Center, 201 N. Goodwin Avenue, Urbana, IL 61801TBD