Publications
(*=Equal Contribution)
2024
SimPO: Simple Preference Optimization with a Reference-Free Reward [PDF] [Code]
Yu Meng*, Mengzhou Xia*, Danqi Chen.
Annual Conference on Neural Information Processing Systems (NeurIPS), 2024Unchosen Experts Can Contribute Too: Unleashing MoE Models’ Power by Self-Contrast [PDF] [Code]
Chufan Shi, Cheng Yang, Xinyu Zhu, Jiahao Wang, Taiqiang Wu, Siheng Li, Deng Cai, Yujiu Yang, Yu Meng.
Annual Conference on Neural Information Processing Systems (NeurIPS), 2024Grasping the Essentials: Tailoring Large Language Models for Zero-Shot Relation Extraction [PDF]
Sizhe Zhou, Yu Meng, Bowen Jin, Jiawei Han.
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024Two Tales of Persona in LLMs: A Survey of Role-Playing and Personalization [PDF]
Yu-Min Tseng, Yu-Chao Huang, Teng-Yun Hsiao, Wei-Lin Chen, Chao-Wei Huang, Yu Meng, Yun-Nung Chen.
Findings of Conference on Empirical Methods in Natural Language Processing (EMNLP-Findings), 2024Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs [PDF] [Code]
Bowen Jin, Chulin Xie, Jiawei Zhang, Kashob Kumar Roy, Yu Zhang, Suhang Wang, Yu Meng, Jiawei Han.
Findings of Annual Meeting of the Association for Computational Linguistics (ACL-Findings), 2024Representation Deficiency in Masked Language Modeling [PDF] [Code]
Yu Meng, Jitin Krishnan, Sinong Wang, Qifan Wang, Yuning Mao, Han Fang, Marjan Ghazvininejad, Jiawei Han, Luke Zettlemoyer.
International Conference on Learning Representations (ICLR), 2024Evaluating Large Language Models at Evaluating Instruction Following [PDF] [Code]
Zhiyuan Zeng, Jiatong Yu, Tianyu Gao, Yu Meng, Tanya Goyal, Danqi Chen.
International Conference on Learning Representations (ICLR), 2024
2023
PIEClass: Weakly-Supervised Text Classification with Prompting and Noise-Robust Iterative Ensemble Training [PDF] [Code]
Yunyi Zhang, Minhao Jiang, Yu Meng, Yu Zhang, Jiawei Han.
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias [PDF] [Code]
Yue Yu, Yuchen Zhuang, Jieyu Zhang, Yu Meng, Alexander Ratner, Ranjay Krishna, Jiaming Shen, Chao Zhang.
Annual Conference on Neural Information Processing Systems Datasets and Benchmarks Track (NeurIPS Benchmark), 2023Weakly Supervised Multi-Label Classification of Full-Text Scientific Papers [PDF] [Code]
Yu Zhang, Bowen Jin, Xiusi Chen, Yanzhen Shen, Yunyi Zhang, Yu Meng, Jiawei Han.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2023Pretrained Language Representations for Text Understanding: A Weakly-Supervised Perspective [Tutorial Page]
Yu Meng, Jiaxin Huang, Yu Zhang, Yunyi Zhang, Jiawei Han.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2023 (Tutorial)Patton: Language Model Pretraining on Text-Rich Networks [PDF] [Code]
Bowen Jin, Wentao Zhang, Yu Zhang, Yu Meng, Xinyang Zhang, Qi Zhu, Jiawei Han.
Annual Meeting of the Association for Computational Linguistics (ACL), 2023ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval [PDF] [Code]
Yue Yu, Yuchen Zhuang, Rongzhi Zhang, Yu Meng, Jiaming Shen and Chao Zhang.
Findings of Annual Meeting of the Association for Computational Linguistics (ACL-Findings), 2023Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning [PDF] [Code]
Yu Meng, Martin Michalski, Jiaxin Huang, Yu Zhang, Tarek Abdelzaher, Jiawei Han.
International Conference on Machine Learning (ICML), 2023SCStory: Self-supervised and Continual Online Story Discovery [PDF] [Code]
Susik Yoon, Yu Meng, Dongha Lee, Jiawei Han.
The Web Conference (WWW), 2023The Effect of Metadata on Scientific Literature Tagging: A Cross-Field Cross-Model Study [PDF] [Code]
Yu Zhang, Bowen Jin, Qi Zhu, Yu Meng, Jiawei Han.
The Web Conference (WWW), 2023Edgeformers: Graph-Empowered Transformers for Representation Learning on Textual-Edge Networks [PDF] [Code]
Bowen Jin, Yu Zhang, Yu Meng, Jiawei Han.
International Conference on Learning Representations (ICLR), 2023Effective Seed-Guided Topic Discovery by Integrating Multiple Types of Contexts [PDF] [Code]
Yu Zhang, Yunyi Zhang, Martin Michalski, Yucheng Jiang, Yu Meng, Jiawei Han.
ACM International Conference on Web Search and Data Mining (WSDM), 2023FineSum: Target-Oriented, Fine-Grained Opinion Summarization [PDF] [Code]
Suyu Ge, Jiaxin Huang, Yu Meng, Jiawei Han.
ACM International Conference on Web Search and Data Mining (WSDM), 2023Turning Web-Scale Texts to Knowledge: Transferring Pretrained Representations to Text Mining Applications [Tutorial Page]
Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han.
The Web Conference (WWW), 2023 (Tutorial)
2022
Generating Training Data with Language Models: Towards Zero-Shot Language Understanding [PDF] [Code]
Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han.
Annual Conference on Neural Information Processing Systems (NeurIPS), 2022Few-Shot Fine-Grained Entity Typing with Automatic Label Interpretation and Instance Generation [PDF] [Code]
Jiaxin Huang, Yu Meng, Jiawei Han.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2022Adapting Pretrained Text Representations to Text Mining [Tutorial Page]
Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2022 (Tutorial)Seed-Guided Topic Discovery with Out-of-Vocabulary Seeds [PDF] [Code]
Yu Zhang, Yu Meng, Xuan Wang, Sheng Wang, Jiawei Han.
Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2022Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators [PDF] [Code]
Yu Meng, Chenyan Xiong, Payal Bajaj, Saurabh Tiwary, Paul Bennett, Jiawei Han, Xia Song.
International Conference on Learning Representations (ICLR), 2022Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations [PDF] [Code]
Yu Meng, Yunyi Zhang, Jiaxin Huang, Yu Zhang, Jiawei Han.
The Web Conference (WWW), 2022Pre-Trained Language Representations for Text Mining [Tutorial Page]
Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han.
AAAI Conference on Artificial Intelligence (AAAI), 2022 (Tutorial)MotifClass: Weakly Supervised Text Classification with Higher-order Metadata Information [PDF] [Code]
Yu Zhang, Shweta Garg, Yu Meng, Xiusi Chen, Jiawei Han.
ACM International Conference on Web Search and Data Mining (WSDM), 2022
2021
Automated Taxonomy Discovery and Exploration [Tutorial Page]
Jiaming Shen, Xiaotao Gu, Yu Meng, Jiawei Han.
IEEE International Conference on Data Mining (ICDM), 2021 (Tutorial)COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining [PDF] [Code] [Blog Post]
Yu Meng, Chenyan Xiong, Payal Bajaj, Saurabh Tiwary, Paul Bennett, Jiawei Han, Xia Song.
Annual Conference on Neural Information Processing Systems (NeurIPS), 2021Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training [PDF] [Code]
Yu Meng, Yunyi Zhang, Jiaxin Huang, Xuan Wang, Yu Zhang, Heng Ji, Jiawei Han.
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021UCPhrase: Unsupervised Context-aware Quality Phrase Tagging [PDF] [Code]
Xiaotao Gu, Zihan Wang, Zhenyu Bi, Yu Meng, Liyuan Liu, Jiawei Han, Jingbo Shang.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2021On the Power of Pre-Trained Text Representations: Models and Applications in Text Mining [Tutorial Page]
Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2021 (Tutorial)TaxoClass: Hierarchical Multi-Label Text Classification Using Only Class Names [PDF] [Code]
Jiaming Shen, Wenda Qiu, Yu Meng, Jingbo Shang, Xiang Ren, Jiawei Han.
Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2021Hierarchical Metadata-Aware Document Categorization under Weak Supervision [PDF] [Code]
Yu Zhang, Xiusi Chen, Yu Meng, Jiawei Han.
ACM International Conference on Web Search and Data Mining (WSDM), 2021
2020
Text Classification Using Label Names Only: A Language Model Self-Training Approach [PDF] [Code]
Yu Meng, Yunyi Zhang, Jiaxin Huang, Chenyan Xiong, Heng Ji, Chao Zhang, Jiawei Han.
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020Weakly-Supervised Aspect-Based Sentiment Analysis via Joint Aspect-Sentiment Topic Embedding [PDF] [Code]
Jiaxin Huang, Yu Meng, Fang Guo, Heng Ji, Jiawei Han.
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020Mining Text Outliers in Document Directories [PDF] [Code]
Edouard Fouché, Yu Meng, Fang Guo, Honglei Zhuang, Klemens Böhm, Jiawei Han.
IEEE International Conference on Data Mining (ICDM), 2020Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding [PDF] [Slides] [Code]
Yu Meng*, Yunyi Zhang*, Jiaxin Huang, Yu Zhang, Chao Zhang, Jiawei Han.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2020CoRel: Seed-Guided Topical Taxonomy Construction by Concept Learning and Relation Transferring [PDF] [Code]
Jiaxin Huang, Yiqing Xie, Yu Meng, Yunyi Zhang, Jiawei Han.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2020Embedding-Driven Multi-Dimensional Topic Mining and Text Analysis [Tutorial Page]
Yu Meng, Jiaxin Huang, Jiawei Han.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2020 (Tutorial)Minimally Supervised Categorization of Text with Metadata [PDF] [Code]
Yu Zhang*, Yu Meng*, Jiaxin Huang, Frank F. Xu, Xuan Wang, Jiawei Han.
ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2020Discriminative Topic Mining via Category-Name Guided Text Embedding [PDF] [Video] [Code]
Yu Meng*, Jiaxin Huang*, Guangyuan Wang, Zihan Wang, Chao Zhang, Yu Zhang, Jiawei Han.
The Web Conference (WWW), 2020Guiding Corpus-based Set Expansion by Auxiliary Sets Generation and Co-Expansion [PDF] [Code]
Jiaxin Huang, Yiqing Xie, Yu Meng, Jiaming Shen, Yunyi Zhang, Jiawei Han.
The Web Conference (WWW), 2020Separate and Attend in Personal Email Search [PDF] [Google AI Page]
Yu Meng, Maryam Karimzadehgan, Honglei Zhuang, Donald Metzler.
ACM International Conference on Web Search and Data Mining (WSDM), 2020Unsupervised Word Embedding Learning by Incorporating Local and Global Contexts [PDF]
Yu Meng, Jiaxin Huang, Guangyuan Wang, Zihan Wang, Chao Zhang, Jiawei Han.
Frontiers in Big Data, 2020
2019 and Before
Spherical Text Embedding [PDF] [Slides] [Poster] [Code]
Yu Meng, Jiaxin Huang, Guangyuan Wang, Chao Zhang, Honglei Zhuang, Lance Kaplan, Jiawei Han.
Annual Conference on Neural Information Processing Systems (NeurIPS), 2019TextCube: Automated Construction and Multidimensional Exploration [PDF] [Abstract]
Yu Meng, Jiaxin Huang, Jingbo Shang, Jiawei Han.
International Conference on Very Large Data Bases (VLDB), 2019 (Tutorial)TopicMine: User-Guided Topic Mining by Category-Oriented Embedding [PDF]
Yu Meng*, Jiaxin Huang*, Zihan Wang, Chenyu Fan, Guangyuan Wang, Chao Zhang, Jingbo Shang, Lance Kaplan, Jiawei Han.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2019 (Demo)HiGitClass: Keyword-Driven Hierarchical Classification of GitHub Repositories [PDF] [Code]
Yu Zhang, Frank F. Xu, Sha Li, Yu Meng, Xuan Wang, Qi Li, Jiawei Han.
IEEE International Conference on Data Mining (ICDM), 2019Weakly-Supervised Hierarchical Text Classification [PDF] [Code]
Yu Meng, Jiaming Shen, Chao Zhang, Jiawei Han.
AAAI Conference on Artificial Intelligence (AAAI), 2019Weakly-Supervised Neural Text Classification [PDF] [Code]
Yu Meng, Jiaming Shen, Chao Zhang, Jiawei Han.
ACM International Conference on Information and Knowledge Management (CIKM), 2018Verifying nonlinear analog and mixed-signal circuits with inputs [PDF]
Chuchu Fan, Yu Meng, Jürgen Maier, Ezio Bartocci, Sayan Mitra, Ulrich Schmid.
IFAC Conference on Analysis and Design of Hybrid Systems (ADHS), 2018