Publications
(*=Equal Contribution)
2025
- ICML 2025 AdaDecode: Accelerating LLM Decoding with Adaptive Layer ParallelismZhepei Wei, Wei-Lin Chen, Xinyu Zhu, Yu Meng.International Conference on Machine Learning (ICML), 2025 Code
- ICML 2025 LLM Alignment as Retriever Optimization: An Information Retrieval PerspectiveBowen Jin, Jinsung Yoon, Zhen Qin, Ziqi Wang, Wei Xiong, Yu Meng, Jiawei Han, Sercan O. Arik.International Conference on Machine Learning (ICML), 2025
- ICLR 2025 InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized RationalesZhepei Wei, Wei-Lin Chen, Yu Meng.International Conference on Learning Representations (ICLR), 2025 Code
2024
- NeurIPS 2024 SimPO: Simple Preference Optimization with a Reference-Free RewardYu Meng*, Mengzhou Xia*, Danqi Chen.Conference on Neural Information Processing Systems (NeurIPS), 2024 Code
- NeurIPS 2024 Unchosen Experts Can Contribute Too: Unleashing MoE Models’ Power by Self-ContrastChufan Shi, Cheng Yang, Xinyu Zhu, Jiahao Wang, Taiqiang Wu, Siheng Li, Deng Cai, Yujiu Yang, Yu Meng.Conference on Neural Information Processing Systems (NeurIPS), 2024 Code
- EMNLP 2024 Grasping the Essentials: Tailoring Large Language Models for Zero-Shot Relation ExtractionSizhe Zhou, Yu Meng, Bowen Jin, Jiawei Han.Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
- EMNLP-Findings 2024 Two Tales of Persona in LLMs: A Survey of Role-Playing and PersonalizationYu-Min Tseng, Yu-Chao Huang, Teng-Yun Hsiao, Wei-Lin Chen, Chao-Wei Huang, Yu Meng, Yun-Nung Chen.Findings of EMNLP, 2024
- ACL-Findings 2024 Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on GraphsBowen Jin, Chulin Xie, Jiawei Zhang, Kashob Kumar Roy, Yu Zhang, Suhang Wang, Yu Meng, Jiawei Han.Findings of ACL, 2024 Code
- ICLR 2024 Representation Deficiency in Masked Language ModelingYu Meng, Jitin Krishnan, Sinong Wang, Qifan Wang, Yuning Mao, Han Fang, Marjan Ghazvininejad, Jiawei Han, Luke Zettlemoyer.International Conference on Learning Representations (ICLR), 2024 Code
- ICLR 2024 Evaluating Large Language Models at Evaluating Instruction FollowingZhiyuan Zeng, Jiatong Yu, Tianyu Gao, Yu Meng, Tanya Goyal, Danqi Chen.International Conference on Learning Representations (ICLR), 2024 Code
2023
- EMNLP 2023 PIEClass: Weakly-Supervised Text Classification with Prompting and Noise-Robust Iterative Ensemble TrainingYunyi Zhang, Minhao Jiang, Yu Meng, Yu Zhang, Jiawei Han.Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Code
- NeurIPS Benchmark 2023 Large Language Model as Attributed Training Data Generator: A Tale of Diversity and BiasYue Yu, Yuchen Zhuang, Jieyu Zhang, Yu Meng, Alexander Ratner, Ranjay Krishna, Jiaming Shen, Chao Zhang.Conference on Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track, 2023 Code
- KDD 2023 Weakly Supervised Multi-Label Classification of Full-Text Scientific PapersYu Zhang, Bowen Jin, Xiusi Chen, Yanzhen Shen, Yunyi Zhang, Yu Meng, Jiawei Han.ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2023 Code
- KDD 2023 Tutorial Pretrained Language Representations for Text Understanding: A Weakly-Supervised PerspectiveYu Meng, Jiaxin Huang, Yu Zhang, Yunyi Zhang, Jiawei Han.ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2023
- ACL 2023 Patton: Language Model Pretraining on Text-Rich NetworksBowen Jin, Wentao Zhang, Yu Zhang, Yu Meng, Xinyang Zhang, Qi Zhu, Jiawei Han.Association for Computational Linguistics (ACL), 2023 Code
- ACL-Findings 2023 ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense RetrievalYue Yu, Yuchen Zhuang, Rongzhi Zhang, Yu Meng, Jiaming Shen, Chao Zhang.Findings of ACL, 2023 Code
- ICML 2023 Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot LearningYu Meng, Martin Michalski, Jiaxin Huang, Yu Zhang, Tarek Abdelzaher, Jiawei Han.International Conference on Machine Learning (ICML), 2023 Code
- WWW 2023 SCStory: Self-supervised and Continual Online Story DiscoverySusik Yoon, Yu Meng, Dongha Lee, Jiawei Han.The Web Conference (WWW), 2023 Code
- WWW 2023 Tutorial Turning Web-Scale Texts to Knowledge: Transferring Pretrained Representations to Text Mining ApplicationsYu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han.The Web Conference (WWW), 2023
- ICLR 2023 Edgeformers: Graph-Empowered Transformers for Representation Learning on Textual-Edge NetworksBowen Jin, Yu Zhang, Yu Meng, Jiawei Han.International Conference on Learning Representations (ICLR), 2023 Code
- WSDM 2023 Effective Seed-Guided Topic Discovery by Integrating Multiple Types of ContextsYu Zhang, Yunyi Zhang, Martin Michalski, Yucheng Jiang, Yu Meng, Jiawei Han.ACM International Conference on Web Search and Data Mining (WSDM), 2023 Code
- WSDM 2023 FineSum: Target-Oriented, Fine-Grained Opinion SummarizationSuyu Ge, Jiaxin Huang, Yu Meng, Jiawei Han.ACM International Conference on Web Search and Data Mining (WSDM), 2023 Code
2022
- NeurIPS 2022 Generating Training Data with Language Models: Towards Zero-Shot Language UnderstandingYu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han.Conference on Neural Information Processing Systems (NeurIPS), 2022 Code
- KDD 2022 Few-Shot Fine-Grained Entity Typing with Automatic Label Interpretation and Instance GenerationJiaxin Huang, Yu Meng, Jiawei Han.ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2022 Code
- KDD 2022 Tutorial Adapting Pretrained Text Representations to Text MiningYu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han.ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2022
- NAACL 2022 Seed-Guided Topic Discovery with Out-of-Vocabulary SeedsYu Zhang, Yu Meng, Xuan Wang, Sheng Wang, Jiawei Han.Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2022 Code
- ICLR 2022 Pretraining Text Encoders with Adversarial Mixture of Training Signal GeneratorsYu Meng, Chenyan Xiong, Payal Bajaj, Saurabh Tiwary, Paul Bennett, Jiawei Han, Xia Song.International Conference on Learning Representations (ICLR), 2022 Code
- WWW 2022 Topic Discovery via Latent Space Clustering of Pretrained Language Model RepresentationsYu Meng, Yunyi Zhang, Jiaxin Huang, Yu Zhang, Jiawei Han.The Web Conference (WWW), 2022 Code
- AAAI 2022 Tutorial Pre-Trained Language Representations for Text MiningYu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han.AAAI Conference on Artificial Intelligence (AAAI), 2022
- WSDM 2022 MotifClass: Weakly Supervised Text Classification with Higher-order Metadata InformationYu Zhang, Shweta Garg, Yu Meng, Xiusi Chen, Jiawei Han.ACM International Conference on Web Search and Data Mining (WSDM), 2022 Code
2021
- ICDM 2021 Tutorial Automated Taxonomy Discovery and ExplorationJiaming Shen, Xiaotao Gu, Yu Meng, Jiawei Han.IEEE International Conference on Data Mining (ICDM), 2021
- NeurIPS 2021 COCO-LM: Correcting and Contrasting Text Sequences for Language Model PretrainingYu Meng, Chenyan Xiong, Payal Bajaj, Saurabh Tiwary, Paul Bennett, Jiawei Han, Xia Song.Conference on Neural Information Processing Systems (NeurIPS), 2021 Code
- EMNLP 2021 Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-TrainingYu Meng, Yunyi Zhang, Jiaxin Huang, Xuan Wang, Yu Zhang, Heng Ji, Jiawei Han.Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021 Code
- KDD 2021 UCPhrase: Unsupervised Context-aware Quality Phrase TaggingXiaotao Gu, Zihan Wang, Zhenyu Bi, Yu Meng, Liyuan Liu, Jiawei Han, Jingbo Shang.ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2021 Code
- KDD 2021 Tutorial On the Power of Pre-Trained Text Representations: Models and Applications in Text MiningYu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han.ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2021
- NAACL 2021 TaxoClass: Hierarchical Multi-Label Text Classification Using Only Class NamesJiaming Shen, Wenda Qiu, Yu Meng, Jingbo Shang, Xiang Ren, Jiawei Han.Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2021 Code
- WSDM 2021 Hierarchical Metadata-Aware Document Categorization under Weak SupervisionYu Zhang, Xiusi Chen, Yu Meng, Jiawei Han.ACM International Conference on Web Search and Data Mining (WSDM), 2021 Code
2020
- EMNLP 2020 Text Classification Using Label Names Only: A Language Model Self-Training ApproachYu Meng, Yunyi Zhang, Jiaxin Huang, Chenyan Xiong, Heng Ji, Chao Zhang, Jiawei Han.Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020 Code
- EMNLP 2020 Weakly-Supervised Aspect-Based Sentiment Analysis via Joint Aspect-Sentiment Topic EmbeddingJiaxin Huang, Yu Meng, Fang Guo, Heng Ji, Jiawei Han.Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020 Code
- ICDM 2020 Mining Text Outliers in Document DirectoriesEdouard Fouché, Yu Meng, Fang Guo, Honglei Zhuang, Klemens Böhm, Jiawei Han.IEEE International Conference on Data Mining (ICDM), 2020 Code
- KDD 2020 Hierarchical Topic Mining via Joint Spherical Tree and Text EmbeddingYu Meng*, Yunyi Zhang*, Jiaxin Huang, Yu Zhang, Chao Zhang, Jiawei Han.ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2020 Code
- KDD 2020 CoRel: Seed-Guided Topical Taxonomy Construction by Concept Learning and Relation TransferringJiaxin Huang, Yiqing Xie, Yu Meng, Yunyi Zhang, Jiawei Han.ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2020 Code
- KDD 2020 Tutorial Embedding-Driven Multi-Dimensional Topic Mining and Text AnalysisYu Meng, Jiaxin Huang, Jiawei Han.ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2020
- SIGIR 2020 Minimally Supervised Categorization of Text with MetadataYu Zhang*, Yu Meng*, Jiaxin Huang, Frank F. Xu, Xuan Wang, Jiawei Han.ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2020 Code
- WWW 2020 Discriminative Topic Mining via Category-Name Guided Text EmbeddingYu Meng*, Jiaxin Huang*, Guangyuan Wang, Zihan Wang, Chao Zhang, Yu Zhang, Jiawei Han.The Web Conference (WWW), 2020 Code
- WWW 2020 Guiding Corpus-based Set Expansion by Auxiliary Sets Generation and Co-ExpansionJiaxin Huang, Yiqing Xie, Yu Meng, Jiaming Shen, Yunyi Zhang, Jiawei Han.The Web Conference (WWW), 2020 Code
- WSDM 2020 Separate and Attend in Personal Email SearchYu Meng, Maryam Karimzadehgan, Honglei Zhuang, Donald Metzler.ACM International Conference on Web Search and Data Mining (WSDM), 2020
- Frontiers in Big Data 2020 Unsupervised Word Embedding Learning by Incorporating Local and Global ContextsYu Meng, Jiaxin Huang, Guangyuan Wang, Zihan Wang, Chao Zhang, Jiawei Han.Frontiers in Big Data, 2020
2019 and Before
- NeurIPS 2019 Spherical Text EmbeddingYu Meng, Jiaxin Huang, Guangyuan Wang, Chao Zhang, Honglei Zhuang, Lance Kaplan, Jiawei Han.Conference on Neural Information Processing Systems (NeurIPS), 2019 Code
- VLDB 2019 Tutorial TextCube: Automated Construction and Multidimensional ExplorationYu Meng, Jiaxin Huang, Jingbo Shang, Jiawei Han.International Conference on Very Large Data Bases (VLDB), 2019
- KDD 2019 Demo TopicMine: User-Guided Topic Mining by Category-Oriented EmbeddingYu Meng*, Jiaxin Huang*, Zihan Wang, Chenyu Fan, Guangyuan Wang, Chao Zhang, Jingbo Shang, Lance Kaplan, Jiawei Han.ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2019
- ICDM 2019 HiGitClass: Keyword-Driven Hierarchical Classification of GitHub RepositoriesYu Zhang, Frank F. Xu, Sha Li, Yu Meng, Xuan Wang, Qi Li, Jiawei Han.IEEE International Conference on Data Mining (ICDM), 2019 Code
- AAAI 2019 Weakly-Supervised Hierarchical Text ClassificationYu Meng, Jiaming Shen, Chao Zhang, Jiawei Han.AAAI Conference on Artificial Intelligence (AAAI), 2019 Code
- CIKM 2018 Weakly-Supervised Neural Text ClassificationYu Meng, Jiaming Shen, Chao Zhang, Jiawei Han.ACM International Conference on Information and Knowledge Management (CIKM), 2018 Code
- ADHS 2018 Verifying nonlinear analog and mixed-signal circuits with inputsChuchu Fan, Yu Meng, Jürgen Maier, Ezio Bartocci, Sayan Mitra, Ulrich Schmid.IFAC Conference on Analysis and Design of Hybrid Systems (ADHS), 2018