publications

2024

  1. SciCode: A Research Coding Benchmark Curated by Scientists
    Minyang Tian, Luyu Gao, Shizhuo Dylan Zhang, Xinan Chen, Cunwei Fan, Xuefei Guo, Roland Haas, Pan Ji, Kittithat Krongchon, Yao Li, and 20 more authors
    arXiv preprint, 2024
  2. A Single Transformer for Scalable Vision-Language Modeling
    Yangyi Chen, Xingyao Wang, Hao Peng, and Heng Ji
    arXiv preprint, 2024
  3. PLUM: Preference Learning Plus Test Cases Yields Better Code Language Models
    Dylan Zhang, Shizhe Diao, Xueyan Zou, and Hao Peng
    arXiv preprint, 2024
  4. Eliminating Position Bias of Language Models: A Mechanistic Approach
    Ziqi Wang, Hanlin Zhang, Xiner Li, Kuan-Hao Huang, Chi Han, Shuiwang Ji, Sham M. Kakade, Hao Peng, and Heng Ji
    2024
  5. Advancing LLM Reasoning Generalists with Preference Trees
    Lifan Yuan, Ganqu Cui, Hanbin Wang, Ning Ding, Xingyao Wang, Jia Deng, Boji Shan, Huimin Chen, Ruobing Xie, Yankai Lin, and 5 more authors
    arXiv preprint, 2024
  6. Source-Aware Training Enables Knowledge Attribution in Language Models
    Muhammad Khalifa, David Wadden, Emma Strubell, Honglak Lee, Lu Wang, Iz Beltagy, and Hao Peng
    In Proceedings of the Conference on Language Modeling (COLM), 2024
  7. Language Models Hallucinate, but May Excel at Fact Verification
    Jian Guan, Jesse Dodge, David Wadden, Minlie Huang, and Hao Peng
    In Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL), 2024
  8. LM-Infinite: Zero-Shot Extreme Length Generalization for Large Language Models
    Chi Han, Qifan Wang, Hao Peng, Wenhan Xiong, Yu Chen, Heng Ji, and Sinong Wang
    In Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL), 2024
  9. Executable Code Actions Elicit Better LLM Agents
    Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhang, Yunzhu Li, Hao Peng, and Heng Ji
    In Proceedings of the International Conference on Machine Learning (ICML), 2024
  10. Data Engineering for Scaling Language Models to 128K Context
    Yao Fu, Rameswar Panda, Xinyao Niu, Xiang Yue, Hannaneh Hajishirzi, Yoon Kim, and Hao Peng
    2024
  11. Examining LLMs’ Uncertainty Expression Towards Questions Outside Parametric Knowledge
    Genglin Liu, Xingyao Wang, Lifan Yuan, Yangyi Chen, and Hao Peng
    arXiv preprint, 2024
  12. spotlight
    TRAM: Bridging Trust Regions and Sharpness Aware Minimization
    Tom Sherborne, Naomi Saphra, Pradeep Dasigi, and Hao Peng
    In Proceedings of the International Conference on Learning Representations (ICLR), 2024
  13. MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback
    Xingyao Wang, Zihan Wang, Jiateng Liu, Yangyi Chen, Lifan Yuan, Hao Peng, and Heng Ji
    In Proceedings of the International Conference on Learning Representations (ICLR), 2024
  14. CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets
    Lifan Yuan, Yangyi Chen, Xingyao Wang, Yi R. Fung, Hao Peng, and Heng Ji
    In Proceedings of the International Conference on Learning Representations (ICLR), 2024
  15. LeTI: Learning to Generate from Textual Interactions
    Xingyao Wang, Hao Peng, Reyhaneh Jabbarvand, and Heng Ji
    In Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL), 2024

2023

  1. FiLM: Fill-in Language Models for Any-Order Generation
    Tianxiao Shen, Hao Peng, Ruoqi Shen, Yao Fu, Zaid Harchaoui, and Yejin Choi
    arXiv preprint, 2023
  2. Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback
    Yao Fu, Hao Peng, Tushar Khot, and Mirella Lapata
    arXiv preprint, 2023
  3. Efficiency Pentathlon: A Standardized Arena for Efficiency Evaluation
    Hao Peng, Qingqing Cao, Jesse Dodge, Matthew E. Peters, Jared Fernandez, Tom Sherborne, Kyle Lo, Sam Skjonsberg, Emma Strubell, Darrell Plessas, and 4 more authors
    arXiv preprint, 2023
  4. Chain-of-Thought Hub: A Continuous Effort to Measure Large Language Models’ Reasoning Performance
    Yao Fu, Litu Ou, Mingyu Chen, Yuhao Wan, Hao Peng, and Tushar Khot
    arXiv preprint, 2023
  5. oral
    Specializing Smaller Language Models towards Multi-Step Reasoning
    Yao Fu, Hao Peng, Litu Ou, Ashish Sabharwal, and Tushar Khot
    In Proceedings of the International Conference on Machine Learning (ICML), 2023
  6. Complexity-Based Prompting for Multi-step Reasoning
    Yao Fu, Hao Peng, Ashish Sabharwal, Peter Clark, and Tushar Khot
    In Proceedings of the International Conference on Learning Representations (ICLR), 2023
  7. Transparency Helps Reveal When Language Models Learn Meaning
    Zhaofeng Wu, William Merrill, Hao Peng, Iz Beltagy, and Noah A. Smith
    Transactions of the Association for Computational Linguistics (TACL), 2023

2022

  1. How Much Does Attention Actually Attend? Questioning the Importance of Attention in Pretrained Transformers
    Michael Hassid, Hao Peng, Daniel Rotem, Jungo Kasai, Ivan Montero, Noah A. Smith, and Roy Schwartz
    In Findings of the Association for Computational Linguistics: EMNLP 2022, 2022
  2. Modeling Context With Linear Attention for Scalable Document-Level Translation
    Zhaofeng Wu, Hao Peng, Nikolaos Pappas, and Noah A. Smith
    In Findings of the Association for Computational Linguistics: EMNLP 2022, 2022
  3. Twist Decoding: Diverse Generators Guide Each Other
    Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Hao Peng, Ximing Lu, Dragomir Radev, Yejin Choi, and Noah A. Smith
    In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
  4. ABC: Attention with Bounded-memory Control
    Hao Peng, Jungo Kasai, Nikolaos Pappas, Dani Yogatama, Zhaofeng Wu, Lingpeng Kong, Roy Schwartz, and Noah A. Smith
    In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2022
  5. Tailor: Generating and Perturbing Text with Semantic Controls
    Alexis Ross, Tongshuang Wu, Hao Peng, Matthew Peters, and Matt Gardner
    In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2022

2021

  1. Finetuning Pretrained Transformers into RNNs
    Jungo Kasai, Hao Peng, Yizhe Zhang, Dani Yogatama, Gabriel Ilharco, Nikolaos Pappas, Yi Mao, Weizhu Chen, and Noah A. Smith
    In In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
  2. spotlight
    Random Feature Attention
    Hao Peng, Nikolaos Pappas, Dani Yogatama, Roy Schwartz, Noah Smith, and Lingpeng Kong
    In Proceedings of the International Conference on Learning Representations (ICLR), 2021
  3. Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation
    Jungo Kasai, Nikolaos Pappas, Hao Peng, James Cross, and Noah Smith
    In Proceedings of the International Conference on Learning Representations (ICLR), 2021
  4. Contextualized Perturbation for Textual Adversarial Attack
    Dianqi Li, Yizhe Zhang, Hao Peng, Liqun Chen, Chris Brockett, Ming-Ting Sun, and Bill Dolan
    In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2021
  5. Infusing Finetuning with Semantic Dependencies
    Zhaofeng Wu, Hao Peng, and Noah A. Smith
    Transactions of the Association for Computational Linguistics (TACL), 2021

2020

  1. A Mixture of h - 1 Heads is Better than h Heads
    Hao Peng, Roy Schwartz, Dianqi Li, and Noah A. Smith
    In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2020

2019

  1. PaLM: A Hybrid Parser and Language Model
    Hao Peng, Roy Schwartz, and Noah A. Smith
    In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019
  2. RNN Architecture Learning with Sparse Regularization
    Jesse Dodge, Roy Schwartz, Hao Peng, and Noah A. Smith
    In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019
  3. Text Generation with Exemplar-based Adaptive Decoding
    Hao Peng, Ankur Parikh, Manaal Faruqui, Bhuwan Dhingra, and Dipanjan Das
    In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2019

2018

  1. Rational Recurrences
    Hao Peng, Roy Schwartz, Sam Thomson, and Noah A. Smith
    In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2018
  2. best paper
    honorable mention
    Backpropagating through Structured Argmax using a SPIGOT
    Hao Peng, Sam Thomson, and Noah A. Smith
    In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2018
  3. Learning Joint Semantic Parsers from Disjoint Data
    Hao Peng, Sam Thomson, Swabha Swayamdipta, and Noah A. Smith
    In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2018
  4. "You Are No Jack Kennedy": On Media Selection of Highlights from Presidential Debates
    Chenhao Tan, Hao Peng, and Noah A. Smith
    In Proceedings of The Web Conference (WWW), 2018

2017

  1. Deep Multitask Learning for Semantic Dependency Parsing
    Hao Peng, Sam Thomson, and Noah A. Smith
    In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2017

2016

  1. A Convolutional Attention Network for Extreme Summarization of Source Code
    Miltiadis Allamanis, Hao Peng, and Charles Sutton
    In Proceedings of the International Conference on Machine Learning (ICML), 2016

2015

  1. Discriminative Neural Sentence Modeling by Tree-Based Convolution
    Lili Mou, Hao Peng, Ge Li, Yan Xu, Lu Zhang, and Zhi Jin
    In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2015
  2. Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths
    Yan Xu, Lili Mou, Ge Li, Yunchuan Chen, Hao Peng, and Zhi Jin
    In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2015