Zhenqiao Song (宋珍巧)

Hi! I am a Ph.D. candidate at the Language Technologies Institute (LTI) in Carnegie Mellon University, advised by Prof. Lei Li. Before that, I was a full-time research scientist at ByteDance AI Lab for two years, advised by Prof. Hao Zhou. I received my master's degree from Fudan University (FDU). During this period, I was advised by Prof. Xiaoqing Zheng in Fudan University Natural Language Processing Group.

Email  /  CV (Jun 2026)  /  Google Scholar  /  Github  

profile photo

Research

My primary research goal is to build foundation models to accelerate therapeutic discovery and advance our understanding of life sciences. Here are some research highlights:

  • AI for Science: Generative modeling for functional protein design and biomolecular surface learning.
  • Generative AI & Foundation Models: Genomic foundation models, and multi-modal modeling bridging natural language and life sciences.
  • Natural Language Processing: LLM evaluation, multilingual NLP, machine translation, and text generation

News

  • 2025/11 - DNALongBench has been featured for the Nature Communications Editors’ Highlights collection – Computational and Theoretical Biology!
  • 2025/09 - Happy to share that out DNALongBench is accepted by Nature Communications.
  • 2025/09 - Honored to be selected as MIT EECS rising stars 2025
  • 2025/05 - I will intern at Google Deepmind Protein Function Team, working with Ankur Parikh and David Belanger
  • 2025/05 - We are organizing our second GenBio at ICML 2025. Check more details at 2nd GenBio.
  • 2024/07 - I will attend ICML 2024 to present my two works "EnzyGen" and "SurfPro". I'm open to talk if you are interested in my works.
  • 2024/05 - I will be an intern of NEC Lab, working with Martin Renqiang Min
  • 2023/10 - We organize the first GenBio Workshop on New Frontiers of Generative AI and Biology (GenBio) at NeurIPS 2023 in New Orleans in Dec. 2023! Check more details here (1st GenBio).
  • 2023/06 - Internship at Broad Institue of Havard and MIT, woking with Wengong Jin
  • Publications

    A full list of publications is here. (* indicates equal contribution and # indicates project lead.)

    AI for Science

    DNALONGBENCH: A Benchmark Suite for Long-Range DNA Prediction Tasks
    Wenduo Cheng* Zhenqiao Song*, Yang Zhang*, Shike Wang, Danqing Wang, Muyu Yang, Lei Li, Jian MaJian Ma
    Nature Communications 2025.
    [paper] [code]

    JanusDNA: A Powerful Bi-directional Hybrid DNA Foundation Model
    Qihao Duan Bingding Huang Zhenqiao Song, Irina Lehmann, Lei Gu, Roland Eils, Benjamin Wild
    NeurIPS 2025.
    [paper] [code]

    PPDiff: Diffusing in Hybrid Sequence-Structure Space for Protein-Protein Complex Design
    Zhenqiao Song, Tianxiao Li, Lei Li, Martin Renqiang Min
    ICML 2025.
    [paper] [code]

    Generative Enzyme Design Guided by Functionally Important Sites and Small-Molecule Substrates
    Zhenqiao Song, Yunlong Zhao, Wenxian Shi, Wengong Jin, Yang Yang Lei Li
    ICML 2024.
    [paper] [code]

    SurfPro: Functional Protein Design Based on Continuous Surface
    Zhenqiao Song, Tinglin Huang, Lei Li, Wengong Jin
    ICML 2024.
    [paper] [code]

    Protein-Nucleic Acid Complex Modeling with Frame Averaging Transformer
    Tinglin Huang, Zhenqiao Song, Rex Ying, Wengong Jin
    NeurIPS 2024.
    [paper] [code]

    Importance Weighted Expectation-Maximization for Protein Sequence Design
    Zhenqiao Song, Lei Li
    ICML 2023.
    [paper] [code]

    Large Language Model

    Hire a Linguist!: Learning Endangered Languages with In-Context Linguistic Descriptions
    Kexun Zhang, Yee Man Choi, Zhenqiao Song, Tianqi He, William Yang Wang, Lei Li
    ACL Findings 2024.
    [paper] [code]

    INSTRUCTSCORE: Explainable Text Generation Evaluation with Fine-grained Feedback
    Wenda Xu, Danqing Wang, Liangming Pan, Zhenqiao Song, Markus Freitag, William Yang Wang, Lei Li
    EMNLP (Oral) 2023.
    [paper] [code]

    Natural Language Processing

    switch-GLAT: Multilingual Parallel Machine Translation Via Code-Switch Decoder
    Zhenqiao Song, Hao Zhou, Lihua Qian, Jingjing Xu, Mingxuan Wang, Lei Li
    ICLR 2022.
    [paper] [code]

    MTG: A Benchmarking Suite for Multilingual Text Generation
    Yiran Chen, Zhenqiao Song#, Xianze Wu, Danqing Wang, Jingjing Xu, Jiaze Chen, Hao Zhou, Lei Li
    NAACL Findings 2022.
    [paper] [code]

    Triangular Bidword Generation for Sponsored Search Auction
    Zhenqiao Song, Jiaze Chen, Hao Zhou, Lei Li
    WSDM 2021.
    [paper]

    Jointly Learning Bilingual Word Embeddings and Alignments
    Zhenqiao Song, Xiaoqing Zheng, Xuanjing Huang
    Machine Translation 2021.
    [paper]

    Generating responses with a specific emotion in dialog
    Zhenqiao Song, Xiaoqing Zheng, Lu Liu, Mu Xu, Xuanjing Huang
    ACL 2019.
    [paper] [code]

    Invited Talk

    Awards

    • EECS Rising Stars, 2025
    • National Scholarship of China, 2020
    • Shanghai Outstanding Graduate Student, 2020

    Academic Service

    Reviewers / PC Members

    • Nature Communication, Medicine, Nature Machine Intelligence
    • ICML: 2023 - 2026
    • NeurIPS: 2023 - 2025
    • ICLR: 2023 - 2026
    • AISTATS: 2025 - 2026
    • ACL: 2023 - 2024
    • EMNLP: 2020 - 2023
    • NLPCC: 2022 - 2023
    • IJCAI: 2023 - 2024
    • TMLR, ARR


    Organizers

    Mentoring

    Graduate Students


    Undergraduate Students


    Interns at ByteDance AI Lab





    Website template from Jon Barron.