Zhenqiao Song (宋珍巧)

Hi! I am a Ph.D. candidate at the Language Technologies Institute (LTI) in Carnegie Mellon University, advised by Prof. Lei Li. Before that, I was a full-time research scientist at ByteDance AI Lab for two years, advised by Prof. Hao Zhou. I received my master's degree from Fudan University (FDU). During this period, I was advised by Prof. Xiaoqing Zheng in Fudan University Natural Language Processing Group.

Email  /  CV (Jun 2026)  /  Google Scholar  /  Github  



I am currently on the job market. If you find my research background a good fit, please feel free to reach out to me.

profile photo

Research

My primary research goal is to build foundation models to accelerate therapeutic discovery and advance our understanding of life sciences. Here are some research highlights:

  • AI for Science: Generative modeling for functional protein design and biomolecular surface learning.
  • Generative AI & Foundation Models: Genomic foundation models, and multi-modal modeling bridging natural language and life sciences.
  • Natural Language Processing: LLM evaluation, multilingual NLP, machine translation, and text generation

News

  • 2025/11 - DNALongBench has been featured for the Nature Communications Editors’ Highlights collection – Computational and Theoretical Biology!
  • 2025/09 - Happy to share that out DNALongBench is accepted by Nature Communications.
  • 2025/09 - Honored to be selected as MIT EECS rising stars 2025
  • 2025/05 - I will intern at Google Deepmind Protein Function Team, working with Ankur Parikh and David Belanger
  • 2025/05 - We are organizing our second GenBio at ICML 2025. Check more details at 2nd GenBio.
  • 2024/07 - I will attend ICML 2024 to present my two works "EnzyGen" and "SurfPro". I'm open to talk if you are interested in my works.
  • 2024/05 - I will be an intern of NEC Lab, working with Martin Renqiang Min
  • 2023/10 - We organize the first GenBio Workshop on New Frontiers of Generative AI and Biology (GenBio) at NeurIPS 2023 in New Orleans in Dec. 2023! Check more details here (1st GenBio).
  • 2023/06 - Internship at Broad Institue of Havard and MIT, woking with Wengong Jin
  • Publications

    A full list of publications is here. (* indicates equal contribution and # indicates project lead.)

    AI for Science

    DNALONGBENCH: A Benchmark Suite for Long-Range DNA Prediction Tasks
    Wenduo Cheng* Zhenqiao Song*, Yang Zhang*, Shike Wang, Danqing Wang, Muyu Yang, Lei Li, Jian MaJian Ma
    Nature Communications 2025.
    [paper] [code]

    JanusDNA: A Powerful Bi-directional Hybrid DNA Foundation Model
    Qihao Duan Bingding Huang Zhenqiao Song, Irina Lehmann, Lei Gu, Roland Eils, Benjamin Wild
    NeurIPS 2025.
    [paper] [code]

    PPDiff: Diffusing in Hybrid Sequence-Structure Space for Protein-Protein Complex Design
    Zhenqiao Song, Tianxiao Li, Lei Li, Martin Renqiang Min
    ICML 2025.
    [paper] [code]

    Generative Enzyme Design Guided by Functionally Important Sites and Small-Molecule Substrates
    Zhenqiao Song, Yunlong Zhao, Wenxian Shi, Wengong Jin, Yang Yang Lei Li
    ICML 2024.
    [paper] [code]

    SurfPro: Functional Protein Design Based on Continuous Surface
    Zhenqiao Song, Tinglin Huang, Lei Li, Wengong Jin
    ICML 2024.
    [paper] [code]

    Protein-Nucleic Acid Complex Modeling with Frame Averaging Transformer
    Tinglin Huang, Zhenqiao Song, Rex Ying, Wengong Jin
    NeurIPS 2024.
    [paper] [code]

    Importance Weighted Expectation-Maximization for Protein Sequence Design
    Zhenqiao Song, Lei Li
    ICML 2023.
    [paper] [code]

    Large Language Model

    Hire a Linguist!: Learning Endangered Languages with In-Context Linguistic Descriptions
    Kexun Zhang, Yee Man Choi, Zhenqiao Song, Tianqi He, William Yang Wang, Lei Li
    ACL Findings 2024.
    [paper] [code]

    INSTRUCTSCORE: Explainable Text Generation Evaluation with Fine-grained Feedback
    Wenda Xu, Danqing Wang, Liangming Pan, Zhenqiao Song, Markus Freitag, William Yang Wang, Lei Li
    EMNLP (Oral) 2023.
    [paper] [code]

    Natural Language Processing

    switch-GLAT: Multilingual Parallel Machine Translation Via Code-Switch Decoder
    Zhenqiao Song, Hao Zhou, Lihua Qian, Jingjing Xu, Mingxuan Wang, Lei Li
    ICLR 2022.
    [paper] [code]

    MTG: A Benchmarking Suite for Multilingual Text Generation
    Yiran Chen, Zhenqiao Song#, Xianze Wu, Danqing Wang, Jingjing Xu, Jiaze Chen, Hao Zhou, Lei Li
    NAACL Findings 2022.
    [paper] [code]

    Triangular Bidword Generation for Sponsored Search Auction
    Zhenqiao Song, Jiaze Chen, Hao Zhou, Lei Li
    WSDM 2021.
    [paper]

    Jointly Learning Bilingual Word Embeddings and Alignments
    Zhenqiao Song, Xiaoqing Zheng, Xuanjing Huang
    Machine Translation 2021.
    [paper]

    Generating responses with a specific emotion in dialog
    Zhenqiao Song, Xiaoqing Zheng, Lu Liu, Mu Xu, Xuanjing Huang
    ACL 2019.
    [paper] [code]

    Invited Talk

    • Sep. 2025: Invited talk at AstraZeneca about "Generative AI for Functional Protein Design"
    • Jun. 2025: Invited talk at Google Deepmind Protein Design/Function Team!
    • Apr. 2025: Invited talk at Tsinghua University Air Institute GenSI open day!
    • Aug. 2024: Invited talk at Tsinghua University Air Institute!
    • Jul. 2024: Invited talk at Fudan University NLP Group!
    • Oct. 2023: Invited talk at 将门!
    • May 2023: Invited talk at BytaDance Research!

    Awards

    • EECS Rising Stars, 2025
    • National Scholarship of China, 2020
    • Shanghai Outstanding Graduate Student, 2020

    Academic Service

    Reviewers / PC Members

    • Nature Communication, Medicine, Nature Machine Intelligence
    • ICML: 2023 - 2026
    • NeurIPS: 2023 - 2025
    • ICLR: 2023 - 2026
    • AISTATS: 2025 - 2026
    • ACL: 2023 - 2024
    • EMNLP: 2020 - 2023
    • NLPCC: 2022 - 2023
    • IJCAI: 2023 - 2024
    • TMLR, ARR


    Organizers

    Mentoring

    Graduate Students

    • Jacky Chen (2024.9-2025.1, University of Pittiburgh PhD)
    • Ramith Hettiarachchi (2024.9-2025.1, CMU PhD)
    • Xiwei Cheng (2024.6- 2025.4, UCSD master -> NorthEastern PhD)
    • Yujia Gao (2024.1-2024.5, CMU master)

    Undergraduate Students

    • Charles Novak (2025.6 -- Now, CMU undergrad -> CMU master)
    • Jingyu Zhu (2024.6-2024.9, Peking University)
    • Yufei Song (2023.1-2023.12, UCSB -> UCLA master)

    Interns at ByteDance AI Lab

    • Lu Liu (2021.6 - 2021.9, Fudan University Master -> ByteDance NLP researcher)
    • Yiran Chen (2021.6 - 2021.9, Fudan University Master -> ByteDance NLP researcher)





    Website template from Jon Barron.