About Me

I am final-year Ph.D. student majoring in Computer Science at Shanghai Jiao Tong University, under the supervision of Professor Weijia Jia. I am also honored to be guided by Dr. Bo Zhang from the Shanghai Artificial Intelligence Laboratory. My research focuses on large language models, multimodal learning, and AIGC. I've delved in document understanding and chart analysis using multimodal large language models, along with mathematical reasoning with formal languages. I'm also exploring how multi-agent systems can drive automated scientific research.

I am delighted to join the School of Artificial Intelligence at Shanghai Jiao Tong University as a Tenure-track Assistant Professor, focusing on multimodal reasoning models with applications in plane geometry reasoning, formal mathematical reasoning, and physics problem modeling, etc. I am now recruiting PhD/Master students (since 2026), research assistants, and interns who possess a solid background in machine learning/mathematics/physics, strong programming skills, and a passion for tackling AI-driven logical reasoning challenges. If you are self-motivated to explore cutting-edge research in symbolic AI and Multi-modal reasoning model, please email your CV and research interests to xiarenqiu@sjtu.edu.cn. Let's collaborate to push the boundaries of AI reasoning!

🔥 News

  • 2025.06:  🎉🎉 One paper (Chimera) is accepted by ICCV 2025.

  • 2025.06:  🎉🎉 One paper (SurveyForge) is accepted by ACL 2025.

  • 2025.02:  🎉🎉 One paper (CDM) is accepted by CVPR 2025.

  • 2024.12:  🎉🎉 One paper (GeoX) is accepted by ICLR 2025.

  • 2024.12:  🎉🎉 One paper (LaTexNet) is accepted by ICASSP 2025.

  • 2024.09:  🎉🎉 One papers (AdaptiveDiffusion) is accepted by NeurIPS 2024.

  • 2024.07:  🎉🎉 One paper (Once-for-Both) is accepted by CVPR 2024.

  • 2024.01:  🎉🎉 One paper (ReSimAD) is accepted by ICLR 2024.

  • 2023.12:  🎉🎉 One paper (EASInst) is accepted by ICASSP 2024.

📝 Selected Publications & Preprints

ICCV 2025
sym

Chimera: Improving Generalist Model with Domain-Specific Experts

Tianshuo Peng*, Mingsheng Li*, Jiakang Yuan, Hongbin Zhou, Renqiu Xia, Renrui Zhang, Lei Bai, Song Mao, Bin Wang, Aojun Zhou, Botian Shi, Tao Chen, Bo Zhang, Xiangyu Yue

[Project][Paper]

  • a scalable and low-cost multi-modal pipeline designed to boost the ability of existing LMMs with domain-specific experts.
ACL 2025
sym

SurveyForge: On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing

Xiangchao Yan*, Shiyang Feng*, Jiakang Yuan, Renqiu Xia, Bin Wang, Bo Zhang, Lei Bai

[Project][Paper]

  • Propose SurveyForge which can automatically generate and refine the content of survey.
CVPR 2025
sym

Image Over Text: Transforming Formula Recognition Evaluation with Character Detection Matching

Bin Wang*, Fan Wu*, Linke Ouyang*, Zhuangcheng Gu, Rui Zhang, Renqiu Xia, Botian Shi, Bo Zhang, Conghui He

[Project][Paper]

  • Propose a Character Detection Matching (CDM) metric, ensuring the evaluation objectivity by designing an image-level rather than a LaTeX-level metric score.
ICLR 2025
sym

GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training

Renqiu Xia*, Mingsheng Li*, Hancheng Ye, Wenjie Wu, Hongbin Zhou, Jiakang Yuan, Tianshuo Peng, Xinyu Cai, Xiangchao Yan, Bin Wang, Conghui He, Botian Shi, Tao Chen, Junchi Yan, Bo Zhang

[Project][Paper]

  • Propose GeoX, a multi-modal large model focusing on geometric understanding and reasoning tasks which reveals the large potential of formalized visual-language pre-training in enhancing geometric problem-solving abilities.
ICASSP 2025
sym

LaTeXNet: A Specialized Model for Converting Visual Tables and Equations to LaTeX Code

Renqiu Xia, Hongbin Zhou, Ziming Feng, Huanxi Liu, Boan Chen, Bo Zhang, Junchi Yan

[Project][Paper]

  • Propose LaTeXNet, a specialized model designed to automate the conversion of visual tables and equations into LaTeX code.
NeurIPS 2024
sym

Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy

Hancheng Ye*, Jiakang Yuan*, Renqiu Xia, Xiangchao Yan, Tao Chen, Junchi Yan, Botian Shi, Bo Zhang

[Project][Paper]

  • Propose AdaptiveDiffusion to adaptively reduce the noise prediction steps during the denoising proces guided by the third-order latent difference.
CVPR 2024
sym

Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression

Hancheng Ye, Chong Yu, Peng Ye, Renqiu Xia, Yansong Tang, Jiwen Lu, Tao Chen, Bo Zhang

[Project][Paper]

  • Propose OFB, a cost-efficient approach that simultaneously evaluates both importance and sparsity scores for VTC.
ICLR 2024
sym

ReSimAD: Zero-Shot 3D Domain Transfer for Autonomous Driving with Source Reconstruction and Target Simulation

Bo Zhang*, Xinyu Cai*, Jiakang Yuan, Donglin Yang, Jianfei Guo, Xiangchao Yan, Renqiu Xia, Botian Shi, Min Dou, Tao Chen, Si Liu, Junchi Yan, Yu Qiao

[Project][Paper]

  • Provide a new perspective and approach of alleviating the domain shifts, by proposing a Reconstruction-Simulation-Perception scheme.
ICASSP 2024
sym

Efficient Architecture Search for Real-Time Instance Segmentation

Renqiu Xia, Dongyuan Zhang, Yixin Dong, Juanping Zhao, Wenlong Liao, Tao He, Junchi Yan

[Project][Paper]

  • Propose EASInst,an efficient framework that discover practical backbone and encoder architectures for the improved sparse activation instance segmentation model.

📖 Educations

  • 2021.04 - Now, Ph.D in Computer Science, Shanghai Jiao Tong University.
  • 2019.02 - 2020.12, Master in Electrical Engineering (Distinction), University of Melbourne.
  • 2014.09 - 2018.06, Bachelor in Instrument for Measurement & Control (Pilot Program), Jilin University.

💻 Internships

📝 Academic Services

  • Reviewer of CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, KDD, TKDE.