About Me
I am delighted to join the School of Artificial Intelligence at Shanghai Jiao Tong University as a Tenure-track Assistant Professor, focusing on multimodal reasoning models with applications in plane geometry reasoning, formal mathematical reasoning, and physics problem modeling, etc. I am now recruiting PhD/Master students (since 2026), research assistants, and interns who possess a solid background in machine learning/mathematics/physics, strong programming skills, and a passion for tackling AI-driven logical reasoning challenges. If you are self-motivated to explore cutting-edge research in symbolic AI and Multi-modal reasoning model, please email your CV and research interests to xiarenqiu@sjtu.edu.cn. Let's collaborate to push the boundaries of AI reasoning!
🔥 News
-
2025.06: 🎉🎉 One paper (Chimera) is accepted by ICCV 2025.
-
2025.06: 🎉🎉 One paper (SurveyForge) is accepted by ACL 2025.
-
2025.02: 🎉🎉 One paper (CDM) is accepted by CVPR 2025.
-
2024.12: 🎉🎉 One paper (GeoX) is accepted by ICLR 2025.
-
2024.12: 🎉🎉 One paper (LaTexNet) is accepted by ICASSP 2025.
-
2024.09: 🎉🎉 One papers (AdaptiveDiffusion) is accepted by NeurIPS 2024.
-
2024.07: 🎉🎉 One paper (Once-for-Both) is accepted by CVPR 2024.
-
2024.01: 🎉🎉 One paper (ReSimAD) is accepted by ICLR 2024.
-
2023.12: 🎉🎉 One paper (EASInst) is accepted by ICASSP 2024.
📝 Selected Publications & Preprints

Chimera: Improving Generalist Model with Domain-Specific Experts
Tianshuo Peng*, Mingsheng Li*, Jiakang Yuan, Hongbin Zhou, Renqiu Xia, Renrui Zhang, Lei Bai, Song Mao, Bin Wang, Aojun Zhou, Botian Shi, Tao Chen, Bo Zhang, Xiangyu Yue
- a scalable and low-cost multi-modal pipeline designed to boost the ability of existing LMMs with domain-specific experts.

Xiangchao Yan*, Shiyang Feng*, Jiakang Yuan, Renqiu Xia, Bin Wang, Bo Zhang, Lei Bai
- Propose SurveyForge which can automatically generate and refine the content of survey.

Image Over Text: Transforming Formula Recognition Evaluation with Character Detection Matching
Bin Wang*, Fan Wu*, Linke Ouyang*, Zhuangcheng Gu, Rui Zhang, Renqiu Xia, Botian Shi, Bo Zhang, Conghui He
- Propose a Character Detection Matching (CDM) metric, ensuring the evaluation objectivity by designing an image-level rather than a LaTeX-level metric score.

GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training
Renqiu Xia*, Mingsheng Li*, Hancheng Ye, Wenjie Wu, Hongbin Zhou, Jiakang Yuan, Tianshuo Peng, Xinyu Cai, Xiangchao Yan, Bin Wang, Conghui He, Botian Shi, Tao Chen, Junchi Yan, Bo Zhang
- Propose GeoX, a multi-modal large model focusing on geometric understanding and reasoning tasks which reveals the large potential of formalized visual-language pre-training in enhancing geometric problem-solving abilities.

LaTeXNet: A Specialized Model for Converting Visual Tables and Equations to LaTeX Code
Renqiu Xia, Hongbin Zhou, Ziming Feng, Huanxi Liu, Boan Chen, Bo Zhang, Junchi Yan
- Propose LaTeXNet, a specialized model designed to automate the conversion of visual tables and equations into LaTeX code.

Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy
Hancheng Ye*, Jiakang Yuan*, Renqiu Xia, Xiangchao Yan, Tao Chen, Junchi Yan, Botian Shi, Bo Zhang
- Propose AdaptiveDiffusion to adaptively reduce the noise prediction steps during the denoising proces guided by the third-order latent difference.

Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression
Hancheng Ye, Chong Yu, Peng Ye, Renqiu Xia, Yansong Tang, Jiwen Lu, Tao Chen, Bo Zhang
- Propose OFB, a cost-efficient approach that simultaneously evaluates both importance and sparsity scores for VTC.

Bo Zhang*, Xinyu Cai*, Jiakang Yuan, Donglin Yang, Jianfei Guo, Xiangchao Yan, Renqiu Xia, Botian Shi, Min Dou, Tao Chen, Si Liu, Junchi Yan, Yu Qiao
- Provide a new perspective and approach of alleviating the domain shifts, by proposing a Reconstruction-Simulation-Perception scheme.

Efficient Architecture Search for Real-Time Instance Segmentation
Renqiu Xia, Dongyuan Zhang, Yixin Dong, Juanping Zhao, Wenlong Liao, Tao He, Junchi Yan
- Propose EASInst,an efficient framework that discover practical backbone and encoder architectures for the improved sparse activation instance segmentation model.
📖 Educations
- 2021.04 - Now, Ph.D in Computer Science, Shanghai Jiao Tong University.
- 2019.02 - 2020.12, Master in Electrical Engineering (Distinction), University of Melbourne.
- 2014.09 - 2018.06, Bachelor in Instrument for Measurement & Control (Pilot Program), Jilin University.
💻 Internships
- 2023.03 - Now, Shanghai AI Laboratory, China.
📝 Academic Services
- Reviewer of CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, KDD, TKDE.