Wenhao Lu

I am a PhD researcher in robot learning and AI. My work focuses on reinforcement learning for robotic manipulation and on interpretability methods that make RL agents easier to analyze, explain, and trust.

research themes

RL-Driven Robot Learning

I am interested in robots that learn purposeful behavior through interaction, especially manipulation systems that must act robustly in structured physical environments.

Interpretable RL Agents

I study how to make learned decision-making systems understandable to humans, so agent behavior can be inspected, questioned, and improved rather than treated as a black box.

Grounded Agentic Intelligence

I use robot learning and reinforcement learning as grounded settings for asking broader questions about autonomy, reliability, and explanation in learning-based agents.

featured publications

ACL Findings 2026

Curriculum-RLAIF: Curriculum Alignment with Reinforcement Learning from AI Feedback

Jiaye Lin , Mengdi Li , Xufeng Zhao , and 4 more authors

2026

ACL 2026 Findings

arXiv Bib

@misc{Lin26CurriculumRLAIF,
  title = {Curriculum-RLAIF: Curriculum Alignment with Reinforcement Learning from AI Feedback},
  author = {Lin, Jiaye and Li, Mengdi and Zhao, Xufeng and Lu, Wenhao and Zhao, Peilin and Wermter, Stefan and Wang, Di},
  year = {2026},
  url = {https://arxiv.org/abs/2505.20075},
  note = {ACL 2026 Findings},
}

TMLR 2025

Mental Modeling of Reinforcement Learning Agents by Language Models

Wenhao Lu , Xufeng Zhao , Josua Spisak , and 2 more authors

Transactions on Machine Learning Research, 2025

Also presented at the 18th European Workshop on Reinforcement Learning

Abs arXiv Bib Website

@article{Lu25MentalModeling,
  title = {Mental Modeling of Reinforcement Learning Agents by Language Models},
  author = {Lu, Wenhao and Zhao, Xufeng and Spisak, Josua and Lee, Jae Hee and Wermter, Stefan},
  journal = {Transactions on Machine Learning Research},
  year = {2025},
  url = {https://openreview.net/forum?id=JN7iNWaPTe},
  note = {Also presented at the 18th European Workshop on Reinforcement Learning},
}

Humanoids 2024

Large Language Models for Orchestrating Bimanual Robots

Kun Chu , Xufeng Zhao , Cornelius Weber , and 3 more authors

In 23rd IEEE-RAS International Conference on Humanoid Robots (Humanoids 2024) , 2024

arXiv Bib

@inproceedings{Chu24Labor,
  title = {Large Language Models for Orchestrating Bimanual Robots},
  author = {Chu, Kun and Zhao, Xufeng and Weber, Cornelius and Li, Mengdi and Lu, Wenhao and Wermter, Stefan},
  booktitle = {23rd {{IEEE-RAS}} International Conference on Humanoid Robots (<b>Humanoids 2024</b>)},
  pages = {328--334},
  publisher = {IEEE},
  year = {2024},
  doi = {10.1109/HUMANOIDS58906.2024.10769891},
  url = {https://doi.org/10.1109/Humanoids58906.2024.10769891},
}

ICANN 2024

Details Make a Difference: Object State-Sensitive Neurorobotic Task Planning

Xiaowen Sun , Xufeng Zhao , Jae Hee Lee , and 3 more authors

In Artificial Neural Networks and Machine Learning - ICANN 2024 , 2024

arXiv Bib

@inproceedings{Sun24ObjectStateSensitive,
  title = {Details Make a Difference: Object State-Sensitive Neurorobotic Task Planning},
  author = {Sun, Xiaowen and Zhao, Xufeng and Lee, Jae Hee and Lu, Wenhao and Kerzel, Matthias and Wermter, Stefan},
  booktitle = {Artificial Neural Networks and Machine Learning - {{ICANN}} 2024},
  series = {Lecture Notes in Computer Science},
  pages = {261--275},
  publisher = {Springer},
  year = {2024},
  doi = {10.1007/978-3-031-72341-4_18},
  url = {https://doi.org/10.1007/978-3-031-72341-4_18},
}

COLING 2024

Enhancing Zero-Shot Chain-of-Thought Reasoning in Large Language Models through Logic

Xufeng Zhao , Mengdi Li , Wenhao Lu , and 4 more authors

In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) , May 2024

Oral presentation

Bib

@inproceedings{Zhao24LogicThoughts,
  title = {Enhancing Zero-Shot Chain-of-Thought Reasoning in Large Language Models through Logic},
  author = {Zhao, Xufeng and Li, Mengdi and Lu, Wenhao and Weber, Cornelius and Lee, Jae Hee and Chu, Kun and Wermter, Stefan},
  booktitle = {Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (<b>LREC-COLING 2024</b>)},
  year = {2024},
  month = may,
  pages = {6144--6166},
  publisher = {ELRA and ICCL},
  url = {https://aclanthology.org/2024.lrec-main.543/},
  note = {Oral presentation},
}

Causal State Distillation for Explainable Reinforcement Learning

Wenhao Lu , Xufeng Zhao , Thilo Fryen , and 4 more authors

In 3rd Conference on Causal Learning and Reasoning (CLeaR 2024) , Apr 2024

Oral presentation

Abs arXiv Bib Website

@inproceedings{Lu24CausalState,
  title = {Causal {{State Distillation}} for {{Explainable Reinforcement Learning}}},
  author = {Lu, Wenhao and Zhao, Xufeng and Fryen, Thilo and Lee, Jae Hee and Li, Mengdi and Magg, Sven and Wermter, Stefan},
  booktitle = {3rd Conference on Causal Learning and Reasoning (<b>CLeaR 2024</b>)},
  year = {2024},
  month = apr,
  url = {https://proceedings.mlr.press/v236/lu24a.html},
  note = {Oral presentation},
}

ICDL 2023

A Closer Look at Reward Decomposition for High-Level Robotic Explanations

Wenhao Lu , Xufeng Zhao , Sven Magg , and 3 more authors

In IEEE International Conference on Development and Learning (ICDL 2023) , Nov 2023

Oral presentation

Abs arXiv Bib

@inproceedings{Lu23CloserLook,
  title = {A {{Closer Look}} at {{Reward Decomposition}} for {{High-Level Robotic Explanations}}},
  author = {Lu, Wenhao and Zhao, Xufeng and Magg, Sven and Gromniak, Martin and Li, Mengdi and Wermter, Stefan},
  booktitle = {IEEE International Conference on Development and Learning (<b>ICDL 2023</b>)},
  year = {2023},
  month = nov,
  doi = {10.48550/arXiv.2304.12958},
  url = {https://arxiv.org/abs/2304.12958},
  note = {Oral presentation},
}

Wenhao Lu

research themes

RL-Driven Robot Learning

Interpretable RL Agents

Grounded Agentic Intelligence

featured publications

selected projects

RL-Driven Robotic Manipulation

Reward Decomposition for Robotic Explanations

Causal and Language-Based RL Interpretability