Selected Publications
See all articles on Google Scholar.
SWE-Bench Mobile: Can Large Language Model Agents Develop Industry-Level Mobile Applications?
arxiv preprint, 2026 Under Review
Where LLM Agents Fail and How They can Learn From Failures
arXiv preprint, 2025 Under Review
OasisSimp: An Open-source Asian-English Sentence Simplification Dataset
LREC, 2026
* denotes equal contribution
† denotes project leader
