We propose ALOE, an action-level off-policy evaluation framework for VLA post-training that enables fine-grained credit assignment and stable policy improvement in real-world robotic manipulation.
HOLO: Holistic Lightweight Optimization for Scene Understanding with Auto-Annotation and Multimodal Learning
Xiaoyun Hu*, Xiaohan Yan*, Nan Wang, Xiaowei Song, Gang Wei, Zhicheng Wang
WACV 2026
We propose HOLO, which includes a large-scale scene description dataset and a lightweight 3D-LLM.
RE0: Recognize Everything with 3D Zero-shot Instance Segmentation
Xiaohan Yan*, Zijian Jiang*, Yinghao Shuai*, Nan Wang, Xiaowei Song,
Wenbo Ji, Ge Wu, Jinyu He, Gang Wei, Zhicheng Wang
Given 3D point clouds and multi-view RGB-D images with poses, RE0 leverages the 3D geometric information, projection relationships and CLIP semantic features for 3D zero-shot instance segmentation.
Semantic-Guided Gaussian Splatting with Deferred Rendering
Nan Wang, Xiaohan Yan, Xiaowei Song, Zhicheng Wang