Publications      (* denotes equal contribution)
|
|
UniGarmentManip: A Unified Framework for Category-Level Garment Manipulation via Dense Visual Correspondence
Ruihai Wu*,
Haoran Lu*,
Yiyan Wang,
Yubo Wang,
Hao Dong
CVPR 2024
Spotlight Presentation at ICRA 2024 Workshop on Deformable Object Manipulation
project page
/
paper
/
code
/
video
We propose to learn dense visual correspondence for diverse garment manipulation tasks with category-level generalization using only one- or few-shot human demonstrations.
|
|
Broadcasting Support Relations Recursively from Local Dynamics for Object Retrieval in Clutters
Yitong Li*,
Ruihai Wu*,
Haoran Lu,
Chuanruo Ning,
Yan Shen,
Guanqi Zhan,
Hao Dong
RSS 2024
Best Poster Award at PKU AI Tech Day 2024
project page
/
paper
/
code
/
video
In this paper, we study retrieving objects in complicated clutters via a novel method of recursively broadcasting the accurate local dynamics to build a support relation graph of the whole scene, which largely reduces the complexity of the support relation inference and improves the accuracy.
|
|
GarmentLab: A Unified Simulation and Benchmark for Garment Manipulation
Haoran Lu*,
Ruihai Wu*,
Yitong Li*,
Sijie Li,
Ziyu Zhu,
Chuanruo Ning,
Yan Shen,
Longzan Luo,
Yuanpei Chen,
Hao Dong
NeurIPS 2024
Spotlight Presentation at ICRA 2024 Workshop on Deformable Object Manipulation
project page
/
paper
/
code
/
video
We present GarmentLab, a benchmark designed for garment manipulation within realistic 3D indoor scenes. Our benchmark encompasses a diverse range of garment types, robotic systems and manipulators including dexterous hands. The multitude of tasks included in the benchmark enables further exploration of the interactions between garments, deformable objects, rigid bodies, fluids, and avatars.
|
|
RoboEXP: Action-Conditioned Scene Graph via Interactive Exploration for Robotic Manipulation
Hanxiao Jiang,
Binghao Huang,
Ruihai Wu,
Zhuoran Li,
Shubham Garg,
Hooshang Nayyeri,
Shenlong Wang,
Yunzhu Li
CoRL 2024
Best Paper Nomination at ICRA 2024 Workshop on Vision-Language Models for Manipulation
project page
/
paper
/
code
/
video
We formulate interactive exploration as an action-conditioned 3D scene graph (ACSG) construction and traversal problem. Our ACSG is an actionable, spatial-topological representation that models objects and their interactive and spatial relations in a scene, capturing both the high-level graph and corresponding low-level memory.
|
|
NaturalVLM: Leveraging Fine-grained Natural Language for Affordance-Guided Visual Manipulation
Ran Xu*,
Yan Shen*,
Xiaoqi Li,
Ruihai Wu,
Hao Dong
RA-L 2024
project page
/
paper
/
video
We introduce a comprehensive benchmark, NaturalVLM, comprising 15 distinct manipulation tasks, containing over 4500 episodes meticulously annotated with fine-grained language instructions. Besides, we propose a novel learning framework that completes the manipulation task step-by-step according to the fine-grained instructions.
|
|
PreAfford: Universal Affordance-Based Pre-Grasping for Diverse Objects and Environments
Kairui Ding,
Boyuan Chen,
Ruihai Wu,
Yuyang Li,
Zongzheng Zhang,
Huan-ang Gao,
Siqi Li,
Yixin Zhu,
Guyue Zhou,
Hao Dong,
Hao Zhao
IROS 2024
Best Poster Finalist at IROS 2024 Workshop on Embodied Navigation to Movable Objects
project page
/
paper
/
code
/
video
PreAfford is a novel pre-grasping planning framework that improves adaptability across diverse environments and object by utilizing point-level affordance representation and relay training. Validated on the ShapeNet-v2 dataset and real-world experiments, PreAfford offers a robust solution for manipulating ungraspable objects with two-finger grippers.
|
|
Learning Environment-Aware Affordance for 3D Articulated Object Manipulation under Occlusions
Ruihai Wu*,
Kai Cheng*,
Yan Shen,
Chuanruo Ning,
Guanqi Zhan,
Hao Dong
NeurIPS 2023
project page
/
paper
/
code
/
video
We explore the task of manipulating articulated objects within environment constraints and formulate the task of environment-aware affordance learning for manipulating 3D articulated objects, incorporating object-centric per-point priors and environment constraints.
|
|
Where2Explore: Few-shot Affordance Learning for Unseen Novel Categories of Articulated Objects
Chuanruo Ning,
Ruihai Wu,
Haoran Lu,
Kaichun Mo,
Hao Dong
NeurIPS 2023
project page
/
paper
/
code
/
video
We introduce an affordance learning framework that effectively explores novel categories with minimal interactions on a limited number of instances. Our framework explicitly estimates the geometric similarity across different categories, identifying local areas that differ from shapes in the training categories for efficient exploration while concurrently transferring affordance knowledge to similar parts of the objects.
|
|
Leveraging SE(3) Equivariance for Learning 3D Geometric Shape Assembly
Ruihai Wu*,
Chenrui Tie*,
Yushi Du,
Yan Shen,
Hao Dong
ICCV 2023
project page
/
paper
/
code
/
video
We study geometric shape assembly by leveraging SE(3) Equivariance, which disentangles poses and shapes of fractured parts.
|
|
Learning Foresightful Dense Visual Affordance for Deformable Object Manipulation
Ruihai Wu*,
Chuanruo Ning*,
Hao Dong
ICCV 2023
project page
/
paper
/
code
/
video
/
video (real world)
We study deformable object manipulation using dense visual affordance, with generalization towards diverse states, and propose a novel kind of foresightful dense affordance, which avoids local optima by estimating states’ values for longterm manipulation.
|
|
DualAfford: Learning Collaborative Visual Affordance for Dual-gripper Object Manipulation
Yan Zhao*,
Ruihai Wu*,
Zhehuan Chen,
Yourong Zhang,
Qingnan Fan,
Kaichun Mo,
Hao Dong
ICLR 2023
project page
/
paper
/
code
/
video
We study collaborative affordance for dual-gripper manipulation. The core is to reduce the quadratic problem for two grippers into two disentangled yet interconnected subtasks.
|
|
Learning Part Motion of Articulated Objects Using Spatially Continuous Neural Implicit Representations
Yushi Du*,
Ruihai Wu*,
Yan Shen,
Hao Dong
BMVC 2023
project page
/
paper
/
code
/
video
We introduce a novel framework that explicitly disentangles the part motion of articulated objects by predicting the movements of articulated parts by utilizing spatially continuous neural implicit representations.
|
|
AdaAfford: Learning to Adapt Manipulation Affordance for 3D Articulated Objects via Few-shot Interactions
Yian Wang*,
Ruihai Wu*,
Kaichun Mo*,
Jiaqi Ke,
Qingnan Fan,
Leonidas J. Guibas,
Hao Dong
ECCV 2022
project page
/
paper
/
code
/
video
We study performing very few test-time interactions for quickly adapting the affordance priors to more accurate instance-specific posteriors.
|
|
VAT-Mart: Learning Visual Action Trajectory Proposals for Manipulating 3D ARTiculated Objects
Ruihai Wu*,
Yan Zhao*,
Kaichun Mo*,
Zizheng Guo,
Yian Wang,
Tianhao Wu,
Qingnan Fan,
Xuelin Chen,
Leonidas J. Guibas,
Hao Dong
ICLR 2022
project page
/
paper
/
code
/
video
We study dense geometry-aware, interaction-aware, and task-aware visual action affordance and trajectory proposals for manipulating articulated objects.
|
|
DMotion: Robotic Visuomotor Control with Unsupervised Forward Model Learned from Videos
Haoqi Yuan*,
Ruihai Wu*,
Andrew Zhao*,
Haipeng Zhang,
Zihan Ding,
Hao Dong
IROS 2021
project page
/
paper
/
code
We train a forward model from video data only, via disentangling the motion of controllable agent to model the transition dynamics.
|
|
Unpaired Image-to-Image Translation using Adversarial Consistency Loss
Yihao Zhao,
Ruihai Wu,
Hao Dong
ECCV 2020
project page
/
paper
/
code
/
video
We propose adversarial consistency loss for image-to-image translation that does not require the translated image to be translated back to the source image.
|
|
Localize, Assemble, and Predicate: Contextual Object Proposal Embedding for Visual Relation Detection
Ruihai Wu,
Kehan Xu,
Chenchen Liu,
Nan Zhuang,
Yadong Mu
AAAI 2020
(Oral presentation)
paper
We propose localize-assemble-predicate network (LAP-Net), decomposing visual relation detection (VRD) into three sub-tasks to tackle the long-tailed data distribution problem.
|
|
TDMPNet: Prototype Network with Recurrent Top-Down Modulation for Robust Object Classification under Partial Occlusion
Mingqing Xiao,
Adam Kortylewski,
Ruihai Wu,
Siyuan Qiao,
Wei Shen,
Alan Yuille
ECCV 2020 Visual Inductive Priors for Data-Efficient Deep Learning Workshop
paper
We introduce prototype learning, partial matching and convolution layers with top-down modulation into feature extraction to purposefully reduce the contamination by occlusion.
|
Reviewer: ICCV, ICLR, ICRA, RA-L, T-RO
Volunteer: Wine 2020
|
Teaching Assistant Deep Generative Models, 2020, 2022
Guest Lecturer at Frontier Computing Research Practice, 2024
|
Unified Simulation, Benchmark and Manipulation for Garments,     AnySyn3D, 2024
      --- If you are interested in 3D Vision research, welcome to follow AnySyn3D that conducts various topics.
Visual Representations for Embodied Agent,     Chinese University of Hong Kong, Shenzhen, 2024
Visual Representations for Embodied Agent,     China3DV, 2024
|
Merit Student with National Scholarship (top1 in 50, CFCS),     Peking University, 2024
Outstanding Student Workshop Speaker (8 Ph.D. students), China3DV,     China, 2024
Nominee (~9 in China), Apple Scholarship,     WorldWide, 2024
Finalist, ByteDance Scholarship,     China, 2023, 2024
Jiukun Scholarship (10 in School of CS),     Peking University, 2023
Research Excellence Award,     Peking University, 2019, 2022, 2023
Excellent Graduate,     Peking University, 2020
Peking University’s Third-class Scholarship,     Peking University, 2019
Star of Tomorrow Excellent Intern,     Microsoft Research Asia, 2019
May Fourth Scholarship and Academic Excellence Award,     Peking University, 2018
Bronze medal in National Olympiad in Informatics (NOI),     China Computer Federation, 2015
First prize in National Olympiad in Informatics in Provinces (NOIP),     China Computer Federation, 2013, 2014, 2015
|
Website template comes from Jon Barron
Last update: October, 2024
|
|