Xiaohan Yan

Xiaohan Yan 颜小涵

I am now a Computer Science Master's student at CAD Research Center, Tongji University.

I did my B.Sc. in Computing Science at Hohai University.

My research interests are computer vision, multi modal and reinforcement learning, specifically, learning-based methods for 3D point cloud segmentation, multimodal pretrain methods, etc.

If you find any research interests that we might share, feel free to drop me an email. I am always open to potential collaborations.

I am a former ACMer and a former OIer.

Email / CV / Github / Linkedin

中文 / English / 日本語

Short Bio

I am a M.Sc student in Computer Science at CAD Research Center, Tongji University, where I am honored to be advised by assoc. prof. Gang Wei.

Before that, I received my B.Sc degree in Computing Science at Hohai University in 2022, and I received the honour of being the Charming Graduate of Hohai University.

I am the former captain of the ACM team at Hohai University, I have chaired the 10th and 11th Hohai University ACM Programming Competition. Also, I ran hohai online judge website for a year.

I am a former OIer at JiangSu DaFeng Senior High School, during that time, I became interested in computer science.

I was born on May 21th, 2000 in Yancheng, China. My hometown is on the shores of the Yellow Sea, with a national nature reserve, also known as the home of the moose.

Research interests

I am working at the intersection between Computer Vision and Multi Modal, developing new deep learning methods to resolve the challenging problems in 3D Vision or text-image alignment, especially focus on segmentation, Pretrain model and Scene Understanding.

My long-term goal is to improve the application of 3D Vision, benefiting society directly by improving people's living environment.

News

[2024/5/23] Our paper "RE0: Recongnize Everything with 3D Zero-shot Open-Vocabulary Instance Segmentaion" has submitted to NeruIPS2024.

[2024/4/29] I have been on a research internship as NIO, Shanghai

[2024/4/28] Our paper "AttenPoint: Exploring Point Cloud Segmentation through Attention-Based Modules" has submitted to PRCV2024.

[2024/3/8] Our paper "Anatomical Structure-Guided Medical Vision-Language Pre-training" has submitted to MICCAI2024.

[2024/1/8] I have been on a research internship at Institute for Al Industry Research (AIR), Tsinghua University.

Internship Experiences

More details has been written in my CV.

• Research Internship at NIO, Shanghai.	April 2024 - Present
• Research Internship at Institute for Al Industry Research (AIR), Tsinghua University.	January 2024 - March 2024

Research

Much of my research is about inferring the physical world (shape, motion, color, light, etc) from images and 3D raw data. Representative researches are highlighted.

	RE0: Recognize Everything with 3D Zero-shot Open-Vocabulary Instance Segmentation Xiaohan Yan, Zijian Jiang, Yinghao Shuai, Nana Wang, Xiaowei Song NeurIPS2024, 2024-5, Code Will coming soon, Paper will coming soon We leverage the 3D geometry information in 3D point cloud, the projection relationship between 3D point cloud and multi-view 2D posed RGB-D frames and the semantic features extracted by CLIP from multi-view 2D posed RGB-D frames to address the challenge of 3D instance segmentation.
	AttenPoint: Exploring Point Cloud Segmentation through Attention-Based Modules Xiaohan Yan, Nana Wang, Xiaowei Song PRCV2024, 2024-4, Code Will coming soon, Paper will coming soon Similar to how humans perceive 3D objects, neural networks discern the class labels of point clouds by combining local and global features of the structures and performance. Based on this, we reviewed the pipeline of few-shot point cloud semantic segmentation and identified three issues.
	GreedyAgent:A Simple yet Efficient Approach for Meta-learning from Learning Curves Jinyu He, Xiaowei Song, Xiaohan Yan, Nana Wang ICIC2024 oral, 2024-4, Code, Paper will coming soon Meta-learning plays an increasingly importantt role in AutoML. A key sub-problem—meta-learning from learning curves is an mmature but gradually attention area within the field of meta-learning.
	Anatomical Structure-Guided Medical Vision-Language Pre-training Qingqiu Li, Xiaohan Yan MICCAI2024, 2024-3, Code, Paper Learning medical visual representations through vision-language pre-training has some challenges, i.e., local alignment lacks interpretability and clinical relevance, and the insufficient internal and external representation learning of image-report pairs. To address these issues, we propose an Anatomical Structure-Guided (ASG) framework.

Project

Much of my projects is about inferring the physical world (shape, motion, color, light, etc) from images and 3D raw data. Representative projects are highlighted.

	End-to-end-SegmentAnything3D Xiaohan Yan, Nan Wang, Kaggle, 2023-10, Code This project aims to using Segment Anything 3D to solve a ply pointcloud without 2D labels. We using a pcd2rgb method to generate 2D RGB & Depth photos. And then, we aligne the inputs and generate the .ply output.
	LLM Science Exam - Use LLMs to answer difficult science questions Xiaohan Yan, Nan Wang, Xiaowei Song, Jinyu He Kaggle, 2023-10, Code We get the 0.905 at the leardboard. And reach the Top 4%. We gather the wiki pedia knowledge about science questions, and use the bag-of-words model to clean the datas. Then, we use the sentence transformer to find the similarity between the problen and the cleaned dataset. Training three large deberta models with different datasets, and combining their features to inferring the right answer.
	Stable Diffusion - Image to Prompts Xiaohan Yan, Nan Wang, Xiaowei Song Kaggle, 2023-05, Code For images generated from text using Stable Diffusion, we use three models BLIP+CLIP, OFA and ViT. Then, we combine their features to predict the text for a given generated image.
	HUAWEIRobot Path Planning for CodeCraft Xiaohan Yan, Nan Wang, Xiaowei Song CodeCraft, 2023-03, Code This project is about HUAWEI robots application, the project requires us to assign policies, control scheduling, and path planning for multiple robots in a single map.

Selected awards

• The 2019 ICPC Asia-East Continetnt Final - Bronze Medal (2019)

• CCF Collegate Computer Systems & Programming Contest - Silver Modal (2019)

• Jiangsu Collegiate Programming Contest - Silver Modal 2nd place (2020)

• CCF Certified Software Professional - 320(Top 0.88%) (2020)

• Hohai University Academic & Science and Technology Scholarship (2019 - 2021)

• Hohai University Charming Graduates (2022)

• The rest of the awards

What did I do In my spare time?

InsideOut
Origami-hui, Xiaohan Yan
GameJam, 2023-04, Porject Page

As an incarnation of matter inhaled by the deep breath, you should try your best to avoid sinking into the human body.

"WASD" move; "R" recover "Oxygen"; "E" interact with environment; "left Shift" sprint; "left mouse button" attack; "right mouse button" parry.

eScape
Origami-hui Xiaohan Yan
GameJam, 2023-12, Porject Page

Scale your device and escape from this geometry storm.

This game reached the "Innovation RK1" and "Theme interpretation RK2" at Game Off 2023

Misc

Japanese🇯🇵:
        I am trying to learn Japanese now. And I plan to take part in the Japanese N2 exam at 2024 Summer.

Sports🏃‍♂️:
        Swimming🏊, swimming is my hobby when I was a kid, and I hit 39‘22s in the 50m backstroke.
        Go, Badminton🏸️, Flying Disc🥏.

Games🎮:
        I love to play Pokémon-related games such like PTCG, Pokémon Legends: Arceus, etc.
        I am a fan of Nintendo. The Legend of Zelda is the best game I think.

Last updated on 2024/5/29