WonJun Moon

Hello. I’m WonJun Moon, a postdoctoral researcher at KAIST, South Korea. I received my Ph.D. from Sungkyunkwan University under the supervision of Prof. Jae-Pil Heo. Currently, I am a member of the Computer Vision Lab advised by Prof. Seungryong Kim.

My research goal is to develop scalable multimodal video understanding systems deployable in real-world environments. My research focuses on video/image representation learning under multimodal ambiguity, temporal complexity, and limited supervision, with applications spanning retrieval, grounding, and segmentation. Most recently, I am dedicated to uncovering and enhancing the visual reasoning processes of Multimodal Large Language Models.

Publications

(* : equal contribution)

International

Video Object-Centric Learning (Compact visual representation / Efficiency) Text-Video Retrieval & Grounding Semantic Segmentation Vision-Language Models & Multimodal Robustness (Few-Shot & OOD & Long-tailed Recognition)

ECCV 2026	WonJun Moon, Jae-Pil Heo	Selective Synergistic Learning for Video Object-Centric Learning [Arxiv] [Code] [Project]
CVPR 2026	WonJun Moon, Hyun Seok Seong, Jae-Pil Heo	Reconstruction-Guided Slot Curriculum: Addressing Object Over-Fragmentation in Video Object-Centric Learning [Arxiv] [Code]
CVPR 2026	Yerim Jeon, Miso Lee, WonJun Moon, Jae-Pil Heo	Masking Matters: Unlocking the Spatial Reasoning Capabilities of LLMs for 3D Scene-Language Understanding [Arxiv] [Code]
CVPR 2026	ByeongCheol Lee, Hyun Seok Seong, Sangeek Hyun, Gilhan Park, WonJun Moon, Jae-Pil Heo	Looking Beyond the Window: Global-Local Aligned CLIP for Training-free Open-Vocabulary Semantic Segmentation [Arxiv] [Code]
ICLR 2026	Hyun Seok Seong^, WonJun Moon^, Jae-Pil Heo	From Vicious to Virtuous Cycles: Synergistic Representation Learning for Unsupervised Video Object-Centric Learning [Arxiv] [Paper] [Code]
NeurIPS 2025	WonJun Moon^, MinSeok Jung^, Gilhan Park, Tae-Young Kim, Cheol-Ho Cho, Woojin Jun, Jae-Pil Heo	Mitigating Semantic Collapse in Partially Relevant Video Retrieval [Arxiv] [Paper] [Code]
ICCV 2025	WonJun Moon^, Hyun Seok Seong^, Jae-Pil Heo	Selective Contrastive Learning for Weakly Supervised Affordance Grounding [Arxiv] [Code]
ICCV 2025	WonJun Moon, Cheol-Ho Cho, Woojin Jun, Minho Shim, Taeoh Kim, Inwoong Lee, Dongyoon Wee, Jae-Pil Heo	Prototypes are Balanced Units for Efficient and Effective Partially Relevant Video Retrieval [Arxiv]
CVPR 2025 (oral)	SuBeen Lee, WonJun Moon, Hyun Seok Seong, Jae-Pil Heo	Temporal Alignment-Free Video Matching for Few-shot Action Recognition [Arxiv] [Paper] [Code]
AAAI 2025	Cheol-Ho Cho, WonJun Moon, Woojin Jun, MinSeok Jung, Jae-Pil Heo	Ambiguity-Restrained Text-Video Representation Learning for Partially Relevant Video Retrieval [Arxiv]
AAAI 2025	Woojin Jun, WonJun Moon, Cheol-Ho Cho, MinSeok Jung, Jae-Pil Heo	Bridging the Semantic Granularity Gap Between Text and Frame Representations for Partially Relevant Video Retrieval [Paper]
TPAMI 2024	SuBeen Lee, WonJun Moon, Hyun Seok Seong, Jae-Pil Heo	Task-oriented channel attention for fine-grained few-shot classification [Arxiv] [Paper]
ECCV 2024	Hyun Seok Seong, WonJun Moon, SuBeen Lee, Jae-Pil Heo	Progressive Proxy Anchor Propagation for Unsupervised Semantic Segmentation [Arxiv] [Code]
ECCV 2024	Gilhan Park, WonJun Moon, SuBeen Lee, Tae-Young Kim, Jae-Pil Heo	Mitigating Background Shift in Class-Incremental Semantic Segmentation [Arxiv] [Code]
Pattern Recognition 2025	WonJun Moon, Sangeek Hyun, SuBeen Lee, Jae-Pil Heo	Correlation-Guided Query-Dependency Calibration for Video Temporal Grounding [Arxiv] [Code]
AAAI 2024	Seunggu Kang, WonJun Moon, Euiyeon Kim, Jae-Pil Heo	VLCounter: Text-aware Visual Representation for Zero-Shot Object Counting [Arxiv] [Paper] [Code]
CVPR 2023	WonJun Moon^, Sangeek Hyun^, Sanguk Park, Dongchan Park, Jae-Pil Heo	Query-Dependent Video Representation for Moment Retrieval and Highlight Detection [Arxiv] [Paper] [Code] [Video]
CVPR 2023	Hyun Seok Seong, WonJun Moon, SuBeen Lee, Jae-Pil Heo	Leveraging Hidden Positives for Unsupervised Semantic Segmentation [Arxiv] [Paper] [Code]
AAAI 2023 (oral)	WonJun Moon, Hyun Seok Seong, Jae-Pil Heo	Minority-Oriented Vicinity Expansion with Attentive Aggregation for Video Long-Tailed Recognition [Arxiv] [Paper] [Code] [Video]
ECCV 2022	WonJun Moon, Ji-Hwan Kim, Jae-Pil Heo	Tailoring Self-Supervision for Supervised Learning [Arxiv] [Paper] [Code] [Video]
ECCV 2022	WonJun Moon, Junho Park, Hyun Seok Seong, Cheol-Ho Cho, Jae-Pil Heo	Difficulty-Aware Simulator for Open Set Recognition [Arxiv] [Paper] [Code] [Video]
CVPR 2022 (oral)	SuBeen Lee, WonJun Moon, Jae-Pil Heo	Task Discrepancy Maximization for Fine-grained Few-Shot Classification [Arxiv] [Paper] [Code]

Domestic

KIISE 2022	Learning from Data Imbalance with Class Grouping Loss	WonJun Moon, Jae-Pil Heo
KIISE 2020	Mix-Contrastive Match	WonJun Moon, Jae-Pil Heo

Experience & Achievements

Research Intern in Naver Cloud, hosted by Minho Shim (Oct. 2023 - Mar. 2024)
2023 President’s List, Sungkyunkwan University
18th Scholarship student of the Kwanjeong Educational Foundation

Reviewer

Conference : CVPR, ICCV, ECCV, NeurIPS
Journal : TPAMI

Education

Ph.D. Dept of Artificial Intelligence, Sungkyunkwan University
MS. Dept of Artificial Intelligence, Sungkyunkwan University
GPA : 4.38 / 4.5
Bs. Dept of Software, Sungkyunkwan University
GPA : 4.42 / 4.5
Language : Korean, English

WonJun Moon

WonJun Moon

Publications

International

Domestic

Experience & Achievements

Reviewer

Education

Recent posts

Query-Dependent Video Representation for Moment Retrieval and Highlight Detection (CVPR23)

Minority-Oriented Vicinity Expansion with Attentive Aggregation for Video Long-Tailed Recognition (AAAI23)

Tailoring Self-Supervision for Supervised Learning (ECCV22)

Difficulty-Aware Simulator for Open Set Recognition (ECCV22)