WonJun Moon

Hello. I’m WonJun Moon, a postdoctoral researcher at KAIST, South Korea. I received my Ph.D. from Sungkyunkwan University under the supervision of Prof. Jae-Pil Heo. Currently, I am a member of the Computer Vision Lab advised by Prof. Seungryong Kim.

My research goal is to develop scalable multimodal video understanding systems deployable in real-world environments. My research focuses on video/image representation learning under multimodal ambiguity, temporal complexity, and limited supervision, with applications spanning retrieval, grounding, and segmentation. Most recently, I am dedicated to uncovering and enhancing the visual reasoning processes of Multimodal Large Language Models.

Publications

(* : equal contribution)

International

Video Object-Centric Learning (Compact visual representation / Efficiency)   Text-Video Retrieval & Grounding   Semantic Segmentation   Vision-Language Models & Multimodal   Robustness (Few-Shot & OOD & Long-tailed Recognition)

ECCV 2026 WonJun Moon, Jae-Pil Heo Selective Synergistic Learning for Video Object-Centric Learning
[Arxiv] [Code] [Project]
CVPR 2026 WonJun Moon, Hyun Seok Seong, Jae-Pil Heo Reconstruction-Guided Slot Curriculum: Addressing Object Over-Fragmentation in Video Object-Centric Learning
[Arxiv] [Code]
CVPR 2026 Yerim Jeon, Miso Lee, WonJun Moon, Jae-Pil Heo Masking Matters: Unlocking the Spatial Reasoning Capabilities of LLMs for 3D Scene-Language Understanding
[Arxiv] [Code]
CVPR 2026 ByeongCheol Lee, Hyun Seok Seong, Sangeek Hyun, Gilhan Park, WonJun Moon, Jae-Pil Heo Looking Beyond the Window: Global-Local Aligned CLIP for Training-free Open-Vocabulary Semantic Segmentation
[Arxiv] [Code]
ICLR 2026 Hyun Seok Seong*, WonJun Moon*, Jae-Pil Heo From Vicious to Virtuous Cycles: Synergistic Representation Learning for Unsupervised Video Object-Centric Learning
[Arxiv] [Paper] [Code]
NeurIPS 2025 WonJun Moon*, MinSeok Jung*, Gilhan Park, Tae-Young Kim, Cheol-Ho Cho, Woojin Jun, Jae-Pil Heo Mitigating Semantic Collapse in Partially Relevant Video Retrieval
[Arxiv] [Paper] [Code]
ICCV 2025 WonJun Moon*, Hyun Seok Seong*, Jae-Pil Heo Selective Contrastive Learning for Weakly Supervised Affordance Grounding
[Arxiv] [Code]
ICCV 2025 WonJun Moon, Cheol-Ho Cho, Woojin Jun, Minho Shim, Taeoh Kim, Inwoong Lee, Dongyoon Wee, Jae-Pil Heo Prototypes are Balanced Units for Efficient and Effective Partially Relevant Video Retrieval
[Arxiv]
CVPR 2025 (oral) SuBeen Lee, WonJun Moon, Hyun Seok Seong, Jae-Pil Heo Temporal Alignment-Free Video Matching for Few-shot Action Recognition
[Arxiv] [Paper] [Code]
AAAI 2025 Cheol-Ho Cho, WonJun Moon, Woojin Jun, MinSeok Jung, Jae-Pil Heo Ambiguity-Restrained Text-Video Representation Learning for Partially Relevant Video Retrieval
[Arxiv]
AAAI 2025 Woojin Jun, WonJun Moon, Cheol-Ho Cho, MinSeok Jung, Jae-Pil Heo Bridging the Semantic Granularity Gap Between Text and Frame Representations for Partially Relevant Video Retrieval
[Paper]
TPAMI 2024 SuBeen Lee, WonJun Moon, Hyun Seok Seong, Jae-Pil Heo Task-oriented channel attention for fine-grained few-shot classification
[Arxiv] [Paper]
ECCV 2024 Hyun Seok Seong, WonJun Moon, SuBeen Lee, Jae-Pil Heo Progressive Proxy Anchor Propagation for Unsupervised Semantic Segmentation
[Arxiv] [Code]
ECCV 2024 Gilhan Park, WonJun Moon, SuBeen Lee, Tae-Young Kim, Jae-Pil Heo Mitigating Background Shift in Class-Incremental Semantic Segmentation
[Arxiv] [Code]
Pattern Recognition 2025 WonJun Moon, Sangeek Hyun, SuBeen Lee, Jae-Pil Heo Correlation-Guided Query-Dependency Calibration for Video Temporal Grounding
[Arxiv] [Code]
AAAI 2024 Seunggu Kang, WonJun Moon, Euiyeon Kim, Jae-Pil Heo VLCounter: Text-aware Visual Representation for Zero-Shot Object Counting
[Arxiv] [Paper] [Code]
CVPR 2023 WonJun Moon*, Sangeek Hyun*, Sanguk Park, Dongchan Park, Jae-Pil Heo Query-Dependent Video Representation for Moment Retrieval and Highlight Detection
[Arxiv] [Paper] [Code] [Video]
CVPR 2023 Hyun Seok Seong, WonJun Moon, SuBeen Lee, Jae-Pil Heo Leveraging Hidden Positives for Unsupervised Semantic Segmentation
[Arxiv] [Paper] [Code]
AAAI 2023 (oral) WonJun Moon, Hyun Seok Seong, Jae-Pil Heo Minority-Oriented Vicinity Expansion with Attentive Aggregation for Video Long-Tailed Recognition
[Arxiv] [Paper] [Code] [Video]
ECCV 2022 WonJun Moon, Ji-Hwan Kim, Jae-Pil Heo Tailoring Self-Supervision for Supervised Learning
[Arxiv] [Paper] [Code] [Video]
ECCV 2022 WonJun Moon, Junho Park, Hyun Seok Seong, Cheol-Ho Cho, Jae-Pil Heo Difficulty-Aware Simulator for Open Set Recognition
[Arxiv] [Paper] [Code] [Video]
CVPR 2022 (oral) SuBeen Lee, WonJun Moon, Jae-Pil Heo Task Discrepancy Maximization for Fine-grained Few-Shot Classification
[Arxiv] [Paper] [Code]

Domestic

KIISE 2022 Learning from Data Imbalance with Class Grouping Loss WonJun Moon, Jae-Pil Heo
KIISE 2020 Mix-Contrastive Match WonJun Moon, Jae-Pil Heo

Experience & Achievements

Reviewer

  • Conference : CVPR, ICCV, ECCV, NeurIPS
  • Journal : TPAMI

Education

  • Ph.D. Dept of Artificial Intelligence, Sungkyunkwan University

  • MS. Dept of Artificial Intelligence, Sungkyunkwan University
    GPA : 4.38 / 4.5

  • Bs. Dept of Software, Sungkyunkwan University
    GPA : 4.42 / 4.5

  • Language : Korean, English

Recent posts