Two IEEE TCSVT Papers Accepted

Jun 1
2 min read

The following papers have been accepted for publications in IEEE Transactions on Circuits and Systems for Video Technology (Impact Factor 11.1, JCR Q1). Congratulations!

Title: C-DiffSET: Leveraging Latent Diffusion for SAR-to-EO Image Translation with Confidence-Guided Reliable Object Generation

Authors: Jeonghyeok Do, Jaehyup Lee, Seungchul Lee, and Munchurl Kim

Abstract:

Synthetic Aperture Radar (SAR) imagery provides robust environmental and temporal coverage (e.g., during clouds, seasons, day-night cycles), yet its noise and unique structural patterns pose interpretation challenges, especially for non-experts. SAR-to-EO (Electro-Optical) image translation (SET) has emerged to make SAR images more perceptually interpretable. However, traditional approaches, trained from scratch on limited SAR-EO datasets, are prone to overfitting and struggle with dataset inconsistencies. To address these challenges, we introduce Confidence Diffusion for SAR-to-EO Translation (C-DiffSET), a framework leveraging a pretrained Latent Diffusion Model (LDM) to effectively adapt its extensive generative priors from natural images to the EO domain. Our investigation reveals that the pretrained VAE encoder effectively aligns SAR and EO images within a shared latent space, demonstrating robustness even to varying noise levels in SAR inputs. To further improve pixel-wise fidelity for SET and mitigate artifacts from temporal discrepancies, such as appearing or disappearing objects, we propose a novel confidence-guided diffusion (C-Diff) loss. This loss dynamically guides the diffusion process to down-weight penalties in uncertain regions, thereby enhancing structural accuracy. C-DiffSET achieves state-of-the-art (SOTA) results on multiple benchmark datasets (QXS-SAROPT, SAR2Opt, SpaceNet6, Stellar-Vision), significantly outperforming recent image-to-image translation methods and specialized SET methods across all standard metrics. The source code and trained models are available at https://github.com/KAIST-VICLab/C-DiffSET.

Title: Diffusion-based Data Augmentation and Knowledge Distillation with Generated Soft Labels for Oil Spill Segmentation Learning on Satellite SAR Images

Authors: Jaeho Moon, Jeonghwan Yun, Jaehyun Kim, Jaehyup Lee, and Munchurl Kim

Abstract:

Semantic segmentation often suffers from a compounded scarcity of data, where both input observations and ground-truth labels are limited. This problem is particularly severe in remote sensing tasks such as oil spill segmentation using Synthetic Aperture Radar (SAR), where rare event occurrence and the difficulty of SAR image acquisition result in limited training data. To address this limitation, we propose a diffusion-based Data Augmentation with Knowledge Transfer (DAKTer) strategy. Our DAKTer strategy enables a diffusion model to generate SAR oil spill images along with soft label pairs, which offer richer class probability distributions than segmentation masks (i.e. hard labels). Also, for reliable joint generation of high-quality SAR images and soft labels with high correspondence, we introduce an SNR-based balancing factor aligning the noise corruption process of both modalities in diffusion models. By leveraging the generated SAR images and soft labels, a student segmentation model can learn robust feature representations without teacher models trained for the same segmentation task, improving its ability to segment oil spill regions. Extensive experiments demonstrate that our DAKTer strategy effectively transfers the knowledge of per-pixel class probabilities to the student segmentation model to distinguish the oil spill regions from other look-alike regions in the SAR images. Our DAKTer strategy boosts various segmentation models to achieve superior performance with large margins compared to other generative data augmentation methods.

Comments