๐Ÿ“
Awesome reviews
  • Welcome
  • Paper review
    • [2022 Spring] Paper review
      • RobustNet [Eng]
      • DPT [Kor]
      • DALL-E [Kor]
      • VRT: A Video Restoration Transformer [Kor]
      • Barbershop [Kor]
      • Barbershop [Eng]
      • REFICS [ENG]
      • Deep texture manifold [Kor]
      • SlowFast Networks [Kor]
      • SCAN [Eng]
      • DPT [Kor]
      • Chaining a U-Net With a Residual U-Net for Retinal Blood Vessels Segmentation [Kor]
      • Chaining a U-Net With a Residual U-Net for Retinal Blood Vessels Segmentation [Eng]
      • Patch Cratf : Video Denoising by Deep Modeling and Patch Matching [Eng]
      • LAFITE: Towards Language-Free Training for Text-to-Image Generation [Kor]
      • RegSeg [Eng]
      • D-NeRF [Eng]
      • SimCLR [Kor]
      • LabOR [Kor]
      • LabOR [Eng]
      • SegFormer [Kor]
      • Self-Calibrating Neural Radiance Fields [Kor]
      • Self-Calibrating Neural Radiance Fields [Eng]
      • GIRAFFE [Kor]
      • GIRAFFE [Eng]
      • DistConv [Kor]
      • SCAN [Eng]
      • slowfastnetworks [Kor]
      • Nesterov and Scale-Invariant Attack [Kor]
      • OutlierExposure [Eng]
      • TSNs [Kor]
      • TSNs [Eng]
      • Improving the Transferability of Adversarial Samples With Adversarial Transformations [Kor]
      • VOS: OOD detection by Virtual Outlier Synthesis [Kor]
      • MultitaskNeuralProcess [Kor]
      • RSLAD [Eng]
      • Deep Learning for 3D Point Cloud Understanding: A Survey [Eng]
      • BEIT [Kor]
      • Divergence-aware Federated Self-Supervised Learning [Eng]
      • NeRF-W [Kor]
      • Learning Multi-Scale Photo Exposure Correction [Eng]
      • ReActNet: Towards Precise Binary Neural Network with Generalized Activation Functions [Eng]
      • ViT [Eng]
      • CrossTransformer [Kor]
      • NeRF [Kor]
      • RegNeRF [Kor]
      • Image Inpainting with External-internal Learning and Monochromic Bottleneck [Eng]
      • CLIP-NeRF [Kor]
      • CLIP-NeRF [Eng]
      • DINO: Emerging Properties in Self-Supervised Vision Transformers [Eng]
      • DINO: Emerging Properties in Self-Supervised Vision Transformers [Kor]
      • DatasetGAN [Eng]
      • MOS [Kor]
      • MOS [Eng]
      • PlaNet [Eng]
      • MAE [Kor]
      • Fair Attribute Classification through Latent Space De-biasing [Kor]
      • Fair Attribute Classification through Latent Space De-biasing [Eng]
      • Learning to Adapt in Dynamic, Real-World Environments Through Meta-Reinforcement Learning [Kor]
      • PointNet [Kor]
      • PointNet [Eng]
      • MSD AT [Kor]
      • MM-TTA [Kor]
      • MM-TTA [Eng]
      • M-CAM [Eng]
      • MipNerF [Kor]
      • The Emergence of Objectness: Learning Zero-Shot Segmentation from Videos [Eng]
      • Calibration [Eng]
      • CenterPoint [Kor]
      • YOLOX [Kor]
    • [2021 Fall] Paper review
      • DenseNet [Kor]
      • Time series as image [Kor]
      • mem3d [Kor]
      • GraSP [Kor]
      • DRLN [Kor]
      • VinVL: Revisiting Visual Representations in Vision-Language Models [Eng]
      • VinVL: Revisiting Visual Representations in Vision-Language Models [Kor]
      • NeSyXIL [Kor]
      • NeSyXIL [Eng]
      • RCAN [Kor]
      • RCAN [Eng]
      • MI-AOD [Kor]
      • MI-AOD [Eng]
      • DAFAS [Eng]
      • HyperGAN [Eng]
      • HyperGAN [Kor]
      • Scene Text Telescope: Text-focused Scene Image Super-Resolution [Eng]
      • Scene Text Telescope: Text-focused Scene Image Super-Resolution [Kor]
      • UPFlow [Eng]
      • GFP-GAN [Kor]
      • Federated Contrastive Learning [Kor]
      • Federated Contrastive Learning [Eng]
      • BGNN [Kor]
      • LP-KPN [Kor]
      • Feature Disruptive Attack [Kor]
      • Representative Interpretations [Kor]
      • Representative Interpretations [Eng]
      • Neural Discrete Representation Learning [KOR]
      • Neural Discrete Representation Learning [ENG]
      • Video Frame Interpolation via Adaptive Convolution [Kor]
      • Separation of hand motion and pose [kor]
      • pixelNeRF [Kor]
      • pixelNeRF [Eng]
      • SRResNet and SRGAN [Eng]
      • MZSR [Kor]
      • SANforSISR [Kor]
      • IPT [Kor]
      • Swin Transformer [kor]
      • CNN Cascade for Face Detection [Kor]
      • CapsNet [Kor]
      • Towards Better Generalization: Joint Depth-Pose Learning without PoseNet [Kor]
      • CSRNet [Kor]
      • ScrabbleGAN [Kor]
      • CenterTrack [Kor]
      • CenterTrack [Eng]
      • STSN [Kor]
      • STSN [Eng]
      • VL-BERT:Visual-Linguistic BERT [Kor]
      • VL-BERT:Visual-Linguistic BERT [Eng]
      • Squeeze-and-Attention Networks for Semantic segmentation [Kor]
      • Shot in the dark [Kor]
      • Noise2Self [Kor]
      • Noise2Self [Eng]
      • Dynamic Head [Kor]
      • PSPNet [Kor]
      • PSPNet [Eng]
      • CUT [Kor]
      • CLIP [Eng]
      • Local Implicit Image Function [Kor]
      • Local Implicit Image Function [Eng]
      • MetaAugment [Eng]
      • Show, Attend and Tell [Kor]
      • Transformer [Kor]
      • DETR [Eng]
      • Multimodal Versatile Network [Eng]
      • Multimodal Versatile Network [Kor]
      • BlockDrop [Kor]
      • MDETR [Kor]
      • MDETR [Eng]
      • FSCE [Kor]
      • waveletSR [Kor]
      • DAN-net [Eng]
      • Boosting Monocular Depth Estimation [Eng]
      • Progressively Complementary Network for Fisheye Image Rectification Using Appearance Flow [Kor]
      • Syn2real-generalization [Kor]
      • Syn2real-generalization [Eng]
      • GPS-Net [Kor]
      • Frustratingly Simple Few Shot Object Detection [Eng]
      • DCGAN [Kor]
      • RealSR [Kor]
      • AMP [Kor]
      • AMP [Eng]
      • RCNN [Kor]
      • MobileNet [Eng]
  • Author's note
    • [2022 Spring] Author's note
      • Pop-Out Motion [Kor]
    • [2021 Fall] Author's note
      • Standardized Max Logits [Eng]
      • Standardized Max Logits [Kor]
  • Dive into implementation
    • [2022 Spring] Implementation
      • Supervised Contrastive Replay [Kor]
      • Pose Recognition with Cascade Transformers [Eng]
    • [2021 Fall] Implementation
      • Diversity Input Method [Kor]
        • Source code
      • Diversity Input Method [Eng]
        • Source code
  • Contributors
    • [2022 Fall] Contributors
    • [2021 Fall] Contributors
  • How to contribute?
    • (Template) Paper review [Language]
    • (Template) Author's note [Language]
    • (Template) Implementation [Language]
  • KAIST AI
Powered by GitBook
On this page
  • 1. Problem definition (๋ฌธ์ œ ์ •์˜)
  • 2. Motivation (์—ฐ๊ตฌ ๋™๊ธฐ)
  • Related work (๊ด€๋ จ ๋…ผ๋ฌธ)
  • Idea (์•„์ด๋””์–ด)
  • 3. Method (๋ฐฉ๋ฒ•๋ก )
  • 3.1 Method Summary (๋ฐฉ๋ฒ•๋ก  ์š”์•ฝ ์ •๋ฆฌ)
  • 3.2 Details of methods (๋ฐฉ๋ฒ•๋ก  ์„ธ๋ถ€์‚ฌํ•ญ)
  • 3.3 Segment-based (์˜์—ญ ๊ธฐ๋ฐ˜ ๊ธฐ๋ฒ•) and Pinted-based(ํฌ์ธํŠธ ๊ธฐ๋ฐ˜ ๊ธฐ๋ฒ•)
  • 4. Experiment & Result (์‹คํ—˜ ๊ฒฐ๊ณผ)
  • Experimental setup (์‹คํ—˜ ์„ธํŒ…)
  • Result
  • 5. Conclusion (๊ฒฐ๋ก )
  • Take home message (์˜ค๋Š˜์˜ ๊ตํ›ˆ)
  • Take home message (์˜ค๋Š˜์˜ ๊ตํ›ˆ)
  • Author / Reviewer information
  • Author
  • Reference & Additional materials

Was this helpful?

  1. Paper review
  2. [2022 Spring] Paper review

LabOR [Kor]

PreviousSimCLR [Kor]NextLabOR [Eng]

Last updated 2 years ago

Was this helpful?

of this article is available.

1. Problem definition (๋ฌธ์ œ ์ •์˜)

  • Domain Adaptation (DA)

    • Domain adaptation ์€ ์ปดํ“จํ„ฐ ๋น„์ „ ๋ถ„์•ผ์˜ ์ค‘์š”ํ•œ ํ•œ ๋ถ„์•ผ์ž…๋‹ˆ๋‹ค.

    • Domain adpatation์˜ ํ•ต์‹ฌ ๋ชฉํ‘œ๋Š”, source domain์„ ๊ฐ€์ง€๊ณ  ํ•™์Šตํ•œ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์ด target dataset์—์„œ๋„ ์ข‹์€ ์„ฑ๋Šฅ์ด ๋‚˜์˜ค๋„๋ก ๋งŒ๋“œ๋Š” ๊ฒƒ ์ž…๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ Target dataset์€ source์™€ ๋งŽ์ด ๋‹ค๋ฅธ ์Šคํƒ€์ผ์„ ๊ฐ€์ง€๋Š” ๋ฐ์ดํ„ฐ์…‹ ์ด๊ธฐ์—, source์—์„œ ํ•™์Šตํ•œ ์‹ ๊ฒฝ๋ง์ด target์—์„œ๋Š” ๋‚ฎ์€ ์„ฑ๋Šฅ(์‹ฌ๊ฐํ•œ ์„ฑ๋Šฅ ํ•˜๋ฝ)์„ ๋ณด์—ฌ์ฃผ๊ธฐ๋„ ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฌธ์ œ์ ์„ ํ•ด๊ฒฐํ•˜๋ ค๊ณ  ํ•˜๋Š” ๊ฒƒ์ด DA์˜ ํ•ต์‹ฌ ๋ชฉํ‘œ์ž…๋‹ˆ๋‹ค.

  • ๋น„์ง€๋„ Domain Adaptation (UDA)

    • label ์ •๋ณด๋ฅผ ๋ชจ๋‘ ์•Œ๊ณ  ์žˆ๋Š” source dataset์„ ๊ฐ€์ง€๊ณ  ํ•™์Šต์‹œํ‚จ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์ด Target ๋„๋ฉ”์ธ์—์„œ๋„ ์ž˜ ๋™์ž‘ํ•˜๊ฒŒ ๋งŒ๋“œ๋Š” ๊ฒƒ์„ ๋ชฉ์ ์œผ๋กœ ํ•ฉ๋‹ˆ๋‹ค. ์ด๋•Œ Target dataset์€ label ์ •๋ณด๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ๋น„์ง€๋„ํ•™์Šต ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•ด์„œ ๋ชจ๋ธ์„ ์ถ”๊ฐ€ ํ•™์Šตํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.

    • UDA์— ๋Œ€ํ•œ ๋งŽ์€ ์—ฐ๊ตฌ๊ฐ€ ์ง„ํ–‰๋์Œ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ , ์ง€๋„ ํ•™์Šต์œผ๋กœ ํ•™์Šต๋œ ๋ชจ๋ธ๋ณด๋‹ค ํ˜„์ €ํžˆ ๋‚ฎ์€ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

  • ์ ์€ target label์„ ๊ฐ€์ง€๊ณ  Domain Adaptation.

    • ์œ„์™€ ๊ฐ™์€ UDA์˜ ์•ฝ์  ๋•Œ๋ฌธ์—, ๋ช‡ ์—ฐ๊ตฌ์ž๋“ค์€ target dataset์˜ label ์ •๋ณด๋ฅผ ์•„์ฃผ ์กฐ๊ธˆ๋งŒ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์„ ๊ณ ๋ คํ•˜๊ธฐ ์‹œ์ž‘ํ–ˆ์Šต๋‹ˆ๋‹ค.

    • ์•„์ฃผ ์ ์€ label ์ •๋ณด๋ฅผ ๋ชจ์œผ๋Š” ๊ฒƒ์€ ๋งŽ์€ ์ž์›๊ณผ ๋น„์šฉ์„ ํ•„์š”๋กœ ํ•˜์ง€ ์•Š๋Š”๋‹ค๋Š” ์ƒ๊ฐ์ด ๋ฐ˜์˜๋œ ์—ฐ๊ตฌ์ž…๋‹ˆ๋‹ค.

  • Semantic segmentation

    • ์ด๋ฏธ์ง€ ์•ˆ์—์„œ ๊ฐ์ฒด๋ฅผ ๊ฒฝ๊ณ„๊นŒ์ง€ ์ •ํ™•ํ•˜๊ฒŒ ๊ตฌ๋ถ„ํ•˜๋Š” ๊ณผ์ œ๋ฅผ ๋งํ•ฉ๋‹ˆ๋‹ค. ํ”ฝ์…€ ๋‹จ์œ„๋กœ ๋ผ๋ฒจ๋ง์„ ๋ชจ๋‘ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

2. Motivation (์—ฐ๊ตฌ ๋™๊ธฐ)

  • target label ์ •๋ณด๋ฅผ ์ตœ์†Œํ•œ์œผ๋กœ ์‚ฌ์šฉํ•ด์„œ (์ฃผ์„์ž(๋ผ๋ฒจ๋ง ์ž‘์—…์„ ํ•˜๋Š” ์‚ฌ๋žŒ)์ด ์ตœ์†Œํ•œ์˜ ๋…ธ๋ ฅ๊ณผ ์‹œ๊ฐ„๋งŒ ํˆฌ์žํ•ด์„œ) ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ์ตœ๋Œ€ํ•œ์œผ๋กœ ๋Œ์–ด๋‚ผ ๋ฐฉ๋ฒ•์„ ๊ณ ๋ฏผํ•ฉ๋‹ˆ๋‹ค.

  • ์ด๋ฏธ์ง€์˜ ์–ด๋–ค ํ”ฝ์…€์— ๋Œ€ํ•œ ๋ผ๋ฒจ ์ •๋ณด๋ฅผ ์ฃผ์–ด์•ผ, ์ด๋ฏธ์ง€ ๋ถ„ํ•  ๋ชจ๋ธ์ด ์ตœ๊ณ ์˜ ์„ฑ๋Šฅ์œผ๋กœ ํ•™์Šต๋  ์ˆ˜ ์žˆ์„๊นŒ? ๋ผ๋Š” ๋ชจํ‹ฐ๋ฒ ์ด์…˜์„ ๊ฐ€์ง€๊ณ  ์—ฐ๊ตฌ๋œ ๋…ผ๋ฌธ์ž…๋‹ˆ๋‹ค.

  • ์ฆ‰ ์ด ๋…ผ๋ฌธ์€ ๋ผ๋ฒจ๋ง์ด ํ•„์š”ํ•œ ํฌ์ธํŠธ๋ฅผ ์ฐพ๊ธฐ. ๋ฅผ ์ฃผ์š” ๊ณผ์ œ๋กœ ์‚ผ์Šต๋‹ˆ๋‹ค. ๋‹ค์‹œ ๋งํ•ด ํšจ์œจ์ ์ธ ํ”ฝ์…€ ๋ ˆ๋ฒจ ์ƒ˜ํ”Œ๋ง ์ž‘์—…์ด๋ผ๊ณ  ํ‘œํ˜„ํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

Related work (๊ด€๋ จ ๋…ผ๋ฌธ)

  1. ๋น„์ง€๋„ Domain Adaptation

    • Adversarial learning (์ ๋Œ€ํ•™์Šต) ์€ source์™€ target์„ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•ด ์ถ”๋ก ํ–ˆ์„ ๋•Œ ๋‚˜์˜ค๋Š” ๊ฒฐ๊ณผ๊ฐ€, ํ”ผ์ฒ˜์˜ ๋ถ„ํฌ ์ฐจ์ด๊ฐ€ ์ตœ์†Œํ•œ์œผ๋กœ ๋‚˜์˜ค๋Š”๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•ฉ๋‹ˆ๋‹ค.

    • ๋งŽ์€ ๋น„์ง€๋„ DA ์—ฐ๊ตฌ๊ฐ€ ์ง„ํ–‰๋˜์–ด ์™”์Œ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ , ์ด ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•ด ๊ฐœ๋ฐœ๋œ ๋ชจ๋ธ๊ณผ ์ง€๋„ ํ•™์Šต์„ ์‚ฌ์šฉํ•ด ๊ฐœ๋ฐœ๋œ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์ฐจ์ด๊ฐ€ ์•„์ง๊นŒ์ง€๋„ ๊ทน๋ช…ํ•˜๊ฒŒ ๋‚˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

  2. ์ ์€ target label์„ ๊ฐ€์ง€๊ณ  Domain Adaptation.

    • ์ด ๋…ผ๋ฌธ๋“ค์€ ๋ณดํ†ต ์ด๋ฏธ์ง€ ๋‹จ์œ„ ๊ณ ๋ ค๋ฅผ ํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰ "์–ด๋–ค ์ด๋ฏธ์ง€๋ฅผ ๋ผ๋ฒจ๋ง ํ•˜๋Š”๊ฒŒ, ๋ชจ๋ธ ์„ฑ๋Šฅ์„ ์ตœ๋Œ€๋กœ ๋งŒ๋“ค๊นŒ?" ๋ผ๋Š” ๊ณ ๋ฏผ์„ ํ•ฉ๋‹ˆ๋‹ค.

    • ๋ฐ˜๋Œ€๋กœ, ์ด ๋…ผ๋ฌธ์€ "์–ด๋–ค ํ”ฝ์…€์„ ๋ผ๋ฒจ๋ง ํ•˜๋Š”๊ฒŒ, ๋ชจ๋ธ ์„ฑ๋Šฅ์„ ์ตœ๋Œ€๋กœ ๋งŒ๋“ค๊นŒ" ๋ผ๋Š” ๊ณ ๋ฏผ์„ ํ•ฉ๋‹ˆ๋‹ค.

Idea (์•„์ด๋””์–ด)

  • ์ด ๋…ผ๋ฌธ์€ ์ƒˆ๋กœ์šด ์˜ˆ์ธก ๋ชจ๋ธ์„ ์ถ”๊ฐ€๋กœ ๋‘๊ณ , ์ด๊ฒƒ์„ "๋ถˆํ™•์‹ค ์˜์—ญ"์„ ์ฐพ๊ธฐ ์œ„ํ•ด์„œ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ด "๋ถˆํ™•์‹ค ์˜์—ญ"๋งŒ์„ ์ฃผ์„์ž๊ฐ€ ๋ผ๋ฒจ๋งํ•œ๋‹ค๋ฉด ์ ์€ ์ž์›์œผ๋กœ๋„ ๋†’์€ ์„ฑ๋Šฅ์„ ๊ฐ€์ง€๋Š” ๋ชจ๋ธ์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ์„๊ฑฐ๋ผ๋Š” ์•„์ด๋””์–ด๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

  • ์ด "๋ถˆํ™•์‹ค ์˜์—ญ" ๋‹ค๋ฅด๊ฒŒ ํฌํ˜„ํ•˜๋ฉด, ์ตœ๊ณ ์˜ ์„ฑ๋Šฅ ์ž ์žฌ๋ ฅ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ์ง€์ , ์˜์—ญ. ์ด๋ผ๊ณ  ํ•ด์„๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

3. Method (๋ฐฉ๋ฒ•๋ก )

3.1 Method Summary (๋ฐฉ๋ฒ•๋ก  ์š”์•ฝ ์ •๋ฆฌ)

  • ์•„๋ž˜์˜ ์ˆœ์„œ๋Œ€๋กœ ๊ธฐ์ œ๋œ ๋ฐฉ๋ฒ•๊ณผ ์ด๋ฏธ์ง€๋ฅผ ํ•จ๊ป˜ ๋ณด์‹œ๊ธธ ๋ฐ”๋ž๋‹ˆ๋‹ค.

  • ์•„๋ž˜์˜ ์„ค๋ช… ์ˆœ์„œ๋Š” ์ด๋ฏธ์ง€ ์ดˆ๋ก์ƒ‰ ๋ฒˆํ˜ธ์™€ ๋™์ผํ•ฉ๋‹ˆ๋‹ค.

  1. ํ”ฝ์…€ ์„ ํƒ ๋ชจ๋ธ (pixel selector model)์€ ๊ณต์œ ๋˜๋Š” ํ•˜๋‚˜์˜ backbone๋ชจ๋ธ๊ณผ 2๊ฐœ์˜ classifiers ๋ชจ๋ธ๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค.

  2. ํ•˜๋‚˜์˜ target ๋ฏธ๋‹ˆ๋ฐฐ์น˜ ์ด๋ฏธ์ง€๋“ค์€ ์œ„ backbone๊ณผ 2๊ฐœ์˜ classifier๋ฅผ ํ†ต๊ณผํ•˜์—ฌ, ์˜ˆ์ธก๊ฒฐ๊ณผ๊ฐ€ ์ถ”๋ก ๋ฉ๋‹ˆ๋‹ค. 2๊ฐœ์˜ classifier๋ฅผ ์‚ฌ์šฉํ•ด์„œ ๋‚˜์˜ค๋Š” "2๊ฐœ์˜ ์ด๋ฏธ์ง€ ๋ถ„ํ•  ์˜ˆ์ธก ๊ฒฐ๊ณผ"๊ฐ€ ๋‚˜์˜จ๋‹ค๊ณ  ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  3. "2๊ฐœ์˜ ์ด๋ฏธ์ง€ ๋ถ„ํ•  ์˜ˆ์ธก ๊ฒฐ๊ณผ"๋Š” ๋ถˆ์ผ์น˜์„ฑ ๋งˆ์Šคํฌ(Inconsistent Mask = ์˜ˆ์ธก๊ฒฐ๊ณผ๊ฐ€ ์„œ๋กœ ๋‹ค๋ฅธ ์˜์—ญ) ์„ ์ฐพ๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

  4. ์œ„์—์„œ ์ฐพ์€ ๋ถˆ์ผ์น˜์„ฑ ๋งˆ์Šคํฌ ๋ฅผ ์ฐธ๊ณ ํ•ด์„œ ์ฃผ์„์ž๋Š” target ์ ์€ ๋ผ๋ฒจ๋ง์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ผ๋ฒจ๋ง์€ ์ด๋ฏธ์ง€ ๋ถ„ํ•  ๋ชจ๋ธ(semantic segmentation model)์„ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

3.2 Details of methods (๋ฐฉ๋ฒ•๋ก  ์„ธ๋ถ€์‚ฌํ•ญ)

  • Loss1,2: ์›๋ž˜ ๊ฐ€์ง€๊ณ  ์žˆ๋˜ source label๊ณผ ์ฃผ์„์ž์˜ ๋ผ๋ฒจ๋ง์œผ๋กœ ๋งŒ๋“ค์–ด์ง„ ์ ์€ target label์„ ๊ฐ€์ง€๊ณ  Cross entropy loss๊ฐ€ UDA model(์ด๋ฏธ์ง€ ๋ถ„ํ•  ๋ชจ๋ธ)์— ์ ์šฉ๋˜์–ด ํ•™์Šต๋ฉ๋‹ˆ๋‹ค.

  • Equ 4: Inconsistent Mask ๋ถˆ์ผ์น˜์„ฑ ๋งˆ์Šคํฌ๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•œ ๊ณต์‹์ž…๋‹ˆ๋‹ค.

3.3 Segment-based (์˜์—ญ ๊ธฐ๋ฐ˜ ๊ธฐ๋ฒ•) and Pinted-based(ํฌ์ธํŠธ ๊ธฐ๋ฐ˜ ๊ธฐ๋ฒ•)

  • ์ด ๋…ผ๋ฌธ์ด ๋ผ๋ฒจ๋ง ์˜์—ญ์„ ์ฐพ๋Š” ๋ฐฉ๋ฒ•์€ 2๊ฐ€์ง€๋กœ ๋‚˜๋ˆ ์ง‘๋‹ˆ๋‹ค. ํ•˜๋‚˜๋Š” ์˜์—ญ ๊ธฐ๋ฐ˜๊ธฐ๋ฒ• โ€œSegment based Pixel-Labeling (SPL)โ€ ์ด๊ณ , ๋‹ค๋ฅธ ํ•˜๋‚˜๋Š” ํฌ์ธํŠธ ๊ธฐ๋ฐ˜ ๊ธฐ๋ฒ•โ€œPoint based Pixel-Labeling (PPL).โ€ ์ž…๋‹ˆ๋‹ค.

  • SPL ์€ ๋‘ classifier๋ฅผ ์‚ฌ์šฉํ•ด ์ถ”๋ก ๋œ ์˜ˆ์ธก ๊ฒฐ๊ณผ์˜ ์ฐจ์ด (์œ„ the inconsistency mask ์ฐธ์กฐ) ๋ฅผ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ•œ ์˜์—ญ์ž…๋‹ˆ๋‹ค.

  • ์œ„ ๊ธฐ๋ฒ•์€ ์•„๋ž˜์˜ ๊ทธ๋ฆผ์ฒ˜๋Ÿผ, ๊ต‰์žฅํžˆ ๋ผ๋ฒจ๋ง์ด ํž˜๋“  ์˜์—ญ์„ ์ด๊ธฐ๋„ ํ•ฉ๋‹ˆ๋‹ค. ๋” ์ข‹์€ ํšจ์œจ์„ฑ์„ ์œ„ํ•ด PPL ๊ธฐ๋ฒ•์€ ์œ„ ์˜์—ญ ์ค‘ 20~40๊ฐœ์˜ ํฌ์ธํŠธ๋งŒ์„ ๊ณจ๋ผ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. ์•„๋ž˜๋Š” ์ด ํฌ์ธํŠธ๋ฅผ ์ฐพ๋Š” ๊ณผ์ •์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

    1. the set of uncertain pixels D^(k) ๋ถˆํ™•์‹ค์„ฑ ์˜์—ญ์— ๋Œ€ํ•œ ์ง‘ํ•ฉ์„ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค.

    2. ๊ฐ ํด๋ž˜์Šค ๋งˆ๋‹ค ํ‰๊ท ๊ฐ’์„ ์‚ฌ์šฉํ•˜์—ฌ the class prototype vector (ํด๋ž˜์Šค ์ค‘์•™๊ฐ’) ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.

    3. ํด๋ž˜์Šค ์ค‘์•™๊ฐ’๊ณผ ๊ฐ€์žฅ ๋น„์Šทํ•œ ์›์†Œ๋ฅผ ์ฐพ์•„์„œ, ๊ทธ ํฌ์ธํŠธ๋ฅผ PPL์„ ์œ„ํ•œ ํฌ์ธํŠธ๋ผ๊ณ  ํ™•์ •ํ•ฉ๋‹ˆ๋‹ค.

4. Experiment & Result (์‹คํ—˜ ๊ฒฐ๊ณผ)

Experimental setup (์‹คํ—˜ ์„ธํŒ…)

  • ์‹คํ—˜์— ์‚ฌ์šฉํ•œ ๋ชจ๋ธ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. (1) ResNet101 (2) Deeplab-V2

Result

  • Figure 1

    1. SPL ๊ธฐ๋ฒ•์€ ์ง€๋„ํ•™์Šต๊ณผ ๊ฑฐ์˜ ์œ ์‚ฌํ•œ ์„ฑ๋Šฅ์„ ๊ฐ€์ง€๋Š” ๊ฒƒ์€ ๊ต‰์žฅํžˆ ๋†€๋ผ์šด ์‚ฌ์‹ค์ž…๋‹ˆ๋‹ค.

  • Table 1

    1. ์ด ํ…Œ์ด๋ธ”์€ ์ตœ๊ทผ์˜ ์šฐ์ˆ˜ํ•œ ๋…ผ๋ฌธ๋“ค๊ณผ์˜ ์„ฑ๋Šฅ์„ ์ˆ˜์น˜์ ์œผ๋กœ ๋น„๊ตํ•œ ํ…Œ์ด๋ธ” ์ž…๋‹ˆ๋‹ค.

    2. ์—ฌ๊ธฐ์„œ๋„, ์•„์ฃผ ์ž‘์€ ๋ผ๋ฒจ์„ ์‚ฌ์šฉํ•˜๋Š” ์ค€์ง€๋„ ํ•™์Šต๋ฒ•์ธ PPL๊ณผ ์ „์ฒด ๋ผ๋ฒจ์„ ์‚ฌ์šฉํ•˜๋Š” ์ง€๋„ ํ•™์Šต์˜ ์„ฑ๋Šฅ์ด ๋น„์Šทํ•œ ๊ฒƒ์„ ์ •ํ™•ํžˆ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  • Figure 2

    1. ์‹œ๊ฐ์  ์„ฑ๋Šฅ ์ฐจ์ด๋ฅผ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

    2. ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•œ SPL ๊ธฐ๋ฒ•์€ ๋‹ค๋ฅธ ๊ธฐ๋ฒ•๋ณด๋‹ค ๋”์šฑ ์ •ํ™•ํ•œ ์ด๋ฏธ์ง€ ๋ถ„ํ•  ๊ฒฐ๊ณผ๋ฅผ ๋งŒ๋“ค์–ด์ฃผ๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

5. Conclusion (๊ฒฐ๋ก )

  • ์ด ๋…ผ๋ฌธ์€ ์ ์€ ์ž์›์˜ ์ฃผ์„์ž ์‚ฌ์šฉ์„ ์œ„ํ•œ ๋„๋ฉ”์ธ ์ ์‘ํ˜• ์ด๋ฏธ์ง€ ๋ถ„ํ•  ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค.

  • 2๊ฐœ์˜ ํ”ฝ์…€ ์„ ํƒ ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ•˜๋Š”๋ฐ, ํ•˜๋‚˜๋Š” ์˜์—ญ๊ธฐ๋ฐ˜(SPL)์ด๋ฉฐ ํ•˜๋‚˜๋Š” ํฌ์ธํŠธ๊ธฐ๋ฐ˜(PPL)์ž…๋‹ˆ๋‹ค.

  • ํ•œ๊ณ„์  (ํฌ์ŠคํŠธ ์ €์ž์˜ ์ƒ๊ฐ)

    1. ๋…ผ๋ฌธ์— ๋ณด๋ฉด SPL๊ณผ PPL ๊ฐ๊ฐ ํ•œ ์ด๋ฏธ์ง€๋‹น 2.2%์˜ ์˜์—ญ๊ณผ 40 ํฌ์ธํŠธ๋ฅผ ๋ผ๋ฒจ๋ง ํ•œ๋‹ค๊ณ  ๊ธฐ๋ก๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ์†Œ๋ฆฌ๋Š” ์ถฉ๋ถ„ํžˆ ๊ต‰์žฅํžˆ ์ ์€ ๋ผ๋ฒจ์„ ์‚ฌ์šฉํ•ด์„œ ํšจ์œจ์ ์ธ ๊ธฐ๋ฒ• ์ฒ˜๋Ÿผ ๋“ค๋ฆฝ๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ, '๋‚ด๊ฐ€ ๋งŒ์•ฝ ์ฃผ์„์ž ๋ผ๋ฉด?' ์ด๋ผ๋Š” ์ƒ๊ฐ์„ ํ•ด๋ดค์„ ๋•Œ (1) PPL ๊ธฐ๋ฒ•์˜ ๋ผ๋ฒจ๋ง์€ ์ด๋ฏธ์ง€ ์ „์ฒด ๋ผ๋ฒจ๋ง ๋ณด๋‹ค ํž˜๋“  ์ž‘์—… ๊ฐ™์•„ ๋ณด์ž…๋‹ˆ๋‹ค. (์„น์…˜ 3.3์˜ ์ด๋ฏธ์ง€ ์ฐธ๊ณ ) (2) SPL ๊ธฐ๋ฒ•์˜ 40ํฌ์ธํŠธ ๋˜ํ•œ PPL๊ธฐ๋ฒ•์˜ ํฌ์ธํŠธ ์ค‘ ํ•˜๋‚˜์ด๋ฏ€๋กœ ๋Œ€๋ถ€๋ถ„ ๊ฐ์ฒด ๊ฒฝ๊ณ„์˜ ๋ผ๋ฒจ์ด๋ผ๊ณ  ๊ณ ๋ ค๋ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ด ๋˜ํ•œ ๋งค์šฐ ํž˜๋“  ์ž‘์—… ๊ฐ™์•„ ๋ณด์ž…๋‹ˆ๋‹ค.

    2. ์ด๊ฒƒ์ด ์ •๋ง ํšจ์œจ์ ์ด๊ณ  ์ž์›์ด ์ ๊ฒŒ ํ•„์š”ํ•œ ์ž‘์—…์ธ์ง€๋Š”, ์ข€ ๋” ๋งŽ์€ ์˜ˆ์ œ ์ด๋ฏธ์ง€์™€ ๊ฒฝํ—˜๋‹ด ๋“ฑ์„ ํ†ตํ•ด์„œ ๋น„๊ตํ•  ํ•„์š”๊ฐ€ ์žˆ์–ด๋ณด์ž…๋‹ˆ๋‹ค.

Take home message (์˜ค๋Š˜์˜ ๊ตํ›ˆ)

์•„์ฃผ ์กฐ๊ธˆ์˜ ๋ชจ๋ธ ์„ฑ๋Šฅ์„ ์˜ฌ๋ฆฌ๊ธฐ ์œ„ํ•ด์„œ ์•„์ฃผ ๋ณต์žกํ•œ ๋น„์ง€๋„ DA ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค, ๋น„์šฉ์„ ์ตœ์†Œํ•œ์œผ๋กœ ํ•„์š”๋กœ ํ•˜๋Š” ์ ์€ ๋ผ๋ฒจ๋ง๋งŒ์œผ๋กœ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ๋งค์šฐ ํšจ๊ณผ์ ์œผ๋กœ ์˜ฌ๋ฆด ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ์žŠ์ง€ ๋ง์•„์•ผ ํ•œ๋‹ค.

Take home message (์˜ค๋Š˜์˜ ๊ตํ›ˆ)

It may be more efficient to obtain a supervision signal at a low cost than using complex unsupervised methods to achieve very small performance gains.

Author / Reviewer information

Author

  1. ์‹ ์ธ๊ทœ (Inkyu Shin)

    • KAIST / RCV Lab

    • https://dlsrbgg33.github.io/

  2. ๊น€๋™์ง„ (DongJin Kim)

    • KAIST / RCV Lab

    • https://sites.google.com/site/djkimcv/

  3. ์กฐ์žฌ์› (JaeWon Cho)

    • KAIST / RCV Lab

    • https://chojw.github.io/

Reference & Additional materials

  1. Citation of this paper

  2. Reference for this post

๋Œ€ํ‘œ์ ์ธ ๋…ผ๋ฌธ์œผ๋กœ๋Š” [, ] ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

์œ„์™€ ๊ฐ™์€ ๋น„์ง€๋„ ํ•™์Šต์˜ ๋ฌธ์ œ์ ์„ ํ•ด๊ฒฐํ•˜๊ณ ์ž, ๋งŽ์€ ์—ฐ๊ตฌ์ž๋“ค์€ ์ž์›์ด ๋„ˆ๋ฌด ๋งŽ์ด ํ•„์š”ํ•˜์ง€ ์•Š๋Š” ์„ ์—์„œ, ์•„์ฃผ ์กฐ๊ธˆ์˜ ๋ผ๋ฒจ๋ง ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์„ ์‹œ๋„ํ–ˆ์Šต๋‹ˆ๋‹ค. ๋Œ€ํ‘œ์ ์ธ ๋…ผ๋ฌธ์œผ๋กœ๋Š” [, , , ] ์ด์™€ ๊ฐ™์€ ๊ฒƒ ๋“ค์ด ์žˆ์Šต๋‹ˆ๋‹ค.

์œ„์—์„œ ์–ป์€ target ์ ์€ ๋ผ๋ฒจ๊ณผ ์›๋ž˜ ๊ฐ€์ง€๊ณ  ์žˆ๋˜ source ๋ผ๋ฒจ์€ ์ด๋ฏธ์ง€ ๋ถ„ํ•  ๋ชจ๋ธ์„ ์ง€๋„ ํ•™์Šต์œผ๋กœ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ๋™์‹œ์— ์ ๋Œ€์  ํ•™์Šต๋ฒ•(adversarial learning [])์ด ์–ด๋–ค domain ์ด๋ฏธ์ง€๊ฐ€ ๋“ค์–ด์˜ค๋“  ๋น„์Šทํ•œ ํ”ผ์ฒ˜ ๋ถ„ํฌ๋ฅผ ๊ฐ€์ง€๊ฒŒ ํ•˜๊ธฐ ์œ„ํ•ด์„œ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

๋งˆ์ง€๋ง‰์œผ๋กœ, ํ”ฝ์…€ ์„ ํƒ ๋ชจ๋ธ์„ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•ด์„œ ์ตœ๋Œ€ ์ฐจ์ด ๊ธฐ๋ฒ•[] ์ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ์ด ๊ธฐ๋ฒ•์€ ๋‘ classiifer์•ˆ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋“ค์ด ์„œ๋กœ๋ฅผ ๋ฐ€๋ฉฐ, ์„œ๋กœ ๋ฉ€์–ด์ง€๊ฒŒ ์œ ๋„ํ•˜๋Š” ๊ธฐ๋ฒ•์ž…๋‹ˆ๋‹ค.

Loss3: ์ ๋Œ€์  ํ•™์Šต๋ฒ•์ž…๋‹ˆ๋‹ค. ์ž์„ธํ•œ ์‚ฌํ•ญ์€ ๋‹ค์Œ ๋…ผ๋ฌธ์„ ์ฐธ๊ณ ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.

Loss5: ํ”ฝ์…€ ์„ ํƒ ๋ชจ๋ธ์„ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•œ Pseudo(๊ฐ€์งœ) label loss์ž…๋‹ˆ๋‹ค. Pseudo label(๊ฐ€์งœ ๋ผ๋ฒจ)์€ ์ด๋ฏธ์ง€ ๋ถ„ํ•  ๋ชจ๋ธ์ด ์˜ˆ์ธกํ•œ ๊ฒฐ๊ณผ๋ฅผ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ๋” ์ž์„ธํ•œ ๋ฐฉ๋ฒ•์œผ๋กœ ํ•ด๋‹น ๋…ผ๋ฌธ()์˜ ๋ฐฉ๋ฒ•์„ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.

Loss6: ์ตœ๋Œ€ ์ฐจ์ด ๊ธฐ๋ฒ• ๋กœ์Šค ํ•จ์ˆ˜ ์ž…๋‹ˆ๋‹ค. (The classifier discrepancy maximization) (์ž์„ธํ•œ ์‚ฌํ•ญ์€ ์ด ๋…ผ๋ฌธ ๋ฐฉ๋ฒ•์„ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค. paper)

๋ฐ์ดํ„ฐ์…‹: source dataset์€ (synthetic dataset)์„ ์‚ฌ์šฉํ•˜์˜€๊ณ , target dataset์€ (real-world data)๋ฅผ ์‚ฌ์šฉํ—€์Šต๋‹ˆ๋‹ค.

LabOR (PPL and SPL) ๊ธฐ๋ฒ•์ด ๊ธฐ์กด์˜ ๋น„์ง€๋„ DA ๊ธฐ๋ฒ•๋“ค() ์ข‹์€ ์„ฑ๋Šฅ์„ ๊ฐ€์ง์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

PPL ์€ ์ตœ๊ทผ์˜ ์ค€์ง€๋„ ํ•™์Šต๋ฒ•์„ ์‚ฌ์šฉํ•œ ๋…ผ๋ฌธ๋ณด๋‹ค ๋” ์ข‹์€ ์„ฑ๋Šฅ์„ ๊ฐ€์ง€๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

\

,

AdaptSeg
ADVENT
Alleviating semantic-level shift
Active Adversarial Domain Adaptation
Playing for Data
DA_weak_labels
AdaptSeg
Maximum classifier discrepancy
AdaptSeg
IAST
MCDDA
GTA5
Cityscape
IAST
WDA
Towards Fewer Annotations: Active Learning via Region Impurity and Prediction Uncertainty for Domain Adaptive Semantic Segmentation
D2ADA: Dynamic Density-aware Active Domain Adaptation for Semantic Segmentation
Unsupervised Domain Adaptation for Semantic Image Segmentation: a Comprehensive Survey
ADeADA: Adaptive Density-aware Active Domain Adaptation for Semantic Segmentation
MCDAL: Maximum Classifier Discrepancy for Active Learning
AdaptSeg
ADVENT
IAST
Alleviating semantic-level shift
Active Adversarial Domain Adaptation
Playing for Data
DA_weak_labels
Maximum classifier discrepancy
WDA
English version
drawing
drawing
drawing
drawing
drawing
drawing