LabOR [Kor]

English versionarrow-up-right of this article is available.

1. Problem definition (๋ฌธ์ œ ์ •์˜)

  • Domain Adaptation (DA)

    • Domain adaptation ์€ ์ปดํ“จํ„ฐ ๋น„์ „ ๋ถ„์•ผ์˜ ์ค‘์š”ํ•œ ํ•œ ๋ถ„์•ผ์ž…๋‹ˆ๋‹ค.

    • Domain adpatation์˜ ํ•ต์‹ฌ ๋ชฉํ‘œ๋Š”, source domain์„ ๊ฐ€์ง€๊ณ  ํ•™์Šตํ•œ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์ด target dataset์—์„œ๋„ ์ข‹์€ ์„ฑ๋Šฅ์ด ๋‚˜์˜ค๋„๋ก ๋งŒ๋“œ๋Š” ๊ฒƒ ์ž…๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ Target dataset์€ source์™€ ๋งŽ์ด ๋‹ค๋ฅธ ์Šคํƒ€์ผ์„ ๊ฐ€์ง€๋Š” ๋ฐ์ดํ„ฐ์…‹ ์ด๊ธฐ์—, source์—์„œ ํ•™์Šตํ•œ ์‹ ๊ฒฝ๋ง์ด target์—์„œ๋Š” ๋‚ฎ์€ ์„ฑ๋Šฅ(์‹ฌ๊ฐํ•œ ์„ฑ๋Šฅ ํ•˜๋ฝ)์„ ๋ณด์—ฌ์ฃผ๊ธฐ๋„ ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฌธ์ œ์ ์„ ํ•ด๊ฒฐํ•˜๋ ค๊ณ  ํ•˜๋Š” ๊ฒƒ์ด DA์˜ ํ•ต์‹ฌ ๋ชฉํ‘œ์ž…๋‹ˆ๋‹ค.

    • drawing

  • ๋น„์ง€๋„ Domain Adaptation (UDA)

    • label ์ •๋ณด๋ฅผ ๋ชจ๋‘ ์•Œ๊ณ  ์žˆ๋Š” source dataset์„ ๊ฐ€์ง€๊ณ  ํ•™์Šต์‹œํ‚จ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์ด Target ๋„๋ฉ”์ธ์—์„œ๋„ ์ž˜ ๋™์ž‘ํ•˜๊ฒŒ ๋งŒ๋“œ๋Š” ๊ฒƒ์„ ๋ชฉ์ ์œผ๋กœ ํ•ฉ๋‹ˆ๋‹ค. ์ด๋•Œ Target dataset์€ label ์ •๋ณด๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ๋น„์ง€๋„ํ•™์Šต ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•ด์„œ ๋ชจ๋ธ์„ ์ถ”๊ฐ€ ํ•™์Šตํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.

    • UDA์— ๋Œ€ํ•œ ๋งŽ์€ ์—ฐ๊ตฌ๊ฐ€ ์ง„ํ–‰๋์Œ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ , ์ง€๋„ ํ•™์Šต์œผ๋กœ ํ•™์Šต๋œ ๋ชจ๋ธ๋ณด๋‹ค ํ˜„์ €ํžˆ ๋‚ฎ์€ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

  • ์ ์€ target label์„ ๊ฐ€์ง€๊ณ  Domain Adaptation.

    • ์œ„์™€ ๊ฐ™์€ UDA์˜ ์•ฝ์  ๋•Œ๋ฌธ์—, ๋ช‡ ์—ฐ๊ตฌ์ž๋“ค์€ target dataset์˜ label ์ •๋ณด๋ฅผ ์•„์ฃผ ์กฐ๊ธˆ๋งŒ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์„ ๊ณ ๋ คํ•˜๊ธฐ ์‹œ์ž‘ํ–ˆ์Šต๋‹ˆ๋‹ค.

    • ์•„์ฃผ ์ ์€ label ์ •๋ณด๋ฅผ ๋ชจ์œผ๋Š” ๊ฒƒ์€ ๋งŽ์€ ์ž์›๊ณผ ๋น„์šฉ์„ ํ•„์š”๋กœ ํ•˜์ง€ ์•Š๋Š”๋‹ค๋Š” ์ƒ๊ฐ์ด ๋ฐ˜์˜๋œ ์—ฐ๊ตฌ์ž…๋‹ˆ๋‹ค.

  • Semantic segmentation

    • ์ด๋ฏธ์ง€ ์•ˆ์—์„œ ๊ฐ์ฒด๋ฅผ ๊ฒฝ๊ณ„๊นŒ์ง€ ์ •ํ™•ํ•˜๊ฒŒ ๊ตฌ๋ถ„ํ•˜๋Š” ๊ณผ์ œ๋ฅผ ๋งํ•ฉ๋‹ˆ๋‹ค. ํ”ฝ์…€ ๋‹จ์œ„๋กœ ๋ผ๋ฒจ๋ง์„ ๋ชจ๋‘ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

    • drawing

2. Motivation (์—ฐ๊ตฌ ๋™๊ธฐ)

  • target label ์ •๋ณด๋ฅผ ์ตœ์†Œํ•œ์œผ๋กœ ์‚ฌ์šฉํ•ด์„œ (์ฃผ์„์ž(๋ผ๋ฒจ๋ง ์ž‘์—…์„ ํ•˜๋Š” ์‚ฌ๋žŒ)์ด ์ตœ์†Œํ•œ์˜ ๋…ธ๋ ฅ๊ณผ ์‹œ๊ฐ„๋งŒ ํˆฌ์žํ•ด์„œ) ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ์ตœ๋Œ€ํ•œ์œผ๋กœ ๋Œ์–ด๋‚ผ ๋ฐฉ๋ฒ•์„ ๊ณ ๋ฏผํ•ฉ๋‹ˆ๋‹ค.

  • ์ด๋ฏธ์ง€์˜ ์–ด๋–ค ํ”ฝ์…€์— ๋Œ€ํ•œ ๋ผ๋ฒจ ์ •๋ณด๋ฅผ ์ฃผ์–ด์•ผ, ์ด๋ฏธ์ง€ ๋ถ„ํ•  ๋ชจ๋ธ์ด ์ตœ๊ณ ์˜ ์„ฑ๋Šฅ์œผ๋กœ ํ•™์Šต๋  ์ˆ˜ ์žˆ์„๊นŒ? ๋ผ๋Š” ๋ชจํ‹ฐ๋ฒ ์ด์…˜์„ ๊ฐ€์ง€๊ณ  ์—ฐ๊ตฌ๋œ ๋…ผ๋ฌธ์ž…๋‹ˆ๋‹ค.

  • ์ฆ‰ ์ด ๋…ผ๋ฌธ์€ ๋ผ๋ฒจ๋ง์ด ํ•„์š”ํ•œ ํฌ์ธํŠธ๋ฅผ ์ฐพ๊ธฐ. ๋ฅผ ์ฃผ์š” ๊ณผ์ œ๋กœ ์‚ผ์Šต๋‹ˆ๋‹ค. ๋‹ค์‹œ ๋งํ•ด ํšจ์œจ์ ์ธ ํ”ฝ์…€ ๋ ˆ๋ฒจ ์ƒ˜ํ”Œ๋ง ์ž‘์—…์ด๋ผ๊ณ  ํ‘œํ˜„ํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

  1. ๋น„์ง€๋„ Domain Adaptation

    • Adversarial learning (์ ๋Œ€ํ•™์Šต) ์€ source์™€ target์„ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•ด ์ถ”๋ก ํ–ˆ์„ ๋•Œ ๋‚˜์˜ค๋Š” ๊ฒฐ๊ณผ๊ฐ€, ํ”ผ์ฒ˜์˜ ๋ถ„ํฌ ์ฐจ์ด๊ฐ€ ์ตœ์†Œํ•œ์œผ๋กœ ๋‚˜์˜ค๋Š”๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•ฉ๋‹ˆ๋‹ค.

    • ๋Œ€ํ‘œ์ ์ธ ๋…ผ๋ฌธ์œผ๋กœ๋Š” [AdaptSegarrow-up-right, ADVENTarrow-up-right] ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

    • ๋งŽ์€ ๋น„์ง€๋„ DA ์—ฐ๊ตฌ๊ฐ€ ์ง„ํ–‰๋˜์–ด ์™”์Œ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ , ์ด ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•ด ๊ฐœ๋ฐœ๋œ ๋ชจ๋ธ๊ณผ ์ง€๋„ ํ•™์Šต์„ ์‚ฌ์šฉํ•ด ๊ฐœ๋ฐœ๋œ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์ฐจ์ด๊ฐ€ ์•„์ง๊นŒ์ง€๋„ ๊ทน๋ช…ํ•˜๊ฒŒ ๋‚˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

  2. ์ ์€ target label์„ ๊ฐ€์ง€๊ณ  Domain Adaptation.

    • ์œ„์™€ ๊ฐ™์€ ๋น„์ง€๋„ ํ•™์Šต์˜ ๋ฌธ์ œ์ ์„ ํ•ด๊ฒฐํ•˜๊ณ ์ž, ๋งŽ์€ ์—ฐ๊ตฌ์ž๋“ค์€ ์ž์›์ด ๋„ˆ๋ฌด ๋งŽ์ด ํ•„์š”ํ•˜์ง€ ์•Š๋Š” ์„ ์—์„œ, ์•„์ฃผ ์กฐ๊ธˆ์˜ ๋ผ๋ฒจ๋ง ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์„ ์‹œ๋„ํ–ˆ์Šต๋‹ˆ๋‹ค. ๋Œ€ํ‘œ์ ์ธ ๋…ผ๋ฌธ์œผ๋กœ๋Š” [Alleviating semantic-level shiftarrow-up-right, Active Adversarial Domain Adaptationarrow-up-right, Playing for Dataarrow-up-right, DA_weak_labelsarrow-up-right] ์ด์™€ ๊ฐ™์€ ๊ฒƒ ๋“ค์ด ์žˆ์Šต๋‹ˆ๋‹ค.

    • ์ด ๋…ผ๋ฌธ๋“ค์€ ๋ณดํ†ต ์ด๋ฏธ์ง€ ๋‹จ์œ„ ๊ณ ๋ ค๋ฅผ ํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰ "์–ด๋–ค ์ด๋ฏธ์ง€๋ฅผ ๋ผ๋ฒจ๋ง ํ•˜๋Š”๊ฒŒ, ๋ชจ๋ธ ์„ฑ๋Šฅ์„ ์ตœ๋Œ€๋กœ ๋งŒ๋“ค๊นŒ?" ๋ผ๋Š” ๊ณ ๋ฏผ์„ ํ•ฉ๋‹ˆ๋‹ค.

    • ๋ฐ˜๋Œ€๋กœ, ์ด ๋…ผ๋ฌธ์€ "์–ด๋–ค ํ”ฝ์…€์„ ๋ผ๋ฒจ๋ง ํ•˜๋Š”๊ฒŒ, ๋ชจ๋ธ ์„ฑ๋Šฅ์„ ์ตœ๋Œ€๋กœ ๋งŒ๋“ค๊นŒ" ๋ผ๋Š” ๊ณ ๋ฏผ์„ ํ•ฉ๋‹ˆ๋‹ค.

Idea (์•„์ด๋””์–ด)

  • ์ด ๋…ผ๋ฌธ์€ ์ƒˆ๋กœ์šด ์˜ˆ์ธก ๋ชจ๋ธ์„ ์ถ”๊ฐ€๋กœ ๋‘๊ณ , ์ด๊ฒƒ์„ "๋ถˆํ™•์‹ค ์˜์—ญ"์„ ์ฐพ๊ธฐ ์œ„ํ•ด์„œ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ด "๋ถˆํ™•์‹ค ์˜์—ญ"๋งŒ์„ ์ฃผ์„์ž๊ฐ€ ๋ผ๋ฒจ๋งํ•œ๋‹ค๋ฉด ์ ์€ ์ž์›์œผ๋กœ๋„ ๋†’์€ ์„ฑ๋Šฅ์„ ๊ฐ€์ง€๋Š” ๋ชจ๋ธ์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ์„๊ฑฐ๋ผ๋Š” ์•„์ด๋””์–ด๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

  • ์ด "๋ถˆํ™•์‹ค ์˜์—ญ" ๋‹ค๋ฅด๊ฒŒ ํฌํ˜„ํ•˜๋ฉด, ์ตœ๊ณ ์˜ ์„ฑ๋Šฅ ์ž ์žฌ๋ ฅ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ์ง€์ , ์˜์—ญ. ์ด๋ผ๊ณ  ํ•ด์„๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

3. Method (๋ฐฉ๋ฒ•๋ก )

3.1 Method Summary (๋ฐฉ๋ฒ•๋ก  ์š”์•ฝ ์ •๋ฆฌ)

  • ์•„๋ž˜์˜ ์ˆœ์„œ๋Œ€๋กœ ๊ธฐ์ œ๋œ ๋ฐฉ๋ฒ•๊ณผ ์ด๋ฏธ์ง€๋ฅผ ํ•จ๊ป˜ ๋ณด์‹œ๊ธธ ๋ฐ”๋ž๋‹ˆ๋‹ค.

  • ์•„๋ž˜์˜ ์„ค๋ช… ์ˆœ์„œ๋Š” ์ด๋ฏธ์ง€ ์ดˆ๋ก์ƒ‰ ๋ฒˆํ˜ธ์™€ ๋™์ผํ•ฉ๋‹ˆ๋‹ค.

  1. ํ”ฝ์…€ ์„ ํƒ ๋ชจ๋ธ (pixel selector model)์€ ๊ณต์œ ๋˜๋Š” ํ•˜๋‚˜์˜ backbone๋ชจ๋ธ๊ณผ 2๊ฐœ์˜ classifiers ๋ชจ๋ธ๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค.

  2. ํ•˜๋‚˜์˜ target ๋ฏธ๋‹ˆ๋ฐฐ์น˜ ์ด๋ฏธ์ง€๋“ค์€ ์œ„ backbone๊ณผ 2๊ฐœ์˜ classifier๋ฅผ ํ†ต๊ณผํ•˜์—ฌ, ์˜ˆ์ธก๊ฒฐ๊ณผ๊ฐ€ ์ถ”๋ก ๋ฉ๋‹ˆ๋‹ค. 2๊ฐœ์˜ classifier๋ฅผ ์‚ฌ์šฉํ•ด์„œ ๋‚˜์˜ค๋Š” "2๊ฐœ์˜ ์ด๋ฏธ์ง€ ๋ถ„ํ•  ์˜ˆ์ธก ๊ฒฐ๊ณผ"๊ฐ€ ๋‚˜์˜จ๋‹ค๊ณ  ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  3. "2๊ฐœ์˜ ์ด๋ฏธ์ง€ ๋ถ„ํ•  ์˜ˆ์ธก ๊ฒฐ๊ณผ"๋Š” ๋ถˆ์ผ์น˜์„ฑ ๋งˆ์Šคํฌ(Inconsistent Mask = ์˜ˆ์ธก๊ฒฐ๊ณผ๊ฐ€ ์„œ๋กœ ๋‹ค๋ฅธ ์˜์—ญ) ์„ ์ฐพ๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

  4. ์œ„์—์„œ ์ฐพ์€ ๋ถˆ์ผ์น˜์„ฑ ๋งˆ์Šคํฌ ๋ฅผ ์ฐธ๊ณ ํ•ด์„œ ์ฃผ์„์ž๋Š” target ์ ์€ ๋ผ๋ฒจ๋ง์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ผ๋ฒจ๋ง์€ ์ด๋ฏธ์ง€ ๋ถ„ํ•  ๋ชจ๋ธ(semantic segmentation model)์„ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

  5. ์œ„์—์„œ ์–ป์€ target ์ ์€ ๋ผ๋ฒจ๊ณผ ์›๋ž˜ ๊ฐ€์ง€๊ณ  ์žˆ๋˜ source ๋ผ๋ฒจ์€ ์ด๋ฏธ์ง€ ๋ถ„ํ•  ๋ชจ๋ธ์„ ์ง€๋„ ํ•™์Šต์œผ๋กœ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ๋™์‹œ์— ์ ๋Œ€์  ํ•™์Šต๋ฒ•(adversarial learning [AdaptSegarrow-up-right])์ด ์–ด๋–ค domain ์ด๋ฏธ์ง€๊ฐ€ ๋“ค์–ด์˜ค๋“  ๋น„์Šทํ•œ ํ”ผ์ฒ˜ ๋ถ„ํฌ๋ฅผ ๊ฐ€์ง€๊ฒŒ ํ•˜๊ธฐ ์œ„ํ•ด์„œ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

  6. ๋งˆ์ง€๋ง‰์œผ๋กœ, ํ”ฝ์…€ ์„ ํƒ ๋ชจ๋ธ์„ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•ด์„œ ์ตœ๋Œ€ ์ฐจ์ด ๊ธฐ๋ฒ•[Maximum classifier discrepancyarrow-up-right] ์ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ์ด ๊ธฐ๋ฒ•์€ ๋‘ classiifer์•ˆ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋“ค์ด ์„œ๋กœ๋ฅผ ๋ฐ€๋ฉฐ, ์„œ๋กœ ๋ฉ€์–ด์ง€๊ฒŒ ์œ ๋„ํ•˜๋Š” ๊ธฐ๋ฒ•์ž…๋‹ˆ๋‹ค.

drawing

3.2 Details of methods (๋ฐฉ๋ฒ•๋ก  ์„ธ๋ถ€์‚ฌํ•ญ)

  • Loss1,2: ์›๋ž˜ ๊ฐ€์ง€๊ณ  ์žˆ๋˜ source label๊ณผ ์ฃผ์„์ž์˜ ๋ผ๋ฒจ๋ง์œผ๋กœ ๋งŒ๋“ค์–ด์ง„ ์ ์€ target label์„ ๊ฐ€์ง€๊ณ  Cross entropy loss๊ฐ€ UDA model(์ด๋ฏธ์ง€ ๋ถ„ํ•  ๋ชจ๋ธ)์— ์ ์šฉ๋˜์–ด ํ•™์Šต๋ฉ๋‹ˆ๋‹ค.

  • Loss3: ์ ๋Œ€์  ํ•™์Šต๋ฒ•์ž…๋‹ˆ๋‹ค. ์ž์„ธํ•œ ์‚ฌํ•ญ์€ ๋‹ค์Œ ๋…ผ๋ฌธ์„ ์ฐธ๊ณ ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค. AdaptSegarrow-up-right

  • Equ 4: Inconsistent Mask ๋ถˆ์ผ์น˜์„ฑ ๋งˆ์Šคํฌ๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•œ ๊ณต์‹์ž…๋‹ˆ๋‹ค.

  • Loss5: ํ”ฝ์…€ ์„ ํƒ ๋ชจ๋ธ์„ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•œ Pseudo(๊ฐ€์งœ) label loss์ž…๋‹ˆ๋‹ค. Pseudo label(๊ฐ€์งœ ๋ผ๋ฒจ)์€ ์ด๋ฏธ์ง€ ๋ถ„ํ•  ๋ชจ๋ธ์ด ์˜ˆ์ธกํ•œ ๊ฒฐ๊ณผ๋ฅผ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ๋” ์ž์„ธํ•œ ๋ฐฉ๋ฒ•์œผ๋กœ ํ•ด๋‹น ๋…ผ๋ฌธ(IASTarrow-up-right)์˜ ๋ฐฉ๋ฒ•์„ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.

  • Loss6: ์ตœ๋Œ€ ์ฐจ์ด ๊ธฐ๋ฒ• ๋กœ์Šค ํ•จ์ˆ˜ ์ž…๋‹ˆ๋‹ค. (The classifier discrepancy maximization) (์ž์„ธํ•œ ์‚ฌํ•ญ์€ ์ด ๋…ผ๋ฌธ ๋ฐฉ๋ฒ•์„ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค. MCDDAarrow-up-right paper)

drawing

3.3 Segment-based (์˜์—ญ ๊ธฐ๋ฐ˜ ๊ธฐ๋ฒ•) and Pinted-based(ํฌ์ธํŠธ ๊ธฐ๋ฐ˜ ๊ธฐ๋ฒ•)

  • ์ด ๋…ผ๋ฌธ์ด ๋ผ๋ฒจ๋ง ์˜์—ญ์„ ์ฐพ๋Š” ๋ฐฉ๋ฒ•์€ 2๊ฐ€์ง€๋กœ ๋‚˜๋ˆ ์ง‘๋‹ˆ๋‹ค. ํ•˜๋‚˜๋Š” ์˜์—ญ ๊ธฐ๋ฐ˜๊ธฐ๋ฒ• โ€œSegment based Pixel-Labeling (SPL)โ€ ์ด๊ณ , ๋‹ค๋ฅธ ํ•˜๋‚˜๋Š” ํฌ์ธํŠธ ๊ธฐ๋ฐ˜ ๊ธฐ๋ฒ•โ€œPoint based Pixel-Labeling (PPL).โ€ ์ž…๋‹ˆ๋‹ค.

  • SPL ์€ ๋‘ classifier๋ฅผ ์‚ฌ์šฉํ•ด ์ถ”๋ก ๋œ ์˜ˆ์ธก ๊ฒฐ๊ณผ์˜ ์ฐจ์ด (์œ„ the inconsistency mask ์ฐธ์กฐ) ๋ฅผ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ•œ ์˜์—ญ์ž…๋‹ˆ๋‹ค.

  • ์œ„ ๊ธฐ๋ฒ•์€ ์•„๋ž˜์˜ ๊ทธ๋ฆผ์ฒ˜๋Ÿผ, ๊ต‰์žฅํžˆ ๋ผ๋ฒจ๋ง์ด ํž˜๋“  ์˜์—ญ์„ ์ด๊ธฐ๋„ ํ•ฉ๋‹ˆ๋‹ค. ๋” ์ข‹์€ ํšจ์œจ์„ฑ์„ ์œ„ํ•ด PPL ๊ธฐ๋ฒ•์€ ์œ„ ์˜์—ญ ์ค‘ 20~40๊ฐœ์˜ ํฌ์ธํŠธ๋งŒ์„ ๊ณจ๋ผ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. ์•„๋ž˜๋Š” ์ด ํฌ์ธํŠธ๋ฅผ ์ฐพ๋Š” ๊ณผ์ •์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

    1. the set of uncertain pixels D^(k) ๋ถˆํ™•์‹ค์„ฑ ์˜์—ญ์— ๋Œ€ํ•œ ์ง‘ํ•ฉ์„ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค.

    2. ๊ฐ ํด๋ž˜์Šค ๋งˆ๋‹ค ํ‰๊ท ๊ฐ’์„ ์‚ฌ์šฉํ•˜์—ฌ the class prototype vector (ํด๋ž˜์Šค ์ค‘์•™๊ฐ’) ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.

    3. ํด๋ž˜์Šค ์ค‘์•™๊ฐ’๊ณผ ๊ฐ€์žฅ ๋น„์Šทํ•œ ์›์†Œ๋ฅผ ์ฐพ์•„์„œ, ๊ทธ ํฌ์ธํŠธ๋ฅผ PPL์„ ์œ„ํ•œ ํฌ์ธํŠธ๋ผ๊ณ  ํ™•์ •ํ•ฉ๋‹ˆ๋‹ค.

drawing

4. Experiment & Result (์‹คํ—˜ ๊ฒฐ๊ณผ)

Experimental setup (์‹คํ—˜ ์„ธํŒ…)

  • ๋ฐ์ดํ„ฐ์…‹: source dataset์€ GTA5arrow-up-right (synthetic dataset)์„ ์‚ฌ์šฉํ•˜์˜€๊ณ , target dataset์€ Cityscapearrow-up-right(real-world data)๋ฅผ ์‚ฌ์šฉํ—€์Šต๋‹ˆ๋‹ค.

  • ์‹คํ—˜์— ์‚ฌ์šฉํ•œ ๋ชจ๋ธ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. (1) ResNet101 (2) Deeplab-V2

Result

drawing

  • Figure 1

    1. LabOR (PPL and SPL) ๊ธฐ๋ฒ•์ด ๊ธฐ์กด์˜ ๋น„์ง€๋„ DA ๊ธฐ๋ฒ•๋“ค(IASTarrow-up-right) ์ข‹์€ ์„ฑ๋Šฅ์„ ๊ฐ€์ง์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

    2. SPL ๊ธฐ๋ฒ•์€ ์ง€๋„ํ•™์Šต๊ณผ ๊ฑฐ์˜ ์œ ์‚ฌํ•œ ์„ฑ๋Šฅ์„ ๊ฐ€์ง€๋Š” ๊ฒƒ์€ ๊ต‰์žฅํžˆ ๋†€๋ผ์šด ์‚ฌ์‹ค์ž…๋‹ˆ๋‹ค.

    3. PPL ์€ ์ตœ๊ทผ์˜ ์ค€์ง€๋„ ํ•™์Šต๋ฒ•์„ ์‚ฌ์šฉํ•œ WDAarrow-up-right ๋…ผ๋ฌธ๋ณด๋‹ค ๋” ์ข‹์€ ์„ฑ๋Šฅ์„ ๊ฐ€์ง€๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  • Table 1

    1. ์ด ํ…Œ์ด๋ธ”์€ ์ตœ๊ทผ์˜ ์šฐ์ˆ˜ํ•œ ๋…ผ๋ฌธ๋“ค๊ณผ์˜ ์„ฑ๋Šฅ์„ ์ˆ˜์น˜์ ์œผ๋กœ ๋น„๊ตํ•œ ํ…Œ์ด๋ธ” ์ž…๋‹ˆ๋‹ค.

    2. ์—ฌ๊ธฐ์„œ๋„, ์•„์ฃผ ์ž‘์€ ๋ผ๋ฒจ์„ ์‚ฌ์šฉํ•˜๋Š” ์ค€์ง€๋„ ํ•™์Šต๋ฒ•์ธ PPL๊ณผ ์ „์ฒด ๋ผ๋ฒจ์„ ์‚ฌ์šฉํ•˜๋Š” ์ง€๋„ ํ•™์Šต์˜ ์„ฑ๋Šฅ์ด ๋น„์Šทํ•œ ๊ฒƒ์„ ์ •ํ™•ํžˆ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  • Figure 2

    1. ์‹œ๊ฐ์  ์„ฑ๋Šฅ ์ฐจ์ด๋ฅผ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

    2. ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•œ SPL ๊ธฐ๋ฒ•์€ ๋‹ค๋ฅธ ๊ธฐ๋ฒ•๋ณด๋‹ค ๋”์šฑ ์ •ํ™•ํ•œ ์ด๋ฏธ์ง€ ๋ถ„ํ•  ๊ฒฐ๊ณผ๋ฅผ ๋งŒ๋“ค์–ด์ฃผ๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

5. Conclusion (๊ฒฐ๋ก )

  • ์ด ๋…ผ๋ฌธ์€ ์ ์€ ์ž์›์˜ ์ฃผ์„์ž ์‚ฌ์šฉ์„ ์œ„ํ•œ ๋„๋ฉ”์ธ ์ ์‘ํ˜• ์ด๋ฏธ์ง€ ๋ถ„ํ•  ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค.

  • 2๊ฐœ์˜ ํ”ฝ์…€ ์„ ํƒ ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ•˜๋Š”๋ฐ, ํ•˜๋‚˜๋Š” ์˜์—ญ๊ธฐ๋ฐ˜(SPL)์ด๋ฉฐ ํ•˜๋‚˜๋Š” ํฌ์ธํŠธ๊ธฐ๋ฐ˜(PPL)์ž…๋‹ˆ๋‹ค.

  • ํ•œ๊ณ„์  (ํฌ์ŠคํŠธ ์ €์ž์˜ ์ƒ๊ฐ)

    1. ๋…ผ๋ฌธ์— ๋ณด๋ฉด SPL๊ณผ PPL ๊ฐ๊ฐ ํ•œ ์ด๋ฏธ์ง€๋‹น 2.2%์˜ ์˜์—ญ๊ณผ 40 ํฌ์ธํŠธ๋ฅผ ๋ผ๋ฒจ๋ง ํ•œ๋‹ค๊ณ  ๊ธฐ๋ก๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ์†Œ๋ฆฌ๋Š” ์ถฉ๋ถ„ํžˆ ๊ต‰์žฅํžˆ ์ ์€ ๋ผ๋ฒจ์„ ์‚ฌ์šฉํ•ด์„œ ํšจ์œจ์ ์ธ ๊ธฐ๋ฒ• ์ฒ˜๋Ÿผ ๋“ค๋ฆฝ๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ, '๋‚ด๊ฐ€ ๋งŒ์•ฝ ์ฃผ์„์ž ๋ผ๋ฉด?' ์ด๋ผ๋Š” ์ƒ๊ฐ์„ ํ•ด๋ดค์„ ๋•Œ (1) PPL ๊ธฐ๋ฒ•์˜ ๋ผ๋ฒจ๋ง์€ ์ด๋ฏธ์ง€ ์ „์ฒด ๋ผ๋ฒจ๋ง ๋ณด๋‹ค ํž˜๋“  ์ž‘์—… ๊ฐ™์•„ ๋ณด์ž…๋‹ˆ๋‹ค. (์„น์…˜ 3.3์˜ ์ด๋ฏธ์ง€ ์ฐธ๊ณ ) (2) SPL ๊ธฐ๋ฒ•์˜ 40ํฌ์ธํŠธ ๋˜ํ•œ PPL๊ธฐ๋ฒ•์˜ ํฌ์ธํŠธ ์ค‘ ํ•˜๋‚˜์ด๋ฏ€๋กœ ๋Œ€๋ถ€๋ถ„ ๊ฐ์ฒด ๊ฒฝ๊ณ„์˜ ๋ผ๋ฒจ์ด๋ผ๊ณ  ๊ณ ๋ ค๋ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ด ๋˜ํ•œ ๋งค์šฐ ํž˜๋“  ์ž‘์—… ๊ฐ™์•„ ๋ณด์ž…๋‹ˆ๋‹ค.

    2. ์ด๊ฒƒ์ด ์ •๋ง ํšจ์œจ์ ์ด๊ณ  ์ž์›์ด ์ ๊ฒŒ ํ•„์š”ํ•œ ์ž‘์—…์ธ์ง€๋Š”, ์ข€ ๋” ๋งŽ์€ ์˜ˆ์ œ ์ด๋ฏธ์ง€์™€ ๊ฒฝํ—˜๋‹ด ๋“ฑ์„ ํ†ตํ•ด์„œ ๋น„๊ตํ•  ํ•„์š”๊ฐ€ ์žˆ์–ด๋ณด์ž…๋‹ˆ๋‹ค.

Take home message (์˜ค๋Š˜์˜ ๊ตํ›ˆ)

์•„์ฃผ ์กฐ๊ธˆ์˜ ๋ชจ๋ธ ์„ฑ๋Šฅ์„ ์˜ฌ๋ฆฌ๊ธฐ ์œ„ํ•ด์„œ ์•„์ฃผ ๋ณต์žกํ•œ ๋น„์ง€๋„ DA ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค, ๋น„์šฉ์„ ์ตœ์†Œํ•œ์œผ๋กœ ํ•„์š”๋กœ ํ•˜๋Š” ์ ์€ ๋ผ๋ฒจ๋ง๋งŒ์œผ๋กœ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ๋งค์šฐ ํšจ๊ณผ์ ์œผ๋กœ ์˜ฌ๋ฆด ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ์žŠ์ง€ ๋ง์•„์•ผ ํ•œ๋‹ค.

Take home message (์˜ค๋Š˜์˜ ๊ตํ›ˆ)

It may be more efficient to obtain a supervision signal at a low cost than using complex unsupervised methods to achieve very small performance gains.

Author / Reviewer information

Author

  1. ์‹ ์ธ๊ทœ (Inkyu Shin)

    • KAIST / RCV Lab

    • https://dlsrbgg33.github.io/

  2. ๊น€๋™์ง„ (DongJin Kim)

    • KAIST / RCV Lab

    • https://sites.google.com/site/djkimcv/

  3. ์กฐ์žฌ์› (JaeWon Cho)

    • KAIST / RCV Lab

    • https://chojw.github.io/

Reference & Additional materials

Last updated