MZSR [Kor]

Soh et al. / Meta-Transfer Learning for Zero-shot Super Resolution / CVPR 2020

Preface

Transfer Learning์ด๋ž€?

Transfer Learning์ด๋ž€ ์•„์ฃผ ํฐ ๋ฐ์ดํ„ฐ์…‹์— ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์˜ ๊ฐ€์ค‘์น˜๋ฅผ ๊ฐ€์ง€๊ณ  ์™€์„œ ์šฐ๋ฆฌ๊ฐ€ ํ•ด๊ฒฐํ•˜๊ณ ์ž ํ•˜๋Š” ๊ณผ์ œ์— ๋งž๊ฒŒ ์žฌ๋ณด์ •ํ•ด์„œ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ๊ฒฐ๊ณผ์ ์œผ๋กœ ๋น„๊ต์  ์ ์€ ์ˆ˜์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ง€๊ณ ๋„ ์šฐ๋ฆฌ๊ฐ€ ์›ํ•˜๋Š” ๊ณผ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋Š” ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ํ›ˆ๋ จ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

Meta Learning์ด๋ž€?

Meta learning์ด๋ž€ ํ•™์Šต์— ๋Œ€ํ•œ ํ•™์Šต์„ ๋งํ•ฉ๋‹ˆ๋‹ค. ์ฒซ ๋ฒˆ์งธ๋Š” ํ•™์Šต์„ ์œ„ํ•ด ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ํŒจํ„ด์ด๋‚˜ ํŠน์ง•์„ ์ฐพ๊ณ  ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ, ๊ทธ ํŠน์ง•์„ ์ฐพ๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋‘ ๋ฒˆ์งธ๋Š” ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•˜๋Š” learning๋ณด๋‹ค ํ•œ ๋‹จ๊ณ„ ์œ„์ธ, Hyper-parameter์— ๋Œ€ํ•ด์„œ ์ ํ•ฉํ•œ ๊ฐ’์„ ์ฐพ๋Š” Learning์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

1. Problem definition

๋ณธ ๋…ผ๋ฌธ์˜ MZSR(Meta-Transfer Learning for Zero-shot Super Resolution)์€ ํ•œ ์žฅ์˜ ์‚ฌ์ง„์—์„œ ์•ฝ๊ฐ„์˜ ์—…๋ฐ์ดํŠธ๋งŒ์„ ์ˆ˜ํ–‰ํ•˜์—ฌ ์šฐ์ˆ˜ํ•œ ํ•ด์ƒ๋„ ๋ณต์› ์„ฑ๋Šฅ์„ ๋ณด์ผ ์ˆ˜ ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. ํŠน์ง•์€ Transfer-learning๊ณผ Meta-learning์„ ์ด๋Ÿฌํ•œ Zero-shot Super Resolution ๋ถ„์•ผ์— ์‚ฌ์šฉํ–ˆ๋‹ค๋Š” ์ ์ธ๋ฐ์š”, ์šฐ์„  Transfer learning์„ ์ด์šฉํ•˜์—ฌ ๋งŽ์€ ์ˆ˜์˜ ์™ธ๋ถ€ ์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ ์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ(pre-trained model)์„ ์ด์šฉํ•˜์—ฌ ์ถ”๊ฐ€์ ์œผ๋กœ Fine-tune์„ ์ง„ํ–‰ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•ฉ๋‹ˆ๋‹ค. ๋‹ค๋งŒ, ์ด Fine-tune์„ ์ง„ํ–‰ํ•  ๋•Œ, Meta-learning์„ ์ด์šฉํ•ด์„œ ๋‹ค์–‘ํ•œ ์ปค๋„(kernel)์— ๋Œ€ํ•ด์„œ ๋น ๋ฅด๊ฒŒ ์ ์‘ํ•  ์ˆ˜ ์žˆ๊ฒŒ๋” ํ•˜๋Š” ๊ฒƒ์ด ํŠน์ง•์ž…๋‹ˆ๋‹ค. ์ด Meta-learning ๊ณผ์ •์„ ๋งˆ์น˜๊ณ  ๋‚˜๋ฉด, ์–ด๋–ค ์ด๋ฏธ์ง€๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ, ์–ด๋–ค internal data repetition ์ •๋ณด๋ฅผ ์ด์šฉํ•ด์„œ ํ•™์Šต์„ ์ง„ํ–‰ํ•˜๋Š” ๊ทธ๋Ÿฐ Zero-shot ๊ธฐ๋ฐ˜์œผ๋กœ ํ•™์Šต์ด ๋™์ž‘ํ•  ๋•Œ, ์ด์™€ ๊ฐ™์ด ์•ฝ๊ฐ„์˜ ์—…๋ฐ์ดํŠธ๋งŒ์„ ์ด์šฉํ•ด๋„ ๋น ๋ฅด๊ฒŒ ์˜๋„ํ–ˆ๋˜ ํŠน์ • ์ปค๋„์— ๋งž๋Š” ๊ทธ๋Ÿฐ ๊ฐ€์ค‘์น˜๋ฅผ ์ฐพ์•„์„œ ํ•™์Šต์„ ์ง„ํ–‰ํ•˜๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Figure 1: Super-resolved results of "image050" in Urban100.

2. Motivation

์ฆ‰ ์ผ๋ฐ˜์ ์œผ๋กœ ์ด ZSSR์€ ํ”ํžˆ ์šฐ๋ฆฌ๊ฐ€ ์•Œ๊ณ  ์žˆ๋Š” Zero-shot Super-resolution ๋ฐฉ๋ฒ•์ธ๋ฐ์š”, ์ด๋Ÿฐ ๊ฒฝ์šฐ์—๋Š” ์ž๊ธฐ ์ž์‹  ์ด๋ฏธ์ง€์— ๋Œ€ํ•ด์„œ ์—ฌ๋Ÿฌ๋ฒˆ ํ•™์Šต ๊ณผ์ •์„ ๊ฑฐ์ณ์•ผํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ด๋ ‡๊ฒŒ ์•ฝ 3000๋ฒˆ์˜ ์—…๋ฐ์ดํŠธ๊ฐ€ ํ•„์š”ํ•œ๋ฐ์š”, ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•˜๋Š” MZSR์„ ์ด์šฉํ•˜๊ฒŒ ๋˜๋ฉด, Transfer์™€ Meta-learning์„ ์‚ฌ์ „์— ๋ฏธ๋ฆฌ ์ด์šฉํ•ด๋†“๊ณ  ์‹ค์ œ๋กœ meta-test ๊ณผ์ •์—์„œ ๋‹จ์ˆœํžˆ ํ•œ๋ฒˆ, ๊ทธ๋ฆฌ๊ณ  ๋งŽ๊ฒŒ๋Š” 10๋ฒˆ ์ •๋„์˜ ์—…๋ฐ์ดํŠธ๋งŒ ์ˆ˜ํ–‰ํ•˜๋”๋ผ๋„, ๊ฐ€์ค‘์น˜๊ฐ€ ์ด ์ด๋ฏธ์ง€์— ๋งž๊ฒŒ ์ ์ ˆํ•˜๊ฒŒ ํŠน์ • ์ปค๋„์— ์ž˜ ๋ถ€ํ•ฉํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•™์Šต์ด ๋˜๊ธฐ ๋•Œ๋ฌธ์—, ์ ์€ ์—…๋ฐ์ดํŠธ๋งŒ ๊ฐ€์ง€๊ณ ๋„ ๋น ๋ฅด๊ฒŒ ๊ฐ€์ค‘์น˜๋ฅผ ์ฐพ์„ ์ˆ˜ ์žˆ์–ด์„œ Zero-shot Super-resolution๋ฅผ ์œ„ํ•œ ๋ชจ๋ธ์„ ๋น ๋ฅด๊ฒŒ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์ด ์žฅ์ ์ž…๋‹ˆ๋‹ค.

1) CNN๊ธฐ๋ฐ˜ ์ ‘๊ทผ๋ฐฉ์‹

์ตœ๊ทผ์—๋Š” CNN๊ธฐ๋ฐ˜์˜ ์ ‘๊ทผ๋ฐฉ๋ฒ•์ด ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์ด๊ณ  ์žˆ์–ด์„œ ๋งŽ์ด ์‚ฌ์šฉ๋˜๊ณ  ์žˆ๋Š”๋ฐ์š”, ์ด๋Š” ์ €ํ•ด์ƒ๋„ ์ด๋ฏธ์ง€๋ฅผ ๋„คํŠธ์›Œํฌ์— ๋Œ€์ž…ํ•˜์—ฌ ๋†’์€ ํ•ด์ƒ๋„๋กœ ๋ฐ˜ํ™˜ํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ์ด๋ฃจ์–ด์ง‘๋‹ˆ๋‹ค. ๋ฌผ๋ก , ์ด Neural Network์˜ ์ข…๋ฅ˜์— ๋”ฐ๋ผ์„œ ์ €ํ•ด์ƒ๋„์˜ ์ด๋ฏธ์ง€๋ฅผ Bicubic ๋“ฑ์„ ์ด์šฉํ•˜์—ฌ ํฌ๊ธฐ๋ฅผ ํ‚ค์šด ๋‹ค์Œ์— ์ด ์ด๋ฏธ์ง€๋ฅผ Neural Network์— ๋„ฃ์–ด์„œ ๊ณ ํ•ด์ƒ๋„๋กœ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. ์ด๋•Œ, ๊ณ ํ•ด์ƒ๋„ ์ด๋ฏธ์ง€๋“ค์„ ํŠน์ • kernal์„ ์ด์šฉํ•˜์—ฌ Blur์ฒ˜๋ฆฌ๋ฅผ ํ•˜๊ณ , Downsampling, Noise ์ถ”๊ฐ€ ๊ณผ์ •์„ ๊ฑฐ์ณ ์ €ํ•ด์ƒ๋„๋กœ ๋งŒ๋“ค์–ด์„œ Train data๋กœ์จ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ๋‹ค๋งŒ, Downsampling ๊ณผ์ •์—์„œ bicubic๊ณผ ๊ฐ™์€ ์ž˜ ์•Œ๋ ค์ง„ kernal๋งŒ์„ ์ด์šฉํ•˜๋ฉด non-bicubic ์ผ€์ด์Šค์— ๋Œ€ํ•˜์—ฌ ์„ฑ๋Šฅ์ด ๋–จ์–ด์ง€๋Š” domain gap ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

Figure 2: CNN๊ธฐ๋ฐ˜ ์ ‘๊ทผ๋ฐฉ์‹.
Figure 3: CNN๊ธฐ๋ฐ˜ ์ ‘๊ทผ๋ฐฉ์‹์˜ ๊ณผ์ •.

2) SISR(Single Image Super-Resolution)

์ด ๋ถ„์•ผ์— ๋Œ€ํ•ด์„œ ๊ธฐ๋ณธ์ ์ธ ๋‚ด์šฉ๋ถ€ํ„ฐ ์•Œ๊ธฐ ์œ„ํ•ด์„œ SISR(Single Image Super-Resolution)์— ๋Œ€ํ•ด์„œ ๋ง์”€๋“œ๋ฆฌ๊ฒ ์Šต๋‹ˆ๋‹ค. ์ด ๋ถ„์•ผ๋Š” ํ•œ ์žฅ์˜ ์ €ํ•ด์ƒ๋„ ์ด๋ฏธ์ง€(LR)๊ฐ€ ํ…Œ์ŠคํŠธ ํƒ€์ž„์— ์ฃผ์–ด์กŒ์„ ๋•Œ, ์ด๋ฅผ ๊ณ ํ•ด์ƒ๋„ ์ด๋ฏธ์ง€(HR)๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋‹ค๋ฃน๋‹ˆ๋‹ค. ํ”ฝ์…€ ์—ฌ๋Ÿฌ๊ฐ€์ง€ ๊ฐ’๋“ค์ด ์กด์žฌํ•œ๋‹ค๊ณ  ํ–ˆ์„ ๋•Œ, ์ด ํ”ฝ์…€์˜ ์ˆ˜๋ฅผ ๋Š˜๋ฆฐ๋‹ค๊ณ  ํ•˜๋ฉด, ์ฆ‰ ๊ณ ํ•ด์ƒ๋„์˜ ์ด๋ฏธ์ง€๋กœ ๋ฐ”๋€๋‹ค๊ณ  ํ•˜๋ฉด, ๋‹ค์–‘ํ•œ ๋ฐฉ๋ฒ•(1D nearest-neighbor, Linear, Cubic, 2D nearest-neighbor, Bilinear, Bicubic)์„ ์ด์šฉํ•˜์—ฌ ํ”ฝ์…€ ๊ฐ’์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ Cubic์€ 3์ฐจํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜๋Š” ๋‚ด์šฉ์ด๋ผ๊ณ  ๋ณด์‹œ๋ฉด ๋ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ๊ธฐ์กด์— ์กด์žฌํ•˜๋Š” ๊ฐ๊ฐ์˜ sample ๊ฐ’์„ ์ฐธ๊ณ ํ•˜์—ฌ ์ด ์ค‘๊ฐ„ ์ง€์ ์˜ ํ”ฝ์…€ ๊ฐ’์„ ๊ฒฐ์ •ํ•˜๋Š” ๋ฐฉ์‹์ด ๊ฐ€์žฅ ์ „ํ†ต์ ์ด๋ฉฐ ๋งŽ์ด ์‚ฌ์šฉ๋˜๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค.

Figure 4: SISR(Single Image Super-Resolution).

3) ZSSR(Zero Shot Super-Resolution)

MZSR์˜ Meta-test ๋‹จ๊ณ„์—์„œ ํ™œ์šฉํ•˜๊ฒŒ๋  Zero-Shot Super Resolution์— ๋Œ€ํ•ด ์„ค๋ช…๋“œ๋ฆฌ๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. ZSSR ์•ž์„œ ๋ง์”€๋“œ๋ฆฐ SISR๊ณผ ๋‹ฌ๋ฆฌ ํ•™์Šต ๋‹จ๊ณ„์—์„œ ์ž๊ธฐ ์ž์‹  ์ฆ‰ internal infromation์„ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค. ์ž๊ธฐ ์ž์‹  image๋กœ๋ถ€ํ„ฐ ์ถ”์ถœ๋œ HR-LR pair๋ฅผ ๋งŒ๋“ค์–ด ํ•™์Šต์„ ์ง„ํ–‰ํ•˜๊ณ  ์ด๋ ‡๊ฒŒ ํ•™์Šต๋œ ์ •๋ณด๋ฅผ ํ† ๋Œ€๋กœ ์›๋ณธ์„ LR๋กœ ์ด์šฉํ•˜์—ฌ ํ™•๋Œ€ํ•œ ๊ฒฐ๊ณผ ์ฆ‰ ์˜ˆ์ธก์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ํ•œ๊ณ„์ ์œผ๋กœ๋Š” ํ•œ ์žฅ์˜ ์ด๋ฏธ์ง€์— ๋Œ€ํ•ด์„œ๋งŒ ํ•™์Šต์„ ํ•˜๋‹ค๋ณด๋‹ˆ ํ•™์Šต ์‹œ๊ฐ„์ด ๋งŽ์ด ํ•„์š”ํ•˜๊ณ , ๋‹ค๋ฅธ ์ด๋ฏธ์ง€์—” ์‚ฌ์šฉ ์–ด๋ ต๋‹ค๋Š” ์ ์„ ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ง€์ ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

Idea

์ž๊ธฐ ์ž์‹  image๋กœ๋ถ€ํ„ฐ ์ถ”์ถœ๋œ HR-LR pair๋ฅผ ๋งŒ๋“ค์–ด ํ•™์Šต์„ ์ง„ํ–‰ํ•˜๊ณ  ์ด๋ ‡๊ฒŒ ํ•™์Šต๋œ ์ •๋ณด๋ฅผ ํ† ๋Œ€๋กœ ์›๋ณธ์„ LR๋กœ ์ด์šฉํ•˜์—ฌ ํ™•๋Œ€ํ•œ ๊ฒฐ๊ณผ๋ฅผ ํ† ๋Œ€๋กœ ์˜ˆ์ธก์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ํ•œ๊ณ„์ ์œผ๋กœ๋Š” ํ•œ ์žฅ์˜ ์ด๋ฏธ์ง€์— ๋Œ€ํ•ด์„œ๋งŒ ํ•™์Šต์„ ํ•˜๋‹ค๋ณด๋‹ˆ ํ•™์Šต์‹œ๊ฐ„์ด ๋งŽ์ด ํ•„์š”ํ•˜๊ณ , ๋‹ค๋ฅธ ์ด๋ฏธ์ง€์—” ์ ์šฉ์ด ์–ด๋ ต๋‹ค๋Š” ์ ์„ ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ง€์ ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ๋‚ด์šฉ์€ MAML(Model-Agnostic Meta-Learning)์˜ ์ ์šฉ์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. MAML์€ ์ ์ ˆํ•œ ์ดˆ๊ธฐ ๊ฐ€์ค‘์น˜(weight)๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•œ ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. ๋‹ค์–‘ํ•œ ์ž‘์—…(task)์— ๋Œ€ํ•ด์„œ ๋น ๋ฅด๊ฒŒ ์ ์‘ํ•  ์ˆ˜ ์žˆ๋Š” ๊ฐ€์ค‘์น˜๋ฅผ ์ฐพ๋Š”๋ฐ ๋„์›€์„ ์ฃผ๋ฉฐ, Fine-tuning์—๋„ ๋„์›€์„ ์ค„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Figure 5: MAML(Model-Agnostic Meta-Learning) ๊ฐœ์š”.
Figure 6: MAML(Model-Agnostic Meta-Learning) ์•Œ๊ณ ๋ฆฌ์ฆ˜.

3. Method

Figure 6: MZSR ๊ฐœ๋…๋„.

๊ทธ๋ž˜์„œ ์ด๋Ÿฌํ•œ CNN ๊ธฐ๋ฐ˜์˜ ๋ฐฉ๋ฒ•๊ณผ ZSSR์˜ ํ•œ๊ณ„์ ์„ ๊ทน๋ณตํ•˜๊ณ ์ž ๋ณธ ๋…ผ๋ฌธ์€ MZSR์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. ์ „์ฒด์ ์ธ ํ๋ฆ„์„ ๋ณด์‹œ๋ฉด externel data๋กœ large scale training๊ณผ meta transfer learning์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  Meta-Test ๋‹จ๊ณ„์—์„œ๋Š” zero-shot super-resolution ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

Figure 7: MZSR ๊ฐœ๋…๋„.

Large-scale Training๋‹จ๊ณ„์—์„œ๋Š” ๋‹ค์–‘ํ•œ ์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ ๊ณตํ†ต์ ์œผ๋กœ ์‚ฌ์šฉ๋˜๋Š” representation๋“ค์„ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•ฉ๋‹ˆ๋‹ค. natural image๋“ค๋กœ ๋ถ€ํ„ฐ ํŠน์ง•๊ฐ’๋“ค์„ ๋ฐ›์•„์™€์„œ ํ™œ์šฉํ•จ์œผ๋กœ์จ ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์ด๋„๋ก ํ•ฉ๋‹ˆ๋‹ค. ์ˆ˜์‹์„ ๋ณด์‹œ๋ฉด ๋ฐ”์ดํํ”ฝ์œผ๋กœ low resolution image๋ฅผ ๋งŒ๋“ค์–ด์„œ HR, LR pair๋ฅผ ๋งŒ๋“  ๋’ค L1๋ฅผ ์‚ฌ์šฉํ•ด์„œ loss๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š” ๋ฐฉํ–ฅ์œผ๋กœ training ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

Figure 8: MZSR ๊ฐœ๋…๋„.

์ด์ œ Meta-Transfer Learning ๋‹จ๊ณ„์ž…๋‹ˆ๋‹ค. Meta learning์€ ํ•™์Šต์„ ์œ„ํ•œ ํ•™์Šต์ด๋ผ๊ณ ๋„ ํ•ฉ๋‹ˆ๋‹ค. ๋‚˜์ค‘์— ํ•™์Šต์ด ์ž˜ ๋  ์ˆ˜ ์žˆ๋„๋ก ํ•˜๊ธฐ ์œ„ํ•ด์„œ ํŠน์ •ํ•œ ๊ฐ task๋กœ ๋น ๋ฅด๊ฒŒ ํ•™์Šต๋  ์ˆ˜ ์žˆ๋„๋ก ๋งŒ๋“œ๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋‹ค์–‘ํ•œ kernel condition์— ๊ฐ€์žฅ sensitiveํ•œ initial point๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•ด transfer-learning๊ณผ optimaization ๊ธฐ๋ฐ˜์˜ meta-learning ๋ฐฉ๋ฒ• ์ฆ‰ MAML์„์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ด์™€ ๊ฐ™์ด task๊ฐ€ 3๊ฐœ๊ฐ€ ์žˆ๋‹ค๊ณ  ํ• ๋•Œ ๊ฐ๊ฐ task์— ๋งž๋Š” optimalํ•œ weight๊ฐ’์€ ์„ธํƒ€1, ์„ธํƒ€2, ์„ธํƒ€3์ด ์žˆ๊ณ  ๊ฐ€ ์žˆ๊ณ , ํ™”์‚ดํ‘œ ๋์œผ๋กœ ๋„๋‹ฌํ•˜๊ฒŒ ๋˜๋ฉด ๊ฐ๊ฐ์˜ ๊ฐ€์ค‘์น˜๋กœ ๊ฐ€๋Š” ๊ฐ task์— ๋Œ€ํ•œ loss์˜ ๋ฐฉํ–ฅ์„ฑ์„ ๊ตฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. Kernel distribution์„ ์œ„ํ•ด์„œ๋Š” Covariance matrix์„ ์‚ฌ์šฉํ•˜๋Š”๋ฐ์š”. ์ฒ˜์Œ ๊ด„ํ˜ธ๋Š” rotation matrix๋กœ ์„ธํƒ€๋งŒํผ ์ด๋ฏธ์ง€๋ฅผ ํšŒ์ „ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋žŒ๋‹ค ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ธ”๋Ÿฌ์ฒ˜๋ฆฌ๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋‹ค์‹œ ์„ธํƒ€๋งŒํผ ๋ฐ˜๋Œ€๋กœ ํšŒ์ „์„ ์‹œ์ผœ์„œ ์›๋ณธ์ด๋ฏธ์ง€๋กœ ๋˜๋Œ๋ฆด ์ˆ˜ ์žˆ๋„๋ก ํ•ฉ๋‹ˆ๋‹ค.

Figure 9: MZSR ๊ฐœ๋…๋„.

์ด์ œ ์ด meta-learner๋ฅผ train์‹œํ‚ต๋‹ˆ๋‹ค. Task-level Loss๋ฅผ ํ†ตํ•ด model parameter ๐œƒ๋ฅผ ์—…๋ฐ์ดํŠธํ•˜๊ณ  Test error๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š” ๋ฐฉํ–ฅ์œผ๋กœ optimization์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

Figure 10: Super-resolved results of "image050" in Urban100.

๊ทธ ๋‹ค์Œ์€ Meta-Test ๋‹จ๊ณ„์ž…๋‹ˆ๋‹ค. ์ด๋Š” ์•ž์„œ ์„ค๋ช…๋“œ๋ฆฐ Zero-shot super learning ๋ฐฉ์‹๊ณผ ๋™์ผํ•˜๊ฒŒ single image ๋‚ด์—์„œ internal information์„ ํ•™์Šตํ•˜๋Š” ๊ฑธ ์œ„ ๊ทธ๋ฆผ์—์„œ ๋ณด์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Figure 11: MZSR ๊ฐœ๋…๋„.

์•ž์„œ ์„ค๋ช…๋“œ๋ฆฐ Meta-Transfer Learning๊ณผ Meta-Test์˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ž…๋‹ˆ๋‹ค. Meta-Transfer Learning ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๋ณด์‹œ๋ฉด Data(D)๊ฐ€ ์žˆ์„ ๋•Œ ๋•Œ LR๊ณผ HR batch๋ฅผ ๋งŒ๋“  ๋‹ค์Œ L1 Loss๋ฅผ ์ด์šฉํ•ด์„œ Training์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  task distridution ๋‚ด ๊ฐ task์— ๋Œ€ํ•ด ๋‚˜์ค‘์— ํ•™์Šต์„ ์ง„ํ–‰ํ–ˆ์„ ๋•Œ ํ•™์Šต์ด ๋นจ๋ฆฌ ๋  ์ˆ˜ ์žˆ๋„๋ก meta-learning์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ meta-learn์„ optimizationํ•ฉ๋‹ˆ๋‹ค. Meta-Test ๋‹จ๊ณ„์—์„œ๋Š” ํ•˜๋‚˜์˜ ์ด๋ฏธ์ง€๊ฐ€ ๋“ค์–ด์™”์„ ๋•Œ ๊ฐ kernel์— ๋งž๊ฒŒ meta-learning์ด ๋œ ๊ฐ€์ค‘์น˜ ๊ฐ’์„ ๋น ๋ฅด๊ฒŒ update ์‹œํ‚ต๋‹ˆ๋‹ค. ์ด๋Ÿฐ ๊ณผ์ •์„ ํ†ตํ•ด SR์ด๋ฏธ์ง€๋ฅผ return ํ•˜๋Š” ๊ฑธ ๋ณด์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

4. Experiment & Result

Figure 12: Bicubic Downsampling ์‹คํ—˜ ๊ฒฐ๊ณผ.

๋ฐ”์ดํ๋น…์œผ๋กœ ๋‹ค์šด์ƒ˜ํ”Œ๋ง๋œ ๋ฐ์ดํ„ฐ์…‹์˜ ์‹คํ—˜ ๊ฒฐ๊ณผ์ž…๋‹ˆ๋‹ค. ์•„๋ฌด๋ž˜๋„ ๋ฐ”์ดํ๋น… ๋‹ค์šด์ƒ˜ํ”Œ๋ง์„ ์ง„ํ–‰ํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ๋‹ค๋ฅธ ๋ชจ๋ธ๊ณผ ๋น„๊ตํ–ˆ์„ ๋•Œ MZSR์ด ๋น„๊ต์  ๋‚ฎ์€ ์„ฑ๋Šฅ์„ ๋ณด์ด๋Š” ๋ฐ์ดํ„ฐ์…‹์ด ์žˆ์ง€๋งŒ 1-10๋ฒˆ์˜ ์—…๋ฐ์ดํŠธ๋งŒ์œผ๋กœ ์œ ์‚ฌํ•œ ์„ฑ๋Šฅ์„ ๋‚ผ ์ˆ˜ ์žˆ์Œ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Figure 13: ๋‹ค์–‘ํ•œ ์ปค๋„์„ ์‚ฌ์šฉํ•œ ์‹คํ—˜ ๊ฒฐ๊ณผ.

ํ•ด๋‹น ํ…Œ์ด๋ธ”์€ ๋‹ค์–‘ํ•œ ์ปค๋„์„ ์‚ฌ์šฉํ•œ ์‹คํ—˜ ๊ฒฐ๊ณผ์ž…๋‹ˆ๋‹ค. ๋นจ๊ฐ„์ƒ‰์ด 1์œ„, ํŒŒ๋ž€์ƒ‰์ด 2์œ„ ๊ฒฐ๊ณผ์ธ๋ฐ, ๋Œ€๋ถ€๋ถ„ unsupervised ๋ฐฉ๋ฒ•์ด ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์ด๋ฉฐ MZSR์˜ ๊ฒฝ์šฐ 10๋ฒˆ๋งŒ ์—…๋ฐ์ดํŠธํ•œ ์‹คํ—˜๊ฒฐ๊ณผ์—์„œ๋Š” ๋Œ€๋ถ€๋ถ„ 1, 2์œ„๋ฅผ ์ฐจ์ง€ํ•œ ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Figure 14: ์‹คํ—˜ ๊ฒฐ๊ณผ ๋ฐ ์ˆ˜์น˜ ์‹œ๊ฐํ™”(1).

์ด๋Ÿฌํ•œ ์ˆ˜์น˜๋ฅผ ์‹œ๊ฐํ™”ํ•œ ๊ฒฐ๊ณผ์ž…๋‹ˆ๋‹ค. MZSR์„ 10๋ฒˆ๋งŒ ์—…๋ฐ์ดํŠธ ํ–ˆ์Œ์—๋„ ์šฐ์ˆ˜ํ•œ ๋ณต์› ์„ฑ๋Šฅ์„ ๋ณด์ด๋Š” ๊ฒƒ์„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Figure 15: ์‹คํ—˜ ๊ฒฐ๊ณผ ๋ฐ ์ˆ˜์น˜ ์‹œ๊ฐํ™”(2).

์—ฌ๊ธฐ๋„ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ํ•ด๋‹น ์ปค๋„ condition์—์„œ๋„ MZSR์€ 10๋ฒˆ๋งŒ ์—…๋ฐ์ดํŠธ ํ–ˆ์Œ์—๋„ ์šฐ์ˆ˜ํ•œ ๋ณต์› ์„ฑ๋Šฅ์„ ๋ณด์ด๊ณ  ์žˆ์Œ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Experimental setup

  • Dataset์œผ๋กœ๋Š” Set5, BSD100, Urban100์„ ์ €ํ•ด์ƒ๋„๋กœ ๋ณ€ํ™˜ํ•œ ์ด๋ฏธ์ง€, ๊ทธ๋ฆฌ๊ณ  ์›๋ณธ์„ ์ด์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.

  • ๋น„๊ต๋Œ€์ƒ์œผ๋กœ๋Š” Bicubic, CARN, RCAN, ZSSR์„ ์ด์šฉํ•˜์˜€์Šต๋‹ˆ๋‹ค.

  • Training ์„ธํŒ…์€ ฮฑ = 0.01 and ฮฒ = 0.0001 ์œผ๋กœ ์„ค์ •ํ•˜์˜€์Šต๋‹ˆ๋‹ค.

  • ๊ฒฐ๊ณผ๋Š” YCbCr color ๊ณต๊ฐ„์˜ Y channel์—์„œ PSNR(dB)๊ณผ SSIM์˜ ํ‰๊ท ์„ ๋‚ธ ๊ฐ’์œผ๋กœ ํ‰๊ฐ€๋ฉ๋‹ˆ๋‹ค. ๋นจ๊ฐ„์ƒ‰์€ ์ตœ์ƒ์˜ ๊ฒฐ๊ณผ๋ฅผ ๋‚˜ํƒ€๋‚ด๊ณ  ํŒŒ๋ž€์ƒ‰์€ ์ฐจ์„ ์„ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค. ๋˜ํ•œ ๊ด„ํ˜ธ ์•ˆ์˜ ์ˆซ์ž๋Š” ์ด ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•˜๋Š” MZSR๋ฐฉ๋ฒ•์˜ ๊ทธ๋ ˆ์ด๋””์–ธํŠธ ์—…๋ฐ์ดํŠธ์˜ ์ˆ˜๋ฅผ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค.

Result

Figure 16: MZSR ๋ฐ ๋‹ค๋ฅธ Baseline์˜ ์„ฑ๋Šฅ ๋น„๊ต.

MZSR์˜ ๊ฒฝ์šฐ ํ•œ ๋ฒˆ์˜ gradient update๋งŒ์œผ๋กœ ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์ž„์„ ์•ž์„œ ์–ธ๊ธ‰ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆผ์„ ๋ณด์‹œ๋ฉด initial point์—์„œ๋Š” ๊ฐ€์žฅ ์•ˆ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์ด๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ด์™€ ๊ฐ™์ด 1๋ฒˆ์˜ ์—…๋ฐ์ดํŠธ๋งŒ์œผ๋กœ ๋‹ค๋ฅธ pre-trained network์œผ๋กœ ๋ณต์›๋œ ์ด๋ฏธ์ง€๋ณด๋‹ค ์ข‹์€ ์„ฑ๋Šฅ์„ ๋‚ผ ์ˆ˜ ์žˆ์Œ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋Š”๋ฐ์š”, ์ด๋Š” ์–ผ๋งˆ๋‚˜ MZSR์ด ๋น ๋ฅธ ์ ์‘ ๋Šฅ๋ ฅ์ด ์žˆ๋Š”์ง€ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Figure 17: MZSR ๋ฐ Bicubic interpolation์˜ ์„ฑ๋Šฅ ๋น„๊ต.

๋˜ํ•œ, MZSR์€ ์ž๊ธฐ ์ž์‹ ์œผ๋กœ๋ถ€ํ„ฐ ํ•™์Šต์„ ์ง„ํ–‰ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์˜ค๋ฅธ์ชฝ ๊ทธ๋ฆผ๊ณผ ๊ฐ™์ด multi-scale recurrent patterns์„ ๊ฐ€์ง„ ์ด๋ฏธ์ง€์—์„œ๋„ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์ด๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค

5. Conclusion

๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์™ธ๋ถ€ ์ƒ˜ํ”Œ๊ณผ ๋‚ด๋ถ€ ์ƒ˜ํ”Œ์„ ๋ชจ๋‘ ํ™œ์šฉํ•˜์—ฌ ๋น ๋ฅด๊ณ  ์œ ์—ฐํ•˜๋ฉฐ ๊ฐ€๋ฒผ์šด ์ž์ฒด ๊ฐ๋… ์ดˆํ•ด์ƒ๋„ ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•˜์˜€์Šต๋‹ˆ๋‹ค. ๊ตฌ์ฒด์ ์œผ๋กœ, ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” Blur kernal์˜ ๋‹ค์–‘ํ•œ ์กฐ๊ฑด์— ๋ฏผ๊ฐํ•œ ์ดˆ๊ธฐ ๊ฐ€์ค‘์น˜๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•ด Transfer Learning๊ณผ ํ•จ๊ป˜ ์ตœ์ ํ™” ๊ธฐ๋ฐ˜ Meta Learning์„ ์ด์šฉํ•ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ๋ณธ ๋…ผ๋ฌธ์—์„œ์˜ ๋ฐฉ๋ฒ•์€ ๋ช‡ ๊ฐ€์ง€ ๊ทธ๋ผ๋ฐ์ด์…˜ ์—…๋ฐ์ดํŠธ ๋‚ด์—์„œ ํŠน์ • ์ด๋ฏธ์ง€ ์กฐ๊ฑด์— ๋น ๋ฅด๊ฒŒ ์ ์‘ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์ด ์žฅ์ ์ž…๋‹ˆ๋‹ค. ๊ด‘๋ฒ”์œ„ํ•œ ์‹คํ—˜์„ ํ†ตํ•ด MZSR์ด ์ˆ˜์ฒœ ๋ฒˆ์˜ ๊ฒฝ์‚ฌ ํ•˜๊ฐ• ๋ฐ˜๋ณต์ด ํ•„์š”ํ•œ ZSSR์„ ํฌํ•จํ•œ ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•๋ณด๋‹ค ์šฐ์ˆ˜ํ•˜๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ๋‹ค๋งŒ ๋„คํŠธ์›Œํฌ ๋„คํŠธ์›Œํฌ ๋ชจํ˜•, ํ•™์Šต ์ „๋žต, multi-scale ๋ชจ๋ธ ๋“ฑ ์ž‘์—…์—์„œ ๊ฐœ์„ ํ•  ๋ถ€๋ถ„์ด ๋งŽ์€ ๊ฒƒ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค.. ๊ฒฐ๋ก ์ ์œผ๋กœ MZSR์€ internal๊ณผ external ์ƒ˜ํ”Œ์„ ๋ชจ๋‘ ์‚ฌ์šฉํ•˜์—ฌ ์ ์€ ์—…๋ฐ์ดํŠธ๋กœ๋งŒ์œผ๋กœ ํ•ด์ƒ๋„ ๋ณต์›์„ ์ˆ˜ํ–‰ํ•˜๊ฒŒ ํ•˜๋Š” ๋น ๋ฅด๊ณ  flexibleํ•œ ๋ฐฉ๋ฒ•์ด๋ผ๊ณ  ๋ง์”€๋“œ๋ฆด ์ˆ˜์žˆ์Šต๋‹ˆ๋‹ค.

Take home message (์˜ค๋Š˜์˜ ๊ตํ›ˆ)

์กฐ๊ธˆ ๋” ์ฐฝ์˜์ ์ธ ์ƒ๊ฐ์„ ํ•  ์ˆ˜ ์žˆ๋Š” ์‚ฌ๋žŒ์ด ๋˜์ž.

Transfer learning๊ณผ Meta learning์˜ ์กฐํ•ฉ์€ ๋‹ค๋ฅธ ๋ถ„์•ผ๋กœ ์—ฐ๊ฒฐ๋  ์ˆ˜ ์žˆ์„๋งŒํผ ๊ทธ ์˜ํ–ฅ๋ ฅ์ด ๋ง‰์ค‘ํ•˜๋‹ค.

Author / Reviewer information

Author

๋ฐฑ์ •์—ฝ (Jeongyeop Baek)

  • M.S. student, Civil & Engineering Department, KAIST (Advisor: Seongju Chang)

  • Interested in occupant-centric HVAC control based on individual thermal comfort

  • jungyubaik@kaist.ac.kr

  • https://baekkkkk96.tistory.com/

Reviewer

  1. Korean name (English name): Affiliation / Contact information

  2. Korean name (English name): Affiliation / Contact information

  3. ...

Reference & Additional materials

  1. Jae Woong Soh, Sunwoo Cho, Namik Cho. Meta-Transfer Learning for Zero-shot Super Resolution. In CVPR, 2020.

  2. Official GitHub repository : https://www.github.com/JWSoh/MZSR.

  3. Eirikur Agustsson and Radu Timofte. Ntire 2017 challenge on single image super-resolution: Dataset and study. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 126โ€“135, 2017.

  4. Namhyuk Ahn, Byungkon Kang, and Kyung-Ah Sohn. Fast, accurate, and lightweight super-resolution with cascading residual network. In Proceedings of the European Conference on Computer Vision(ECCV), pages 252โ€“268, 2018.

  5. Antreas Antoniou, Harrison Edwards, and Amos Storkey. How to train your maml. In ICLR, 2019.

Last updated