RCAN [Kor]

Yulun Zhang et al. / Image Super-Resolution Using Very Deep Residual Channel Attention Networks / ECCV 2018

English version of this article is available.

1. Problem definition

๋‹จ์ผ ์ด๋ฏธ์ง€ ์ดˆํ•ด์ƒํ™” (Single Image Super-Resolution, SISR) ๊ธฐ๋ฒ•์€ ์ด๋ฏธ์ง€ ๋‚ด์˜ ๋ธ”๋Ÿฌ์™€ ๋‹ค์–‘ํ•œ ๋…ธ์ด์ฆˆ๋ฅผ ์ œ๊ฑฐํ•˜๋ฉด์„œ, ๋™์‹œ์— ์ €ํ•ด์ƒ๋„ (Low Resolution, LR) ์ด๋ฏธ์ง€๋ฅผ ๊ณ ํ•ด์ƒ๋„ (High Resolution, HR)๋กœ ๋ณต์›ํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•œ๋‹ค. x์™€ y๋ฅผ ๊ฐ๊ฐ LR๊ณผ HR ์ด๋ฏธ์ง€๋ผ๊ณ  ํ•  ๋•Œ, SR์„ ์ˆ˜์‹์œผ๋กœ ํ‘œํ˜„ํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

y=(xโŠ—k)โ†“s+n\textbf{y}=(\textbf{x} \otimes \textbf{k} )\downarrow_s + \textbf{n}

์—ฌ๊ธฐ์„œ y์™€ x๋Š” ๊ฐ๊ฐ ๊ณ ํ•ด์ƒ๋„์™€ ์ €ํ•ด์ƒ๋„ ์ด๋ฏธ์ง€๋ฅผ ์˜๋ฏธํ•˜๋ฉฐ, k์™€ n์€ ๊ฐ๊ฐ ๋ธ”๋Ÿฌ ํ–‰๋ ฌ๊ณผ ๋…ธ์ด์ฆˆ ํ–‰๋ ฌ์„ ๋‚˜ํƒ€๋‚ธ๋‹ค. ์ตœ๊ทผ์—๋Š” CNN์ด SR์— ํšจ๊ณผ์ ์œผ๋กœ ์ž‘์šฉํ•œ๋‹ค๋Š” ์‚ฌ์‹ค์— ๋”ฐ๋ผ, CNN-based SR์ด ํ™œ๋ฐœํžˆ ์—ฐ๊ตฌ๋˜๊ณ  ์žˆ๋‹ค. ํ•˜์ง€๋งŒ CNN-based SR์€ ๋‹ค์Œ ๋‘๊ฐ€์ง€ ํ•œ๊ณ„์ ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค.

  • ์ธต์ด ๊นŠ์–ด์งˆ์ˆ˜๋ก Gradient Vanishing [Note i]์ด ๋ฐœ์ƒํ•˜์—ฌ ํ•™์Šต์ด ์–ด๋ ค์›Œ์ง

  • LR ์ด๋ฏธ์ง€์— ํฌํ•จ๋œ ์ €์ฃผํŒŒ(low-frequency) ์ •๋ณด๊ฐ€ ๋ชจ๋“  ์ฑ„๋„์—์„œ ๋™๋“ฑํ•˜๊ฒŒ ๋‹ค๋ฃจ์–ด์ง์œผ๋กœ์จ ๊ฐ feature map์˜ ๋Œ€ํ‘œ์„ฑ์ด ์•ฝํ™”๋จ

์•ž์„œ ์–ธ๊ธ‰ํ•œ SR์˜ ๋ชฉํ‘œ์™€ ์œ„ 2๊ฐ€์ง€ ํ•œ๊ณ„์ ์„ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด, ํ•ด๋‹น ๋…ผ๋ฌธ์—์„œ๋Š” Deep-RCAN (Residual Channel Attention Networks)์„ ์ œ์•ˆํ•œ๋‹ค.

[Note i] Gradient Vanishing: Input ๊ฐ’์ด activation function์„ ๊ฑฐ์น˜๋ฉด์„œ ์ž‘์€ ๋ฒ”์œ„์˜ output ๊ฐ’์œผ๋กœ squeezing ๋˜๋ฉฐ, ๋”ฐ๋ผ์„œ ์ดˆ๊ธฐ์˜ input ๊ฐ’์ด ์—ฌ๋Ÿฌ ์ธต์˜ activation function์„ ๊ฑฐ์น ์ˆ˜๋ก output ๊ฐ’์— ๊ฑฐ์˜ ์˜ํ–ฅ์„ ๋ฏธ์น˜์ง€ ๋ชปํ•˜๊ฒŒ ๋˜๋Š” ์ƒํƒœ๋ฅผ ์˜๋ฏธํ•จ. ์ด์— ๋”ฐ๋ผ ์ดˆ๊ธฐ layer๋“ค์˜ ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’๋“ค์ด output์— ๋Œ€ํ•œ ๋ณ€ํ™”์œจ์ด ์ž‘์•„์ง€๊ฒŒ๋˜์–ด ํ•™์Šต์ด ๋ถˆ๊ฐ€ํ•ด์ง

2. Motivation

๋ณธ ๋…ผ๋ฌธ์˜ baseline์ธ deep-CNN๊ณผ attention ๊ธฐ๋ฒ•๊ณผ ๊ด€๋ จ๋œ paper๋“ค์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

1. CNN ๊ธฐ๋ฐ˜ SR

  • [SRCNN & FSRCNN]: CNN์„ SR์— ์ ์šฉํ•œ ์ตœ์ดˆ์˜ ๊ธฐ๋ฒ•์œผ๋กœ์„œ, 3์ธต์˜ CNN์„ ๊ตฌ์„ฑํ•จ์œผ๋กœ์จ ๊ธฐ์กด์˜ Non-CNN ๊ธฐ๋ฐ˜ SR ๊ธฐ๋ฒ•๋“ค์— ๋น„ํ•ด ํฌ๊ฒŒ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œ์ผฐ์Œ. FSRCNN์€ SRCNN์˜ ๋„คํŠธ์›Œํฌ ๊ตฌ์กฐ๋ฅผ ๊ฐ„์†Œํ™”ํ•˜์—ฌ ์ถ”๋ก ๊ณผ ํ•™์Šต ์†๋„๋ฅผ ์ฆ๋Œ€์‹œํ‚ด.

  • [VDSR & DRCN]: SRCNN๋ณด๋‹ค ์ธต์„ ๋” ๊นŠ๊ฒŒ ์ ์ธตํ•˜์—ฌ (20์ธต), ์„ฑ๋Šฅ์„ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œํ‚ด.

  • [SRResNet & SRGAN]: SRResNet์€ SR์— ResNet์„ ์ตœ์ดˆ๋กœ ๋„์ž…ํ•˜์˜€์Œ. ๋˜ํ•œ SRGAN์—์„œ๋Š” SRResNet์— GAN์„ ๋„์ž…ํ•จ์œผ๋กœ์จ ๋ธ”๋Ÿฌํ˜„์ƒ์„ ์™„ํ™”์‹œํ‚ด์œผ๋กœ์จ ์‚ฌ์‹ค์— ๊ฐ€๊นŒ์šด(photo-realistic) SR์„ ๊ตฌํ˜„ํ•˜์˜€์Œ. ํ•˜์ง€๋งŒ, ์˜๋„ํ•˜์ง€ ์•Š์€ ์ธ๊ณต์ ์ธ(artifact) ๊ฐ์ฒด๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋ฐœ์ƒํ•จ.

  • [EDSR & MDSR]: ๊ธฐ์กด์˜ ResNet์—์„œ ๋ถˆํ•„์š”ํ•œ ๋ชจ๋“ˆ์„ ์ œ๊ฑฐํ•˜์—ฌ, ์†๋„๋ฅผ ํฌ๊ฒŒ ์ฆ๊ฐ€์‹œํ‚ด. ํ•˜์ง€๋งŒ, ์ด๋ฏธ์ง€ ์ฒ˜๋ฆฌ์—์„œ ๊ด€๊ฑด์ธ ๊นŠ์€ ์ธต์„ ๊ตฌํ˜„ํ•˜์ง€ ๋ชปํ•˜๋ฉฐ, ๋ชจ๋“  channel์—์„œ low-frequency ์ •๋ณด๋ฅผ ๋™์ผํ•˜๊ฒŒ ๋‹ค๋ฃจ์–ด ๋ถˆํ•„์š”ํ•œ ๊ณ„์‚ฐ์ด ํฌํ•จ๋˜๊ณ  ๋‹ค์–‘ํ•œ feature๋ฅผ ๋‚˜ํƒ€๋‚ด์ง€ ๋ชปํ•œ๋‹ค๋Š” ํ•œ๊ณ„๋ฅผ ์ง€๋‹˜.

2. Attention ๊ธฐ๋ฒ•

Attention์€ ์ธํ’‹ ๋ฐ์ดํ„ฐ์—์„œ ๊ด€์‹ฌ ์žˆ๋Š” ํŠน์ • ๋ถ€๋ถ„์— ์ฒ˜๋ฆฌ ๋ฆฌ์†Œ์Šค๋ฅผ ํŽธํ–ฅ์‹œํ‚ค๋Š” ๊ธฐ๋ฒ•์œผ๋กœ์„œ, ํ•ด๋‹น ๋ถ€๋ถ„์— ๋Œ€ํ•œ ์ฒ˜๋ฆฌ ์„ฑ๋Šฅ์„ ์ฆ๊ฐ€์‹œํ‚จ๋‹ค. ํ˜„์žฌ๊นŒ์ง€ attention์€ ๊ฐ์ฒด์ธ์‹์ด๋‚˜ ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ ๋“ฑ high-level vision task์— ์ผ๋ฐ˜์ ์œผ๋กœ ์‚ฌ์šฉ๋˜์—ˆ๊ณ , ์ด๋ฏธ์ง€ SR ๋“ฑ์˜ low-level vision task์—์„œ๋Š” ๊ฑฐ์˜ ๋‹ค๋ฃจ์–ด์ง€์ง€ ์•Š์•˜๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๊ณ ํ•ด์ƒ๋„(High-Resolution, HR) ์ด๋ฏธ์ง€๋ฅผ ๊ตฌ์„ฑํ•˜๋Š” ๊ณ ์ฃผํŒŒ(High-Frequency)๋ฅผ ๊ฐ•ํ™”ํ•˜๊ธฐ ์œ„ํ•ด, LR ์ด๋ฏธ์ง€์—์„œ ๊ณ ์ฃผํŒŒ ์˜์—ญ์— attention์„ ์ ์šฉํ•œ๋‹ค.

2.2. Idea

ํ•ด๋‹น ๋…ผ๋ฌธ์˜ idea์™€ ์ด์— ๋”ฐ๋ฅธ contribution์€ ์•„๋ž˜ ์„ธ๊ฐ€์ง€๋กœ ์š”์•ฝํ•  ์ˆ˜ ์žˆ๋‹ค.

1. Residual Channel Attention Network (RCAN)

Residual Channel Attention Network (RCAN) ์„ ํ†ตํ•ด ๊ธฐ์กด์˜ CNN ๊ธฐ๋ฐ˜ SR๋ณด๋‹ค ๋”์šฑ ์ธต์„ ๊นŠ๊ฒŒ ์Œ“์Œ์œผ๋กœ์จ, ๋” ์ •ํ™•ํ•œ SR ์ด๋ฏธ์ง€๋ฅผ ํš๋“ํ•œ๋‹ค.

2. Residual in Residual (RIR)

Residual in Residual (RIR)์„ ํ†ตํ•ด i) ํ•™์Šต๊ฐ€๋Šฅํ•œ(trainable) ๋”์šฑ ๊นŠ์€ ์ธต์„ ์Œ“์œผ๋ฉฐ, ii) RIR ๋ธ”๋ก ๋‚ด๋ถ€์˜ long and short skip connection์œผ๋กœ ์ €ํ•ด์ƒ๋„ ์ด๋ฏธ์ง€์˜ low-frequency ์ •๋ณด๋ฅผ ์šฐํšŒ์‹œํ‚ด์œผ๋กœ์จ ๋” ํšจ์œจ์ ์ธ ์‹ ๊ฒฝ๋ง์„ ์„ค๊ณ„ํ•  ์ˆ˜ ์žˆ๋‹ค.

3. Channel Attention (CA)

Channel Attention (CA)์„ ํ†ตํ•ด Feature ์ฑ„๋„ ๊ฐ„ ์ƒํ˜ธ์ข…์†์„ฑ์„ ๊ณ ๋ คํ•จ์œผ๋กœ์จ, ์ ์‘์‹ feature rescaling์„ ๊ฐ€๋Šฅ์ผ€ ํ•œ๋‹ค.

3. Residual Channel Attention Network (RCAN)

3.1. Network Architecture

RCAN์˜ ๋„คํŠธ์›Œํฌ ๊ตฌ์กฐ๋Š” ํฌ๊ฒŒ 4 ๋ถ€๋ถ„์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋‹ค: i) Shallow feature extraction, ii) RIR deep feature extraction, iii) Upscale module, iv) Reconstruction part. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” i), iii), iv)์— ๋Œ€ํ•ด์„œ๋Š” ๊ธฐ์กด ๊ธฐ๋ฒ•์ธ EDSR๊ณผ ์œ ์‚ฌํ•˜๊ฒŒ ๊ฐ๊ฐ one convolutional layer, deconvolutional layer, L1 loss๊ฐ€ ์‚ฌ์šฉ๋˜์—ˆ๋‹ค. ii) RIR deep feature extraction์„ ํฌํ•จํ•˜์—ฌ, CA์™€ RCAB์— ๋Œ€ํ•œ contribution์€ ๋‹ค์Œ ์ ˆ์—์„œ ์†Œ๊ฐœํ•œ๋‹ค.

L(ฮ˜)=1Nโˆ‘Ni=1โˆฅHRCAN(ILRi)โˆ’IHRiโˆฅ1L(\Theta )=\frac{1}{N}\sum_{N}^{i=1}\left \| H_{RCAN}(I_{LR}^i)-I_{HR}^i \right \|_1

3.2. Residual in Residual (RIR)

RIR์—์„œ๋Š” residual group (RG)๊ณผ long skip connection (LSC)์œผ๋กœ ๊ตฌ์„ฑ๋œ G๊ฐœ์˜ ๋ธ”๋ก์œผ๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ๋‹ค. ํŠนํžˆ, 1๊ฐœ์˜ RG๋Š” residual channel attention block(RCAB)์™€ short skip connection (SSC)์„ ๋‹จ์œ„๋กœ ํ•˜๋Š” B๊ฐœ์˜ ์—ฐ์‚ฐ์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ตฌ์กฐ๋กœ 400๊ฐœ ์ด์ƒ์˜ CNN ์ธต์„ ํ˜•์„ฑํ•˜๋Š” ๊ฒƒ์ด ๊ฐ€๋Šฅํ•˜๋‹ค. RG๋งŒ์„ ๊นŠ๊ฒŒ ์Œ“๋Š” ๊ฒƒ์€ ์„ฑ๋Šฅ ์ธก๋ฉด์—์„œ ํ•œ๊ณ„๊ฐ€ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— LSC๋ฅผ RIR ๋งˆ์ง€๋ง‰ ๋ถ€์— ๋„์ž…ํ•˜์—ฌ ์‹ ๊ฒฝ๋ง์„ ์•ˆ์ •ํ™”์‹œํ‚จ๋‹ค. ๋˜ํ•œ LSC์™€ SSC๋ฅผ ํ•จ๊ป˜ ๋„์ž…ํ•จ์œผ๋กœ์จ LR์ด๋ฏธ์ง€์˜ ๋ถˆํ•„์š”ํ•œ ์ €์ฃผํŒŒ ์ •๋ณด๋ฅผ ๋”์šฑ ํšจ์œจ์ ์œผ๋กœ ์šฐํšŒ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค.

3.3. Residual Channel Attention Block (RCAB)

๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” Channel Attention (CA)๋ฅผ Residual Block (RB)์— ๋ณ‘ํ•ฉ์‹œํ‚ด์œผ๋กœ์จ, Residual Channel Attention Block (RCAB)๋ฅผ ์ œ์•ˆํ•˜์˜€๋‹ค. ํŠนํžˆ, CNN์ด local receptive field๋งŒ ๊ณ ๋ คํ•จ์œผ๋กœ์จ local region ์ด์™ธ์˜ ์ „์ฒด์ ์ธ ์ •๋ณด๋ฅผ ์ด์šฉํ•˜์ง€ ๋ชปํ•œ๋‹ค๋Š” ์ ์„ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด CA์—์„œ๋Š” global average pooling์œผ๋กœ ๊ณต๊ฐ„์  ์ •๋ณด๋ฅผ ํ‘œํ˜„ํ•˜์˜€๋‹ค.

ํ•œํŽธ, ์ฑ„๋„๊ฐ„ ์—ฐ๊ด€์„ฑ์„ ๋‚˜ํƒ€๋‚ด๊ธฐ ์œ„ํ•ด, gating ๋งค์ปค๋‹ˆ์ฆ˜์„ [Note ii] ์ถ”๊ฐ€๋กœ ๋„์ž…ํ•˜์˜€๋‹ค. gating ๋งค์ปค๋‹ˆ์ฆ˜์€ ์ผ๋ฐ˜์ ์œผ๋กœ ์ฑ„๋„๊ฐ„ ๋น„์„ ํ˜•์„ฑ์„ ๋‚˜ํƒ€๋‚ด์•ผ ํ•˜๋ฉฐ, one-hot ํ™œ์„ฑํ™”์— ๋น„ํ•ด ๋‹ค์ˆ˜ ์ฑ„๋„์˜ feature๊ฐ€ ๊ฐ•์กฐ๋˜๋ฉด์„œ ์ƒํ˜ธ ๋ฐฐํƒ€์ ์ธ ๊ด€๊ณ„๋ฅผ ํ•™์Šตํ•ด์•ผ ํ•œ๋‹ค. ์ด๋Ÿฌํ•œ ๊ธฐ์ค€์„ ์ถฉ์กฑํ•˜๊ธฐ ์œ„ํ•ด, sigmoid gating๊ณผ ReLU๊ฐ€ ์„ ์ •๋˜์—ˆ๋‹ค.

[Note ii] Gating Mechanisms: Gating Mechanisms์€ Vanishing gradient ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๋„์ž…๋˜์—ˆ์œผ๋ฉฐ RNN์— ํšจ๊ณผ์ ์œผ๋กœ ์ ์šฉ๋œ๋‹ค. Gating Mechanisms์€ ์—…๋ฐ์ดํŠธ๋ฅผ smoothingํ•˜๋Š” ํšจ๊ณผ๋ฅผ ์ง€๋‹Œ๋‹ค. [Gu, Albert, et al. "Improving the gating mechanism of recurrent neural networks." International Conference on Machine Learning. PMLR, 2020.]

4. Experiment & Result

4.1. Experimental setup

1. Datasets and degradation models

ํ•™์Šต์šฉ ์ด๋ฏธ์ง€๋Š” DIV2K ๋ฐ์ดํ„ฐ์…‹์˜ ์ผ๋ถ€ 800๊ฐœ ์ด๋ฏธ์ง€๋ฅผ ์ด์šฉํ•˜์˜€์œผ๋ฉฐ, ํ…Œ์ŠคํŠธ ์ด๋ฏธ์ง€๋กœ๋Š” Set5, B100, Urban 100๊ณผ Manga109๋ฅผ ์‚ฌ์šฉํ•˜์˜€๋‹ค. Degradation ๋ชจ๋ธ๋กœ๋Š” bicubic (BI)์™€ blur-downscale (BD)๊ฐ€ ์‚ฌ์šฉ๋˜์—ˆ๋‹ค.

2. Evaluation metrics

PSNR๊ณผ SSIM์œผ๋กœ ์ฒ˜๋ฆฌ๋œ ์ด๋ฏธ์ง€์˜ YCbCr color space [Note iii]์˜ Y ์ฑ„๋„์„ ํ‰๊ฐ€ํ•˜์˜€์Œ. ๋˜ํ•œ recognition error์—์„œ 1~5์œ„์˜ ํƒ€ SR ๊ธฐ๋ฒ•๊ณผ ๋น„๊ตํ•˜์—ฌ, ์„ฑ๋Šฅ ์šฐ์œ„๋ฅผ ํ™•์ธํ•˜์˜€์Œ.

[Note iii] YcbCr: YCBCR์€ Y'CBCR, YCbCr ๋˜๋Š” Y'CbCr์ด๋ผ๊ณ  ๋ถˆ๋ฆฌ๋ฉฐ, ๋น„๋””์˜ค ๋ฐ ๋””์ง€ํ„ธ ์‚ฌ์ง„ ์‹œ์Šคํ…œ์—์„œ ์ปฌ๋Ÿฌ ์ด๋ฏธ์ง€ ํŒŒ์ดํ”„๋ผ์ธ์˜ ์ผ๋ถ€๋กœ ์‚ฌ์šฉ๋˜๋Š” ์ƒ‰์ƒ ๊ณต๊ฐ„ ์ œํ’ˆ๊ตฐ์ด๋‹ค. Y'๋Š” luma ์„ฑ๋ถ„์ด๊ณ  CB ๋ฐ CR์€ ์ฒญ์ƒ‰์ฐจ ๋ฐ ์ ์ƒ‰์ฐจ ํฌ๋กœ๋งˆ ์„ฑ๋ถ„์ด๋‹ค. Y'(ํ”„๋ผ์ž„ ํฌํ•จ)๋Š” ํœ˜๋„์ธ Y์™€ ๊ตฌ๋ณ„๋˜๋ฉฐ, ์ด๋Š” ๊ด‘ ๊ฐ•๋„๊ฐ€ ๊ฐ๋งˆ ๋ณด์ •๋œ RGB ํ”„๋ผ์ด๋จธ๋ฆฌ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋น„์„ ํ˜•์ ์œผ๋กœ ์ธ์ฝ”๋”ฉ๋จ์„ ์˜๋ฏธํ•œ๋‹ค. [Wikipedia]

3. Training settings

์•ž์„œ ์–ธ๊ธ‰ํ•œ DIV2K ๋ฐ์ดํ„ฐ์…‹์— ์žˆ๋Š” 800๊ฐœ์˜ ์ด๋ฏธ์ง€์— ํšŒ์ „, ์ƒํ•˜๋ฐ˜์ „ ๋“ฑ data augmentation์„ ์ ์šฉํ•˜๊ณ , ๊ฐ training batch์—์„œ๋Š” 48x48 ์‚ฌ์ด์ฆˆ์˜ 16๊ฐœ์˜ LR ํŒจ์น˜๊ฐ€ ์ธํ’‹์œผ๋กœ ์ถ”์ถœ๋˜์—ˆ๋‹ค. ๋˜ํ•œ ์ตœ์ ํ™” ๊ธฐ๋ฒ•์œผ๋กœ๋Š” ADAM์ด ์‚ฌ์šฉ๋˜์—ˆ๋‹ค.

4.2. Result

1. Effects of RIR and CA

๊ธฐ์กด๊ธฐ๋ฒ•์ด 37.45dB์˜ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ค€๋ฐ ๋ฐ˜ํ•ด, long skip connection (LSC)๊ณผ short skip connection (SSC)๊ฐ€ ํฌํ•จ๋œ RIR๊ณผ CA๋ฅผ ์ด์šฉํ•จ์œผ๋กœ์จ, 37.90dB๊นŒ์ง€ ์„ฑ๋Šฅ์„ ๋†’์˜€๋‹ค. (LSC)์œผ๋กœ ๊ตฌ์„ฑ๋œ G๊ฐœ์˜ ๋ธ”๋ก์œผ๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ๋‹ค.

2. Model Size Analyses

RCAN์€ ํƒ€ ๊ธฐ๋ฒ•๋“ค (DRCN, FSRCNN, PSyCo, ENet-E)๊ณผ ๋น„๊ตํ•˜์—ฌ ๊ฐ€์žฅ ๊นŠ์€ ์‹ ๊ฒฝ๋ง์„ ์ด๋ฃจ๋ฉด์„œ๋„, ์ „์ฒด ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜๋Š” ๊ฐ€์žฅ ์ ์ง€๋งŒ, ๊ฐ€์žฅ ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค.

5. Conclusion

๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋†’์€ ์ •ํ™•๋„์˜ SR ์ด๋ฏธ์ง€๋ฅผ ํš๋“ํ•˜๊ธฐ ์œ„ํ•ด RCAN์ด ์ ์šฉ๋˜์—ˆ๋‹ค. ํŠนํžˆ, RIR ๊ตฌ์กฐ์™€ LSC ๋ฐ SSC๋ฅผ ํ•จ๊ป˜ ํ™œ์šฉํ•จ์œผ๋กœ์จ, ๊นŠ์€ ์ธต์„ ํ˜•์„ฑํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๋˜ํ•œ RIR์€ LR ์ด๋ฏธ์ง€์˜ ๋ถˆํ•„์š”ํ•œ ์ •๋ณด์ธ ์ €์ฃผํŒŒ ์ •๋ณด๋ฅผ ์šฐํšŒ์‹œํ‚ด์œผ๋กœ์จ, ์‹ ๊ฒฝ๋ง์ด ๊ณ ์ฃผํŒŒ ์ •๋ณด๋ฅผ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜์˜€๋‹ค. ๋” ๋‚˜์•„๊ฐ€, CA๋ฅผ ๋„์ž…ํ•˜์—ฌ ์ฑ„๋„๊ฐ„์˜ ์ƒํ˜ธ์ข…์†์„ฑ์„ ๊ณ ๋ คํ•จ์œผ๋กœ์จ channel-wise feature๋ฅผ ์ ์‘์‹์œผ๋กœ rescalingํ•˜์˜€๋‹ค. ์ œ์•ˆํ•œ ๊ธฐ๋ฒ•์€ BI, DB degradation ๋ชจ๋ธ์„ ์ด์šฉํ•˜์—ฌ SR ์„ฑ๋Šฅ์„ ๊ฒ€์ฆํ•˜์˜€์œผ๋ฉฐ, ์ถ”๊ฐ€๋กœ ๊ฐ์ฒด ์ธ์‹์—์„œ๋„ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋‚˜ํƒ€๋‚ด๋Š” ๊ฒƒ์„ ํ™•์ธํ•˜์˜€๋‹ค.

Take home message (์˜ค๋Š˜์˜ ๊ตํ›ˆ)

์ด๋ฏธ์ง€ ๋‚ด์—์„œ ๊ด€์‹ฌ ์žˆ๋Š” ์˜์—ญ์˜ ์ •๋ณด๋ฅผ ๋ถ„ํ• ํ•ด๋‚ด๊ณ , ํ•ด๋‹น ์ •๋ณด์— attention์„ ์ ์šฉํ•จ์œผ๋กœ์จ ํ•™์Šต๊ณผ์ •์—์„œ ๋น„์ค‘์„ ๋” ๋†’์ผ ์ˆ˜ ์žˆ๋‹ค.

์ „์ฒด ํŒŒ๋งˆ๋ฆฌํ„ฐ ๊ฐœ์ˆ˜๋ฅผ ๋Š˜๋ฆฌ๋Š” ๊ฒƒ๋ณด๋‹ค ์‹ ๊ฒฝ๋ง์„ ๋” ๊นŠ๊ฒŒ ์Œ“๋Š” ๊ฒƒ์ด ์„ฑ๋Šฅ์„ ๋†’์ด๋Š”๋ฐ ๋” ํšจ๊ณผ์ ์ด๋‹ค.

Author / Reviewer information

1. Author

ํ•œ์Šนํ˜ธ (Seungho Han)

  • KAIST ME

  • Research Topics: Formation Control, Vehicle Autonomous Driving, Image Super Resolution

  • https://www.linkedin.com/in/seung-ho-han-8a54a4205/

2. Reviewer

  1. Korean name (English name): Affiliation / Contact information

  2. Korean name (English name): Affiliation / Contact information

  3. ...

Reference & Additional materials

  1. [Original Paper] Zhang, Yulun, et al. "Image super-resolution using very deep residual channel attention networks." Proceedings of the European conference on computer vision (ECCV). 2018.

  2. [Github] https://github.com/yulunzhang/RCAN

  3. [Github] https://github.com/dongheehand/RCAN-tf

  4. [Github] https://github.com/yjn870/RCAN-pytorch

  5. [Attention] https://wikidocs.net/22893

  6. [Dataset] Xu, Qianxiong, and Yu Zheng. "A Survey of Image Super Resolution Based on CNN." Cloud Computing, Smart Grid and Innovative Frontiers in Telecommunications. Springer, Cham, 2019. 184-199.

  7. [BSRGAN] Zhang, Kai, et al. "Designing a practical degradation model for deep blind image super-resolution." arXiv preprint arXiv:2103.14006 (2021).

  8. [Google's SR3] https://80.lv/articles/google-s-new-approach-to-image-super-resolution/

  9. [SRCNN] Dai, Yongpeng, et al. "SRCNN-based enhanced imaging for low frequency radar." 2018 Progress in Electromagnetics Research Symposium (PIERS-Toyama). IEEE, 2018.

  10. [FSRCNN] Zhang, Jian, and Detian Huang. "Image Super-Resolution Reconstruction Algorithm Based on FSRCNN and Residual Network." 2019 IEEE 4th International Conference on Image, Vision and Computing (ICIVC). IEEE, 2019.

  11. [VDSR] Hitawala, Saifuddin, et al. "Image super-resolution using VDSR-ResNeXt and SRCGAN." arXiv preprint arXiv:1810.05731 (2018).

  12. [SRResNet ] Ledig, Christian, et al. "Photo-realistic single image super-resolution using a generative adversarial network." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.

  13. [SRGAN] Nagano, Yudai, and Yohei Kikuta. "SRGAN for super-resolving low-resolution food images." Proceedings of the Joint Workshop on Multimedia for Cooking and Eating Activities and Multimedia Assisted Dietary Management. 2018.

Last updated

Was this helpful?