RCAN [Kor]
Yulun Zhang et al. / Image Super-Resolution Using Very Deep Residual Channel Attention Networks / ECCV 2018
English version of this article is available.
1. Problem definition
ėØģ¼ ģ“ėÆøģ§ ģ“ķ“ģķ (Single Image Super-Resolution, SISR) źø°ė²ģ ģ“ėÆøģ§ ė“ģ ėøė¬ģ ė¤ģķ ė øģ“ģ¦ė„¼ ģ ź±°ķė©“ģ, ėģģ ģ ķ“ģė (Low Resolution, LR) ģ“미ģ§ė„¼ ź³ ķ“ģė (High Resolution, HR)ė” ė³µģķė ź²ģ ėŖ©ķė” ķė¤. xģ y넼 ź°ź° LRź³¼ HR ģ“미ģ§ė¼ź³ ķ ė, SRģ ģģģ¼ė” ķķķė©“ ė¤ģź³¼ ź°ė¤.
ģ¬źø°ģ yģ xė ź°ź° ź³ ķ“ģėģ ģ ķ“ģė ģ“미ģ§ė„¼ ģ미ķė©°, kģ nģ ź°ź° ėøė¬ ķė ¬ź³¼ ė øģ“ģ¦ ķė ¬ģ ėķėøė¤. ģµź·¼ģė CNNģ“ SRģ ķØź³¼ģ ģ¼ė” ģģ©ķė¤ė ģ¬ģ¤ģ ė°ė¼, CNN-based SRģ“ ķė°ķ ģ°źµ¬ėź³ ģė¤. ķģ§ė§ CNN-based SRģ ė¤ģ ėź°ģ§ ķź³ģ ģ ź°ģ§ź³ ģė¤.
ģøµģ“ ź¹ģ“ģ§ģė” Gradient Vanishing [Note i]ģ“ ė°ģķģ¬ ķģµģ“ ģ“ė ¤ģģ§
LR ģ“미ģ§ģ ķ¬ķØė ģ 주ķ(low-frequency) ģ ė³“ź° ėŖØė ģ±ėģģ ėė±ķź² ė¤ė£Øģ“ģ§ģ¼ė”ģØ ź° feature mapģ ėķģ±ģ“ ģ½ķėØ
ģģ ģøźøķ SRģ ėŖ©ķģ ģ 2ź°ģ§ ķź³ģ ģ ź·¹ė³µķźø° ģķ“, ķ“ė¹ ė ¼ė¬øģģė Deep-RCAN (Residual Channel Attention Networks)ģ ģ ģķė¤.
[Note i] Gradient Vanishing: Input ź°ģ“ activation functionģ ź±°ģ¹ė©“ģ ģģ ė²ģģ output ź°ģ¼ė” squeezing ėė©°, ė°ė¼ģ ģ“źø°ģ input ź°ģ“ ģ¬ė¬ ģøµģ activation functionģ ź±°ģ¹ ģė” output ź°ģ ź±°ģ ģķ„ģ 미ģ¹ģ§ ėŖ»ķź² ėė ģķ넼 ģ미ķØ. ģ“ģ ė°ė¼ ģ“źø° layerė¤ģ ķė¼ėÆøķ° ź°ė¤ģ“ outputģ ėķ ė³ķģØģ“ ģģģ§ź²ėģ“ ķģµģ“ ė¶ź°ķ“ģ§
2. Motivation
2.1. Related work
ė³ø ė ¼ė¬øģ baselineģø deep-CNNź³¼ attention źø°ė²ź³¼ ź“ė Øė paperė¤ģ ė¤ģź³¼ ź°ė¤.
1. CNN źø°ė° SR
[SRCNN & FSRCNN]: CNNģ SRģ ģ ģ©ķ ģµģ“ģ źø°ė²ģ¼ė”ģ, 3ģøµģ CNNģ źµ¬ģ±ķØģ¼ė”ģØ źø°ģ”“ģ Non-CNN źø°ė° SR źø°ė²ė¤ģ ė¹ķ“ ķ¬ź² ģ±ė„ģ ķ„ģģģ¼°ģ. FSRCNNģ SRCNNģ ė¤ķøģķ¬ źµ¬ģ”°ė„¼ ź°ģķķģ¬ ģ¶ė” ź³¼ ķģµ ģė넼 ģ¦ėģķ“.
[VDSR & DRCN]: SRCNNė³“ė¤ ģøµģ ė ź¹ź² ģ ģøµķģ¬ (20ģøµ), ģ±ė„ģ ķ¬ź² ķ„ģģķ“.
[SRResNet & SRGAN]: SRResNetģ SRģ ResNetģ ģµģ“ė” ėģ ķģģ. ėķ SRGANģģė SRResNetģ GANģ ėģ ķØģ¼ė”ģØ ėøė¬ķģģ ģķģķ“ģ¼ė”ģØ ģ¬ģ¤ģ ź°ź¹ģ“(photo-realistic) SRģ źµ¬ķķģģ. ķģ§ė§, ģėķģ§ ģģ ģøź³µģ ģø(artifact) ź°ģ²“넼 ģģ±ķė ź²½ģ°ź° ė°ģķØ.
[EDSR & MDSR]: 기씓ģ ResNetģģ ė¶ķģķ ėŖØėģ ģ ź±°ķģ¬, ģė넼 ķ¬ź² ģ¦ź°ģķ“. ķģ§ė§, ģ“ėÆøģ§ ģ²ė¦¬ģģ ź“ź±“ģø ź¹ģ ģøµģ źµ¬ķķģ§ ėŖ»ķė©°, ėŖØė channelģģ low-frequency ģ 볓넼 ėģ¼ķź² ė¤ė£Øģ“ ė¶ķģķ ź³ģ°ģ“ ķ¬ķØėź³ ė¤ģķ feature넼 ėķė“ģ§ ėŖ»ķė¤ė ķź³ė„¼ ģ§ė.
2. Attention źø°ė²
Attentionģ ģøķ ė°ģ“ķ°ģģ ź“ģ¬ ģė ķ¹ģ ė¶ė¶ģ ģ²ė¦¬ 리ģģ¤ė„¼ ķøķ„ģķ¤ė źø°ė²ģ¼ė”ģ, ķ“ė¹ ė¶ė¶ģ ėķ ģ²ė¦¬ ģ±ė„ģ ģ¦ź°ģķØė¤. ķģ¬ź¹ģ§ attentionģ ź°ģ²“ģøģģ“ė ģ“ėÆøģ§ ė¶ė„ ė± high-level vision taskģ ģ¼ė°ģ ģ¼ė” ģ¬ģ©ėģź³ , ģ“ėÆøģ§ SR ė±ģ low-level vision taskģģė ź±°ģ ė¤ė£Øģ“ģ§ģ§ ģģė¤. ė³ø ė ¼ė¬øģģė ź³ ķ“ģė(High-Resolution, HR) ģ“미ģ§ė„¼ 구ģ±ķė ź³ 주ķ(High-Frequency)넼 ź°ķķźø° ģķ“, LR ģ“미ģ§ģģ ź³ 주ķ ģģģ attentionģ ģ ģ©ķė¤.
2.2. Idea
ķ“ė¹ ė ¼ė¬øģ ideaģ ģ“ģ ė°ė„ø contributionģ ģė ģøź°ģ§ė” ģģ½ķ ģ ģė¤.
1. Residual Channel Attention Network (RCAN)
Residual Channel Attention Network (RCAN) ģ ķµķ“ 기씓ģ CNN źø°ė° SRė³“ė¤ ėģ± ģøµģ ź¹ź² ģģģ¼ė”ģØ, ė ģ ķķ SR ģ“미ģ§ė„¼ ķėķė¤.
2. Residual in Residual (RIR)
Residual in Residual (RIR)ģ ķµķ“ i) ķģµź°ė„ķ(trainable) ėģ± ź¹ģ ģøµģ ģģ¼ė©°, ii) RIR ėøė” ė“ė¶ģ long and short skip connectionģ¼ė” ģ ķ“ģė ģ“미ģ§ģ low-frequency ģ 볓넼 ģ°ķģķ“ģ¼ė”ģØ ė ķØģØģ ģø ģ ź²½ė§ģ ģ¤ź³ķ ģ ģė¤.
3. Channel Attention (CA)
Channel Attention (CA)ģ ķµķ“ Feature ģ±ė ź° ģķøģ¢ ģģ±ģ ź³ ė ¤ķØģ¼ė”ģØ, ģ ģģ feature rescalingģ ź°ė„ģ¼ ķė¤.
3. Residual Channel Attention Network (RCAN)
3.1. Network Architecture
RCANģ ė¤ķøģķ¬ źµ¬ģ”°ė ķ¬ź² 4 ė¶ė¶ģ¼ė” 구ģ±ėģ“ ģė¤: i) Shallow feature extraction, ii) RIR deep feature extraction, iii) Upscale module, iv) Reconstruction part. ė³ø ė ¼ė¬øģģė i), iii), iv)ģ ėķ“ģė źø°ģ”“ źø°ė²ģø EDSRź³¼ ģ ģ¬ķź² ź°ź° one convolutional layer, deconvolutional layer, L1 lossź° ģ¬ģ©ėģė¤. ii) RIR deep feature extractionģ ķ¬ķØķģ¬, CAģ RCABģ ėķ contributionģ ė¤ģ ģ ģģ ģź°ķė¤.
3.2. Residual in Residual (RIR)
RIRģģė residual group (RG)ź³¼ long skip connection (LSC)ģ¼ė” 구ģ±ė Gź°ģ ėøė”ģ¼ė” ģ“루ģ“ģ ø ģė¤. ķ¹ķ, 1ź°ģ RGė residual channel attention block(RCAB)ģ short skip connection (SSC)ģ ėØģė” ķė Bź°ģ ģ°ģ°ģ¼ė” 구ģ±ėģ“ ģė¤. ģ“ė¬ķ źµ¬ģ”°ė” 400ź° ģ“ģģ CNN ģøµģ ķģ±ķė ź²ģ“ ź°ė„ķė¤. RGė§ģ ź¹ź² ģė ź²ģ ģ±ė„ 츔멓ģģ ķź³ź° ģźø° ė문ģ LSC넼 RIR ė§ģ§ė§ ė¶ģ ėģ ķģ¬ ģ ź²½ė§ģ ģģ ķģķØė¤. ėķ LSCģ SSC넼 ķØź» ėģ ķØģ¼ė”ģØ LRģ“미ģ§ģ ė¶ķģķ ģ 주ķ ģ 볓넼 ėģ± ķØģØģ ģ¼ė” ģ°ķģķ¬ ģ ģė¤.
3.3. Residual Channel Attention Block (RCAB)
ė³ø ė ¼ė¬øģģė Channel Attention (CA)넼 Residual Block (RB)ģ ė³ķ©ģķ“ģ¼ė”ģØ, Residual Channel Attention Block (RCAB)넼 ģ ģķģė¤. ķ¹ķ, CNNģ“ local receptive fieldė§ ź³ ė ¤ķØģ¼ė”ģØ local region ģ“ģøģ ģ 첓ģ ģø ģ 볓넼 ģ“ģ©ķģ§ ėŖ»ķė¤ė ģ ģ ź·¹ė³µķźø° ģķ“ CAģģė global average poolingģ¼ė” ź³µź°ģ ģ 볓넼 ķķķģė¤.
ķķø, ģ±ėź° ģ°ź“ģ±ģ ėķė“źø° ģķ“, gating 매커ėģ¦ģ [Note ii] ģ¶ź°ė” ėģ ķģė¤. gating 매커ėģ¦ģ ģ¼ė°ģ ģ¼ė” ģ±ėź° ė¹ģ ķģ±ģ ėķė“ģ¼ ķė©°, one-hot ķģ±ķģ ė¹ķ“ ė¤ģ ģ±ėģ featureź° ź°ģ”°ėė©“ģ ģķø ė°°ķģ ģø ź“ź³ė„¼ ķģµķ“ģ¼ ķė¤. ģ“ė¬ķ źø°ģ¤ģ ģ¶©ģ”±ķźø° ģķ“, sigmoid gatingź³¼ ReLUź° ģ ģ ėģė¤.
[Note ii] Gating Mechanisms: Gating Mechanismsģ Vanishing gradient 문ģ 넼 ķ“ź²°ķźø° ģķ“ ėģ ėģģ¼ė©° RNNģ ķØź³¼ģ ģ¼ė” ģ ģ©ėė¤. Gating Mechanismsģ ģ ė°ģ“ķøė„¼ smoothingķė ķØź³¼ė„¼ ģ§ėė¤. [Gu, Albert, et al. "Improving the gating mechanism of recurrent neural networks." International Conference on Machine Learning. PMLR, 2020.]
4. Experiment & Result
4.1. Experimental setup
1. Datasets and degradation models
ķģµģ© ģ“미ģ§ė DIV2K ė°ģ“ķ°ģ ģ ģ¼ė¶ 800ź° ģ“미ģ§ė„¼ ģ“ģ©ķģģ¼ė©°, ķ ģ¤ķø ģ“미ģ§ė”ė Set5, B100, Urban 100ź³¼ Manga109넼 ģ¬ģ©ķģė¤. Degradation ėŖØėøė”ė bicubic (BI)ģ blur-downscale (BD)ź° ģ¬ģ©ėģė¤.
2. Evaluation metrics
PSNRź³¼ SSIMģ¼ė” ģ²ė¦¬ė ģ“미ģ§ģ YCbCr color space [Note iii]ģ Y ģ±ėģ ķź°ķģģ. ėķ recognition errorģģ 1~5ģģ ķ SR źø°ė²ź³¼ ė¹źµķģ¬, ģ±ė„ ģ°ģ넼 ķģøķģģ.
[Note iii] YcbCr: YCBCRģ Y'CBCR, YCbCr ėė Y'CbCrģ“ė¼ź³ ė¶ė¦¬ė©°, ė¹ėģ¤ ė° ėģ§ķø ģ¬ģ§ ģģ¤ķ ģģ ģ»¬ė¬ ģ“ėÆøģ§ ķģ“ķė¼ģøģ ģ¼ė¶ė” ģ¬ģ©ėė ģģ ź³µź° ģ ķźµ°ģ“ė¤. Y'ė luma ģ±ė¶ģ“ź³ CB ė° CRģ ģ²ģģ°Ø ė° ģ ģģ°Ø ķ¬ė”ė§ ģ±ė¶ģ“ė¤. Y'(ķė¼ģ ķ¬ķØ)ė ķėģø Yģ źµ¬ė³ėė©°, ģ“ė ź“ ź°ėź° ź°ė§ 볓ģ ė RGB ķė¼ģ“머리넼 źø°ė°ģ¼ė” ė¹ģ ķģ ģ¼ė” ģøģ½ė©ėØģ ģ미ķė¤. [Wikipedia]
3. Training settings
ģģ ģøźøķ DIV2K ė°ģ“ķ°ģ ģ ģė 800ź°ģ ģ“미ģ§ģ ķģ , ģķė°ģ ė± data augmentationģ ģ ģ©ķź³ , ź° training batchģģė 48x48 ģ¬ģ“ģ¦ģ 16ź°ģ LR ķØģ¹ź° ģøķģ¼ė” ģ¶ģ¶ėģė¤. ėķ ģµģ ķ źø°ė²ģ¼ė”ė ADAMģ“ ģ¬ģ©ėģė¤.
4.2. Result
1. Effects of RIR and CA
기씓기ė²ģ“ 37.45dBģ ģ±ė„ģ 볓ģ¬ģ¤ė° ė°ķ“, long skip connection (LSC)ź³¼ short skip connection (SSC)ź° ķ¬ķØė RIRź³¼ CA넼 ģ“ģ©ķØģ¼ė”ģØ, 37.90dBź¹ģ§ ģ±ė„ģ ėģė¤. (LSC)ģ¼ė” 구ģ±ė Gź°ģ ėøė”ģ¼ė” ģ“루ģ“ģ ø ģė¤.
2. Model Size Analyses
RCANģ ķ źø°ė²ė¤ (DRCN, FSRCNN, PSyCo, ENet-E)ź³¼ ė¹źµķģ¬ ź°ģ„ ź¹ģ ģ ź²½ė§ģ ģ“루멓ģė, ģ 첓 ķė¼ėÆøķ° ģė ź°ģ„ ģ ģ§ė§, ź°ģ„ ėģ ģ±ė„ģ 볓ģ¬ģ£¼ģė¤.
5. Conclusion
ė³ø ė ¼ė¬øģģė ėģ ģ ķėģ SR ģ“미ģ§ė„¼ ķėķźø° ģķ“ RCANģ“ ģ ģ©ėģė¤. ķ¹ķ, RIR 구씰ģ LSC ė° SSC넼 ķØź» ķģ©ķØģ¼ė”ģØ, ź¹ģ ģøµģ ķģ±ķ ģ ģģė¤. ėķ RIRģ LR ģ“미ģ§ģ ė¶ķģķ ģ ė³“ģø ģ 주ķ ģ 볓넼 ģ°ķģķ“ģ¼ė”ģØ, ģ ź²½ė§ģ“ ź³ ģ£¼ķ ģ 볓넼 ķģµķ ģ ģėė” ķģė¤. ė ėģź°, CA넼 ėģ ķģ¬ ģ±ėź°ģ ģķøģ¢ ģģ±ģ ź³ ė ¤ķØģ¼ė”ģØ channel-wise feature넼 ģ ģģģ¼ė” rescalingķģė¤. ģ ģķ źø°ė²ģ BI, DB degradation ėŖØėøģ ģ“ģ©ķģ¬ SR ģ±ė„ģ ź²ģ¦ķģģ¼ė©°, ģ¶ź°ė” ź°ģ²“ ģøģģģė ģ°ģķ ģ±ė„ģ ėķė“ė ź²ģ ķģøķģė¤.
Take home message (ģ¤ėģ źµķ)
ģ“ėÆøģ§ ė“ģģ ź“ģ¬ ģė ģģģ ģ 볓넼 ė¶ķ ķ“ė“ź³ , ķ“ė¹ ģ 볓ģ attentionģ ģ ģ©ķØģ¼ė”ģØ ķģµź³¼ģ ģģ ė¹ģ¤ģ ė ėģ¼ ģ ģė¤.
ģ 첓 ķė§ė¦¬ķ° ź°ģ넼 ėė¦¬ė ź²ė³“ė¤ ģ ź²½ė§ģ ė ź¹ź² ģė ź²ģ“ ģ±ė„ģ ėģ“ėė° ė ķØź³¼ģ ģ“ė¤.
Author / Reviewer information
1. Author
ķģ¹ķø (Seungho Han)
KAIST ME
Research Topics: Formation Control, Vehicle Autonomous Driving, Image Super Resolution
https://www.linkedin.com/in/seung-ho-han-8a54a4205/
2. Reviewer
Korean name (English name): Affiliation / Contact information
Korean name (English name): Affiliation / Contact information
...
Reference & Additional materials
[Original Paper] Zhang, Yulun, et al. "Image super-resolution using very deep residual channel attention networks." Proceedings of the European conference on computer vision (ECCV). 2018.
[Github] https://github.com/yulunzhang/RCAN
[Github] https://github.com/dongheehand/RCAN-tf
[Github] https://github.com/yjn870/RCAN-pytorch
[Attention] https://wikidocs.net/22893
[Dataset] Xu, Qianxiong, and Yu Zheng. "A Survey of Image Super Resolution Based on CNN." Cloud Computing, Smart Grid and Innovative Frontiers in Telecommunications. Springer, Cham, 2019. 184-199.
[BSRGAN] Zhang, Kai, et al. "Designing a practical degradation model for deep blind image super-resolution." arXiv preprint arXiv:2103.14006 (2021).
[Google's SR3] https://80.lv/articles/google-s-new-approach-to-image-super-resolution/
[SRCNN] Dai, Yongpeng, et al. "SRCNN-based enhanced imaging for low frequency radar." 2018 Progress in Electromagnetics Research Symposium (PIERS-Toyama). IEEE, 2018.
[FSRCNN] Zhang, Jian, and Detian Huang. "Image Super-Resolution Reconstruction Algorithm Based on FSRCNN and Residual Network." 2019 IEEE 4th International Conference on Image, Vision and Computing (ICIVC). IEEE, 2019.
[VDSR] Hitawala, Saifuddin, et al. "Image super-resolution using VDSR-ResNeXt and SRCGAN." arXiv preprint arXiv:1810.05731 (2018).
[SRResNet ] Ledig, Christian, et al. "Photo-realistic single image super-resolution using a generative adversarial network." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
[SRGAN] Nagano, Yudai, and Yohei Kikuta. "SRGAN for super-resolving low-resolution food images." Proceedings of the Joint Workshop on Multimedia for Cooking and Eating Activities and Multimedia Assisted Dietary Management. 2018.
Last updated
Was this helpful?