Chaining a U-Net With a Residual U-Net for Retinal Blood Vessels Segmentation [Kor]

CURU-kor

1. Problem definition

  • ์˜ค์ง ๋ง๋ง‰์—์„œ๋งŒ ์‹ฌํ˜ˆ๊ด€๊ณ„(cardiovascular system)๋ฅผ ๋น„์นจ์Šต์ ์œผ๋กœ ๊ด€์ฐฐ ๊ฐ€๋Šฅํ•œ๋ฐ

  • ์ด๋ฅผ ํ†ตํ•ด ์‹ฌํ˜ˆ๊ด€ ์งˆํ™˜์˜ ๋ฐœ๋‹ฌ๊ณผ ๋ฏธ์„ธํ˜ˆ๊ด€์˜ ํ˜•ํƒœ๋ณ€ํ™”์™€ ๊ฐ™์€ ์ •๋ณด๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค.

  • ๋˜ ์ด๋ฅผ ์•ˆ๊ณผ ์ง„๋‹จ์— ์ค‘์š”ํ•œ ์ง€ํ‘œ์ธ๋ฐ

  • ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š”, U-Net(+ Residual U-net) ๋ชจ๋ธ์„ ํ™œ์šฉํ•ด ๋ง๋ง‰ ์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ ํ˜ˆ๊ด€์— ํ•ด๋‹นํ•˜๋Š” ํ”ฝ์…€๋“ค์„ ๊ตฌ๋ถ„ํ•œ๋‹ค.(Image Segmentation)

  • ์ด๋กœ ๋ถ€ํ„ฐ ํ˜ˆ๊ด€๋“ค์˜ ํ˜•ํƒœํ•™์  ๋ฐ์ดํ„ฐ๋ฅผ ํš๋“ํ•œ๋‹ค.

    โž” ๊ธฐ์กด ๋ฐฉ๋ฒ•๋“ค์— ๋น„ํ•ด ํ›ˆ๋ จ์‹œ๊ฐ„๊ณผ ์„ฑ๋Šฅ ์‚ฌ์ด์˜ ํŠธ๋ ˆ์ด๋“œ ์˜คํ”„๋ฅผ ์ตœ์†Œํ™” ์‹œํ‚ค๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ์ง„ํ–‰ํ–ˆ์Œ

U-Net : Biomedical ๋ถ„์•ผ์—์„œ ์ด๋ฏธ์ง€ ๋ถ„ํ• (Image Segmentation)์„ ๋ชฉ์ ์œผ๋กœ ์ œ์•ˆ๋œ End-to-End ๋ฐฉ์‹์˜ Fully-Convolutional Network ๊ธฐ๋ฐ˜ ๋ชจ๋ธ

Link: U-net


2. Motivation

๋ง๋ง‰ ํ˜ˆ๊ด€ ๋ถ„ํ• ์„ ์œ„ํ•œ ๋งŽ์€ ๋ฐฉ๋ฒ•๋“ค์ด ์ œ์•ˆ๋˜์—ˆ๊ณ , ํ˜„์žฌ๋Š” ๋Œ€๋ถ€๋ถ„ CNN์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋‹ค.

  • [Cai et al., 2016] ์šฐ๋ฆฌ๊ฐ€ ์ž˜ ์•Œ๊ณ ์žˆ๋Š” VGG net๋˜ํ•œ CNN์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๊ณ ์žˆ๋‹ค.

  • [Dasgupta et al., 2017] CNN๊ณผ ๊ตฌ์กฐํ™”๋œ ์˜ˆ์ธก(structured prediction)์„ ๊ฒฐํ•ฉํ•˜์—ฌ ๋‹ค์ค‘ ๋ ˆ์ด๋ธ” ์ถ”๋ก  ์ž‘์—…(multi-label inference task)์„ ์ˆ˜ํ–‰ํ•จ.

  • [Alom et al., 2018] ์ž”๋ฅ˜ ๋ธ”๋ก(residual blocks)์„ ๋„์ž…ํ•˜๊ณ  recurrent residual Convolution Layer๋กœ ๋ณด์™„ํ•˜์˜€๋‹ค.

  • [Zhuang et al., 2019] ๋‘ ๊ฐœ์˜ U-net์„ ์Œ“์•„ ์ž”๋ฅ˜ ๋ธ”๋ก(residual blocks)์˜ ๊ฒฝ๋กœ๋ฅผ ์ฆ๊ฐ€์‹œ์ผฐ๋‹ค.

  • [Khanal et al., 2019] ๋ชจํ˜ธํ•œ ํ”ฝ์…€์— ๋Œ€ํ•ด ํ•œ๋ฒˆ ๋” ์ถ•์†Œ๋œ ๋„คํŠธ์›Œํฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐฐ๊ฒฝ ํ”ฝ์…€๊ณผ ํ˜ˆ๊ด€ ์‚ฌ์ด์˜ ๊ท ํ˜•์„ ์ž˜ ์žก๊ธฐ ์œ„ํ•ด ํ™•๋ฅ ์  ๊ฐ€์ค‘์น˜(stochastic weights)๋ฅผ ์‚ฌ์šฉํ–ˆ๋‹ค.

Idea

๋ณธ ์—ฐ๊ตฌ๋Š” U-Net1 ๊ณผ U-Net2 with residual blocks๋ฅผ ์„œ๋กœ ์—ฐ๊ฒฐ์‹œํ‚จ ๊ตฌ์กฐ๋ฅผ ์ œ์•ˆํ•œ๋‹ค.

  • ์ฒซ ๋ฒˆ์งธ ๋ถ€๋ถ„(U-Net1)์€ ํŠน์ง• ์ถ”์ถœ์„ ์ˆ˜ํ–‰ํ•˜๊ณ 

  • ๋‘ ๋ฒˆ์งธ ๋ถ€๋ถ„(U-Net2 with residual blocks)์€ ์ž”๋ฅ˜ ๋ธ”๋ก(residual block)์œผ๋กœ ๋ถ€ํ„ฐ ์ƒˆ๋กœ์šด ํŠน์ง•์„ ๊ฐ์ง€ํ•˜๊ณ  ๋ชจํ˜ธํ•œ ํ”ฝ์…€์„ ๊ฐ์ง€ํ•œ๋‹ค.


3. Method

๋ณธ ์—ฐ๊ตฌ์˜ ์›Œํฌํ”Œ๋กœ์šฐ(work flow)๋Š” ์•„๋ž˜์™€ ๊ฐ™๋‹ค.

  1. ์ด๋ฏธ์ง€ํš๋“

    • ๋ง๋ง‰ ์ด๋ฏธ์ง€๋ฅผ ์ˆ˜์ง‘

  2. ์ „์ฒ˜๋ฆฌ(pre-processing)

    • ํŠน์ง• ์ถ”์ถœ(feature extraction), ํŠน์ • ํŒจํ„ด highliting, ์ •๊ทœํ™” ๋“ฑ์„ ์ง„ํ–‰

    • ์ด ์ค‘์—์„œ CNN architecture์— ์ ์šฉํ•  ํŠน์„ฑ(characteristics)๋“ค์„ ์„ ํƒ

  3. ์„ฑ๋Šฅ ํ‰๊ฐ€ ๋ฐ ๊ฐ€์ค‘์น˜ ์กฐ์ •

    • ์ตœ์ƒ์˜ ๊ฒฐ๊ณผ๋ฅผ ์œ„ํ•ด ํ•ด๋‹น ๊ณผ์ •์€ ์ง€์†์ ์œผ๋กœ ์ง„ํ–‰

  4. ๊ฒฐ๊ณผํ•ด์„

1. Pre-Processing

์ „์ฒ˜๋ฆฌ๋ฅผ ํ†ตํ•ด ์ด๋ฏธ์ง€์˜ ํ’ˆ์งˆ์„ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋Š”๋ฐ, ์ด๋Š” CNN์ด ํŠน์ • ํŠน์„ฑ ํƒ์ง€์— ๋งค์šฐ ์ค‘์š”ํ•œ ๋‹จ๊ณ„์ด๋‹ค.

Step1

RGB์ด๋ฏธ์ง€๋ฅผ ํ‘๋ฐฑ์ด๋ฏธ์ง€๋กœ ๋ณ€ํ™˜ํ•ด์ค€๋‹ค. ์ด๋Š” ํ˜ˆ๊ด€๊ณผ ๋ฐฐ๊ฒฝ(background)์˜ ๋Œ€๋น„๋ฅผ ๋†’์—ฌ ๊ตฌ๋ถ„์‹œ์ผœ์ค€๋‹ค.

๊ด€๋ จ์‹์€ ์•„๋ž˜์™€ ๊ฐ™๋‹ค

image ์—ฌ๊ธฐ์„œ, R G B๋Š” ๊ฐ๊ฐ ์ด๋ฏธ์ง€์˜ ์ฑ„๋„์ด๋‹ค. ์œ„ ์‹์—์„œ๋Š”, G(Green)์„ ๊ฐ€์žฅ ๊ฐ•์กฐ์‹œ์ผฐ๋‹ค. ๋…น์ƒ‰์ด ๊ฐ€์žฅ ๋…ธ์ด์ฆˆ๊ฐ€ ์ ๊ณ  ์ด๋ฏธ์ง€์˜ ๋””ํ…Œ์ผํ•œ ๋ถ€๋ถ„ ๊นŒ์ง€ ํฌํ•จํ•œ๋‹ค๊ณ  ํ•œ๋‹ค.

Step2

๋ฐ์ดํ„ฐ ์ •๊ทœํ™”(normalization) ๋‹จ๊ณ„์ด๋‹ค. ์ด ๋‹จ๊ณ„๋Š” ๋ถ„๋ฅ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜๊ณผ ํŠนํžˆ ์—ญ์ „ํŒŒ(backpropagation) ์‹ ๊ฒฝ๋ง ๊ตฌ์กฐ์— ๋งค์šฐ ์œ ์šฉํ•˜๋‹ค. ๊ฐ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ(trainig data)๋กœ ๋ถ€ํ„ฐ ์ถ”์ถœ๋œ ๊ฐ’๋“ค์„ ์ •๊ทœํ™”ํ•œ๋‹ค๋ฉด ํ›ˆ๋ จ์†๋„ ํ–ฅ์ƒ์„ ๊ธฐ๋Œ€ํ• ์ˆ˜ ์žˆ๋‹ค.

๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” 2๊ฐ€์ง€์˜ ์ •๊ทœํ™” ๋ฐฉ๋ฒ•์ด ์‚ฌ์šฉ๋˜์—ˆ๋‹ค. ํ•˜๊ธฐ ๋  2๊ฐ€์ง€ ๋ฐฉ๋ฒ•์ด ๊ฐ€์žฅ ์ผ๋ฐ˜์ ์œผ๋กœ ์‚ฌ์šฉ๋˜๋Š” ๋ฐฉ๋ฒ•์ด๋ผ๊ณ  ํ•œ๋‹ค.

  1. ์ตœ์†Œ-์ตœ๋Œ€ ์ •๊ทœํ™”(Min-Max normalization)

  • ๋ฐ์ดํ„ฐ๋ฅผ ์ •๊ทœํ™”ํ•˜๋Š” ๊ฐ€์žฅ ์ผ๋ฐ˜์ ์ธ ๋ฐฉ๋ฒ•์ด๋‹ค. ๋ชจ๋“  feature์— ๋Œ€ํ•ด ๊ฐ๊ฐ์˜ ์ตœ์†Œ๊ฐ’ 0, ์ตœ๋Œ€๊ฐ’ 1๋กœ, ๊ทธ๋ฆฌ๊ณ  ๋‹ค๋ฅธ ๊ฐ’๋“ค์€ 0๊ณผ 1 ์‚ฌ์ด์˜ ๊ฐ’์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๊ฑฐ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ์–ด๋–ค ํŠน์„ฑ์˜ ์ตœ์†Œ๊ฐ’์ด 20์ด๊ณ  ์ตœ๋Œ€๊ฐ’์ด 40์ธ ๊ฒฝ์šฐ, 30์€ ๋”ฑ ์ค‘๊ฐ„์ด๋ฏ€๋กœ 0.5๋กœ ๋ณ€ํ™˜๋œ๋‹ค. ์ด๋Š” ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ์„ ํ˜•๋ณ€ํ™˜ํ•˜๊ณ  ์›๋ž˜ ๊ฐ’์„ ๋ณด์กดํ•  ์ˆ˜ ์žˆ๋‹ค.

๋งŒ์•ฝ v๋ผ๋Š” ๊ฐ’์— ๋Œ€ํ•ด ์ตœ์†Œ-์ตœ๋Œ€ ์ •๊ทœํ™”๋ฅผ ํ•œ๋‹ค๋ฉด ์•„๋ž˜์™€ ๊ฐ™์€ ์ˆ˜์‹์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.

image
  • vโ€ฒ: ๋Š” ์ •๊ทœํ™”๋œ ๊ฐ’

  • v: ์›๋ž˜ ๊ฐ’

  • A: ์†์„ฑ ๊ฐ’ (์—ฌ๊ธฐ์„œ๋Š” ๊ฐ ์ฑ„๋„์˜ ๋ฐ๊ธฐ์ด๋‹ค. 0์ด๋ฉด ๊ฐ€์žฅ ์–ด๋‘ก๊ณ , 255๋Š” ๊ฐ€์žฅ๋ฐ๋‹ค)

  • MAXA: ์ž…๋ ฅ ๋ฐ์ดํ„ฐ(์ด๋ฏธ์ง€)๋‚ด์—์„œ ๊ฐ€์žฅ ์ž‘์€ ๋ฐ๊ธฐ ๊ฐ’

  • MINA: ์ž…๋ ฅ ๋ฐ์ดํ„ฐ(์ด๋ฏธ์ง€)๋‚ด์—์„œ ๊ฐ€์žฅ ํฐ ๋ฐ๊ธฐ ๊ฐ’

  1. Z-์ ์ˆ˜ ์ •๊ทœํ™”(Z-Score Normalization)

  • Z-์ ์ˆ˜ ์ •๊ทœํ™”๋Š” ์ด์ƒ์น˜(outlier) ๋ฌธ์ œ๋ฅผ ํ”ผํ•˜๋Š” ๋ฐ์ดํ„ฐ ์ •๊ทœํ™” ์ „๋žต์ด๋‹ค. ๋งŒ์•ฝ feature์˜ ๊ฐ’์ด ํ‰๊ท ๊ณผ ์ผ์น˜ํ•˜๋ฉด 0์œผ๋กœ ์ •๊ทœํ™”๋˜๊ฒ ์ง€๋งŒ, ํ‰๊ท ๋ณด๋‹ค ์ž‘์œผ๋ฉด ์Œ์ˆ˜, ํ‰๊ท ๋ณด๋‹ค ํฌ๋ฉด ์–‘์ˆ˜๋กœ ๋‚˜ํƒ€๋‚œ๋‹ค. ์ด ๋•Œ ๊ณ„์‚ฐ๋˜๋Š” ์Œ์ˆ˜์™€ ์–‘์ˆ˜์˜ ํฌ๊ธฐ๋Š” ๊ทธ feature์˜ ํ‘œ์ค€ํŽธ์ฐจ์— ์˜ํ•ด ๊ฒฐ์ •๋˜๋Š” ๊ฒƒ์ด๋‹ค. ๊ทธ๋ž˜์„œ ๋งŒ์•ฝ ๋ฐ์ดํ„ฐ์˜ ํ‘œ์ค€ํŽธ์ฐจ๊ฐ€ ํฌ๋ฉด(๊ฐ’์ด ๋„“๊ฒŒ ํผ์ ธ์žˆ์œผ๋ฉด) ์ •๊ทœํ™”๋˜๋Š” ๊ฐ’์ด 0์— ๊ฐ€๊นŒ์›Œ์ง„๋‹ค. ์ตœ๋Œ€-์ตœ์†Œ ์ •๊ทœํ™”์— ๋น„ํ•ด ์ด์ƒ์น˜(outlier)์„ ํšจ๊ณผ์ ์œผ๋กœ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋‹ค.

image
  • ฯƒA: ํ‘œ์ค€ํŽธ์ฐจ

  • Aโ€ฒ: A์˜ ํ‰๊ท ๊ฐ’

Step3

์„ธ ๋ฒˆ์งธ ๋‹จ๊ณ„๋Š” ํ‘๋ฐฑ ๋ง๋ง‰ ์ด๋ฏธ์ง€์˜ ์„ธ๋ถ€ ์‚ฌํ•ญ์„ ๊ท ์ผํ•˜๊ฒŒ ๊ฐœ์„ ํ•˜๋Š” ํšจ๊ณผ์ ์ธ ๋ฐฉ๋ฒ•์ธ "๋Œ€๋น„ ์ œํ•œ ์ ์‘ ํžˆ์Šคํ† ๊ทธ๋žจ ๊ท ๋“ฑํ™”(Contrast Limited Adaptive Histogram Equalization, CLAHE)"๋ฅผ ์ ์šฉํ•˜๋Š” ๊ฒƒ์ด๋‹ค.

  • ์ด๋ฏธ์ง€์˜ ํžˆ์Šคํ† ๊ทธ๋žจ์ด ํŠน์ •์˜์—ญ์— ๋„ˆ๋ฌด ์ง‘์ค‘๋˜์–ด ์žˆ์œผ๋ฉด contrast๊ฐ€ ๋‚ฎ์•„ ์ข‹์€ ์ด๋ฏธ์ง€๋ผ๊ณ  ํ•  ์ˆ˜ ์—†์Œ

  • ์ „์ฒด ์˜์—ญ์— ๊ณจ๊ณ ๋ฃจ ๋ถ„ํฌ๊ฐ€ ๋˜์–ด ์žˆ์„ ๋•Œ ์ข‹์€ ์ด๋ฏธ์ง€๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ๋Š”๋ฐ, ํŠน์ • ์˜์—ญ์— ์ง‘์ค‘๋˜์–ด ์žˆ๋Š” ๋ถ„ํฌ๋ฅผ ๊ณจ๊ณ ๋ฃจ ๋ถ„ํฌํ•˜๋„๋ก ํ•˜๋Š” ์ž‘์—…์„ Histogram Equalization ์ด๋ผ๊ณ  ํ•จ

  • ๊ธฐ์กด ํžˆ์Šคํ† ๊ทธ๋žจ ๊ท ์ผํ™” ์ž‘์—…์€ ์ „์ฒด ํ”ฝ์…€์— ๋Œ€ํ•ด ์ง„ํ–‰ํ•ด ์›ํ•˜๋Š” ๊ฒฐ๊ณผ๋ฅผ ์–ป๊ธฐ ํž˜๋“  ๋ฐ˜๋ฉด, CLAHE๋Š” ์ด๋ฏธ์ง€๋ฅผ ์ผ์ •ํ•œ ํฌ๊ธฐ๋ฅผ ์ž‘์€ ๋ธ”๋ก์œผ๋กœ ๊ตฌ๋ถ„ํ•˜์—ฌ ๊ท ์ผํ™”๋ฅผ ์ง„ํ–‰ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ข‹์€ ํ’ˆ์งˆ์˜ ์ด๋ฏธ์ง€๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค.

Link: CLAHE

Step4

๋งˆ์ง€๋ง‰ ๋‹จ๊ณ„๋Š” ๊ฐ๋งˆ ๊ฐ’์„ ํ†ตํ•ด ๋ฐ๊ธฐ๋ฅผ ์กฐ์ ˆํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์ด๋Š” ๋ฐ๊ธฐ๊ฐ€ ํ•œ๊ณณ์— ์ง‘์ค‘๋˜์–ด ํŠน์ง• ์ถ”์ถœ์— ์žฅ์• ๊ฐ€ ๋˜๋Š” ๊ฒƒ์„ ๋ฐฉ์ง€ํ•ด์ค€๋‹ค.

์ „ ์ฒ˜๋ฆฌ๋ฅผ ๊ฑฐ์ณ ํš๋“ํ•œ ์ด๋ฏธ์ง€๋Š” ์•„๋ž˜์™€ ๊ฐ™๋‹ค

์ „ ์ฒ˜๋ฆฌํ•œ ์ด๋ฏธ์ง€๋กœ ๋ถ€ํ„ฐ ํŒจ์น˜(patches)๋ฅผ ์ถ”์ถœํ•˜์—ฌ ๋” ํฐ ๊ทœ๋ชจ์˜ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ํš๋“ํ•˜๊ณ  ๊ตฌ์„ฑ๋œ ์‹ ๊ฒฝ๋ง ํ›ˆ๋ จ์— ์ด์šฉํ•œ๋‹ค. ๋˜ ์ด ํŒจ์น˜(patches)์— ์—ฌ๋Ÿฌ๊ฐ€์ง€ ๋ณ€ํ˜•(flipping)์„ ์ฃผ์–ด ๊ฐ€์šฉ ๋ฐ์ดํ„ฐ๋ฅผ ์ถ”๊ฐ€ ํ™•๋ณดํ•œ๋‹ค.

2. Architecture

๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ์ด์ค‘ ์—ฐ๊ฒฐ๋œ U-Net์„ ์‚ฌ์šฉ๋˜์—ˆ๊ณ , ๋‘ ๋ฒˆ์งธ ๋ถ€๋ถ„์€ ์ž”๋ฅ˜ ๋„คํŠธ์›Œํฌ(residual network)๊ฐ€ ์‚ฌ์šฉ๋˜์—ˆ๋‹ค.

U-Net์€ ์ด๋ฏธ์ง€์˜ ์ „๋ฐ˜์ ์ธ ์ปจํ…์ŠคํŠธ ์ •๋ณด๋ฅผ ์–ป๊ธฐ ์œ„ํ•œ ๋„คํŠธ์›Œํฌ์™€ ์ •ํ™•ํ•œ ์ง€์—ญํ™”(Localization)๋ฅผ ์œ„ํ•œ ๋„คํŠธ์›Œํฌ๊ฐ€ ๋Œ€์นญ ํ˜•ํƒœ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋‹ค.

U-Net:

The Contracting Path

  • 3x3 convolutions์„ ๋‘ ์ฐจ๋ก€์”ฉ ๋ฐ˜๋ณต (ํŒจ๋”ฉ ์—†์Œ)

  • ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋Š” ReLU

  • 2x2 max-pooling (stride: 2)

  • Down-sampling ๋งˆ๋‹ค ์ฑ„๋„์˜ ์ˆ˜๋ฅผ 2๋ฐฐ๋กœ ๋Š˜๋ฆผ

Expanding Path๋Š” Contracting Path์™€ ๋ฐ˜๋Œ€์˜ ์—ฐ์‚ฐ์œผ๋กœ ํŠน์ง•๋งต์„ ํ™•์žฅํ•œ๋‹ค.

The Expanding Path

  • 2x2 convolution (โ€œup-convolutionโ€)

  • 3x3 convolutions์„ ๋‘ ์ฐจ๋ก€์”ฉ ๋ฐ˜๋ณต (ํŒจ๋”ฉ ์—†์Œ)

  • Up-Conv๋ฅผ ํ†ตํ•œ Up-sampling ๋งˆ๋‹ค ์ฑ„๋„์˜ ์ˆ˜๋ฅผ ๋ฐ˜์œผ๋กœ ์ค„์ž„

  • ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋Š” ReLU

  • Up-Conv ๋œ ํŠน์ง•๋งต์€ Contracting path์˜ ํ…Œ๋‘๋ฆฌ๊ฐ€ Cropped๋œ ํŠน์ง•๋งต๊ณผ concatenation ํ•จ

  • ๋งˆ์ง€๋ง‰ ๋ ˆ์ด์–ด์— 1x1 convolution ์—ฐ์‚ฐ ์œ„์™€ ๊ฐ™์€ ๊ตฌ์„ฑ์œผ๋กœ ์ด 23-Layers Fully Convolutional Networks ๊ตฌ์กฐ์ด๋‹ค. ์ฃผ๋ชฉํ•ด์•ผ ํ•˜๋Š” ์ ์€ ์ตœ์ข… ์ถœ๋ ฅ์ธ Segmentation map์˜ ํฌ๊ธฐ๋Š” Input Image ํฌ๊ธฐ๋ณด๋‹ค ์ž‘๋‹ค๋Š” ๊ฒƒ์ด๋‹ค. Convolution ์—ฐ์‚ฐ์—์„œ ํŒจ๋”ฉ์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š์•˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

์ž”๋ฅ˜ ๋ธ”๋ก(Residual block):

์—ดํ™”(Degradation) ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์ž”๋ฅ˜๋ธ”๋ก๋„ ์ œ์•ˆ๋˜์—ˆ๋‹ค. image ์—ฌ๊ธฐ์„œ FM(x)์€ F(x)๋กœ ํ‘œํ˜„๋˜๋Š” ์ž…๋ ฅ ํ˜•์ƒ์— ๋‘ ๊ฐœ์˜ ์ปจ๋ณผ๋ฃจ์…˜ ๋ ˆ์ด์–ด๋ฅผ ์ ์šฉํ•˜๋Š” ๊ฒƒ์—์„œ ์˜ˆ์ƒ๋˜๋Š” ํ˜•์ƒ ๋งต์ด๋ฉฐ, ์ด ๋ณ€ํ™˜์— ์›๋ž˜ ์ž…๋ ฅ x๊ฐ€ ์ถ”๊ฐ€๋˜์—ˆ๋‹ค. ์›๋ž˜ ํ˜•์ƒ ๋งต์„ ์ถ”๊ฐ€ํ•˜๋ฉด ๋ชจ๋ธ์— ๋‚˜ํƒ€๋‚˜๋Š” ์—ดํ™” ๋ฌธ์ œ๊ฐ€ ์™„ํ™”๋œ๋‹ค. ์•„๋ž˜๋Š” ๋ณธ ์ž‘์—…์— ์‚ฌ์šฉ๋œ ํ”„๋กœ์„ธ์Šค์ด๋‹ค.

image
  • U-Net2 with Residual blocks:

U-Net ๋„คํŠธ์›Œํฌ์˜ ์ถœ๋ ฅ๊ณผ ๋‘ ๋ฒˆ์งธ ๋„คํŠธ์›Œํฌ์˜ ์ž…๋ ฅ์„ ๊ตฌ์„ฑํ•œ๋‹ค. ๊ฐ ์ˆ˜์ค€์˜ ์ฑ„๋„ ์ˆ˜์™€ ์ด๋ฏธ์ง€ ํฌ๊ธฐ๋Š” ์•ž ์ ˆ๋ฐ˜์˜ ๋””์ฝ”๋”ฉ ๋ถ€๋ถ„๊ณผ ๋™์ผํ•˜๊ฒŒ ์œ ์ง€๋˜์—ˆ๋‹ค. ํ•˜์ง€๋งŒ Contracting๊ณผ Expanding ๋ชจ๋‘ ์ƒˆ๋กœ์šด ์ˆ˜์ค€์—์„œ ์ž”๋ฅ˜ ๋ธ”๋Ÿญ์ด ์ถ”๊ฐ€๋˜์—ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋งˆ์ง€๋ง‰ Expanding์—์„œ ์ด์ง„ ๋ถ„๋ฅ˜ ์ž‘์—…์ด ์ˆ˜ํ–‰๋˜๋ฏ€๋กœ, 1x1 ์ปจ๋ณผ๋ฃจ์…˜์„ ์ ์šฉํ•˜์˜€๋‹ค.

image

ํ•ด๋‹น ์ด๋ฏธ์ง€์˜ ํ”ฝ์…€์€ ๋Œ€๋ถ€๋ถ„ ๋ฐฐ๊ฒฝ์ด๊ณ  ์†Œ์ˆ˜๋งŒ์ด ํ˜ˆ๊ด€ ๊ตฌ์กฐ๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค(ํด๋ž˜์Šค ๋ถˆ๊ท ํ˜•). ์ด ๋•Œ๋ฌธ์— ์†์‹คํ•จ์ˆ˜๊ฐ€ ์‚ฌ์šฉ๋˜๊ณ  ๋ฐฉ์ •์‹์€ ์•„๋ž˜์™€ ๊ฐ™๋‹ค.

image

์ด ํ•จ์ˆ˜๋Š” ๋ถ„๋ฅ˜๊ฐ€ ์ž˜๋ชป๋˜์—ˆ๊ฑฐ๋‚˜ ๋ถˆ๋ถ„๋ช…ํ•  ๋•Œ ๋†’์€ ์†์‹ค ๊ฐ’์„ ์ฃผ๊ณ  ์˜ˆ์ธก์ด ๋ชจํ˜•์˜ ์˜ˆ์ƒ๊ณผ ์ผ์น˜ํ•  ๋•Œ ๋‚ฎ์€ ์†์‹ค ๊ฐ’์„ ๋ถ€์—ฌํ•˜์—ฌ ๋ฐ์ดํ„ฐ์˜ ์ „์ฒด ํ™•๋ฅ ์„ ์ตœ๋Œ€ํ™” ํ•œ๋‹ค. ๋กœ๊ทธ๋Š” ํŒจ๋„ํ‹ฐ๋ฅผ ์ˆ˜ํ–‰ํ•˜๊ณ , ํ™•๋ฅ ์ด ๋‚ฎ์„์ˆ˜๋ก ๋กœ๊ทธ๊ฐ’์€ ์ฆ๊ฐ€ํ•œ๋‹ค. ํ™•๋ฅ ๋“ค์€ 0๊ณผ 1 ์‚ฌ์ด์˜ ๊ฐ’์„ ๊ฐ€์ง„๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ฐ ํด๋ž˜์Šค์— ๊ฐ€์ค‘์น˜ ๋ถ€์—ฌํ•œ๋‹ค. image

์—ฌ๊ธฐ์„œ ๋ฌด๊ฒŒ w๋Š” 1๊ณผ ฮฑ ๊ฐ’ ์‚ฌ์ด์—์„œ ๋ฌด์ž‘์œ„๋กœ ๋ณ€ํ™”ํ•˜๋ฉฐ, s๋Š” ์Šคํ…์ด๋‹ค. ์ด๋Ÿฌํ•œ ๋™์  ๊ฐ€์ค‘์น˜ ๋ณ€ํ™”๋Š” ๋„คํŠธ์›Œํฌ๊ฐ€ ์ง€์—ญ ์ตœ์†Œ๊ฐ’์œผ๋กœ ๋–จ์–ด์ง€๋Š” ๊ฒƒ์„ ๋ฐฉ์ง€ํ•œ๋‹ค. ๋กœ๊ทธ ํ™•๋ฅ ์„ ์–ป๊ธฐ ์œ„ํ•ด LogSoftmax ํ•จ์ˆ˜๊ฐ€ ์‹ ๊ฒฝ๋ง ๋งˆ์ง€๋ง‰ ๋ ˆ์ด์–ด์— ์ ์šฉ๋œ๋‹ค.


4. Experiment & Result

Dataset

  1. DRIVE

  • Each image resolution is 584*565 pixels with eight bits per color channel (3 channels).

  • 20 images for training set

  • 20 images for testing set

  1. CHASEDB

  • Each image resolution is 999*960 pixels with eight bits per color channel (3 channels).

Evaluation metric

๋ง๋ง‰ ์ด๋ฏธ์ง€๋Š” ํด๋ž˜์Šค์˜ ๋ถˆ๊ท ํ˜•์„ ๋ณด์—ฌ์ฃผ๋ฏ€๋กœ ์ ์ ˆํ•œ metric์„ ์„ ํƒํ•ด์•ผ ํ•œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” Recall, precision, F1-score, accurarcy๋ฅผ ์ฑ„ํƒํ•˜์˜€๋‹ค.

  • Recall: tells us how many relevant samples are selected. image

  • Precision: tells us how many predicted samples are relevant. image

  • F1-Score: is the harmonic mean between recall and precision. image

  • Accuracy: measures how many observations, both positive and negative, were correctly classified. image

Results

1. ์ „๋ฐ˜์  ์„ฑ๋Šฅ

  • ์ƒ๊ธฐ๋œ ์ธก์ •์ง€ํ‘œ๋“ค์„ ๋ฐ”ํƒ•์œผ๋กœ, ์„ ํ–‰ ์—ฐ๊ตฌ๋“ค๊ณผ ์„ฑ๋Šฅ์„ ๋น„๊ตํ•จ

  • F1-Score์˜ ๋†’์€ ์ˆ˜์น˜๋•์— Precision ๊ณผ Recall ๋ชจ๋‘ ๊ณจ๊ณ ๋ฃจ ๋†’์€ ๊ฐ’์„ ๊ฐ€์ง -ํ˜ˆ๊ด€ ๋ถ„๋ฅ˜์— ์ ํ•ฉํ•จ

  • Accuracy์—์„œ๋Š” ๊ฐ€์žฅ ๋†’์€ ์ˆ˜์น˜๋ฅผ ๋ณด์—ฌ์ฃผ์—ˆ๊ณ , F1-Score์— ๋Œ€ํ•ด์„œ 2๋ฒˆ์งธ๋กœ ๋†’์€ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์—ฌ์คŒ

  • ๋ณธ ์—ฐ๊ตฌ๋Š” ๋Œ€๋ถ€๋ถ„์˜ ๊ฒฝ์šฐ ground truth์™€ ์ผ์น˜ํ•˜์˜€๊ณ , FP, FN ๋˜ํ•œ ์ ๋‹ค๊ณ  ๋ณผ์ˆ˜ ์žˆ๋‹ค.

2. ์†Œ์š”์‹œ๊ฐ„

  • ๋ณธ ์•„ํ‚คํ…์ณ๋Š” Khanal et al. ์— ๋น„ํ•ด ๋งŽ์€ ์‹œ๊ฐ„์„ ๋‹จ์ถ•์‹œ์ผฐ๋‹ค

    • DRIVE ๋ฐ์ดํ„ฐ ์…‹์— ๋Œ€ํ•ด์„œ๋Š” ์•ฝ 1์‹œ๊ฐ„

    • CHASEDB ๋ฐ์ดํ„ฐ ์…‹์— ๋Œ€ํ•ด์„œ๋Š” ์•ฝ 10์‹œ๊ฐ„

3. ๋ถ„ํ• (segmentation)๊ณผ ๊ตฌ์กฐ ์œ ์‚ฌ๋„ ์ง€์ˆ˜(The structural similarity index, SSIM)

Drive ๋ฐ์ดํ„ฐ์…‹๊ณผ CHASEDB ๋ฐ์ดํ„ฐ์…‹์˜ ๋ถ„ํ• (segmentation)๊ฒฐ๊ณผ

๊ตฌ์กฐ ์œ ์‚ฌ๋„ ์ง€์ˆ˜(The structural similarity index, SSIM) ์€ ๋ถ„ํ• (segmentation) ํ”„๋กœ์„ธ์Šค๋ฅผ ํ‰๊ฐ€ํ•˜๊ธฐ์œ„ํ•ด ๋„์ž…ํ•จ, U-Net1 ๋งŒ ์žˆ๋Š” ์ฒซ ๋ฒˆ์งธ ๋‹จ๊ณ„์™€ ์ž”๋ฅ˜ ๋ธ”๋ก์ด ์ถ”๊ฐ€๋œ ๋‘ ๋ฒˆ์งธ ๋‹จ๊ณ„(U-Net2 with residual block)๋ฅผ ๋น„๊ตํ•˜๊ธฐ ์œ„ํ•จ.

๊ตฌ์กฐ ์œ ์‚ฌ๋„ ์ง€์ˆ˜๋Š” gtound truth์™€ ํ…Œ์ŠคํŠธ ์ด๋ฏธ์ง€๋“ค ๊ฐ„์˜ viewing distance์™€ edge information๋ฅผ ๋ถ„์„ํ•œ๋‹ค. ์ด๋Š” ์ด๋ฏธ์ง€ ํ’ˆ์งˆ ์ €ํ•˜๋ฅผ ์ˆ˜์น˜ํ™”ํ•˜์—ฌ ์ธก์ •ํ•œ๋‹ค.(์ด๋ฏธ์ง€ ์••์ถ• ๊ฐ™์€ ๊ณณ์—์„œ ์‚ฌ์šฉ) ์ด๋Š” 0 ~ 1 ์˜ ๊ฐ’์„ ๊ฐ€์ง€๊ณ , ๋†’์„์ˆ˜๋ก ์ข‹๋‹ค. ๊ทธ๋ฆผ 6์€ U-Net1๊ณผ ground truth๋ฅผ ๋น„๊ตํ•œ ๊ฒƒ์ด๊ณ , ๊ทธ๋ฆผ 7์€ ์ „์ฒด ์•„ํ‚คํ…์ณ(U-Net1 + U-Net2 with residual block)๊ณผ ground truth์™€ ๋น„๊ตํ•œ๊ฒƒ์ด๋‹ค. ํ›„์ž๊ฐ€ ๋” ๋†’์€ ์ˆ˜์น˜๋ฅผ ๊ฐ€์ง„๋‹ค.

4. ๋ถ„ํ• (segmentation) ์„ฑ๋Šฅ์— ์˜ํ–ฅ์„ ์ฃผ๋Š” ์š”์†Œ

  • Chunk(๋ฉ์–ด๋ฆฌ์ง„ ํ˜ˆ๊ด€)

ํŒŒ๋ž€์ƒ‰ ๋™๊ทธ๋ผ๋ฏธ์นœ ๋ถ€๋ถ„์„ ๋ณด๋ฉด, ํ˜ˆ๊ด€๋“ค์ด ๋น„๊ต์  ๋ญ‰์ณ์žˆ๋Š” ๊ฒƒ์„ ๋ณผ์ˆ˜ ์žˆ๋‹ค. ์ด๋ฏธ์ง€ ๋ถ„ํ• (segmentation)์—์„œ ์ค‘์š”ํ•œ ๋ฌธ์ œ์ธ๋ฐ, ์œ„๋Š” ์ž˜ ๊ตฌ๋ถ„ํ•œ ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

  • ๋ณ‘๋ณ€ ๋ถ€์œ„๋ฅผ ์ž˜ ํ”ผํ•ด๊ฐ”๋Š”์ง€

DRIVE ๋ฐ์ดํ„ฐ์…‹์—๋Š” 7๊ฐœ์˜ ๋ณ‘๋ณ€์ด ํฌํ•จ๋œ ์ด๋ฏธ์ง€๊ฐ€ ์žˆ๋Š”๋ฐ, ์ด๋ฅผ ํ˜ˆ๊ด€์œผ๋กœ ์ฐฉ๊ฐํ•˜๊ณ  ๋ถ„ํ• (segmentation)์„ ํ•  ์ˆ˜ ์žˆ๋‹ค. ์œ„ ์‚ฌ์ง„์„ ๋ณด๋ฉด, ๋ณ‘๋ณ€๋ถ€์œ„(c)๋ฅผ ํ”ผํ•ด ์ž˜ ์ˆ˜ํ–‰ ๋œ๊ฒƒ์œผ๋กœ ๋ณด์ธ๋‹ค.

โž” ์ˆ˜์น˜ํ™”๋œ ์ง€ํ‘œ๊ฐ€ ์žˆ์—ˆ์œผ๋ฉด ์ข‹๊ฒ ๋‹ค.

5. Conclusion

  1. ๋ณธ ์—ฐ๊ตฌ์˜ ๋…ธ๋ฒจํ‹ฐ๋Š” ํฌ๊ฒŒ 2๊ฐ€์ง€๋กœ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

  • ์ฒซ ๋ฒˆ์งธ, ๊ธฐ์กด U-Net ๋„คํŠธ์›Œํฌ์— ์ž”๋ฅ˜ ๋ธ”๋Ÿญ์„ ์ถ”๊ฐ€ํ•œ ๊ฒƒ์ด๋‹ค. ์ด๋Š” ์ด๋ฏธ์ง€์˜ ์—ดํ™”(degradation)์„ ์™„ํ™”ํ•˜๋Š”๋ฐ ํฐ ๊ธฐ์—ฌ๋ฅผ ํ–ˆ๋‹ค.

  • ๋‘ ๋ฒˆ์งธ, ์•ž์˜ U-Net์—์„œ ์–ป์€ ์ •๋ณด๋ฅผ ๋’ค์˜ U-Net(U-Net with residual blocks)์˜ ์ž”๋ฅ˜ ๋ธ”๋Ÿญ๊ณผ ์—ฐ๊ฒฐ์‹œ์ผœ ์ •๋ณด์†์‹ค์„ ์ตœ์†Œํ™” ํ•˜์˜€๋‹ค.

  1. ๋ณธ ์—ฐ๊ตฌ๋Š” ์„ฑ๋Šฅ๊ณผ ํ›ˆ๋ จ์‹œ๊ฐ„ ๋‘˜๋‹ค ์žก์•˜๋‹ค.

  • ์„ ํ–‰ ์—ฐ๊ตฌ์™€ ๋น„์Šทํ•œ ์ˆ˜์ค€์˜ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ์ง€๋งŒ

  • ํ›ˆ๋ จ์‹œ๊ฐ„์„ ํฌ๊ฒŒ ๋‹จ์ถ• ์‹œ์ผฐ๋‹ค๋Š” ๊ฒƒ์— ์˜์˜๋ฅผ ๋‘˜ ์ˆ˜ ์žˆ๋‹ค.

  1. ์ด๋ฏธ์ง€ ์ „์ฒ˜๋ฆฌ ๊ณผ์ •

  • ๊ทธ๋ ˆ์ด ์Šค์ผ€์ผ๋กœ ๋ณ€ํ™˜, ์ •๊ทœํ™”, CLAHE, ๊ฐ๋งˆ๊ฐ’ ์กฐ์ ˆ ์ž‘์—…์œผ๋กœ ํ’ˆ์งˆ ์ข‹์€ ์ž…๋ ฅ ์ด๋ฏธ์ง€๋กœ ๋งŒ๋“ค์—ˆ๊ณ 

  • ์›๋ณธ ์ด๋ฏธ์ง€๋ฅผ ํŒจ์น˜(patch)์ž‘์—…ํ•˜์—ฌ ๋ถ€์กฑํ–ˆ๋˜ ๋ฐ์ดํ„ฐ๋“ค์„ ์ฆ๊ฐ•ํ•˜์—ฌ ํ™•๋ณดํ•จ


Take home message (์˜ค๋Š˜์˜ ๊ตํ›ˆ)

์ •ํ™•๋„ ๋†’์€ ์ด๋ฏธ์ง€ ๋ถ„ํ• ์„ ์œ„ํ•ด์„œ๋Š” ๋งŽ์€ ์‹œ๊ฐ„๊ณผ ๋…ธ๋ ฅ์ด ํ•„์š”ํ•˜๋‹ค. ๋ณธ ๋…ผ๋ฌธ ์—์„œ๋Š” ๊ธฐ์กด์— ์ œ์•ˆ๋œ architecture๋“ค์„ ์ž˜ ํ™œ์šฉํ•˜์˜€๊ณ , ์ด๋ฅผ ํ†ตํ•ด ์งง์€ ํ›ˆ๋ จ์‹œ๊ฐ„์œผ๋กœ ๊ฒฐ๊ณผ๋ฅผ ๋„์ถœํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๋˜ํ•œ ์ด๋ฏธ์ง€ ์ „ ์ฒ˜๋ฆฌ๋ฅผ ํ†ตํ•ด ๋†’์€ ํ’ˆ์งˆ์˜ in-put ์ด๋ฏธ์ง€๋ฅผ ํš๋“ํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๊ฒฐ๊ณผ์ ์œผ๋กœ ํ›ˆ๋ จ์‹œ๊ฐ„๊ณผ ์„ฑ๋Šฅ ์‚ฌ์ด์˜ ํŠธ๋ ˆ์ด๋“œ ์˜คํ”„๋ฅผ ์ตœ์†Œํ™”ํ•  ์ˆ˜ ์žˆ์—ˆ๋˜ ๊ฒƒ์œผ๋กœ ๋ณด์ธ๋‹ค.

Author

Korean Name (English name)

Reviewer

TBD

Reference & Additional materials

  1. [Original Paper] G. Alfonso Francia, C. Pedraza, M. Aceves and S. Tovar-Arriaga, "Chaining a U-Net With a Residual U-Net for Retinal Blood Vessels Segmentation," in IEEE Access, vol. 8, pp. 38493-38500, 2020

  2. [Blog] https://medium.com/@msmapark2/u-net-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0-u-net-convolutional-networks-for-biomedical-image-segmentation-456d6901b28a

Last updated

Was this helpful?