RCNN [Kor]

Girshick et al. / Rich feature hierarchies for accurate object detection and semantic segmentation / CVPR 2014

1. Problem definition

Object Detection ๋ถ„์•ผ๋Š” ํ•œ๋™์•ˆ ์ •์ฒด๋˜์—ˆ๊ณ  2012๋…„ ILSVRC (ImageNet Large Scale Visual Recognition Callenge) ์—์„œ CNN์ด ์•Œ๋ ค์กŒ๋‹ค. ์ด ๋…ผ๋ฌธ์€ PASCAL VOC Challenge์—์„œ CNN์œผ๋กœ classification ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์„ฑ๋Šฅ ์ข‹์€ object detection์„ ๋‹ฌ์„ฑํ•  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์—ฌ์ค€๋‹ค.

2. Motivation

Region proposal๊ณผ CNN์„ ํ†ตํ•œ clssification์„ ๊ฒฐํ•ฉํ•˜์—ฌ object detection์— ๊ด€ํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐ. ์ดํ›„ RCNN์„ ๋ฐ”ํƒ•์œผ๋กœ RCNN ๊ณ„์—ด (Fast RCNN, Faster RCNN, Mask RCNN ๋“ฑ) ๋…ผ๋ฌธ๋“ค์—์„œ ๊พธ์ค€ํžˆ ์„ฑ๋Šฅ๊ณผ ์†๋„๋ฅผ ํ–ฅ์ƒ

Figure 1: You can freely upload images in the manuscript.
  1. ์ด๋ฏธ์ง€๋ฅผ ์ง‘์–ด๋„ฃ๋Š”๋‹ค.

  2. 2000๊ฐœ ์ดํ•˜์˜ ์˜์—ญ์„ ์ถ”์ถœํ•˜์—ฌ ์ด๋ฏธ์ง€๋ฅผ ์ž˜๋ผ๋‚ธ๋‹ค.

  3. ์ž˜๋ผ๋‚ธ ์ด๋ฏธ์ง€์˜ ์‚ฌ์ด์ฆˆ๋Š” CNN ๋ชจ๋ธ์— ๋งž๊ฒŒ ์กฐ์ •ํ•œ๋‹ค. (227x227 pixels)

  4. ์ด๋ฏธ์ง€๋ฅผ ImageNet์„ ํ™œ์šฉํ•œ pre-trained CNN ๋ชจ๋ธ๋กœ ์—ฐ์‚ฐํ•œ๋‹ค.

  5. ๊ฐ ์˜์—ญ๋ณ„๋กœ ์ž˜๋ผ๋‚ธ ์ด๋ฏธ์ง€๋“ค์˜ CNN ๊ฒฐ๊ณผ๋กœ ๋‚˜์˜จ feature map ์„ ํ™œ์šฉํ•˜์—ฌ, SVM์œผ๋กœ Classification ๊ฒฐ๊ณผ๋ฅผ ๋„์ถœํ•œ๋‹ค.

  6. regressor๋ฅผ ํ†ตํ•œ bounding box regression์„ ์ง„ํ–‰ํ•œ๋‹ค.

์œ„์˜ ๊ณผ์ •์—์„œ ๊ฐ ๋ฌผ์ฒด์˜ ์˜์—ญ์„ ์ฐพ์•„๋‚ด๋Š” Region proposal๊ณผ ์ž˜๋ผ๋‚ธ ์ด๋ฏธ์ง€๋“ค์„ ๋ถ„๋ฅ˜ํ•˜๋Š” clssification์„ ์ˆ˜ํ–‰ํ•˜๊ฒŒ ๋œ๋‹ค. ์ด 2๊ฐ€์ง€์˜ ๊ณผ์ •์„ ์—ฐ๋‹ฌ์•„ ์ง„ํ–‰ํ•จ์œผ๋กœ์จ object detection์˜ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•œ๋‹ค.

3. Method

object detection ์‹œ์Šคํ…œ์€ 3๊ฐ€์ง€์˜ ๋ชจ๋“ˆ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋‹ค.

  1. Region proposal

    selective search๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ ์˜์—ญ์„ ๋ถ„๋ฆฌํ•˜์—ฌ ์ชผ๊ฐ ๋‹ค. [0,1] ์‚ฌ์ด๋กœ ์ •๊ทœํ™”๋œ 4๊ฐ€์ง€ ์š”์†Œ (์ƒ‰์ƒ, ์žฌ์งˆ, ํฌ๊ธฐ, ์ฑ„์›€) ๋“ฑ์˜ ๊ฐ€์ค‘ํ•ฉ์œผ๋กœ ์œ ์‚ฌ๋„๋ฅผ ์ธก์ •ํ•œ๋‹ค. ์ดˆ๊ธฐ์— ์„ ํƒ๋œ ์˜์—ญ๋“ค ์ค‘์— ์œ ์‚ฌ๋„๊ฐ€ ๋†’์€ ์˜์—ญ๋“ค์„ ์„ ํƒํ•˜์—ฌ ๋ณ‘ํ•ฉํ•œ๋‹ค. ๋ณ‘ํ•ฉํ•œ ์˜์—ญ๊ณผ ๋‹ค๋ฅธ ์˜์—ญ์˜ ์œ ์‚ฌ๋„๋ฅผ ์žฌ์ •์˜ํ•œ๋‹ค. ์ด ๊ณผ์ •์„ ๋ฐ˜๋ณตํ•˜์—ฌ ์œ ์‚ฌ๋„๊ฐ€ ๋†’์€ ์˜์—ญ๋“ค์„ ํ•ฉ์ณ ๊ฐ ์˜์—ญ์„ ๊ตฌ๋ถ„ํ•œ๋‹ค.

    Figure 1: You can freely upload images in the manuscript.

  2. Pre-trained CNN (Convolutional Neural Network)

    region proposal์— ์˜ํ•ด ์ชผ๊ฐœ์ง„ ์ด๋ฏธ์ง€๋“ค์„ 227x277 ์‚ฌ์ด์ฆˆ๋กœ ๋งž์ถ˜๋‹ค. ๊ณ ์ •๋œ ์‚ฌ์ด์ฆˆ๋กœ ๋งž์ถฐ์ง„ ์ด๋ฏธ์ง€๋ฅผ CNN์— ๋„ฃ์–ด์„œ Classification์„ ์ง„ํ–‰ํ•œ๋‹ค. ๊ธฐ์กด์˜ AlexNet์˜ ๊ตฌ์กฐ๋ฅผ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ–ˆ๋‹ค. ๋‹จ์ง€ object detection์„ ์œ„ํ•ด์„œ 1000๊ฐœ์˜ class๋กœ ๋ถ„๋ฅ˜ํ•˜๋˜ ๊ตฌ์กฐ๋Œ€์‹ ์— (200,20)์˜ feature map์„ ์ถ”์ถœํ•˜๋„๋ก ๋ณ€ํ˜•์„ ์‹œ์ผฐ๋‹ค.

  3. SVM (Support Vector Machine) CNN์„ ํ†ตํ•ด์„œ feature๋“ค์ด ์ถ”์ถœ๋œ๋‹ค. Feature ๋“ค์„ ์ด์šฉํ•ด์„œ Linear SVM์œผ๋กœ Classifciation์„ ์ง„ํ–‰ํ•œ๋‹ค.

  4. Bounding Box Regression Region proposal์„ ๊ฑฐ์น˜๋ฉด์„œ ์ถ”์ถœ๋œ bounding box์ธ P์™€ ground truth bounding box๋ฅผ ๋งž์ถ”๋„๋ก ํ•™์Šต์„ ํ•˜๋Š” ๊ฒƒ์ด Bounding Box Regression์˜ ๋ชฉํ‘œ์ด๋‹ค.

    Figure 1: You can freely upload images in the manuscript.

4. Experiment & Result

TBD

Result

Please summarize and interpret the experimental result in this subsection.

5. Conclusion

RCNN์€ ๊ธฐ์กด PASCAL VOC 2012์˜ ๊ฐ€์žฅ ์ข‹์€ ๊ธฐ๋ก๋ณด๋‹ค 30%์˜ ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋ฌ๋‹ค. 2๊ฐ€์ง€ ๊ด€์ ์—์„œ ์˜์˜๋ฅผ ๊ฐ€์ง„๋‹ค. ํ•˜๋‚˜๋Š” region proposal๊ณผ CNN์„ ํ™œ์šฉํ•œ Object detection ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•œ ๊ฒƒ์ด๊ณ , ๋‚˜๋จธ์ง€๋Š” ๋ฐ์ดํ„ฐ๊ฐ€ ๋ถ€์กฑํ•œ ์ƒํƒœ์—์„œ pre-train ๋œ ๊ฑฐ๋Œ€ CNN๊ณผ ํŠน์ • ๋ชฉ์ ์œผ๋กœ fine-tuneํ•˜์—ฌ ํšจ์œจ์„ฑ์„ ์ œ๊ณ ํ–ˆ๋‹ค๋Š” ๊ฒƒ์ด๋‹ค.

Author / Reviewer information

Author

Korean Name (English name)

  • ๊ถŒ๋ฌธ๋ฒ” (NAVER)

  • https://github.com/MBKwon

Reviewer

  1. Korean name (English name): Affiliation / Contact information

  2. Korean name (English name): Affiliation / Contact information

  3. ...

Reference & Additional materials

  1. Citation of this paper

  2. Official (unofficial) GitHub repository

  3. Citation of related work

  4. Other useful materials

  5. ...

Last updated

Was this helpful?