Faster R-CNN :: Wizna's Study Room

Faster R-CNN

Wizna 2022. 9. 7. 13:13

2022. 9. 7. 13:13

참고: Selective Search(https://donghwa-kim.github.io/SelectiveSearch.html) 참고: EdgeBoxes(https://donghwa-kim.github.io/EdgeBoxes.html) 참고: https://www.youtube.com/watch?v=nDPWywWRIRo

Abstract
- SPPNet [1] and Fast R-CNN [2] have reduced the running time of detection networks, exposing region proposal computation as a bottleneck
- We introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network (predicts object bounds and objectness scores at each position)
- Use Fast R-CNN for detection, changed Region Proposal Algorithm into RPN merging RPN and Fast R-CNN into a single network (Only 300 proposals; 2000 proposals in R-CNN)

Introduction
- Now, proposals are the test-time computational bottleneck in state-of-the-art detection systems
  - Selective Search [4], one of the most popular methods, greedily merges superpixels based on engineered low-level features
  - EdgeBoxes [6] currently provides the best tradeoff between proposal quality and speed
- We introduce novel Region Proposal Networks that share convolutional layers with state-of-the-art object detection networks
- Our observation is that the convolutional feature maps used by region-based detectors, like Fast RCNN, can also be used for generating region proposals
- RPNs are designed to efficiently predict region proposals with a wide range of scales and aspect ratios (Figure 1-a, 1-b, 1-c)
- RPNs completely learn to propose regions from data, and thus can easily benefit from deeper and more expressive features

Related Work
- Object Proposals
  - Grouping super-pixels: Selective Search [4], CPMC [22], MCG [23]
  - Sliding windows: objectness in windows [24], EdgeBoxes [6]
- Deep Networks for Object Detection
  - R-CNN [5]
  - Predicting object bounding boxes: [25], OverFeat method [9], MultiBox methods [26, 27], DeepMask method [28]
  - Shared Computation of convolutions: [9, 1, 29, 7, 2]

Faster R-CNN
- Region Proposal Networks 참고: ZFNet(https://oi.readthedocs.io/en/latest/computer_vision/cnn/zfnet.html)
  - Network takes as input an $n * n$ spatial window
  - Intermediate layer maps input into a lower dimensional feature
  - Box-regression layer and a box-classification layer are used
  - Anchor is centered at the sliding window in question, and is associated with a scale and aspect (by default 3 scales([128, 256, 512]) * 3 aspect ratios([1:1, 1:2, 2:1]), total k=9 anchors at each sliding position)
  - Translation-Invariant Anchors: Anchors and the Functions that compute proposals relative to the anchors
  - Multi-Scale Anchors as Regression References
  - Loss Function (Multi-task Loss)
  - dd

Experiments

Conclusion

Uploaded by N2T

'Paper_Reading' 카테고리의 다른 글

BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation (0)	2022.09.07
Resnet: Deep Residual Learning for Image Recognition (0)	2022.09.07

+ Recent posts

Powered by Tistory, Designed by wallel

티스토리툴바