ICode9

精准搜索请尝试: 精确搜索
首页 > 其他分享> 文章详细

Global Context-Aware Progressive Aggregation Network for Salient Object Detection Notes

2021-11-09 04:31:07  阅读:193  来源: 互联网

标签:Salient mathbf Network Progressive level odot widetilde delta features


Global Context-Aware Progressive Aggregation Network for Salient Object Detection

Facts

  1. due to the pyramid-like CNNs structure, high-level features help locate the salient objects roughly, low-level features help refine boundaries.
  2. traditional methods like FCN-based methods just simply combined semantic information and appearance information, which is not sufficient and lacks consideration for different contribution of different features.
  3. Most of previous works ignored the global-context information, which can tell the relationship among multiple salient regions. Let's take the figure of ping-pong girl for example, most of other methods pay attention to the ping-pong bat while ignoring the ping-pong ball, which is related to the bat.

Structure

The GCPANet consists of four parts

  1. FIA (Feature Interweaved Aggregation)
  2. SR (Self Refinement)
  3. HA(Head Attention)
  4. GCF (Global Context Flow)

Feature Interweaved Aggregation

Benefits

Combine low-level features and high-level features. 取长补短

Additionally use global context information to help understand the relationship between different objects (ping-pong ball for example), which is beneficial in generate more complete and accurate saliency map.

What's more, global context information helps alleviate the effect of feature dilution.

Function

To fully integrate the three mentioned features.

Implementation

High & Low Level Features

To better fuse up-sampled high-level features with low-level features, the paper suggests we should use multiplication instead of concatenation, which helps to strengthen the response of salient objects and to suppress the background noise.

To be specific, here is what the paper tells us

\[\mathbf W^t_h = upsample(conv_2(\mathbf f^t_h)) \]

\[\mathbf f^t_{hl}=\delta(\mathbf W^t_h\odot \mathbf{ \widetilde f_l^t }) \]

\[\mathbf W_l^t = conv_3(\mathbf{\widetilde f_l^t}) \]

\[\mathbf f_{lh}^t = \delta(\mathbf W_l^t\odot upsample(\mathbf f_h^t)) \]

Global Context Features

Introduce the global context features \(\mathbf f_{g}^t\)​ at each stage.

\[\mathbf W_g^t=upsample(conv_4(f_g^t)) \]

\[\mathbf f_{gl}^t=\delta(\mathbf W_g^t \odot \mathbf{\widetilde f_l^t}) \]

Output

Concatenate the three features and pass them through a \(3\times 3\) convolution layer to obtain the output.

\[\mathbf f_a^t = conv_5(concat(\mathbf f_{hl}^t,\mathbf f_{lh}^t,\mathbf f_{hl}^t)) \]

Self Refinement

Function

To reduce the contradictory response of different layers.

Implementation

\[\mathbf{\widetilde f} = conv_6(\mathbf f_{in}) \]

\[\mathbf f_{out} = \delta(\mathbf W\odot \mathbf{\widetilde f}+b) \]

Head Attention (HA)

Function

To select important and representative features from the output of the top layers, which usually contains much redundant information.

Location

As is mentioned above, it locates following the top layers to process the output of the first layers.

Implementation

  • Apply a convolution layer to the input feature maps \(\mathbf F\)​​​ to obtain a compressed feature representation \(\mathbf{\widetilde F}\)​​ with 256 channels.​​

  • Generate a mask \(\mathbf W\)​​​​ and bias \(\mathbf{b}\)​, then we get

    \[\mathbf {F_1 = \delta(W\oplus \widetilde F+b)} \]

    where \(\delta\) represents to the ReLU activation function

  • Use average pooling to down-sample \(\mathbf F\)​ into channel-wise feature vector \(\mathbf f\)​

  • Apply 2 successive fully connected layers to \(\mathbf f\) and get an output vector \(\mathbf y\)

  • Get final output vector \(\mathbf F_{out} = \mathbf F_1 \odot \mathbf y\)

Global Context Flow (GCF)

Function

To better understand the relationship between different salient objects, and to alleviate the effect of feature dilution.

Implementation

\[\mathbf y^t = \sigma \circ fc_4 \circ \delta \circ fc_3 (\mathbf f_{gap}) \]

\[\mathbf{\widetilde f}^t = conv_{10}(\mathbf f_{top}) \]

\[\mathbf f_g^t = \mathbf{\widetilde f}^t \odot \mathbf y^t \]

Results

Outperform other 12 stage-of-the-art methods on 6 benchmark datasets.

Perform ablation study to prove the effectiveness of the four main part of GCPANet.

My Experiments

I use BJTU HPC platform to run the code.

So many troubles :<

  • Had trouble trying to SSH to the server

    sol: The platform supports WinSCP, which can pass password to PuTTY. So I can SSH to the server indirectly.

  • Fail to pass parameter to test.py due to the restriction of BJTU HPC platform.

    sol: Replace sys.argv[1] with the parameter I'm trying to pass. A better solution would be writing a start.py.

  • Fail to locate files since working path is redirected to "jobs/xxx"

    sol: Add os.chdir('/data/home/u20281202/SOD/GCPANet-master/') at the beginning of test.py.

  • Fail to load the ResNet50

    sol: Upload resnet50-19c8e357.pth to model folder and modify the initialize function.

    def initialize(self):
      self.load_state_dict(torch.load('./model/resnet50-19c8e357.pth'), strict=False)
    

    (I missed the beginning dot when I thought I was going to successfully run it, only to find the MAE was way too large.)

Obversions

标签:Salient,mathbf,Network,Progressive,level,odot,widetilde,delta,features
来源: https://www.cnblogs.com/ghostcai/p/15527092.html

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有