Convolutional Neural Networks with Caffe and NEGATIVE or FALSE IMAGES -


when training set of classes (let's #clases (number of classes) = n) on caffe deep learning (or cnn framework) , make query caffemodel, % of probability of image ok.

so, let's take picture of similar class 1, , result:

1.- 96%

2.- 4%

rest... 0% problem is: when take random picture (for example of environment), keep getting same result, 1 of class predominant (>90% probability) doesn't belong class.

so i'd hear opinions/answers people has experienced , have solved how deal no-sense inputs neural network.

my purposes are:

  • train 1 more class negative images (like train_cascade).
  • train 1 more class positive images in train set, , negative on val set. purposes don't have scientific base execute them, that's why ask question.

what do?

thank in advance.

rafael.


edit:

after 2 months, colleague of mine throw me clue: the activation function.

i've seen use relu in every layer means value x x when x > 0 , 0 otherwise. these layers:

layers {   name: "conv1"   type: convolution   bottom: "data"   top: "conv1"   blobs_lr: 1   blobs_lr: 2   weight_decay: 1   weight_decay: 0   convolution_param {     num_output: 96     kernel_size: 11     stride: 4     weight_filler {       type: "gaussian"       std: 0.01     }     bias_filler {       type: "constant"       value: 0     }   } } layers {   name: "relu1"   type: relu   bottom: "conv1"   top: "conv1" } layers {   name: "pool1"   type: pooling   bottom: "conv1"   top: "pool1"   pooling_param {     pool: max     kernel_size: 3     stride: 2   } } layers {   name: "norm1"   type: lrn   bottom: "pool1"   top: "norm1"   lrn_param {     local_size: 5     alpha: 0.0001     beta: 0.75   } } layers {   name: "conv2"   type: convolution   bottom: "norm1"   top: "conv2"   blobs_lr: 1   blobs_lr: 2   weight_decay: 1   weight_decay: 0   convolution_param {     num_output: 256     pad: 2     kernel_size: 5     group: 2     weight_filler {       type: "gaussian"       std: 0.01     }     bias_filler {       type: "constant"       value: 1     }   } } layers {   name: "relu2"   type: relu   bottom: "conv2"   top: "conv2" } layers {   name: "pool2"   type: pooling   bottom: "conv2"   top: "pool2"   pooling_param {     pool: max     kernel_size: 3     stride: 2   } } layers {   name: "norm2"   type: lrn   bottom: "pool2"   top: "norm2"   lrn_param {     local_size: 5     alpha: 0.0001     beta: 0.75   } } layers {   name: "conv3"   type: convolution   bottom: "norm2"   top: "conv3"   blobs_lr: 1   blobs_lr: 2   weight_decay: 1   weight_decay: 0   convolution_param {     num_output: 384     pad: 1     kernel_size: 3     weight_filler {       type: "gaussian"       std: 0.01     }     bias_filler {       type: "constant"       value: 0     }   } } layers {   name: "relu3"   type: relu   bottom: "conv3"   top: "conv3" } layers {   name: "conv4"   type: convolution   bottom: "conv3"   top: "conv4"   blobs_lr: 1   blobs_lr: 2   weight_decay: 1   weight_decay: 0   convolution_param {     num_output: 384     pad: 1     kernel_size: 3     group: 2     weight_filler {       type: "gaussian"       std: 0.01     }     bias_filler {       type: "constant"       value: 1     }   } } layers {   name: "relu4"   type: relu   bottom: "conv4"   top: "conv4" } layers {   name: "conv5"   type: convolution   bottom: "conv4"   top: "conv5"   blobs_lr: 1   blobs_lr: 2   weight_decay: 1   weight_decay: 0   convolution_param {     num_output: 256     pad: 1     kernel_size: 3     group: 2     weight_filler {       type: "gaussian"       std: 0.01     }     bias_filler {       type: "constant"       value: 1     }   } } layers {   name: "relu5"   type: relu   bottom: "conv5"   top: "conv5" } layers {   name: "pool5"   type: pooling   bottom: "conv5"   top: "pool5"   pooling_param {     pool: max     kernel_size: 3     stride: 2   } } layers {   name: "fc6"   type: inner_product   bottom: "pool5"   top: "fc6"   blobs_lr: 1   blobs_lr: 2   weight_decay: 1   weight_decay: 0   inner_product_param {     num_output: 4096     weight_filler {       type: "gaussian"       std: 0.005     }     bias_filler {       type: "constant"       value: 1     }   } } layers {   name: "relu6"   type: relu   bottom: "fc6"   top: "fc6" } layers {   name: "drop6"   type: dropout   bottom: "fc6"   top: "fc6"   dropout_param {     dropout_ratio: 0.5   } } layers {   name: "fc7"   type: inner_product   bottom: "fc6"   top: "fc7"   blobs_lr: 1   blobs_lr: 2   weight_decay: 1   weight_decay: 0   inner_product_param {     num_output: 4096     weight_filler {       type: "gaussian"       std: 0.005     }     bias_filler {       type: "constant"       value: 1     }   } } layers {   name: "relu7"   type: relu   relu_param {     negative_slope: -1   }   bottom: "fc7"   top: "fc7" } layers {   name: "drop7"   type: dropout   bottom: "fc7"   top: "fc7"   dropout_param {     dropout_ratio: 0.5   } } layers {   name: "fc8"   type: inner_product   bottom: "fc7"   top: "fc8"   blobs_lr: 1   blobs_lr: 2   weight_decay: 1   weight_decay: 0   inner_product_param {     num_output: 1000     weight_filler {       type: "gaussian"       std: 0.01     }     bias_filler {       type: "constant"       value: 0     }   } } layers {   name: "loss"   type: softmax_loss   bottom: "fc8"   bottom: "label" } 

if make relu x x (so negative x < 0) network converges in accuracy = 0...

is there better way ?

train class negative examples.
or - work - use pre-trained network , weights if network definition satisfies you, example imagenet, , add classes additional labels. in way have higher chances not overfit additional (the negative) class. if network different can train scratch on larger dataset instead of using pre-trained weights.


Comments

Popular posts from this blog

javascript - Using jquery append to add option values into a select element not working -

Android soft keyboard reverts to default keyboard on orientation change -

Rendering JButton to get the JCheckBox behavior in a JTable by using images does not update my table -