Convolutional Neural Networks with Caffe and NEGATIVE or FALSE IMAGES -
when training set of classes (let's #clases (number of classes) = n) on caffe deep learning (or cnn framework) , make query caffemodel, % of probability of image ok.
so, let's take picture of similar class 1, , result:
1.- 96%
2.- 4%
rest... 0% problem is: when take random picture (for example of environment), keep getting same result, 1 of class predominant (>90% probability) doesn't belong class.
so i'd hear opinions/answers people has experienced , have solved how deal no-sense inputs neural network.
my purposes are:
- train 1 more class negative images (like train_cascade).
- train 1 more class positive images in train set, , negative on val set. purposes don't have scientific base execute them, that's why ask question.
what do?
thank in advance.
rafael.
edit:
after 2 months, colleague of mine throw me clue: the activation function.
i've seen use relu in every layer means value x x when x > 0 , 0 otherwise. these layers:
layers { name: "conv1" type: convolution bottom: "data" top: "conv1" blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layers { name: "relu1" type: relu bottom: "conv1" top: "conv1" } layers { name: "pool1" type: pooling bottom: "conv1" top: "pool1" pooling_param { pool: max kernel_size: 3 stride: 2 } } layers { name: "norm1" type: lrn bottom: "pool1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layers { name: "conv2" type: convolution bottom: "norm1" top: "conv2" blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 1 } } } layers { name: "relu2" type: relu bottom: "conv2" top: "conv2" } layers { name: "pool2" type: pooling bottom: "conv2" top: "pool2" pooling_param { pool: max kernel_size: 3 stride: 2 } } layers { name: "norm2" type: lrn bottom: "pool2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layers { name: "conv3" type: convolution bottom: "norm2" top: "conv3" blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layers { name: "relu3" type: relu bottom: "conv3" top: "conv3" } layers { name: "conv4" type: convolution bottom: "conv3" top: "conv4" blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 1 } } } layers { name: "relu4" type: relu bottom: "conv4" top: "conv4" } layers { name: "conv5" type: convolution bottom: "conv4" top: "conv5" blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 1 } } } layers { name: "relu5" type: relu bottom: "conv5" top: "conv5" } layers { name: "pool5" type: pooling bottom: "conv5" top: "pool5" pooling_param { pool: max kernel_size: 3 stride: 2 } } layers { name: "fc6" type: inner_product bottom: "pool5" top: "fc6" blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 inner_product_param { num_output: 4096 weight_filler { type: "gaussian" std: 0.005 } bias_filler { type: "constant" value: 1 } } } layers { name: "relu6" type: relu bottom: "fc6" top: "fc6" } layers { name: "drop6" type: dropout bottom: "fc6" top: "fc6" dropout_param { dropout_ratio: 0.5 } } layers { name: "fc7" type: inner_product bottom: "fc6" top: "fc7" blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 inner_product_param { num_output: 4096 weight_filler { type: "gaussian" std: 0.005 } bias_filler { type: "constant" value: 1 } } } layers { name: "relu7" type: relu relu_param { negative_slope: -1 } bottom: "fc7" top: "fc7" } layers { name: "drop7" type: dropout bottom: "fc7" top: "fc7" dropout_param { dropout_ratio: 0.5 } } layers { name: "fc8" type: inner_product bottom: "fc7" top: "fc8" blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 inner_product_param { num_output: 1000 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layers { name: "loss" type: softmax_loss bottom: "fc8" bottom: "label" }
if make relu x x (so negative x < 0) network converges in accuracy = 0...
is there better way ?
train class negative examples.
or - work - use pre-trained network , weights if network definition satisfies you, example imagenet, , add classes additional labels. in way have higher chances not overfit additional (the negative) class. if network different can train scratch on larger dataset instead of using pre-trained weights.
Comments
Post a Comment