Batch normalization (and nesterov momentum) seem to help. After only 11 epochs, an ~50% smaller network is able to reach equivalent validation performance.
Epoch 11
Train Accuracy 0.874000
Valid Accuracy 0.802000
Loss 0.364031
The code for the batch normalization layer is here:
https://github.com/kastnerkyle/ift6266h15/blob/master/normalized_convnet.py#L46
With the same sized network as before, things stay pretty consistently around 80% but begin to massively overfit. The best validation scores, with .95 nesterov momentum are:
Epoch 10
Train Accuracy 0.875050
Valid Accuracy 0.813800
Loss 0.351992
Epoch 36
Train Accuracy 0.967650
Valid Accuracy 0.815800
Epoch 96
Train Accuracy 0.992100
Valid Accuracy 0.822000
I next plan to try batch normalization on fully connected and convolutional VAE. First trying with MNIST, then LFW, then probably cats and dogs. It would be nice to try to either a) mess with batch normalization and do it properly *or* simplify the equations somehow b) do some reinforcement learning like Julian is doing, but on the minibatch selection process. However, time is short!
No comments:
Post a Comment