Wednesday, March 11, 2015

IFT6266 Week 6

The network has been working this entire time, there was just bug in my prediction accuracy percentage printing! See this snippet for an example


In [3]: import numpy as np
In [4]: a = np.array([1., 2., 3., 4., 5.])
In [5]: b = np.array([[1.], [2.], [3.], [4.], [5.]])

In [6]: a
Out[6]: array([ 1., 2., 3., 4., 5.])

In [7]: b
Out[7]:
array([[ 1.],
[ 2.],
[ 3.],
[ 4.],
[ 5.]])

In [8]: (a == b).mean()
Out[8]: 0.20000000000000001
The correct way to print these is (a.flatten() == b.flatten()).astype('float32').mean()

For a task with 2 classes, this will always give ~50% accuracy but the number will change slightly due to small changes in the prediction!

It is an obvious bug in hindsight, but it took a long time to find. Now after 100 epochs, the results are as follows:

Epoch 99
Train Accuracy  0.982200
Valid Accuracy  0.751000
Loss 0.021493

The general network architecture is:
Resize all images to 64x64, then crop the center 48x48
No flipping, random subcrops, or anything

32 kernels, 4x4 with 2x2 pooling and 2x2 strides
64 kernels, 2x2 with 2x2 pooling
128 kernels, 2x2 with 2x2 pooling
512x128 fully-connected
128x64 fully connected
64x2 fully connected
All initialized uniformly [-0.1, 0.1] 
ReLU activations at every layer
Softmax cost

I plan to try:
Adding random horizontal flipping
ZCA with 0.1 bias
dropout or batch normalization
PReLU
dark knowledge / knowledge distillation

No comments:

Post a Comment