KK... JK... OK : /: IFT6266 Week 6

The network has been working this entire time, there was just bug in my prediction accuracy percentage printing! See this snippet for an example

In [3]: import numpy as np
In [4]: a = np.array([1., 2., 3., 4., 5.])
In [5]: b = np.array([[1.], [2.], [3.], [4.], [5.]])

In [6]: a
Out[6]: array([ 1., 2., 3., 4., 5.])

In [7]: b
Out[7]:
array([[ 1.],
[ 2.],
[ 3.],
[ 4.],
[ 5.]])

In [8]: (a == b).mean()
Out[8]: 0.20000000000000001
The correct way to print these is (a.flatten() == b.flatten()).astype('float32').mean()

For a task with 2 classes, this will always give ~50% accuracy but the number will change slightly due to small changes in the prediction!

It is an obvious bug in hindsight, but it took a long time to find. Now after 100 epochs, the results are as follows:

Epoch 99
Train Accuracy 0.982200
Valid Accuracy 0.751000
Loss 0.021493

The general network architecture is:

Resize all images to 64x64, then crop the center 48x48

No flipping, random subcrops, or anything

32 kernels, 4x4 with 2x2 pooling and 2x2 strides

64 kernels, 2x2 with 2x2 pooling

128 kernels, 2x2 with 2x2 pooling

512x128 fully-connected

128x64 fully connected

64x2 fully connected

All initialized uniformly [-0.1, 0.1]

ReLU activations at every layer

Softmax cost

I plan to try:

Adding random horizontal flipping

ZCA with 0.1 bias

dropout or batch normalization

PReLU

dark knowledge / knowledge distillation

KK... JK... OK : /

Wednesday, March 11, 2015

IFT6266 Week 6

No comments:

Post a Comment

About Me