This week was largely spent on presentations and getting ready for the last push before April 20th.
Semi supervised (feedforward) VAE will probably be my last topic. The model I hope to use will take the label and concatenate following the code layer which should allow the model to mix this information in during reconstruction. This means that it should possible to sample the code layer and clamp the label to "ground truth" or chosen label, and get examples of the generated class. It should also be possible to feed in unlabeled X and generate Y'. The cost would then be nll + KL + indicator {labeled, notlabeled} * softmax error.
This can be seen as two separate models that share parameters - a standard classifier from X to Y, predicting Y', and a VAE from X to X' where the sampled code layer is partially clamped. This may require adding another KL term, but I hope it will be sufficient to train the softmax penalty using the available labeled data. In the limit of no labels, this should devolve back into standard VAE with KL evaluated on only *part* of the code layer, which may not be ideal. The softmax parameters of the white box may be more of a problem than I am anticipating.
This model departs somewhat from others in the literature (to my knowledge), so there may be a flaw in this plan.
Diagram:
No comments:
Post a Comment