Collaborative Filtering as AutoEncoding
After doing some reading
it seems Collaborative Filtering, or the process of minimizing the distance
$Y_{ij}-\mathbf{\Theta}_{jl}\mathbf{X}_{il}$ by alternately fitting for
$\mathbf{\Theta}_{jl}$ and $\mathbf{X}_{il}$ is the same as sparse
autoencoding, or using a restricted Boltzman machine. Will have more to say
about this, in particular regarding Lasso -- L1 regularization -- regression
and non-negative matrix factorization