Streaming k-PCA and concentration for products of random matrices
Abstract: We analyze a non-convex stochastic descent algorithm for streaming PCA due to Erkki Oja (1982). Despite its simplicity, this algorithm has resisted optimal analysis outside of the rank-one case. We show that given access to a sequence of i.i.d. d x d symmetric matrices, Oja's algorithm obtains an accurate approximation to the subspace of the top k eigenvalues of their expectation using a number of samples scaling polylogarithmically in d. This matches the performance of an optimal offline algorithm for the same problem.
Our analysis is based on new concentration results for products of random matrices, which allow us to obtain strong bounds on the tails of the random matrices which arise in the course of the algorithm's execution. The argument relies on the optimal uniform smoothness properties of the Schatten trace class obtained by Ball, Carlen, and Lieb.
This is based on joint work with Huang, Tropp, and Ward.