Statistical inference for temporal data has been successfully approached via state-space modeling, with Kalman and Sequential Monte Carlo methods providing powerful approximations. While benefiting from theoretical guarantees in arbitrarily complex nonlinear models, the performance of the latter does not scale with the dimension of the state space, which is increasingly large in modern machine learning tasks. Contrarily, variational methods coupled with stochastic gradient optimization provide powerful tools for high-dimensional i.i.d data, but extensions to dependent data are still nascent. Bridging results from literature on hidden Markov models and efficient neural parameterizations from approximate structured inference, we’ll show how efficient sequential variational methods can be built with an error at most linearly increasingly with time. If time allows, we’ll see how these methods can be extended to the online case and present some possible applications.