In this talk we survey recent advances on statistical efficiency and regret of reinforcement learning (RL) when good state representations are available. Motivated by the RL theory, we discuss what should be good state representations for RL and how to find compact state embeddings from high-dimensional Markov state trajectories. In the spirit of diffusion map for dynamical systems, we propose an efficient method for learning a low-dimensional state embedding and capturing the process's dynamics. State embedding can be used to cluster states into metastable sets, thereby identifying the slow dynamics of a black-box system; as well as to identify interpretable latent states and soft aggregation structures through nonnegative functional factorization and convex hull approximation. We demonstrated these approaches in various data sets including Atari game.