Understanding Deep Learning via Analyzing Dynamics of Gradient Descent