We study the dynamical process of congestion formation for large-scale urban networks by exploring a unique dataset of taxi movements in a megacity. We develop a dynamic model based on a reaction and a diffusion term that properly reproduces the cascade phenomena of traffic. The interaction of these two terms brings the values of the speeds on road network in self-organized patterns and it reveals an elegant physical law that reproduces the dynamics of congestion with very few parameters. The results presented show a promising match with an available real data set of link speeds estimated from more than 40 millions of GPS coordinates per day of about 20,000 taxis in Shenzhen, China.