Future architectures will feature hundreds to thousands of simple processors and on-chip memories connected through a network-on-chip. Architectural simulators will remain primary tools for design space exploration, performance (and power) evaluation of these massively parallel architectures. However, architectural simulation performance is a serious concern, as virtual platforms and simulation technology are not able to tackle the complexity of 1,000-core future scenarios. The main contribution of this paper is the development of a simulator for 1,000-core processors which exploits the enormous parallel processing capability of low-cost and widely available General Purpose Graphic Processing Units (GPGPU). We demonstrate our GPGPU simulator on a target architecture composed by several cores (i.e. ARM ISA based), with instruction and data caches, connected trough a Networkon- Chip (NoC). Our experiments confirm the feasibility of our approach. Currently, our ongoing work is focused on developing the power models within the simulation engine