All computing platforms, from mobile to supercomputers, are becoming more and more heterogeneous and massively parallel. While they can provide higher power efficiency and computation throughput, effective and confident use of these systems always requires knowledge about low-level programming. The average time necessary to develop and optimize a design on heterogeneous platforms is higher and higher compared to typical homogeneous systems. Dataflow models of computation (MoC) are quickly becoming the common practice in heterogeneous systems development. In domains such as signal processing and multimedia communication, dataflow MoCs have become accepted as standard. However, the shift from a sequential and architecture-specific MoC to a dataflow MoC still uncovers several programming and development challenges. The Cal Actor Language (CAL) is a recently-specified dataflow and actor-based language capable of concisely expressing complex and general purpose parallel applications. However, design tools supporting this language are generally not adequate to fully exploit its features and expressiveness power. In fact, they generally restrict its MoC in order to reduce the design space exploration (DSE) effort. The objective of this thesis is to provide a DSE methodology where all the features of CAL and dynamic dataflow MoCs can be exploited in a more general and effective manner. This dissertation illustrates a novel profiling, analysis and performance estimation methodology for the DSE of dynamic dataflow programs. The main research contributions of this thesis are: the formalization of a graph-based representation of the program execution called an execution trace graph (ETG); the formalization of a systematic methodology for profiling generic dynamic dataflow programs through their code interpretation; the formalization of a complete DSE methodology for dynamic dataflow programs in order to efficiently identify close-to-optimal design points according to various and tailored performance merit functions. In particular, the following design space optimization problems for dynamic dataflow programs are addressed: the analysis of the hotspots and the algorithmic bottlenecks of a parallel program; the bounding and optimization of the buffer size configuration for complex designs; the dynamic power dissipation minimization of programs implemented in multi-clock domain architecture. Furthermore, theoretical concepts like the design space critical path and the potential speedup of a dataflow application have been defined and revisited, respectively. The thesis also presents a DSE framework developed in order to demonstrate the effectiveness of this design methodology.