This paper describes a methodology for the optimization of portable parallel signal processing applications specified by dataflow programs. The use of dataflow as a programming model for signal processing applications targeting parallel platforms provides an important advantage versus the traditional sequential programming paradigm: the portability of parallelism. The paper introduce a design space exploration methodology for exploring alternative implementations in which abstract traces of a program, representing the actual data dependencies of its parts, are first constructed and then analyzed to guide the refactoring and mapping and of the signal processing applications to best match its intended parallel target. The methodology is demonstrated and evaluated in an at-size case study of an MPEG-4 video decoder.