The efficient implementation of multimedia algorithms, for the ever increasing complexity of the specifications and the emergence of the new generation of processing platforms characterized by multicore and multicomponent parallel architectures, requires appropriate design space exploration procedures as preliminary step for any implementation. This paper describes a new platform aiming at supporting the algorithm and architecture co-exploration starting by a pure software specification that is gradually transformed into a possibly mixed SW and HW implementation. The process is based on profiling capabilities supported by the new platform specifically conceived to study and optimize data flows and data transfers between SW and HW modules. Different explicit or implicit (i.e. virtual memory extensions) data transfer modes can be profiled in the co-exploration process, by using minimal SW reconfiguration, thus minimizing any SW/HW re-writing effort in the co-exploration stage. Such optimization capabilities can be used to achieve different optimization objectives such as the optimization of memory architectures or low power designs by appropriate minimization of data transfers. Experimental results and an example of the usage of the platform are provided for the design case of a motion estimation module for video encoding.