In this article, we show how the use of a bio-inspired dynamic task replication algorithm, in the context of stream processing, can be used to significantly improve the performance of embedded programs. We also show that this programming methodology, which is not tied to a particular implementation, can also be used as an heuristic for task mapping in the context of embedded multiprocessors systems. The technique was applied to a 36-processor system implemented on a scalable mesh of FPGAS for two different case studies: for AES encryption, it resulted in a tenfold speedup compared to a static implementation, while for MJPEG compression a throughput multiplication of 11 was obtained.