Application performance on these processor array platforms is highly sensitive to how functionality is physically placed on the device, as this choice crucially determines communication latencies and congestion patterns of the on-chip inter-core communication. The problem of identifying the best, or just a good enough, partitioning and placement does not, in general, admit to an analytic solution, and its combinatorial nature makes solving it by pure experimentation impractical. This paper presents an approach that maps stream programs onto processor arrays using trace analysis as a technique for evaluating candidate solutions and for suggesting alternatives.