Our vectorized Helmholtz solver runs at 85% efficiency on a NEC SX-5. The most time-consuming parts have been ported on SMP, NUMA, and cluster architectures. It is shown that an OpenMP version can deliver similar performance when running it on a 16 processor SGI Altix. A partial parallelisation using MPI is made to validate the Gamma model to predict application behaviours on parallel machines. This model is then applied to simulate the behaviour of a hypothetical full MPI version on different distributed memory machines. It is found that only the Cray XT3 with its very fast internode communication network will be able to deliver the performance of a NEC SX-8 with the advantage that bigger case could be handled