Parallel FPGA Routing with On-the-Fly Net Decomposition
A high-quality routing algorithm is crucial to achieving high-speed FPGA designs, and it is one of the most timeconsuming steps in the FPGA CAD flow. Using multiple CPUs is one way to reduce route time. However, exploiting parallelism on the most performant algorithms incorporating negotiated congestion, directed searches, and incremental approaches has been challenging. We introduce two parallel routers extending the state-of-the-art PathFinder-based AIR router in VPR 8. The first is the baseline parallel router, based on the widely applied technique of recursively bi-partitioning the physical FPGA so nonoverlapping nets can be routed in parallel; however, scalability is limited by nets (often high-fanout) spanning large chip areas. The second router enhances the baseline by applying a new net decomposition method to enable fragments of nets to be routed in parallel for better scalability. For intra-cluster routing, Titan benchmarks, and eight threads, we obtain a speedup of 2.14× with the baseline and 2.38× with the net-decomposing router, compared to the latest VPR 8+ sequential router. On flat (singlestep) routing, the net-decomposing router achieves a speedup of 2.15× with eight threads. The routers are deterministic and serially equivalent, achieving wire length and critical path delay comparable to the sequential algorithm. The routers are being integrated into the open-source VTR framework, enabling the research community to build on this work.
Kosar24 Parallel FPGA Routing with On-the-Fly Net Decomposition.pdf
main document
openaccess
CC BY
309.65 KB
Adobe PDF
4d8804b99160bc6e3070e7109e0e36e9