CiMBA: Accelerating Genome Sequencing through On-Device Basecalling via Compute-in-Memory
As genome sequencing is finding utility in a wide variety of domains beyond the confines of traditional medical settings, its computational pipeline faces two significant challenges. First, the creation of up to 0.5 GB of data per minute imposes substantial communication and storage overheads. Second, the sequencing pipeline is bottlenecked at the basecalling step, consuming >40% of genome analysis time. A range of proposals have attempted to address these challenges, with limited success. We propose to address these challenges with a Compute-in-Memory Basecalling Accelerator (CiMBA), the first embedded (∼ 25mm2) accelerator capable of real-time, on-device basecalling, coupled with AnaLog (AL)-Dorado, a new family of analog focused basecalling DNNs. Our resulting hardware/software co-design greatly reduces data communication overhead, is capable of a throughput of 4.77 million bases per second, 24× that required for real-time operation, and achieves 17×/27× power/area efficiency over the best prior basecalling embedded accelerator while maintaining a high accuracy comparable to state-of-the-art software basecallers.
2-s2.0-105000459957
International Business Machines
International Business Machines
École Polytechnique Fédérale de Lausanne
International Business Machines
Advanced Micro Devices, Inc.
Georgia State University
International Business Machines
International Business Machines
International Business Machines
ETH Zürich
2025
REVIEWED
EPFL