In this study, the authors explore sequential and parallel processing architectures, utilising a custom ultra-low-power (ULP) processing core, to extend the lifetime of health monitoring systems, where slow biosignal events and highly parallel computations exist. To this end, a single- and a multi-core architecture are proposed and compared. The single-core architecture is composed of one ULP processing core, an instruction memory (IM) and a data memory (DM), while the multi-core architecture consists of several ULP processing cores, individual IMs for each core, a shared DM and an interconnection crossbar between the cores and the DM. These architectures are compared with respect to power/ performance trade-offs for different target workloads of online biomedical signal analysis, while exploiting near threshold computing. The results show that with respect to the single-core architecture, the multi-core solution consumes 62% less power for high computation requirements (167 MOps/ s), while consuming 46% more power for extremely low computation needs when the power consumption is dominated by leakage. Additionally, the authors show that the proposed ULP processing core, using a simplified instruction set architecture (ISA), achieves energy savings of 54% compared to a reference microcontroller ISA (PIC24).