
Video tutorial: how to submit a publication
Infoscience: 20 years of open knowledge at EPFL
Book an Infoscience expert
Infoscience: Support and Help for the New Version
- Some of the metrics are blocked by yourconsent settings
Publication Structured pruning for efficient systolic array accelerated cascade Speech-to-Text Translation
(2025)We present in this paper a simple method for pruning tiles of weights in sparse matrices, that do not require fine-tuning or retraining. This method is applied here to the feed-forward layers of transformers. We assess in a first experiment the impact of such pruning on the performances of speech recognition, machine translation, and the cascaded speech-to-text translation, on the MuST-C database, for the English to French direction. Depending on the size of the pruned tiles (from 4x4 to 32x32), we observe that pruning rates from 15 to 40% for speech recognition and from 40 to 70% for machine translation are feasible for a performance degradation of 10%. Applying this pruning method to the systolic array accelerated version of the cascade speech-to-text translation system results in speedups up to 74x compared to the non-accelerated system. Energy consumption also benefits from structured pruning with a maximum reduction of 35%.
- Some of the metrics are blocked by yourconsent settings
Publication Cough-E: A multimodal, privacy-preserving cough detection algorithm for the edge
(2025)Continuous cough monitors can greatly benefit doctors in home monitoring and treatment of respiratory diseases. Although many works propose algorithms to automate this task, they suffer of poor data privacy and short-term monitoring. Edge-AI is a promising paradigm to overcome these limitations by processing privacy-sensitive data close to their source. However, it presents challenges for the deployment of resource-demanding algorithms on constrained devices. In this work, we propose a hardwareaware methodology for developing a cough detection algorithm, analyzing design-time trade-offs for performance and energy. From audio and kinematic signals, our methodology aims at optimal features via Recursive Feature Elimination with Cross-Validation (RFECV), exploiting the explainability of the selected XGB model. Additionally, it analyzes the use of Mel spectrogram features, instead of the common MFCC. Moreover, a set of hyperparameters for a multimodal implementation of the classifier is explored. Finally, it evaluates the performance based on clinically relevant event-based metrics. The methodology proposes a novel structured approach to efficiently deploy AI on the edge, preserving data privacy. We apply our methodology to develop Cough-E, an energy-efficient, multimodal, and edge AI cough detector. It exploits audio and kinematic data in two distinct models, cooperating for a balanced energy and performance trade-off. We demonstrate that our algorithm can be executed in real-time on an ARM Cortex M33 microcontroller. Cough-E achieves a 70.56% energy saving compared to the audio-only approach, for a 1.26% relative performance drop, resulting in a 0.78 F1-score. Both Cough-E and the edge-aware model optimization methodology are available as open-source code. This approach demonstrates the benefits of the proposed hardware-aware methodology to enable privacy-preserving cough monitors on the edge, paving the way to efficient cough monitoring.
- Some of the metrics are blocked by yourconsent settings
Publication Towards Accurate RISC-V Full System Simulation via Component-level Calibration
(Association for Computing Machinery (ACM), 2025-06-04)Full-System (FS) simulation is essential for performance evaluation of complete systems that execute complex applications on a complete software stack consisting of an operating system and user applications. Nevertheless, they require careful fine-tuning against real hardware to obtain reliable performance statistics, which can become tedious, error-prone, and time-consuming with typical trial-and-error approaches. We propose a novel, streamlined, component-level calibration methodology to address these shortcomings to validate FS simulation models. Our methodology greatly accelerates the validation process without sacrificing accuracy. It is Instruction Set Architecture (ISA)-agnostic, and can tackle hardware specifications at different levels of detail. We demonstrate its effectiveness by validating FS models against both open-hardware and IP-protected (closed hardware) RISC-V silicon, achieving a mean error of 19-23% for the SPEC CPU2017 suite in the two cases. We introduce the first open-source RISC-V-based FS-validated simulation models with a complete and replicable methodology.
- Some of the metrics are blocked by yourconsent settings
Publication Digital SiPMs and SPAD arrays: available technologies and implementation challenges for large arrays
(2024-10-27)CMOS-based individual SPAD detectors and arrays have seen a host of applications being explored and industrialised in the past years, relying on their single-photon detection capability, combined with excellent photon-timing precision and noiseless read-out (in the digital flavour). SPADs are available in most standard CMOS technologies, with sensitivities spanning the entire spectrum, from NUV to NIR.Digital SiPMs are one of the array flavours, where the intrinsically digital nature of the SPAD response is preserved and exploited as close as possible to the SPAD itself, without necessarily reaching the granularity of true imagers. We will have a look at some of the recent commercial developments, technology trends and general manufacturing options, with emphasis on manufacturing bottlenecks and opportunities. This will be complemented by a closer look at tiling and tiling-enabling technologies, such as TSVs, which are of particular importance to the high-energy physics and nuclear science community in the large sense (e.g. including Positron Emission Tomography). The corresponding applications do indeed often emphasize large-area implementations coupled to high detection efficiency and fill factor.
- Some of the metrics are blocked by yourconsent settings
Publication DIGILOG: A digital-analog SiPM towards 10 ps prompt-photon tagging in TOF-PET
(2023-11-07)Functional imaging techniques like positron emission tomography (PET) are an essential tool in an aging society. Despite impressive advances in microelectronics, photodetectors and scintillation materials, PET is still awaiting a breakthrough in terms of reduced cost and increased performance. Large potential is seen in ultraprecise time-of-flight (TOF), aiming at coincidence time resolutions (CTRs) better than 30 ps. However, state-of-the-art TOF-PET systems are still far away from this goal, achieving typical CTRs of 214 ps (FWHM). Several proposals have been put forth, whereas the most promising is to use prompt photon emission, e.g. Cherenkov radiation in BGO crystals, which are cheap to produce, thus contributing to drastic cost cutting. However, Cherenkov detection is challenging due to its limited photon yield, which in turn requires a very high photon detection efficiency (PDE), low dark count rate (DCR) and extremely fast and innovative electronic readout schemes. Recent analog silicon photomultipliers (aSiPMs) meet the first two targets, but not the latter. In the Digilog project we envisage to unite the best of these two worlds, combining high PDE, low DCR and an exceptional SPTR. To reach this goal, we will segment state-of-the-art aSiPMs into smaller clusters, called μSiPMs. A balanced segmentation of the electronic readout will make it possible to efficiently detect the first scintillation and Cherenkov photons, with a manageable granularity at system level. The μSiPM signals will feature photon-density time walk correction and photon counting. We envision to create 3D-stacked sensors where the electronics will be housed in a CMOS bottom-tier and the μSiPMs in the top-tier chip. Preliminary measurements on first μSiPM test-structures already reached PDE and DCR close to their commercial counterparts, while an SPTR of 25 ps FWHM, close to our sub-20 ps goal, has been achieved.
- Some of the metrics are blocked by yourconsent settings
Publication High-throughput isolation of anaerobic arsenic-transforming microorganisms
(EPFL, 2025)Arsenic (As) is a toxic metalloid that occurs naturally and is widely distributed in the environment. The inorganic compounds arsenite (As(III)) and arsenate (As(V)) are the most prevalent As species in the biosphere, the former being predominant in reducing environments. Some microbes can methylate As(III) to produce methylated As compounds. This biotransformation alters the fate and toxicity of As, and is a key component of its biogeochemical cycle. Arsenic methylation is often observed in rice paddy fields, where it is enhanced upon soil flooding (i.e., under anoxic conditions). The methylated products can be absorbed by the rice plant through the roots and accumulate in the grains, and can also induce the sterility of the plant. This microbial transformation may thus pose a threat to both food security and safety. However, very few anaerobic As-methylating microbes have been isolated, which precludes further understanding of the controls on this transformation and its biological function. Their isolation is laborious because traditional techniques for the isolation of environmental anaerobes are time-consuming, and the As-methylating phenotype cannot be easily screened. To tackle these challenges, we developed an alternative isolation approach which consists of trapping and growing individual soil microbes within permeable compartments (hydrogel capsules), and subsequent sorting using fluorescence-activated cell sorting (FACS) to distribute compartmentalized isolates in microwell plates for further growth. This approach enabled the cultivation of anaerobic taxa which fail to grow on agar-based medium. Further, we employed a bacterial biosensor that fluorescently responds to methylated As to functionally screen for the isolates showing methylation capacity. This approach allowed us to rapidly isolate anaerobic As-methylating strains from a paddy soil.
- Some of the metrics are blocked by yourconsent settings
Publication Integrating AI across the Chemistry Discovery Cycle: Advancing Sustainable Chemistry through Digital Methods
(EPFL, 2025)The chemical industry faces mounting pressure to develop more sustainable processes while accelerating innovation to address climate challenges. Traditional discovery approaches in chemistry follow an iterative cycle: forming hypotheses about new molecules or reactions, testing these hypotheses experimentally, analyzing the results, and using these insights to refine future hypotheses. Each stage presents unique challenges that can create bottlenecks in the discovery process. At the hypothesis stage, researchers must navigate vast chemical spaces to identify promising candidates. During testing, optimal reaction conditions must be determined, focusing on reactivity while also considering sustainability. Analysis requires the interpretation of complex spectroscopic data, and the refinement stage demands efficient integration of all gathered information to guide future experiments.
Digital chemistry methods offer promising approaches to accelerate this discovery cycle. By leveraging artificial intelligence, machine learning, and optimization techniques, these methods can enhance each stage of the process while promoting more sustainable practices. This work investigates how these computational tools can be effectively deployed across the entire discovery workflow, demonstrating their potential through several case studies.
Following the cycle we first showcase the potential of generative AI to develop new hypotheses by creating new molecules with desired properties. In our case we propose new catalyst candidates for the Suzuki cross-coupling with desired binding energies. Subsequently, to support the testing phase with digital methods, transformer-based models for solvent recommendation were developed to predict suitable solvents for chemical reactions while suggesting greener alternatives. The effectiveness of these recommendations was successfully experimentally validated. The next step is automating the analysis, showcased by a case study on assisting in the interpretation of 1H-NMR spectra. The attention mechanisms in a transformer model were leveraged to establish a mapping between 1H-NMR spectral peaks and molecular substructures, achieving high accuracy in assigning the experimental spectra to the correct molecule. And lastly, the iterative formation of new hypotheses on previous experiments was accelerated using Bayesian optimization in combination with automated synthesis hardware. This combination enabled the efficient optimization of iodoalkyne synthesis across multiple starting materials while exploring only a small part of the potential parameter space.
Recognizing that the impact of digital chemistry tools depends heavily on their accessibility, we highlight a potential method to increase accessibility by packaging existing chemistry AI tools into an App, that does not requiring coding knowledge and focused on local execution. Finally, an examination of the environmental footprint of computational chemistry methods themselves emphasizes the importance of balancing their benefits for sustainable chemistry against their own resource consumption.
Through these case studies, we demonstrate how digital chemistry methods can significantly accelerate the discovery cycle while promoting sustainable practices, establishing a framework where computational tools complement and enhance experimental approaches rather than replace them.
- Some of the metrics are blocked by yourconsent settings
Publication A local structure perspective on metal-organic frameworks
(EPFL, 2025)Metal-organic frameworks (MOFs), or coordination polymers (CPs), are networks of organic ligands and metal nodes that form ordered or disordered structures. By selecting appropriate building blocks, CPs with tailored pore shapes, dimensions, and surface chemistry can be designed for applications such as separations and catalysis. Traditionally, MOF research has focused on crystalline structures due to their well-defined atomic arrangements. However, stability - often determined by how crystalline a material remains under operation conditions - remains a major challenge for commercial applications. This work explores the potential of amorphous MOFs, showing that a focus on the local structure, which extends only a few Angstrom, offers new insights for material engineering. The first part introduces the motivation for the local structure perspective on MOFs. We present a strategy to post-synthetically modify polymer chains within MOF pores for selective precious metal adsorption and reduction. While promising, stability concerns arose under application-relevant conditions. However, despite significant crystallinity loss, the composite retained functionality, challenging the necessity of highly crystalline starting materials. In the following part, we examine amorphous and disordered CPs as alternatives to crystalline MOFs. We apply the local-structure perspective to Zr-based MOFs. Zr-MOFs are ideal due to extensive research on their discrete Zr oxo clusters independent from research on MOFs. We demonstrate that amorphous Zr-MOFs can be engineered to induce oxygen vacancies, generating coordinatively unsaturated Zr (Zr-cus) sites. These lower-symmetry structures, identifiable only through local structure analysis, enhance catalytic and water purification performance by increasing Zr-cus site density. Next, we extended our focus to the direct synthesis of amorphous CPs. We found that highly stable, porous, Zr organophosphate structures can be synthesized using environmentally friendly conditions and bio-available building blocks. The resulting CP shows excellent performance metrics for selectively removing toxic lead from water. Here, too, the local structure perspective is a useful tool for characterizing the material and assessing its stability in various application-relevant conditions. Finally, we assess method protocols to characterize the local structure using widely available laboratory instruments. This study highlights the advantages of amorphous MOFs, demonstrating that their properties can be tuned via local structure engineering. They can be synthesized with low energy input, characterized effectively, and optimized for applications such as catalysis and water remediation.
- Some of the metrics are blocked by yourconsent settings
Publication Absolute calibration of standard candles in the Gaia era
(EPFL, 2025)Measuring the distance to stars in relation to us is fundamental for understanding the universe we live in, however, it is also one of the most challenging tasks in astronomy. Accurately estimating distances would allow us to understand the distribution of stars within our galaxy, the distribution of galaxies within the universe and how the universe has evolved over time, allowing us to set constraints on the theory governing gravitational interactions. The objective of this thesis is to enhance our understanding of the methods used to measure distances and the physics of the stars involved.
Classical Cepheids are the primary standard candles used for determining galactic and extragalactic distances. They are a fundamental part of one of the most precise methods known for measuring the rate of expansion of the universe H0. Cepheids calibrate the absolute luminosity of Type Ia supernovae, which, in turn, are used as standard candles for more distant galaxies where Cepheids can no longer be detected. By combining this calibration with redshift measurements of galaxies that host these supernovae, it is possible to estimate H0. Alternatively, H0 can be derived from fitting the equations of the standard cosmological model, LCDM, to observations of the Cosmic Microwave Background. If LCDM is correct, both measurements should agree, but they differ by more than five standard deviations, possibly indicating measurement errors or new physics.
Improving our understanding of distance estimation could help identify the source of the discrepancy. Chapter 1 introduces the basic concepts used in this thesis, including the definitions of standard candles and highlights the role of clusters in studying the population of variable stars.
Chapter 2 examines Cepheid. We show that by using cluster parallaxes, it is possible to reduce the uncertainties associated with the parallax of Cepheids by a factor of 3. This allowed us to achieve one of the most precise calibrations of their absolute luminosity, reaching uncertainties of ~1% for Cepheids with pulsation periods of 10 days. As a result, the uncertainty in the measurement of H0 was reduced, increasing the statistical significance of the Hubble tension.
Chapters 3 and 4 focus on identifying RR Lyrae and Population II Cepheids in GCs, in the future, these samples could help verify the distances obtained with Cepheids to galaxies in the Local Group. We found that around 25% of horizontal branch stars showed no signs of variability in the Gaia photometry. This observation could have major implications for RRL models, as they predict that all stars in the instability strip should be photometrically variable.
Chapter 5 examines the geometry of GCs. We found that the on-sky geometry of these clusters is elliptical. Our results show that rotation is present in multiple GCs, and the projected rotation axis aligns with the minor axis of the ellipse. This provides evidence that rotation influences their geometry.
Chapter 6 presents unpublished results on long-period variable stars in GCs. We observe that these stars appear to be split into two types, with some indications that blending may have affected the photometry of one type, but we cannot conclude that this has impacted their classification. The thesis concludes with Chapter 7, which summarizes the main contributions in greater detail and explores how future data releases from Gaia could enhance our understanding of standard candles and clusters.
- Some of the metrics are blocked by yourconsent settings
Publication Spaces of Interface: Proximity in the Diffusion of Energy Innovations
(EPFL, 2025)Context. Technical and social innovations are central to the transition to renewable energy systems. Despite the availability of innovative technologies for renewable energy production, management, and e-mobility, their widespread societal adoption remains limited. Proximity between actors is key for energy technologies' diffusion. Yet, it is challenging to bring into a cohesive interdisciplinary conceptualisation.
Goal. This thesis examines proximity in the diffusion of energy innovations in Switzerland. I conceptualise proximity related to innovation-diffusion as relational and circumstantial. Relational proximity is studied by drawing on information networks to examine connectivity and diversity within five proximity dimensions: geographical, social, cognitive, organisational and institutional. I analyse circumstantial proximity by researching which socio-spatial contexts bring actors into situations where they exchange information and what characterises these situations.
Methods. Using data from 36 interviews with Swiss energy actors and surveys with 157 professionals and 4,000 adopters of photovoltaic panels, electric vehicles, and energy management systems, I analysed information exchanges that led to the diffusion of these technologies. I included exchanges among professionals, between professionals and adopters, and between adopters and their personal contacts. Results. Proximity relates to the diffusion of energy innovations through linking infrastructures that create the potential for interactions and circumstances that generate spaces of interface where information can be exchanged. The relational proximity results show linking infrastructures are geographically and socially connected and introduce a balance of actors with similar and diverse cognitive, institutional and organisational characteristics. Circumstantial proximity reveals face-to-face interactions to be key for the diffusion of innovations, allowing the flow of tacit knowledge and the development of trust. Specific contexts foster spaces of interface: a) diverse urban environments that increase interactions with people outside of the closest circle, b) socio-economic conditions that allow actors to have these interactions and, c) professionals and convinced adopters that support less urban and innovative ones.
Discussion. The interrelations between relational and circumstantial proximity highlight three avenues to nurture spaces of interface for innovation-diffusion. First, an appropriate configuration of the space, through densification, urban planning, design or event organisation, can be key to increasing connecting dimensions and opportunities for interaction. Second, establishing trustful relationships is important to transform these interactions into diffusion. Third, the integration of people from different backgrounds in social circles supports opportunities to discuss with "others" bringing novel information, and encourages diffusion from early adopters to the people from different sociodemographic backgrounds.
Conclusion. By complementing relational with circumstantial proximity, this research enriches the theory of the diffusion of innovations and offers practical recommendations for advancing the diffusion of energy technologies in Switzerland. It highlights the importance of socio-spatially mixed contexts that also foster trust and openness for innovation to flow in society.