Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. EPFL thesis
  4. Novel Methods For Detection And Analysis Of Atypical Aspects In Speech
 
doctoral thesis

Novel Methods For Detection And Analysis Of Atypical Aspects In Speech

Fritsch, Julian David  
2023

Atypical aspects in speech concern speech that deviates from what is commonly considered normal or healthy. In this thesis, we propose novel methods for detection and analysis of these aspects, e.g. to monitor the temporary state of a speaker, diseases that manifest in speech, or people that have trouble producing speech. To overcome data scarcity, most methods in this thesis depend on auxiliary resources; to comply with clinicians, prior knowledge and explainability are taken into account.

In the first part of this thesis, we augment methods that aim to directly assess atypical speech with convolutional neural networks (CNN). With the goal of inducing prior knowledge about atypical speech into CNNs, we present findings in the context of Alzheimer's disease detection and severity estimation: We demonstrate that filtering the waveforms to focus on voice-source-related frequencies and increasing the input segment length to capture prosody has beneficial effects. Additionally, we explore incorporating phonetic knowledge into CNNs: By using CNN-based models trained for articulation prediction that are fine-tuned on continuous sleepiness estimation. Furthermore, we propose methods for detecting and estimating breathing impairments in people with Parkinson's disease. We compare hand-crafted features that model voice-source information and embeddings extracted from CNNs and find they are well-suited.

The second part of this thesis presents a novel method for intelligibility assessment of people with dysarthria. Intelligibility is a clinical measure of the severity of dysarthria. Typically assessed as an aggregate over a set of utterances by a speaker, we emulate the subjective listening tests by performing utterance verification using phonetic features on all of a speaker's utterances, aggregate them into the speaker's intelligibility score, and demonstrate this scheme's robustness through several variations. The same scheme was applied to emulate a human listening test, where listeners had to differentiate between before and after lip filler surgery. The intelligibility assessment scheme is extended into pronunciation feedback: Expected pronunciation is modeled by training one hidden Markov model per phoneme on healthy speech. Given a prompt and its corresponding dysarthric utterance, we can estimate by how much a phoneme deviates from its expected pronunciation and give a phoneme-level assessment.

  • Files
  • Details
  • Metrics
Type
doctoral thesis
DOI
10.5075/epfl-thesis-9785
Author(s)
Fritsch, Julian David  
Advisors
Odobez, Jean-Marc  
•
Magimai Doss, Mathew  
Jury

Prof. Jean-Philippe Thiran (président) ; Dr Jean-Marc Odobez, Dr Mathew Magimai Doss (directeurs) ; Dr Dorina Thanou, Prof. Juan Rafael Orozco Arroyave, Dr Fabien Ringeval (rapporteurs)

Date Issued

2023

Publisher

EPFL

Publisher place

Lausanne

Public defense year

2023-05-26

Thesis number

9785

Total of pages

101

Subjects

convolutional neural networks

•

articulatory features

•

Alzheimer's disease

•

degree of sleepiness

•

Parkinson's disease

•

speech intelligibility

•

dysarthria

EPFL units
LIDIAP  
Faculty
STI  
School
IEM  
Doctoral School
EDEE  
Available on Infoscience
May 15, 2023
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/197675
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés