Comparison of Noise Reduction Techniques for Dysarthric Speech Recognition

IRIS

The paper investigates the impact of denoising techniques on a deep learning recognition system for speakers with dysarthria, i.e., a neuromotor speech disorder which compromises speech intelligibility and that affects approximately 46 million of people worldwide. In particular, we compare a manual noise reduction techniques with automatic approaches based on classical signal processing techniques, i.e. filtering and spectral analysis, as well as more recent deep learning techniques based on recurrent neural network models. Comparison results reported in this paper are based on a dataset with more than 21K audio files collected with the collaboration of 156 Italian native speakers with different disabilities that cause dysarthria speech impairment. Therefore, different diseases and dysarthric severity levels have been taken into account. Moreover, differently from several other studies related to automatic recognition systems, audio files considered in our analysis have been collected in real environments, with a very limited supervision and simply using users' smartphones. Our analysis shows that, in this context, the effectiveness of automatic denoising tools is quite limited, particularly for dysarthric speakers with severe grades of disorder. However, comparisons with the proposed manual denoising intervention provide new and interesting insights which can be effectively and easily exploited with the aim of empowering actual automatic dysarthric speech recognition systems and that could drive future research in this field.

Comparison of Noise Reduction Techniques for Dysarthric Speech Recognition

Mulfari, D;Campobello, G;Gugliandolo, G;Celesti, A;Villari, M;Donato, N

2022-01-01

Abstract

The paper investigates the impact of denoising techniques on a deep learning recognition system for speakers with dysarthria, i.e., a neuromotor speech disorder which compromises speech intelligibility and that affects approximately 46 million of people worldwide. In particular, we compare a manual noise reduction techniques with automatic approaches based on classical signal processing techniques, i.e. filtering and spectral analysis, as well as more recent deep learning techniques based on recurrent neural network models. Comparison results reported in this paper are based on a dataset with more than 21K audio files collected with the collaboration of 156 Italian native speakers with different disabilities that cause dysarthria speech impairment. Therefore, different diseases and dysarthric severity levels have been taken into account. Moreover, differently from several other studies related to automatic recognition systems, audio files considered in our analysis have been collected in real environments, with a very limited supervision and simply using users' smartphones. Our analysis shows that, in this context, the effectiveness of automatic denoising tools is quite limited, particularly for dysarthric speakers with severe grades of disorder. However, comparisons with the proposed manual denoising intervention provide new and interesting insights which can be effectively and easily exploited with the aim of empowering actual automatic dysarthric speech recognition systems and that could drive future research in this field.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2022
			
	Codice ISBN
	
				978-1-6654-8299-8
			
	Parole chiave
	
				automatic speech recognition; dysarthria; dysarthric speech; noise suppression
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
20.500.12610-72243.pdf non disponibili Tipologia: Versione Editoriale (PDF) Licenza: Copyright dell'editore Dimensione 1.76 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.76 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12610/72243

Citazioni

ND

6

2

social impact