Publication year
2005Author(s)
Annotation
werkbezoek CSTR
Edinburgh
Publication type
Conference lecture
Display more detailsDisplay less details
Organization
Taalwetenschap
Subject
Speech Technology and Information Processing; Technologie en informatieverwerkingAbstract
The acoustic environment in which speech is recorded has a strong
influence on the statistical distributions of observed acoustic
features. In order to make ASR insensitive to noise it is crucial
that these distributions are similar in the training and testing
condition. Mostly, it is attempted to compensate for the impact of
noise by estimating the noise characteristics from the signal. We
explored the feasibility of a new method to increase noise
robustness: We try to exploit the a priori knowledge that is stored
in clean speech models. Using Mel bank log-energy features,
recognition accuracy was monitored while an increasing number of
model components (chosen differently for each state) were ignored.
This strategy aims at recognition results that are determined more
strongly by the match in the high-energy rather than by the mismatch
in the low-energy model components. Application of the new method to
clean speech data confirms that discarding components below a
certain energy threshold does not deteriorate recognition
performance. Experiments with noisy data, however, show that
performance gains are relatively small. An analysis of the poor
results is presented and used to distill future research questions.
This item appears in the following Collection(s)
- Academic publications [244084]
- Faculty of Arts [29769]
Upload full text
Use your RU credentials (u/z-number and password) to log in with SURFconext to upload a file for processing by the repository team.