著作: [柘植 覚]/[福見 稔]/[獅々堀 正幹]/[任 福継]/[北 研二]/[黒岩 眞吾]/Study of Relationships Between Intra-Speaker's Speech Variability and Speech Recognition Performance/2006 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS 2006)
(英) Study of Relationships Between Intra-Speaker's Speech Variability and Speech Recognition Performance
(英) Even if a speaker uses a speaker-dependent speech recognition system, speech recognition performance varies. For this reason, speech quality is varied by some factors, including emotion, background noise, and so on, even though the speaker and utterance remain constant. However, the relationships between intra-speaker's speech variability and speech recognition performance are not clear. Hence, we focus on the intra-speaker's speech variability which affects the speech recognition performances. To investigate these relationships, we have been collecting speech data since November 2002. Using a part of the speech corpus, we conducted speech recognition experiments. In this paper, we analyze the relationships between intra-speaker's speech variability and the phoneme accuracy by using the correlation analysis. For factors of the correlation analysis, we use a number of errors, a speaking rate, a likelihood. Analysis results show a strong correlation between the number of the substitution errors and the phoneme accuracy although the correlations of the number of the deletion and the insertion errors are low. Therefore, it is considered that there are overlaps between phonemes since the feature parameters vary at each speaking rate. For improving the phoneme accuracy, it is needed that we study a method which discriminates phonemes. On the other hand, although the correlation between the phoneme accuracy and the speaking rate seems to be low, a strong correlation between the speaking rate and the number of deletion errors and insertion errors are found. Since the number of the insertion errors and the number of the deletion errors were in the counterbalance relation, the correlation between the speaking rate and the phoneme accuracy was low. However, we consider that it is needed to normalize the speaking rate because the speaking rate influences on the number of the deletion and the insertion errors.
(英) 2006 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS 2006)
|都市||必須||(日) Tottori, Japan|
|年月日||必須||2006年 12月 12日|