- 1、本文档共3页,可阅读全部内容。
- 2、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
- 3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
查看更多
潜在语义分析在连续语音识别中的应用
计算机工程与应用 , ( ) 111
Computer Engineering and Applications 2009 45 32
潜在语义分析在连续语音识别中的应用
欧建林,林 茜,史晓东
, ,
OU Jian-lin LIN Qian SHI Xiao-dong
厦门大学计算机科学系,福建厦门 361005
, , , ,
Department of Computer Science Xiamen University Xiamen Fujian 361005 China
:
E-mail mandel@
, ,
OU Jian-lin LIN Qian SHI Xiao-dong.Application of latent semantic analysis in continuous speech recognition.Computer
, , ():
Engineering and Applications 2009 45 32 111-113.
: ( ) ,
Abstract The theory of Latent Semantic Analysis LSA for speech recognition is described and the related techniques for imple-
menting LSA-based language modeling in speech recognition systems are presented.An LSA-based semantic model is constructed
on the WSJ0 text corpus.This paper uses the interpolation method to combine this semantic model with conventional 3-gram to
( , ) ,
form a hybrid language model i.e. LSA+3-gram .To optimize the performance of the hybrid model it applies k-means algorithm
to perform vector clustering in the LSA vector space while the density function is used to initialize the centroid.The constructed
:
hybrid language model outperforms the corresponding 3-gram baseline Continuous speech recognition experiments conducted on
the WSJ0 test corpus show a relative reduction in word error rate of about 13.3%.
: ; ; ;
Key words latent semantic analysis N-gram k-means clustering continuous speech recognition
摘 要:研究了潜在语义分析( )理论及其在连续语
文档评论(0)