基于动态贝叶斯网络的音视频双模态说话人识别-humancomputer.pdfVIP

下载本文档

3
0
约2.73万字
约 6页
2017-11-11 发布于天津
举报
版权申诉

基于动态贝叶斯网络的音视频双模态说话人识别-humancomputer.pdf

1、原创力文档（book118）网站文档一经付费（服务费），不意味着购买了该文档的版权，仅供个人/单位学习、研究之用，不得用于商业用途，未经授权，严禁复制、发行、汇编、翻译或者网络传播等，侵权必究。。
2、本站所有内容均由合作方或网友上传，本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺！文档内容仅供研究参考，付费前请自行鉴别。如您付费，意味着您自己接受本站规则且自行承担风险，本站不退款、不进行额外附加服务；查看《如何避免下载的几个坑》。如果您已付费下载过本站文档，您可以点击这里二次下载。
3、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等，请点击“版权申诉”（推荐），也可以打举报电话：400-050-0827(电话支持时间：9:00-18:30)。
4、该文档为VIP文档，如果想要下载，成为VIP会员后，下载免费。
5、成为VIP后，下载本文档将扣除1次下载权益。下载后，不支持退款、换文档。如有疑问请联系我们。
6、成为VIP后，您将拥有八大权益，权益包括：VIP文档下载权益、阅读免打扰、文档格式转换、高级专利检索、专属身份标志、高级客服、多端互通、版权登记。
7、VIP文档为合作方或网友上传，每下载1次，网站将根据用户上传文档的质量评分、类型等，对文档贡献者给予高额补贴、流量扶持。如果你也想贡献VIP文档。上传文档

计算机研究与发展 ISSN N 111777TP Journal of Computer Research and Development 43 (3) : 470～475 , 2006 基于动态贝叶斯网络的音视频双模态说话人识别吴志勇　　　蔡莲红 (清华大学计算机科学与技术系普适计算教育部重点实验室　北京　100084) (wuzy99 @mailstsinghuaeducn) AudioVisual Bimodal Speaker Identif ication Using Dynamic Bayesian Net works Wu Zhiyong and Cai Lianhong ( Key L aboratory of Pervasive Computin g , Minist ry of Education , Depart ment of Computer Science and Technology , Tsinghua ) U niversity , Beijin g 100084 Abstract 　Studied in this paper is the use of dynamic Bayesian networks (DBNs) for the task of text prompt audiovisual bimodal speaker identification The task is to determine the identity of a speaker from a temporal sequence of audio and visual observations obtained from the acoustic speech and the shape of the mouth respectively According to the hierarchical structure of audiovisual bimodal modeling , a new DBN is constructed to describe the natural audio and visual state asynchrony as well as their conditional dependency over time The experimental results show that the dynamic Bayesian network is a powerful and flexible methodology for representing and modeling the audiovisual correlations and the proposed DBN can improve the accuracy of audioonly speaker identification at all levels of acoustic signaltonoise ratio ( SNR) from 0 to 30dB Key words 　biometrics ; speaker identification ; audiovisual bimodal modeling ; fusion ; dynamic Bayesian network (DBN) 摘　要　动态贝叶斯网络在描述具有多个通道的复杂随机过程方面具有优异的性能基于动态贝叶斯网络进行音视频双模态说话人识别的工作分析了音视频联合建模的层级结构 ,利用动态贝叶斯网络对不同层级的音视频关联关系建立模型 ,并基于该模型进行音视频说话人识别的实验通过对不同层级的建模过程及说话人识别实验的结果进行分析 ,结果表明 ,动态贝叶斯网络为描述音视频间的时序相关性和特征相关性提供了有效的建模方法 ,在不同语音信噪比的情况下均能提高说话人识别的性能关键词　生物识别 ;说话人识别 ;音视频联合建模 ;融合 ;动态贝叶斯网络中图法分类号　TP