- 1
- 0
- 约7.19万字
- 约 16页
- 2017-09-02 发布于天津
- 举报
基于张量表示的直推式多模态视频语义概念检测 - 软件学报
ISSN 1000-9825, CODEN RUXUEW E-mail: jos@
Journal of Software , Vol.19, No.11, November 2008, pp.2853−2868
DOI: 10.3724/SP.J.1001.2008.02853 Tel/Fax: +86-10
© 2008 by Journal of Software . All rights reserved.
∗
基于张量表示的直推式多模态视频语义概念检测
+
吴 飞 , 刘亚楠, 庄越挺
(浙江大学 计算机科学与技术学院 数字媒体计算与设计实验室,浙江 杭州 310027)
Transductive Multi-Modality Video Semantic Concept Detection with Tensor Representation
+
WU Fei , LIU Ya-Nan, ZHUANG Yue-Ting
(Digital Media Computing Design Lab., College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China)
+ Corresponding author: E-mail: wufei@
Wu F, Liu YN, Zhuang YT. Transductive multi-modality video semantic concept detection with tensor
representation. Journal of Software, 2008,19(11):2853−2868. /1000-9825/19/2853.htm
Abstract : A higher-order tensor framework for video analysis and understanding is proposed in this paper. In this
framework, image frame, audio and text are represented, which are the three modalities in video shots as data points
by the 3rd-order tensor. Then a subspace embedding and dimension reduction method is proposed, which explicitly
considers the manifold structure of the tensor space from temporal-sequenced associated co-occurring multimodal
media data in video. It is called TensorShot approach. Transductive learning uses a large amount of unlabeled data
together with the labeled data to build better classifiers. A transductive support tensor machines algorithm is
proposed to train effective classifier. This algorithm preserves the intrinsic structure of the
原创力文档

文档评论(0)