人工智能论文ActBERT - Learning Global-Local Video-Text Representations.pdfVIP

  • 3
  • 0
  • 约9.48万字
  • 约 10页
  • 2026-06-22 发布于山东
  • 举报

人工智能论文ActBERT - Learning Global-Local Video-Text Representations.pdf

ActBERT:LearningGlobal-LocalVideo-TextRepresentations

LinchaoZhu1,2andYiYang2∗

1BaiduResearch2ReLER,UniversityofTechnologySydney

{linchao.zhu,yi.yang}@uts.edu.au

AbstractMotivatedbyBERT’ssuccessinself-supervisedtrain-

ing,weaimtolearnananalogousmodelforvideoand

Inthisp

文档评论(0)

1亿VIP精品文档

相关文档