- 0
- 0
- 约20.57万字
- 约 43页
- 2026-03-17 发布于广东
- 举报
MultimodalChain-of-ThoughtReasoning:
AComprehensiveSurvey
112
YaotingWang,ShengqiongWu,YuechengZhang,
3451∗
WilliamWang,ZiweiLiu,JieboLuo,HaoFei
51NUS,2CUHK,3UCSB,4NTU,5UR
2SurveyProject:/yaotingwangofficial/Awesome-MCoT
0
2
rAbstract
a
MByextendingtheadvantageofchain-of-thought(CoT)reasoninginhuman-like
6step-by-stepprocessestomultimodalcontexts,multimodalCoT(MCoT)reason-
1inghasrecentlygarneredsignificantresearchattention,especiallyintheintegra-
tionwithmultimodallargelanguagemodels(MLLMs).ExistingMCoTstudies
]designvariousmethodologiesandinnovativereasoningparadigmstoaddressthe
Vuniquechallengesofimage,video,speech,audio,3D,andstructureddataacross
Cdifferentmodalities,achievingextensivesuccessinapplicationssuchasrobotics,
.healthcare,autonomousdriving,andmultimodalgeneration.However,MCoTstill
s
cpresentsdistinctchallengesandopportunitiesthatrequirefurtherfocustoensure
[consistentthrivinginthisfield,whereunfortunatelyanupdate-to-datereviewof
thisdomainislacking.Tobridgethisgap,wepresentthefirstsystematicsurveyof
1MCoTreasoning,elucidatingtherelevantfoundational
您可能关注的文档
- 动力电池-动力电池回收及综合利用行业-退役潮下的市场机遇与挑战.pdf
- 动力电池回收及综合利用-退役潮下的市场机遇与挑战.pdf
- 动力电池需求持续增长,动力电池检测市场规模广阔.pdf
- 动态视频流世界中的客户情感分析 -客户情绪揭示了好莱坞流媒体平台之间的战线.pdf
- 多机器人系统的大型语言模型综述 Large Language Models for Multi-Robot Systems - A Survey.pdf
- 多模态大型语言模型的幻觉综述.pdf
- 多模态大语言模型实现药物机制与属性的全方位预测.pdf
- 多模态基础模型的机制可解释性综述 A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models.pdf
- 多模态可解释人工智能综述:过去、现在与未来.pdf
- 多模态融合与视觉语言模型:机器人视觉综述.pdf
原创力文档

文档评论(0)