用于评估大语言模型军事决策能力的综合性基准测试 WARBENCH A Comprehensive Benchmark for Evaluating LLMs in Military Decision-Making.pdfVIP

  • 0
  • 0
  • 约12.97万字
  • 约 14页
  • 2026-05-13 发布于广东
  • 举报

用于评估大语言模型军事决策能力的综合性基准测试 WARBENCH A Comprehensive Benchmark for Evaluating LLMs in Military Decision-Making.pdf

WARBENCH:AComprehensiveBenchmarkforEvaluatingLLMs

inMilitaryDecision-Making

ZongjieLiChaozhengWangYuchongXie

zligo@connect.ust.hkczwang23@.hkyxiece@cse.ust.hk

HongKongUniversityofScienceandChineseUniversityofHongKongHongKongUniversityofScienceand

Technology

您可能关注的文档

文档评论(0)

1亿VIP精品文档

相关文档