- 0
- 0
- 约8.24万字
- 约 38页
- 2026-02-10 发布于浙江
- 举报
OpenAIo3-miniSystemCard
OpenAI
January31,2025
1 Introduction
TheOpenAIomodelseriesistrainedwithlarge-scalereinforcementlearningtoreasonusingchainofthought.Theseadvancedreasoningcapabilitiesprovidenewavenuesforimprovingthesafetyandrobustnessofourmodels.Inparticular,ourmodelscanreasonaboutoursafetypoliciesincontextwhenrespondingtopotentiallyunsafeprompts,throughdeliberativealignment[1]1.ThisbringsOpenAIo3-minitoparitywithstate-of-the-artperformanceoncertainbenchmarksforriskssuchasgeneratingillicitadvice,choosingstereotypedresponses,andsuccumbingtoknownjailbreaks.Trainingmodelstoincorporateachainofthoughtbeforeansweringhasthepotentialtounlocksubstantialbenefits,whilealsoincreasingpotentialrisksthatstemfromheightenedintelligence.
UnderthePreparednessFramework,OpenAI’sSafetyAdvisoryGroup(SAG)recommendedclassifyingtheOpenAIo3-mini(Pre-Mitigation)modelasMediumriskoverall.ItscoresMediumriskforPersuasion,CBRN(chemical,biological,radiological,nuclear),andModelAutonomy,andLowriskforCybersecurity.Onlymodelswithapost-mitigationscoreofMediumorbelowcanbedeployed,andonlymodelswithapost-mitigationscoreofHighorbelowcanbedevelopedfurther.
Duetoimprovedcodingandresearchengineeringperformance,OpenAIo3-miniisthefirstmodeltoreachMediumriskonModelAutonomy(seesection5.PreparednessFrameworkEvaluations).However,itstillperformspoorlyonevaluationsdesignedtotestreal-worldMLresearchcapabilitiesrelevantforselfimprovement,whichisrequiredforaHighclassification.Ourresultsunderscoretheneedforbuildingrobustalignmentmethods,extensivelystress-testingtheireficacy,andmaintainingmeticulousriskmanagementprotocols.
ThisreportoutlinesthesafetyworkcarriedoutfortheOpenAIo3-minimodel,includingsafetyevaluations,externalredteaming,andPreparedness
您可能关注的文档
- Gartner:2025年第一季度首席信息官CIO报告最关切问题解答 英文版 .docx
- Gartner:2025年分析与人工智能AI规划指南 英文版 .docx
- Gartner:2025年领导力前瞻:安全与风险管理领导者的三大战略重点 英文版 .docx
- Gartner:2025年领导力前瞻企业风险管理:ERM负责人的三大战略重点 英文版 .docx
- 2025年AI转型的进展洞察报告 .docx
- 2025年DDoS攻击趋势白皮书 .docx
- 2025年DeepSeek赋能数据分析报告 .docx
- 2025年OpenAI o3&o4-mini技术报告英文版 .docx
- 2025年度开源安全和风险分析报告英文版 .docx
- 2025年风险与合规状况报告技术与第三方 英文版 .docx
原创力文档

文档评论(0)