2025年OpenAI o3o4-mini技术报告英文版 .docxVIP

下载本文档

0
0
约6.65万字
约 35页
2026-02-10 发布于浙江
举报

2025年OpenAI o3o4-mini技术报告英文版 .docx

OpenAIo3ando4-miniSystemCard

OpenAI

April16,2025

1 Introduction

OpenAIo3andOpenAIo4-minicombinestate-of-the-artreasoningwithfulltoolcapabilities—webbrowsing,Python,imageandfileanalysis,imagegeneration,canvas,automations,filesearch,andmemory.Thesemodelsexcelatsolvingcomplexmath,coding,andscientificchallengeswhiledemonstratingstrongvisualperceptionandanalysis.Themodelsusetoolsintheirchainsofthoughttoaugmenttheircapabilities;forexample,croppingortransformingimages,searchingtheweb,orusingPythontoanalyzedataduringtheirthoughtprocess.

TheOpenAIo-seriesmodelsaretrainedwithlarge-scalereinforcementlearningonchainsofthought.Theseadvancedreasoningcapabilitiesprovidenewavenuesforimprovingthesafetyandrobustnessofourmodels.Inparticular,ourmodelscanreasonaboutoursafetypoliciesincontextwhenrespondingtopotentiallyunsafeprompts,throughdeliberativealignment[1]1.

ThisisthefirstlaunchandsystemcardtobereleasedunderVersion2ofourPreparednessFramework.OpenAI’sSafetyAdvisoryGroup(SAG)reviewedtheresultsofourPreparednessevaluationsanddeterminedthatOpenAIo3ando4-minidonotreachtheHighthresholdinanyofourthreeTrackedCategories:BiologicalandChemicalCapability,Cybersecurity,andAISelf-improvement.Wedescribetheseevaluationsbelow,andprovideanupdateonourworktomitigaterisksintheseareas.

2 ModelDataandTraining

OpenAIreasoningmodelsaretrainedtoreasonthroughreinforcementlearning.Modelsintheo-seriesfamilyaretrainedtothinkbeforetheyanswer:theycanproducealonginternalchainofthoughtbeforerespondingtotheuser.Throughtraining,thesemodelslearntorefinetheirthinkingprocess,trydifferentstrategies,andrecognizetheirmistakes.Reasoningallowsthesemodelstofollowspecificguidelinesandmodelpolicieswe’veset,helpingthemactinlinewithoursafetyex

您可能关注的文档

文档评论（0）

1亿VIP精品文档

更多 >

2025年OpenAI o3o4-mini技术报告英文版 .docxVIP