2025年OpenAI o3o4-mini技术报告英文版 .docxVIP

  • 0
  • 0
  • 约6.65万字
  • 约 35页
  • 2026-02-10 发布于浙江
  • 举报

OpenAIo3ando4-miniSystemCard

OpenAI

April16,2025

1 Introduction

OpenAIo3andOpenAIo4-minicombinestate-of-the-artreasoningwithfulltoolcapabilities—webbrowsing,Python,imageandfileanalysis,imagegeneration,canvas,automations,filesearch,andmemory.Thesemodelsexcelatsolvingcomplexmath,coding,andscientificchallengeswhiledemonstratingstrongvisualperceptionandanalysis.Themodelsusetoolsintheirchainsofthoughttoaugmenttheircapabilities;forexample,croppingortransformingimages,searchingtheweb,orusingPythontoanalyzedataduringtheirthoughtprocess.

TheOpenAIo-seriesmodelsaretrainedwithlarge-scalereinforcementlearningonchainsofthought.Theseadvancedreasoningcapabilitiesprovidenewavenuesforimprovingthesafetyandrobustnessofourmodels.Inparticular,ourmodelscanreasonaboutoursafetypoliciesincontextwhenrespondingtopotentiallyunsafeprompts,throughdeliberativealignment[1]1.

ThisisthefirstlaunchandsystemcardtobereleasedunderVersion2ofourPreparednessFramework.OpenAI’sSafetyAdvisoryGroup(SAG)reviewedtheresultsofourPreparednessevaluationsanddeterminedthatOpenAIo3ando4-minidonotreachtheHighthresholdinanyofourthreeTrackedCategories:BiologicalandChemicalCapability,Cybersecurity,andAISelf-improvement.Wedescribetheseevaluationsbelow,andprovideanupdateonourworktomitigaterisksintheseareas.

2 ModelDataandTraining

OpenAIreasoningmodelsaretrainedtoreasonthroughreinforcementlearning.Modelsintheo-seriesfamilyaretrainedtothinkbeforetheyanswer:theycanproducealonginternalchainofthoughtbeforerespondingtotheuser.Throughtraining,thesemodelslearntorefinetheirthinkingprocess,trydifferentstrategies,andrecognizetheirmistakes.Reasoningallowsthesemodelstofollowspecificguidelinesandmodelpolicieswe’veset,helpingthemactinlinewithoursafetyex

您可能关注的文档

文档评论(0)

1亿VIP精品文档

相关文档