- 2
- 0
- 约2.08千字
- 约 14页
- 2016-09-07 发布于天津
- 举报
ReinforcementLearning-TexasAamp;MUniversity.ppt
Reinforcement Learning Mitchell, Ch. 13 (see also Barto Sutton book on-line) Rationale Learning from experience Adaptive control Examples not explicitly labeled, delayed feedback Problem of credit assignment – which action(s) led to payoff? tradeoff short-term thinking (immediate reward) for long-term consequences Agent Model Transition function – T:SxA-S, environment Reward function R:SxA-real, payoff Stochastic but Markov Policy=decision function, p:S-A “rationality” – maximize long term expected reward Discounted long-term reward (convergent series) Alternatives: finite time horizon, un
您可能关注的文档
- GettingtoknowAxe10.doc
- Girlgamersit'sOKtoplaytoo.ppt
- Glossary-SecuringAmerica'sBorders.doc
- GraphingS’COOLData.doc
- GuideToPurchasingLawyer’s-VirginiaState.doc
- HepatitisB-NorthDakota.ppt
- hereafterabriefscheduleofthepointsdiscussed.doc
- Homework#1-WPI.doc
- HOMEWORK-OneSamplet-testsonSPSS.doc
- HousingResidentswithMentalHealthChallenges-.ppt
- RelatedSamplest-test-DepartmentofPsychology.ppt
- RelaxationTechniquesfortheSTRESSED-Purdue.ppt
- ReproductiveHealth-UniversityofPittsburgh.ppt
- ResearchParadigms-CaliforniaStateUniversity,.ppt
- RESUMEWORKSHEET-BridgewaterCollege.doc
- RiskAssessment-HomeUniversityofArkansas.ppt
- RollOfThunder,HearMyCryByMildredD.Taylor.ppt
- ROUGHDRAFT-California.doc
- ROUGHDRAFT.doc
- S.733-2-Determinationofthegtratioforearth.doc
原创力文档

文档评论(0)