Files

632 B

id, title, challengeType, videoId, bilibiliIds, dashedName
id title challengeType videoId bilibiliIds dashedName
5e8f2f13c4cdbe86b5c72da5 使用 Q-Learning 進行強化學習:示例 11 RBBSNta234s
aid bvid cid
848073871 BV1uL4y187Eq 409139471
reinforcement-learning-with-q-learning-example

--question--

--text--

填空以完成以下 Q-Learning 方程:

Q[__A__, __B__] = Q[__A__, __B__] + LEARNING_RATE * (reward + GAMMA * np.max(Q[__C__, :]) - Q[__A__, __B__])

--answers--

A: state

B: action

C: next_state


A: state

B: action

C: prev_state


A: state

B: reaction

C: next_state

--video-solution--

1