2020-04-21 11:19:42 -04:00
|
|
|
---
|
|
|
|
id: 5e8f2f13c4cdbe86b5c72da5
|
2020-04-24 05:52:42 -05:00
|
|
|
title: 'Reinforcement Learning With Q-Learning: Example'
|
2020-04-21 11:19:42 -04:00
|
|
|
challengeType: 11
|
|
|
|
videoId: RBBSNta234s
|
|
|
|
---
|
|
|
|
|
|
|
|
## Description
|
|
|
|
<section id='description'>
|
|
|
|
</section>
|
|
|
|
|
|
|
|
## Tests
|
|
|
|
<section id='tests'>
|
|
|
|
|
|
|
|
```yml
|
|
|
|
question:
|
2020-05-28 22:40:36 +09:00
|
|
|
text: |
|
|
|
|
Fill in the blanks to complete the following Q-Learning equation:
|
|
|
|
|
|
|
|
```py
|
|
|
|
Q[__A__, __B__] = Q[__A__, __B__] + LEARNING_RATE * (reward + GAMMA * np.max(Q[__C__, :]) - Q[__A__, __B__])
|
|
|
|
```
|
|
|
|
|
2020-04-21 11:19:42 -04:00
|
|
|
answers:
|
2020-05-28 22:40:36 +09:00
|
|
|
- |
|
|
|
|
A: `state`
|
|
|
|
|
|
|
|
B: `action`
|
|
|
|
|
|
|
|
C: `next_state`
|
|
|
|
- |
|
|
|
|
A: `state`
|
|
|
|
|
|
|
|
B: `action`
|
|
|
|
|
|
|
|
C: `prev_state`
|
|
|
|
- |
|
|
|
|
A: `state`
|
|
|
|
|
|
|
|
B: `reaction`
|
|
|
|
|
|
|
|
C: `next_state`
|
|
|
|
solution: 1
|
2020-04-21 11:19:42 -04:00
|
|
|
```
|
|
|
|
|
|
|
|
</section>
|
|
|
|
|