freeCodeCamp/curriculum/challenges/chinese-traditional/11-machine-learning-with-python/tensorflow/reinforcement-learning-with-q-learning-part-2.md

---
id: 5e8f2f13c4cdbe86b5c72da4
title: '使用 Q-Learning 進行強化學習：第 2 部分'
challengeType: 11
videoId: DX7hJuaUZ7o
bilibiliIds:
  aid: 420570359
  bvid: BV1G341127zr
  cid: 409139190
dashedName: reinforcement-learning-with-q-learning-part-2
---

# --question--

## --text--

如果智能體在採取隨機動作和使用學習動作之間沒有很好的平衡，會發生什麼？

## --answers--

智能體將始終嘗試將其對當前狀態/動作的獎勵最小化，從而導致局部最小值。

---

智能體將始終嘗試將其對當前狀態/動作的獎勵最大化，從而導致局部最大值。

## --video-solution--

2
chore: seed chinese traditional (#42005) Seeds the chinese traditional files manually so we can deploy to staging. 2021-05-05 10:13:49 -07:00			`---`
			`id: 5e8f2f13c4cdbe86b5c72da4`
chore(i18n,curriculum): processed translations (#42868) 2021-07-16 11:03:16 +05:30			`title: '使用 Q-Learning 進行強化學習：第 2 部分'`
chore: seed chinese traditional (#42005) Seeds the chinese traditional files manually so we can deploy to staging. 2021-05-05 10:13:49 -07:00			`challengeType: 11`
			`videoId: DX7hJuaUZ7o`
chore(i18n,curriculum): update translations (#43661) 2021-10-03 12:24:27 -07:00			`bilibiliIds:`
			`aid: 420570359`
			`bvid: BV1G341127zr`
			`cid: 409139190`
chore: seed chinese traditional (#42005) Seeds the chinese traditional files manually so we can deploy to staging. 2021-05-05 10:13:49 -07:00			`dashedName: reinforcement-learning-with-q-learning-part-2`
			`---`

			`# --question--`

			`## --text--`

chore(i18n,curriculum): processed translations (#42868) 2021-07-16 11:03:16 +05:30			`如果智能體在採取隨機動作和使用學習動作之間沒有很好的平衡，會發生什麼？`
chore: seed chinese traditional (#42005) Seeds the chinese traditional files manually so we can deploy to staging. 2021-05-05 10:13:49 -07:00
			`## --answers--`

chore(i18n,curriculum): processed translations (#42868) 2021-07-16 11:03:16 +05:30			`智能體將始終嘗試將其對當前狀態/動作的獎勵最小化，從而導致局部最小值。`
chore: seed chinese traditional (#42005) Seeds the chinese traditional files manually so we can deploy to staging. 2021-05-05 10:13:49 -07:00
			`---`

chore(i18n,curriculum): processed translations (#42868) 2021-07-16 11:03:16 +05:30			`智能體將始終嘗試將其對當前狀態/動作的獎勵最大化，從而導致局部最大值。`
chore: seed chinese traditional (#42005) Seeds the chinese traditional files manually so we can deploy to staging. 2021-05-05 10:13:49 -07:00
			`## --video-solution--`

			`2`