chore(i8n,learn): processed translations
This commit is contained in:
committed by
Mrugesh Mohapatra
parent
15047f2d90
commit
e5c44a3ae5
@ -1,36 +1,155 @@
|
||||
---
|
||||
id: 5956795bc9e2c415eb244de1
|
||||
title: 哈希加入
|
||||
title: Hash join
|
||||
challengeType: 5
|
||||
videoUrl: ''
|
||||
forumTopicId: 302284
|
||||
dashedName: hash-join
|
||||
---
|
||||
|
||||
# --description--
|
||||
|
||||
<p> <a href='https://en.wikipedia.org/wiki/Join_(SQL)#Inner_join' title='wp:Join_(SQL)#Inner_join'>内连接</a>是一种操作,它根据匹配的列值将两个数据表组合到一个表中。实现此操作的最简单方法是<a href='https://en.wikipedia.org/wiki/Nested loop join' title='wp:嵌套循环连接'>嵌套循环连接</a>算法,但更可扩展的替代方法是<a href='https://en.wikipedia.org/wiki/hash join' title='wp:哈希联接'>散列连接</a>算法。 </p><p>实现“散列连接”算法,并演示它通过下面列出的测试用例。 </p><p>您应该将表表示为在编程语言中感觉自然的数据结构。 </p><p> “散列连接”算法包含两个步骤: </p>哈希阶段:从两个表中的一个表创建一个<a href='https://en.wikipedia.org/wiki/Multimap' title='wp:Multimap'>多</a>图,从每个连接列值映射到包含它的所有行。多图必须支持基于散列的查找,它比简单的线性搜索更好地扩展,因为这是该算法的重点。理想情况下,我们应该为较小的表创建多图,从而最小化其创建时间和内存大小。加入阶段:扫描另一个表,通过查看之前创建的多图来查找匹配的行。 <p>在伪代码中,算法可以表示如下: </p><pre>让A =第一个输入表(或理想情况下,更大的输入表)
|
||||
设B =第二个输入表(或理想情况下,较小的输入表)
|
||||
令j <sub>A</sub> =表A的连接列ID
|
||||
令j <sub>B</sub> =表B的连接列ID
|
||||
让M <sub>B</sub> =一个多图,用于从单个值映射到表B的多行(从空白开始)
|
||||
让C =输出表(从空开始)
|
||||
对于表B中的每一行b:
|
||||
将b放在密钥b(j <sub>B</sub> )下的多映射M <sub>B中</sub>
|
||||
对于表A中的每一行a:
|
||||
对于a(j <sub>A</sub> )项下多图M <sub>B中的</sub>每一行b:
|
||||
设c =第a行和第b行的串联
|
||||
将行c放在表C中<p></p>
|
||||
</pre>测试用例<p>输入</p><table><tbody><tr><td style='padding: 4px; margin: 5px;'><table style='border:none; border-collapse:collapse;'><tbody><tr><td style='border:none'> <i>A =</i> </td><td style='border:none'><table><tbody><tr><th style='padding: 4px; margin: 5px;'>年龄</th><th style='padding: 4px; margin: 5px;'>名称</th></tr><tr><td style='padding: 4px; margin: 5px;'> 27 </td><td style='padding: 4px; margin: 5px;'>约拿</td></tr><tr><td style='padding: 4px; margin: 5px;'> 18 </td><td style='padding: 4px; margin: 5px;'>艾伦</td></tr><tr><td style='padding: 4px; margin: 5px;'> 28 </td><td style='padding: 4px; margin: 5px;'>荣耀</td></tr><tr><td style='padding: 4px; margin: 5px;'> 18 </td><td style='padding: 4px; margin: 5px;'>大力水手</td></tr><tr><td style='padding: 4px; margin: 5px;'> 28 </td><td style='padding: 4px; margin: 5px;'>艾伦</td></tr></tbody></table></td><td style='border:none; padding-left:1.5em;' rowspan='2'></td><td style='border:none'> <i>B =</i> </td><td style='border:none'><table><tbody><tr><th style='padding: 4px; margin: 5px;'>字符</th><th style='padding: 4px; margin: 5px;'>复仇者</th></tr><tr><td style='padding: 4px; margin: 5px;'>约拿</td><td style='padding: 4px; margin: 5px;'>鲸鱼</td></tr><tr><td style='padding: 4px; margin: 5px;'>约拿</td><td style='padding: 4px; margin: 5px;'>蜘蛛</td></tr><tr><td style='padding: 4px; margin: 5px;'>艾伦</td><td style='padding: 4px; margin: 5px;'>鬼</td></tr><tr><td style='padding: 4px; margin: 5px;'>艾伦</td><td style='padding: 4px; margin: 5px;'>植物大战僵尸</td></tr><tr><td style='padding: 4px; margin: 5px;'>荣耀</td><td style='padding: 4px; margin: 5px;'>巴菲</td></tr></tbody></table></td></tr><tr><td style='border:none'> <i>j <sub>A</sub> =</i> </td><td style='border:none'> <i><code>Name</code> (即第1栏)</i> </td><td style='border:none'> <i>j <sub>B</sub> =</i> </td><td style='border:none'> <i><code>Character</code> (即第0列)</i> </td></tr></tbody></table></td><td style='padding: 4px; margin: 5px;'></td></tr></tbody></table><p>产量</p><table><tbody><tr><th style='padding: 4px; margin: 5px;'> A.Age </th><th style='padding: 4px; margin: 5px;'>一个名字</th><th style='padding: 4px; margin: 5px;'> B.Character </th><th style='padding: 4px; margin: 5px;'> B.Nemesis </th></tr><tr><td style='padding: 4px; margin: 5px;'> 27 </td><td style='padding: 4px; margin: 5px;'>约拿</td><td style='padding: 4px; margin: 5px;'>约拿</td><td style='padding: 4px; margin: 5px;'>鲸鱼</td></tr><tr><td style='padding: 4px; margin: 5px;'> 27 </td><td style='padding: 4px; margin: 5px;'>约拿</td><td style='padding: 4px; margin: 5px;'>约拿</td><td style='padding: 4px; margin: 5px;'>蜘蛛</td></tr><tr><td style='padding: 4px; margin: 5px;'> 18 </td><td style='padding: 4px; margin: 5px;'>艾伦</td><td style='padding: 4px; margin: 5px;'>艾伦</td><td style='padding: 4px; margin: 5px;'>鬼</td></tr><tr><td style='padding: 4px; margin: 5px;'> 18 </td><td style='padding: 4px; margin: 5px;'>艾伦</td><td style='padding: 4px; margin: 5px;'>艾伦</td><td style='padding: 4px; margin: 5px;'>植物大战僵尸</td></tr><tr><td style='padding: 4px; margin: 5px;'> 28 </td><td style='padding: 4px; margin: 5px;'>荣耀</td><td style='padding: 4px; margin: 5px;'>荣耀</td><td style='padding: 4px; margin: 5px;'>巴菲</td></tr><tr><td style='padding: 4px; margin: 5px;'> 28 </td><td style='padding: 4px; margin: 5px;'>艾伦</td><td style='padding: 4px; margin: 5px;'>艾伦</td><td style='padding: 4px; margin: 5px;'>鬼</td></tr><tr><td style='padding: 4px; margin: 5px;'> 28 </td><td style='padding: 4px; margin: 5px;'>艾伦</td><td style='padding: 4px; margin: 5px;'>艾伦</td><td style='padding: 4px; margin: 5px;'>植物大战僵尸</td></tr></tbody></table><p></p><p></p><p>输出表中行的顺序并不重要。 </p><p>如果你使用数字索引数组来表示表行(而不是按名称引用列),你可以用<code style='white-space:nowrap'>[[27, "Jonah"], ["Jonah", "Whales"]]</code>的形式表示输出行。 。 </p><hr>
|
||||
An [inner join](https://en.wikipedia.org/wiki/Join_(SQL)#Inner_join "wp: Join\_(SQL)#Inner_join") is an operation that combines two data tables into one table, based on matching column values. The simplest way of implementing this operation is the [nested loop join](https://en.wikipedia.org/wiki/Nested loop join "wp: Nested loop join") algorithm, but a more scalable alternative is the [hash join](https://en.wikipedia.org/wiki/hash join "wp: hash join") algorithm.
|
||||
|
||||
The "hash join" algorithm consists of two steps:
|
||||
|
||||
<ol>
|
||||
<li><strong>Hash phase:</strong> Create a <a href='https://en.wikipedia.org/wiki/Multimap' title='wp: Multimap' target='_blank'>multimap</a> from one of the two tables, mapping from each join column value to all the rows that contain it.</li>
|
||||
<ul>
|
||||
<li>The multimap must support hash-based lookup which scales better than a simple linear search, because that's the whole point of this algorithm.</li>
|
||||
<li>Ideally we should create the multimap for the smaller table, thus minimizing its creation time and memory size.</li>
|
||||
</ul>
|
||||
<li><strong>Join phase:</strong> Scan the other table, and find matching rows by looking in the multimap created before.</li>
|
||||
</ol>
|
||||
|
||||
In pseudo-code, the algorithm could be expressed as follows:
|
||||
|
||||
<pre><strong>let</strong> <i>A</i> = the first input table (or ideally, the larger one)
|
||||
<strong>let</strong> <i>B</i> = the second input table (or ideally, the smaller one)
|
||||
<strong>let</strong> <i>j<sub>A</sub></i> = the join column ID of table <i>A</i>
|
||||
<strong>let</strong> <i>j<sub>B</sub></i> = the join column ID of table <i>B</i>
|
||||
<strong>let</strong> <i>M<sub>B</sub></i> = a multimap for mapping from single values to multiple rows of table <i>B</i> (starts out empty)
|
||||
<strong>let</strong> <i>C</i> = the output table (starts out empty)
|
||||
<strong>for each</strong> row <i>b</i> in table <i>B</i>:
|
||||
<strong>place</strong> <i>b</i> in multimap <i>M<sub>B</sub></i> under key <i>b(j<sub>B</sub>)</i>
|
||||
<strong>for each</strong> row <i>a</i> in table <i>A</i>:
|
||||
<strong>for each</strong> row <i>b</i> in multimap <i>M<sub>B</sub></i> under key <i>a(j<sub>A</sub>)</i>:
|
||||
<strong>let</strong> <i>c</i> = the concatenation of row <i>a</i> and row <i>b</i>
|
||||
<strong>place</strong> row <i>c</i> in table <i>C</i>
|
||||
</pre>
|
||||
|
||||
# --instructions--
|
||||
|
||||
Implement the "hash join" algorithm as a function and demonstrate that it passes the test-case listed below. The function should accept two arrays of objects and return an array of combined objects.
|
||||
|
||||
**Input**
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<td style="padding: 4px; margin: 5px;">
|
||||
<table style="border:none; border-collapse:collapse;">
|
||||
<tr>
|
||||
<td style="border:none"><i>A =</i></td>
|
||||
<td style="border:none">
|
||||
<table>
|
||||
<tr>
|
||||
<th style="padding: 4px; margin: 5px;">Age</th>
|
||||
<th style="padding: 4px; margin: 5px;">Name</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td style="padding: 4px; margin: 5px;">27</td>
|
||||
<td style="padding: 4px; margin: 5px;">Jonah</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td style="padding: 4px; margin: 5px;">18</td>
|
||||
<td style="padding: 4px; margin: 5px;">Alan</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td style="padding: 4px; margin: 5px;">28</td>
|
||||
<td style="padding: 4px; margin: 5px;">Glory</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td style="padding: 4px; margin: 5px;">18</td>
|
||||
<td style="padding: 4px; margin: 5px;">Popeye</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td style="padding: 4px; margin: 5px;">28</td>
|
||||
<td style="padding: 4px; margin: 5px;">Alan</td>
|
||||
</tr>
|
||||
</table>
|
||||
</td>
|
||||
<td style="border:none; padding-left:1.5em;" rowspan="2"></td>
|
||||
<td style="border:none"><i>B =</i></td>
|
||||
<td style="border:none">
|
||||
<table>
|
||||
<tr>
|
||||
<th style="padding: 4px; margin: 5px;">Character</th>
|
||||
<th style="padding: 4px; margin: 5px;">Nemesis</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td style="padding: 4px; margin: 5px;">Jonah</td>
|
||||
<td style="padding: 4px; margin: 5px;">Whales</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td style="padding: 4px; margin: 5px;">Jonah</td>
|
||||
<td style="padding: 4px; margin: 5px;">Spiders</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td style="padding: 4px; margin: 5px;">Alan</td>
|
||||
<td style="padding: 4px; margin: 5px;">Ghosts</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td style="padding: 4px; margin: 5px;">Alan</td>
|
||||
<td style="padding: 4px; margin: 5px;">Zombies</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td style="padding: 4px; margin: 5px;">Glory</td>
|
||||
<td style="padding: 4px; margin: 5px;">Buffy</td>
|
||||
</tr>
|
||||
</table>
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td style="border:none">
|
||||
<i>j<sub>A</sub> =</i>
|
||||
</td>
|
||||
<td style="border:none">
|
||||
<i><code>Name</code> (i.e. column 1)</i>
|
||||
</td>
|
||||
<td style="border:none">
|
||||
<i>j<sub>B</sub> =</i>
|
||||
</td>
|
||||
<td style="border:none">
|
||||
<i><code>Character</code> (i.e. column 0)</i>
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
**Output**
|
||||
|
||||
| A_age | A_name | B_character | B_nemesis |
|
||||
| ----- | ------ | ----------- | --------- |
|
||||
| 27 | Jonah | Jonah | Whales |
|
||||
| 27 | Jonah | Jonah | Spiders |
|
||||
| 18 | Alan | Alan | Ghosts |
|
||||
| 18 | Alan | Alan | Zombies |
|
||||
| 28 | Glory | Glory | Buffy |
|
||||
| 28 | Alan | Alan | Ghosts |
|
||||
| 28 | Alan | Alan | Zombies |
|
||||
|
||||
The order of the rows in the output table is not significant.
|
||||
|
||||
# --hints--
|
||||
|
||||
`hashJoin`是一个函数。
|
||||
`hashJoin` should be a function.
|
||||
|
||||
```js
|
||||
assert(typeof hashJoin === 'function');
|
||||
```
|
||||
|
||||
`hashJoin([{ age: 27, name: "Jonah" }, { age: 18, name: "Alan" }, { age: 28, name: "Glory" }, { age: 18, name: "Popeye" }, { age: 28, name: "Alan" }], [{ character: "Jonah", nemesis: "Whales" }, { character: "Jonah", nemesis: "Spiders" }, { character: "Alan", nemesis: "Ghosts" }, { character:"Alan", nemesis: "Zombies" }, { character: "Glory", nemesis: "Buffy" }, { character: "Bob", nemesis: "foo" }])`应该返回`[{"A_age": 27,"A_name": "Jonah", "B_character": "Jonah", "B_nemesis": "Whales"}, {"A_age": 27,"A_name": "Jonah", "B_character": "Jonah", "B_nemesis": "Spiders"}, {"A_age": 18,"A_name": "Alan", "B_character": "Alan", "B_nemesis": "Ghosts"}, {"A_age": 18,"A_name": "Alan", "B_character": "Alan", "B_nemesis": "Zombies"}, {"A_age": 28,"A_name": "Glory", "B_character": "Glory", "B_nemesis": "Buffy"}, {"A_age": 28,"A_name": "Alan", "B_character": "Alan", "B_nemesis": "Ghosts"}, {"A_age": 28,"A_name": "Alan", "B_character": "Alan", "B_nemesis": "Zombies"}]`
|
||||
`hashJoin([{ age: 27, name: "Jonah" }, { age: 18, name: "Alan" }, { age: 28, name: "Glory" }, { age: 18, name: "Popeye" }, { age: 28, name: "Alan" }], [{ character: "Jonah", nemesis: "Whales" }, { character: "Jonah", nemesis: "Spiders" }, { character: "Alan", nemesis: "Ghosts" }, { character:"Alan", nemesis: "Zombies" }, { character: "Glory", nemesis: "Buffy" }, { character: "Bob", nemesis: "foo" }])` should return `[{"A_age": 27,"A_name": "Jonah", "B_character": "Jonah", "B_nemesis": "Whales"}, {"A_age": 27,"A_name": "Jonah", "B_character": "Jonah", "B_nemesis": "Spiders"}, {"A_age": 18,"A_name": "Alan", "B_character": "Alan", "B_nemesis": "Ghosts"}, {"A_age": 18,"A_name": "Alan", "B_character": "Alan", "B_nemesis": "Zombies"}, {"A_age": 28,"A_name": "Glory", "B_character": "Glory", "B_nemesis": "Buffy"}, {"A_age": 28,"A_name": "Alan", "B_character": "Alan", "B_nemesis": "Ghosts"}, {"A_age": 28,"A_name": "Alan", "B_character": "Alan", "B_nemesis": "Zombies"}]`
|
||||
|
||||
```js
|
||||
assert.deepEqual(hashJoin(hash1, hash2), res);
|
||||
|
Reference in New Issue
Block a user