2020-04-21 12:42:19 -04:00
|
|
|
---
|
|
|
|
id: 5e9a093a74c4063ca6f7c15f
|
|
|
|
title: Data Cleaning Duplicates
|
|
|
|
challengeType: 11
|
2020-06-23 17:36:39 +05:30
|
|
|
isHidden: false
|
2020-04-21 12:42:19 -04:00
|
|
|
videoId: kj7QqjXhH6A
|
|
|
|
---
|
|
|
|
|
|
|
|
## Description
|
|
|
|
<section id='description'>
|
2020-07-17 05:12:45 -04:00
|
|
|
More resources:
|
|
|
|
- <a href="https://notebooks.ai/rmotr-curriculum/data-cleaning-rmotr-freecodecamp-fd76fa59" target='_blank'>Notebook</a>
|
2020-04-21 12:42:19 -04:00
|
|
|
</section>
|
|
|
|
|
|
|
|
## Tests
|
|
|
|
<section id='tests'>
|
|
|
|
|
|
|
|
```yml
|
|
|
|
question:
|
2020-05-28 22:40:36 +09:00
|
|
|
text: |
|
|
|
|
The Python method `.duplicated()` returns a boolean Series for your DataFrame. `True` is the return value for rows that:
|
2020-04-21 12:42:19 -04:00
|
|
|
answers:
|
2020-05-10 00:16:11 -05:00
|
|
|
- contain a duplicate, where the value for the row contains the first occurrence of that value.
|
|
|
|
- contain a duplicate, where the value for the row is at least the second occurrence of that value.
|
|
|
|
- contain a duplicate, where the value for the row contains either the first or second occurrence.
|
|
|
|
solution: 2
|
2020-04-21 12:42:19 -04:00
|
|
|
```
|
|
|
|
|
|
|
|
</section>
|