147 lines
		
	
	
		
			4.1 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			147 lines
		
	
	
		
			4.1 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
---
 | 
						|
title: Python defaultdict
 | 
						|
---
 | 
						|
## Python defaultdict
 | 
						|
 | 
						|
Dictionary is one of the most used data structures in Python. 
 | 
						|
A dictionary is an unordered collection of items and we usually have keys and values stored in a dictionary.
 | 
						|
Let us look at a few examples for how the dictionary is usually used. 
 | 
						|
 | 
						|
```python
 | 
						|
# dictionary declaration 1
 | 
						|
dict1 = dict()
 | 
						|
 | 
						|
# dictionary declaration 2
 | 
						|
dict2 = {}
 | 
						|
 | 
						|
# Add items to the dictionary
 | 
						|
# The syntax to add and retrieve items is same for either of the two objects we defined above. 
 | 
						|
key = "X"
 | 
						|
value = "Y"
 | 
						|
dict1[key] = value
 | 
						|
 | 
						|
# The dictionary doesn't have any specific data-type. 
 | 
						|
# So, the values can be pretty diverse. 
 | 
						|
dict1[key] = dict2
 | 
						|
```
 | 
						|
 | 
						|
Let's now look at some retrieval ways. 
 | 
						|
 | 
						|
```python
 | 
						|
# Since "X" exists in our dictionary, this will retrieve the value
 | 
						|
value = dict1[key]
 | 
						|
 | 
						|
# This key doesn't exist in the dictionary. 
 | 
						|
# So, we will get a `KeyError`
 | 
						|
value = dict1["random"]
 | 
						|
```
 | 
						|
 | 
						|
### Avoiding KeyError: Use .get function
 | 
						|
 | 
						|
In case the given key does not exist in the dictionary, Python will throw a `KeyError`. 
 | 
						|
There is a simple workaround for this. Let's look at how we can avoid `KeyError` using the 
 | 
						|
in-built `.get` function for dictionaries. 
 | 
						|
 | 
						|
```python
 | 
						|
dict_ = {}
 | 
						|
 | 
						|
# Some random key
 | 
						|
random_key = "random"
 | 
						|
 | 
						|
# The most basic way of doing this is to check if the key 
 | 
						|
# exists in the dictionary or not and only retrieve if the 
 | 
						|
# key exists. Otherwise not. 
 | 
						|
if random_key in dict_:
 | 
						|
  print(dict_[random_key])
 | 
						|
else:
 | 
						|
  print("Key = {} doesn't exist in the dictionary".format(dict_))
 | 
						|
```
 | 
						|
 | 
						|
A lot of times we are ok getting a default value when the key doesn't exist. For e.g. when 
 | 
						|
building a counter. There is a better way to get default values from the dictionary in case of 
 | 
						|
missing keys rather than relying on standard `if-else`. 
 | 
						|
 | 
						|
```python
 | 
						|
 | 
						|
# Let's say we want to build a frequency counter for items in the following array
 | 
						|
arr = [1,2,3,1,2,3,4,1,2,1,4,1,2,3,1]
 | 
						|
 | 
						|
freq = {}
 | 
						|
 | 
						|
for item in arr:
 | 
						|
  # Fetch a value of 0 in case the key doesn't exist. Otherwise, fetch the stored value
 | 
						|
  freq[item] = freq.get(item, 0) + 1
 | 
						|
```
 | 
						|
 | 
						|
So, the `get(<key>, <defaultval>)` is a handy operation for retrieving the default value for any given key from the dictionary.
 | 
						|
The problem with this method comes when we want to deal with mutable data structures as values e.g. `list` or `set`. 
 | 
						|
 | 
						|
```python
 | 
						|
dict_ = {}
 | 
						|
 | 
						|
# Some random key
 | 
						|
random_key = "random"
 | 
						|
 | 
						|
dict_[random_key] = dict_.get(random_key, []).append("Hello World!")
 | 
						|
print(dict_) # {'random': None}
 | 
						|
 | 
						|
dict_ = {}
 | 
						|
dict_[random_key] = dict_.get(random_key, set()).add("Hello World!")
 | 
						|
print(dict_) # {'random': None}
 | 
						|
```
 | 
						|
 | 
						|
Did you see the problem?
 | 
						|
 | 
						|
The new `set` or the `list` doesn't get assigned to the dictionary's key. We should assign a new `list` or a `set`
 | 
						|
to the key in case of missing value and then `append` or `add` respectively. Ley's look at an example for this. 
 | 
						|
 | 
						|
```python
 | 
						|
dict_ = {}
 | 
						|
dict_[random_key] = dict_.get(random_key, set())
 | 
						|
dict_[random_key].add("Hello World!")
 | 
						|
print(dict_) # {'random': set(['Hello World!'])}. Yay!
 | 
						|
```
 | 
						|
 | 
						|
### Avoiding KeyError: Use defaultdict
 | 
						|
 | 
						|
This works most of the times. However, there is a better way to do this. A more `pythonic` way.  The `defaultdict` is a subclass of the built-in dict class.
 | 
						|
The `defaultdict` simply assigns the default value that we specify in case of a missing key. So, the two steps:
 | 
						|
 | 
						|
```python
 | 
						|
dict_[random_key] = dict_.get(random_key, set())
 | 
						|
dict_[random_key].add("Hello World!")
 | 
						|
```
 | 
						|
 | 
						|
can now be combined into one single step. For e.g.
 | 
						|
 | 
						|
```python
 | 
						|
 | 
						|
from collections import defaultdict
 | 
						|
 | 
						|
# Yet another random key
 | 
						|
random_key = "random_key"
 | 
						|
 | 
						|
# list defaultdict
 | 
						|
list_dict_ = defaultdict(list)
 | 
						|
 | 
						|
# set defaultdict
 | 
						|
set_dict_ = defaultdict(set)
 | 
						|
 | 
						|
# integer defaultdict
 | 
						|
int_dict_ = defaultdict(int)
 | 
						|
 | 
						|
list_dict_[random_key].append("Hello World!")
 | 
						|
set_dict_[random_key].add("Hello World!")
 | 
						|
int_dict_[random_key] += 1
 | 
						|
 | 
						|
"""
 | 
						|
  defaultdict(<class 'list'>, {'random_key': ['Hello World!']}) 
 | 
						|
  defaultdict(<class 'set'>, {'random_key': {'Hello World!'}}) 
 | 
						|
  defaultdict(<class 'int'>, {'random_key': 1})
 | 
						|
"""
 | 
						|
print(list_dict_, set_dict_, int_dict_)
 | 
						|
```
 | 
						|
 | 
						|
---
 | 
						|
<a href='https://docs.python.org/2/library/collections.html' target='_blank' rel='nofollow'>Official Docs</a>
 |