147 lines
		
	
	
		
			4.1 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
		
		
			
		
	
	
			147 lines
		
	
	
		
			4.1 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| 
								 | 
							
								---
							 | 
						||
| 
								 | 
							
								title: Python defaultdict
							 | 
						||
| 
								 | 
							
								---
							 | 
						||
| 
								 | 
							
								## Python defaultdict
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Dictionary is one of the most used data structures in Python. 
							 | 
						||
| 
								 | 
							
								A dictionary is an unordered collection of items and we usually have keys and values stored in a dictionary.
							 | 
						||
| 
								 | 
							
								Let us look at a few examples for how the dictionary is usually used. 
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```python
							 | 
						||
| 
								 | 
							
								# dictionary declaration 1
							 | 
						||
| 
								 | 
							
								dict1 = dict()
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								# dictionary declaration 2
							 | 
						||
| 
								 | 
							
								dict2 = {}
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								# Add items to the dictionary
							 | 
						||
| 
								 | 
							
								# The syntax to add and retrieve items is same for either of the two objects we defined above. 
							 | 
						||
| 
								 | 
							
								key = "X"
							 | 
						||
| 
								 | 
							
								value = "Y"
							 | 
						||
| 
								 | 
							
								dict1[key] = value
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								# The dictionary doesn't have any specific data-type. 
							 | 
						||
| 
								 | 
							
								# So, the values can be pretty diverse. 
							 | 
						||
| 
								 | 
							
								dict1[key] = dict2
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Let's now look at some retrieval ways. 
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```python
							 | 
						||
| 
								 | 
							
								# Since "X" exists in our dictionary, this will retrieve the value
							 | 
						||
| 
								 | 
							
								value = dict1[key]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								# This key doesn't exist in the dictionary. 
							 | 
						||
| 
								 | 
							
								# So, we will get a `KeyError`
							 | 
						||
| 
								 | 
							
								value = dict1["random"]
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								### Avoiding KeyError: Use .get function
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								In case the given key does not exist in the dictionary, Python will throw a `KeyError`. 
							 | 
						||
| 
								 | 
							
								There is a simple workaround for this. Let's look at how we can avoid `KeyError` using the 
							 | 
						||
| 
								 | 
							
								in-built `.get` function for dictionaries. 
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```python
							 | 
						||
| 
								 | 
							
								dict_ = {}
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								# Some random key
							 | 
						||
| 
								 | 
							
								random_key = "random"
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								# The most basic way of doing this is to check if the key 
							 | 
						||
| 
								 | 
							
								# exists in the dictionary or not and only retrieve if the 
							 | 
						||
| 
								 | 
							
								# key exists. Otherwise not. 
							 | 
						||
| 
								 | 
							
								if random_key in dict_:
							 | 
						||
| 
								 | 
							
								  print(dict_[random_key])
							 | 
						||
| 
								 | 
							
								else:
							 | 
						||
| 
								 | 
							
								  print("Key = {} doesn't exist in the dictionary".format(dict_))
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								A lot of times we are ok getting a default value when the key doesn't exist. For e.g. when 
							 | 
						||
| 
								 | 
							
								building a counter. There is a better way to get default values from the dictionary in case of 
							 | 
						||
| 
								 | 
							
								missing keys rather than relying on standard `if-else`. 
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```python
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								# Let's say we want to build a frequency counter for items in the following array
							 | 
						||
| 
								 | 
							
								arr = [1,2,3,1,2,3,4,1,2,1,4,1,2,3,1]
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								freq = {}
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								for item in arr:
							 | 
						||
| 
								 | 
							
								  # Fetch a value of 0 in case the key doesn't exist. Otherwise, fetch the stored value
							 | 
						||
| 
								 | 
							
								  freq[item] = freq.get(item, 0) + 1
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								So, the `get(<key>, <defaultval>)` is a handy operation for retrieving the default value for any given key from the dictionary.
							 | 
						||
| 
								 | 
							
								The problem with this method comes when we want to deal with mutable data structures as values e.g. `list` or `set`. 
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```python
							 | 
						||
| 
								 | 
							
								dict_ = {}
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								# Some random key
							 | 
						||
| 
								 | 
							
								random_key = "random"
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								dict_[random_key] = dict_.get(random_key, []).append("Hello World!")
							 | 
						||
| 
								 | 
							
								print(dict_) # {'random': None}
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								dict_ = {}
							 | 
						||
| 
								 | 
							
								dict_[random_key] = dict_.get(random_key, set()).add("Hello World!")
							 | 
						||
| 
								 | 
							
								print(dict_) # {'random': None}
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Did you see the problem?
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The new `set` or the `list` doesn't get assigned to the dictionary's key. We should assign a new `list` or a `set`
							 | 
						||
| 
								 | 
							
								to the key in case of missing value and then `append` or `add` respectively. Ley's look at an example for this. 
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```python
							 | 
						||
| 
								 | 
							
								dict_ = {}
							 | 
						||
| 
								 | 
							
								dict_[random_key] = dict_.get(random_key, set())
							 | 
						||
| 
								 | 
							
								dict_[random_key].add("Hello World!")
							 | 
						||
| 
								 | 
							
								print(dict_) # {'random': set(['Hello World!'])}. Yay!
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								### Avoiding KeyError: Use defaultdict
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								This works most of the times. However, there is a better way to do this. A more `pythonic` way.  The `defaultdict` is a subclass of the built-in dict class.
							 | 
						||
| 
								 | 
							
								The `defaultdict` simply assigns the default value that we specify in case of a missing key. So, the two steps:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```python
							 | 
						||
| 
								 | 
							
								dict_[random_key] = dict_.get(random_key, set())
							 | 
						||
| 
								 | 
							
								dict_[random_key].add("Hello World!")
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								can now be combined into one single step. For e.g.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```python
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								from collections import defaultdict
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								# Yet another random key
							 | 
						||
| 
								 | 
							
								random_key = "random_key"
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								# list defaultdict
							 | 
						||
| 
								 | 
							
								list_dict_ = defaultdict(list)
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								# set defaultdict
							 | 
						||
| 
								 | 
							
								set_dict_ = defaultdict(set)
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								# integer defaultdict
							 | 
						||
| 
								 | 
							
								int_dict_ = defaultdict(int)
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								list_dict_[random_key].append("Hello World!")
							 | 
						||
| 
								 | 
							
								set_dict_[random_key].add("Hello World!")
							 | 
						||
| 
								 | 
							
								int_dict_[random_key] += 1
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								"""
							 | 
						||
| 
								 | 
							
								  defaultdict(<class 'list'>, {'random_key': ['Hello World!']}) 
							 | 
						||
| 
								 | 
							
								  defaultdict(<class 'set'>, {'random_key': {'Hello World!'}}) 
							 | 
						||
| 
								 | 
							
								  defaultdict(<class 'int'>, {'random_key': 1})
							 | 
						||
| 
								 | 
							
								"""
							 | 
						||
| 
								 | 
							
								print(list_dict_, set_dict_, int_dict_)
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								---
							 | 
						||
| 
								 | 
							
								<a href='https://docs.python.org/2/library/collections.html' target='_blank' rel='nofollow'>Official Docs</a>
							 |