147 lines
		
	
	
		
			4.1 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			147 lines
		
	
	
		
			4.1 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| ---
 | |
| title: Python defaultdict
 | |
| ---
 | |
| ## Python defaultdict
 | |
| 
 | |
| Dictionary is one of the most used data structures in Python. 
 | |
| A dictionary is an unordered collection of items and we usually have keys and values stored in a dictionary.
 | |
| Let us look at a few examples for how the dictionary is usually used. 
 | |
| 
 | |
| ```python
 | |
| # dictionary declaration 1
 | |
| dict1 = dict()
 | |
| 
 | |
| # dictionary declaration 2
 | |
| dict2 = {}
 | |
| 
 | |
| # Add items to the dictionary
 | |
| # The syntax to add and retrieve items is same for either of the two objects we defined above. 
 | |
| key = "X"
 | |
| value = "Y"
 | |
| dict1[key] = value
 | |
| 
 | |
| # The dictionary doesn't have any specific data-type. 
 | |
| # So, the values can be pretty diverse. 
 | |
| dict1[key] = dict2
 | |
| ```
 | |
| 
 | |
| Let's now look at some retrieval ways. 
 | |
| 
 | |
| ```python
 | |
| # Since "X" exists in our dictionary, this will retrieve the value
 | |
| value = dict1[key]
 | |
| 
 | |
| # This key doesn't exist in the dictionary. 
 | |
| # So, we will get a `KeyError`
 | |
| value = dict1["random"]
 | |
| ```
 | |
| 
 | |
| ### Avoiding KeyError: Use .get function
 | |
| 
 | |
| In case the given key does not exist in the dictionary, Python will throw a `KeyError`. 
 | |
| There is a simple workaround for this. Let's look at how we can avoid `KeyError` using the 
 | |
| in-built `.get` function for dictionaries. 
 | |
| 
 | |
| ```python
 | |
| dict_ = {}
 | |
| 
 | |
| # Some random key
 | |
| random_key = "random"
 | |
| 
 | |
| # The most basic way of doing this is to check if the key 
 | |
| # exists in the dictionary or not and only retrieve if the 
 | |
| # key exists. Otherwise not. 
 | |
| if random_key in dict_:
 | |
|   print(dict_[random_key])
 | |
| else:
 | |
|   print("Key = {} doesn't exist in the dictionary".format(dict_))
 | |
| ```
 | |
| 
 | |
| A lot of times we are ok getting a default value when the key doesn't exist. For e.g. when 
 | |
| building a counter. There is a better way to get default values from the dictionary in case of 
 | |
| missing keys rather than relying on standard `if-else`. 
 | |
| 
 | |
| ```python
 | |
| 
 | |
| # Let's say we want to build a frequency counter for items in the following array
 | |
| arr = [1,2,3,1,2,3,4,1,2,1,4,1,2,3,1]
 | |
| 
 | |
| freq = {}
 | |
| 
 | |
| for item in arr:
 | |
|   # Fetch a value of 0 in case the key doesn't exist. Otherwise, fetch the stored value
 | |
|   freq[item] = freq.get(item, 0) + 1
 | |
| ```
 | |
| 
 | |
| So, the `get(<key>, <defaultval>)` is a handy operation for retrieving the default value for any given key from the dictionary.
 | |
| The problem with this method comes when we want to deal with mutable data structures as values e.g. `list` or `set`. 
 | |
| 
 | |
| ```python
 | |
| dict_ = {}
 | |
| 
 | |
| # Some random key
 | |
| random_key = "random"
 | |
| 
 | |
| dict_[random_key] = dict_.get(random_key, []).append("Hello World!")
 | |
| print(dict_) # {'random': None}
 | |
| 
 | |
| dict_ = {}
 | |
| dict_[random_key] = dict_.get(random_key, set()).add("Hello World!")
 | |
| print(dict_) # {'random': None}
 | |
| ```
 | |
| 
 | |
| Did you see the problem?
 | |
| 
 | |
| The new `set` or the `list` doesn't get assigned to the dictionary's key. We should assign a new `list` or a `set`
 | |
| to the key in case of missing value and then `append` or `add` respectively. Ley's look at an example for this. 
 | |
| 
 | |
| ```python
 | |
| dict_ = {}
 | |
| dict_[random_key] = dict_.get(random_key, set())
 | |
| dict_[random_key].add("Hello World!")
 | |
| print(dict_) # {'random': set(['Hello World!'])}. Yay!
 | |
| ```
 | |
| 
 | |
| ### Avoiding KeyError: Use defaultdict
 | |
| 
 | |
| This works most of the times. However, there is a better way to do this. A more `pythonic` way.  The `defaultdict` is a subclass of the built-in dict class.
 | |
| The `defaultdict` simply assigns the default value that we specify in case of a missing key. So, the two steps:
 | |
| 
 | |
| ```python
 | |
| dict_[random_key] = dict_.get(random_key, set())
 | |
| dict_[random_key].add("Hello World!")
 | |
| ```
 | |
| 
 | |
| can now be combined into one single step. For e.g.
 | |
| 
 | |
| ```python
 | |
| 
 | |
| from collections import defaultdict
 | |
| 
 | |
| # Yet another random key
 | |
| random_key = "random_key"
 | |
| 
 | |
| # list defaultdict
 | |
| list_dict_ = defaultdict(list)
 | |
| 
 | |
| # set defaultdict
 | |
| set_dict_ = defaultdict(set)
 | |
| 
 | |
| # integer defaultdict
 | |
| int_dict_ = defaultdict(int)
 | |
| 
 | |
| list_dict_[random_key].append("Hello World!")
 | |
| set_dict_[random_key].add("Hello World!")
 | |
| int_dict_[random_key] += 1
 | |
| 
 | |
| """
 | |
|   defaultdict(<class 'list'>, {'random_key': ['Hello World!']}) 
 | |
|   defaultdict(<class 'set'>, {'random_key': {'Hello World!'}}) 
 | |
|   defaultdict(<class 'int'>, {'random_key': 1})
 | |
| """
 | |
| print(list_dict_, set_dict_, int_dict_)
 | |
| ```
 | |
| 
 | |
| ---
 | |
| <a href='https://docs.python.org/2/library/collections.html' target='_blank' rel='nofollow'>Official Docs</a>
 |