119 lines
		
	
	
		
			5.1 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
		
		
			
		
	
	
			119 lines
		
	
	
		
			5.1 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
|   | --- | ||
|  | title: Data Structure Trie | ||
|  | --- | ||
|  | ## Introduction to Trie
 | ||
|  | 
 | ||
|  | The word trie is an inflix of the word "re**trie**val", because the trie can find a single word in a dictionary with only a prefix of the word.   | ||
|  | Trie is an efficient data retrieval data structure, using trie, search complexities can be brought to an optimal limit, i.e. length of the string.   | ||
|  | It is a multi-way tree structure useful for storing strings over an alphabet, when we are storing them.   | ||
|  | It has been used to store large dictionaries of English, say, words in spell-checking programs.   | ||
|  | However, the penalty on tries is the storage requirement. | ||
|  | 
 | ||
|  | ## What is a trie?
 | ||
|  | 
 | ||
|  | A trie is a tree like data structure which stores strings, and helps you find the data associated with that string using the prefix of the string.   | ||
|  | For example, say you plan on building a dictionary to store strings along with their meanings. You must be wondering why can't I simply use a hash table, to get the information.   | ||
|  | Yes, you obviously can get information using a hash table, but, the <a>hash tables</a> can only find data where the string exactly matches the one we've added. But trie will give us the capability to find strings with common prefixes, a missing character etc in lesser time, in comparison to a hash table.   | ||
|  | A trie typically, looks something like this, | ||
|  | 
 | ||
|  |  | ||
|  | 
 | ||
|  | This is an image of a Trie, which stores the words {assoc, algo, all, also, tree, trie}. | ||
|  | 
 | ||
|  | ## How to implement a trie?
 | ||
|  | 
 | ||
|  | Let's implement a trie in python, for storing words with their meanings from english dictionary. | ||
|  | 
 | ||
|  |     ALPHABET_SIZE = 26 # For English | ||
|  | 
 | ||
|  |     class TrieNode: | ||
|  |     	def __init__(self): | ||
|  |     		self.edges = [None]*(ALPHABET_SIZE) # Each index respective to each character. | ||
|  |     		self.meaning = None # Meaning of the word. | ||
|  |     		self.ends_here = False # Tells us if the word ends here. | ||
|  | 
 | ||
|  | As you can see, edges are 26 in length, each index referring to each character in the alphabet. 'A' corresponding to 0, 'B' to 1, 'C' to 2 ... 'Z' to 25th index. If the character you are looking for is pointing to `None`, that implies the word is not there in the trie. | ||
|  | 
 | ||
|  | A typical Trie should implement at least these two functions: | ||
|  | 
 | ||
|  | *   `add_word(word,meaning)` | ||
|  | *   `search_word(word)` | ||
|  | *   `delete_word(word)` | ||
|  | 
 | ||
|  | Additionally, one can also add something like | ||
|  | 
 | ||
|  | *   `get_all_words()` | ||
|  | *   `get_all_words_with_prefix(prefix)` | ||
|  | 
 | ||
|  | #### Adding Word to the trie
 | ||
|  | 
 | ||
|  |     	def add_word(self,word,meaning): | ||
|  |     		if len(word)==0: | ||
|  |     			self.ends_here = True # Because we have reached the end of the word | ||
|  |     			self.meaning = meaning # Adding the meaning to that node | ||
|  |     			return | ||
|  |     		ch = word[0] # First character | ||
|  |     		# ASCII value of the first character (minus) the ASCII value of 'a'-> the first character of our ALPHABET gives us the index of the edge we have to look up. | ||
|  |     		index = ord(ch) - ord('a') | ||
|  |     		if self.edges[index] == None: | ||
|  |     			# This implies that there's no prefix with this character yet. | ||
|  |     			new_node = TrieNode() | ||
|  |     			self.edges[index] = new_node | ||
|  | 
 | ||
|  |     		self.edges[index].add(word[1:],meaning) #Adding the remaining word | ||
|  | 
 | ||
|  | #### Retrieving data
 | ||
|  | 
 | ||
|  |     	def search_word(self,word): | ||
|  |     		if len(word)==0: | ||
|  |     			if self.ends_here: | ||
|  |     				return True | ||
|  |     			else: | ||
|  |     				return "Word doesn't exist in the Trie" | ||
|  |     		ch = word[0] | ||
|  |     		index = ord(ch)-ord('a') | ||
|  |     		if self.edge[index]== None: | ||
|  |     			return False | ||
|  |     		else: | ||
|  |     			return self.edge[index].search_word(word[1:]) | ||
|  | 
 | ||
|  | The `search_word` function will tell us if the word exists in the Trie or not. Since ours is a dictionary, we need to fetch the meaning as well, now lets declare a function to do that. | ||
|  | 
 | ||
|  |     	def get_meaning(self,word): | ||
|  |     		if len(word)==0 : | ||
|  |     			if self.ends_here: | ||
|  |     				return self.meaning | ||
|  |     			else: | ||
|  |     				return "Word doesn't exist in the Trie" | ||
|  |     		ch = word[0] | ||
|  |     		index = ord(ch) - ord('a') | ||
|  |     		if self.edges[index] == None: | ||
|  |     			return "Word doesn't exist in the Trie" | ||
|  |     		else: | ||
|  |     			return self.edges[index].get_meaning(word[1:]) | ||
|  | 
 | ||
|  | #### Deleting data
 | ||
|  | 
 | ||
|  | By deleting data, you just need to change the variable `ends_here` to `False`. Doing that doesn't alter the prefixes, but stills deletes the meaning and the existence of the word from the trie. | ||
|  | 
 | ||
|  |     	def delete_word(self,word): | ||
|  |     		if len(word)==0: | ||
|  |     			if self.ends_here: | ||
|  |     				self.ends_here = False | ||
|  |     				self.meaning = None | ||
|  |     				return "Deleted" | ||
|  |     			else: | ||
|  |     				return "Word doesn't exist in the Trie" | ||
|  |     		ch = word[0] | ||
|  |     		index = ord(ch) - ord('a') | ||
|  |     		if self.edges[index] == None: | ||
|  |     			return "Word doesn't exist in the Trie" | ||
|  |     		else: | ||
|  |     			return self.edges[index].delete_word(word[1:]) | ||
|  | 
 | ||
|  |  <a href='https://repl.it/CWbr' target='_blank' rel='nofollow'>Run Code</a> | ||
|  | 
 | ||
|  | ## Resources
 | ||
|  | 
 | ||
|  | *   For further reading, you can try this <a href='https://www.topcoder.com/community/data-science/data-science-tutorials/using-tries/' target='_blank' rel='nofollow'>topcoder</a> tutorial. | ||
|  | *   Also, a tutorial from <a href='http://www.geeksforgeeks.org/trie-insert-and-search/' target='_blank' rel='nofollow'>geeksforgeeks</a> |