2018-10-12 15:37:13 -04:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								---
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								title: Natural Language Processing
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								---
							 
						 
					
						
							
								
									
										
										
										
											2019-07-19 13:59:11 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2018-10-12 15:37:13 -04:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								## Natural Language Processing(NLP)
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2019-04-09 01:50:10 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								According to Wikipedia, "Natural language processing (NLP) is a subfield of computer science, information engineering, and artificial intelligence, concerned with the interactions between computers and human (natural) languages, in particular how to program computers to process and analyze large amounts of natural language data."
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2018-10-12 15:37:13 -04:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								In simpler terms, it is a process in which natural language generated by humans are made sense of by computers.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								### Challenges in NLP
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								#### 1.Easy or mostly solved
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
											  *Spam detection
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
											  *Part of Speech Tagging
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
											  *Named Entity Recognition
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								#### 2.Intermediate or making good progress
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
											  *Sentiment analysis
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
											  *Coreference resolution
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
											  *Word sense disambiguation
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
											  *Parsing
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
											  *Machine Translation
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
											  *Information Translation
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								#### 3.Hard or still need lot of work
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
											  *Text Summarization
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
											  *Machine dialog system
							 
						 
					
						
							
								
									
										
										
										
											2018-11-07 15:34:13 +00:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2018-10-12 15:37:13 -04:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								### Common Techniques
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
											 *Structure extraction
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
											 *Identify and mark sentence, phrase, and paragraph boundaries
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
											 *Language identification
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
											 *Tokenization
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
											 *Acronym normalization and tagging
							 
						 
					
						
							
								
									
										
										
										
											2018-11-07 15:34:13 +00:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
											 *Lemmatization / Stemming
							 
						 
					
						
							
								
									
										
										
										
											2018-10-12 15:37:13 -04:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
											 *Entity extraction
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
											 *Phrase extraction
							 
						 
					
						
							
								
									
										
										
										
											2018-12-14 03:16:02 +01:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								       *Text summarization
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
										
							 
						 
					
						
							
								
									
										
										
										
											2018-10-12 15:37:13 -04:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								### Popularly Used Libraries
  
						 
					
						
							
								
									
										
										
										
											2018-11-07 15:34:13 +00:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
										  *NLTK, the most widely-mentioned NLP library for Python.
							 
						 
					
						
							
								
									
										
										
										
											2018-10-12 15:37:13 -04:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
											*SpaCy, an industrial-strength NLP library built for performance.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
											*Gensim, a library for document similarity analysis.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
											*TextBlob, a user-friendly and intuitive NLTK interface.
							 
						 
					
						
							
								
									
										
										
										
											2018-11-07 15:34:13 +00:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
											*CoreNLP from Stanford Group
							 
						 
					
						
							
								
									
										
										
										
											2018-10-12 15:37:13 -04:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
											*PolyGlot, a natural language pipeline that supports massive multilingual applications.
							 
						 
					
						
							
								
									
										
										
										
											2019-07-19 13:59:11 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								      *Pattern, used for web crawling, NLP tasks, and machine learning.			 
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								## Installing the package nltk which can help in Natural Language Processing:
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    pip install --upgrade pip
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    pip install --upgrade nltk
							 
						 
					
						
							
								
									
										
										
										
											2018-12-14 03:16:02 +01:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
											 
							 
						 
					
						
							
								
									
										
										
										
											2018-10-12 15:37:13 -04:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								#### More Information:
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								<!--  Please add any articles you think might be helpful to read before writing the article  -->  
						 
					
						
							
								
									
										
										
										
											2019-04-09 01:50:10 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								For further reading:
							 
						 
					
						
							
								
									
										
										
										
											2018-10-12 15:37:13 -04:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								-  Click < a  href = "https://medium.com/ @gon .esbuyo/get-started-with-nlp-part-i-d67ca26cc828"   target = '_blank'  rel = 'nofollow' > here</ a >  for an article about NLP intro. 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								-  Click < a  href = "https://en.wikipedia.org/wiki/Natural_language_processing"  target = '_blank'  rel = 'nofollow' > here</ a >  for the Wikipedia reference.