Theoretical aspects of Natural Language Processing

August 20, 2022 23 0

Theoretical aspects of Natural Language Processing, as a prelude to Python programming.

This course is an introduction to several basic theoretical aspects of natural language processing (NLP). Text mining will be discussed and it will br shown how this technique relates to NLP. An introduction to NLP will discuss how this science is crucial to our current technological world.

Three libraries that cover NLP will be discussed and these libraries are:-

1. Natural language toolkit (NLTK)

2. Spacy

3. Sklearn

NLTK has many functions that are relevant to NLP, to include:-

1. Processing text data

2. Removing frequently used words

3. Sentence tokenisation

4. Word tokenisation

5. Blank line tokenisation

6. Frequency distribution

7. Stop words

8. Unikgrams, bigrams, trigrams, and ngrams

9. Stemming

10. Lemmatisation

11. Part of speech tagging

12. Named entity recognition

13. Chunking

14. Chinking

Spacy is a new library that is concerned with NLP and has several functions to cover this genre including:-

1. Lemmatisation

2. Part of speech tagging

3. Named entity recognition

4. Displacy

5. Pattern matching

Machine learning, deep learning, and neural networks are crucial to NLP because they are needed to make predictions on the text data that is mined.

Sklearn is Python’s library that carries out machine learning and it has several methods relating solely to NLP, being:-

1. CountVectorizer

2. TfidfTransformer

3. Cosine similarity

4. TfidfVectorizer

5. HashingVectorizer

6. DictVectorizer

Classifiers will be discussed because they are necessary to carry out sentiment analysis. Although there is a wide range of classifiers that can be used in NLP, the ones that will be discussed in this course are:-

1. Sklearn’s LinearSVC

2. NaiveBayes

Redeem Coupon