Supervised Learning – Traditional Methods

July 6, 2023 67 0

Supervised Learning – Traditional Methods, Supervised Learning – Traditional Methods.

Data Mining Supervised Learning – Traditional ML Models.

Supervised Learning is a sub-division of the Model Building step of CRISP-ML(Q)

Methodology. Supervised learning is a type of Predictive Modeling that involves

Classification Models, Shallow Machine Learning Models, Ensemble Models, Regression

Models, and the Black Box technique. We have numerous divisions of each modeling

technique.

We thoroughly discuss Probability, Joint Probability, Bayes Rule, and Naive Bayes using a

use case. Naive Bayes is not ideal for larger numeric features because numeric features

must be converted into categorical ones through discretization or bining. It allows the

deletion of missing value entries. This algorithm assumes class-conditional independence.

The probability is zero for new words not seen in training data, making the entire calculation

zero. To encounter this problem, we use Laplace Estimator. French Mathematician

Pierre-Simon Laplace created this algorithm. The default value of the Laplace estimator is 1.

Any value can be used for the Laplace estimator.

K-Nearest Neighbor Classifier is also called Lazy Learner, Memory-Based Reasoning,

Example-Based Reasoning, Instance-Based Learning, Case-Based Reasoning, Rote

Learning, etc. We understand the differences between the k-means algorithm and kNN. We

then understand 1, 2, 3, and 7 Nearest Neighbors. The minimum k value equals 1, and the

maximum equals the number of observations. k is a hyperparameter. We then understand

what a baseline model is, where accuracy is equal to the majority class, and for prediction

models, accuracy is greater than 80%. We further understand the Bias-variance trade-off.

We jump into the applications and importance of k-NN at the end.

The Decision Tree algorithm is a Rules-based algorithm. We understand what a decision

tree is, followed by learning how to build decision trees, then we dive into the greedy

algorithm, building the best decision tree and attribute selection. A decision tree is a tree-like

structure in which an internal node represents an attribute, each branch represents the

outcome of the best and each leaf node represents a class label. There are 3 types of nodes

a root node, a branch node, and a leaf node. So how do we build a decision tree? First, we

use training data to build a model, and then the tree generator determines the following-

– Which variable has to be split at a node, and the value of the split

– The decision to stop or split again has to be made

– Assigning terminal nodes to a label

– A basic or Greedy algorithm is a tree constructed in a top-down recursive

divide-and-conquer manner.

– Further, we analyze Greedy Approach, Entropy, and Information gain.