top of page
Naive Bayes Classifier
  • Supervised learning and Classification Model.

  • Probabilistic classification model.

  • Model uses Bayes Theorem for calculating probabilities.

  • Model makes classification using posterior decision rule.

Bayes Rule

P ( A | B ) = ( P ( B | A ) * P ( A ) ) / P ( B ) 

 

Where:

  • P ( A ∣ B) is the posterior probability: the probability of event A (e.g., Hire/Not Hire) given evidence B (the features like age, gender, etc.).

  • P ( B ∣ A ) is the likelihood: the probability of evidence B given that event A is true.

  • P( A ) is the prior probability: the initial probability of event A occurring.

  • P( B ) is the evidence: the total probability of the evidence (features), regardless of the class label.

What is priori and posterior probability?

Priori (prior) Probability: 

​

It represents what is originally believed before new evidence.

This takes information into account simply, whatever is known (For ex: Coin, Dice, Card etc)

Posterior Probability: 
Whatever probability of event/ happening that has to be calculated from the data is the posterior probability.

​

​

How Bayes Classification works?

Training Data:
 

Assume a new data point below is taken for testing/ prediction:

We will need to predict if we will need hire this person or not

​

Prediction will happen with the Bayes rule with the available training data.

 

Y = Hire/ Not Hire

​​

Y = 0

Probability of Y = 0 ( Not Hire ) , given that age = 26 *
Probability of Y = 0 ( Not Hire ) , given that Gender = Male * 

Probability of Y = 0 ( Not Hire ) , given that Occupation = Self* 

Probability of Y = 0 ( Not Hire ) , given that Skill = ML

​​

Formula: 

P ( Y = 0 ) => P ( Y = 0 | Age = 26 ) * P ( Y = 0 | Gender = Male ) * P ( Y = 0 | Occupation = Self ) * P ( Y = 0 | Skill = ML ) 

​

​

Similarly, it will calculate the other probability as well, where​​

​

Y = 1

Probability of Y = 1 ( Not Hire ) , given that age = 26 *
Probability of Y = 1 ( Not Hire ) , given that Gender = Male * 

Probability of Y = 1 ( Not Hire ) , given that Occupation = Self* 

Probability of Y = 1 ( Not Hire ) , given that Skill = ML
 

Formula: 

P ( Y = 1 ) => P ( Y = 1 | Age = 26 ) * P ( Y = 1 | Gender = Male ) * P ( Y = 1 | Occupation = Self ) * P ( Y = 1 | Skill = ML ) 

 

 

Decision:
If P ( Y = 0 )  is ( ex 0.75 ) and P
 ( Y = 1 )  is ( ex 0.25 ), then the new data will be classified as '0' since 0.75 > 0.25.
 

​If P ( Y = 1 )  is ( ex 0.75 ) and P ( Y = 0)  is ( ex 0.25 ), then the new data will be classified as '1' since 0.75 > 0.25.​

​

​

Why is it called "Naive Bayes"?
Case Study

- We will be using the SMS Spam dataset

​

- First we will import all the libraries as below:

- Load the data

- Check the shape

(5574, 2)

Checking the first 5 data in the datasets

Check the text of the first dataset

Go until jurong point, crazy.. Available only in bugis n great world la e buffet... Cine there got amore wat...

Describe the data

Group the data based on the type

Add the text length column in the data set

Check for outliers

# stats description of the length columns

data['text_length'].describe()

Shortest Message: Ok
Longest Message: For me the love should start with attraction.i should feel that I need her every time around me.she should be the first thing which comes in my thoughts ...

Text Preprocessing

Remove Punctuations

This is a sample message to remove punctuations

Remove Stop Words

[nltk_data] Downloading package stopwords to /root/nltk_data... [nltk_data] Package stopwords is already up-to-date!

List all the stop words in English

['i', 'me', 'my', 'myself', 'we', 'our', 'ours', 'ourselves', 'you', "you're", "you've", "you'll", "you'd", ...

Remove punctuations and stop words

- Apply the text processing function to the text in the text column

©2021 by CYBERODE. Proudly created with Wix.com

bottom of page