Running Naive Bayes On UCI ADULT Data set With R

Another simple used supervised machine learning algorithm is Naive bayes.

Naive Bayes makes an assumption that all variables are independent of each other and although it may seem Naive it can help us get good results at time.

Naive Bayes can be difficult to explain due to the small amount of math involved in it.

In the case of UCI adult data set we want to predict if the individual has an income above or below 50K. Which is nothing but a factor variable.
Naive Bayes work amazing when all the explanatory variables are categorical and numerical.Rather numerical variables just cant be used here.Hence I have converted the entire data set into categorical variables.

To run the model i made use of the package e1071 in R which has the function naiveBayes and achieved an accuracy of -> 80 %

For the code and method please visit my GitHub link below

https://github.com/mmd52/UCI_ADULT_DATSET_PROJECT

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s