We saw that logistic Regression was a bad model for our telecom churn analysis, that leaves us with Decision tree.
Again we have two data sets the original data and the over sampled data. We run decision tree model on both of them and compare our results.
So running decision tree on the normal data set yielded better results as compared to running on the over sampled data set
|Over Sampled Data||0.8894||0.5274||0.5965||0.5862||0.7656|
Unfortunately the decision tree plot was too big for me to put it in this post.
As decision tree is giving the highest level of accuracy , we will select it as the clear winner for our telecom churn analysis problem.
Another major advantage of decision tree is that it could be explained graphically very easily to the end business user on why a particular choice is being made.
You can find the code for decision tree here->
This was a dummy database and may not have yielded the best results , but is a perfect exercise for practice.