Logistic Regression

If you have followed my previous post you may have understood some common things to create before running any kind of model in tensor flow.

  1. Number of iterations
  2. Learning rate
  3. Cost Function

Now just like simple linear regression we want to first understand how logistic regression is working in tensor flow because of which we will take a very simple data set say 2 independent variables and one dependant variable(1 or 0).

Now lets accept one complicated thing. Some data points for certain variables could have very high values as compared to another variable, Hence its important to tackle this problem head on by normalising our entire data set.

Now we look at the problem systematically and define a few functions to get it up and working.

def read_dataset(filePath,delimiter=','):
 data = genfromtxt(filePath, delimiter=delimiter)
 features, labels = np.array(data[:,0:-1], dtype=float), 
                                     np.array(data[:,-1],dtype=int)
 return features,labels

def feature_normalize(features):
 mu = np.mean(features,axis=0)
 sigma = np.std(features,axis=0)
 return (features - mu)/sigma

def append_bias_reshape(features):
 n_training_samples, n_dim = features.shape[0], features.shape[1]
 features = np.reshape(np.c_[np.ones(n_training_samples),features],
                                         [n_training_samples,n_dim + 1])
 return features

def one_hot_encode(labels):
 n_labels = len(labels)
 n_unique_labels = len(np.unique(labels))
 one_hot_encode = np.zeros((n_labels,n_unique_labels))
 one_hot_encode[np.arange(n_labels), labels] = 1
 return one_hot_encode

def plot_points(features,labels):
 normal = np.where(labels == 0)
 outliers = np.where(labels == 1)
 fig = plt.figure(figsize=(10,8))
 plt.plot(features[normal ,0],features[normal ,1],'bx')
 plt.plot(features[outliers,0],features[outliers ,1],'ro')
 plt.xlabel('Latency (ms)')
 plt.ylabel('Throughput (mb/s)')
 plt.show()

Now our basic framework to build a model is set , next is to define number of iterations and cost function

learning_rate = 0.00001
training_epochs = 100

X = tf.placeholder(tf.float32,[None,n_dim])
Y = tf.placeholder(tf.float32,[None,2])
W = tf.Variable(tf.ones([n_dim,2]))
init = tf.initialize_all_variables()

y_ = tf.nn.sigmoid(tf.matmul(X,W))
cost_function = tf.nn.l2_loss(y_-Y,name="Squared_Error_Cost") 
#tf.reduce_mean(tf.reduce_sum((-Y * tf.log(y_)) - ((1 - Y) * tf.log(1 - y_)), reduction_indices=[1]))
#tf.nn.l2_loss(activation_OP-yGold, name="squared_error_cost")
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost_function)

After this we just need to run the model.

You can view my complete code for simple Logistic Regression here

Data set for the above code here

and logistic regression using the iris data set here

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s