Linear Regression with TensorFlow

This section provides a tutorial example on how to create a linear regression learning model with TensorFlow Python API. An introduction of basics concepts of linear regression is provided. The Python script by Nikhil Kumar is used as a test example.

From previous tutorials, we have learned how to create a tensor flow graph to represent a tensor expression of multiply connected tensor operations. Using very simple tensor expression as an example, we have also learned how to create a TensorFlow session to evaluate the tensor flow graph to generate the output of the tensor expression.

In this tutorial, let's try to create a more real tensor flow graph to solve a machine learning problem using the linear regression model.

First, we need refresh our memory on the linear regression model with the following concepts:

1. Task - Using supervised learning technique to construct a learning model to approximate a real-world relation between a set of features and their related target. Some samples with feature sets and their known targets are provided to help training the model.

2. Features - Also called independent variables, or predictor variables, input values. Features of a single sample is usually represented as X = (x1, x2, ..., xn).

3. Target - Also called dependent variable, or output variable. Target is usually represented as y.

4. Prediction - The output value generated from the learning model. Prediction can be represented as y'.

5. Linear regression model - A linear function to calculate the prediction y' from the features, X. Linear regression model can be expressed as below using vector product operation:

Linear regression model:
  y' = b0 + B·X

  y' = b0 + b1*x1 + b2*x2 + ... bn*xn

6. Intercept, also called bias - A parameter in the linear regression model to move the intercept value up or down. Intercept is the first parameter, b0, in the above formula.

7. Coefficients, also weights - Also called scale factors. Coefficients are B = (b1, b2, ..., bn), in the above formula.

9. Error, or Residual - The distance between the target and the prediction of a given single simple:

  e = y - y'

10. Loss function - A function that measures how far off the prediction y' is from the target y:

  l = L(y, y')

11. Squared error - Half of the error squared (to power of 2) of a given single simple, (y - y')2/2. Squared error is the commonly used loss function in linear regression model.

  l = e2/2
  l = (y - y')2/2.

12. Cost function - A function on model's parameters (b0, B) that measures how far off the prediction model on a given set of samples.

  c = C(b0, B) on (X1, X2, ..., Xm)

13. MSE (Mean Squared Error) - The mean value of squared errors on a given set of samples. MSE is the commonly used cost function in linear regression models.

MSE as cost function:
  c = (l1 + l2 + ... + lm)/m
  c = (e12 + e22 + ... + em2)/m

15. Cost optimization - A process to find model's parameters that result the lowest cost on a given sample set. For a linear regression model, there are two types cost optimization processes:

14. Gradient Descent - An iterative solution algorithm for cost optimization. Gradient descent updates model's parameters (b0, B), in the deepest descending direction on the cost function surface. The descending distance is controlled by factor called learning rate.

Gradient descent is commonly in linear regression models. The formula for calculation parameter updates can be found in any linear regression text book.

15. Learning rate - A factor used to reduce the update quantities on model's parameters in a gradient descent step. A smaller learning rate like 0.01 allows us to rerun the gradient descent step multiple times on the same training set to reach the lowest cost gradually to avoid overshooting problem.

16. Initialization - Providing initial values for model's parameters, intercept and coefficients. Random values are usually used for initialization.

17. Training - Using a set of samples to train the model by using the gradient descent method to find model's parameters that result the lowest cost on the sample set.

18. Epoch - A cycle of training that uses each and every sample once in the training set. If a smaller learning rate is used, you need to run epochs to the lowest cost.

19. Testing - Using a set of samples to test the model by evaluating the cost on the test set.

Okay, enough on linear regression concepts. Let's use TensorFlow to build a linear regression model by following the example provided in "Introduction to TensorFlow" by Nikhil Kumar at

Here is Python script provided by Nikhil Kumar with some updates.

#- Source:
#- Updates:
#-   Removed graphical plots
#-   Identified variables explicitly
import tensorflow as tf
import numpy as np

# Model Parameters
learning_rate = 0.01
training_epochs = 2000
display_step = 200

# Training Data
train_X = np.asarray([3.3,4.4,5.5,6.71,6.93,4.168,9.779,6.182,7.59,2.167,
train_y = np.asarray([1.7,2.76,2.09,3.19,1.694,1.573,3.366,2.596,2.53,1.221,
n_samples = train_X.shape[0]

# Test Data
test_X = np.asarray([6.83, 4.668, 8.9, 7.91, 5.7, 8.7, 3.1, 2.1])
test_y = np.asarray([1.84, 2.273, 3.2, 2.831, 2.92, 3.24, 1.35, 1.03])

# Set placeholders for feature and target vectors
X = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)

# Set model weights and bias
W = tf.Variable(np.random.randn())
b = tf.Variable(np.random.randn())

# Construct a linear model
linear_model = W*X + b

# Mean squared error
cost = tf.reduce_sum(tf.square(linear_model - y)) / (2*n_samples)

# Gradient descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
minimize = optimizer.minimize(cost, var_list=(W, b))

# Initializing the variables
W_init = tf.variables_initializer([W])
b_init = tf.variables_initializer([b])

# Launch the graph
with tf.Session() as sess:
    # Load initialized variables in current session

    # Fit all training data
    for epoch in range(training_epochs):

        # perform gradient descent step, feed_dict={X: train_X, y: train_y})

        # Display logs per epoch step
        if (epoch+1) % display_step == 0:
            c =, feed_dict={X: train_X, y: train_y})
            print("Epoch:{0:6} \t Cost:{1:10.4} \t W:{2:6.4} \t b:{3:6.4}".
                  format(epoch+1, c,,

    # Print final parameter values
    print("Optimization Finished!")
    training_cost =, feed_dict={X: train_X, y: train_y})
    print("Final training cost:", training_cost, "W:",, "b:",
, '\n')

    # Testing the model
    testing_cost =
                    tf.square(linear_model - y)) / (2 * test_X.shape[0]),
                    feed_dict={X: test_X, y: test_y})

    print("Final testing cost:", testing_cost)

If you run the script, you should get something similar to the following:

herong$ python3

Epoch:   200    Cost:   0.08787    W:0.1923    b: 1.219
Epoch:   400    Cost:   0.08366    W:0.2051    b: 1.129
Epoch:   600    Cost:   0.08107    W:0.2151    b: 1.057
Epoch:   800    Cost:   0.07948    W:0.2230    b: 1.002
Epoch:  1000    Cost:   0.07850    W:0.2292    b: 0.9579
Epoch:  1200    Cost:   0.07789    W:0.2340    b: 0.9236
Epoch:  1400    Cost:   0.07752    W:0.2378    b: 0.8967
Epoch:  1600    Cost:   0.07729    W:0.2408    b: 0.8756
Epoch:  1800    Cost:   0.07715    W:0.2431    b: 0.859
Epoch:  2000    Cost:   0.07707    W:0.2450    b: 0.846
Optimization Finished!
Final training cost: 0.07706697 W: 0.24497162 b: 0.84604114

Final testing cost: 0.079794206

Notes on Nikhil Kumar's sample script:

Table of Contents

 About This Book

 Deep Playground for Classical Neural Networks

 Building Neural Networks with Python

 Simple Example of Neural Networks

TensorFlow - Machine Learning Platform

 What Is TensorFlow

 "tensorflow" - TensorFlow Python Library

 "tensorflow" Interactive Test Web Page

 Tensor and Tensor Flow Graph

 Tensor Operation Properties

 TensorFlow Session Class and run() Function

 TensorFlow Variable Class and load() Function

Linear Regression with TensorFlow

 tensorflow.examples.tutorials.mnist Module

 Simple TensorFlow Model on MNIST Database

 Commonly Used TensorFlow Funcitons

 PyTorch - Machine Learning Platform

 CNN (Convolutional Neural Network)

 RNN (Recurrent Neural Network)

 GNN (Graph Neural Network)


 Full Version in PDF/EPUB