TensorFlow
is an open source software library created by Google that is used to implement machine learning and deep learning systems. These two names contain a series of powerful algorithms that share a common challenge — to allow a computer to learn how to automatically spot complex patterns and/or to make best possible decisions.
TensorFlow
, open sourced to the public by Google in November 2015, is the result of years of lessons learned from creating and using its predecessor, DistBelief. It was made to be flexible, efficient, extensible, and portable.
Computers of any shape and size can run it, from smartphones all the way up to huge computing clusters. It comes with lightweight software that can instantly productionize your trained model, effectively eliminating the need to reimplement models.
TensorFlow
embraces the innovation and community engagement o
f open source, but has the support, guidance, and stability of a large
corporation. Because of its multitude of strengths, TensorFlow
is appropriate for individuals and businesses ranging from startups to companies as large as, well, Google.
Since its open source release in November 2015, TensorFlow
has become one of the most exciting machine learning libraries available.
It is being used more and more in research, production, and education. The library has seen continual improvements, additions, and optimizations, and the TensorFlow community has grown dramatically.
Tensors
are the standard way of representing data in deep learning. Simply put, tensors
are just multidimensional arrays, an extension of two-dimensional tables (matrices) to data with higher dimensionality.
A tensor, put simply, is an n-dimensional matrix
.
In general, you can think about tensors
the same way you would matrices, if you are more comfortable with matrix math!
In this article, we will discuss the following TensorFlows
basics:
Sessions
Computation Graphs
Fetches
Rank
TensorFlow Operations
Data Types
Constants
Tensor Shape
Names
Name Scopes
Feed dictionary
Variables
Placeholders
Also, we will see how to install the TensorFlow
library.
#Installing TensorFlow
TensorFlow supports several methods for installing the TensorFlow
library:
- Pip install: Install TensorFlow on your machine, possibly upgrading previously installed Python packages. May impact existing Python programs on your machine.
- Virtualenv install: Install TensorFlow in its own directory, not impacting any existing Python programs on your machine.
- Anaconda install: Install TensorFlow in its own environment for those running the Anaconda Python distribution. Does not impact existing Python programs on your machine.
- Docker install: Run TensorFlow in a Docker container isolated from all other programs on your machine.
- Installing from sources: Install TensorFlow by building a pip wheel that you then install using pip.
Tensorflow can be easily installed using the following command:
pip install tensorflow
- Note:
TensorFlow
supports Python v2.7 / v3.5
#Running the codes (Using Terminal or Jupyter Notebook) The codes in this tutorial's Git repo can be run using the terminal command format
python <example>.py
or using Jupyter Notebook. Let's go through on how we could install Jupyter Notebook:
- Make sure the
TensorFlow
library is already installed in your system. - Run the following command in your terminal:
pip install jupyter
- To start the notebook server, run the command:
jupyter notebook
To run an example in Jupyter Notebook:
- Clone the repo:
git clone https://github.com/philipszdavido/tensorflow-basics
- Move into the
tensorflow-basics
directory:
cd tensorflow-basics
- Start the Jupyter server
jupyter notebook
-
If the server does not automatically navigate to
http://localhost:8888/tree#
, Open your favorite browser and navigate to the URLhttp://localhost:8888/tree#
. -
You will see a list of the directory's files. Clicking on one of the files opens a separate tab.
-
Click on
Cell
menu, and selectRun All Below
to run the code.
For more information on how to use Jupyter Notebook, you can visit their docs.
Now, we are done with installing and setting up our TensorFlow
environment. Let’s make a simple TensorFlow
program that will combine the words “Hello” and “World” and display the phrase — “HelloWorld”.
This example introduces many of the core elements of TensorFlow
and the ways in which it is different from a regular Python program.
First, we run a simple install and version check:
# inst_check.py
import tensorflow as tf
print(tf.__version__)
The above prints the version of the tensorflow
in the terminal. Run the following command to execute the script:
python inst_check.py
In your terminal,it will display the version of your tensorflow
:
python inst_check.py # It displays 1.4.0 the version of TensorFlow in my system
If correct, the output will be the version of TensorFlow
you have installed on your system. Version mismatches are the most probable cause of issues down the line.
We are done with verifying the TensorFlow
version. Let’s implement the HelloWorld example. Below is the full code:
# helloworld.py
import tensorflow as tf
hello = tf.constant("Hello")
world = tf.constant(" World!")
helloworld = hello + world
with tf.Session() as sess:
ans = sess.run(helloworld)
print(ans)
We assume you are familiar with Python and imports, in which case the first line:
import tensorflow as tf
imports
the TensorFlow
library and gives it an alias of tf
.
Next, we define the constants “Hello” and “ World!”, and combine them:
import tensorflow as tf
hello = tf.constant(“Hello”)
world = tf.constant(“ World!”)
helloworld = hello + world
At this point, you might wonder how (if at all) this is different from the simple Python code for doing this:
phello = “Hello”
pworld = “ World!”
phelloworld = phello + pworld
The key point here is what the variable helloworld
contains in each case. We can check this using the print command. In the pure Python case we get this:
> print phelloworld
Hello World!
In the TensorFlow
case, however, the output is completely different:
> print helloworld
Tensor(“add:0”, shape=(), dtype=string)
Probably not what you expected!
helloworld = hello + world
The TensorFlow
line of code does not compute the sum of hello
and world
, but rather adds the summation operation to a graph of computations to be done later.
Next, the Session
object acts as an interface to the external TensorFlow
computation mechanism, and allows us to run parts of the computation graph we have already defined. The line:
ans = sess.run(helloworld)
actually computes helloworld
(as the sum of hello
and world
, the way it was defined previously), following which the printing of ans displays the expected “Hello World!” message.
This completes the first TensorFlow
example. Run the following command to execute the script:
python helloworld.py
#Session
Following the above code, you have noticed the constant use of Session
. It is time we get to know what Session
is.
A Session
object is the part of the TensorFlow
API that communicates between Python objects and data on our end, and the actual computational system where memory is allocated for the objects we define, intermediate variables are stored, and finally results are fetched for us.
# session.py
import tensorflow as tf
f = tf.constant(5)
sess = tf.Session()
outs = sess.run(f)
print(outs)
The execution itself is then done with the .run()
method of the Session
object. When called, this method completes one set of computations in our graph in the following manner: it starts at the requested output(s) and then works backward, computing nodes that must be executed according to the set of dependencies.
Therefore, the part of the graph that will be computed depends on our output query.
In our example, we requested that node f
be computed and got its value, 5
, as output:
outs = sess.run(f)
When our computation task is completed, it is good practice to close the session using the sess.close()
command, making sure the resources used by our session are freed up.
#Computation Graphs
TensorFlow
allows us to implement machine learning algorithms by creating and computing operations that interact with one another. These interactions form what we call a computation graph
.
A graph
refers to a set of interconnected entities, commonly called nodes
or vertices
. These nodes
are connected to each other via edges
. In a dataflow graph, the edges allow data to “flow” from one node to another in a directed manner.
In TensorFlow
, each of the graph’s nodes represents an operation, possibly applied to some input, and can generate an output that is passed on to other nodes.
By analogy, graph computation can be thought of as an assembly line where each machine (node) either gets or creates its raw material (input), processes it, and then passes the output to other machines in an orderly flow, producing sub components and eventually a final product when the assembly process comes to an end.
Operations in the graph include all kinds of functions, from simple arithmetic ones such as subtraction and multiplication to more complex ones.
Let’s take a look at a bare-bones example.
In the above example, we see the graph for basic addition. The function, represented by a circle, takes in two inputs, represented as arrows pointing into the function. It outputs the result of adding 1
and 4
together: 5
, which is shown as an arrow pointing out of the function. The result could then be passed along to another function, or it might simply be returned to the client. We can also look at this graph as a simple equation:
The above illustrates how the two fundamental building blocks of graphs, nodes
and edges
, are used when constructing a computation graph. Let’s go over their properties:
-
Nodes
, drawn as circles, ovals, or boxes, represent some sort of computation or action being done on or with data in the graph’s context. In the above example, the operationadd
is the sole node. -
Edges
are the actual values that get passed to and from Operations, and are typically drawn as arrows. In theadd
example, the inputs1
and4
are both edges leading into the node, while the output5
is an edge leading out of the node.
Now, here’s a slightly more interesting example:
There’s a bit more going on in this graph! The data is traveling from left to right (as indicated by the direction of the arrows), so let’s break down the graph, starting from the left.
-
- At the very beginning, we can see two values flowing into the graph,
9
and5
.
- At the very beginning, we can see two values flowing into the graph,
-
- Each of these initial values is passed to one of two explicit
input
nodes, labeleda
andb
in the graphic. Theinput
nodes simply pass on values given to them- nodea
receives the value9
and outputs that same number to nodesc
andd
, while nodeb
performs the same action with the value5
.
- Each of these initial values is passed to one of two explicit
-
Node c
is a multiplication operation. It takes in the values9
and5
from nodesa
andb
, respectively, and outputs its result of45
to nodee
. Meanwhile, noded
performs addition with the same input values and passes the computed value of14
along to nodee
.
-
- Finally, node
e
, the final node in our graph, is anotheradd
node. It receives the values of45
and14
, adds them together, and spits out59
as the final result of our graph.
- Finally, node
#Building your first TensorFlow graph
Following through the last section, we became familiar with tensor graphs. In this section, we are going to convert the graph model into a TensorFlow
code.
Here’s what it looks like in TensorFlow code:
# tensor_graph.py
import tensorflow as tf
a = tf.constant(9, name="input_a")
b = tf.constant(5, name="input_b")
c = tf.multiply(a,b, name="mul_c")
d = tf.add(a,b, name="add_d")
e = tf.add(c,d, name="add_e")
sess = tf.Session()
print("Tensor Graph Result: {}".format(sess.run(e)))
Let’s break this code down line by line. First, you’ll notice this import statement:
import tensorflow as tf
This, unsurprisingly, imports
the TensorFlow
library and gives it an alias of tf
. This is by convention, as it’s much easer to type tf
, rather than tensorflow
over and over as we use its various functions!
Next, let’s focus on our first two variable assignments:
a = tf.constant(9, name=”input_a”)
b = tf.constant(5, name=”input_b”)
Here, we’re defining our “input” nodes, a
and b
. These lines use our first TensorFlow Operation: tf.constant()
.
In TensorFlow
, any computation node in the graph is called an Operation
, or Op
for short. Ops take in zero or more Tensor
objects as input and output zero or more Tensor
objects.
To create an Operation
, you call its associated Python constructor- in this case, tf.constant()
creates a “constant”
Op. It takes in a single tensor
value, and outputs that same value to nodes that are directly connected to it. For convenience, the function automatically converts the scalar numbers 9
and 5
into Tensor objects for us.
We also pass in an optional string name parameter, which we can use to give an identifier to the nodes we create.
c = tf.multiply(a,b, name=”mul_c”)
d = tf.add(a,b, name=”add_d”)
Here, we are defining the next two nodes in our graph, and they both use the nodes we defined previously.
Node c
uses the tf.multiply
. Op, which takes in two inputs and outputs the result of multiplying them together.
Similarly, node d
uses tf.add
, an Operation
that outputs the result of adding two inputs together. We again pass in a name to both of these Ops
(it’s something you’ll be seeing a lot of).
Notice that we don’t have to define the edges of the graph separately from the node- when you create a node in TensorFlow
, you include all of the inputs that the Operation
needs to compute, and the software draws the connections for you.
e = tf.add(c,d, name=”add_e”)
This last line defines the final node in our graph. e
uses tf.add
in a similar fashion to node d
. However, this time it takes nodes c
and d
as input- exactly as its described in the graph above. With that, our first, albeit small, graph has been fully defined! If you were to execute the above in a Python script or shell, it would run, but it wouldn’t actually do anything.
Remember- this is just the definition part of the process. To get a brief taste of what running a graph looks like, we could add the following two lines at the end to get our graph to output the final node:
sess = tf.Session()
print("Tensor Graph Result: {}".format(sess.run(e)))
To see this example in action, run the following command in your terminal:
python tensor_graph.py
You can also use Jupyter/iPython Notebook to run the tensorflow examples.
#Fetches
This is TensorFlow
's mechanism for retrieving tensors from a graph launched in a session. You retrieve fetches when you trigger the execution of a graph, not when you build the graph. To fetch the tensor value of a node or nodes, execute the graph with a run()
call on the Session
object and pass a list of names of nodes to retrieve.
We can also ask sess.run()
for multiple nodes’ outputs simply by inputting a list of requested nodes:
# fetches.py
import tensorflow as tf
a = tf.add(7,9)
b = tf.multiply(a,9)
with tf.Session() as sess:
fetches = [a,b]
outs = sess.run(fetches)
print("outs = {}".format(outs))
print(type(outs[0]))
We get back a list containing the outputs of the nodes according to how they were ordered in the input list.
#Rank
Rank
is defined as the number of dimensions a tensor has. Tensor
shape represents the size of each dimension.
- A tensor with rank 0 is a zero dimensional array.
v = []
sess = tf.Session()
print(sess.run(tf.rank(v))) # prints 0
- A tensor with rank 1 is a one dimensional array.
v = [4, 3, 4, 5]
sess = tf.Session()
print(sess.run(tf.rank(v))) # prints 1
- A tensor with rank 2 is a two dimensional array.
b = [ [2, 1, 7], [7, 9, 8] ]
sess = tf.Session()
print(sess.run(tf.rank(b))) # prints 2
We can continue increasing the entries up to 5-Tensor, 6-Tensor, .... n- Tensor.
# rank.py
import tensorflow as tf
# tensor object
t = [[1, 2, 3], [4, 5, 6], [7, 8, 9], [2, 3, 4]]
# initialize session here
sess = tf.Session()
print("Tensor Object: {}".format(t))
print("Rank: {}".format(sess.run(tf.rank(t)))) # prints the rank 2
print("Shape: {}".format(sess.run(tf.shape(t)))) # prints the shape [4, 3]
print("Size: {}".format(sess.run(tf.size(t)))) # prints the size 12
#TensorFlow Operations
TensorFlow Operations
are nodes that perform computations on or with Tensor
. objects.
- In
TensorFlow
, Constants, Variables and, Operations are collectively calledOps
. Ops
are more than ust mathematical computations, and are used for tasks such as initializing state.
Nodes in TensorFlow
graph are called Ops
(short for operations). An Op
takes zero or more Tensors
, performs some computation, and produces zero or more Tensors
.
#Data Types
Tensors
have a data type. The basic units of data that pass through a graph are numerical
, Boolean
, or string
elements. When we print out the Tensor
object c
from our last code example, we see that its data type is a floating-point number. Since we didn’t specify the type of data, TensorFlow
inferred it automatically.
For example, 5
is regarded as an integer, while anything with a decimal point, like 5.1
, is regarded as a floating-point number.
We can explicitly choose what data type we want to work with by specifying it when we create the Tensor
object. We can see what type of data was set for a given Tensor
object by using the attribute dtype
:
# data_types.py
import tensorflow as tf
c = tf.constant(5.0, dtype=tf.float64)
print(c)
print(c.dtype)
#Constants In TensorFlow, constants are created using the function constant, which has the signature
constant(value, dtype=None, shape=None, name='Const', verify_shape=False)
,
where value
is an actual constant value which will be used in further computation, dtype
is the data type parameter (e.g., float32/64
, int8/16
, etc.), shape
is optional dimensions, name
is an optional name for the tensor
, and the last parameter is a boolean which indicates verification of the shape of values.
If you need constants
with specific values inside your training model, then the constant object can be used as in following example:
z = tf.constant(5.2, name="x", dtype=tf.float32)
#Tensor shape
The shape of a tensor
is the number of elements in each dimension. TensorFlow
automatically infers shapes during graph construction.The shape
of a tensor, describes both the number of dimensions in a tensor as well as the length of each dimension.
Tensor
shapes can either be Python lists
or tuples
containing an ordered set of integers: there are as many numbers in the list as there are dimensions, and each number describes the length of its corresponding dimension. For example, the list [3, 4]
describes the shape of a 3-D
tensor of length 3
in its first dimension and length 4
in its second dimension. Note that either tuples (())
or lists ([])
can be used to define shapes.
Let’s take a look at more examples to illustrate this further:
# tensor_shapes.py
import tensorflow as tf
# Shapes that specify a 0-D Tensor (scalar)
# e.g. any single number: 7, 1, 3, 4, etc.
s_0_list = []
s_0_tuple = ()
list_0_D = tf.random_uniform(s_0_list, 1, 4)
tuple_0_D = tf.random_uniform(s_0_tuple, 1, 5)
list_0_D_var = tf.Variable(list_0_D, name='list_0')
tuple_0_D_var = tf.Variable(tuple_0_D, name='tuple_0')
# Shape that describes a vector of length 3
# e.g. [1, 2, 3]
s_1 = [3]
s_1_val = tf.random_uniform(s_1, 1, 4)
s_1_var = tf.Variable(s_1_val, name='s_1')
# Shape that describes a 3-by-2 matrix
# e.g [[1 ,2],
# [3, 4],
# [5, 6]]
s_2 = (3, 2)
s_2_val = tf.random_uniform(s_2, 1, 4)
s_2_var = tf.Variable(s_2_val, name='s_2')
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
print("\nList 0-D Tensor: \n{}".format(sess.run(list_0_D_var)))
print("\nTuple 0-D Tensor: \n{}".format(sess.run(tuple_0_D_var)))
print("\n1-D Tensor: \n{}".format(sess.run(s_1_var)))
print("\nMatrix 3-by-2: \n{}".format(sess.run(s_2_var)))
We can assign a flexible length by passing in None
as a dimension’s value. Passing None
as a shape will tell TensorFlow
to allow a tensor of any shape. That is, a tensor with any amount of dimensions and any length for each dimension:
# Shape for a vector of any length:
t_1 = [None]
# Shape that could be any Tensor
t_any = None
# Shape for a matrix that is any amount of rows tall, and 4 columns wide:
t_2 = (None, 4)
# Shape of a 3-D Tensor with length 3 in its first dimension, and variable-
# length in its second and third dimensions:
t_3 = [3, None, None]
The tf.shape
Op can be used to find the shape of a tensor
if any need to in your graph. It simply takes in the Tensor
object you’d like to find the shape for, and returns it as an int32
vector:
import tensorflow as tf
# create 3-by-2 tensor
s_2 = (3, 2)
s_2_val = tf.random_uniform(s_2, 1, 4)
# initialize and run session
sess = tf.Session()
# Find the shape of the mystery
print(sess.run(tf.shape(s_2_val, name="mystery_shape"))) # prints [3, 2]
Tensors
are just a superset of matrices!tf.shape
, like any other Operation, doesn’t run until it is executed inside of aSession
.
#Names
Tensor
objects can be identified by a name. This name is an intrinsic string name. As with dtype
, we can use the .name
attribute to see the name of the object:
# names.py
import tensorflow as tf
with tf.Graph().as_default():
five1 = tf.constant(5, dtype=tf.float64, name='five')
five2 = tf.constant(5, dtype=tf.int32, name='five')
six = tf.constant(6, dtype=tf.int32, name='six')
print(five1.name)
print(five2.name)
print(six.name)
The .name
attribute is used to print the name of each Tensor
objects in the example above.
The name of the Tensor
object is simply the name of its corresponding operation (“five”; concatenated with a colon)
, followed by the index of that tensor in the outputs of the operation that produced it — it is possible to have more than one.
#Name scopes
In TensorFlow
, large, complex graph could be grouped together, so as to make it easier to manage. Nodes
can be grouped by name. It is done by using tf.name_scope(“prefix”)
Op together with the useful with clause.
# name_scopes.py
import tensorflow as tf
with tf.Graph().as_default():
ns1 = tf.constant(9,dtype=tf.float64,name='ns')
with tf.name_scope("prefix_name"):
ns2 = tf.constant(9,dtype=tf.int32,name='ns')
ns3 = tf.constant(9,dtype=tf.float64,name='ns')
print(ns1.name)
print(ns2.name)
print(ns3.name)
In this example we’ve grouped objects contained in variables ns2
and ns3
under the scope prefix_name
, which shows up as a prefix in their names.
Prefixes
are especially useful when we would like to divide a graph into subgraphs with some semantic meaning.
#Feed dictionary
Feed
is used to temporarily replace the output of an operation with a tensor value. The parameter feed_dict
is used to override Tensor
values in the graph, and it expects a Python dictionary object as input. The keys in the dictionary are handles to Tensor
objects that should be overridden, while the values can be numbers, strings, lists, or NumPy
arrays. feed_dict
is also useful for specifying input values.
- Note : The values must be of the same type as the
Tensor
key.
Let’s show how we can use feed_dict
to overwrite the value of a
:
# feed_dict.py
import tensorflow as tf
# Create Operations, Tensors, etc (using the default graph)
a = tf.add(5, 9)
b = tf.multiply(a, 7)
# Start up a `Session` using the default graph
sess = tf.Session()
# Define a dictionary that says to replace the value of `a` with 45
replace_dict = {a: 45}
# Run the session, passing in `replace_dict` as the value to `feed_dict`
print(sess.run(b, feed_dict=replace_dict)) # returns 315
# Close the graph, release its resources
sess.close()
Even though a
would normally evaluate to 14
, the dictionary we passed into feed_dict
replaced that value with 45
.
feed_dict
can be extremely useful in a number of situations. Because the value of a tensor is provided up front, the graph no longer needs to compute any of the tensor’s normal dependencies.
#Variables
TensorFlow
uses special objects called Variables
. Unlike other Tensor
objects that are “refilled”
with data each time we run a session. They can maintain a fixed state in a graph.
Variables
like other Tensors
, can be used as input for other operations in the graph.
Using Variables is done in two stages.
- First the
tf.Variable()
function is called in order to create aVariable
and define what value it will be initialized with. - An initialization operation is perfomed by running the session with the
tf.global_variables_initializer()
method, which allocates the memory for the Variable and sets its initial values.
Like other Tensor
objects, Variables
are computed only when the model runs, as we can see in the following example:
# variable.py
import tensorflow as tf
init_val = tf.random_normal((2,4),1,4)
var = tf.Variable(init_val, name='var')
print("pre run: \n{}".format(var))
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
post_var = sess.run(var)
print("\npost run: \n{}".format(post_var))
We see that the variable var
is not initialized until the ses.run(init)
is called. Like we said earlier, values of Variable
in Tensorflow
are not computed, until when the Session
runs.
- Note: To reuse the same variable, we can use the
tf.get_variables()
function instead of `tf.Variable().
#Placeholders
Placeholders
are structures designated by TensorFlow
for feeding input values. They can be also thought of as empty Variables that will be filled with data later on. They are used by first constructing our graph and only when it is executed feeding them with the input data.
Placeholders
have an optional shape argument. If a shape is not fed or is passed as None, then the placeholder can be fed with data of any size:
ph = tf.placeholder(tf.float32,shape=(None,10))
- Whenever a placeholder is defined, it must be fed with some input values or else an exception will be thrown.
# placeholder.py
import tensorflow as tf
x = tf.placeholder("float", None)
y = x * 5
with tf.Session() as sess:
ans = sess.run(y, feed_dict={x: [2, 4, 6]})
print(ans)
First, we import tensorflow
as normal. Then we create a placeholder called x
, i.e. a place in memory where we will store value later on.
Then, we create a Tensor called y
, which is the operation of multiplying x
by 5
. Note that we haven’t defined any initial values for x
yet.
We now have an operation (y)
defined, and can now run it in a session. We create a session object, and then run just the y
variable. Note that this means, that if we defined a much larger graph of operations, we can run just a small segment of the graph. This subgraph evaluation is actually a bit selling point of TensorFlow
, and one that isn’t present in many other libraries that do similar things.
Running y
requires knowledge about the values of x
. We define these inside the feed_dict
argument to run. We state here that the values of x
are [2, 4, 6]
. We run y
, giving us the result of [10, 20, 30]
.
#Training to predict Now we have gone through the basics of TensorFlow, we will demonstrate a full working example that illustrates everything we talked about. Now we turn to optimization. We first describe the basics of training a model, giving a short description of each component in the process, and show how it is performed in TensorFlow. We then demonstrate a full working example of an optimization process of a simple regression model.
In this book we will focus on supervised learning problems, where we train an inference model with an input dataset, along with the real or expected output for each example. The model will cover a dataset and then be able to predict the output for new inputs that don’t exist in the original training dataset.
Linear regression is the simplest form of modeling for a supervised learning problem. Given a set of data points as training data, we are going to find the linear function that best fits them. In a 2-dimensional dataset, this type of function represents a straight line.
X value | Y value |
---|---|
1 | 1 |
2 | 3 |
4 | 3 |
3 | 2 |
5 | 5 |
The attribute x is the input variable and y is the output variable that we are trying to predict. If we got more data, we would only have x values and we would be interested in predicting y values.
Below is a simple scatter plot of x versus y:
We can see the relationship between x and y looks kind of linear. As in, we could probably draw a line somewhere diagonally from the bottom left of the plot to the top right to generally describe the relationship between the data.
This is a good indication that using linear regression might be appropriate for this little dataset. When we have a single input attribute (x) and we want to use linear regression, this is called simple linear regression.
If we had multiple input attributes (e.g. x1, x2, x3, etc.) This would be called multiple linear regression. The procedure for linear regression is different and simpler than that for multiple linear regression, so it is a good place to start.
In this section we are going to create a simple linear regression model from our training data, then make predictions for our training data to get an idea of how well the model learned the relationship in the data.
The general formula of a linear function is:
As our X values is one dimensional, the general formula will now become:
import tensorflow as tf
# Model parameters
W = tf.Variable([.3], dtype=tf.float32)
b = tf.Variable([-.3], dtype=tf.float32)
# Model input and output
x = tf.placeholder(tf.float32)
linear_model = W*x + b
y = tf.placeholder(tf.float32)
# loss
loss = tf.reduce_sum(tf.square(linear_model - y)) # sum of the squares
# optimizer
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)
# training data
x_train = [1, 2, 3, 4]
y_train = [0, -1, -2, -3]
# training loop
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init) # reset values to wrong
for i in range(1000):
sess.run(train, {x: x_train, y: y_train})
# evaluate training accuracy
curr_W, curr_b, curr_loss = sess.run([W, b, loss], {x: x_train, y: y_train})
print("W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss))
#Conclusion
TensorFlow
is a powerful framework that makes working with mathematical expressions and multi-dimensional arrays a breeze — something fundamentally necessary in machine learning.
We have covered the basics of TensorFlow
. I think this will get us started on our journey into the TensorFlow
realm. In my subsequent tutorials to come, we will see how to leverage the TensorFlow library to solve optimization problems using the Gradient Descent
model. We will also train a model to solve the XOR problem using Linear Regression
and Logistic regression
.