Created at 2017-07-08 Updated at 2018-12-03 Category TensorFlow Tag TensorFlow

TensorFlow is such a powerful tools that you can easily to compute any value on your needs. Today I am going to introduce a function called tf.gradients() by which you can compute gradients. Let’s go.

Before using it, let’s have a look at the docs

ys,
xs,
aggregation_method=None
)

And its description

Constructs symbolic partial derivatives of sum of ys w.r.t. x in xs

ys and xs are each a Tensor or a list of tensors. grad_ys is a list of Tensor, holding the gradients received by the ys. The list must be the same length as ys.

gradients() adds ops to the graph to output the partial derivatives of ys with respect to xs. It returns a list ofTensor of length len(xs) where each tensor is the sum(dy/dx) for y in ys.

grad_ys is a list of tensors of the same length as ys that holds the initial gradients for each y in ys. When grad_ys is None, we fill in a tensor of ‘1’s of the shape of y for each y in ys. A user can provide their own initial grad_ys to compute the derivatives using a different initial gradient for each y (e.g., if one wanted to weight the gradient differently for each value in each y).

Usually, we need to calculate gradients and update the gradients and practically one can detect whether gradient vanishing or exploding are happening by summarise gradients via TensorBoard— another visulization tool developed by TensorFlow team.

Here is a simple example to clarify its usage

Suppose you have simple linear function
$$\widehat Y = W \times X + b$$
And We want to fit a linear function so that we can prediction unseen data. And normally we should have a function which in quadratic way.

$$cost = \frac{1}{2} \times (\widehat Y-Y)^2$$
We can get our fit by minimising the cost to a threshold for example 0.01. Here is the code to calculate gradients

If you have learnt calculus, you can easily calculate the result
$$\frac {\partial cost}{\partial W} = (W \times X + b -Y) \times X$$
and
$$\frac {\partial cost}{\partial b} = (W \times X +b - Y)$$
Since the gradients are the accumlated gradients w.r.t each xin xs , to get the result, just adding all the corresponding gradients.

Hope you can understand this post.

## Table of Content

Site by GoingMyWay using Hexo & Random

I am a ML and RL research student

Hide