The least squares or least squares allows us to find the line that best approximates our data. The methods that allow us to find the minimum or maximum solution are called optimization methods. There are numerous types of optimization methods, but, in this post, we are going to focus on one of the most used today in the world of machine learning: he gradient descent or descending gradient.
What is gradient descent?
He gradient descent It is an iterative optimization method (each iteration we execute we are a little closer to the minimum of the solution-function).
What we want to do is calculate the solution of the following function:
The values that must be found are those of w0 and w1, w2, w3… wn.
We can convert it into arrays:
yi ≈ f (xi ; w) = XW
And we want to find:
The arguments w that minimize the function argmin f (w).
Note: min f (x) and arg min f (x) are different things. For example:
Consider the function f (x) = 100 + (x – 3)2
min f(x) = f(arg min f(x))
So:
df(x)/x = 2(x – 3)
arg min 2 (x – 3) = 0
arg min f(x) = 3
On the other hand:
min f (x) = f (arg min f (x)) = 100 + (3 – 3)2
min f(x) = 100
He gradient descent It is not the same as least squares, where you simply apply the formula and it shows us the results in one go.
This is an iterative method, therefore, each time we execute an iteration of the gradient descentwe will be a little closer to the solution. If in the first iteration we are at a distant point, what we must do is execute another iteration, in which we will be a little closer to the result. Thus, successively, with each iteration we arrive at the result we expect, that is, the final solution. For this reason we say that it is an iterative optimization method.
We can graph this optimization process. As always, we will use matplotlib and numpy.
#gradient descent import matplotlib.pyplot as plt import numpy as np fig, axs = plt.subplots() x = np.linspace (-5, 5, 100) #Calculate the y value for each element of the x vector y = 100 + (x – 3) ** 2 axs.annotate (‘arg min f (x)’, xy = 0.05, 0), xycoords=»axes fraction», xytest = (-0, 0), textcoords=»offset pixels «, horizontalalligment=»right», verticalalligment=»bottom», color=»r» axs.annotate (‘min f (x)’, xy = 0, 0.1), xycoords=»axes fraction», xytest = (-0 , 0), textcoords=»offset pixels», horizontalalligment=»right», verticalalligment=»bottom», color=»r» axs.plot (x, y)
The min arg
To understand how the gradient descentit is necessary that we understand the function that arg min plays within this entire process.
We then have a function that depends on w (f (w)) and we want to find the parameters that make this function minimum. So, as we can see in the graph of the result of the gradient descentt, we have a curve. The value that makes the function minimum, which is what we want to find, is at the bottom, marked in red:
The minimum of the function is on the y-axis, marked in green. So, it is worth asking: what is the value of x that allows us to obtain the minimum of the function?
We know that the minimum of this function is 100. Likewise, the value of x that we have to evaluate so that it gives us 100 is 3.
Since we want to find the value of w that makes minimum f (x), that is, arg min f (x), the steps to follow are:
start somewhere random point of the curve. Iteratively, move in the direction of steepest slope ∇ f (x) with steps with size n.
x : = x – n. ∇f(x)
xi = xi-1 – n. ∇f(x)
The parameter n is called learning rate and it is the parameter that is responsible for measuring the step size in each iteration as we move towards the minimum of the function.
Do you want to continue advancing in your professional training?
Now that you’ve seen how it works gradient descent, you can continue advancing in your training process. Big Data is one of the areas in which the most jobs are offered. To be able to access this type of job options, some of the most prolific and best paid, we have for you the Big Data, Artificial Intelligence & Machine Learning Full Stack Bootcamp, an intensive and comprehensive training in which you will acquire all the theoretical and practical knowledge that will allow you to enter the job market. Don’t wait any longer to boost your career and request information now!