Newton, The Ultimate: One Weird Trick To Make You A Mathematical Superhero, part 1

Ever wanted to be a mathematical superhero? Here’s your chance! In this post I’m going to give you a one line algorithm for solving any equation and solving any optimization problem. Here it comes:

iterate x := x - f(x)/f'(x)

This is Newton’s algorithm, and it will find you an x such that f(x) = 0. Let’s try it out.

Linear equations

We want to solve 3*x - 5 = 0 so we take f(x) = 3*x - 5. The derivative of f is f'(x) = 3, so Newton’s algorithm becomes:

iterate x := x - (3*x - 5)/3

We simplify this to:

iterate x := 5/3

This is not a terribly exciting iteration, but it did give us the right answer: f(5/3) = 0. The reason that Newton’s method solves linear equations in one step is that Newton’s method works by making a linear approximation to your function, and then solving the equation for the linear approximation. So if your function was linear to begin with, it’s not an approximation at all, and it will give you the right answer in one step. Here’s an excellent animation from Wikipedia that shows this in action for a non linear function:

Function inversion

Suppose we have some function g and a number a and we want to solve g(x) = a. We rewrite this as g(x) - a = 0 so we get f(x) = g(x) - a and apply Newton’s algorithm. The derivative f'(x) = g'(x). The algorithm becomes:

iterate x := x - (g(x) - a) / g'(x)

This can compute the inverse of any function!

Square roots

We want to invert the function g(x) = x^2, in other words we want to compute square roots. Since g'(x) = 2*x we get:

iterate x := x - (x^2 - a)/(2*x)

We can further simplify this if we want:

iterate x := (x + a/x)/2

Now we have a real iteration that will take more than one step to converge, but what x do we start with? You want to start with a number that’s reasonably close to the answer. How many iterations? That doesn’t really matter! Newton’s method is so fast, it usually converges in a handful of iterations. Let’s use this to solve the equation x^2 = 2 in Python:

a = 2.0   # equation we want to solve is x^2 = a
x = 10.0  # terrible initial guess
for i in range(0,10):
   x = (x + a/x)/2
   print x

This gives us the following sequence:

1
74607843137
73719487438
44423809487
41452565515
4142135968
41421356237
41421356237

Notice how fast that converged to sqrt(2)? This is a general property of Newton’s method: on each iteration we get about twice as many digits of accuracy.

Division

Now that we’ve used Newton’s algorithm to compute square roots, let’s try something else. Can we get Newton’s algorithm to compute division 1/a? (then we can compute arbitrary division with b/a = b * 1/a) Lets make a function out of that: g(x) = 1/x. We are going to apply Newton’s method to compute the inverse of g. “But the inverse of g is g”, you say, “How could that possible help us?”. Lets just march on anyway: g'(x) = -1/x^2, so the algorithm becomes:

iterate x := x - (1/x - a)/(-1/x^2)

So how did that help us? To compute division we have to compute three divisions each iteration! We can simplify this:

iterate x := x*(2 - a*x)

Whoa! Where did the divisions go? Let’s try it out:

a = 7.0     # we compute 1/a
x = 0.1     # initial guess
for i in range(0,10):
    x = x*(2 - a*x)
    print x

This gives:

13
1417
14284777
142857142242
142857142857
142857142857

Great, we’ve computed 1/7. This is actually the method that some CPUs use to divide! This brings us to one of the limits of our mathematical superpowers. What if we had used 0.5 as our initial guess instead of 0.1? Let’s try it:

-0.75
-5.4375
-217.83984375
-332615.062363
-7.74430123204e+11
-4.19819411008e+24
-1.23373836501e+50
-1.06547724731e+101
-7.94669235181e+202
-inf

Hmm, that doesn’t look too good. If our initial guess is too far off, then sometimes Newton’s method can diverge (this also depends on the equation we’re trying to solve).

Complicated equations

What about an equation like sin(x) + log(x) = 2? We get Newton’s algorithm:

iterate x := x - (sin(x) + log(x) - 2)/(cos(x) + 1/x)

And the output:

67332523525
70032160401
70042510551
70042510714
70042510714

And sure enough, if we compute sin(9.70042510714) + log(9.70042510714) we get a number that’s extremely close: 2.0000000000000004.

Isn’t it amazing that you can do all this with such a small amount of code? You need tens of thousands of lines of code to make an enterprise web application, and then Newton comes along and gives us a way to solve any equation with a one liner.

Stay tuned for the next episode(s), where we use Newton’s method to find the maximum of a function, to solve whole systems of equations with multiple variables, to solve differential equations, to do variational optimization problems, and to solve optimization problems with equality and inequality constraints, such as linear programming and convex optimization.