When we are looking at two-dimensional optimization algorithms, there are a variety of options. The first is the bisection method. I like the bisection method, because it makes a delightful use of the intermediate value theorem. We can use the fact that continuous functions take on all of the values between two points. They may take on a value more than once, but that is someone else’s problem. Formally stated, the [intermediate value theorem says]():

If f is continuous on a closed interval [a,b], and c is any number between f(a) and f(b) inclusive, then there is at least one number x in the closed interval such that f(x)=c.

As a root-finding algorithm, bisection takes advantage of the fact we *know* the value crosses 0. Take, for example, f(x) = x^2 - 1:

We can *see* that the function crosses zero. Of course, what we see is irrelevant, what is important is the image tells us a story. Going back to our theorem, we can say let a = 0 and b = 2. Then we know f(a) = -1 and f(b) = 3. It is a continuous function, so we know there has to be some value of x such that f(x) = 0. We short cut this process by just narrowing the space between a and b until we are satisfied our solution is close enough.

But there’s a problem here. What if there are no negative values of the function? Take, for example, f(x) = (x - 1)^2:

We can solve this problem with Newton’s method. Of course, the problem with Newton’s method is that you need to know the derivative of $f(x)$, *a priori*. Finding the derivative of f(x) = (x - 1)^2 is easy. Finding the derivative of f(x) = x^x is hard. I mean, I know the answer, but don’t ask me how to figure it out. That’s what undergrads are for. So we can solve this problem, with the secant method. It just estimates the derivative along the way. What we learn here is that no method works perfectly in every scenario. That’s the hardest part of numerical analysis: understanding the computer can do all the work.