Let’s get a feel for how to use it beyond a trivial example. In order to do so, we’ll need to understand a few things about how it works.
So now we understand how to optimize a function with Gradient Descent, as long as we can get the derivative of the function. Great, if all functions had an obvious derivative, we would be able to optimize everything!
In Part 0 of this series, we introduce the usefulness of automatic differentiation.
What if code could run, not just forwards, but backwards too? While automatic differentiation isn’t exactly reversible computing, it’s a close proxy for solving the large number of computing problems that take the form of: “I have the answer… what was the question?”