Recall from Calculus 1 that if f(x) has a local maximum/minimum at a point x=c, and if f′(c) exists, then f′(c)=0.
MatPlotLib Graph
Calculus 1: local extrema correspond to horizontal tangent lines
But what about multivariable functions in the 3D plane?
Extrema in 3D
Suppose a local extremum occurs at point P on the surface z=f(x,y)
This means that P must simultaneously be an extremum on every trace curve passing through P
MatPlotLib Graph
In Other Words:
The directional derivative at P must equal zero in every possible direction, as the tangent plane must be horizontal!
Duf(a,b)=0→∇f(a,b)⋅u=0
For all u, resulting in:
∇f(a,b)=0
What this means:
This says that if f(a,b) is a local maximum or minimum, and if the first order partial derivative exists, then the gradient vector at (a,b) must equal zero.
f(a,b) is a local maximum
this means that f(a,b)≥f(x,y) for all (x,y) in some neighborhood of (a,b)
At (a,b), Duf(a,b)=0 for all u in some neighborhood of (a,b)
Remarks
This does not say if ∇f(a,b)=0, then (a,b) corresponds to a local extremum. However, it does suggest that we can find the local extrema by looking for where ∇f=0
More specifically, "∇f=0" means ⟨fx,fy⟩=⟨0,0⟩, ie. fx=0ANDfy=0.
The points where ∇f=0 are called Critical Points. So, just like in Calculus 1, the local extrema “candidates” occur at Critical Points
Problems
Example
Find the local extrema of f(x,y)=x2+y2−2x−6y+14
Solution
To approach this problem, we need to start by finding the Critical Points:
∇f=0→⟨2x−2,2y−6⟩=⟨0,0⟩
→2x−2=0and2y−6=0
→x=1,y=3
Observe that f(1,3)=1+9−2−18+14=4, and by completing the square, we get that f(x,y)=[(x−1)2−1]+[(y−3)2−9]+14, resulting in:
=(x−1)2+(y−3)2+4
Because (x−1)2 and (y−3)2 are always non-negative, f(x,y)≥4 for all (x,y). Therefore, f(1,3)=4 is a local minimum
Saddle Points
Saddle points are points where the gradient is zero, but the point is neither a maximum nor a minimum.
Saddle Points present a unique challenge in optimization, as they are neither maxima nor minima. They are points where the gradient is zero, but the point is neither a maximum nor a minimum.
Second Derivative Test
Hessian Matrix
The Hessian Matrix is a square matrix of second-order partial derivatives of a scalar-valued function.
H=[f∗xxf∗yxf∗xyf∗yy]
We need to know the eigenvalues of the Hessian Matrix to determine the nature of the critical point.
det(H−λI)=0
Now that we know this, we can move on tho the Second Partial Derivative Test.
Second Partial Derivative Test
Suppose the second order partial derivative f(x,y) are continuous, and that (a,b) is a Critical Point.
If det(H)>0 and fxx(a,b)>0, then f(a,b) is a local minimum.
If the Hessain Matrix is positive definite, then the point is a local minimum.
If the Hessina Matrix is negative definite, then the point is a local maximum.
This becomes a saddle point if the determinant is negative.
If the determinant is zero, the test is inconclusive.