Lecture 5: Lagrangian Methods for Optimum Design

Reading time: 14 min

In our journey through Design Optimization, we’ve progressed from graphically understanding basic optimization concepts to systematically solving linear problems using Linear Programming. Today, we confront the more general and pervasive class of problems in engineering: Nonlinear Programming (NLP). Unlike LP, where everything is linear, NLP problems involve objective functions or constraints (or both) that are nonlinear. For these complex problems, we often cannot rely on simple graphical methods or direct algebraic solutions beyond very small cases.

This lecture introduces a fundamental analytical technique for solving constrained nonlinear optimization problems: Lagrangian Methods. The cornerstone of this approach is the Karush–Kuhn–Tucker (KKT) necessary conditions, which provide a set of equations that must be satisfied at any local optimum point of a constrained optimization problem.

By the end of this lecture, you will be able to define the Lagrangian function, state and interpret the Karush–Kuhn–Tucker (KKT) necessary conditions, and apply these conditions to find candidate optimum solutions for constrained nonlinear design problems.

General Nonlinear Programming Problem (NLP)#

Recall the general mathematical model for an optimum design problem we formulated in Lecture 2:

Find the design variable vector $\mathbf{x} = (x_1, x_2, …, x_n)$

To minimize (or maximize) the objective function $f(\mathbf{x})$

Subject to:

Equality constraints: $h_k(\mathbf{x}) = 0$, for $k = 1, …, p$
Inequality constraints: $g_j(\mathbf{x}) \le 0$, for $j = 1, …, m$
Side constraints (bounds): $x_{iL} \le x_i \le x_{iU}$, for $i = 1, …, n$

For Lagrangian methods, we often treat the side constraints as general inequality constraints. So, the problem can be viewed as: Minimize $f(\mathbf{x})$ subject to $h_k(\mathbf{x}) = 0$ and $g_j(\mathbf{x}) \le 0$. The functions $f(\mathbf{x})$, $h_k(\mathbf{x})$, and $g_j(\mathbf{x})$ can be nonlinear.

The Lagrangian Function#

The central idea behind Lagrangian methods is to transform a constrained optimization problem into an unconstrained one by introducing a new function called the Lagrangian function, $L$. This function incorporates all the constraints into the objective function using special coefficients called Lagrange multipliers.

The Lagrangian function for a general NLP problem is defined as: $L(\mathbf{x}, \mathbf{v}, \mathbf{u}) = f(\mathbf{x}) + \sum_{k=1}^{p} v_k h_k(\mathbf{x}) + \sum_{j=1}^{m} u_j g_j(\mathbf{x})$

Where:

$f(\mathbf{x})$ is the objective function.
$h_k(\mathbf{x})$ are the equality constraints.
$g_j(\mathbf{x})$ are the inequality constraints.
$v_k$ are the Lagrange multipliers associated with the equality constraints $h_k(\mathbf{x})$. These are unrestricted in sign.
$u_j$ are the Lagrange multipliers associated with the inequality constraints $g_j(\mathbf{x})$. These must be non-negative.

The Lagrange multipliers $v_k$ and $u_j$ are additional variables introduced into the problem. We are now looking for an optimum point $(\mathbf{x}^, \mathbf{v}^, \mathbf{u}^*)$ that satisfies certain conditions.

Karush–Kuhn–Tucker (KKT) Necessary Conditions#

For a given NLP problem, if a design point $\mathbf{x}^$ is a local minimum, and certain regularity conditions (like constraint qualifications, which are typically satisfied for well-behaved engineering problems) hold, then there must exist Lagrange multipliers $\mathbf{v}^$ and $\mathbf{u}^*$ such that the Karush–Kuhn–Tucker (KKT) necessary conditions are satisfied. These conditions essentially generalize the concept of finding where the gradient is zero for unconstrained optimization to problems with constraints.

The KKT conditions are:

Gradient of the Lagrangian with respect to design variables must be zero: $\frac{\partial L}{\partial x_i}(\mathbf{x}^, \mathbf{v}^, \mathbf{u}^) = \frac{\partial f}{\partial x_i}(\mathbf{x}^) + \sum_{k=1}^{p} v_k^* \frac{\partial h_k}{\partial x_i}(\mathbf{x}^) + \sum_{j=1}^{m} u_j^ \frac{\partial g_j}{\partial x_i}(\mathbf{x}^*) = 0 \quad \text{for } i = 1, \ldots, n$ This condition ensures that at the optimum, the gradient of the objective function is a linear combination of the gradients of the active constraints. Geometrically, it means that the objective function’s gradient is “aligned” with the combined gradients of the active constraints, preventing further improvement without violating constraints.
Feasibility of all constraints: $h_k(\mathbf{x}^) = 0 \quad \text{for } k = 1, \ldots, p$ $g_j(\mathbf{x}^) \le 0 \quad \text{for } j = 1, \ldots, m$ The optimal solution must satisfy all the original problem’s constraints.
Complementary Slackness Conditions: $u_j^* g_j(\mathbf{x}^) = 0 \quad \text{for } j = 1, \ldots, m$ This is a crucial condition for inequality constraints. It implies that for each inequality constraint $g_j(\mathbf{x}^)$:
- If $g_j(\mathbf{x}^) < 0$ (the constraint is inactive, meaning the optimal solution is strictly within its allowable region), then its corresponding Lagrange multiplier $u_j^$ must be zero.
- If $u_j^* > 0$, then the constraint $g_j(\mathbf{x}^)$ must be active (i.e., $g_j(\mathbf{x}^) = 0$), meaning the optimal solution lies on the boundary defined by this constraint. This condition helps to determine which inequality constraints are active at the optimum.
Non-negativity of Lagrange multipliers for inequality constraints: $u_j^* \ge 0 \quad \text{for } j = 1, \ldots, m$ Lagrange multipliers for inequality constraints cannot be negative. This is consistent with the physical interpretation that relaxing a constraint (making its limit less restrictive) should not worsen the objective for a minimization problem. If $u_j < 0$, it means relaxing that constraint would make the objective worse, which is contradictory to its role.

Solving these KKT conditions for $\mathbf{x}^$, $\mathbf{v}^$, and $\mathbf{u}^$ gives us the candidate local optimum points. A superscript “” on a variable typically indicates its optimum value.

Physical Meaning of Lagrange Multipliers#

Lagrange multipliers have an important physical meaning, particularly in engineering design: they represent the sensitivity of the optimal objective function value to a change in the constraint limit. This is also referred to as “shadow prices” in economic contexts.

For an active inequality constraint $g_j(\mathbf{x}^) = 0$ with $u_j^ > 0$, a positive $u_j^$ indicates that making the constraint slightly more restrictive (e.g., reducing the upper limit $b_j$ in $g_j(\mathbf{x}) \le b_j$) would increase the minimum objective function value (for a minimization problem). Conversely, relaxing the constraint would decrease the objective. The magnitude of $u_j^$ quantifies this rate of change.
For an inactive inequality constraint $g_j(\mathbf{x}^) < 0$, its corresponding $u_j^$ is 0, meaning a small change in its limit has no effect on the optimum objective function value, because the design is not “hitting” that constraint.
For equality constraints $h_k(\mathbf{x}^) = 0$, $v_k^$ can be positive, negative, or zero. Its sign indicates whether relaxing or tightening the constraint would increase or decrease the objective.

This post-optimality or sensitivity analysis is extremely valuable for designers, as it tells them which constraints are critical and how much the design’s performance would change if those constraints were modified. In Excel Solver, Lagrange multiplier values are sometimes called Shadow Prices, and it’s important to note that the sign convention in Solver might be opposite to that used in academic texts, requiring a flip to match.

Second-Order Conditions (Brief Mention)#

The KKT conditions are necessary conditions for a local minimum. This means that if a point is a local minimum, it must satisfy the KKT conditions. However, satisfying the KKT conditions does not guarantee that the point is a local minimum (it could be a local maximum or a saddle point). To confirm that a KKT point is indeed a local minimum, second-order sufficiency conditions are needed. These conditions involve the Hessian matrix of the Lagrangian function, $\nabla^2 L(\mathbf{x}^, \mathbf{v}^, \mathbf{u}^*)$. For a point to be an isolated local minimum, this Hessian (or a related quadratic form) must be positive definite under certain conditions related to feasible directions. These second-order conditions are more advanced and are typically covered in graduate-level courses.

Solved Example: Minimum Distance to Origin with a Linear Constraint#

Let’s apply the KKT conditions to a classic problem:

Problem Statement: Minimize the square of the distance from the origin to a point $(x_1, x_2)$ in a 2D plane, subject to a linear inequality constraint and non-negativity of variables.

Formulation: Minimize $f(x_1, x_2) = x_1^2 + x_2^2$

Subject to: $g_1: x_1 + x_2 - 1 \le 0$ $g_2: -x_1 \le 0 \quad (\text{i.e., } x_1 \ge 0)$ $g_3: -x_2 \le 0 \quad (\text{i.e., } x_2 \ge 0)$

Step 1: Formulate the Lagrangian Function#

$L(x_1, x_2, u_1, u_2, u_3) = (x_1^2 + x_2^2) + u_1(x_1 + x_2 - 1) + u_2(-x_1) + u_3(-x_2)$

Step 2: Write Down the KKT Necessary Conditions#

Gradient conditions: $\frac{\partial L}{\partial x_1} = 2x_1 + u_1 - u_2 = 0 \quad \text{(a)}$ $\frac{\partial L}{\partial x_2} = 2x_2 + u_1 - u_3 = 0 \quad \text{(b)}$
Feasibility conditions: $x_1 + x_2 - 1 \le 0 \quad \text{(c)}$ $-x_1 \le 0 \quad \Rightarrow x_1 \ge 0 \quad \text{(d)}$ $-x_2 \le 0 \quad \Rightarrow x_2 \ge 0 \quad \text{(e)}$
Complementary slackness conditions: $u_1(x_1 + x_2 - 1) = 0 \quad \text{(f)}$ $u_2(-x_1) = 0 \quad \text{(g)}$ $u_3(-x_2) = 0 \quad \text{(h)}$
Non-negativity of multipliers: $u_1 \ge 0, u_2 \ge 0, u_3 \ge 0 \quad \text{(i)}$

Step 3: Solve the KKT System by Considering Different Cases#

We need to consider all possible combinations of active/inactive constraints. The complementary slackness conditions (f), (g), (h) are key here.

Case 1: All inequality constraints are inactive. If $g_1 < 0$, $g_2 < 0$, $g_3 < 0$, then from complementary slackness (f, g, h), we must have $u_1 = 0, u_2 = 0, u_3 = 0$. Substitute these into gradient conditions (a) and (b): $2x_1 = 0 \Rightarrow x_1 = 0$ $2x_2 = 0 \Rightarrow x_2 = 0$ So, the candidate point is $(0,0)$. Now, check feasibility condition (c) at $(0,0)$: $0 + 0 - 1 = -1 \le 0$. This is satisfied. However, we assumed $g_1 < 0$ (i.e., $x_1+x_2-1 < 0$). At $(0,0)$, $x_1+x_2-1 = -1 < 0$, which is consistent. So, $(x_1, x_2) = (0,0)$ is a KKT point, with $u_1 = u_2 = u_3 = 0$. The objective function value $f(0,0) = 0^2 + 0^2 = 0$.

Case 2: Constraint $g_1$ is active, and $g_2, g_3$ are inactive. If $g_1 = 0$, then $x_1 + x_2 - 1 = 0 \quad \text{(j)}$ If $g_2 < 0 \Rightarrow x_1 > 0$, then $u_2 = 0$ (from (g)) If $g_3 < 0 \Rightarrow x_2 > 0$, then $u_3 = 0$ (from (h))

From (a) and (b) with $u_2=0, u_3=0$: $2x_1 + u_1 = 0 \Rightarrow u_1 = -2x_1$ $2x_2 + u_1 = 0 \Rightarrow u_1 = -2x_2$ So, $-2x_1 = -2x_2 \Rightarrow x_1 = x_2$. Substitute $x_1 = x_2$ into (j): $x_1 + x_1 - 1 = 0 \Rightarrow 2x_1 = 1 \Rightarrow x_1 = 0.5$. So, $x_2 = 0.5$. The candidate point is $(0.5, 0.5)$. Now, calculate $u_1$: $u_1 = -2(0.5) = -1$. However, KKT condition (i) requires $u_1 \ge 0$. Since $u_1 = -1$, this case is not a valid KKT point. This means that the optimum cannot occur with $g_1$ active and $x_1, x_2$ strictly positive. This indicates that one or both of $x_1, x_2$ must be zero at the optimum, making $g_2$ or $g_3$ active.

Case 3: Constraints $g_1$ and $g_2$ are active, and $g_3$ is inactive. If $g_1 = 0$, then $x_1 + x_2 - 1 = 0 \quad \text{(j)}$ If $g_2 = 0$, then $-x_1 = 0 \Rightarrow x_1 = 0 \quad \text{(k)}$ If $g_3 < 0 \Rightarrow x_2 > 0$, then $u_3 = 0$ (from (h))

Substitute $x_1 = 0$ into (j): $0 + x_2 - 1 = 0 \Rightarrow x_2 = 1$. The candidate point is $(0, 1)$. Check condition $x_2 > 0$: $1 > 0$, so consistent. Now, find $u_1, u_2$ (since $u_3=0$): From (a) with $x_1=0$: $2(0) + u_1 - u_2 = 0 \Rightarrow u_1 = u_2$. From (b) with $x_2=1$: $2(1) + u_1 - u_3 = 0 \Rightarrow 2 + u_1 - 0 = 0 \Rightarrow u_1 = -2$. This requires $u_1 = -2$. Since $u_1 \ge 0$ (condition (i)), this is not a valid KKT point.

Case 4: Constraints $g_1$ and $g_3$ are active, and $g_2$ is inactive. This case is symmetric to Case 3. It would lead to $(1,0)$ and $u_1 = -2$, which is also not a valid KKT point.

Case 5: Constraints $g_2$ and $g_3$ are active, and $g_1$ is inactive. If $g_2 = 0 \Rightarrow x_1 = 0 \quad \text{(k)}$ If $g_3 = 0 \Rightarrow x_2 = 0 \quad \text{(l)}$ The candidate point is $(0,0)$. This is the same point as in Case 1. The difference is the active constraints. Here, $u_1=0$ (because $g_1$ is inactive). From (a) with $x_1=0, u_1=0$: $0 + 0 - u_2 = 0 \Rightarrow u_2 = 0$. From (b) with $x_2=0, u_1=0$: $0 + 0 - u_3 = 0 \Rightarrow u_3 = 0$. So, $(0,0)$ with $u_1 = 0, u_2 = 0, u_3 = 0$ is a KKT point. $f(0,0)=0$.

Case 6: Constraints $g_1, g_2, g_3$ are all active. $x_1 + x_2 - 1 = 0 \quad \text{(j)}$ $x_1 = 0 \quad \text{(k)}$ $x_2 = 0 \quad \text{(l)}$ Substitute (k) and (l) into (j): $0 + 0 - 1 = 0 \Rightarrow -1 = 0$, which is a contradiction. This means these three constraints cannot be simultaneously active. So, no solution from this case.

Let’s re-examine Case 2. The issue was $u_1 = -1$. A correct KKT point from Case 2 would need $u_1 \ge 0$. So we need to consider combinations where $u_1$ turns out positive.

Let’s consider the scenario where $g_1$ is active, and $u_1 > 0$. If $u_1 > 0$, then $x_1+x_2-1=0$. From (a) and (b): $2x_1 + u_1 - u_2 = 0$ $2x_2 + u_1 - u_3 = 0$

Subcase 2.1: $g_1$ active, $g_2$ active, $g_3$ inactive ($u_1 > 0, u_2 > 0, u_3 = 0$) $x_1 + x_2 - 1 = 0$ $x_1 = 0 \Rightarrow x_2 = 1$ Point is $(0,1)$. From (b) with $x_2=1, u_3=0$: $2(1) + u_1 = 0 \Rightarrow u_1 = -2$. Violates $u_1 \ge 0$. Invalid.

Subcase 2.2: $g_1$ active, $g_3$ active, $g_2$ inactive ($u_1 > 0, u_3 > 0, u_2 = 0$) $x_1 + x_2 - 1 = 0$ $x_2 = 0 \Rightarrow x_1 = 1$ Point is $(1,0)$. From (a) with $x_1=1, u_2=0$: $2(1) + u_1 = 0 \Rightarrow u_1 = -2$. Violates $u_1 \ge 0$. Invalid.

Subcase 2.3: $g_1$ active, $g_2$ inactive, $g_3$ inactive ($u_1 > 0, u_2 = 0, u_3 = 0$) $x_1 + x_2 - 1 = 0$ From (a): $2x_1 + u_1 = 0 \Rightarrow u_1 = -2x_1$ From (b): $2x_2 + u_1 = 0 \Rightarrow u_1 = -2x_2$ So $x_1 = x_2$. Substitute into $x_1 + x_2 - 1 = 0 \Rightarrow 2x_1 = 1 \Rightarrow x_1 = 0.5$. Thus $x_2 = 0.5$. Point is $(0.5, 0.5)$. $u_1 = -2(0.5) = -1$. Violates $u_1 \ge 0$. Invalid.

It seems I made a mistake by prematurely dismissing $(0,0)$ as a single valid point. Let’s look closer at the problem.

The objective is $f(x_1, x_2) = x_1^2 + x_2^2$, which is minimized at $(0,0)$. The feasible region is the area $x_1+x_2 \le 1$ in the first quadrant ($x_1 \ge 0, x_2 \ge 0$). The point $(0,0)$ is in the feasible region, and $f(0,0)=0$.

Let’s check the point $(0.5, 0.5)$ again carefully. At $(0.5, 0.5)$, $g_1 = 0.5 + 0.5 - 1 = 0$. So $g_1$ is active. $g_2 = -0.5 < 0$. So $g_2$ is inactive, meaning $u_2 = 0$. $g_3 = -0.5 < 0$. So $g_3$ is inactive, meaning $u_3 = 0$. From KKT (a): $2x_1 + u_1 - u_2 = 0 \Rightarrow 2(0.5) + u_1 - 0 = 0 \Rightarrow 1 + u_1 = 0 \Rightarrow u_1 = -1$. Since $u_1$ must be $\ge 0$, this point $(0.5, 0.5)$ is not a valid KKT point. This means that if $x_1, x_2 > 0$, the optimum cannot be on the line $x_1+x_2=1$.

The only remaining valid KKT point is $(0,0)$ from Case 1 (or Case 5 which leads to the same point and multipliers). At $(0,0)$: $g_1 = 0+0-1 = -1 \le 0$ (inactive, so $u_1=0$) $g_2 = -0 = 0 \le 0$ (active, so $u_2 \ge 0$) $g_3 = -0 = 0 \le 0$ (active, so $u_3 \ge 0$)

With $u_1=0$: From (a): $2x_1 + 0 - u_2 = 0 \Rightarrow 2x_1 = u_2$. At $(0,0)$, $u_2 = 0$. From (b): $2x_2 + 0 - u_3 = 0 \Rightarrow 2x_2 = u_3$. At $(0,0)$, $u_3 = 0$. So, $u_1=0, u_2=0, u_3=0$. All non-negativity conditions for multipliers are met. Thus, $\mathbf{x}^* = (0,0)$ is a valid KKT point, with $f(\mathbf{x}^*) = 0$.

This result is consistent: the unconstrained minimum $(0,0)$ is within the feasible region ($0+0-1 \le 0$, $0 \ge 0$, $0 \ge 0$), so the constraints do not restrict the optimal solution, and thus the Lagrange multipliers are all zero. The constraints are “slack” or “non-binding” at the true optimum for this specific problem.

Conclusion for the Example#

The only point that satisfies all KKT conditions is $\mathbf{x}^ = (0,0)$*, with the objective function value $f(\mathbf{x}^) = 0$. The Lagrange multipliers are $u_1^ = 0$, $u_2^* = 0$, and $u_3^* = 0$. This indicates that none of the constraints are actively “pushing” the solution away from the origin; the unconstrained minimum is already feasible.

This concludes our introduction to Lagrangian Methods and the KKT conditions. These analytical tools are foundational for understanding and developing more advanced numerical methods for constrained optimization, which we will explore in subsequent lectures. While solving KKT conditions algebraically can be complex for highly nonlinear problems, the principles provide profound insights into the nature of optimal design solutions.