The VMCON Algorithm

VMCON aims to solve a nonlinearly constrained problem (general nonlinear programming problem), specifically, minimising an objective function subject to several nonlinear constraints.

The Problem

Also called the general non-linear programming problem, we want to minimise an objective function (figure of merit), \(f(\vec{x})\), subject to some non-linear constraints:

\[ \begin{align}\begin{aligned}c_i(\vec{x}) = 0, \quad i = 1,...,k\\c_i(\vec{x}) \geq 0, \quad i = k+1,...,m\end{aligned}\end{align} \]

where the objective and constraint functions are all \(n\)-dimensional.

Several facts are worth noting about problem formulation:

  1. To maximise some function \(g(\vec{x})\) we would minimise the objective \(f(\vec{x}) = -g(\vec{x})\).

  2. To constrain the solution such that \(h(\vec{x}) = a\) we would apply the constraint \(h(\vec{x}) - a = 0\).

  3. To constrain the solution such that \(h(\vec{x}) \geq a\) we would apply the constraint \(h(\vec{x}) - a \geq 0\).

  4. To constrain the solution such that \(h(\vec{x}) \leq 0\) we would apply the constraint \(-h(\vec{x}) \geq 0\).

  5. To constrain the solution such that \(h(\vec{x}) \leq a\) we would apply the constraint \(a-h(\vec{x}) \geq 0\).

The Lagrangian

VMCON is an augmented Lagrangian solver which means it uses the Lagrange multipliers, \(\vec{\lambda}\), of some solution to characterise the quality of the solution.

Of specific note is the Lagrangian function:

\[L(\vec{x}, \vec{\lambda}) = f(\vec{x}) - \sum_{i=1}^m \vec{\lambda}_ic_i(\vec{x})\]

and the derivative of the Lagrangian function, with respect to \(\vec{x}\):

(1)\[\nabla_XL(\vec{x}, \vec{\lambda}) = \nabla f(\vec{x}) - \sum_{i=1}^m \vec{\lambda}_i \nabla c_i(\vec{x})\]

Initialisation of VMCON

VMCON is initialised with:

  • The objective function to minimise, \(f(\vec{x})\), as described above.

  • The constraints \(c_i(\vec{x}), i = 1,...,m\), as described above.

  • An initial sample point \(\vec{x}_0\).

  • \(\mathbf{B}\): the initial Hessian approximation matrix, usually the identity matrix.

  • \(\epsilon\): the “user-supplied error tolerance”.

It should be noted that \(\mathbf{B}\) will need to be of dimension \(d \times d\) where \(d = \mathrm{max}(n, m)\).

We also set the iteration number to 1, \(j=1\).

The Quadratic Programming Problem

The Quadratic Programming Probelm (QPP) will also be known as the Quadratic Sub-Problem (QSP) because it forms only a part of the VMCON algorithm–with the other half being the Augmented Lagrangian.

The QPP provides the search direction \(\delta_j\) which is a vector upon which \(\vec{x}_j\) will lay. Solving the QPP also provides the Lagrange multipliers, \(\lambda_{j}\).

The quadratic program to be minimised on iteration \(j\) is:

\[Q(\delta) = f(\vec{x}_{j-1}) + \delta^T\nabla f(\vec{x}_{j-1}) + \frac{1}{2}\delta^TB\delta\]

subject to

\[ \begin{align}\begin{aligned}\nabla c_i(\vec{x}_{j-1})^T\delta + c_i(\vec{x}_{j-1}) = 0, \quad i=1,...,k\\\nabla c_i(\vec{x}_{j-1})^T\delta + c_i(\vec{x}_{j-1}) \ge 0, \quad i=k+1,...,m\end{aligned}\end{align} \]

The Convergence Test

The convergence test is performed on the \(j\)’th iteration after the QSP. The convergence test is the sum of two terms:

  • The predicted change in magnitude of the objective function.

  • The complimentary error; where the complimentary error being 0 would mean that a specific constraint is at equality or the Lagrange multipliers are 0.

This is encapsulated in the equation:

\[\lvert \nabla f(\vec{x}_{j-1})^T \cdot \delta_j \rvert + \sum^m_{i=1}\lvert \lambda_{j,i} c_i(\vec{x}_{j-1}) \rvert < \epsilon\]

The Broyden-Fletcher-Goldfarb-Shanno (BFGS) Quasi-Newton Update

The final stage of an iteration of the VMCON optimiser is to update the Hessian approximation via a BFGS update.

For an unconstrained problem, we use the following differences to update \(\mathbf{B}\):

\[\vec{\xi} = \vec{x}_j - \vec{x}_{j-1}\]
\[\vec{\gamma} = \nabla_XL(\vec{x}_j, \vec{\lambda}_j) - \nabla_XL(\vec{x}_{j-1}, \vec{\lambda}_j)\]

which is calculated using (1).

Since we have a constrained problem, we define a further quantity:

\[\vec{\eta} = \theta\vec{\gamma} + (1-\theta)\mathbf{B}\vec{\xi}\]

where

\[\begin{split}\theta = \begin{cases} 1 ,& \text{if } \vec{\xi}^T\vec{\gamma} \geq 0.2\vec{\xi}^T\mathbf{B}\vec{\xi}\\ \frac{0.8\vec{\xi}^T\mathbf{B}\vec{\xi}}{\vec{\xi}^T\mathbf{B}\vec{\xi} - \vec{\xi}^T\vec{\gamma}},& \text{otherwise} \end{cases}\end{split}\]

The definition of \(\vec{\eta}\) ensures \(\mathbf{B}\) remains positive semi-definite, which is a prereqesite to solving the QSP.

We can then perform the BFGS update:

\[\mathbf{B_{NEW}} = \mathbf{B} - \frac{\mathbf{B}\vec{\xi}\vec{\xi}^T\mathbf{B}}{\vec{\xi}^T\mathbf{B}\vec{\xi}} + \frac{ \vec{\eta} \vec{\eta}^T}{\vec{\xi}^T\vec{\eta}}\]

The VMCON Algorithm

This page covers the mathematics and theory behind the VMCON algorithm. For completeness, the following flow diagram demonstrates how the algorithm is implemented at a high level.

flowchart setup("Initialisation of VMCON") --> j1("j = 1") j1 --> qsp("The Quadratic Programming Problem (Lagrange multipliers and search direction)") qsp --> convergence_test(["Convergence criterion met?"]) convergence_test -- "Yes" --> exit[["Exit"]] convergence_test -- "No" --> linesearch("Line search (next evaluation point)") linesearch --> bfgs("BFGS update") bfgs --> incrementj("j = j + 1") incrementj --> qsp