Structured solvers

A stochastic program has a structure that can be exploited in solver algorithms through decomposition. This can heavily reduce the computation time required to optimize the stochastic program, compared to solving the extensive form directly. Moreover, a distributed stochastic program is by definition decomposed and a structured solver that can operate in parallel will be much more efficient.

Stochastic structure

StochasticPrograms provides multiple alternatives for how finite stochastic program instances are represented and stored in memory. We refer to these alternatives as the structure of the stochastic program. Certain operations are more efficient in certain structures. We summarize the available structures in the following. For code examples, see the Quick start.

Deterministic Equivalent

The DeterministicEquivalent, instantiated using Deterministic, is the default structure in StochasticPrograms. A stochastic program instance is represented by one large optimization problem that considers all scenarios at once. This structure is supported by any standard third-party MathOptInterface solver. Moreover, it is the most efficient choice for smaller problem sizes.

Stage-decomposition

The StageDecompositionStructure, instantiated using StageDecomposition or Vertical, decomposes the stochastic program into stages. It is the structure induced by the L-shaped and quasi-gradient algorithms and is more efficient for larger instances. It is especially efficient for decision evaluation problems, such as when calculating VSS. In a distributed environment, the subproblems in later stages can be distributed on worker nodes. This distributed stage-decomposition structure is instantiated using either DistributedStageDecomposition or DistributedVertical.

Scenario-decomposition

The ScenarioDecompositionStructure, instantiated using ScenarioDecomposition or Horizontal, decomposes the stochastic program by scenarios. It is the structure induced by the progressive-hedging algorithm and is more efficient for larger instances. It is especially efficient for solving wait-and-see type problems, such as when calculating EVPI. In a distributed environment, the subproblems in later stages can be distributed on worker nodes. This distributed scenario-decomposition structure is instantiated using either DistributedScenarioDecomposition or DistributedHorizontal.

Solver interface

The structured solver interface mimics that of MathOptInterface, and it needs to be implemented by any structured solver to be compatible with StochasticPrograms. We distinguish between structure-exploiting solvers for solving finite stochastic programs and sampled-bases solvers for approximately solving stochastic models, even though they can be based on the same algorithm.

Stochastic programs

To interface a new structure-exploiting solver, define a AbstractStructuredOptimizer object. To follow the style of MathOptInterface, name the object Optimizer so that users can set_optimizer(sp, SOLVER_MODULE.Optimizer) to use the optimizer. Next, implement load_structure!, which loads any stochastic structure supported by the solver. Define supports_structure to inform StochasticPrograms what structures are supported by the solver and define default_structure to ensure that an appropriate structure is used when instantiating a stochastic program with your solver. After a call to load_structure!, optimize! should solve the stochastic program or otherwise throw UnloadedStructure. After a call to optimize!, calling restore_structure! should remove any changes made to the model by the solver. For example, calling this method after running an L-shaped procedure removes all cutting planes from the first stage. The solver should at least be able to return a solver for solving subproblems through subproblem_optimizer and can also optionally support a master_optimizer. Finally, the solver can optionally support a custom solver name through optimizer_name.

In summary, the solver interface that a new AbstractStructuredOptimizer should adhere to is given by

In addition, the solver can include support getting/setting/modiyfing any MathOptInterface attributes. See also the subtypes of AbstractStructuredOptimizerAttribute for special attributes defined by the framework. For more thorough examples of implementing the structured solver interface, see the L-shaped or Progressive-hedging implementations.

Stochastic models

To interface a new structure-exploiting solver, define a AbstractSampledOptimizer object. Next, implement load_model!, which should load a provided StochasticModel object into the solver. A call to optimize! should then approximately solve the model. Afterwards, a call to optimal_instance can optionally return a sampled instance with an optimal value within the confidence interval of the solution. Again, a custom solver name can be provided in optimizer_name.

In summary, the solver interface that a new AbstractSampledOptimizer should adhere to is given by

See also the subtypes of AbstractSampledOptimizerAttribute for special attributes defined by the framework. For a thorough example, consider the SAA implementation.

Crash methods

Some structure-exploiting algorithms benefit from crash starting in various ways. For example, a good initial guess in combination with a regularization procedure can improve convergence.

The following Crash methods are available in StochasticPrograms:

To use a Crash procedure, set the crash keyword in the call to optimize!.

L-shaped solvers

StochasticPrograms includes a collection of L-shaped algorithms in the submodule LShaped. All algorithm variants are based on the L-shaped method by Van Slyke and Wets. LShaped interfaces with StochasticPrograms through the structured solver interface. Every algorithm variant is an instance of the functor object LShapedAlgorithm, and are instanced using the API object LShaped.Optimizer. Consider subtypes of AbstractLShapedAttribute for a summary of available configurations.

As an example, we solve the simple problem introduced in the Quick start:

set_optimizer(sp, LShaped.Optimizer)
set_optimizer_attribute(sp, MasterOptimizer(), GLPK.Optimizer)
set_optimizer_attribute(sp, SubProblemOptimizer(), GLPK.Optimizer)
optimize!(sp)
L-Shaped Gap  Time: 0:00:02 (6 iterations)
  Objective:       -855.8333333333358
  Gap:             0.0
  Number of cuts:  8
  Iterations:      6

Note, that an LP capable AbstractOptimizer is required to solve emerging subproblems.

LShaped uses a policy-based design. This allows combinatorially many variants of the original algorithm to be instanced by supplying linearly many policies to the factory function LShaped.Optimizer. We briefly describe the various policies in the following.

Feasibility cuts

If the stochastic program does not have complete, or relatively complete, recourse then subproblems may be infeasible for some master iterates. Convergence can be maintained through the use of feasibility cuts. To reduce overhead and memory usage, feasibility issues are ignored by default. If you know that your problem does not have complete recourse, or if the algorithm terminates due to infeasibility, set the FeasibilityStrategy attribute to FeasibilityCuts:

set_optimizer_attribute(sp, FeasibilityStrategy(), FeasibilityCuts())
optimize!(sp)

Integer strategies

If the stochastic program includes binary or integer decisions, especially in the second-stage, special strategies are required for the L-shaped algorihm to stay effective. Integer restrictions are ignored by default and the procedure will generally not converge if they are present.

set_optimizer_attribute(sp, IntegerStrategy(), CombinatorialCuts())
optimize!(sp)

The following L-shaped integer strategies are available:

Note, that CombinatorialCuts requires a third-party subproblem optimizer with integer capabilities. Convexification solves linear subproblems through cutting-plane approximations, determined by a convexification strategy. The currently availiable strategies are:

  • Gomory
  • LiftAndProject
  • CuttingPlaneTree

The Gomory strategy is cheapest and often the most effective. The latter strategies involves solving extra linear programs using a supplied optimizer.

Regularization

A Regularization procedure can improve algorithm performance. The idea is to limit the candidate search to a neighborhood of the current best iterate in the master problem. This can result in more effective cutting planes. Moreover, regularization enables warm-starting the L-shaped procedure with Crash decisions. Regularization is enabled by setting the Regularizer attribute.

The following L-shaped regularizations are available:

Note, that RegularizedDecomposition and LevelSet require an AbstractOptimizer capable of solving QP problems. Alternatively, the quadratic proximal term in the objective can be approximated through various linear terms. This is achieved by supplying a AbstractPenaltyTerm object through penaltyterm in either RD or LV. The alternatives are given below:

Aggregation

Cut aggregation can be applied to reduce communication latency and load imbalance. This can yield major performance improvements in distributed settings. Aggregation is enabled by setting the Aggregator attribute.

The following aggregation schemes are available:

Consolidation

If cut consolidation is enabled, cuts from previous iterations that are no longer active are aggregated to reduce the size of the master. Consolidation is enabled by setting the Consolidator attribute to Consolidate. See Consolidation for further details.

Execution

There are three currently available modes of execution:

Running a distributed L-shaped algorithm, either synchronously or asynchronously, required adding Julia worker cores with [addprocs]. The execution policy can be changed by setting the Execution attribute.

Solver examples

Below are a few examples of L-shaped algorithm with advanced policy configurations:

function tr_with_partial_aggregation()
    opt = LShaped.Optimizer()
    MOI.set(opt, MasterOptimizer(), GLPK.Optimizer)
    MOI.set(opt, SubProblemOptimizer(), GLPK.Optimizer)
    MOI.set(opt, Regularizer(), TR()) # Set regularization to trust-region
    MOI.set(opt, Aggregator(), PartialAggregate(36)) # Use partial aggregation in groups of 36 cuts
    return opt
end

function lv_with_kmedoids_aggregation_and_consolidation()
    opt = LShaped.Optimizer()
    MOI.set(opt, MasterOptimizer(), Gurobi.Optimizer)
    MOI.set(opt, SubProblemOptimizer(), Gurobi.Optimizer)
    MOI.set(opt, Regularizer(), LV()) # Use level-set regularization
    MOI.set(opt, Aggregator(), ClusterAggregate(Kmedoids(20, distance = angular_distance) # Use K-medoids cluster aggregation
    MOI.set(opt, Consolidator(), Consolidate()) # Enable consolidation
    return opt
end

# Employ advanced solvers
set_optimizer(sp, tr_with_partial_aggregation)
optimize!(sp)

set_optimizer(sp, lv_with_kmedoids_aggregation_and_consolidation)
optimize!(sp)

References

  1. Van Slyke, R. and Wets, R. (1969), L-Shaped Linear Programs with Applications to Optimal Control and Stochastic Programming, SIAM Journal on Applied Mathematics, vol. 17, no. 4, pp. 638-663.

  2. Ruszczyński, A (1986), A regularized decomposition method for minimizing a sum of polyhedral functions, Mathematical Programming, vol. 35, no. 3, pp. 309-333.

  3. Linderoth, J. and Wright, S. (2003), Decomposition Algorithms for Stochastic Programming on a Computational Grid, Computational Optimization and Applications, vol. 24, no. 2-3, pp. 207-250.

  4. Fábián, C. and Szőke, Z. (2006), Solving two-stage stochastic programming problems with level decomposition, Computational Management Science, vol. 4, no. 4, pp. 313-353.

  5. Wolf, C. and Koberstein, A. (2013), Dynamic sequencing and cut con-solidation for the parallel hybrid-cut nested l-shaped method, European Journal of Operational Research, vol. 230, no. 1, pp. 143-156.

  6. Biel, M. and Johansson, M. (2018), Distributed L-shaped Algorithms in Julia, 2018 IEEE/ACM Parallel Applications Workshop, Alternatives To MPI (PAW-ATM).

  7. Biel, M. and Johansson, M. (2019), Dynamic cut aggregation in L-shaped algorithms, arXiv preprint arXiv:1910.13752.

Progressive-hedging solvers

StochasticPrograms also includes a collection of progressive-hedging algorithms in the submodule ProgressiveHedging. All algorithm variants are based on the original progressive-hedging algorithm by Rockafellar and Wets. ProgressiveHedging interfaces with StochasticPrograms through the structured solver interface. Every algorithm variant is an instance of the functor object ProgressiveHedgingAlgorithm, and are instanced using the API object ProgressiveHedging.Optimizer. Consider subtypes of AbstractProgressiveHedgingAttribute for a summary of available configurations.

As an example, we solve the simple problem introduced in the Quick start:

set_optimizer(sp, ProgressiveHedging.Optimizer)
set_optimizer_attribute(sp, SubProblemOptimizer(), Ipopt.Optimizer)
optimize!(sp)
Progressive Hedging Time: 0:00:05 (303 iterations)
  Objective:   -855.5842547490254
  Primal gap:  7.2622997706326046e-6
  Dual gap:    8.749063651111478e-6
  Iterations:  302

Note, that an QP/LP capable AbstractOptimizer is required to solve emerging subproblems.

ProgressiveHedging also uses a policy-based design. See ProgressiveHedging.Optimizer for options. We briefly describe the various policies in the following.

Penalty

There are two options for the penalty parameter used in the progressive-hedging algorithm. The alternatives are

Execution

The same execution policies as for LShaped are available in ProgressiveHedging, i.e.

Penalty term

As with the L-shaped variants with quadratic 2-norm terms, the 2-norm term in progressive-hedging subproblems can be approximated. This enables the use of an AbstractOptimizer that only support linear problems. The alternatives are as before:

References

  1. R. T. Rockafellar and Roger J.-B. Wets (1991), Scenarios and Policy Aggregation in Optimization Under Uncertainty, Mathematics of Operations Research, vol. 16, no. 1, pp. 119-147.

  2. Zehtabian. S and Bastin. F (2016), Penalty parameter update strategies in progressive hedging algorithm

Quasi-gradient solvers

StochasticPrograms also includes a collection of quasi-gradient algorithms in the submodule QuasiGradient. All algorithm variants are based on projected subgradient methods. QuasiGradient interfaces with StochasticPrograms through the structured solver interface. Every algorithm variant is an instance of the functor object QuasiGradientAlgorithm, and are instanced using the API object QuasiGradient.Optimizer. Consider subtypes of AbstractQuasiGradientAttribute for a summary of available configurations.

As an example, we solve the simple problem introduced in the Quick start:

set_optimizer(sp, QuasiGradient.Optimizer)
set_optimizer_attribute(sp, MasterOptimizer(), Ipopt.Optimizer)
set_optimizer_attribute(sp, SubProblemOptimizer(), GLPK.Optimizer)
optimize!(sp)
Quasi-gradient Progress 100%|██████████████████████████████████████████████████████████████████| Time: 0:00:08
  Objective:   -854.9691513511461
  ||∇Q||::     34.64997546896679
  Iterations:  1000

Note, that an QP/LP capable AbstractOptimizer is required to solve emerging subproblems.

QuasiGradient also uses a policy-based design. See QuasiGradient.Optimizer for options. We briefly describe the various policies in the following.

Step-size

The following step-size policies are available:

Prox

A proximal step is taken each iteration in a projected (sub)gradient method. The following prox steps are currently available:

At the very least, a polyhedral projection on the first-stage constraints are required when solving stochastic programs.

Termination

The following termination criteria are available:

Execution

The following execution policies are available in QuasiGradient, i.e.

Smoothing

A smooth approximation can be applied to the subproblems to enable gradient-based method that require smooth properties. The smoothing procedure is based on Moreau envelopes.