Stochastic data

Decoupling data design and model design is a fundamental principle in StochasticPrograms. This decoupling is achieved through data injection. By data we mean parameters in an optimization problem. In StochasticPrograms, this data is either deterministic and related to a specific stage, or uncertain and related to a specific scenario.

Stage data

Stage data is related to parameters that always appear in the first or second stage of a stochastic program. These parameters are deterministic and are the same across all scenarios. Such parameters are conveniently included in stochastic models using @parameters. To showcase, we consider a minimal stochastic program:

\[\begin{aligned} \operatorname*{maximize}_{x \in \mathbb{R}} & \quad x + \operatorname{\mathbb{E}}_{\omega} \left[Q(x, \xi(\omega))\right] \\ \text{s.t.} & \quad l_1 \leq x \leq u_1 \end{aligned}\]

where

\[\begin{aligned} Q(x, \xi(\omega)) = \max_{y \in \mathbb{R}} & \quad q_{\omega} y \\ \text{s.t.} & \quad y + x \leq U \\ & \quad l_2 \leq y \leq u_2 \end{aligned}\]

and the stochastic variable

\[ \xi(\omega) = q_{\omega}\]

takes on the value $1$ or $-1$ with equal probability. Here, the first stage contains the two parameters: $l_1$ and $u_1$. The second stage contains the three scenario-independent parameters: $U$, $l_2$, and $u_2$. The following defines this problem in StochasticPrograms, with some chosen deault parameter values:

using StochasticPrograms
using GLPK

sm = @stochastic_model begin
    @stage 1 begin
        @parameters begin
            l₁ = -1.
            u₁ = 1.
        end
        @decision(model, l₁ <= x <= u₁)
        @objective(model, Max, x)
    end
    @stage 2 begin
        @parameters begin
            U = 2.
            l₂ = -1.
            u₂ = 1.
        end
        @uncertain q
        @variable(model, l₂ <= y <= u₂)
        @objective(model, Max, q*y)
        @constraint(model, y + x <= U)
    end
end

ξ₁ = @scenario q = 1. probability = 0.5
ξ₂ = @scenario q = -1. probability = 0.5

sp = instantiate(sm, [ξ₁,ξ₂], optimizer = GLPK.Optimizer)

println(sp)

print("VRP = $(VRP(sp))")
Deterministic equivalent problem
Max x - 0.5 y₂ + 0.5 y₁
Subject to
 y₁ ≥ -1.0
 y₂ ≥ -1.0
 y₁ ≤ 1.0
 y₂ ≤ 1.0
 x ∈ Decisions
 x ≥ -1.0
 x ≤ 1.0
 x + y₁ ≤ 2.0
 x + y₂ ≤ 2.0
Solver name: GLPK
VRP = 2.0

Now, we can investigate the impact of the stage parameters by changing them slightly and reinstantiate the problem. This is achieved by supplying the new parameter values as keyword arguments to instantiate:

sp = instantiate(sm, [ξ₁,ξ₂], l₁ = -2., u₁ = 2., U = 2., l₂ = -0.5, u₂ = 0.5, optimizer = GLPK.Optimizer)

println(sp)

print("VRP = $(VRP(sp))")
Deterministic equivalent problem
Max x - 0.5 y₂ + 0.5 y₁
Subject to
 y₁ ≥ -0.5
 y₂ ≥ -0.5
 y₁ ≤ 0.5
 y₂ ≤ 0.5
 x ∈ Decisions
 x ≥ -2.0
 x ≤ 2.0
 x + y₁ ≤ 2.0
 x + y₂ ≤ 2.0
Solver name: GLPK
VRP = 2.25

Scenario data

Any uncertain parameter in the second stage of a stochastic program should be included in some predefined AbstractScenario type. Hence, all uncertain parameters in a stochastic program must be identified before defining the models. In brief, StochasticPrograms demands two functions from this abstraction. The discrete probability of a given AbstractScenario occurring should be returned from probability. Also, the expected scenario out of a collection of given AbstractScenarios should be returned by expected. The predefined Scenario type adheres to this abstraction and is the recommended option for most models, as exemplified in the Quick start.

Instances of Scenario that match an @uncertain declaration are conveniently created using the @scenario macro. The syntax of these macros match, as is shown in the examples below. The following is a declaration of four scalar uncertain values:

@uncertain q₁ q₂ d₁ d₂

which is paired with a matching instantiation of a scenario containing these scalars:

ξ₁ = @scenario q₁ = 24.0 q₂ = 28.0 d₁ = 500.0 d₂ = 100.0 probability = 0.4
Scenario with probability 0.4
  q₁: 24.0
  q₂: 28.0
  d₁: 500.0
  d₂: 100.0

Below, an equivalent formulation is given that instead defines a random vector.

@uncertain ξ[i in 1:4]

paired with

ξ₁ = @scenario ξ[i in 1:4] = [24.0, 28.0, 500.0, 100.0] probability = 0.4
Scenario with probability 0.4 and underlying data:

[24.0, 28.0, 500.0, 100.0]

Multidimensional random data is also supported. A simple example is given below.

@uncertain ξ[i in 1:2, j in 1:3]

paired with

ξ₁ = @scenario ξ[i in 1:2, j in 1:3] = rand(2, 3) probability = rand()
Scenario with probability 0.49238476039070767 and underlying data:

[0.6740473008048378 0.003089412799658131 0.34082198657957297; 0.3135166874394546 0.34434780697595757 0.7281116320787477]

The assignment syntax is used to directly create the random matrix. The dimensions of the RHS must match the index declaration, which in turn must match the @uncertain declaration. It is also possible to construct more complex examples using JuMP's container syntax. For example,

@uncertain ξ[i in 1:3, k in [:a, :b, :c]]

and

data = Dict((1, :a) => 1.0, (2, :a) => 2.0, (3, :a) => 3.0,
            (1, :b) => 4.0, (2, :b) => 5.0, (3, :b) => 6.0,
            (1, :c) => 7.0, (2, :c) => 8.0, (3, :c) => 9.0)
ξ₁ = @scenario ξ[i in 1:3, k in [:a, :b, :c]] data[i,k] probability = rand()
Scenario with probability 0.2845066602391808 and underlying data:

2-dimensional DenseAxisArray{Float64,2,...} with index sets:
    Dimension 1, Base.OneTo(3)
    Dimension 2, [:a, :b, :c]
And data, a 3×3 Matrix{Float64}:
 1.0  4.0  7.0
 2.0  5.0  8.0
 3.0  6.0  9.0

or the shorthand:

ξ₁ = @scenario ξ[i in 1:3, k in [:a, :b, :c]] = [1. 2. 3.;
                                                 4. 5. 6.;
                                                 7. 8. 9] probability = rand()
Scenario with probability 0.9601335381899163 and underlying data:

2-dimensional DenseAxisArray{Float64,2,...} with index sets:
    Dimension 1, Base.OneTo(3)
    Dimension 2, [:a, :b, :c]
And data, a 3×3 Matrix{Float64}:
 1.0  2.0  3.0
 4.0  5.0  6.0
 7.0  8.0  9.0

Triangular and conditional indexing work as well:

@uncertain ξ[i in 1:3, j in 1:3; i <= j]

and

ξ₁ = @scenario ξ[i in 1:3, j in 1:3; i <= j] i+j probability = rand()
Scenario with probability 0.3650521971495615 and underlying data:

  [1, 1]  =  2
  [1, 2]  =  3
  [1, 3]  =  4
  [2, 2]  =  4
  [2, 3]  =  5
  [3, 3]  =  6

Error checking is performed during model instantiation to ensure that all provided scenarios adhere to the @uncertain declaration.

In addition, StochasticPrograms provides a convenience macro, @define_scenario, for creating scenario types that also adhere to the scenario abstraction. The following is an alternative way to define a scenario structure for the simple problem introduced in the Quick start:

using StochasticPrograms

@define_scenario SimpleScenario = begin
    q₁::Float64
    q₂::Float64
    d₁::Float64
    d₂::Float64
end

Now, $\xi_1$ and $\xi_2$ can be created through:

ξ₁ = SimpleScenario(24.0, 28.0, 500.0, 100.0, probability = 0.4)
SimpleScenario with probability 0.4
  q₁: 24.0
  q₂: 28.0
  d₁: 500.0
  d₂: 100.0

and

ξ₂ = SimpleScenario(28.0, 32.0, 300.0, 300.0, probability = 0.6)
SimpleScenario with probability 0.6
  q₁: 28.0
  q₂: 32.0
  d₁: 300.0
  d₂: 300.0

The defined SimpleScenarios automatically have the [AbstractScenario] functionality. For example, we can check the discrete probability of a given scenario occuring:

probability(ξ₁)
0.4

Moreover, we can form the expected scenario out of a given set:

ξ̄ = expected([ξ₁, ξ₂])
Expected scenario of type SimpleScenario
  q₁: 26.400000000000002
  q₂: 30.4
  d₁: 380.0
  d₂: 220.0

To use the defined scenario in a model, the following @uncertain syntax is used:

@uncertain ξ from SimpleScenario

There are some caveats to note. First, the autogenerated requires an additive zero element of the introduced scenario type. For simple numeric types this is autogenerated as well. However, say that we want to extend the above scenario with some vector parameter of size 2:

using StochasticPrograms

@define_scenario ExampleScenario = begin
    X::Float64
    Y::Vector{Float64}
end
┌ Warning: Zero not defined for Vector{Float64}. Cannot generate zero function.
└ @ StochasticPrograms ~/work/StochasticPrograms.jl/StochasticPrograms.jl/src/methods/util.jl:95
┌ Warning: The scenario type ExampleScenario was not defined. A user-provided implementation
│
│     function zero(::Type{{ExampleScenario})
│         ...
│     end
│
│ is required.
└ @ Main ?:0

In this case, we must provide an implementation of zero using @zero:

using StochasticPrograms

@define_scenario ExampleScenario = begin
    X::Float64
    Y::Vector{Float64}

    @zero begin
        return ExampleScenario(0.0, [0.0, 0.0])
    end
end

s₁ = ExampleScenario(1., ones(2), probability = 0.5)
s₂ = ExampleScenario(5., -ones(2), probability = 0.5)

println("Probability of s₁: $(probability(s₁))")

s = expected([s₁, s₂])

println("Expectation over s₁ and s₂: $s")
println("Expectated X: $(s.scenario.X)")
println("Expectated Y: $(s.scenario.Y)")
Probability of s₁: 0.5
Expectation over s₁ and s₂: Expected scenario of type ExampleScenario
  X: 3.0
  Y: [0.0, 0.0]
Expectated X: 3.0
Expectated Y: [0.0, 0.0]

Another caveat is that the expected function can only be auto generated for fields that support addition and scalar multiplication with Float64. Consider:

using StochasticPrograms

@define_scenario ExampleScenario = begin
    X::Float64
    Y::Vector{Float64}
    Z::Int

    @zero begin
        return ExampleScenario(0.0, [0.0, 0.0], 0)
    end
end
┌ Warning: Scalar multiplication with Float64 not defined for Int64. Cannot generate expectation function.
└ @ StochasticPrograms ~/work/StochasticPrograms.jl/StochasticPrograms.jl/src/methods/util.jl:109
┌ Warning: The scenario type ExampleScenario was not defined. A user-provided implementation
│
│     function expected(scenarios::Vector{ExampleScenario})
│         ...
│     end
│
│ is required.
└ @ Main ?:0

Again, the solution is to provide an implementation of expected, this time using @expectation:

using StochasticPrograms

@define_scenario ExampleScenario = begin
    X::Float64
    Y::Vector{Float64}
    Z::Int

    @zero begin
        return ExampleScenario(0.0, [0.0, 0.0], 0)
    end

    @expectation begin
        X = sum([probability(s)*s.X for s in scenarios])
        Y = sum([probability(s)*s.Y for s in scenarios])
        Z = sum([round(Int, probability(s)*s.Z) for s in scenarios])
        return ExampleScenario(X, Y, Z)
    end
end

s₁ = ExampleScenario(1., ones(2), 1, probability = 0.5)
s₂ = ExampleScenario(5., -ones(2), -1, probability = 0.5)

println("Probability of s₁: $(probability(s₁))")

s = expected([s₁, s₂])

println("Expectation over s₁ and s₂: $s")
println("Expectated X: $(s.scenario.X)")
println("Expectated Y: $(s.scenario.Y)")
println("Expectated Z: $(s.scenario.Z)")
Probability of s₁: 0.5
Expectation over s₁ and s₂: Expected scenario of type ExampleScenario
  X: 3.0
  Y: [0.0, 0.0]
  Z: 0
Expectated X: 3.0
Expectated Y: [0.0, 0.0]
Expectated Z: 0

For most problems, @define_scenario will probably be adequate. Otherwise consider defining Custom scenarios.

Sampling

Typically, we do not have exact knowledge of all possible future scenarios. However, we often have access to some model of the uncertainty. For example, scenarios could originate from:

  • A stochastic variable with known distribution
  • A time series fitted to data
  • A nerual network prediction

Even if the exact scenario distribution is unknown, or not all possible scenarios are available, we can still formulate a stochastic program that approximates the model we wish to formulate. This is achieved through a technique called sampled average approximation, which is based on sampling. The idea is to sample a large number $n$ of scenarios with equal probability $\frac{1}{n}$ and then use them to generate and solve a stochastic program. By the law of large numbers, the result will converge with probability $1$ to the "true" solution with increasing $n$.

StochasticPrograms accepts AbstractSampler objects in place of AbstractScenario. However, an AbstractSampler is always linked to some underlying AbstractScenario type, which is reflected in the resulting stochastic program as well.

The most basic sampler is the included Sampler, which is used to sample basic Scenarios. Consider

using StochasticPrograms

sampler = Sampler() do
    return Scenario(q₁ = 24.0 + randn(), q₂ = 28.0 + randn(), d₁ = 500.0 + randn(), d₂ = 100 + randn(), probability = rand())
end

sampler()
Scenario with probability 0.4886128300795012
  q₁: 24.297287984535462
  q₂: 28.382395967790607
  d₁: 499.4023655232718
  d₂: 99.98955475536262

Samplers can also be conveniently created using @sampler. We can define a simple scenario type and a simple sampler as follows:

using StochasticPrograms

@define_scenario ExampleScenario = begin
    w::Float64
end

@sampler ExampleSampler = begin
    w::Float64

    ExampleSampler(w::AbstractFloat) = new(w)

    @sample ExampleScenario begin
        w = sampler.w
        return ExampleScenario(w*randn(), probability = rand())
    end
end

This creates a new AbstractSampler type called ExampleSampler, which samples ExampleScenarios. Now, we can create a sampler object and sample a scenario

sampler = ExampleSampler(2.)

ξ = sampler()

println(ξ)
println("ξ: $(ξ.w)")
ExampleScenario with probability 0.951916339835734
  w: 0.6222226769966677
ξ: 0.6222226769966677

Now, lets create a stochastic model using the ExampleScenario type:

sm = @stochastic_model begin
    @stage 1 begin
        @decision(model, x >= 0)
        @objective(model, Min, x)
    end
    @stage 2 begin
        @uncertain w from ExampleScenario
        @variable(model, y)
        @objective(model, Min, y)
        @constraint(model, y + x == w)
    end
end
Two-Stage Stochastic Model

minimize f₀(x) + 𝔼[f(x,ξ)]
  x∈𝒳

where

f(x,ξ) = min  f(y; x, ξ)
              y ∈ 𝒴 (x, ξ)

Now, we can sample $5$ scenarios using the first sampler to generate $5$ subproblems:

sp = instantiate(sm, sampler, 5)
Stochastic program with:
 * 1 decision variable
 * 0 recourse variables
 * 5 scenarios of type ExampleScenario
Structure: Deterministic equivalent
Solver name: No optimizer attached.

Printing yields:

print(sp)
Deterministic equivalent problem
Min x + 0.2 y₅ + 0.2 y₄ + 0.2 y₃ + 0.2 y₂ + 0.2 y₁
Subject to
 x ∈ Decisions
 x ≥ 0.0
 x + y₁ = -4.534172697601061
 x + y₂ = 0.8628430528458241
 x + y₃ = 1.9265432100763813
 x + y₄ = -1.0446735148430168
 x + y₅ = -0.10090245986733057
Solver name: No optimizer attached.

Sampled stochastic programs are solved as usual:

using GLPK

set_optimizer(sp, GLPK.Optimizer)

optimize!(sp)

println("optimal decision: $(optimal_decision(sp))")
println("optimal value: $(objective_value(sp))")
optimal decision: [0.0]
optimal value: -0.5780724818778408

Again, if the functionality offered by @sampler is not adequate, consider Custom scenarios.

Custom scenarios

More complex scenario designs are probably not implementable using @define_scenario. However, it is still possible to create a custom scenario type as long as:

The restriction on expected is there to support taking expectations in a distributed environment. We are also free to define custom sampler objects, as long as:

  • The sampler type is a subtype of AbstractSampler
  • The sampler type implements a functor call that performs the sampling

See the Continuous scenario distribution for an example of custom scenario/sampler implementations.