6. Policy Search

Multi-objective Pareto optimization

In the previous section, we used exploratory modeling to understand where candidate policies succeed or fail. Now we use multi-objective optimization to automatically search for policies that balance competing objectives.

In DMDU practice, optimization and exploration are complementary (Kasprzyk et al. 2013):

When objectives conflict (e.g., upfront cost vs. long-term damage), no single solution is best. Instead, we get a Pareto front: the set of solutions where improving one objective necessarily worsens another.

Setup

Type definitions (click to expand)
using SimOptDecisions
using Distributions
using Random
using CairoMakie
using Statistics

# === All type definitions (self-contained) ===

Base.@kwdef struct HouseConfig{T<:AbstractFloat} <: AbstractConfig
    horizon::Int = 70
    house_height::T = 4.0
    house_area_ft2::T = 1500.0
    house_value::T = 200_000.0
end

@scenariodef HouseScenario begin
    @timeseries water_levels
    @continuous dd_threshold
    @continuous dd_saturation
    @continuous discount_rate
end

function sample_scenario(rng::AbstractRNG, horizon::Int)
    # Draw GEV parameters from prior distributions
    μ = rand(rng, Normal(2.8, 0.3))
    σ = rand(rng, truncated(Normal(1.0, 0.15); lower=0.3))
    ξ = rand(rng, truncated(Normal(0.15, 0.05); lower=-0.2, upper=0.5))

    # Generate water level time series from GEV
    surge_dist = GeneralizedExtremeValue(μ, σ, ξ)
    water_levels = [rand(rng, surge_dist) for _ in 1:horizon]

    HouseScenario(
        water_levels = water_levels,
        dd_threshold = rand(rng, Normal(0.0, 0.25)),
        dd_saturation = rand(rng, Normal(8.0, 0.5)),
        discount_rate = rand(rng, truncated(Normal(0.03, 0.015); lower=0.01, upper=0.07)),
    )
end

struct HouseState{T<:AbstractFloat} <: AbstractState
    elevation_ft::T
end

struct ElevationAction{T<:AbstractFloat} <: AbstractAction
    elevation_ft::T
end

@policydef ElevationPolicy begin
    @continuous elevation_ft 0.0 14.0
end

function SimOptDecisions.get_action(
    policy::ElevationPolicy, state::HouseState, t::TimeStep, scenario::HouseScenario
)
    return ElevationAction(value(policy.elevation_ft))
end

function depth_damage(depth::T, threshold::T, saturation::T) where {T<:AbstractFloat}
    depth <= threshold && return zero(T)
    depth >= saturation && return one(T)
    midpoint = (threshold + saturation) / 2
    steepness = T(6) / (saturation - threshold)
    return one(T) / (one(T) + exp(-steepness * (depth - midpoint)))
end

function elevation_cost(Δh::Real, area_ft2::Real, house_value::Real)
    Δh <= 0 && return 0.0
    base_cost = 20_745.0
    thresholds = [0.0, 5.0, 8.5, 12.0, 14.0]
    rates = [80.36, 82.5, 86.25, 103.75, 113.75]
    rate = rates[1]
    for i in 1:(length(thresholds) - 1)
        if Δh <= thresholds[i + 1]
            t = (Δh - thresholds[i]) / (thresholds[i + 1] - thresholds[i])
            rate = rates[i] + t * (rates[i + 1] - rates[i])
            break
        end
        rate = rates[i + 1]
    end
    return (base_cost + area_ft2 * rate) / house_value
end
elevation_cost(Δh::Real) = elevation_cost(Δh, 1500.0, 200_000.0)

function SimOptDecisions.initialize(::HouseConfig, ::HouseScenario, ::AbstractRNG)
    return HouseState(0.0)
end

function SimOptDecisions.time_axis(config::HouseConfig, ::HouseScenario)
    return 1:(config.horizon)
end

function SimOptDecisions.run_timestep(
    state::HouseState, action::ElevationAction, t::TimeStep,
    config::HouseConfig, scenario::HouseScenario, rng::AbstractRNG
)
    if is_first(t)
        new_elevation = state.elevation_ft + action.elevation_ft
        construction_cost = elevation_cost(action.elevation_ft, config.house_area_ft2, config.house_value)
    else
        new_elevation = state.elevation_ft
        construction_cost = 0.0
    end

    water_level = scenario.water_levels[t]
    floor_level = config.house_height + new_elevation
    flood_depth = water_level - floor_level
    damage = depth_damage(flood_depth, value(scenario.dd_threshold), value(scenario.dd_saturation))

    return (HouseState(new_elevation), (construction_cost=construction_cost, damage_fraction=damage))
end

function SimOptDecisions.compute_outcome(
    step_records::Vector,
    config::HouseConfig, scenario::HouseScenario
)
    construction_cost = step_records[1].construction_cost
    npv_damages = sum(
        step_records[t].damage_fraction * discount_factor(value(scenario.discount_rate), t)
        for t in eachindex(step_records)
    )
    return (construction_cost=construction_cost, npv_damages=npv_damages)
end
NoteWhat optimize() Needs

When all fields in a @policydef are bounded @continuous, everything needed for optimization is auto-derived:

  • params(policy) — extracts current values as a vector
  • param_bounds(::Type{<:MyPolicy}) — returns bounds from field definitions
  • MyPolicy(x::AbstractVector) — vector constructor for the optimizer

No manual method definitions are required. If some fields are not bounded @continuous, you must define param_bounds(::Type) and the vector constructor manually.

Unlike explore(), optimize() does not need parameter-wrapped outcomes. It works with simulate() directly and uses calculate_metrics to aggregate plain outcome values.

Outcomes vs Metrics

Before optimization, understand two key concepts:

  • Outcome: Result of a single simulation (one policy × one scenario)
  • Metric: Aggregated statistic across many scenarios for a given policy

The optimizer evaluates policies by computing metrics across the scenario ensemble.

"""
Compute metrics from outcomes across many scenarios.
Returns a NamedTuple that the optimizer can reference.
"""
function calculate_metrics(outcomes)
    damages = [o.npv_damages for o in outcomes]
    construction = outcomes[1].construction_cost

    return (
        construction_cost=construction,
        expected_damage=mean(damages),
        cvar_95=mean(sort(damages)[(end - max(1, length(damages) ÷ 20)):end]),
        worst_case=maximum(damages),
    )
end

Setting Up the Problem

config = HouseConfig()
n_scenarios = 1000
rng = Random.Xoshiro(42)
scenarios = [sample_scenario(rng, config.horizon) for _ in 1:n_scenarios]

println("Config: $(config.horizon)-year horizon, \$$(Int(config.house_value)) house")
println("Scenarios: $(n_scenarios) sampled futures")
Config: 70-year horizon, $200000 house
Scenarios: 1000 sampled futures

Multi-Objective Problem

We trade off construction cost against expected damage—conflicting objectives:

import Metaheuristics  # loads the optimization extension

# Define objectives for optimization
objectives = [minimize(:construction_cost), minimize(:expected_damage)]
2-element Vector{Objective}:
 Objective(:construction_cost, Minimize)
 Objective(:expected_damage, Minimize)
TipHandling Discrete Choices

Real elevation choices are discrete: you either don’t elevate (0 ft) or elevate to at least 3 ft. Rather than encoding this discontinuity in the policy (which confuses the optimizer), we:

  1. Optimize over the continuous [3, 14] range
  2. Add the 0 ft baseline using merge_into_pareto!

This gives the optimizer a smooth landscape while including discrete alternatives.

Running Optimization

# NSGA-II is a popular multi-objective evolutionary algorithm
backend = MetaheuristicsBackend(;
    algorithm=:NSGA2,
    max_iterations=100,
    population_size=30,
    parallel=true
)
MetaheuristicsBackend(:NSGA2, 100, 30, true, Dict{Symbol, Any}())
# Use flat API: config, scenarios, policy_type, metric_calculator, objectives
opt_result = optimize(
    config, scenarios, ElevationPolicy, calculate_metrics, objectives;
    backend=backend,
    bounds=[(3.0, 14.0)]  # custom bounds for smooth optimization
)

# Add the "no elevation" baseline to the Pareto front
merge_into_pareto!(
    opt_result, config, scenarios, ElevationPolicy(elevation_ft=0.0),
    calculate_metrics, objectives
)

println("Found $(length(opt_result.pareto_params)) Pareto-optimal solutions")
Found 31 Pareto-optimal solutions

Understanding the Pareto Front

A solution is Pareto-optimal if you can’t improve one objective without worsening another. The Pareto front shows the efficient frontier of trade-offs.

Code
let
    fig = Figure(; size=(600, 600))
    ax = Axis(fig[1, 1];
        xlabel="Construction cost (fraction of house value)",
        ylabel="Expected NPV damages (fraction of house value)",
        title="Pareto Front: Cost vs Damage Trade-off",
        aspect=1)

    pareto_pts = collect(SimOptDecisions.pareto_front(opt_result))

    if !isempty(pareto_pts)
        construction = [obj[1] for (_, obj) in pareto_pts]
        expected = [obj[2] for (_, obj) in pareto_pts]
        elevations = [round(p[1]; digits=1) for (p, _) in pareto_pts]

        # Iso-cost lines (x + y = constant)
        all_costs = construction .+ expected
        cost_range = range(minimum(all_costs) * 0.8, maximum(all_costs) * 1.1; length=6)
        for total_cost in cost_range
            lines!(ax, [0.0, total_cost], [total_cost, 0.0];
                color=(:gray, 0.3), linewidth=0.5, linestyle=:dash)
        end

        scatter!(ax, construction, expected; markersize=12, color=:steelblue)

        # Label selected points
        labeled = Set{Float64}()
        for (i, elev) in enumerate(elevations)
            if all(abs(elev - l) > 1.5 for l in labeled)
                text!(ax, construction[i], expected[i];
                    text="$(round(Int, elev)) ft", align=(:left, :bottom), offset=(5, 5))
                push!(labeled, elev)
            end
        end
    end

    fig
end
Figure 1: Pareto front showing trade-off between construction cost and expected damage

Interpreting the Results

The diagonal lines show iso-cost contours (total cost = construction + damage). Moving along the Pareto front:

  • Lower left: Infeasible—you can’t get low cost and low damage
  • Upper left: Low construction, high damage (0 ft)
  • Lower right: High construction, low damage (14 ft)
  • The curve: Efficient trade-offs between extremes
println("Pareto-optimal solutions:")
for (params, objectives) in pareto_front(opt_result)
    elev = round(params[1], digits=1)
    cost = round(objectives[1], digits=3)
    damage = round(objectives[2], digits=3)
    total = round(objectives[1] + objectives[2], digits=3)
    println("  $(elev) ft → cost=$cost, damage=$damage (total=$total)")
end
Pareto-optimal solutions:
  3.0 ft → cost=0.716, damage=0.275 (total=0.991)
  14.0 ft → cost=0.957, damage=0.01 (total=0.967)
  3.9 ft → cost=0.719, damage=0.182 (total=0.901)
  13.1 ft → cost=0.923, damage=0.013 (total=0.935)
  12.8 ft → cost=0.912, damage=0.014 (total=0.925)
  12.6 ft → cost=0.906, damage=0.014 (total=0.92)
  12.3 ft → cost=0.892, damage=0.015 (total=0.907)
  10.4 ft → cost=0.822, damage=0.024 (total=0.846)
  9.7 ft → cost=0.796, damage=0.029 (total=0.825)
  9.3 ft → cost=0.779, damage=0.032 (total=0.811)
  7.4 ft → cost=0.742, damage=0.054 (total=0.796)
  6.8 ft → cost=0.737, damage=0.065 (total=0.802)
  5.4 ft → cost=0.725, damage=0.105 (total=0.831)
  5.3 ft → cost=0.725, damage=0.107 (total=0.832)
  5.0 ft → cost=0.722, damage=0.122 (total=0.844)
  4.2 ft → cost=0.72, damage=0.164 (total=0.884)
  8.2 ft → cost=0.748, damage=0.043 (total=0.791)
  3.2 ft → cost=0.717, damage=0.255 (total=0.972)
  13.4 ft → cost=0.935, damage=0.011 (total=0.947)
  3.3 ft → cost=0.717, damage=0.24 (total=0.957)
  12.4 ft → cost=0.896, damage=0.015 (total=0.911)
  3.5 ft → cost=0.718, damage=0.219 (total=0.937)
  3.4 ft → cost=0.717, damage=0.229 (total=0.946)
  13.7 ft → cost=0.947, damage=0.011 (total=0.958)
  7.2 ft → cost=0.74, damage=0.057 (total=0.798)
  4.3 ft → cost=0.72, damage=0.154 (total=0.875)
  12.0 ft → cost=0.881, damage=0.017 (total=0.897)
  3.6 ft → cost=0.718, damage=0.205 (total=0.923)
  4.7 ft → cost=0.721, damage=0.136 (total=0.857)
  10.6 ft → cost=0.831, damage=0.023 (total=0.854)
  0.0 ft → cost=0.0, damage=1.486 (total=1.486)

Selecting from the Pareto Front

The optimizer finds the efficient frontier, but choosing a point on it is a decision, not an optimization. This is a fundamental insight in DMDU: the analyst’s job is to characterize the trade-offs, not to prescribe a single answer.

Different stakeholders, values, and institutional contexts lead to different choices:

Stakeholder Preference Likely Choice
Risk-averse homeowner Minimize worst-case outcome Higher elevation
Budget-constrained owner Minimize upfront cost Lower elevation
Insurance company Minimize expected total cost Middle of the Pareto curve
Community planner Meet a cost threshold reliably Depends on the threshold

The Pareto front makes these trade-offs explicit, enabling informed deliberation rather than opaque optimization.

# Find the solution with minimum total expected cost
pareto_pts = collect(pareto_front(opt_result))
total_costs = [obj[1] + obj[2] for (_, obj) in pareto_pts]
best_idx = argmin(total_costs)
best_params, best_obj = pareto_pts[best_idx]

println("Minimum expected total cost:")
println("  Elevation: $(round(best_params[1], digits=1)) ft")
println("  Construction: $(round(best_obj[1], digits=3))")
println("  Expected damage: $(round(best_obj[2], digits=3))")
println("  Total: $(round(best_obj[1] + best_obj[2], digits=3))")
Minimum expected total cost:
  Elevation: 8.2 ft
  Construction: 0.748
  Expected damage: 0.043
  Total: 0.791

Summary

We’ve covered the complete SimOptDecisions workflow:

  1. Define the problem with types (Config, Scenario, State, Action, Policy)
  2. Implement physics (depth-damage, elevation cost)
  3. Write callbacks (initialize, time_axis, get_action, run_timestep, compute_outcome)
  4. Run simulations with simulate()
  5. Evaluate policies across many scenarios
  6. Explore systematically with grid search or explore()
  7. Optimize with multi-objective algorithms to find Pareto-optimal solutions

Key Takeaways

  1. Structure your uncertainty in the Scenario type
  2. Separate physics from framework in standalone functions
  3. Validate with exploration before optimizing
  4. Use multi-objective optimization when objectives conflict
  5. The Pareto front shows trade-offs; choosing a point requires judgment
TipConnecting Optimization Back to Exploration

In a full DMDU analysis, the Pareto-optimal policies found here would be stress-tested using explore() against a broader or different scenario ensemble. This closes the loop: optimization finds promising policies, and exploratory modeling reveals whether they are robust. If a Pareto-optimal policy has a region of vulnerability (visible in a scenario map), you might redesign the policy or add constraints before re-optimizing.

Discussion

This tutorial used a deliberately simplified model to illustrate the DMDU workflow. In a real analysis, you would want to critically examine these simplifications:

  • Stationary climate: We drew storm surges from a fixed distribution, but climate change is shifting flood risk over time. The scenario type supports time-varying parameters via @timeseries.
  • No sea-level rise: Rising seas will compound storm surge risk. The State type can track evolving baseline conditions.
  • Independent damages: We assumed each year’s damage is independent, but cumulative flooding can worsen structural vulnerability. The state-based simulation loop supports path-dependent effects.
  • Single, static decision: We chose one elevation at time zero. In practice, decision-makers can adapt over time. More complex Policy types can encode adaptive decision rules (e.g., “elevate further if damages exceed a threshold”).

The value of the DMDU workflow is that it makes these modeling choices transparent. By exploring how results depend on assumptions, analysts and stakeholders can identify which simplifications matter most and where further modeling effort is needed.

Next Steps

References

Kasprzyk, Joseph R., Shanthi Nataraj, Patrick M. Reed, and Robert J. Lempert. 2013. “Many Objective Robust Decision Making for Complex Environmental Systems Undergoing Change.” Environmental Modelling & Software 42 (April): 55–71. https://doi.org/10.1016/j.envsoft.2012.12.007.