A single simulation gives us one outcome. But we need to evaluate a policy across many scenarios and summarize the results. This is where metrics come in.
A metric summarizes outcomes across many scenarios. Common metrics include:
Metric
What it answers
Expected value
What’s the average outcome?
Variance
How much does the outcome vary?
Probability
How likely is an event (e.g., damage > 50%)?
Quantile
What’s the worst-case at a given confidence level?
Evaluating a Policy Across Scenarios
Let’s run one policy against many scenarios:
n_scenarios =500rng =Random.Xoshiro(123)scenarios = [sample_scenario(rng, config.horizon) for _ in1:n_scenarios]# Evaluate one policy across all scenariospolicy =ElevationPolicy(elevation_ft=5.0)outcomes = [simulate(config, s, policy) for s in scenarios]# Extract total coststotal_costs = [o.total_cost for o in outcomes]println("Mean total cost: $(round(mean(total_costs), digits=3))")println("Std deviation: $(round(std(total_costs), digits=3))")println("5th percentile: $(round(quantile(total_costs, 0.05), digits=3))")println("95th percentile: $(round(quantile(total_costs, 0.95), digits=3))")
Mean total cost: 0.819
Std deviation: 0.199
5th percentile: 0.722
95th percentile: 1.222
Code
let fig =Figure(; size=(700, 400)) ax =Axis(fig[1, 1]; xlabel="Total cost (fraction of house value)", ylabel="Count", title="Outcome Distribution (5ft elevation, $(n_scenarios) scenarios)")hist!(ax, total_costs; bins=30, color=:steelblue)vlines!(ax, [mean(total_costs)]; color=:red, linewidth=2, label="Mean")vlines!(ax, [quantile(total_costs, 0.95)]; color=:orange, linewidth=2, linestyle=:dash, label="95th percentile")axislegend(ax; position=:rt) figend
Figure 1: Distribution of total costs across scenarios (5ft elevation)
The calculate_metrics Function
For optimization, we need a function that takes outcomes and returns metrics. This is calculate_metrics:
functioncalculate_metrics(outcomes) total_costs = [o.total_cost for o in outcomes]return ( expected_cost =mean(total_costs), cost_variance =var(total_costs), worst_5pct =quantile(total_costs, 0.95), # 95th percentile = worst 5% )end
Now we can compare policies using consistent metrics:
Code
let elevations =0:14 expected_costs =Float64[] worst_5pcts =Float64[]for elev in elevations policy =ElevationPolicy(elevation_ft=Float64(elev)) outcomes = [simulate(config, s, policy) for s in scenarios] metrics =calculate_metrics(outcomes)push!(expected_costs, metrics.expected_cost)push!(worst_5pcts, metrics.worst_5pct)end fig =Figure(; size=(700, 500)) ax =Axis(fig[1, 1]; xlabel="Expected cost (fraction of house value)", ylabel="Worst 5% cost (fraction of house value)", title="Trade-off: Expected vs Worst-Case Cost")scatter!(ax, expected_costs, worst_5pcts; markersize=15)for (i, elev) inenumerate(elevations)text!(ax, expected_costs[i], worst_5pcts[i]; text=string(elev), align=(:left, :bottom), offset=(5, 5))end figend
Figure 2: Expected cost vs worst-case cost for different elevations
This reveals a Pareto frontier: no policy can improve on expected cost without worsening worst-case cost (or vice versa).
Risk Attitudes and Metric Choice
The “right” metric depends on the decision-maker’s values and the institutional context. In decision making under deep uncertainty, the choice of metric is itself a modeling decision with real consequences (Herman et al. 2015):
Risk Attitude
Metric
Favors
Risk-neutral
Expected cost
Policies that perform well on average across scenarios
Risk-averse
95th percentile or CVaR
Policies that limit worst-case losses
Regret-averse
Maximum regret
Policies whose worst-case deviation from the best possible choice is small
Satisficing
Probability of cost < threshold
Policies most likely to meet a minimum performance standard
TipRobustness vs. Optimality
In classical optimization, we seek the policy that minimizes (or maximizes) some objective. In DMDU, we often seek robust policies—ones that perform acceptably well across a wide range of futures, even if they aren’t optimal in any single scenario.
A policy that minimizes expected cost might perform terribly in the worst 5% of scenarios. A slightly more expensive policy might avoid catastrophic outcomes entirely. The Pareto frontier in the plot above captures exactly this trade-off.
The calculate_metrics function lets you define whatever aggregations are relevant for your problem. Different metrics lead to different “optimal” policies—which is why exploring the full outcome distribution matters.
Built-in Metric Types
SimOptDecisions provides declarative metric types for common cases:
# These are equivalent to what we computed manually:ExpectedValue(:total_cost) # mean(outcomes.total_cost)Variance(:total_cost) # var(outcomes.total_cost)Quantile(:total_cost, 0.95) # quantile(outcomes.total_cost, 0.95)Probability(:total_cost, >, 0.5) # fraction where total_cost > 0.5
These are used with compute_metrics(metric, outcomes) for convenience, but for optimization you’ll typically write a custom calculate_metrics function.
Summary
Outcome: Result of one simulate() call
Metric: Summary statistic across many outcomes
calculate_metrics(outcomes): Your function that aggregates outcomes into named metrics
The choice of metrics reflects your values and risk attitude. Different metrics lead to different “optimal” policies.
TipComing Up: explore()
Manually looping over scenarios works, but explore() does this automatically with:
Parallel execution via Executors
Automatic result organization as YAXArrays
Common Random Numbers for variance reduction
In the next section, we’ll use explore() to systematically run all policy×scenario combinations and analyze the results.
References
Herman, Jonathan D., Patrick M. Reed, Harrison B. Zeff, and Gregory W. Characklis. 2015. “How Should Robustness Be Defined for Water Systems Planning Under Change?”Journal of Water Resources Planning and Management 141 (10): 04015012. https://doi.org/10.1061/(asce)wr.1943-5452.0000509.