Using Discrete and Continuous Named Probability Distributions
When modeling an uncertain variable, you may sometimes want to represent the
probability distribution with a specific distribution type such as the
normal, lognormal, or exponential distribution. DPL allows you to
easily incorporate such "named" distributions into your analyses.
For example, suppose you are analysing the profits of a production facility,
and one of the key uncertainties is the number of failures that will occur in
the machines used in the production process. You are currently representing
this uncertainty with the chance event "Machine failures".
Discrete Named Distributions
Based on statistical analysis of historical data, your engineers have recommended
that you use a Poisson distribution with mean 1.4 to represent the probability
distribution for machine failures. To enter this distribution into your DPL model,
double-click on the "Machine failures" node, and then select Poison from the Distribution
type dropdown list. Enter 1.4 for the mean.
When using a named distribution, you may notice that in each "Value" field
in the node data window there is an asterisk instead of a number.
In a decision tree analysis, it is necessary to define a finite number of possible
values for each chance event. However, these named distributions are continuous
probability distributions that may include an infinite number of values. To
use this distribution in a decision tree, DPL must approximate the distribution
with a finite number of probability/value pairs, which it computes. The asterisk
indicates that DPL will supply the values associated with the distribution.
Note that in this process, called discretization, DPL will generate a discrete
distribution (with up to six states) that has the same low-order moments (mean,
variance, etc.) as the original named distribution. (If n is the number of states,
the approximation will match the named distribution's first 2n-1 moments.)
The cumulative probability graph below displays DPL's three-state discretization
of the standard normal distribution. As the graph indicates, DPL uses three
values (-1.732, 0, and 1.732) to approximate the standard normal distribution.
In DPL's discrete approximation, the outer values of -1.732 and 1.732 are each
assigned probability 1/6, while the value 0 is assigned probability 2/3. We
can quickly see that both distributions have mean = 0. As a matter of fact,
we could also verify that the first five moments of these distributions are
equal.
Continuous Named Distributions
In some cases, matching the exact moments of the distribution may be less important
than representing its percentiles and overall shape. For these situations, DPL 6.0 and
later allow you to include "continuous" chance nodes in your models. Models with
continuous chance nodes are evaluated by Monte Carlo simulation, rather than by decision
tree rollback.
Switching between discrete and continuous chance node types is easy. Just right click
on the node in the influence diagram and choose Change Node Type. In the Node Type dialog,
choose Continuous Chance. You'll notice that the influence diagram node will become a
darker shade or green.
The continuous version of "Machine failures" is still modeled by a Poisson distribution
with mean 1.4. The difference is in how DPL treats the node when it runs an analysis.
Instead of determining the probability to assign to a fixed number of states, DPL will
take a random draw from the distribution for each sample.
Discrete or Continuous?
There are two issues to consider when deciding how to model an uncertainty.
Precision of results
Discrete chance nodes exactly represent the first few moments of the distribution,
therefore they are usually better when you want the mean and variance of the results
to be very precise.
Continuous chance nodes are generated by random draws from the probability density
function, so for reasonable sample sizes they represent the shape and percentiles of
the distribution very well.
In either case, keep in mind that the input data may be based on subjective assessments
and/or extrapolation from limited data sets, so reading too many decimal places
from the final results may not make sense.
Runtime performance
Discrete chance nodes produce models that run faster when there
are downstream decisions (read the real options
paper for more discussion of downstream decisions).
Simulation (Monte Carlo or discrete tree) is usually faster for models which have a lot of uncertainties and no decisions
or only an up-front decision. This is true of both Monte Carlo simulation with continuous
chance nodes, and discrete tree simulation with discrete chance nodes, so you never need
to use continuous chance nodes simply for performance reasons.
|