Expect Value™
TECHNICAL NOTES

Using Discrete and Continuous Named Probability Distributions

When modeling an uncertain variable, you may sometimes want to represent the probability distribution with a specific distribution type such as the normal, lognormal, or exponential distribution. DPL allows you to easily incorporate such "named" distributions into your analyses.

For example, suppose you are analysing the profits of a production facility, and one of the key uncertainties is the number of failures that will occur in the machines used in the production process. You are currently representing this uncertainty with the chance event "Machine failures".

Discrete Named Distributions

Based on statistical analysis of historical data, your engineers have recommended that you use a Poisson distribution with mean 1.4 to represent the probability distribution for machine failures. To enter this distribution into your DPL model, double-click on the "Machine failures" node, and then select Poison from the Distribution type dropdown list. Enter 1.4 for the mean.

Figure 3 When using a named distribution, you may notice that in each "Value" field in the node data window there is an asterisk instead of a number. In a decision tree analysis, it is necessary to define a finite number of possible values for each chance event. However, these named distributions are continuous probability distributions that may include an infinite number of values. To use this distribution in a decision tree, DPL must approximate the distribution with a finite number of probability/value pairs, which it computes. The asterisk indicates that DPL will supply the values associated with the distribution. Note that in this process, called discretization, DPL will generate a discrete distribution (with up to six states) that has the same low-order moments (mean, variance, etc.) as the original named distribution. (If n is the number of states, the approximation will match the named distribution's first 2n-1 moments.)

The cumulative probability graph below displays DPL's three-state discretization of the standard normal distribution. As the graph indicates, DPL uses three values (-1.732, 0, and 1.732) to approximate the standard normal distribution. In DPL's discrete approximation, the outer values of -1.732 and 1.732 are each assigned probability 1/6, while the value 0 is assigned probability 2/3. We can quickly see that both distributions have mean = 0. As a matter of fact, we could also verify that the first five moments of these distributions are equal.

Figure 4

Continuous Named Distributions

In some cases, matching the exact moments of the distribution may be less important than representing its percentiles and overall shape. For these situations, DPL 6.0 and later allow you to include "continuous" chance nodes in your models. Models with continuous chance nodes are evaluated by Monte Carlo simulation, rather than by decision tree rollback.

Figure 5

Switching between discrete and continuous chance node types is easy. Just right click on the node in the influence diagram and choose Change Node Type. In the Node Type dialog, choose Continuous Chance. You'll notice that the influence diagram node will become a darker shade or green.

The continuous version of "Machine failures" is still modeled by a Poisson distribution with mean 1.4. The difference is in how DPL treats the node when it runs an analysis. Instead of determining the probability to assign to a fixed number of states, DPL will take a random draw from the distribution for each sample.

Figure 6

Discrete or Continuous?

There are two issues to consider when deciding how to model an uncertainty.

Precision of results

Discrete chance nodes exactly represent the first few moments of the distribution, therefore they are usually better when you want the mean and variance of the results to be very precise.

Continuous chance nodes are generated by random draws from the probability density function, so for reasonable sample sizes they represent the shape and percentiles of the distribution very well.

In either case, keep in mind that the input data may be based on subjective assessments and/or extrapolation from limited data sets, so reading too many decimal places from the final results may not make sense.

Runtime performance

Discrete chance nodes produce models that run faster when there are downstream decisions (read the real options paper for more discussion of downstream decisions).

Simulation (Monte Carlo or discrete tree) is usually faster for models which have a lot of uncertainties and no decisions or only an up-front decision. This is true of both Monte Carlo simulation with continuous chance nodes, and discrete tree simulation with discrete chance nodes, so you never need to use continuous chance nodes simply for performance reasons.

Copyright © 2003-2007 Syncopation Software