A method estimating complex probabilities with simulation.
Named after the famous Casino de Monte-Carlo in Monaco.
Involves repeatedly generating random outcomes for potentially complex combinations of events and then measuring statistics about them.
“Anyone who attempts to generate random numbers by deterministic means is, of course, living in a state of sin.”
—John von Neumann
Linear congruential pseudorandom number generator, as defined by D. H. Lehmer and described by Donald E. Knuth in The Art of Computer Programming, Volume 2, Third edition: Seminumerical Algorithms, section 3.2.1.
private static final long multiplier = 0x5DEECE66DL;
private static final long addend = 0xBL;
private static final long mask = (1L << 48) - 1;
nextseed = (oldseed * multiplier + addend) & mask;
java.lang
and java.util
Old school
Math.random()
java.util.Random
java.util.ThreadLocalRandom
java.util.SplittableRandom
java.util.random
New hotness
RandomGenerator
Several nested interfaces in RandomGenerator
Pluggable algorithms.
RandomGenerator
import java.util.random.RandomGenerator;
RandomGenerator r = RandomGenerator.of("L128X1024MixRandom");
"L128X1024MixRandom"
is the name of a specific algorithm.
See java.util.random Javadocs for a deep summary of different algorithms.
Math.random()
has a period of 248
"L128X1024MixRandom"
has a period of approximately 21,152.
The “shape” of the collection of values generated by some random process.
Some distributions have easily described shapes.
Real distributions do not.
RandomGenerator.nextDouble()
- generates a uniformly distributed random double
between 0.0 and 1.0
Can shift and scale to cover a different range like you learned in CSA.
nextGaussian()
- next normally distributed random number from distribution with mean of 0 and standard deviation of 1
Can shift and scale to some particular mean and standard deviation.
Log normal is a big one. A distribution of values whose natural logs are normally distributed.
Typically we parameterize the underlying normal distribution (to a given mean and standard deviation).
When we try to predict (or estimate) values we always want to specify a range.
Typically we are aiming for a 90% prediction which you can think of as a prediction that will be right 90% of the time.
The width of your prediction interval is related to your level of uncertainty. The more uncertain you are, the wider a range your prediction should cover.
A Monte Carlo simulation gives us a way to compute a prediction interval over a complex distribution made up of interactions between other distributions.
The foundation of the simulation is a set of input intervals that we can estimate.
Then we model how those base intervals interact.
We can estimate individual tasks that make up a project.
We know how the time taken be each task affects to total.
And we can estimate other things, like how much time we’ll lose for various reasons.
What is the 90% prediction interval for our total cost?
What is the probability of it costing more than a given amount.
What's the 90% prediction interval on when all the tasks will be done.
What's the probability that a given task will be done by a given date.