"The project will be finished on September 30." Statements like this sound certain, but they never are. A Monte Carlo simulation replaces false precision with an honest answer: "80 percent likely to be finished by October 15." This guide explains how the simulation works and why it beats the critical path.
An estimate is a claim about the future. "Five days" sounds precise, but it hides the fact that the same task takes eight days on a bad day and three on a good one. Plan with a fixed value and you plan with an illusion. The Monte Carlo simulation takes that illusion seriously and factors in the spread instead of averaging it away.
What Is a Monte Carlo Simulation?
A Monte Carlo simulation models uncertainty through repeated random sampling. Instead of computing with a fixed value, such as "the task takes five days," you work with probability distributions and let the model run thousands of times. Each run draws random values from those distributions. The result is not a single estimate, but a distribution of possible outcomes.
This shifts the core statement fundamentally. "The project will be finished on September 30" becomes "80 percent likely to be finished by October 15." The difference is tangible. One is a promise you cannot keep. The other is a statement a client can actually plan around.
Monte Carlo in Project Management: Step by Step
In project management, the simulation follows a clear process. We walk through it step by step, from the single task to the finished probability curve.
Step 1: Every task gets a range. Instead of a single number, you estimate three: optimistic (O), when everything runs smoothly, most likely (M) for the normal case and pessimistic (P), when things stall. From these three points the model builds a three-point distribution. Most draws land near M, few at the edges. If you want to go deeper, you choose the shape deliberately: a PERT-beta weights M more heavily, a triangle is coarser but easier to explain.
Step 2: A single run rolls the dice for every duration. A single run draws exactly one random duration for each task from its distribution and computes a possible project end from them. That is a plausible future, not an average but a concrete scenario, like a single dice roll for the whole project. Sometimes it comes out to 27 days, sometimes 31. On its own, one run says little. It is just one of many.
Step 3: The network decides, not the sum. Durations do not simply add up along a line. The dependencies in the network diagram determine the path, and at a milestone where several paths converge, the latest one counts, that is the maximum, not the average. This is path convergence: even if each individual path is usually on time, every additional path raises the chance that at least one comes in late. Picture a relay race where the slowest runner sets the time.
Step 4: 10,000 runs yield the probability. Now you repeat the dice roll, for example 10,000 times. Each run delivers a finish date, and together they form exactly the distribution shown above. Sort the dates in ascending order and the cumulative S-curve emerges, and from it you read off the confidence levels. P80 simply means: in 80 out of 100 simulated worlds you are finished by this date. Histogram and S-curve are two views of the same data; you commit to the S-curve in the end, when you pledge a date.
Why Monte Carlo Beats the Critical Path
It is precisely the path convergence from step 3 that explains why the classic critical path systematically underestimates project duration. It computes with fixed durations and looks only at the longest strand, but ignores how likely it is that several tasks go wrong at the same time. That makes dates more fragile than a deterministic plan suggests.
Tornado Diagram: The Sensitivity Analysis
As a side effect, the simulation delivers a sensitivity analysis. A tornado diagram shows which tasks contribute most to overall uncertainty. That tells your risk management where to focus, instead of watching everywhere at once.
The Three Building Blocks of Uncertainty
Before we simulate, it is worth looking at the mental model. There are three kinds of uncertainty, and these are not competing variants of Monte Carlo to choose between. They are three sources, all of which you model, and a good model covers all of them. So the question is not "which variant do I take," but "which uncertainties are in my project and how do I represent each one."
Variability: the Everyday Spread
Every task sometimes takes longer, sometimes shorter. That is the plus-or-minus-X-percent band, the classic Monte Carlo where every task gets a distribution. Here you deliberately turn all the dials at once, not just a single one. The point is that many small spreads accumulate through the network logic and amplify at points of path convergence.
Event Risks: Discrete Risk Events
The supplier fails or does not. The resource is sick or not. This is not "plus or minus ten percent duration," but "with probability p, event X occurs, and then Y shifts by Z days." Instead of a band around every value, here you place an on/off switch at a few critical points: with probability p the event occurs, otherwise it does not.
Structural Risks: Branching and Repetition
The third source does not lie in the duration of a task, but in the question of whether it runs through as planned at all. An acceptance test can fail, an approval can be rejected, a review step can be failed. Then the task is not completed on the first attempt but pulls a rework loop along with it: an extra task that does not appear in the smooth plan at all. While an event risk extends an existing duration, a structural risk changes the network diagram itself.
In the simulation you represent this as a probabilistic branch: in every run a random switch decides, with probability p, whether the rework loop with its own duration distribution is inserted. Over 10,000 runs this shows up as a second bump on the right of the distribution, the typical signature of a rework risk. These are exactly the cases a plan misses when it only knows the smooth run-through.
The best models combine all three: a base level of noise on every task, a few discrete risk switches where something can really tip over, and a branch wherever a task can fail and trigger rework.
Correlation: the Bracket Around the Three
The three building blocks rarely act independently, and that is exactly what the frame in the graphic above shows. Bad weather does not delay one outdoor task, but all of them at once; an overloaded supplier hits every order, not just one. That is correlation: a common driver that pulls several tasks in the same direction within a single run.
Correlation is not a fourth building block, but a bracket around the three. It decides whether many small spreads average each other out or build up. Whoever wrongly models risks as independent systematically underestimates the range, because uncorrelated deviations cancel out on average, while correlated ones do not. In practice this is often the biggest lever on a realistic P80.
The Adjustment Levers of a Monte Carlo Simulation
Beyond the question "what do I model," there are further axes where you actually decide.
Distribution shape per task. The triangle (O, M, P) is simple, intuitive and ideal for explaining. The PERT-beta distribution weights the most likely duration more heavily and runs more smoothly; it is the de facto standard in project management and our recommendation. The uniform distribution fits when you genuinely only know the minimum and maximum. Lognormal suits durations with a long upper tail, since tasks rarely get much faster, but sometimes considerably slower.
Target metric of the simulation. The most common question is the schedule (schedule risk): when you will be finished. Alongside that stands cost risk: what the project ends up costing. Both can be coupled, because delay creates extra cost, such as penalties or storage costs. That is exactly what a question like the supplier comparison answers, where schedule and price are weighed against each other.
There remains the third axis: where the figures for these distributions come from in the first place, that is the input uncertainty. This is where it is decided whether the simulation rests on solid data or on a gut feeling.
Where the input uncertainty comes from. The strongest basis is experience data, that is the empirical frequency, such as a delivery history. Alongside that stands expert judgment, the classic O/M/P estimate when no data is available. And there is the plan/actual comparison in Merlin Project: Merlin keeps planned and actual values per task. In a running project you measure the typical spread from the variance of past tasks, instead of estimating it. This feeds your simulation with real project data and makes it more reliable with every completed task.
We deliberately leave out one advanced adjustment lever here, because it would overload an introductory guide:
- Latin Hypercube Sampling covers the value range more evenly and therefore converges faster than naive random sampling. Well explained in the Wikipedia article.
From Theory into Practice
Until now, a Monte Carlo simulation was complex and remained the domain of specialist tools, alongside laboriously maintained three-point estimates. With the MCP server in Merlin Project, an AI reads your project plan directly and computes the simulation on demand. You ask Claude in a single sentence to simulate 10,000 project runs from your plan, and get the S-curve with P50, P80 and P90 explained back to you. A specialist tool with its own data maintenance becomes a single sentence to Claude.
What this looks like in concrete terms, from the setup through the example prompts to two fully worked case studies, is shown in the practical part: How to simulate Monte Carlo with Merlin Project and the MCP server.
How worthwhile structured risk management is in a project is also shown by our piece on Riskology, the Monte Carlo method of the Atlantic Systems Guild.
If you have any questions about this blog article or would like to discuss it, we look forward to your contribution in our forum.
Frequently asked questions
How many runs does a Monte Carlo simulation need?
For stable values at P50, P80 and P90, 5,000 to 10,000 runs are usually enough in practice. More runs smooth the curve further but barely change the core result.
PERT-beta or triangular distribution, which should I use?
The triangle is the easiest to explain; the PERT-beta weights the most likely value more strongly and is the de-facto standard in project management. When in doubt, use the PERT-beta.
Should I commit to P50, P80 or P90?
P50 is a 50/50 date and better suited to internal planning. For a reliable external commitment, communicate P80 or P90, depending on your risk appetite and any penalty clauses.
How does Monte Carlo differ from the critical path?
The critical path works with fixed durations and systematically underestimates project duration because of path convergence. Monte Carlo factors in the spread and delivers a probability instead of a single date.
Do I need historical data for a Monte Carlo simulation?
No. You can start with expert estimates of an optimistic, most likely and pessimistic value. With real plan/actual data, for example from Merlin Project, the simulation becomes considerably more reliable.
Do I have to account for correlations between tasks?
If one risk hits several tasks at the same time, for example bad weather affecting all outdoor work, then yes. Uncorrelated deviations average each other out, correlated ones do not. Whoever ignores correlation underestimates the range and with it the schedule risk.