Monte Carlo Simulation – Introduction
In this article I will explain about Monte Carlo simulations – what they are, when they are useful and how they can be done with Excel with an easy example. “A Monte Carlo simulation provides an answer to a simple, yet difficult to answer question, by providing an estimate plus an uncertainty tolerance.” Whenever I have a problem that involves uncertainty, I start with a basic MC simulation. Usually, it will give me a pretty good idea what to expect. The rest is fine tuning of inputs and a nice presentation of results. And best of all, I can built it with Excel in a matter of minutes.
An Example Question
Imagine that you received a job offer with a promotion – but you must relocate to a different country, and not just yourself, but your entire family and household. Your employer provides you with a relocation allowance and certain benefits to support your move. But how do you know if the offered benefits will be enough? You need an estimate of the expected costs and compare them with the offered job package, and you need to do so quickly (your boss is waiting for your answer!). But it gets worse: you have never been in that location before and you have only vague information about what costs to expect.
Once you reach home you talk to your spouse, seeking advice (and some comfort), but instead your hear “Let’s make sure our kids get into a good school!”, followed by “Shall we take all our furniture with us or do a garage sale before we leave?”. The pressure is on.
What Monte Carlo Simulations do
There is plenty of material about Monte Carlo simulations available; for the purpose of this article and the above example question, my definition is simple. A Monte Carlo simulation tries to provide an answer to a simple, yet difficult to answer question, by providing an estimate plus an uncertainty tolerance. Let’s look at the question and answer for our example:
- Question: How much will I have to pay out of my own pocket for the relocation?
- Answer: It will cost XYZ or less with a probability of 90%
In fact, the simulation provides more information than that, for example:
- The Average (or “Expected”) Cost (probability: 50%)
- The Lowest Possible Cost
- The Worst Case Scenario Cost
- Everything in between, with probabilities included
After asking the right question, the second most important part of building MC simulations is the input. As for most models, garbage in – garbage out says it all. In our example case we start with identifying the cost categories before getting into the details of each. Relocating is complex and depending on where you relocate to, you might or might not have costs like the below:
- Container / Packing Costs
- Insurance and Customs
- Damage of not insured items (oh yes, that happens!)
- Yearly Rent, to be paid in advance
- Insurances, medical checkups, vaccinations
- School fees for children
- Utilities, local taxes & fees
That sounds like a lot of costs – but after reviewing your job package you start breathing again, as some costs are 100% covered, while others are partially covered. You go through every paragraph, do some web research about the new location and make some phone calls to get some more information about each cost item, and you come up with a table with cost estimates:
|Rent||$ 40,000||$ 35,000||$ 50,000|
|Child 1 School||$ 15,000||$ 12,000||$ 20,000|
|Child 2 School||$ 10,000||$ 10,000||$ 16,000|
|Utilities||$ 4,800||$ 4,000||$ 12,000|
After running the simulation, the result can be interpreted on a chart that shows the cumulative probabilities. The simulation creates 1000 different scenarios with varying inputs. Employer coverage values are subtracted from each scenario and the remaining amounts are summed up. This is the amount you need to pay out of your own pocket for the relocation – or that you “earn”, if you are lucky.
The cumulative probabilities chart sums up all scenario results, ordered by cost. It shows that there are scenarios where your cost is negative – means, you have money to keep. This happens if your allowances are paid to you in cash in full and your costs are lower than the allowances. Scenarios are ordered in increasing order and all probabilities are summed up. The x-axis shows best and worst case scenarios:
- Best case: You can pocket 10,000 USD
- Worst case: You have to pay 30,000 USD
However, both cases are unlikely to occur and none provides the answer we are looking for.
We want to know, with 90% certainty, what our maximum relocation cost will be. To do so, go up the y-axis of the graph until you reach 90%. Then go to the right until you hit the graph line. From there, go down to the x-axis and read the value. In our case, it shows approx. 14,700 USD.
Therefore, our answer is: With 90% certainty (9 out of 10 times) our maximum cost will not exceed 14,700 USD. In fact, it can also be much lower. Alternatively, we can say that only in 1 out of 10 times will our cost be above 14,700 USD.
In the same way we can estimate the probability of having no cost at all: starting from the x-axis at 0 USD, move up to the chart line, then go left to get the probability (approx. 11%, or 1 out of 10 times). So it appears unlikely that all costs are going to be covered, and the new salary must make up for it to make the move financially worthwhile.
From Deterministic to Probabilistic Values
In business environments often the best / most likely / worst case scenario models are used. They are easy to build and easy to understand. However, they can be misleading and fail to answer the question. In our example case, this model would look like below:
The main weakness of this model is that it does not allow a proper risk assessment. Best and worst case scenarios are equally unlikely, and the middle ground is often only the average or expected value. Choosing this scenario is like making decisions by flipping a coin. A Monte Carlo model provides all the required information to make a risk based decision: Severity – “how bad can it get?” and Likelihood – “what’s the probability of it happening?”. It also allows for getting this information for very complex scenarios, while deterministic models often must strongly simplify the case to get any results.
I only used a simple example for now to simplify interpretation and model building. Obviously, this model (like all models) has limitations that can make the results not as good as needed. In this particular example, the problem is that the model assumes that all inputs are independent of each other. In most cases this is unrealistic; e.g. if you pay more rent, then it is likely that you are renting a larger property. A larger property requires more electricity and water, hence utility costs are expected to increase with property size. In statistical terms it means that both variables are correlated; but this model does not take this into account and happily creates scenarios where the yearly rent is low and the utility costs are skyrocketing, or vice versa.
In later articles I will discuss and showcase models that take correlated inputs into account. It can dramatically change results and must be taking into consideration during model building.