Introduction to Monte Carlo Simulation in Python
Intro
This post will provide a brief introduction to Monte Carlo simulations in python, utilizing the numpy, pandas, and plotly libraries.
What are Monte Carlo Simulations
Monte Carlo simulations are the process of using random sampling to model possible paths a variable could take. It turns out that rather than finding some abstract equation or method to predict an outcome, it is sometimes more practical to randomly simulate an outcome many times. And due to the Law of Large Numbers as the number of samples increases the observations will approach the theoretical solution.
Nassim Taleb (a statistician and former options trader) in his book Fooled by Randomness describes the process of creating Monte Carlo simulations as, “One sets conditions believed to resemble the ones that prevail in reality, and launches a collection of simulations around possible events.” Put another way, with Monte Carlo the goal is to simulate the collection of all possible paths(using random sampling) in order to find the relevant possibilities.
Monte Carlo Simulations can be achieved using a three-part process which I will define below and is drawn from Introduction to Monte Carlo Simulation.
- Model a system as a single or series of probability density functions
- Repeatedly sample from the probability density functions
- Calculate the relevant statistics
Example
A simple example of implementing this could be as follows. Consider the following, you flip a coin, if it lands on heads you get +1 if it lands on tails you get -1. What would the final score be after 100 flips? Let’s try it out below.
from numpy.random import default_rng
import numpy as np
import plotly.express as px
import pandas as pd
The first thing to do is import any necessary libraries. In this case I’ll be using numpy, pandas, and plotly.
coin = pd.DataFrame([0], columns=['Score'])
num = 0
for i in range(100):
num += np.random.choice([-1,1], p=[0.5, 0.5])
y = pd.DataFrame([num], columns=['Score'])
coin = coin.append(y, ignore_index=True)
Next we can create a dataframe to hold the history of our score, and a loop through 100 flips modifying our score by the outcome.
fig = px.line(coin, x=coin.index, y='Score')
fig.update_layout(title='100 Coin Flips Score', xaxis_title='Flip Number')
fig.show()
In this example, the probability density function(PDF) is an unbiased coin flip. Therefore one would have an even 50/50 chance of flipping either head or tails. It might be expected that after 100 flips the end score would be 0 but that isn’t necessarily the case as is shown above. The flips are still independent and stochastic so while it may return to 0, it could also go on a run of consecutive heads or tails in a row. But what scores might we get if we repeat the above process 5000 times and then plot the PDF of the 5000 final scores? Let’s see below.
num_trials = 5000
trials = np.zeros(num_trials)
num = 0
independently 5000 times
for t in range(1,num_trials):
num = 0
for i in range(100):
num += np.random.choice([-1,1], p=[0.5, 0.5])
trials[t] = num
Now lets repeat the before process but only keep the final score after 100 flips, and loop through it 5000 times. I’ll also create an array to hold all 5000 final scores.
fig = px.histogram(trials, histnorm='probability density', marginal='box')
fig.update_layout(showlegend=False, title='PDF of 5000 Outcomes of 100 Flips')
fig.show()
Shown above is the PDF of 5000 outcomes of 100 coin flips. In this case, we can see what might be expected by equally weighted coin flips, a median outcome score of 0. So the most likely outcome should be about 0 but that doesn’t tell the whole story. What is also evident is that this PDF seems to be converging to a normal distribution, this occurs due to the central limit theorem. Another conclusion that can be drawn is that rarely will a final score after 100 flips be more extreme than +30 or -30 (but it is still possible!).
Real World Applications
Now that I have introduced a short example, what are some examples of where this method is used in the real world? I referenced Nassim Taleb at the beginning of this post. Nassim utilized Monte Carlo as an options trader, to calculate the potential future prices of options.
Another practical use is in 3D graphics, such a video games. In order to simulate light in these 3D environments it would be way too computationally expensive to simulate every single photon of light emitted from a source. As a shortcut path tracing will utilize Monte Carlo to randomly simulate rays of light. These rays will travel through the 3D environment bounce off of objects in the scene. At each bounce more randomly generated rays of light are created until the whole scene is illuminated sufficiently. Below is an image which demonstrates this process.
Resources
Finally if you are wanting to learn more about Monte Carlo simulations, I have linked some resources below which I think might be useful. One highlight is the link to the Numpy documentation for generating random samples from a specific distribution. They have basically every standard distribution one might use such as Gaussian, Pareto, Gamma, and more. So if you know what distribution to use those can be useful starting point.
- Numpy Random Sampling Documentation and Distributions
- Estimating Pi using Monte Carlo — An interesting YouTube video
- Deep Learning Book — Chapter 17
- The Forgotten Algorithm
If you want to access the code or connect more, check out my GitHub, Twitter, and the rest of my Medium.