Homebrew Monte Carlo Simulations for Security Risk Analysis Part 2

Previously I wrote about how I had implemented the simple quantitative analysis from Doug Hubbard’s book ‘How to measure anything in cybersecurity’ into javascript.

When I wrote that code for Monte Carlo simulation I was working with percentage probabilities derived from expected rates of occurrence which I spoke about here. This was a bit clunky and really fell down with high rates of occurrence (2 or more times in the period under analysis).

I briefly swapped messages with Doug later last year and he suggested that the Poisson distribution was likely what I needed, either that or reduce the period of time of analysis. The latter seemed like a cludge especially when in the same portfolio I had risks that were once every five years and in some cases were expected to occur once a day.

An inverse sampling of a Poisson distribution is not easily available in Excel so I switched to using the R programming language.

I started with the qpois function in R that provides the ability to sample quartiles as a result the following call provides the rate of occurrence at each point of sampling given the expected rate of occurence (lambda). The function runif() provides a random number between 0 and 1 to choose the quartile from which to sample the poisson distribution:

occurrence <- 36
qpois(runif(1,0:1),lambda=occurrence)

R provides a great function called replicate() that allows you to take multiple samples into a list (And importantly regenerates random numbers for each sample). The code snippet below runs the Poisson sampling 1000 times and creates a list of 100 samples.

occurrence <- 36
sample <- 1000
occurrencelist <- replicate(sample, qpois(runif(1,0:1),lambda=occurrence), simplify = TRUE)

That is half of a fairly naive Monte Carlo simulation Now we need to consider the consequences of the risk occurring. previously I copied Doug’s use of the lognormal distribution for estimating harm. this distribution has some research that supports its use. However, lognormal distributions have long tails with some big numbers and can deliver unexpected outcomes especially if you have broad uncertainty around the range of outcomes.

The modified PERT distribution allows us to take those uncertain estimates with the skew of a lognormal distribution but is a bit more discriminating of long-tail values which in the previous implementation I was cutting off anyway. The following function qpert() in the mc2d package provides a quartile sample which can be randomised as with the Poisson sample.


minharm <- 10000
likelyharm <- 50000
maxharm <- 250000
confidence <- 4
qpert(runif(1,0:1),min=minharm, mode=likelyharm, max=maxharm, shape=confidence)

Combining the output of the rate of occurrence sampling from the Poisson Distribution with a set of samples from the Modified PERT distribution can be done as follows:

sample <- 10000
occurrence <- 36
minharm <- 10000
likelyharm <- 50000
maxharm <- 250000
confidence <- 4


occurrencelist <- replicate(sample, qpois(runif(1,0:1),lambda=occurrence), simplify = TRUE)

outcomelist <- vector("vector", length(occurrencelist))
x <- 1
for (i in occurrencelist)
{
  loopvector <- unlist(replicate(i, qpert(runif(1,0:1),min=minharm, mode=likelyharm, max=maxharm, shape=confidence),simplify=FALSE))

  average <- sum(loopvector)/length(loopvector)
  outcomelist[x] <- average
  x = x+1
}

print(mean(na.omit(outcomelist)))

This bit of code runs 10000 samples of the simulation and prints a mean (average) of the results. It’s quick and dirty but it works. Hopefully, this will help anyone else following a similar path. There is a maturing open source project doing this released by Netflix called RiskQuant and if you’re looking for a tool to run rather than write your own I recommend that you use that.

As usual, if you spot any glaring errors or just want to tell me how bad my code is you can drop me a line (I know how bad my code is btw).

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.