Bootstrapping
Bootstrapping is a method for evaluating the variance of an estimator using Nboot data sets each containing N points obtained by random (say Monte Carlo) sampling of the original set of N points.
The process is to assign measures of accuracy (defined in terms of bias, variance, confidence intervals, prediction error or some other such measure) to sample estimates.
For purposes of our example, lets assume that a candidate takes either 1 or 2 or 3 or 4 or 5 months to prepare for the FRM -P2 exam. We then select 5 random candidates from 20 different cities to check how many months do they take to prepare for the exam.
We begin with a statistical sample from a population that we know nothing about. Our goal will be a 90% confidence interval about the mean of the sample. Although other statistical techniques used to determine confidence intervals assume that we know the mean or standard deviation of our population, bootstrapping does not require anything other than the sample.
We now re-sample with replacement from our sample to form what are known as bootstrap samples. Each bootstrap sample will have a size of five, just like our original sample (1,2,3,4,5). Since we are randomly selecting and then are replacing each value, the bootstrap samples may be different from the original sample and from each other.
- Sample 1: 3,3,1,5,5. Mean of sample 1 = 3.4
- Sample 2: 1,2,2,3,3. Mean of sample 2 = 2.2
- Sample 3: 3,2,5,2,1. Mean of Sample 3 = 2.6
- Sample 4: 4,3,4,2,3. Mean of sample 4 = 3.2
- Sample 5: 1,2,4,4,4. Mean of Sample 5 = 3
- Sample 6: 2,2,2,2,2. Mean of Sample 6 = 2
- Sample 7: 5,4,3,1,1. Mean of sample 7 = 2.8
- Sample 8: 1,1,5,4,3. Mean of sample 8 = 2.8
- Sample 9: 3,3,4,4,1. Mean of sample 9 = 3
- Sample 10: 5,1,3,2,2. Mean of sample 10 = 2.6
- Sample 11: 5,5,4,1,4. Mean of sample 11 = 3.8
- Sample 12: 1,4,4,4,5. Mean of sample 12 = 3.6
- Sample 13: 3,3,1,1,1. Mean of sample 13 = 1.8
- Sample 14: 4,1,5,1,2. Mean of sample 14 = 2.6
- Sample 15: 5,5,5,5,4. Mean of Sample 15 = 4.8
- Sample 16: 1,2,5,1,1. Mean of sample 16 = 2
- Sample 17: 4,3,5,2,1. Mean of sample 17 = 3
- Sample 18: 5,1,2,1,5. Mean of sample 18 = 2.8
- Sample 19: 4,4,1,1,2. Mean of sample 19 = 2.4
- Sample 20: 5,5,1,2,3. Mean of sample 20 = 3.2
Mean
Since we are using bootstrapping to calculate a confidence interval for the population mean, we now calculate the means of each of our bootstrap samples. These means, arranged in ascending order are: 1.8, 2, 2, 2.2, 2.4, 2.6, 2.6, 2.6, 2.8, 2.8, 2.8, 3, 3, 3, 3.2, 3.2, 3.4, 3.6, 3.8 and 4.8
Confidence Interval
We now obtain from our list of bootstrap sample means a confidence interval. Since we want a 90% confidence interval, we use the 95th and 5th percentiles as the endpoints of the intervals. The reason for this is that we split 100% - 90% = 10% in half so that we will have the middle 90% of all of the bootstrap sample means. For our example above we have a confidence interval of 2 to 3.8 months that a candidate takes to prepare for the FRM-P2 exam.
The bootstrap is a robust, non-parametric, method that does well with smaller samples or awkward distributions.
One of the biggest advantages of the bootstrap method is its simplicity. There is no need to assume the specific data distribution based on some theoretical distribution. We use the empirical distribution and we can analyze the statistics which theoretical properties cannot be analyzed mathematically.

