Introduction: To Get Vaccinated or Not? a Project on Observing Herd Immunity Through Disease Simulation
Overview of Project:
Our project explores herd immunity and hopes to encourage people to get vaccinations to decrease infection rates in our communities. Our program simulates how a disease infects a population with different percentages of vaccinated and unvaccinated rates. It shows herd immunity by showing how an increased number of the vaccinated population can decrease the number of people affected.
We model this in Matlab using graph theory concepts. Graph theory is a mathematical way to represent the relationships between objects. In graph theory, graphs have vertices (or nodes) connected by edges (or lines). For our project, the nodes are the individuals concerned and the edges are their connections. For example, if two nodes are connected with an edge then it means they are "friends" or have some form of contact with each other. This contact is a way for the disease to spread. This is why we used graph theory to model our concept because we wanted to see how disease spreads among individuals that are connected in a population.
Our project also involves the Monte Carlo Method. The Monte Carlo method are algorithms that create repeated random sampling to receive numerical results. In our project, we use this method to run our simulation several times changing the percent of initial unvaccinated to see the rate at which people get infected.
All project code is linked at the bottom!
Matlab link to Graph Theory: https://www.mathworks.com/help/symbolic/graph-theo...
Step 1: Create Adjacency Matrix
Create a new script. We are going to call ours 'infectionSim.m'.
We are going to create a variable 'NUMOFPEOPLE'. You can assign it to any integer value. This will represent the number of people in your population.
From now on, we will assume that
NUMOFPEOPLE = 20;
First start by using Matlab's graph theory functions for an undirected graph.
If you are interested in learning more, here is a link for you to read more in depth about it.
Created an adjacency matrix.
adjMatrix = zeros(NUMOFPEOPLE);
This will create a square matrix of 0s. Each row in the matrix is a person. Each column in the matrix is a person or friend that the person meets throughout the day.
See Figure 100 (above) to help visualize what adjMatrix looks like for 20 people.
**From this point on we will assume NUMOFPEOPLE is equal to 20.**
You can try plotting this adjacency matrix. Here is a little more information about plotting these types of matrices.
Note: How adjacency matrix works.
%making the adjacent matrix a = [0,1,0,0,0; 1,0,1,1,1; 0,1,0,0,0; 0,1,0,0,0; 0,1,0,0,0] %plotting g = graph(a); %using the graph function (graph theory) figure(1); h = plot(g);
See Figure 1 (above) to see how to add edges in the adjacency matrix, using the code in "Note".
Step 2: Create Relationships.
Now that the people (vertices or nodes) are created, we need to create a network of relations (lines or edges of the graph). This will simulate how people interact and meet other people throughout a day.
This can be done many ways. One way to complete this task is to first assign a random number to each person to determine how many people each person will interact with in a day.
numOfFriendsMatrix = randi([leastFriendsPersonCanHave,mostFriendsPersonCanHave],1,NUMOFPEOPLE);
This makes a 1 by 20 matrix of random integers representing the number of interactions each person has a day. The columns of this matrix would be the number corresponding to each person. For instance if we assign the leastFriendsPersonCanHave = 2 and mostFriendsPersonCanHave = 5, we would get random values between 2 and 5.
Having trouble with randi()? In terminal, type
Next, we make a randomized matrix (called "allFriendsmatrix") of how each person in the population is connected/interacts within the population.
tempMatrix = ; count = 0; allFriendsMatrix = ; for k = 1:NUMOFPEOPLE while length(tempMatrix) ~= numOfFriendsMatrix(k) count = count +1; temp = randi([1, NUMOFPEOPLE]); tempMatrix(count) = temp; end clear each while length(tempMatrix) ~= 9 tempMatrix = [tempMatrix, NaN]; end allFriendsMatrix = [allFriendsMatrix; tempMatrix]; tempMatrix = ; count = 0; end
In depth explanation of code:
First we create an empty temporary matrix to hold each person's friends/interaction list. We also initialize count, which just keeps track of where to stick the new random connection in the tempMatrix. The for loops run 20 times so that this happens for each individual person in the population. The first while loop runs until each person's tempMatrix is the same length of the randomly assigned number of interactions. In this loop, a random number corresponding to person in population is generated and placed into the tempMatrix. Because the lengths of each of the tempMatrixes are different, we needed to create some NaN values so that we can concatenate all these tempMaticies all into one matrix ('allFriendsMatrix'). The second while loop solves this problem by adding NaN's into each tempMatrix. The while loop was set to run 9 times because it is a number greater than 5, which was the upper bound of friends a person can be assigned. The value '9' is variable and can/must be changed when 'mostFriendsPersonCanHave' is greater than 9. The last three lines of code (excluding the end) adds the tempMatrix into the next row of the 'allFriendsMatrix'. Then it clears out tempMatrix and count for the next person.
This is what the output should look like for the first run through the for loop (before the last three lines).
tempMatrix = 16 8 17 16 13 NaN NaN NaN NaN<br>
allFriendsMatrix = 16 8 17 16 13 NaN NaN NaN NaN 8 8 2 7 11 NaN NaN NaN NaN 10 13 NaN NaN NaN NaN NaN NaN NaN 11 17 2 NaN NaN NaN NaN NaN NaN 10 12 NaN NaN NaN NaN NaN NaN NaN 4 13 2 12 NaN NaN NaN NaN NaN 17 10 9 3 1 NaN NaN NaN NaN 16 16 6 NaN NaN NaN NaN NaN NaN 3 8 17 17 14 NaN NaN NaN NaN 20 19 3 NaN NaN NaN NaN NaN NaN 13 10 NaN NaN NaN NaN NaN NaN NaN 2 18 10 16 NaN NaN NaN NaN NaN 2 6 14 3 13 NaN NaN NaN NaN 8 16 14 8 NaN NaN NaN NaN NaN 7 7 NaN NaN NaN NaN NaN NaN NaN 19 10 9 NaN NaN NaN NaN NaN NaN 10 19 NaN NaN NaN NaN NaN NaN NaN 5 18 NaN NaN NaN NaN NaN NaN NaN 1 7 NaN NaN NaN NaN NaN NaN NaN 16 7 13 10 1 NaN NaN NaN NaN
Next, add these relationships to the adjMatrix.
for eachRow = 1:NUMOFPEOPLE for eachCol = 1:9 if isnan(allFriendsMatrix(eachRow, eachCol)) == 0 adjMatrix(eachRow, allFriendsMatrix(eachRow, eachCol)) = 1; adjMatrix(allFriendsMatrix(eachRow, eachCol), eachRow) = 1; end end end
This double for loop goes through each row and column of the 'allFriendsMatrix'. The if statement will run for all values that aren't 'NaN'. Basically it will create the edges or lines of the graph. So the first line this will make is person 1 to person 16 and person 16 to person 1. Because it is undirected, 1 must be changed for both! We cannot just have the edge 1 to 16 and not 16 to 1. They must be symmetric for it to run properly in Matlab.
In our simulation, we established that people cannot interact with themselves. When we randomized the values, there is a chance our adjacent matrix has this errors.
Let's fix this with the following code:
for each = 1:NUMOFPEOPLE adjMatrix(each, each) = 0; end
This for loop ensures that person 1 isn't connected to person 1, person 2 isn't connected to person 2, etc by making all them all 0. As you can see below in the output section, we have the diagonal of the square matrix from the top left to bottom right are all 0's.
This is the final adjMatrix for this current simulation. This accounts for all the lines in the graph (Figure 2).
adjMatrix = 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 1 1 0 1 1 0 0 0 1 0 1 1 1 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0 1 1 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 0 0 0 0 1 0 1 0 1 1 1 1 0 0 0 1 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 1 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0 0 0 1 1 0 1 1 0 1 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 1 0 1 0 1 0 0 0 0 1 1 1 0 0 1 0 0 1 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0
See Figure 2 to see graph of 'adjMatrix'.
Step 3: Add Disease Statistics.
Now that your program can create a graph with a set of random people and create random relationships, we need to input the disease's information or statistics to see how these interactions within a population can increase or decrease infection.
Create these variables:
unvacc %type: double; percent chance of unvaccinated people not getting the sickness vacc %type: double; percent chance of vaccinated people not getting the sickness unvacc_perc %type: double; percent population unvaccinated init_infect %type: int; percent population vaccinated
Next we need to do some calculations.
We are going to make a 'infectionMat' which is a 3*NUMOFPEOPLE matrix.
vacc_perc = 1-unvacc_perc; infectionMat = nan(3,NUMOFPEOPLE); number = round(vacc_perc * NUMOFPEOPLE); infectionMat(1,1:number) = vacc; infectionMat(1,number+1:end) = unvacc; infectionMat(2,1:end) = 0; infectionMat(2,1:init_infect) = 1; <br>
line 1: Percent of population unvaccinated calculated
line 2: create a 3*N number of people matrix
line 3: find out number of people vaccinated from vaccinated percentage
line 4: for the vaccinated people, give them an immunity associated with having the vaccine. This value is assigned based from research about the disease.
line 5: for the rest of the population (unvaccinated persons), give them the percent immunity. This value is assigned based from research about the disease.
line 6: initially set all the people to not infected.
line 7: for the number of people initially infected, fill in first couple columns accordingly.
Now that we have set all the parameters for the disease simulation, we are going to randomize the chance of whether the person (both vaccinated and unvaccinated) gets infected. This is done in the next step by assigning random values between 0 and 1 to each person in the third row of this 'infectionMat'.
Step 4: Randomize the Chance a Vaccinated and Unvaccinated Person Can Get Infected.
Next, assign each person a random number, this will be used later to determine whether the person gets infected or not.
for w = 1:length(infectionMat) infectionMat(3,w) = rand; end
This for loop deals with the third row of the 'infectionMat' created in the last step. 'rand' assigns a value between 0 and 1 to each index of row 3.
infectionMat is now complete! This was with a population with 100% vaccination and 1 person initially infected.
infectionMat = Columns 1 through 12 0.7500 0.7500 0.7500 0.7500 0.7500 0.7500 0.7500 0.7500 0.7500 0.7500 0.7500 0.7500 1.0000 0 0 0 0 0 0 0 0 0 0 0 0.0869 0.5489 0.3177 0.9927 0.7236 0.5721 0.7172 0.9766 0.4270 0.9130 0.8973 0.8352 Columns 13 through 20 0.7500 0.7500 0.7500 0.7500 0.7500 0.7500 0.7500 0.7500 0 0 0 0 0 0 0 0 0.0480 0.3593 0.2958 0.6291 0.1362 0.3740 0.8648 0.2503
row 1: Percent chance of NOT getting the disease
row 2: Infected or not infected (boolean value)
row 3: Number used to check if person who is not infected gets infected if they meet an infected person. If uninfected person meets infected person, this number is greater than the number in row 1 (for the same column), then they are infected. We will code this functionality out in step 7.
Step 5: Create Matrices of People Who Are Unvaccinated and Infected From Initial Information.
Create 2 matrices called "matrixUnvacc" and "matrixInfected" that stores all the infected people from infectionMat. This will be used so that we can color-code the graph of those who are infected, unvaccinated, or vaccinated, helping visualize the impact of unvaccinated versus vaccinated individuals.
clear each matrixInfected = ; matrixUnvacc = ; for h= 1:length(infectionMat) if infectionMat(1,h) == unvacc matrixUnvacc = [matrixUnvacc,h]; end end for person = 1:NUMOFPEOPLE if infectionMat(2, person) == 1 matrixInfected = [matrixInfected, person]; end end
Create two empty matrices to store the numbers of the people who are unvaccinated and infected, respectively. Both for loops run 20 times and if the if statement is satisfied, then the number is added to the correct matrix.
matrixUnvacc = [ ]
matrixInfected = [ 1 ]
Step 6: Plot Initial Graph.
Next we are going to plot the adjacency matrix.
g = graph(adjMatrix); figure(1) p = plot(g, 'NodeColor', 'b', 'MarkerSize', 7); highlight(p, matrixUnvacc, 'NodeColor','g') highlight(p, matrixInfected, 'NodeColor', 'r') title_unvacc = unvacc_perc*100; title(['Percent of people unvaccinated: ', num2str(title_unvacc),'%']); pause(speed)
Graph theory in Matlab has built in functions. When we use the graph() function, we are able to translate the 'adjMatrix' into an actual undirected graph. We then have to create a plot using the plot() function to actually see what it looks like. We set this plot() to a variable so that we can manipulate and change the colors of the plot more easily throughout the simulation. All people (or nodes) are initially set to the color 'blue'. Next, all the unvaccinated people are set to the color 'green'. The infected people are then set to the color 'red'. The title is set according to the certain percentage value of unvaccinated people being tested. The pause() function temporarily stops MatLab execution. We pass through the variable speed which is spread which is calculated in seconds.
See picture (above) to see a random color coded graph.
Learn more about the highlight() function in MatLab.
Step 7: Simulate the Progression of Infection.
Next we need to figure out who gets infected after the interactions (recorded in the adjMatrix) and update graph when someone becomes infected.
Use the adjMatrix to determine which people are infected after their interactions with people in a day.
for eachRow = 1:length(adjMatrix) if infectionMat(2,eachRow) == 1 for eachCol = 1:length(adjMatrix) if adjMatrix(eachRow, eachCol) == 1 % eachRow = the person % eachCol = its friend % each person's friend and see if they are get infected. if infectionMat(3, eachCol) > infectionMat(1,eachCol) infectionMat(2, eachCol) = 1; highlight(p, eachCol, 'NodeColor', 'r') pause(speed) end end end end end
The for loop loops through each person. It checks that if the person is infected, it will check each of the people/friend they interacted with and check if the friend's immunity level was greater than the strength of the disease. This is where the 'infectionMat' we created earlier comes into play. The 1st and 3rd row of each column of the friend is compared and if the 3rd row is greater, it means that the friend didn't have a high enough immunity to escape the disease and ultimately gets infected. We also change to color using highlight() to red if they get infected.
Now your code for the simulation should work! and for any size of population, just change NUMOFPEOPLE!
Step 8: Use Monte Carlo Theory.
To take this one step further and extract data from our simulator ('infectionSim.m'), we wanted to calculate and graph the trend in percent of unvaccinated people who got infected and the percent of vaccinated people who got infected. We hypothesize that the percent of vaccinated people who got infected should be a lot lower than the percent of unvaccinated people who got infected.
Step 9: Make the File ('infectionSim.m') With the Simulation Into a Function.
To run Monte Carlo, we would want to run the simulation multiple times and accumulate data so that we can use that to graph the percentages of people who got infected.
The function could be set up like this:
function output = infectionSim(unvacc, vacc, NUMOFPEOPLE, unvacc_perc, init_infect, speed)
Comment out the variables in your simulation since now you are passing these in through the main file (we will start to write this in step 12) :
unvacc, vacc, NUMOFPEOPLE, unvacc_perc, init_infect
The new variable
will be assigned in the main file (Monte_Carlo.m).
Note: Don't forget the end at the bottom of the function file to end the function!
Step 10: Calculate the Percentage of Unvaccinated and Vaccinated People Who Got Infected.
This calculates the percentage of unvaccinated people who got infected. This code goes at the bottom of the 'infectionSim.m' file.
number_of_unvacc = 0; number_of_infec_unvacc = 0; %calculates percentage of unvaccinated people who got infected for x = 1:length(infectionMat) if infectionMat(1,x) == unvacc number_of_unvacc = number_of_unvacc+1; end if infectionMat(1,x) == unvacc & infectionMat(2,x) == 1 number_of_infec_unvacc = number_of_infec_unvacc +1; end end percentage_of_unvacc_and_infec = (number_of_infec_unvacc / number_of_unvacc)*100;
In the for loop, it will loop over NUMOFPEOPLE times. Each time the number in the infectionMat corresponds to the unvacc number (i.e. 0.95 == 0.95), then the number of unvaccinated people will be increased by 1. Each time the number in the infectionMat corresponds to the unvacc number and they are infected, the number of infected and unvaccinated increases by 1. The last line divides the number of infected, unvaccinated people by total number of unvaccinated people. Then the percentage is calculated from this.
Try to calculate the percentage of vaccinated of people who got infected! (Hint: it is very similar to this above code, however some of variables are changed and names are adjusted.)
Next the percent of people infected based on the total population is calculated:
pre_per_infect = cumsum(infectionMat(2,:)); per_infect = (pre_per_infect(1,NUMOFPEOPLE)/NUMOFPEOPLE)*100;
The cumulative sum is calculated using the second row of the infectionMat, which stores 1s and 0s depending if the person is infected or not. Since the cumsum() function gives back a matrix, we take the last value in the matrix ('pre_per_infect(1,NUMOFPEOPLE)'), which should be the actual sum of all the values from 'infectionMat(2,:)'. By dividing the sum by the NUMOFPEOPLE and multiplying it by 100, we get the final percentage of infected in the total population.
Step 11: Create a Output Variable in Your Function 'infectionSim.m'
output = [per_infect, percentage_of_unvacc_and_infec, percentage_of_vacc_and_infec];
Store this information in output, which will be sent back into main (Monte_Carlo.m) when the function is called and done running. This data is used to graph the points of percent of infected of those who are vaccinated and unvaccinated.
Your 'infectionSim.m' function should be done now! However, it won't run because we still need to write the main!
Step 12: Create a Menu to Get Initial Conditions of the Simulation From User.
Remember how we said the variable
would be created and passed through the main function? We need to obtain the values to pass to the function. Note, order of the values when calling the function do matter!
Start by asking the user to type some answers into the terminal.
------ Sample Menu ------ >> Pick a disease. Note it is case sensitive<br>>> Pertussis >> Flu >> Measles >> Disease Chosen: Flu >> Pick size of population. >> 20 >> 200 >> Population chosen: 20 >> Pick speed of simulation. >> Fast >> Slow >> Speed chosen: Fast
This code below asks the user what disease they want to look into.
disp('Pick a disease. Note it is case sensititive') fprintf('Pertussis\nFlu\nMeasles\n') disease = input('Disease Chosen: ','s'); if isequal(disease,'Pertussis') vacc = .85; %15 percent chance of getting disease unvacc = .20; %80 percent chance of getting disease elseif isequal(disease,'Flu') vacc = .75; %25 percent chance of getting disease unvacc = .31; %69 percent chance of getting disease elseif isequal(disease,'Measles') vacc = .97; %3 percent chance of getting disease unvacc = .10; %90 percent chance of getting disease end
The disp() function prints out the statement to the screen and it also prints out the different options. The disease will be assigned accordingly. This version does not currently account for invalid input. Invalid input will produce an error and stop the program completely. Each disease has vacc and unvacc values associated with it. These values are NOT random. We got these values from researching statistics about the diseases.
Next, we need to ask the user if they want to test a big or small population size for their chosen disease.
disp('Pick size of population.') fprintf('20\n200\n') speed = input('Population chosen: ','s'); if isequal(speed,'20') population_size = 20; elseif isequal(speed,'200') population_size = 200; end
This does prints out a statement to the user and asks the user to enter what size of population it wants to test. This version does not currently account for invalid input. Invalid input will produce an error and stop the program completely. 20 was picked because it is a small sample size that still gives a good idea of how infection spreads throughout a small population. 200 people was picked as the larger option because 200 points plotted on the graph had barely any overlap of points so everything could easily be seen and distinguished from each other.
Next, we need to find the speed of the simulation.
disp('Pick speed of simulation.') fprintf('Fast\nSlow\n') speed = input('Speed chosen: ','s'); if isequal(speed,'Fast') sim_speed = 0; elseif isequal(speed,'Slow') sim_speed = 0.25; end
This process was the same as getting the type of disease and size of population. For fast, there will be no pause. and for slow, there will be a 0.25 second lag in the for loop when running the simulation.
Great! Now we have all the inputs from the user we need! Let's move on to collecting data for different percentages of unvaccinated people.
Step 13: Choose a % of Unvaccinated People and Calculate Average of Unvaccinated and Infected for Chosen Percentage.
This code is for 0% of unvaccinated people.
% ------- %0 Unvaccinated ------------ per_infect_av_0 = ; percentage_of_unvacc_and_infec_av_0 = ; for i = 1:20 out = infectionSim(unvacc, vacc, population_size, 0, 1, sim_speed); per_infect_av_0 = [per_infect_av_0, out(1,1)]; percentage_of_unvacc_and_infec_av_0 = [percentage_of_unvacc_and_infec_av_0, out(1,2)]; end average_infected_0 = mean(per_infect_av_0); average_unvacc_and_infected_0 = mean(percentage_of_unvacc_and_infec_av_0);
The for loop is run 20 times. The output from the function, infectionSim(), is stored in out. Each time the for loop runs, then the percentage of infected in the total population is added to matrix, 'per_infect_av_0'. Additionally, the percentage of unvaccinated and infected is also added each time into the 'percentage_of_unvacc_and_infec_av_0' matrix. In the last two lines, these two, above-mentioned matrices are then averaged and stored in variables. To sum up, the percentages are stored for each simulation, averaged, and graphed. Monte Carlo is used to show the average value of running a simulation and showing the result. For our experimental purposes, we choose to run the simulation 20 times and average those values.
Repeat for all percentages you want to test! This can be done by changing the variable names according to the percentage numbers. We tested for 0%, 5%, 10%, 20%, 30%, and 50%.
The only line that must be changed in the actual code is
out = infectionSim(unvacc, vacc, population_size, 0, 1, sim_speed);
Change the zero to the percent in decimal form. For instance, for 5% unvaccinated simulation, the 0 should be replaced with 0.5.
Step 14: Graph: 'The Trend of Infection in Unvaccinated Vs. Vaccinated for Specified Disease'
This is the code to make a graph of the trend of infection in unvaccinated persons vs. unvaccinated persons.
graph_mat_y = [average_infected_0,average_infected_5,average_infected_10,average_infected_20,average_infected_30,average_infected_50]; graph_mat_x = [0,5,10,20,30,50]; slope = (average_infected_5-average_infected_0)/5; line_y = [average_infected_0,(slope*50)+average_infected_0]; line_x = [0,50]; figure(2) plot(graph_mat_x,graph_mat_y); line(line_x,line_y,'Color','red','LineStyle','--'); title(['Trend in Unvaccination for ',disease]); xlabel('Percentage of Initial Unvaccinated'); ylabel('Percentage of Final Infected')
line 1: assigned y values to the averages of percent infected
line 2: assigned x values to the percentage of initial percent unvaccinated
line 3: calculate the slope of 0% and 5%
line 4: store y values of line. This is a continuation of the 0% to 5% section.
line 5: store y values of line. This line spans the length of the graph.
line 6: create figure
line 7: plot the graph x and y values of the percent infected, who are unvaccinated.
line 8: plot the line. This is used to show that it does not increase linearly, but exponentially.
line 9: Set title for the graph.
line 10-11: Set x and y labels for graph.
Now you should be able to see that the greater percentage of the population unvaccinated, the greater amounts of infection. You will also see that most of the dots that turn red are green dots, showing that the vaccine does help to some degree! Hope you liked this tutorial. Comment if you have any questions!
Step 15: Final Product: What the Simulation Looks Like
All the code can be found here
We have a be nice policy.
Please be positive and constructive.