Supporting Material is available for this work. For more information, follow the link from
the Table of Contents to "Accessing Supporting Material".
Table of Contents
Go Back
Heterogeneity and Network Structure in the Dynamics of Contagion:
Comparing Agent-Based and Differential Equation Models
Hazhir Rahmandad' hazhir@mit.edu
System Dynamics Group, MIT Sloan School of Management, Cambridge MA 02142
Abstract
When is it better to use an agent based (AB) model, and when should differential equation (DE) models
be used? We compare and contrast the dynamics of AB models with those of the corresponding mean-field DE
model, using the common and important context of the spread of contagious disease as an example. We
compare the dynamics of the well-known SEIR model of contagion, a lumped nonlinear DE system, to those of
an explicit AB model of the same system. We examine both the impact of heterogeneity in agent attributes and
the impact of different network structures for the interactions among the agents, including fully connected,
random, Watts- Strogatz small world, scale free, and ring-lattice networks. We further show how agent based
models can be formulated in continuous time while preserving the full stochastic character of state transitions,
allowing AB and DE elements to be combined in the same model. Sensitivity analysis demonstrates the
conditions under which the extra complexity of the AB representation leads to different conclusions compared
to the aggregated DE model.
Introduction
Spurred by growing computational power, agent-based modeling (AB) is increasingly applied to
problems previously modeled with nonlinear differential equations (DE). Scholars interested in the dynamics of
physical, biological, social, and economic phenomena can now choose from highly disaggregate, agent-based
representations to highly aggregated models.
The extended toolset has enabled many interesting and valuable studies, which have contributed to a
better understanding of diverse phenomena. See (Schelling 1978; Carley 1992; Axelrod 1997; Axtell, Epstein et
al. 2002; Epstein 2002; Tesfatsion 2002) for some interesting examples. AB models can readily include
heterogeneity in the attributes of the agents, and in the network structure of their interactions. However, this
increased detail comes at the cost of introducing large numbers of parameters. It can be difficult to analyze the
behavior of an AB model, and the computation required to carry out necessary sensitivity and policy analysis
can be prohibitive, even with small populations of agents. Therefore modelers always choose among different
modeling assumptions and methodologies with different strengths and weaknesses.
Agent-based models in system dynamics are not new. Indeed, the first system dynamics model,
Forrester’s supply chain model (Forrester 1958; Forrester 1961), represents the interactions of four “agents” —
in this case, the four organizations in the supply chain (retailer, wholesaler, distributor, and manufacturer). Each
of these, furthermore, is modeled as a set of “agents” (organizational departments) which exchange information
such as inventory levels, order rates, and cash balances. As in modern agent models, these individual
organizational departments are conceived of as distinct entities, each responsible for specific decisions (e.g.,
production starts in the factory, hiring in the human resources department, advertising in the sales and marketing
organization), and each has access to a specific subset of information cues that they use as a basis for their
decisions. That is, the market and organizations within it are modeled as consisting of distinct boundedly
rational entities, each with a specific and possibly different set of information upon which to base decisions, and
each interacting with others according to well specified rules indeed. While he did not use the now-current
jargon, one of Forrester’s key contributions to modeling methodology was precisely to view systems as
comprised of multiple interacting agents, each perceiving only a limited subset of all available information, and
each with their own local goals, norms, and decision rules. Forrester is adamant that organizations be modeled
in this descriptive fashion and not as a single entity whose overall behavior was conceived of as the solution to a
global optimization problem (see Morecroft (1983; 1985) for elaboration of the representation of bounded
, Special thanks go to John Sterman whose contribution to this paper has been no less than mine.
rationality and limited information in system dynamics models). Many system dynamics models since 1958
likewise consist of multiple agents interacting in some context such as a market, organization, or ecosystem.
While agent-based modeling is not new in system dynamics, there are differences between many
modern agent models and typical system dynamics models. First, while the example of Forrester’s supply chain
model show that SD can portray distinct agents, there are other SD models in which a system is highly
aggregated, with a small number of state variables capturing large populations of agents. Forrester’s (1971)
World Dynamics is perhaps the paradigm case: here the entire world system is portrayed in just five state
variables (aggregate world population, nonrenewable resources, persistent pollution, and stocks of capital in
industry and agriculture).
There are other differences. Most modern agent-based models are stochastic, discrete time systems.
Agents are often modeled as being in a finite number of discrete states, and the state transitions among them are
often stochastic, with probabilities that are either constant or dependent on the states of other agents, as in
cellular automata where a cell’s state or probability of taking a state depends on the states of the cell and its
neighbors; (see Conway’s famous “game of life” and Wolfram (2002); see Larsen and Lomi (2001) for
organizational examples). In contrast, typical system dynamics models are formulated as systems of ordinary
nonlinear differential equations. SD models often do not explicitly model heterogeneity among different
elements accounted for in a stock, instead lumping different entities into a set of stocks at an appropriate level of
disaggregation; within each compartment or stock it is typically assumed that all entities are perfectly mixed.
The flows affecting these stocks are usually formulated using the “mean field” approximation familiar in
physics, that is, as the expected or average value of the underlying probabilistic transitions from one state to
another (though many SD models do include random variations in these average flows). For example, the
perfect mixing assumption within compartments means the probability of exit from a stock, S, is Poisson, that is,
independent of an individual item’s residence time in the stock, 1;, leading to the mean-field approximation that
the outflow is given by the total stock and the mean residence time, S/t;. Understanding where the agent-based
approach brings additional insights can guide system dynamicists to make a better use of agent based ideas in
their modeling projects and therefore can significantly benefit the practitioners and the researchers in the field.
Unfortunately, AB and DE models are seen by many as incompatible modeling paradigms rather than
points on a spectrum, and it is currently difficult to integrate both AB and DE elements in the same modeling
framework. Therefore to compare and understand the effects of different assumptions and to integrate the
differential equation and agent based ideas in mixed mode models some bridging concepts and tools should be
developed. For example, the continuous vs. discrete time dichotomy needs to be reconciled and unit consistency
should be introduced into AB models.
In this paper we contribute to the discussion on both these fronts by comparing the classical SEIR
disease diffusion model developed under both DE and AB paradigms. We use the same software to develop both
versions in parallel and enforce good modeling practices (e.g. unit consistency, independence of simulation
output from the simulation time step) in both models. The results shed light on challenges and opportunities of
integrating DE and AB approaches to build mixed mode system dynamics models that are consistent with good
modeling principles. Moreover, we do extensive sensitivity analysis to examine where the two approaches
converge and under what conditions the extra disaggregation required to build AB models pays off with new
insights or different conclusions. Furthermore, this paper is informative in light of a recent policy debates on
containing a smallpox bioterrorist attack (Halloran, Longini et al. 2002; Kaplan, Craft et al. 2002; Koopman
2002) which highlights the importance of a comparative study between DE and AB models of disease diffusion.
The next section discusses the characteristic differences between AB and DE models and reviews the
literature comparing the two approaches. Next we describe the structure of the DE and AB model. Next the
analysis of the two models is described and finally we present conclusions, lessons learned, and directions for
future work.
Differential Equation vs. Agent Based
It is possible to outline a spectrum of modeling approaches that include different DE and AB models. In
fact in the context of epidemiologic modeling Koopman and colleagues (Koopman, Jacquez et al. 2001) offer a
hierarchy of models based on different aggregation assumptions, and go through the benefits and problems of
each. Their categories include ordinary differential equations (DE), stochastic compartmental (SC) models
where the rates of change are discrete but the rest of the model is similar to DE’s, individual event history (IEH)
where the model tracks the states of individuals, but their network of relationships is ignored, and dynamic
network (DNW) models where not only individuals, but also their network of contacts are explicit.
Even though it is unrealistic to put a hard boundary between DE and AB modeling practices, these two
modeling paradigms as the limits of a spectrum of possibilities differ in some important dimensions. One of the
significant dimensions that separates DE models from AB ones is the level of aggregation. DE models tend to
have a higher level of aggregation (less detail) while AB models lean towards more detailed representation of a
concept. However, there is no dichotomous choice when dealing with aggregation. For example in modeling
world population, one can imagine a continuum of possible aggregation levels, where at the highest level there
is only a single variable representing the number of people, this can be diaggregated by age and country, or
further to the level of individuals. Indeed, why stop at the level of the individual person? Individuals could be
modeled as of ‘agents’ in the form of organs, the organs as consisting of ‘agents’ of cells, the cells as consisting
of organelles, organelles as consisting of molecules, and so on. Agent models, like all models, make
aggregation assumptions and lump together all structure below some threshold. The level of aggregation must
always be chosen to be appropriate to the model purpose and is constrained by available computer power, data
availability, client needs, and the time available for the study.
Another difference between the two approaches is the way change through time is represented. DE
models use a continuous time numerical integration approach where change comes in very small increments
added to each other over very small intervals (denoted DT), time step. AB models typically use events that
change variables to different values in an instant at pre-specified “periods” through simulation. This approach
usually highlights the stochastic nature of change in AB models, even though stochastic elements are commonly
added to DE models as well. Table 1 summarizes the main characteristics of the two modeling paradigms (see
also (Scholl 2001).)
Table 1- Differential Equation vs. Agent Based approaches. The two panes outline the main assumptions
typically used in each method (though there are exceptions for both classes).
Differential Equation Agent Based
- dx/dt = f(x,u) - Set A= {aj, ... an} of agents, each agent has
x, vector of states; u, vector of exogenous states Xq
inputs, including stochastic shocks; f() typically | x can be e.g. health status, location, wealth,
nonlinear beliefs, decision rules, etc.
- Typically in continuous time but difference States x, change according to rules of
equations also common interaction, e.g. Nearest neighbor (on lattice,
Finite number of compartments (elements of | torus, etc.) or other network structure;
x) Stochastic or deterministic.
- No heterogeneity within a compartment. - Discrete time: x,(t) = Rule[x,(t-1)] for all a in
Heterogeneity added by enlarging number of {A}]
compartments, e.g.: - Heterogeneity across agents. Often,
Disaggregation by spatial structure: distribution of states across agents (often
World population P becomes population by assigned randomly)
country, state/province, county, health status, Aggregatio:
gender, etc. Pix Population is sum of agents; Number of people
Disaggregation by attribute in each category (e.g., health status, gender) is
People P become Pix..., where, e.g., i,j, k= sum of agents with those attributes each period.
sex, age, health status, behavior, etc.).
Even though AB models and especially DE models have a fairly long history, to the best of our
knowledge published research at the intersection of the two modeling methodologies is limited. Scholl (2001)
reviews the modeling principles underlying system dynamics and agent based models and suggests that there is
room for the two communities to learn from each other’s work. A few researchers in recent system dynamics
publications have taken on this task by introducing mixed models where agents are modeled in detail with
differential equations (Akkermans 2001; Schieritz 2002; Grobler, Stotz et al. 2003).
On the question of comparing the DE and AB models, a few papers in different domains have looked at
the problem. Picard and Franc (2001) compare the results of an individual based model of forest growth with 3
different aggregate models. The results show that the AB model fit the data better. However, the results should
be interpreted with caution as the AB models by definition have many more parameters that can be used to fit
the data. Further the aggregate models were not fitted to the data; instead their parameters are taken from those
of the AB model. Edwards et al (2003) present an agent based model of innovation selection where
heterogeneous agents have different thresholds for adoption of an innovation and compare it to the
corresponding aggregate model. The analysis suggests that the two models show similar behavior or diverge
under different parameter settings. Specifically, when multiple attractors exist in the deterministic version of the
aggregate model, the behavior may follow a different path than that of agent based model. Koopman et al (2002)
analyze the effect of having integer changes in number of individuals, rather than continuous variables, in
disease modeling. Their results suggest that under some conditions the difference between the two settings
become significant (for small populations). Keeling (1999) formulates a DE model that captures the first order
effects of spatial structures in disease diffusion, traditionally only captured in AB models, that approximates the
behavior of equivalent AB model extremely well even when highly correlated network structures link the
individuals.
In summary these primary studies suggest that in modeling the same phenomenon, the extra complexity
captured in AB models sometimes leads to conclusions that differ from those of the corresponding DE model,
while in many cases the two models have very similar behavior. Such divergence is particularly important if it
changes the policy implications that follow from the modeling effort. For example, Kaplan et al. (2002) used DE
methodology to model the spread of smallpox following a hypothetical bioterrorist attack and compared two
alternative policies, mass vaccination and targeted vaccination, concluding that mass vaccination is far more
effective in lowering the risk of the attack. In contrast, Halloran et al.(2002) develop an AB model and
concluded that targeted vaccination is more effective. It is not clear if the difference in the results come from
differences between the AB and DE methods (Ferguson, Keeling et al. 2003) or other assumptions. The great
importance of the policy issues at stake require an answer.
In this paper we compare the behavior of the classic differential equation SEIR disease diffusion model
with that of a parallel AB model. In this respect we follow the model alignment and docking concept of (Axtell,
Axelrod et al. 1996). However, the AB model is far more complex than the DE version and captures the network
structure between individuals and heterogeneity in individual attributes. This enables us to investigate the
sensitivity of the gap between the two models’ behavior to different parameters and network structures.
Epidemic models are well suited for this type of analysis because the effects of different networks of contacts
among individuals seem to be very important in the diffusion process, leading to a relatively strong test for
differences between the two methodologies.
The results can inform the discussion on the fitness of DE and AB models to investigate different
diffusion scenarios. No matter how powerful computers become, modelers will always face tradeoffs between
the size and complexity of a model and the ability to understand its behavior, carry out sensitivity analysis, and
test policies. Due to limited time, budget, cognitive capabilities and client attention, modelers must always
choose whether to capture more details of agent heterogeneity in attributes or in the network structure of their
interactions, expand the model boundary to capture additional feedback processes, or keep the model simple so
that it can be analyzed more thoroughly. For example, in studying possible responses to a smallpox bioterror
attack, Kaplan, Craft and Wein (2002) augmented the traditional lumped SEIR (Susceptible-Exposed-Infected-
Recovered) epidemic model to include rapid response immunization. By using a lumped DE model they traded
off the ability to consider heterogeneity in agent behavior and the network structure of relationships for the
ability to include the impact of capacity constraints on crash immunization programs in a large city with a
population of millions, and they are able to carry out comprehensive sensitivity analysis on the model. In
contrast, Halloran et al. (2002) used an AB model to consider the same problem, opting to model the contact
network explicitly and capture heterogeneity in the behavior of individuals, while giving up the ability to
examine a population larger than a few thousand, capacity constraints on immunization, and the ability to carry
out comprehensive sensitivity analysis.
Models
Mathematical modeling of infectious disease goes at least back to the 1920’s when Kermack and
McKendrick (1927) developed the SIR (Susceptible, Infectious, Recovered) model. The SIR model is a lumped
nonlinear differential equation model in which all members of the population are in one of three stages of the
disease, namely Susceptible (S), Infected (I), or Recovered (R). Infected individuals can transmit the disease to
susceptibles before they recover (or die) and stop infecting others’. A typical extension of the SIR model is the
SEIR formulation, which includes another state variable, the Exposed population, to capture diseases with
significant latency between the transmission of the disease and the emergence of symptoms and development of
infectiousness. The SEIR model can be parameterized so that the exposed (asymptomatic infected) population
may also transmit the disease but at different rates, reflecting their different health status compared to the
Infectious population. Typically, the rate of contacts between susceptibles and exposed is higher than that
between susceptibles and infectious because the exposed are healthier (asymptomatic) and, often, unaware that
they are potentially contagious, while the infectious are typically ill (and self-quarantined). S(E)IR models have
been developed and applied successfully to many types of infectious diseases. The S(E)IR paradigm (lumped
nonlinear differential equations) has been extended to include multiple additional states (compartments) to
capture more complex disease lifecycles, diagnostic categories and therapeutic protocols, to capture
heterogeneity among the population, and to capture factors such as birth/recruitment of new susceptibles, loss of
immunity of the recovered, and a host of other factors. However, the lumped, aggregated formulation makes a
number of simplifying assumptions including homogeneity of the population and perfect mixing within each
compartment, and typically assumes that the flows between states are expected values. Proponents of agent
based modeling argue that heterogeneity in networks of social contacts and in individual attributes such as
contact rates and infectivity, along with the inherently stochastic nature of contacts among individuals, can make
significant differences in the evolution of an epidemic (see for example (Epstein, Cummings et al. 2004)). In this
paper we analyze and compare a differential equation SEIR model and its agent based counterpart.
Figure 1 shows an overview of the causal structure governing classical SEIR model. Two positive
contagion loops fuel the spread of disease, where more exposed (and subsequently) infectious people result in
more infection and even larger numbers of exposed people. Therefore a few initial exposed/infectious
individuals start a growing epidemic. The main loop that slows down the epidemic after some time is the decline
in the number of susceptibles (the negative Depletion loop in the figure). The two minor negative loops
governing the Emergence and Recovery rates indicate that the rate of emergence and recovery depend on the
number of Exposed/Infectious individuals.
Exposed Infectious
=o Asymptomatic >< Symptomatic | prea
opulation Infection Population E Emergence | Population | ‘opulation
Rate IR Recovery
= Rate ER Rate RR
a
AD,
+ Total Infectious Ay)
aX
um
Total Population rae
i Contact Rate
Figure 1- Feedback structure of the SEIR model. The positive (self-reinforcing) contagion loops generate the
epidemic, which ends when the susceptible population is sufficiently reduced that the negative depletion feedback
dominates the dynamics.
The classic SEIR model embeds a few key assumptions including the following:
e Contagion only occurs as a result of an infectious contact between a susceptible and an exposed (or
infectious) individual.
e = The total population is fixed and only can be in one of the four states of susceptible, exposed, infectious,
and recovered.
e The recovered population is immune and immunity is permanent.
e Individuals remain in the exposed or infectious state for some time before moving to the next state.
These are the boundary assumptions of the classic SEIR model; we maintain them for both the DE and
the AB models we compare in this paper, noting however that all these assumptions are easily and often relaxed
in actual applications of the SEIR framework.
The DE representation of the SEIR framework, however, imposes some additional assumptions that
allow simpler equations. These include:
e Perfect mixing of individuals within each compartment, so that everybody has the same chance of
meeting everybody else. Moreover, the chances of any two people from two compartments contacting
each other is the same.
e Homogeneity of individuals in one compartment, so that the contact frequency and disease parameters
are the same for all the individuals which are in the same state.
e Exponential distribution for waiting times of individuals in each stock.
e Mean field aggregation assumption so that the rates of change equal the expected values for summation
of individual rates of change.
With these assumptions the classical SEIR model, in which new cases arise only from constacts between
the S and I populations, reduces to a few simple equations. Specifically, the infection rate is the expected
number of contacts between infectious and susceptible populations resulting in infection of a susceptible person.
Total contacts between infectious and susceptible populations is given by the total number of infectious
individuals multiplied by their average contact rate, multiplied by the ratio of population that is susceptible, or
Cis*1*(S/N). Here Cis is the average number of contacts for an infectious individual if the rest of the population
was susceptible, I is the number of infectious individuals, and S/N is the ratio of susceptibles to the total
population. Each contact between an infectious and a susceptible has an infectivity ijs, which is the probability
that a susceptible contracts the disease conditional on contact with an infectious person, putting the final
infection rate at:
IR= ijs*Cis*1*(S/N)
Typically the parameters Cis and ijg cannot be independently estimated, and the infection rate is written
IR = kIS/N where k = Cisirs, the rate of infectious contacts between S and I populations, with k estimated to
provide the best fit to the data. While these formulations are mathematically equivalent, it is conceptually more
clear to separate these parameters. Moreover, each of these parameters is potentially influenced by different
feedbacks in the system and therefore keeping them separate enables the modeler to easily incorporate new
feedbacks. For example the contact rate C changes if people stay at home in fear of epidemic, while infectivity i
changes if people wear masks in public (e.g., to protect against SARS) or use safe practices such as condoms
and needle cleaning (e.g., to protect against HIV). The formulation is easily expanded to include the possibility
of infectious contact between exposed and susceptible populations:
IR = (Cygipg E + Cjghig 1) *(S/N)
The “Emergence Rate” and “Recovery Rate” in the DE model follow a simple first order delay, where
ER=E/eé and RR=1/d, consistent with the perfect mixing assumption. The details of the DE model
formulations and how they can be derived from the AB model are discussed in appendix 1.
An agent based model can avoid the simplifying assumptions of DE formulation by introducing
heterogeneity among individuals and assuming different contact structures among them—at the cost of more
model complexity. As in the classic SEIR framework, we assume a fixed population of N individuals who
interact with each other. Each individual is in one of the states of susceptible, exposed, infectious, and
recovered. To keep the parallelism with the DE case in representation and description, we use four stock
variables for each agent: S[j], E[j], I[j], and R{j] for j = (1, ..., N). These stock variables can be | or 0,
depending on whether the agent is in that state or not.
At the beginning of the simulation a few of these individuals are exposed, and the rest are susceptible. If
any of the exposed or infectious individuals come into contact with a susceptible individual, there is a chance
the disease is transmitted to the susceptible person and s/he becomes exposed. Therefore for any short time
interval time step of length DT one can calculate the probability that an individual changes her state from
susceptible to exposed and use that probability and a draw of a random number to determine if an infectious
contact happens between individuals j and k. If an infectious contact happens between individual j and any of
the other people, the individual’s Infection Rate (IR[j]) during that period is calculated so that the state of
individual j changes from susceptible to exposed.
The probability of two individuals contacting each other depends on the structure of the network of
relationship among individuals, as well as the state they are in. Contacts are stochastic, and can only occur
between people who are connected to each other in their network. Heterogeneity is captured in two ways: first,
a class of people may have different mean contact rates. For example, infectious people typically have a lower
contact rate than exposed because they too sick to leave home or self-quarantine. This type of heterogeneity is
captured in the DE model through differences between cjs and ces and between ijs and igs. In addition, the
individuals in a disease state may have different contact rates and infectivities, making explicit the heterogeneity
within compartments that the DE model cannot capture. Agent heterogeneity section describes in detail how
individual heterogeneity and different network structures are captured.
For each individual the transition from E to I states (emergence) and I to R states (recovery) is a
stochastic process determined by the probability of emergence and recovering in each period. Those
probabilities in turn are the reciprocal of the mean emergence time and duration of disease, respectively. To
maintain parallelism with the classic SEIR model we assume an exponential distribution for the residence time
in each state. Details of the formulation for the AB model and how the DE model is derived from it are found in
Appendix 1.
In developing our AB model we adhere to good modeling practices, such as unit consistency and
independence of the dynamics from the time step used in the numerical integration of the equations". In
contrast, many AB models are formulated in discrete time and one cannot change the length of the time period
between calculations without reparameterizing the model. These features not only enable a consistent
integration of AB and DE structures in a single model, but also avoid ad hoc simulation periods which can in
fact change the dynamics of interest (See (Low 1980) for an example of how arbitrary periods can result in
erroneous conclusions about the origins of model behavior).
In summary, the DE model has 4 state variables, three of which are independent since S +E +1I+R=
N, , and the AB model has 4*N states, and must also track interactions among the N individuals. Therefore the
difference in the complexity of the two models increases on the order N’, so the size of the AB version grows
very fast with the agent population. For example with a total population of N = 200, the DE model has a total of
35 variables, parameters, and initial conditions, while the AB model includes over 400,000 (the main
complexity comes from tracking the network of interactions among different agents).
We use two general parameterization schemes. In the base case the DE model uses the mean values of
the parameters characterizing individuals in the AB model. In the calibration case the parameters of the DE
model are estimated to provide the best fit to the mean of an ensemble of simulations of the AB model. In the
base case we parameterize the model based on the estimates for smallpox. This parameter setting makes the
model relevant to the smallpox bio-terrorist attack policy analysis. Moreover, smallpox has a mid range basic
reproduction number" (Rp ~ 3 10) and therefore is a good choice to observe potential differences between DE
and AB models: diseases with Ro << 1 die out quickly for (nearly) any network structure, while for those with
Ro >> 1 a severe epidemic is unavoidable in (nearly) any network of interactions.
Our base case parameters are obtained from parameter values of other studies (Halloran, Longini et al.
2002; Kaplan, Craft et al. 2002; Ferguson, Keeling et al. 2003). There are a few caveats in this parameterization
process. First, smallpox is usually modeled in four, rather than three stages, where an individual is initially
infected but goes through an incubation period, than s/he becomes prodromal with high infectivity but few
symptoms, and then transfers into symptomatic before recovery (or death). This distinction is necessary to
model different vaccination policies which have different effects during different stages. However, for the focus
of our model, we aggregate the incubation and prodromal stages into the exposed population and therefore we
use aggregate estimates for this stage. Second, there is a lot of uncertainty around the values of key parameters;
we select middle ground values from the above studies. Moreover, it is very hard to empirically distinguish the
contact rate and the infectivity, because only their product matters to the infection rate and disease diffusion.
Consequently most studies do not distinguish between the two and report only their product. Therefore we
choose a typical contact rate for asymptomatic people and choose infectivity to yield Ro of approximately 3.
Table 2 reports the base case parameter values for the DE model. The same parameters are used for the AB
model when applicable. It is explained in the text when heterogeneous parameters are used for different agents,
rather than those reported in the table. The parameters are fully described in Appendix 1.
Table 2- Parameter values for the model without calibration of DE parameters.
Parameter Shorthand Value Unit
Total Population N 200 Person
Contact Rate for Healthy Asymptomatic | AC 5 1/Day
Relative Contact Rate for Exposed RCE 0.8 Dimensionless
Relative Contact Rate for Infectious RCI 0.25 Dimensionless
Contact Rate Exposed Ces 4 1/Day
Contact Rate Infectious Cis 1.25 1/Day
Infectivity of Exposed ies 0.05 Dimensionless
Infectivity of Infectious ils 0.06 Dimensionless
Average Incubation Time € 15 Day
Average Duration of Illness Fy 15 Day
Experimental design
We seek to analyze differences in the behavior of AB vs. DE models arising from the network structure
between the individuals and heterogeneity across individuals in their contact frequency. We explore five
different network structures and two settings for heterogeneity of individuals, leading to ten different
experimental settings. For each setting we create an ensemble of 1000 simulations of the AB model. In the
second regime of analysis we find the best fitting parameters for the DE model, which can replicate the mean
behavior of the symptomatic population in AB model for each of the ten network and heterogeneity settings
above. This set of simulations informs the facilities and limitations of the DE model in following different
settings of the AB model behaviors. The same ten experimental conditions are used in these parameter
estimation exercises. The five network structures and two settings of individual heterogeneity, as well as the
calibration of the DE model are explained below.
Network structure
The literature on networks has been growing fast in the last decade. Researchers with different
disciplinary backgrounds, from mathematics to sociology and biology, have contributed to a better
understanding of how widespread and important the networks are in the world around us(Watts 1997; Watts and
Strogatz 1998; Barabasi and Albert 1999; Barabasi 2002; Strogatz 2003). One of the main advantages of an AB
model in comparison to DE models is its ability to capture the relationship network among individuals
(Koopman 2002). While DE models assume a complete mixing of individuals in the population, equivalent to
the possibility of anybody being able to meet anybody else at any time, realistic networks are far more sparse,
with most people contacting only a small fraction of the population. Further, most contacts are comparatively
‘local’ in some sense (geographical or in some other space of similarity). The network structure determines
which individuals can possibily meet each other and potentially transmit the disease. Consequently, what is
important in this setting is the topology of the network of relationships among individuals. Researchers have
identified at least three important types of random network topologies: completely random graphs, small-worlds
networks (Watts and Strogatz 1998), and scale-free topologies (Barabasi and Albert 1999), which cover a large
range of real world networks. In this study we include all three of these random topologies, along with a regular
lattice and a fully connected network, to cover a large range of possible network interactions in the AB model.
All the networks used in the AB model have the same population (N=200) and mean links per node
(k=10). The population of 200 is smaller than most real world settings where modeling is used for policy
analysis, e.g., cities or nations. However, the effects of network structure and stochastic elements seem to be
more important in the small populations (Kaplan and Wein 2003) and therefore this population level gives us a
better opportunity to surface the differences between the DE and AB models. A small population also yields
tolerable computation times, allowing more extensive sensitivity analysis. A mean of 10 links per person is
realistic for relationship networks and makes the model resemble small human population centers, e.g. small
towns
At one extreme is the fully connected network, in which every person interacts, with equal probability,
with every other. While unrealistic, the fully connected network corresponds most closely to the assumptions of
the DE SEIR model. Therefore it allows us to distinguish the differences between AB and DE models arising
solely from stochastic effects and integer vs. continuous state variables. In fact these differences are not always
trivial. One study (Rohani, Keeling et al. 2002) suggests that the stochastic effects are required to account for
the irregular cyclical modes of whooping cough epidemics and another study (Koopman, Chick et al. 2002)
concludes that stochastic effects arising from discrete events can have important effects on the final number of
infected individuals.
Moving away from the uniformity of the classic SEIR assumptions, the next network topology we use is
the Erdos-Renyi (Erdos and Renyi 1960) random graph, the oldest and the best studied random network
topology. Here people do not interact with all others, but with a subset. However, each node (person) has the
same probability of being linked to every other node. Several modifications have been proposed to better align
the properties of this topology to what is observed in the real world networks (see (Newman 2002) for a review).
In our model we use the original formulation where every two nodes are connected with the same probability, p.
Therefore if each node has an average of k links to other nodes (average degree of a vertex=k) in the population,
p, the probability of a link between two individuals, will be k/(N-1), where N is the size of the population.
Random networks also conform well to the assumptions of the DE SEIR model, but are not good
representations of most real life networks, in which most connections are local and only a few are distant. That
is, the probability of your having a link with another person tends to fall off with the distance between you and
that other person (in the relevant space). In their classic paper (Watts and Strogatz 1998), Watts and Strogatz
provide a simple random network which falls between regular network structures and the completely random
ones, and captures realistic features of clustering and short path lengths between any two nodes. They name this
structure small-world, following the small-world, or six degrees of freedom, phenomenon. To build their small-
world topology one starts with a ring lattice with N nodes and k edges per node connecting each node to its
immediate neighbors. The links between nodes are then randomly rewired with probability p (See Figure 2),
resulting in a few long-range connections in a predominantly locally connected lattice structure.
Figure 2- The Watts-Strogatz small-world network structure on a 1-dimensional lattice. The long-range
links are distinguished with thicker, red lines.
Watts and Strogatz examine the diffusion of an infectious disease in this network and conclude that the speed of
diffusion is much faster than in a pure lattice structure with no long range connections, even for very small p
values, because a few long-range links can disseminate the disease to different areas in the network. In this study
we use a similar network with the probability that any link is long-range set at 0.05. This yields a relatively
locally centered network, yet with realistic level of long-range networks to resemble it to human settings, e.g.
small towns (Watts 1997). We also analyze the pure ring lattice network in which there are no long range links
(all links are to immediate neighbors). We expect this lattice structure to show the strongest network effects,
since by restricting contacts to neighbors it stands in direct contrast to the free mixing assumption of the DE
model.
The next network topology we examine is the scale-free network. Scale-free networks attracted a lot of
attention after Barabasi and Albert (1999) showed how they can emerge from the normal growth and evolution
of a network when the probability a new node links to an existing node is increasing in the number of links the
existing node already has. Such rules for network growth, known as preferential attachment, amounts to a rule
where new nodes want to link to those that are already most popular, creating an obvious positive feedback
through which popular nodes become even more popular, while those with few links are unlikely to make new
connections. Preferential attachment can result in a power law for the probability that a node has k links,
Prob(k)=k”, where y empirically has fallen between 2 and 3. Scale free networks are characterized by a number
of “hubs” with a large number of links, while the rest of the network is mainly connected to these hubs. Several
real world examples including networks of relationships among journal citations, movie actors, internet
websites, and some biological systems show scale-free properties. Diffusion in scale-free networks is expected
to be faster than random networks, as the hubs get infected very quickly and then spread the disease to remote
parts of the network (Barthelemy, Barrat et al. 2003). It is possible to argue that large differences across
individuals in the number of contacts they have can result in some scale-free properties. For example, in the
early diffusion of AIDS one individual, usually referred to as patient zero, a flight attendant with a large network
of sexual contacts, is suspected to be linked to over 40 early cases of AIDS, thus seeding the epidemic
throughout North America (Barabasi 2003). Appendix 2 explains how we construct a scale free network for this
study.
Heterogeneity among individuals
While in the DE model the population is homogeneous, the AB model allows each individual to be
different. In our diffusion setting each individual has four relevant characteristics: the contact rate, infectivity,
incubation time, and average disease length. We focus our analysis on contact rate. This will enable us to keep
the dimensions of sensitivity analysis at a manageable level with little compromise on capturing the difference
of DE and AB models. For one thing only the product of the contact rate and infectivity matters. Moreover, the
exponential distribution for incubation time and disease length results in significant heterogeneity across
individuals, even though their underlying probability distribution parameters are identical.
Contact rates for individuals depend on two distinct factors. First, individuals with higher number of
links in the network of relationships have a higher chance of contacting other people. In fact, if an individual is
not restraint by available resources for contact (e.g. time), in expectation her contact rate will be proportional to
the number of links she has. Such setting is equivalent with having similar chances of contact for every link in
the network. However, time is often limited and therefore higher number of links for an individual can result in
lower chance of contact through each link that the individual is a part of. Therefore we can discount the contact
probability based on the number of links the two individuals at each side of a connection have, IAK[i}*KG)* 5
where K{[j] is the total number of links individual j has, and a. is the strength of this discounting. The second
factor that influences the contact rate is the tendency of two individuals to use their links: some links are used
more than others, as some friends are more often visited. In the model, we capture this effect by assigning
individuals different tendencies to use their links, TUL[j], and assuming that the chance that the link between
individuals i and j is used is proportional to multiplication of individual tendencies to use links, TUL[i]*TUL[j].
Therefore, we capture the heterogeneity in individual contact rates by capturing both these factors in the
probability that a link is used for a contact: LCR{i,jJ=L*TUL[i]*TUL[j]/ (K[i}*K[j])” , Where LCR[i,j] is the
average contacts per day for the link between individuals i and j, and L is an appropriate constant.
We use this formulation to design two scenarios for individual contact rates in the AB model that fall in
the two extremes of heterogeneity. In the homogenous case the effect of extra links of an individual is
compensated by the negative effect of link multiplicity (a=1) and individuals have no difference in their
tendency to use their links (o7y;=0). In the heterogeneous extreme, not only number of links differentiate
between low contact and high contact individuals (a=0), but also individual tendencies to use links are
distributed. We use a uniform distribution with a wide range, TUL[j] ~ U(.25, 1.75), to assess the extent of
individual heterogeneity effect. We select the parameter L so that the expected contact rate for each realization
of network equals the total contact rate for the whole population, AC*N. Figure 3 illustrates the distribution of
individual expected contact rates driven from the formulation above in a random sample of 1000 individuals.
Homogeneous Heterogenous
TZ200 B80
160
1000 oo
140 nO
= | 800 @=@ 120 o=1.95
= 100
= | 600
q 80
> | 400 60
40
200
20
)
300 180
r00 160 —
ji o =0.59 140 Ai 0 = 2.58
= |400 5 120
Ss
3 loo 100 =
g 80
& 200 60
40
100 es | |
0 O 0 i.=,a
350 250
300 . o = 1.58
: 200 =
2 [250 n 0 =6.42
a
& 200 pe aT
5
3 150 100 -
BH |ioo
“olmill ' i
ola | ee 0 foo.
300 180
= 160 =
500 o=0.17 ie 0 = 2.13
~ 4 140
5 400 120
E |s00 fee
= 80
= 200 60
a 40
100 _
; i i...
1200 180
160
> ia o=0 140 o = 2.09
= | 800 120
| 00 100
ch 80
& | 400 60
3 40
an || [|
0 )
vow eer eosea | no euoneaor™”
Figure 2- Histogram for expected contact per individual i per day, SLR ii, jb for different network and
J
heterogeneity settings. The mean is 5 for all cases and the standard deviation is indicated on the figure. Histograms
are built on data from 1000 random individuals (5 random realizations of each network setting)
Calibrated DE Model
The DE model is also analyzed when parameters are calibrated to the mean of a large ensemble of AB
simulations. In real world applications the individual level parameter values are rarely known with a reasonable
degree of confidence and are usually estimated from aggregate behavior. For example in epidemiological
context rough estimates for the mean and distribution of the incubation period and duration of infectivity are
usually available from clinical data, however, data on Ry are much less certain and hence it is usually estimated
from historical data. In the case of new emerging diseases, micro-level estimates of key parameters do not exist.
The SARS epidemic provides a typical example, where clinical and biological data were initially unavailable so
key parameters were estimated from data from the first few weeks of outbreak (Dye and Gay 2003; Lipsitch,
Cohen et al. 2003; Riley, Fraser et al. 2003).Therefore comparing the DE model parameterized with the same
underlying parameters as individuals in AB is not realistic for typical policy applications and exaggerates the
difference between the two models.
To capture the realistic situation in which the parameters of the DE model are set to those that provide
the best fit to the aggregate data, we treat the AB model as the “real world” and set the parameters of the DE
model to those that provide the best fit to the mean’ of the ensemble of AB simulatio If there are behavioral
or policy differences between the best-fitting DE and the AB models it would indicate that the aggregation
assumptions of the DE model are inappropriate. If, however, there are no meaningful differences between the
best fitting DE and AB models the aggregation and perfect mixing, and homogeneity assumptions of the DE
model are adequate and the extra detail of the AB model unnecessary. For the calibration cases, the best fitting
parameters for infectivity of exposed and infectious (igs and ijs) as well as the residence times in both exposed
and infected states (¢ and 8) are found and used to simulate the DE model”. Table 3 reports the parameter values
used in the calibrated DE models for each of the network and heterogeneity scenarios.
Table 3- Value of parameters in DE model fitted to mean behavior of symptomatic population in each AB scenario.
The corresponding parameters in the original DE model are 0.05, 0.06, 15 day, and 15 day.
Network Structure
Scale Small
Heterogeneity? | Uniform | Random Free World Lattice
Infectivity of Homogeneous | 0.0501 0.0452 | 0.0425 | 0.0352 | 0.0367
Exposed Heterogeneous | 0.0543 | 0.0565 | 0.076 0.038 | 0.0366
Infectivity of Homogeneous 0.022 0 0) 0.0073 | 0.0068
Infectious Heterogeneous 0 0.0093 | 0.0104 | 0.0023 | 0.0032
Average Homogeneous 13.0 12.0 11.0 8.7 5.9
Incubation Time | jeterogeneous 16.9 7.34 46 87 64
Average Homogeneous 15.2 16.6 17.2 23.5 29.5
Duration of
Illness Heterogeneous 14.1 17.0 17.9 23.4 30.0
In summary our experimental design includes five distinct network structures and two scenarios for
individual heterogeneity in contact frequency, resulting in ten experimental conditions. This design allows us to
examine the effects of three distinct sources of divergence between AB and DE models: First, the stochastic
effects that result from discrete events can be a source of difference between AD and deterministic DE models.
Second, the effect of the structure of the network of relationships among individuals is captured in our study by
examining five different network structures and comparing them with what is observed in the DE model.
Finally, our analysis examines the effect of individual heterogeneity in the pattern of diffusion. These
comparisons are carried forward for both base case and calibrated DE model which constitute two views on
what is the parallel DE model for an AB formulation.
Analysis
For each of the ten scenarios we simulate the AB model 1000 times and compare the results with the DE
model behavior, both for the base case and the calibrated DE model". We discuss two types of sensitivities in
examining the results. Numerical sensitivity captures quantitative changes in the values of variables (e.g., the
peak infected population or final fraction infected). Behavior mode sensitivity assesses changes in the general
mode of behavior (e.g. from S-shaped to linear diffusion). These types of sensitivity are discussed with regard to
three measures. The diffusion fraction, F = R../N“”, is the fraction of the initial population that ultimately is
infected; it measures the total burden of morbidity and mortality borne by the population. The time from the
introduction of patient zero to the maximum of the infected population (the peak time, T,) measures how quickly
the epidemic spreads, a parameter of great importance in mobilizing public health measures. Finally, the peak
value (Imax) of the symptomatic infected population determines the maximum load on public health resources.
Figure 4 shows a sample AB simulation and explains these measures.
Susceptible Populations Recovered
200 al ‘
150
100 a
5
in
Infectious a
50 o| =
Me) a
a
a
0 Se
0; 30 60 90 120 150 180 210 240 270 300%
H H Time (Day)
Figure 4- A sample simulation from AB model, showing the evolution of 4 different population groups. The
main output measures, Diffusion Fraction, Peak Time, and Peak Value, are defined on this figure.
The first set of graphs (Figure 5) shows the envelopes of simulated symptomatic population for each
parameter setting and network structure we discussed. The mean of the AB simulations is shown, along with
the trajectory of the DE model with both the mean AB (Base Case) and best-fit parameters. Also shown are the
simulation envelopes encompassing different percentiles of the ensemble of simulations
A few important patterns are observed in the data. First, the general mode of behavior does not change
across different network or heterogeneity conditions and is similar to that of the base case DE model, except for
the lattice network structure. In nearly all cases the epidemic generates the classic S shape diffusion pattern
where initially the epidemic spreads at an increasing rate, then slows, peaks, decline, and finally ends when the
majority of people have recovered. The similarity is no surprise: The initial growth is driven by the positive
contagion feedbacks where an increase in the exposed and infectious populations increases the rate of infection
and therefore further pushes up the number of exposed and infectious. The epidemic must end when the
susceptible population is sufficiently depleted that the (mean) number of new cases generated by the exposed
and infected populations is less than the rate at which infected individuals recover.
One the other hand there is some numerical sensitivity to the network structure The gap between base
DE and AB models widens as we move from uniform, to random, to scale free, small world, and lattice
structures. In fact, as expected, the homogeneous uniform network is very similar to the DE base case, while the
lattice structure shows the largest difference in the timing and magnitudes of the epidemic.
Homogeneous Heterogeneous
80
7 Es Es
E . P
& 40 M M
= iz
= | . Er
0
80
60 Es
g Ee
S
3 |40
< Bu B
GI M
x P E;
EF
0
80
2 |e
oS
=
4 | . Es Es
3
a
=
z
3
=]
n
g ja
S
2 OE Es
B=j
5 |x
op Bu Bu
& lis Er
x Ee
5 -
0 75 150 225 30¢|0 75 150 225 304
Time (Day) Time (Day)
Figure 5- Comparison of the DE SEIR model to the corresponding AB model. Graphs show the time path of the
Infectious population for 300 days; smallworld and lattice are simulated for 400 and 600 days respectively, but not
shown in this graph. Each panel shows 1000 simulations of the AB model for each network and heterogeneity
setting discussed. The purple line (DE,) is the DE model with parameters set to the mean of those in the AB model.
The black line (ABy) is the mean of the AB simulations, and the grey line (DE,) is the best fitting DE model (ABy
and DE; are hard to distinguish); also shown are the envelopes encompassing 50% (white), 75% (green) and 95%
(blue) of the AB simulations.
Two concepts are central in explaining the variation across different network structures. First, how big
is the network of relationships for an individual in the system: given the contact rate, the bigger the network of
potential contacts, the higher is the chance of meeting a new person and therefore passing the disease to
somebody new. Second, the degree of clustering: when there is a high level of clustering (mostly local
contacts), a given infectious person is likely to contact the same people again and again, so that after these
neighbors are infected, the chance that an infectious person can contact a susceptible and generate a new case
declines, even if the total susceptible population remains high. That is, when most contacts are local, the
epidemic can burn out in a local patch of the network, reducing the effective reproduction rate and slowing
disease diffusion below what would occur if contacts were random. With these concepts in mind, we can explain
the various levels of diffusion across different networks. In the uniform network with homogeneous agents, the
positive contagion loop is strongest, because an individual can have contacts with everybody else, and the
overlap of contacts for two infected individuals is minimum (the chance of sampling the same two people out of
the entire population). The base DE and mean of the AB models are very close. Note, however, that even here
the epidemic in the base DE model is slightly faster than the mean of the AB simulations. In the DE model the
populations in each compartment are treated as continuous variables so that the number of exposed people
increases immediately on introduction of patient zero, while in the AB model some time typically passes before
the first new case is generated by a random encounter between people. In effect, the discrete stochastic nature
of contacts and state transitions adds a small additional delay in the contagion loop, slowing the epidemic
slightly compared to the continuous time and continuous valued DE model. However the base DE model
remains within the 75% interval of AB simulations nearly all the time, so its behavior is statistically
indistinguishable from that of the AB model.
The random network closely follows the uniform network because it shares the second property (few
common links for two infected individuals), yet the diffusion slows down slightly at the end as the number of
common links between infected individuals increases over what is expected in completely homogenous
population. The scale free network behaves in a very similar fashion because a few hubs have very high number
of links; as soon as the infection is transferred to these hubs, they quickly spread the disease across the
population. This results in a period of slow diffusion followed by a burst in disease as soon as the hubs are
infected. In fact in case of heterogeneous scale-free the initial take off is even faster than DE case because hubs
are not only well-connected, but also have larger contact rates. However, the larger contact rates of hubs trades-
off with lower contact rate of majority of other nodes, therefore after infection of main hubs, the rest of the
network has lower contact rate and the epidemic stops at lower levels of diffusion fraction (Compare the median
of homogeneous and heterogeneous scale-free scenarios in table 4). As in the uniform and random networks, the
base DE trajectory remains within the 75% envelope of AB simulations most of the time and almost never falls
outside the 95% interval, so the differences between the base DE and mean of AB simulations is not significant.
Small world networks show slightly slower diffusion speed because their clustering coefficient is high
and they lack nodes with large numbers of links. Nevertheless, the few long-range links in this architecture are
enough to sustain the typical S-shape pattern of diffusion, which is in line with Watts and Strogatz’s original
results about diffusion in small-world networks (Watts and Strogatz 1998).The main impact of individual
heterogeneity is an increase in the dispersion of the results among the ensemble of AB simulations and increase
the speed of diffusion slightly. Nevertheless, the similarity of the right and left columns of the graphs suggests
that in neither of the network scenarios adding individual heterogeneity makes a substantial difference in the
general mode of behavior.
Table 4 shows the diffusion fraction, F. We look at the mean, standard deviation, and the percentage of
simulations with less than 10% diffusion fraction for the disease. The latter category is introduced to track the
relative frequency of the non-diffusion mode of behavior in which the epidemic never takes off. In the DE
model there is always an epidemic if Rg > 1. Due to the stochastic nature of interactions in the AB model, it is
possible that no epidemic occurs or that it ends early even when Ro >1. Table 4 shows that the network structure
strongly conditions the probability of takeoff, with the fraction of cases experiencing takeoff highest for the
uniform and random networks, and lower in networks where most interactions among agents are local (as in the
small world and lattice networks). The lattice network shows the greatest divergence in behavior from the DE
case because the epidemic spreads almost linearly as it reaches a quasi-equilibrium in which encounters with
new susceptibles are balanced by recovery of infectious individuals; the probability the epidemic burns out at
any time in this regime is roughly constant. Note that the 10% cutoff value is arbitrary and the results are not
16
sensitive to it because, except for the case of lattice network structures, two general modes of behavior exist, one
dominant mode where diffusion fraction is fairly high and the epidemic sweeps through the population, and one
where the epidemic dies off because the initially sick individuals recover before they pass the disease to others.
Therefore if more than a few percent of population are infected, then most of the population will be infected and
the diffusion fraction will go beyond 10% anyway.
Table 4- Diffusion fraction F (final fraction of population infected). The fraction of simulations with F <0.1,
as a measure of non-diffusion mode of behavior, for the Diffusion Fraction statistic across different scenarios.
Diffusion Fraction for DE model is: 0.984. Also Diffusion Fractions for the DE model fitted to mean of AB
simulations in each scenario is reported. */**/*** denotes DE simulation falls outside the 90/95/99% confidence
bound for AB simulations.
severe Uniform Random Scale Free | Small World Lattice
g | Median 0.980 0.965 0.929 0.955 0.505
3 Mean 0.971 0.942 0.892 0.876 0.525*
5 Stdev 0.103 0.139 0.181 0.237 0.310
: EDA 11 24 3.9 5.2 8.1
2 Fitted DE 0.943 0.840 0.763 0.548 0.248
a | Median 0.949 0.909 0.838 0.904 0.359
i Mean 0.931 0.865 0.789 0.812 0.404***
i Stdev 0.130 0.195 0.205 0.257 0.268
HELA 1.9 48 6.2 79 12.0
Fitted DE 0.973 0.756 0.660 0.510 0.189
These results distinguish the lattice structure from the rest: while majority of simulations in every other
network structure end up with over 90% diffusion fractions, the lattice structure is widely spread on this
dimension. One interesting issue is that contrary to usual ranking of network structures, scale free networks
have slightly lower diffusion fraction than small-world. The reason is that the power law distribution of network
links leaves most of the nodes in this network with fewer than average links, therefore after diffusion of
epidemic among highly connected hubs the disease may die out before spreading to the whole population. The
high standard deviations for diffusion fractions are because of two distinct modes of behavior, one with almost
complete diffusion, and one with the epidemic dying out shortly after the start and therefore exhibiting low
diffusion fractions. The differential equation model cannot reproduce the second mode of behavior.
To find out how often the epidemic dies off, we look at the fraction of simulations where diffusion
fraction is smaller than 10%. The results suggest that the frequency of non-diffusion is fairly low, less than 10%,
except for the lattice network case. In the lattice network scenarios, the diffusion fraction is fairly evenly
distributed between 0 and | and the diffusion pattern often includes a relatively fixed rate of contagion, a
different mode of behavior than other scenarios and the DE model. The reason is that the epidemic can grow
only locally on a ring lattice and therefore the number of infectious people linked with susceptible ones is more
or less constant throughout the diffusion. If at any time all the individuals on the infectious side of such links
recover before passing the disease to susceptible side of the link, the disease dies out. The probability of this
event is fairly stable throughout the simulation because of the stable number of such links, and therefore the
epidemic has similar chances of burning out throughout most of the lifecycle. Figure 6 shows the histogram of
diffusion fraction for the lattice network. Note that the decreasing trend in the percent of simulations with high
diffusion fractions is consistent with the explanation above, as a constant probability of stopping through time
will lead to decreasing number of epidemics that remain active and get a chance to proceed further.
200
150
100
0-011 03-04 06-07 09-1
01-02 04-05 07-08
02.03 05.06 08-09
Figure 6- The histogram for diffusion fraction in the case of heterogeneous lattice network. The lattice
network is different from other networks which show two distinct modes, one with simulations concentrated at high
diffusion farctions and another with a few simulations in very low levels (no diffusion mode).
Next we examine the dynamics of diffusion as related to the peak time and peak value for the infectious
population. Peak time is important in a real epidemic scenario because it determines the time available to
mobilize public health resources for fighting the epidemic. For well understood diseases these delays include
deploying health workers, information about the disease, setting up treatment facilities, and, if available,
vaccines. For new diseases such as SARS, additional delays are created by the need to isolate the infectious
agent, sequence its genome, identify vectors and routes of transmission, formulate and test treatments and
vaccines, and so on. The peak value, representing the total number of sick people at the peak of epidemic,
determines the maximum load on public health resources, and likewise of great importance for policy purposes.
Table 5 shows the mean and standard deviation of both variables across different scenarios, along with their
values for the base case and fitted DE models.
These results, along with the graphs in figure 5, show that uniform, random, and scale free networks
have similar diffusion speeds that are also close to the deterministic DE model. The epidemic proceeds
significantly slower, which also lowers the peak fraction as more people recover before new cases are added,
peak time increases and peak values fall, indicating slower diffusion as we move from more random to more
locally clustered networks. The standard deviations of both peak times and peak values also increase, indicating
that the more locally clustered networks generate more variability in outcomes. Intuitively, the placement of
patient zero in the network makes no difference in the uniform and random networks, but can matter quite a lot
in highly clustered networks (if patient zero is well connected diffusion is more likely and more likely to be
rapid than if patient zero arrives in a poorly connected hinterland). This increase in standard deviation suggests
that despite the increase in the gap between AB and DE models as we go through different network structures,
our ability to distinguish network structure from a given trajectory does not increase as fast as the differences in
means would suggest (the differences only become significant for peak value in the case of lattice and
(marginally for) small-world networks.) Finally, the behavior of the fitted DE model is very close to the average
for the corresponding AB scenario, suggesting that the DE model can capture most of the network and
heterogeneity effects.
Table 5- The mean and standard deviation for the peak time of infectious population for different networks
scenarios, across 1000 simulations. The peak time for the DE model is 49 days and the peak value is 55 people. Also
reported are the peak times and peak values for DE models fitted to the mean behavior of each AB scenario.
*/**/*** denotes DE simulation falls outside the 90/95/99% confidence bound for AB simulations.
NW Scale Small
Structure | Uniform | Random | Free World Lattice
Mean 50.5 59.5 65.1 98.7 102.3
g Homogeneous | Stdev 9.4 14.6 19.6 42.3 81.0
F Fitted DE 51.4 60.5 66.6 94 90.4
% Mean 47.0 49.8 417 86.5 87.7
Q | Heterogeneous | stdev 10.3 14.7 14.0 37.5 74.3
Fitted DE 48 53.8 45.37 86.8 67.9
ri Mean 58.5 52.1 46.0 32.0 14.1"
3 | Homogeneous | stdev 8.2 9.6 10.7 10.6 53
g Fitted DE 50.8 42.2 36.0 21.3 6.8
% Mean 57.2 50.4 48.6 31.9" 13.1%"
8 Heterogeneous | Sidev 9.6 12.2 13.3 11.5 53
Fitted DE 49.3 418 37.0 21.4 65
The effect of agent heterogeneity on the dynamics of diffusion remains relatively small. The main effect
is that the agent heterogeneity decreases the peak time. To understand this effect we note that individuals with
higher contact rates form a sub-population with higher rates of diffusion than what is observed in the
homogenous case, therefore a quick diffusion of epidemic in this sub-population. This effect is more
pronounced in scale-free network because high contact rate is accompanied with high connectivity, creating a
subpopulation of highly connected hubs that give rise to a quick initial diffusion. On the other hand, in
heterogeneous networks we also have a sub-population with lower-than-average contact rates that becomes
involved later in the epidemic, therefore, at the later stages of diffusion epidemic dies out faster in these
networks when the reproduction number goes bellow | faster for this weakly connected group, than a
homogenous population. This effect is evident in table 4 where heterogeneous networks show lower diffusion
fractions.
One would expect faster diffusion to increase peak value, because more people are infected before those
already sick recover, yet peak value statistics do not show such effect. To understand this we note that at higher
contact rates, each extra contact per day adds less to the chance that an individual contracts the disease from a
neighbor, because at some point it is almost certain that the individual will contract the disease if all her
neighbors are sick. Therefore heterogeneity in contact rates decreases the chances of infection more from people
with lower levels of contact than it adds to the people with higher levels of contact. In fact in the extreme case
where people are either hermits or very social, the effective population size will be reduced and therefore we
will have quick diffusion among a smaller population, leading to smaller peak values.
Tables 4 and 5 also suggest that most of the differences between base DE and different AB scenarios are
statistically insignificant at 90% confidence level. When there is a difference, often network structure, and not
the heterogeneity, appears to be the distinguishing factor. For uniform, random, and scale-free none of the three
metric can distinguish base DE from a typical simulation in the ensemble of AB simulations. For small-world
and lattice networks, the differences are enough to distinguish the two model performances through one or two
of the metrics (Peak Time never becomes significant). This observation highlights both the fact that DE model,
even in the base case, behaves quite similar to AB models, and that network structure and heterogeneity increase
the variability of AB simulations, making it harder to distinguish AB and DE outputs.
The differences between the models are much smaller when we compare the results of the AB model to
the best-fit DE model. Indeed, the dynamics of the best fitting mean-field model and mean of the AB
simulations are hardly distinguishable (R° for different scenarios are between 0.97 and 1.00) and none of the
metrics in tables 4 and 5 distinguish between behavior of a typical AB model and the fitted DE. Note that we fit
the DE model to the mean of infectious population in the ensemble of AB model runs. In practice model
19
parameters are obtained from historical data, analogous to one, or at most a few, realizations of the AB
simulation, degrading the fit and yielding much broader confidence intervals for parameters. Further, when the
epidemic fails to take off, the DE estimate of the basic reproduction rate will be less than one even if the
underlying value of Ro >1. This small sample problem is equally a difficulty in parameterizing the DE and AB
models.
Discussion
The assessment of modeling methods is always contingent on the purpose of the model. The purpose
determines what variables are of interest, what level of precision is required, and what level of sensitivity to
different assumptions is acceptable. Meanwhile, it is useful to categorize different sensitivity types to facilitate
connecting different model purposes with different types of sensitivity to enhance assessment of appropriateness
of different sets of assumptions for a modeling project.
In this context it is useful to distinguish numerical, behavioral, and policy sensitivity (Sterman 2000).
Numerical sensitivity exists when change in parameters or assumptions cause changes in the numerical values of
some output measures that matter to the purpose. Behavioral sensitivity exists when the model’s mode of
behavior is sensitive to some parameter value or assumption, e.g. an S-shaped diffusion mode changes into
overshoot and collapse. Policy sensitivity arises when the recommended policy resulting from a model is
sensitive to changes in a parameter or an assumption, e.g. if mass vaccination is indicated by one set of
assumptions and ring vaccination by another set. Usually the existence of behavior mode sensitivity implies
numerical sensitivity. Policy sensitivity is closely related to the purpose of the model, yet, for most strategic
applications of simulation models, policy sensitivity to an assumption implies both numerical and behavioral
sensitivity.
The importance of each sensitivity type depends on the purpose of the modeling project. Numerical
sensitivity is most important when we deal with point forecasting, or in high precision applications, e.g.
designing the trajectory of the space shuttle. In general, numerical sensitivity to one parameter or assumption
can be important only to the degree that we have confidence about the other parameters and assumptions of the
model and the data used to calibrate and build the model. For example in applying epidemic models to real
world problems, a 1% change in the diffusion fraction as a result of a change in one parameter is not important,
if we don’t have any reliable data on the network structure (which appears to have far more influence on the
diffusion fraction). Behavior mode sensitivity is important when long-term strategies and decisions are studied,
e.g. investigating policies to stimulate the growth of a firm. In these settings the uncertainty around the issue is
so high that the analyst is not concerned with the exact numerical values but cares mainly about the overall
trajectories. Moreover, behavioral sensitivity is also important when the main purpose of the model is to
understand the structural sources of dynamics. In such cases usually a change in the behavior mode is what
signals a shift in the structural features responsible for the dynamics and therefore provides a window into better
understanding of the structure-behavior relationship. Finally, policy sensitivity is by definition important.
Our analysis discusses the numerical and behavioral sensitivity of disease propagation to the main
assumptions that distinguish typical AB and DE models. Specifically, we explore two critical assumptions. First,
that there is a network of relationships between agents governing social interactions; and second, that agents are
heterogeneous in their relevant characteristics. The five network structures span a wide range of different
patterns of relationship, from completely random to highly organized. The results show that the numerical
sensitivity of the aggregation assumption to this dimension depends on the network structure. For uniform,
random, and scale-free structures, numerical sensitivity is low, for a typical small-world it is medium, and for
ring lattice network it is high. Moreover, the stochastic nature of AB models offers an additional insight into the
range of uncertainty in model behavior that comes from different assumptions about the network and agent
heterogeneity. Behavioral sensitivity to this aggregation assumption is evident in existence of two modes of
behavior in the AB models that don’t show up in the DE model. First, the non-diffusion mode of behavior which
happens when initial infectious people recover before they transmit the disease to others, and second, the linear
diffusion that shows up in the lattice network. Nevertheless, these modes of behavior are infrequent (except for
the lattice network case, see table 4) and the dominant mode of behavior in four of the network structures is
identical to the DE case. These results imply that if overall knowledge of network portrays a structure close to
lattice network in being highly local and low variation in the number of links per node, then using a DE model
20
with the same parameters will be inappropriate. We note that in modern human societies, with their high
mobility, a pure lattice is unrealistic: there is simply too much travel and mixing among people. The lattice
network is more appropriate for the spread of diseases among immobile plant populations (see the spatial
epidemic models in e.g. Murray’s Mathematical Biology (Murray 2002).) If the network is small-world with
sparse long-distance links, then disaggregation of simulation model into agents is needed in order to get
numerical precision in our results, but a simple DE model will be able to capture the main dynamics at work.
Finally, the behaviors of uniform, random, and scale-free networks are similar enough to the DE case that the
main value added from disaggregation is in observing the non-diffusion mode and getting an idea about the
variability of model outcomes caused by the stochastic encounters among individuals.
The influence of individual heterogeneity remains even smaller. Despite changing the heterogeneity of
agents in a wide range, the simulations show little sensitivity of dynamics to this dimension. The main insight
from adding heterogeneity is that it distinguishes two sub-populations, one highly connected and one with sparse
interaction, therefore an epidemic diffuses faster in the first sub-population before slowing down in the second
group. The result is slightly sooner peak-times, followed by slightly lower diffusion levels in comparison of
heterogeneous and homogeneous networks. This dynamic is similar to what has been observed in case of
diffusion of HIV, where initial diffusion was faster than later stages due to the host population of dominantly
male, homosexual, with large contact rates.
The calibration results suggest that the simple DE model with fitted parameters can replicate the
behavior of a detailed AB model under multiple network and heterogeneity scenarios. This comparison is
important in the light of the fact that often parameter values for policy models are obtained through conceptually
similar calibration processes. The fitted DE model is hardly distinguishable from a typical AB simulation in any
of the networks and therefore shows little numerical sensitivity, yet, it can does not answer questions about
variability of results, and does not show multiple modes of behavior (e.g. non-diffusion) with a single parameter
setting, unless stochastic elements are added. In short, the calibrated DE model can be used for most policy
purposes, as long as the questions of interest pertain mainly to aggregate variables and their expectations.
The calibration results also highlight an important methodological issue. The parameter values obtained
by fitting the aggregated DE model to the data from an AB simulation (and therefore from the real world) do not
necessarily resemble the individual level parameters governing the micro-level interactions among agents
(compare the fitted parameters in table 3 with the individual level parameters in table 2). The reason is that the
aggregate parameter values not only account for the individual level concept they represent (e.g. number of
contacts per person per day) but also capture the impact on diffusion of the contact network and agent
heterogeneity. This is an important consideration in parameterizing both AB and DE models. For example the
best fitting estimate of Ro may differ substantially from its true value when the network structure is dominated
by local interactions (and that value varies as the disease spreads within a network). Fahse (Fahse, Wissel et al.
1998) offer a method for estimating aggregate level parameters from available individual level parameters so
that one can use the more manageable aggregate model for policy analysis, while making the best use of all
individual level data available.
Our analysis and discussion focused on the effects of different simplifying assumptions typically used in
DE models, as one of the important criteria for deciding between DE vs. AB models. However, there are several
other considerations that need to be taken into account. First, in modeling social systems, data availability is
usually an important concern. In absence of data, disaggregation can indicate whether the model is potentially
sensitive to some aggregation assumption or not, but does not take the analysis any further in the policy
conclusions for real problems. For example our results indicate numerical and behavioral sensitivity of the
diffusion pattern to aggregation assumptions in the case of lattice networks, however, in the absence of any
useful data about the type of the network structure in a population of interest, this result does not warrant the use
of disaggregate AB model because the analyst has no reason to assume the lattice network structure, among all
possibilities.
Second, a more disaggregate, agent based model offers some leverage points and policy options that do
not exist in the case of an aggregate model. For example, if the network of relationships in a population is
known, one can examine the effect of altering this network in an AB model by removing a node or creating new
links, to the overall social behavior of interest in the group, while such questions can not be asked in the DE
model.
21
Third, there is a tradeoff between model complexity and an analyst’s ability to build confidence in and
understand the model. AB models require significantly more computation resources than corresponding DE
models, preventing modelers from carrying out full sensitivity analyses and limiting the size of the population
that can be considered. Our version of the DE SEIR model has a total of 35 variables, initial conditions, and
parameters. To simulate a population of only 200 individuals the corresponding AB model requires between 5
to 50 thousand variables, depending on the network structure, plus the associated parameters, initial conditions,
and random number calls. Complete mapping of model behavior in parameter space takes a few seconds of
simulation time in the DE model but is completely infeasible in the AB counterpart; policy analysis for a
realistic population such as that of a nation or large city is likewise infeasible.
Fourth, the purpose of the modeling project determines the cost of different types of error, and
consequently the potential cost of different assumptions. In the case of the diffusion of epidemics an assumption
that potentially underestimates the chances of diffusion of a disease is likely to be more costly, because it can
result in avoidable morbidity and mortality due to inadequate preparation, immunization, or deployment of
health care resources. Therefore some researchers suggest that one should be mainly concerned with the upper
limits on the speed of diffusion of disease, which usually are very close to the DE model’s behavior (Kaplan and
Lee 1990; Kaplan 1991). The same argument would suggest that the cost of the aggregation assumption is
increased in studies of diffusion of innovations, where diffusion is usually the desired outcome.
Finally, the complexity of a model can be expanded in different directions. One direction is the
disaggregation of populations into their constituent agents, which is one of the typical distinctions between DE
and AB models. Another direction is expanding the boundary of the model by including feedbacks from other
subsystems that can influence the behavior of interest. Yet another option is making the DE model more realistic
by disaggregating the population into additional compartments or including noise in the mean-field rates of
change. In a world of limited resources the researcher frequently needs to make decisions on which of these
directions to choose. For example, by using an aggregate, DE model, Kaplan, Craft and Wein (2002) were able
to include the dynamics of vaccination, including capacity limits on mass vaccination, in their model of
smallpox attack, without sacrificing extensive sensitivity analysis.
Conclusions
In this paper we compare the AB and DE modeling methodologies . The propagation of an epidemic is a
good context for examining this problem because both modeling methods are widely used even though the
differences between the two are potentially pronounced in this setting. Therefore this study provides insights for
choosing an appropriate set of assumptions in the tradeoff between complexity and understandability in other
modeling problems. Moreover, models of diffusion of disease, innovations, and ideas (Rogers 2003) are
structurally very similar, making the results of this study partially applicable to other diffusion problems.
Our analysis suggests that, in the context of diffusion, disaggregating population into agents adds
additional insights mainly when we deal with locally structured networks such as a lattice, or when we are
interested in some marginal modes of behavior such as non-diffusion as a result of stochastic contacts, or
reservoirs of disease in scale-free networks that prevent it from dieing out. In fact even in the case of locally
dense networks, the main mode of behavior is usually well captured by a calibrated DE model. In the other
network structure and heterogeneity scenarios we analyzed, the differences between DE and AB models were
small, warranting AB disaggregation only in the presence of detailed network data on the specific population
under study, and provided that the computational burden does not prevent sensitivity analysis or inclusion of
other key feedbacks that may condition the dynamics. These conclusions are reinforced by other tradeoffs
involved in selecting an appropriate level of aggregation.
We further showed how AB models can be formulated in continuous time so that AB structures can be
integrated with DE structures in a single modeling environment while adhering to good modeling practice
including dimensional consistency and independence of dynamics from the length of the time step used in the
numerical integration of the equations. The ability to mix DE and AB elements in a single model enhances the
ability of analysts to build models with structure and assumptions suited to the problem under study, expanding
the toolkit available to model important policy issues efficiently and effectively.
22,
Appendix 1- The detailed model formulation
In this section we develop an agent based SEIR model of epidemic diffusion and drive the classical
differential equation model from it. At the time when S(E)IR models were developed, in absence of powerful
computers, only the simplest differential equation models where tractable, therefore it is only recently that, with
progress in computational power, AB models of disease are being developed and adopted. Nevertheless,
following the more tangible logic of an AB epidemic model and deriving the DE model from it highlights the
assumptions made in the process and clarifies the relationship of the two. The formulation for infection rate is
explained in detailed for the agent based model, followed by derivation of the aggregate DE formulation for
infection rate from the AB formula.
In and AB model, if an infectious contact happens between individual j and any of the other people, the
individual’s Infection Rate (IR[j]) during that period is calculated so that the state of individual j changes from
susceptible to exposed. Formally:
IR{j]= IF THEN ELSE (TIC[j]>0,1,0)/time step
Here TIC[j] is the “Total Infectious Contacts” in the last time step for individual j and is the summation of all
infectious contacts in that time step between individual j and other people (See Figure 7).
TICHI= Dy CHK)
k
<TIME STEP>
Susceptible S Exposed E simptonnatc'
Infection Rate IR Emergence
' Rate ER
B Total Infectious
Contacts TIC R R
Depletion .
P f Contagion Contagion
Infectious Infectivity of
‘Contacts C Infectious IIS
R ae
Contact Probability
Network CP
Infectivity of
Infection Risk IP Exposed IES
<TIME STEP
«Renoversa R <Exgosed E>
Contact Frequency. Link Contact CONVENES pone
ra <Susceptible S>
for Healthy AC Rate LCR
aa ;
Contact Observed Link
Network NW Per PersonK CR
Relative Contact for
oe *
EffLinkNumon Relative Contact <Relative Contact "Recovered RCR
Contact Coeficient a Risk for Infectious Risk for Exposed
RCI RCE>
Tendency to Use Links
for Individual TUL
Figure 7- The infection rate formulation in the AB model
Contact between individuals j and k happens stochastically. If a uniform random number Rn{j,k] is
smaller than the probability of infectious contact between the two persons, contact will be 1, otherwise, it is 0:
C[j,.k]= IF(CP [j,.k]*CR[j]*CR[k]*IP[k]*S[j]>Rn[j,k],1,0) Where Rn{[j,.k]~Uniform[0,1]
Here CP{j,k] is the probability of a contact between the two individuals during the current time step, if
they were both healthy, CR[j]*CR[k] corrects the probability to reflect that unhealthy individuals have fewer
23
contacts (See below), IP[k] is the probability of infection upon a contact between individual k and a susceptible,
and S[j] is 1 if the individual is susceptible and 0 otherwise.
Probability of contact, CP[j,k], depends on the contact rates for different links (LCR[j,k]) and the
simulation time step, so that we calculated the probability of a contact happening in one time step. For small
probabilities we have:
CP{j,k]=LCR[j,k]*time step
Contact rates for different links are calculated so that the total expected number of contacts remain
similar to that of DE model (AC*N) while we account for different levels of heterogeneity as discussed in the
text:
LCRELKE AC *N NW [j,k]*TUL[k]*TUL[j]
y NW kI*TUL Ry *TULE] (K{k]* KE j])*
oa (K[k]* K[j])*
Where NW\{j,k] is a 0 or | variable that indicates if individuals j and k have a relationship in the given network
of people™ and is described in more detail in the analysis section. Average Contact (AC) is the average number
of contacts per day that an individual in this population has if everybody else was healthy, and N is the total
population. Tendency to Use Links for Individuals (TUL[j,k]) factors in the differences between individuals in
their propensity to use their network links, and is either 1 (for homogeneous setting) or distributed with
Uniform[0.25, 1.75] distribution (for heterogeneous setting). The total number of links individual j has in the
network, K[j], is used to capture cases where having more links reduces the chances of using those links, and a
determines how strong this effect is (a=0 is when no such effect exists, heterogeneous, and o=1 is when extra
links proportionately cut down on chances of contact).
Contact Risks (CR[j]*CR[k]) factor in the effect of different disease stages on individual activity:
CRUE (SUF EG]*RCEHG*RCHRE))/( SGHEG] +15] +RG)
Here Relative Contact rate for Exposed (RCE) and Relative Contact rate for Infectives (RCI) are
parameters between 0 and | which adjusts for the effect of different disease stages on individual activity,
relative to their activity level when healthy.
Infection Probability (IP[k]) depends on which stage the individual is in, so susceptible and recovered
people are not infectious around, while exposed and infectious individuals have a positive possibility of
infecting a susceptible upon contact:
IP[k]= EUj]* iss HE]* iis) SEG) 1G) +RGD.
Here igs and ijs are the infectivities of exposed and infectious people, i.e. the probability of a susceptible
catching the disease upon a contact with an exposed/infectious person.
To drive the DE equivalent of this model, we first note that infection rate (IRpz) in the DE model is for
the whole population rather than individuals, and therefore we are looking for the summation of IR[j] in the AB
model:
> IR[jJ= > TIC[j]/time step= 1/time step* >> C{j,k] *
J J JX
C[j,k] is an stochastic variable. To drive the deterministic DE equivalent, expectation of C[j,k] is used:
IRpz =I/time step* > > CP [j.k]*CR[j]*CR[k]*IP[k]*S[j]
JX
=I/time step* > SUj]*CRU]* > LCR{j,k]*time step CR[k]*IP[k]
J K
In the DE model, homogeneity of population results in everybody having the same number of links,
K{j]=K[k]=K, and same propensity to use their links, TUL[j]=TUL[k]=TUL. Therefore:
LORIKIE AC#*N NW[j,k]}*TUL? AC *N*NW[j,k]
NW [j,k]*TUL? K% SY NW(.K
24
NWJ{j,k] depends on the network structure, however, in the DE model, one assumes that the network
structure is homogeneous, i.e. everybody has the same chance of meeting everybody else at any time. Therefore
NW{j,k]/ > NW(j,kJ=I/N*. Therefore LCR[j,k]J=-AC*N*1/N°=ACWN. Putting this into the equation for
Kj
infection rate we will have:
IRoe= SliFcRGy* >, AC/N#*CR[k]}*IP[k]
i] K
S[j] is 0 for all people which are not susceptible, as CR[j]=1 for susceptibles, we will have:
IRpe =AC/N* > Siiy* > CR[k]*IP[k]
J K
Considering that individuals are always in one and only one of the four states, S[j]}+E[j]+I[j] +RUjJ=1
and therefore
IRpz = AC/N* > Siil* > (S[K]}+E[k]*RCE+I[k]*RCHR[k]) (E[k]* igs +I[k]* iis)
J K
For individuals not exposed or infectious, (E[k]* izs +I[k]* ijs)=0 and therefore its multiplication with
S[k] and R[k] will be zero, so we do not need to include the first term in those cases, and we can rewrite the
equations as:
IRpz = AC/N* > Siil* ss ( E[k]*RCE+I[k]*RCD (E[k]* izs +I[k]* its)
J K
Moreover, an individual can not be in both exposed and infectious states at the same time, therefore
E[k]*I[k]=0. So more simplification is possible:
IRpe = AC/N* > Siil* > ( E[k]*RCE* igs +I[k]*RCI* ijs)
J K
Finally, denoting the total number of susceptible, exposed, and infected S, E, and I respectively, we will
have:
IRpe = AC/N* > S[j]*(RCE* igs > E[k]+RCI* ijs * > I[k])
J K K
=AC/N* > S[j]*(RCE* igs*E+RCI*ijs*1)
J
=AC#(RCE* igs *E+RCI* iys*I)*(S/N)
Note that AC*RCE is the average number of contacts for an exposed individual, if everybody else was
healthy (cps) and AC*RCI is the same variable for infectious infividuals (c)s)“'. Therefore we get to the typical
SEIR formulation for Infection Rate (IRpr) (See Figure 8)
IR pg = (CysiesE + Cysig1) * (S/N)
25
Average
Incubation Time e Average Duration
of IlIness d
Susceptible Exposed _ Infectious ; Recovered
Population § 5 Asymptomatic Symptomatic Population R
Infection | PopulationE | Emergence | Population!| Recovery Lis
Rate ER Rate RR
AR) A®)
Relative Contac
Contagion Contagion Risk for Exposed
RCE
Total Contacts of Contact Freq
Total Population N Asymptomatic “Exposed and
Apfectious Susceptible
Total Infectious Contact Frequency Relative Contact
Contac Infectivity of for Healthy AC saab enous
Exposed IES
Contact Frequency
for Infectious
Total Contacts of
Symptomatic <4 Infectivity of
Infectious Infectious IIS
Figure 8- The structure of SEIR differential equation model.
The events of an individual transferring from Exposed to Infectious, and from Infectious to Recovered,
are more straight forward and follow a similar logic, which is explained in detail for Recovery Rate (RR) (See
Figure 9).
Infectious I' _———<— Recovered R
|Rec overy Rate RR
Infectious | |S er")
+ One Da’
Re A 'y
ecovery fo - a~
a R Recovery Probability of
a” Recovery
<Expected
Duration of Illness Noise Seed Expected Duration
DE Model * crim step» —~—ofliness
AB Model
Figure 9- A very simple DE structure with its parallel AB formulation.
The main idea for formulating the “Recovery Rate RR” is to switch “Infectious I’” from 1 to 0 at some
specific time step which is selected randomly based on the probability of recovery at any time for each
individual. This can be achieved through the following formulation:
RR[j]= If Then Else( R[j]< Probability of Recovery[j], Infectious I’[j]/time step, 0)
Here R is a random number taken from a uniform(0-1) distribution. It is generated each time step for
each individual to test that, if Infectious, the individual can recover at that step or not. The “Probability of
Recovery” is the probability that the individual will recover during a time step and depends on the probability
distribution of recovery times for individuals. Here we assume an exponential distribution with parameter A=1/d,
where d is the average duration of illness in days. Therefore the probability of recovery is independent of how
long the individual has been sick (See footnote ii). This value can be found based on the fact that the probability
of not recovery during one day is 1-One Day/d, which itself equals to probability that an individual does not
recover in any of the time steps during one day, or (Probability of Recovery)'"* *""* *Y, Writing these
equations and recovering the probability of recovery we get:
Probability of Recovery[j]=1-(1-One Day/d)'™ 0" Psy
The parallel DE formulation can be obtained by noticing that recovery rate is the summation of recovery
rates over all the individuals. While the AB model recovery rate for each individual is a stochastic variable for
26
the parallel DE case we use the deterministic expected value of the recovery rate in calculating the summation of
individual recover rates:
ar By 1-(1-OneDay / yTinestep /Oneday iz 1-(1-OneDay / dyTimeste /OneDay
ae TimeStep TimeStep
Expanding (1- One Day /d)'"***"°"* P*’, we note that the first two terms of expansion are a good
approximation for this part of equation, as long as 1/d<1 and time step<1, both of which are good assumptions
in this context. Therefore we replace (1- One Day /d)'™ *""* P*® with 1-time step/d™*:
1-(1-timestep/d) _ I
time step d
Therefore the parallel formulation in the DE case is a simple exponential decay with recovery rate being
equal to population divided by waiting time, RR=I/d. Following this argument, all the equations for the DE
model are listed bellow:
RRp, =! *
ds |
TT IR=IC *(S/N) IC = CysiggE + Cygig]
GE _IR-ER ER=E/e Gl ER-RR RR=I/d
dt dt
27
Appendix 2- The construction of scale free network
Barabasi and Albert (1999) outline a simulation algorithm to grow scale-free networks. This algorithm
starts with m0 number of initial nodes and adds one new node to the structure at every time t. The new node is
connected to the previous nodes through m new links, where the probability of each old node j to receive one of
the new links is m*k;/ > k,, In the resulting network the probability that a node j has connectivity smaller
than k after t time steps is determined by the following equation:
P(ki[t}<k)=1-m°t/(k?(t+m0))
To avoid the computational challenges of growing a thousand networks of size 200*200 outside our
simulation software and importing them for the sensitivity analysis of scale-free network setting, we use the
following algorithms to obtain, in closed form, scale-free networks with link distributions similar to Barabasi
and Albert (BA) algorithm.
We first randomly pick the number of links for each node, Nj, from the given probability distribution of
having k links (equation above). The resulting link numbers are between m and 199, with the exception of the
m0 initial nodes that can have less than m links. Then we find the probability for existence of a link between
individuals j and i, P,, , So that the expected number of links for every individual equals Nj. Finally, we use
these Pi probabilities to determine if there is a link between individuals i and j.
Formally we need to find P., so that we have:
Pa = Ny forall
I
P, =P, for all i and j
P,, =0 for all j
O<P. <1 foralliandj
This set of equations is under-specified, i.e. there are more variables than equations, and therefore one can find
infinite number of link settings that satisfy these conditions. Among these solutions, we pick the one that
assumes no correlation between the links for different nodes, i.e. the probability of nodes i and j having a link is
independent of whether they share other links in the network, and only depends on the total number of links they
can
yen N?N?
P, = ——
have. * ji k=0 l=k 2
Ts
This solution is consistent with the BA algorithm assumption that the probability of link between new nodes to
the older ones only depend on the total links an older node holds, and not on the specific network of
relationships between different old nodes.
The solution for Pi is a polynomial series where each new term adds to the precision of the results. We
use the first three terms (k=0, 1, 2) in our simulations, which is enough to give Pi. s that are 0.1% close to the
real values.
In the algorithm, the parameters m, m0, and t are set in order to get the desired number of links per
node. Moreover, following the examples of BA paper, we keep m=m0. Note that by drawing a random number
and comparing that with P.; to connect every two links, the expected number of links for node j will be N;, but
the exact number of links are binomially distributed, therefore we have a possibility of having unconnected
28
nodes, hermits, in our network. Such agents are indeed not susceptible to disease because they never meet
anybody else. This issue is not problematic in our analysis because the average number of links for each
individual is high enough to ensure the connectedness of the network in limits, and therefore the rare event of
having a few hermits at worst slightly over emphasizes the effect of networks, but does not change the general
trends. Our simulations show that the probability of having a hermit, and individual with no link to any other
individual, is about 0.11% with the parameter settings of the scale free network. Note that with 2 initial exposed
individuals, the chance of both of these individuals being a hermit is as low as 10°, therefore the chances of
non-diffusion as a result of first exposed individuals being a hermit is negligible in our analysis.
References:
Akkermans, H. A. (2001). Emergent Supply Networks: System Dynamics Simulation of Adaptive
Supply Agents. Preceedings of the 34th Hawaii International Conference on Systems Sciences
(HICSS-34), University of Hawaii.
Axelrod, R. (1997). "The dissemination of culture - A model with local convergence and global
polarization." Journal of Conflict Resolution 41(2): 203-226.
Axtell, R. L., J. Axelrod, et al. (1996). "Aligning simulation models: A case study and results."
Computational and Mathematical Organization Theory 1(2): 123-141.
Axtell, R. L., J. M. Epstein, et al. (2002). "Population growth and collapse in a multiagent model of the
Kayenta Anasazi in Long House Valley." Proceedings of the National Academy of Sciences of
the United States of America 99: 7275-7279.
Barabasi, A. L. (2002). Linked: How Everything Is Connected to Everything Else and What It Means.
Cambridge, Massachusetts, Perseus.
Barabasi, A. L. (2003). Linked: How Everything Is Connected to Everything Else and What It Means,
Plume.
Barabasi, A. L. and R. Albert (1999). "Emergence of scaling in random networks." Science 286(5439):
509-512.
Barthelemy, M., A. Barrat, et al. (2003). "Velocity and hierarchical spread of epidemic outbreaks in
complex networks." Condensed Matter.
Carley, K. (1992). "Organizational Learning and Personnel Turnover." Organization Science 3(1): 20-
46.
Dye, C. and N. Gay (2003). "Modeling the SARS epidemic." Science 300(5627): 1884-1885.
Edwards, M., S. Huet, et al. (2003). "Comparing an individual-based model of behavior diffusion with
its mean field aggregate approximation." Journal of Artificial Societies and Social Simulation
6(4).
Epstein, J. M. (2002). "Modeling civil violence: An agent-based computational approach." Proceedings
of the National Academy of Sciences of the United States of America 99: 7243-7250.
Epstein, J. M., D. A. T. Cummings, et al. (2004). Toward a Containment Strategy for Smallpox
Bioterror: An individual-Based Computational Approach, Brookings Institution Press.
Erdos, P. and A. Renyi (1960). "On the evolution of random graphs." Publications of the Mathematical
Institute of the Hungarian Academy of Sciences 5: 17-61.
Fahse, L., C. Wissel, et al. (1998). "Reconciling Classical and Individual-Based Approaches in
Theoretical Population Ecology: A Protocol for Extracting Population Parameters from
Individual-Based Models." The American Naturalist 152(6): 838-852.
Ferguson, N. M., M. J. Keeling, et al. (2003). "Planning for smallpox outbreaks." Nature 425: 681-685.
Forrester, J. W. (1958). "Industrial Dynamics: A Major Breakthrough for Decision Makers." Harvard
Business Review 26(4): 37-66.
Forrester, J. W. (1961). Industrial Dynamics. Cambridge, The M.LT. Press.
Forrester, J. W. (1971). World dynamics. Cambridge, Mass.,, Wright-Allen Press.
29
Grobler, A., M. Stotz, et al. (2003). A Software Interface Between System Dynamics and Agent-based
Simulations- Linking Vensim and RePast. Proceedings of International System Dyanmics
Conference, New York, NY.
Halloran, E. M., I. M. Longini, et al. (2002). "Containing Bioterrorist Smallpox." science 298: 1428-
1432.
Kaplan, E. H. (1991). "Mean-Max Bounds for Worst-Case Endemic Mixing Models." Mathematical
Biosciences 105(1): 97-109.
Kaplan, E. H., D. L. Craft, et al. (2002). "Emergency response to a smallpox attack: The case for mass
vaccination." Proceedings of the National Academy of Sciences 99(16): 10935-10940.
Kaplan, E. H. and Y. S. Lee (1990). "How BAd Can it Get? Bounding Worst Case Endemic
Heterogeneous Mixing Models of HIV/AIDS." Mathematical Biosciences 99: 157-180.
Kaplan, E. H. and L. M. Wein (2003). "Smallpox bioterror response." science 300: 1503.
Keeling, M. J. (1999). "The effects of local spatial structure on epidemiological invasions."
Proceedings of the Royal Society of London Series B-Biological Sciences 266(1421): 859-867.
Kermack, W. and A. McKendrick (1927). "Contributions to the mathematical theory of epidemics."
Proceedings of the Royal Society 115A: 700-721.
Koopman, J. S. (2002). "Contolling Smallpox." science 298: 1342-1344.
Koopman, J. S., S. E. Chick, et al. (2002). "Stochastic effects on endemic infection levels of
disseminating versus local contacts." Mathematical ciences 180: 49-71.
Koopman, J. S., G. Jacquez, et al. (2001). New data and tools for integrating discrete and continuous
population modeling strategies. Population Health and Aging. 954: 268-294.
Lipsitch, M., T. Cohen, et al. (2003). "Transmission dynamics and control of severe acute respiratory
syndrome." Science 300(5627): 1966-1970.
Lomi, A. and E. R. Larsen (2001). Dynamics of organizations : computational modeling and
organization theories. Menlo Park, Calif., AAAI Press/MIT Press.
Low, G. (1980). The Multiplier-accelerator model of business cycles interpreted from a system
dynamcis perspective. Elements of the System Dynamics Method. J. Randars. Waltham, MA,
Pegasus Communications.
Morecroft, J. D. (1983). "system Dybnamics: portraying bounded rationality." OMEGA 11(2): 131-
142.
Morecroft, J. D. W. (1985). "Rationality in the Analysis of Behavioral Simulation Models."
Management Science 31(7): 900-916.
Murray, J. D. (2002). Mathematical biology. New York, Springer.
Newman, M. E. J. (2002). Random Graphs as Models of Networks. Sante Fe Institute Working Paper.
Picard, N. and A. Franc (2001). "Aggregation of an individual-based space-dependant model of forest
dynamics into distribution-based and space-independent models." Ecological Modeling 145(1):
69-84.
Riley, S., C. Fraser, et al. (2003). "Transmission dynamics of the etiological agent of SARS in Hong
Kong: Impact of public health interventions." Science 300(5627): 1961-1966.
Rogers, E. M. (2003). Diffusion of innovations. New York, Free Press.
Rohani, P., M. J. Keeling, et al. (2002). "The interplay between determinism and stochasticity in
childhood diseases." American Naturalist 159(5): 469-481.
Schelling, T. C. (1978). Micromotives and macrobehavior. New York, Norton.
Schieritz, N. (2002). Integrating System Dynamics and Agent-Based Modeling. Proceedings of the
20th International Conference of the System Dynamics Society, Palermo, Italy, The System
Dynamics Society.
Scholl, H. J. (2001). Agent-based and System Dynamics Modeling: A Call for Cross Study and Joint
Research. Proceedings of the 34th Hawaiian International Conference on System Sciences
(HICSS-34), Maui, HI.
30
Sterman, J. (2000). Business
Dynamics: systems thinking and modeling for a copmlex world. Irwin, McGraw-Hill.
Strogatz, S. H. (2003). Sync : the emerging science of spontaneous order. New York, Theia.
Tesfatsion, L. (2002). "Economic agents and markets as emergent phenomena." Proceedings of the
National Academy of Sciences of the United States of America 99: 7191-7192.
Watts, D. J. (1997). The structure and dynamics of small-world systems: xx, 331 leaves.
Watts, D. J. and S. H. Strogatz (1998). "Collective dynamics of "small-world" networks." Nature 393(4
June): 440-442.
Wolfram, S. (2002). A new kind of science. Champaign, IL, Wolfram Media.
‘Traditionally the S(E)IR formulation denotes the R compartment “removed” and lumps both those who recover (and gain
immunity) and those who die. This assumption is not completely harmless when there is a large number of deaths (e.g.,
smallpox, HIV/AIDS), since the at-risk population shrinks as the epidemic progresses, changing contact probabilities
among individuals. Nevertheless we maintain the traditional assumption for comparability to the classical formulation.
“ The current formulations are based on Euler numerical integration technique. Use of Runge-Kutta or other techniques may
create problems in robustness of formulations.
" The basic reproduction number (Ro) is the expected number of people that one infected individual can transmit the
disease to when all others are susceptible, and measures the propensity of the disease to spread (it is related to the net gain
of the feedbacks governing the growth of the infectious populations). In the DE model, for the situation when the entire
population is susceptible the basic reproduction number can be written as:
> Ro=Infectivity[S]*Contact frequency[S]*duration[S], where S is the stage of the disease.
s
The observed reproduction number in the AB model can be determined by tracking how many individuals each person
infects before recovering, and averaging.
“Mean performance was chosen because it gives an overall impression of diffusion pattern across all 1000 simulations. An
alternative would be to calibrate the model to the behavior of a representative simulation from the ensemble of AB
simulations. This alternative has the advantage of being closer to one historical realization of an epidemic.
’ Calibration was done using the Vensim™ software’s optimization engine and mean square error function.
“ The infectivity parameters where constraint to be above 0, the average incubation time to be between 0 and 30 and
average duration of illness to be between 5 and 30 days. The latter constraints are based on the fact that people have some
understanding of disease length and therefore do not calibrate the model with no restriction on parameters.
™ We start a simulation with 2 randomly chosen, initially exposed individuals and simulate for 300 days for Uniform,
Random, and Scale-Free networks, 400 days for Small-World, and 600 days for Lattice network. This ensures that
epidemic finishes in all the simulations. We use a numerical integration time step of 0.5. Sensitivity analysis shows no
change in the overall performance of models when cutting the time step to 0.25 and 0.125. The simulations are done in the
Vensim™ software.
“In fact we use (So-S3o0)/N = (R..-Eo)/N, rather than R.,/N, because Ey is not infected by the endogenous epidemic. R.,/N =
(R..-Eo)/N when Ep is small (as is the case in our simulations). (R..-Eo)/N = (So-S..)/N because at the end of the epidemic all
the population are either in S or R state. Yet we simulate for a finite time of 300 days, therefore we use (So-S3o0)/N rather
than (R3o0-Eo)/N because S399 is closer to its equilibrium value. This will bring our metric closer to what is observed at the
real end of epidemic, should we continued the simulations longer, which is important in case of lattice network where
epidemic is not always finished in 300 days.
™ The network structure can be probabilistic, for example by having a continuous 0-1 variable rather than the binary choice.
Here, except for one case of continuous probability network, we select the binary version as it is more in line with normal
practice in AB modeling of networks, and highlights the effects of network structure better.
* Note that in small time steps, there is little chance that TIC[j]>1 and therefore the Min function is not carried over.
“ The parameter Ces (Cis ) is representing the average number of contacts an asymptomatic (symptomatic) person will have,
if everybody else was asymptomatic. Therefore these parameters are conditional on the other side of contact, which is here
an asymptomatic individual. In general, if we have M different population categories, each with population N;, they would
have (M?+M)/2 different contact rates C; (for every i, j pair as well as i=j. Because contact between i and j is the same as j
and i, C= Cj). In this case the average number of contacts for an individual from each category will be: C= > Cy*N/N
J
where N is the total population. In the case of disease epidemic, we are only interested in contacts between susceptible and
31
infectious populations, therefore, only the contact rates between exposed / infectious and susceptible (Czs and C;s) are
included in typical models. Similar argument is valid for the infectivity parameters, igs and ijs,
“ The parameter One Day has the value | and the unit Day. Even though mathematically the same, it is required to have a
conceptually coherent and unit wise consistent equation.
Back to the Top
32;