Stochastic epidemiological models

24 August 2023

Julien Arino

Department of Mathematics & Data Science Nexus
University of Manitoba*

Canadian Centre for Disease Modelling

*The University of Manitoba campuses are located on original lands of Anishinaabeg, Cree, Oji-Cree, Dakota and Dene peoples, and on the homeland of the Métis Nation. We respect the Treaties that were made on these territories, we acknowledge the harms and mistakes of the past, and we dedicate ourselves to move forward in partnership with Indigenous communities in a spirit of reconciliation and collaboration.


  • Why stochasticity matters
  • Side note: sojourn / residence times
  • Discrete time Markov chains
  • Continuous time Markov chains

Remarks / Resources

This is a user-oriented course: I barely touch on the algorithms; instead, I focus on how to use them

Code is available in my subfolder in the course Github repo in the CODE directory

Some of the slides are inspired from slides given to me by Linda Allen (Texas Tech) and Frank Ball (University of Nottingham). I recommend books and articles by Linda for more detail

Why stochasticity matters

Running example - SIS model without demography

Constant total population

Basic reproduction number:

In the deterministic world, rules the world

  • If , the disease dies out (disease-free equilibrium)
  • If , it becomes established at an endemic equilibrium

    Next slides: K, , (and )

In stochastic world, make that '' rules-ish'' ()


When , extinctions happen quite frequently

Types of stochastic systems discussed today

  • Discrete-time Markov chains (DTMC)
  • Continuous-time Markov chains (CTMC)

But there are many others. Of note:

  • Branching processes (BP)
  • Stochastic differential equations (SDE)

Side note: sojourn / residence times

Some probability theory

Suppose that a system can be in two states, and

  • At time , system is in state
  • An event happens at some time , which triggers the switch from state to state

A random variable is a variable that takes random values, that is, a mapping from random experiments to numbers

Let us call the random variable

time spent in state before switching into state

States can be anything:

  • : working, : broken;
  • : infected, : recovered;
  • : alive, : dead;

We take a collection of objects or individuals in state and want some law for the distribution of the times spent in , i.e., a law for

For example, we make light bulbs and would like to tell our customers that on average, our light bulbs last 200 years..

For this, we conduct an infinite number of experiments, and observe the time that it takes, in every experiment, to switch between and

From this, deduce a model, which in this context is called a probability distribution

Discrete vs continuous random variables

We assume that is a continuous random variable, that is, takes continuous values. Examples of continuous r.v.:

  • height or age of a person (if measured very precisely)
  • distance
  • time

Another type of random variables are discrete random variables, which take values in a denumerable set. Examples of discrete r.v.:

  • heads or tails on a coin toss
  • the number rolled on a dice
  • height of a person, if expressed rounded without subunits, age of a person in years (without subunits)

Probability density function

Assume continuous; it has a continuous probability density function


Cumulative distribution function

The cumulative distribution function (c.d.f.) is a function that characterizes the distribution of , and defined by


Properties of the c.d.f.

  • Since is a nonnegative function, is nondecreasing
  • Since is a probability density function, , and thus


Mean value

For a continuous random variable with probability density function , the mean value of , denoted or , is given by

Survival function

Another characterization of the distribution of the random variable is through the survival (or sojourn) function

The survival function of state is given by

This gives a description of the sojourn time of a system in a particular state (the time spent in the state)

is a nonincreasing function (since with a c.d.f.), and (since is a positive random variable)

The average sojourn time is

Since ,

Expected future lifetime

Hazard (or failure) rate

The hazard rate (or failure rate) is

Gives probability of failure between and , given survival to

We have

The exponential distribution

The exponential distribution

The random variable has an exponential distribution if its probability density function takes the form

with . Then the survival function for state is of the form , for , and the average sojourn time in state is

The Dirac distribution

The Dirac distribution

If on the other hand, for some constant ,

which means that has a Dirac delta distribution , then the average sojourn time is

A cohort model

A model for a cohort with one cause of death

We consider a population consisting of individuals born at the same time (a cohort), for example, the same year

We suppose

  • At time , there are initially individuals
  • All causes of death are compounded together
  • The time until death, for a given individual, is a random variable , with continuous probability density distribution and survival function

The model

Denote the population at time . Then

gives the proportion of , the initial population, that is still alive at time

Case where is exponentially distributed

Suppose that has an exponential distribution with mean (or parameter ), . Then the survival function is and takes the form

Now note that


The ODE makes the assumption that the life expectancy at birth is exponentially distributed

Case where has a Dirac delta distribution

Suppose that has a Dirac delta distribution at , giving the survival function

Then takes the form

All individuals survive until time , then they all die at time

Here, we have everywhere except at , where it is undefined

Sojourn times in an SIS disease transmission model

An SIS with tweaked recovery

Traditional ODE models assume recovery from disease at per capita rate (often denoted )

Here, assume that, of the individuals who have become infective at time , a fraction remain infective at time

Thus, considered for , the function is a survival function

Reducing the dimension of the problem

We have

is constant (equal total population at time ), so we can deduce the value of , once we know , from the equation

Model for infectious individuals

Integral equation for the number of infective individuals:

  • number of individuals who were infective at time and still are at time
    • is nonnegative, nonincreasing, and such that
  • proportion of individuals who became infective at time and
    who still are at time
  • is with , from the reduction of dimension

Expression under the integral

Integral equation for the number of infective individuals:

The term

  • is the rate at which new infectives are created, at time ,
  • multiplying by gives the proportion of those who became infectives at time and who still are at time

Summing over gives the number of infective individuals at time

Case of an exponentially distributed time to recovery

Suppose that is such that the sojourn time in the infective state has an exponential distribution with mean , i.e.,

Then the initial condition function takes the form

with the number of infective individuals at time . This is obtained by considering the cohort of initially infectious individuals, giving a model such as

Equation becomes

Taking the time derivative of yields

which is the classical logistic type ordinary differential equation (ODE) for in an SIS model without vital dynamics (no birth or death)

Case of a step function survival function

Consider case where the time spent infected has survival function

i.e., the sojourn time in the infective state is a constant

In this case becomes

Here, it is more difficult to obtain an expression for . It is however assumed that vanishes for

When differentiated, gives, for

Since vanishes for , this gives the delay differential equation (DDE)

What we know this far

  • The time of sojourn in compartments plays an important role in determining the type of model that we deal with
  • All ODE compartmental models, when they use terms of the form , make the assumption that the time of sojourn in compartments is exponentially distributed with mean
  • At the other end of the spectrum, delay differential with discrete delay make the assumption of a constant sojourn time , equal for all individuals
  • Both can be true sometimes.. but reality is often somewhere in between

Survival function, , for exponential distrib. with mean 80 years


The problems with the exponential distribution

  • Survival drops quickly: in previous graph, 20% mortality of a cohort at age 20 years
  • Survival extends way past the mean: in previous graph, almost 25% survival to age 120 years
  • Acceptable if what matters is mean duration of sojourn over long time period
  • Less so if interested in short term dynamics
  • Exponential distribution with parameter has mean and variance , i.e., one parameter controls both the mean and dispersion

An COVID-19 model : "making Erlangs"

JA & Portet. A simple model for COVID-19. Infectious Disease Modelling 5:309-315 (2020)

Simple way to "fix" sojourn times: sums of exponential distributions

  • Exponential distribution of sojourn times is acceptable if what matters is mean duration of sojourn over long time period
  • For COVID-19, were trying to give "predictions" over 2-4 weeks period, so we need more than the mean

Use a property of exponential distributions, namely, that the sum of i.i.d. (independent and identically distributed) exponential distributions is Erlang distributed

Sum of exponential distributions

and independent exponential r.v. with rate parameters and . Then the p.d.f. of is the convolution

The Erlang distribution

P.d.f. of the Erlang distribution

shape parameter, rate parameter (sometimes use scale parameter )

So, if , has distribution

i.e., an Erlang distribution with shape parameter and rate parameter


, , be exponential i.i.d. random variables with parameter

Then Erlang distributed with rate parameter and shape parameter

So use multiple compartments

Discrete time Markov chains

Discrete-time Markov chains

: probability vector, with describing the probability that at time , the system is in state ,

for all , of course

State evolution governed by

where is a stochastic matrix (row sums all equal 1), the transition matrix, with entry , where