Modelling COVID-19 with SEIR

"Flattening the curve" has become a household phrase these days, as COVID-19 is ravaging the western world and putting enormous pressure on our healthcare systems. To flatten the curve means to decrease the number of cases in need of hospitalization at any one time to avoid exceeding the available capacity, such as the number of ventilators. Policy makers around the world have enlisted epidemiologists to tell them what needs to be done to flatten the curve in their country. Common actions include limiting crowd sizes and encouraging those who can to work from home. Two questions immediately arise: How early do we need to enact these costly policies and how severe do the policies need to be in terms of decreasing the exponential growth factor.

In this article, I will implement a susceptible-exposed-infected-recovered (SEIR) model to answer these questions. This model has previously been used in New York Times for this purpose, and versions of it is used in research by epidemiologists to model the epidemic curve.


SEIR is a mathematical model for a group of infectious diseases including COVID-19. In the model, all members of a population belong to one of four groups: susceptible, exposed, infectious, or recovered. Susceptible individuals have never been infected, exposed have recently been infected but cannot yet infect others, infectious can infect others, and recovered have developed immunity.

In the model, individuals move from one group to the next. In the beginning, the number of infected individuals increases exponentially since each case is generating a number of secondary cases, those cases each generate secondary cases of their own, etc. After some time, the number of individuals that are susceptible decreases to the point where, for lack of individuals to infect, the number of secondary cases generated per case drops. Eventually it drops below one, at which point the number of infected individuals in the population starts to decrease until finally the disease disappears from the population. The number of infected individuals over time is known as the epidemic curve. Next, I discuss the epidemic parameters that affect the shape of the curve in the SEIR model.


Throughout this article and in our implementation, we are going to use the following parameters. We are going to get numeric values for them from a World Health Organization report and an article by Li et al. These articles are from the early days of COVID-19 and mostly based on what happened in Wuhan.

Number of infectious days. The number of days that an individual is infectious but asymptomatic. Li et al. estimated this value to 2.9 days.

Incubation period. The WHO report says that the incubation period for COVID-19 has a range of 1-14 days and a mean of 5-6 days. Li et al. estimated the value to 5.2 days.

Time from symptom to outcome. The WHO report says that the median time from onset to clinical outcome is approximately 2 weeks for mild cases and 3-6 weeks for severe or critical cases. For simulation purposes, we can say that, on average, mild cases have a recovery time, counted from the first day of symptoms, of 14 - 2.9 = 11.1 days for mild cases, and 7 * (3 + 6) / 2 - 2.9 = 28.6 days for severe cases. Patients that eventually die spend, according to the report, between 2 to 8 weeks in hospital before the final outcome. For the purpose of our simulations, we can use the average number, assuming uniform distribution, of ((2 + 8) * 7 - 2.9) / 2 = 32.1 days.

Time to hospitalization. The WHO report says that the time period from onset to the development of severe disease is 1 week. Based on this, we can set the time to hospitalization to 7 days.

Basic reproduction number. Li et al. estimate the basic reproduction number to be 2.2. The basic reproduction number is the number of secondary cases generated by each infected individual at the start of an outbreak, in a population lacking immunity, without any mitigation policies.

Effective basic reproduction number. The actual number of secondary cases per infected individual is the effective basic reproduction number. The effective basic reproduction number reflects the effect of mitigation and containment policies.

Case fatality rate. The case fatality rate (CFR) is the proportion of deaths compared to the number of diagnosed cases. This is different from the infection fatality rate, which accounts also for undiagnosed cases and asymptomatic cases. Covid-19's CFR at the time of writing varies a lot between different regions because of different testing practices, healthcare availability etc. The WHO report found that the CFR varied between 5.8% in Wuhan to 0.7% in other areas of China.

Model equations

Let $S$ be the number of susceptible individuals, $E$ the number of exposed, $I$ the number of infectious, and $R$ the number of recovered. The dynamics of the SEIR model are characterized by the following for ODEs:

$$ \frac{\rm{d}S}{\rm{d}t} = -\frac{R_t}{T_{\rm{inf}}}\cdot I S,\quad \frac{\rm{d}E}{\rm{d}t} = \frac{R_t}{T_{\rm{inf}}}\cdot I S - T_{\rm{inc}}^{-1} E,\quad \frac{\rm{d}I}{\rm{d}t} = T^{-1}_{\rm{inc}} E - T^{-1}_{\rm{inf}} I,\quad \frac{\rm{d}R}{\rm{d}t} = T^{-1}_{\rm{inf}} I $$

$R_t$ is the effective basic reproduction number. $T_{\rm{inf}}$ is the average number of infectious days. $T_{\rm{inc}}$ is the incubation period in days. In the simulation, $R_t$ will be $R_0$ up to some time $T_int$, when some intervention measure was taken, after that it will be some other constant that is lower than $R_0$. How much lower depends on the effectiveness of the intervention.

In order to get more detailed information about the outcome and the load on the healthcare system, I will follow Gabriel Goh's lead and track also the number of people entering hospitals and leaving them.

There are three ways in which individuals go from the infected state to their final outcome. First, they may have a mild case and self-isolate after the number of infectious days computed, given earlier as 2.9. They will then be self-isolated for the 11.1 days also given earlier before they move to the recovered group. Second, they may have a severe infection, in which case they self-isolate first but then are moved to a hospital after some days, this number of days was given earlier as 7. Once in the hospital, they may either recover or have a fatal outcome. The average time to outcome was given earlier as 28.6 days for severe cases with recovery, and 33.55 days for cases with fatal outcome.

The probability of a fatal outcome can be taken to be the case fatality rate, $P_{\rm{F}} = CFR$. The WHO report says that in their source material, 13.8% of the laboratory confirmed patients went on to develop a severe infection. This can be taken to be the probability of developing a severe infection, $P_{\rm{S}}$. The probability of developing mild symptoms is $1 - P_{\rm{M}} - P_{\rm{S}}$. We now add more equations to model the movement between self-isolation and recovery, from severe infection to hospitalization, and from hospitalization to final outcome. Let $I_\rm{M}$ be the number of removed individuals with mild infection, $I_\rm{S}$ be the number of removed individuals with severe infection, let $I_\rm{H}$ be those with severe infection that have been hospitalized, and $I_\rm{F}$ those that will eventually have a fatal outcome. Further, let $R_\rm{M}$ be the recovered individuals of those that had a mild infection, $R_\rm{S}$ the recovered individuals of those that had a severe infection, and $R_\rm{F}$ those with a fatal outcome. Let $T_\rm{M}$, $T_\rm{S}$, $T_\rm{H}$, $T_\rm{F}$ be the time to outcome for each group. Finally, let $T_\rm{lag}$ be the time that it takes before a severe case is hospitalized.

$$ \frac{\rm{d}I_\rm{M}}{\rm{d}t} = T_\rm{inf}^{-1} P_\rm{M} I - T_\rm{M}^{-1} I_\rm{M},\quad \frac{\rm{d}I_\rm{S}}{\rm{d}t} = T_\rm{inf}^{-1} P_\rm{S} I - T_\rm{lag}^{-1} I_\rm{S} $$ $$ \frac{\rm{d}I_\rm{H}}{\rm{d}t} = T_\rm{lag}^{-1} I_\rm{S} - T_\rm{S}^{-1} I_\rm{H},\quad \frac{\rm{d}I_\rm{F}}{\rm{d}t} = T_\rm{inf}^{-1} P_\rm{F} I - T_\rm{F}^{-1} I_\rm{F} $$ $$ \frac{\rm{d}R_\rm{M}}{\rm{d}t} = T_\rm{M}^{-1} I_\rm{M},\quad \frac{\rm{d}R_\rm{S}}{\rm{d}t} = T_\rm{S}^{-1} I_\rm{H},\quad \frac{\rm{d}R_\rm{F}}{\rm{d}t} = T_\rm{F}^{-1} I_\rm{F} $$



We start by solving the model with $R_t = R_0$, the case of no intervention.

If we look at the number of susceptible individuals, we see that it takes some time but once the infection starts to gain traction, it quickly engulfs approximately 85 percent of population and then dies out (the outbreak in this simulation started with just one individual):

The number of infected at a time follows the famous epidemic curve (I'm multiplying by the population size so that the y axis shows the number of individuals that are infected):

The curve peaks at 70,000 for this population of one million. The number of hospitalized individuals is very important since it tells us what capacity the healthcare system must have. We can plot the number of hospitalized individuals as well:

Remember that in this model, a hospitalized person is not infectious since he has been removed from the population, so a hospitalized person does not count towards the number in the previous curve. Because of this, the curve over hospitalized individuals lags the curve over infectious individuals, and it is broader because individuals spend more time in the hospital than they spend being infectious.

We now solve the model with an intervention, starting at time 100 and bringing $R_0$ down to 60% of the original value. We look in particular on the number of individuals in hospital.

We see that the orange curve, the curve that shows the number of hospitalized individuals given that the intervention takes place, is much lower than what it was without intervention. This is what it means to flatten the curve. By intervening, e.g. by social distancing measures, we can lower the peak and make it more manageable for the healthcare system.


The following papers were used for estimates of epidemic parameters:

Li et al.

World Health Organization.

Special thanks to Gabriel Goh who made his implementation of this model public.

About the author

Calle Ekdahl

Mathematica enthusiast from Gothenburg, Sweden.

Add comment

By Calle Ekdahl