COVID-19/Adjusted Number of Cases

This learning resource is required to process the number of reported cases in a way that they provide an estimation of the real number of cases in the total population. The adjusted number of cases at day is important for the forcast and mathematical modelling for COVID-19.

History of Learning Resource edit

Julian Mendez originally posted this idea here and it is archived here. For an updated reformulation, see below.

Adjusted number of cases edit

We start with a reference day of data collection with the index   (e.g. January 6, 2020 is  , January 7, 2020 is  , ...). We use the variable   because it is a time index.

  •   number of days after reference day of data collection.
  •   number of tests performed at day  , defines the baseline ( ) for positive tested patients.
  •   number of COVID-19 positive tests, for day   on which the samples for the tests are collected. Please keep in mind, that laboratory needs time to process the collected samples, so do not count the positive tested COVID-19 cases to the day, when they are official reported. Count the positive tests to the day, when the samples are taken from the patients. Variable ( ) used because of first letter COVID-19.
  • fraction   indicates the percentage of positive test of all tests, e.g.   tests   positive COVID-19 tests creates the fraction   (i.e. 5% of the tests are positive)
  • Patients that show minor symptoms and do not popup in the health system for testing create a bias in the the fraction  . Otherwise we could adjust the asymptomatic patients do not show symptoms and the spread the disease without knowing that they are COVID-19 positive.

Susceptible, Infected and Recovered (SIR) edit

The spread of the virus is dependend on the immune status of the population. Consider the two completely different situations for the public health status of the population with   citizens at day  :

 
  • (Situation 1: vulnerable population) There is only one infected citizen ( ) at time index   among the population and all the other citizens are susceptible (i.e.   for COVID-19. Hence there are no citizens, that recovered from the COVID-19 disease (i.e.  ). Therefore an epdidemiologically extremly vulnerable community is exposed to a single infected citizen among the population (patient zero). The spreading of the disease will show an exponential growth.
  • (Situation 2: protected population) There is again only one infected citizen ( ) among the population, but this time all the other citizens recovered from the COVID-19 infection (i.e.  . Hence there are no susceptible citizens among the population, that can be infected from the single COVID-19 patient (i.e.  ). The single infected patient among the population cannot infect somebody in population because the recovered patients were immune against COVID-19 infection.

Lesson learnt: Having a higher percentage of recovered (immune) patients among the population slows down the spreading of the disease in the population.

Learning Task edit

  • Explain why the people in the compartment "recovered" (R) may return to "susceptible" (S) after a period of time. Do you know disease that you keep you immune status for lifetime?

Vaccination and protected Population edit

Vaccination moves citizens from the vulnerable status "susceptible" (S) into a recovered status (R), because the vaccination "emulates" an infection for the immune system and the allows immune system to produce antibodies against the disease. COVID-19 was a new virus in 2019 and therefore vaccination of the population was not possible before the outbreak. COVID-19 disease could cause a critical status of the patient, so that she/he must be treated on an Intensive Care Unit (ICU), so a protected population would be ideal but was not possible because the new virus COVID-19 was exposed to a totally vulnerable society.

Lesson learnt: Due to the fact that vaccination was not possible for COVID-19, the only option to protect the health system is, that the number of cases increase slowly, so that the health system can provide the health service delivery for patients. Keep in mind, that health system has other patients on ICU and the capacity might not be sufficient for a huge number of COVID-19 cases. Therefore staying at home and reducing the number of physical contacts among the population to slow down the epidemiological spreading of COVID-19.


Estimation of aggregated COVID-19 Infection among Population edit

  • Now we consider the once again   as number of tests performed at day  . The selection of people that are tested are not randomly selected among the population   (e.g.   people), so that ::   might be a wrong estimate
for the number of infected people among the population. There might be a bias especially when only patients with symptoms are tested. A randomly selected test sample of   tests at time   can be selected among the population. The tested people are selected without consideration of any symptoms and the selection should be representative for the total population. This study will also detect a number of   COVID-19 positive tests. The ratio   is an estimate for fraction of infected people among the population. This might show probably a difference between the calculated ratio   and random control test   at day  , e.g.
 
This leads to better estimate for the total number of infected people among the population, if the number of   of tests for the randomly selected people is high (see Borels Law of large numbers) with a total number of population   (e.g.   people).
 
Testing capacity is limited and so random selection of samples is costly and therefore this test design might be applied just for calibrating the model. A COVID-19 tests is a limited resource and tests are mostly applied if the patient showing symptoms for COVID-19 or the immune status must be clarified if someone (e.g. member of medical staff) is a risk for the enviroment in which she/he is working/living. So the estimation for the total number of people that show an immune response or exposure to the COVID-19 virus in the test must be based on   and   of tested patients resp. the fraction

 . A control test was performed only once at day   constant for the number of people in the population the show an immune response or exposure to the COVID-19 virus can be calculated by:

 
  • With the error correction value   (e.g.  ) the estimate for total number of people that show an immune response or exposure to the COVID-19 virus can be estimated by
 
Please keep in mind, that the error correction value   might be updated not as other as new cases are reported.
  • The daily growth rate   for the value of new cases is defined as  . E.g. if you have   at day   and   at day before at index   then
 .
This means that we have an increase of 50% in the number of new cases for the day  . The daily growth rate   for the value of new cases could also be negative, e.g. if you have   at day   and   at day before at index   then
 .
This means that the number of new cases for the day   decrease by 25%.
date index tested positive tests percentage adjusted new cases daily growth rate for new cases
  days          
  days          
           
1 day          
0 reference day         undefined

Logistical Growth and SIR Model edit

If we assume, that the logistical growth can be applied on COVID-19 disease, the point in time when the number of new detected cases do not increase anymore. This point in time can be estimated if   and the point   in the following graph.

 

With the SIR model is applied on the epidemiological modelling, the logistical growth is with a delay in time similar to the green curve of the recovered.

 
 
Blue=Susceptible, Red=Infected, and Green=Recovered


Each member of the population typically progresses from susceptible to infectious to recovered. This can be shown as a flow chart in which the boxes represent the different compartments and the arrows the transition between compartments. An arrow from recovered (R) back to susceptible (S) might be added if the patients loose the immune status after a while. That is similar to the status of patients must refresh their vaccination after a number of year. For some diseases one infection or one vaccination is sufficient for life time. COVID-19 is a new disease, so it difficult to estimate in 2020 how the immune system will be prepare for a new exposure to the Corona virus, if the patient recovered.

Learning Task edit

  • Identify disease that need just one vaccination for life time and identify a disease that need a new vaccination for immune system after a number of years.
  • Explain why a arrow in SIR-model might be added to the flow chart from recovered (R) to susceptible (S) if scientific evidence will be available for the model extension?

Testing edit

There are different tests for a viral disease:

  • Polymerase Chain Reaction (PCR): Polymerase chain reaction (PCR) is a method in molecular biology for making millions of copies of a specific DNA sample of the virus DNA. If the replication fails, the tests provides the result, that the sample did not contain the specific DNA sample of the test. Please keep in test addresses not the complete virus DNA. Therefore a fragmented virus DNA that is not capable to program cells for the production of new viruses might leed to a positive test (false positive). The PCR tet is used to detect patient that might infect other patients. So a PCR provides information about the red curve of infected people in the population.
  • Antibodies: a test for antibodies of COVID-19 shows if the immune system was exposed to COVID-19 virus and responded to the virus exposure by creating antibodies. This test provides information about the green curve of recovered. Please keep in mind that the immune system needs time to respond to the exposure to a new virus. Therefore the antibody test might fail and patient may be infected and is able to infect others.

Julian Mendez Contribution edit

I would like to share an idea to have a more precise understanding of the number of cases of COVID-19. A more precise current number of cases could be approximated by computing the square of cases today divided by the cases one week ago. My suggestion is to use the growth of the previous days. Let us imagine that a region had the following cases:

date cases daily growth
  days ago  
  days ago    
 
1 day ago    
today    

We can approximate the future growth by the past, and say that an adjusted number of cases can be approximated with:

 

which is the same as  

As an example, these are the values for the first 10 countries in the list on 2020-03-14:

country cases today cases one week ago adjusted cases today (approx.)
China 80844 80695 80993
Italy 21157 5883 76087
Iran 12729 5823 27825
South Korea 8162 7134 9338
Spain 6391 430 94988
Germany 3795 684 21056
France 4499 949 21329
United States 2794 352 22177
Switzerland 1359 254 7271
United Kingdom 1140 206 6309

These adjusted values may fluctuate with sudden high values, like in the case of Spain.

I hope someone can find this idea useful.

Adjustments due to the delay in the tests edit

This is a reformulation of the previous section. I originally posted it here. The updated version is here.

We would like to approximate how many true cases are there. Let us assume that:

  • the time between a patient gets infected and the case is reported is always the same
  • people do not significantly change the growth of infected cases

The variables are:

  •   is the number of days between a patient gets infected and the case is reported
  •   is the reported cases for day  
  •   is an approximation of true cases for day  
day reported cases daily growth true cases approx. of true cases
0    
1      
       
         
       
         
       
         

We would like to find a formula for   that approximates  .

One possibility is using the previous   growth rates. In this case:

 

Hence,