Search for a unified framework

Contact tracing as a personalization framework

Published: April 8, 2021


  • A framework should incorporate knowledge from the related domains of epidemiology, virology, privacy, user-behavior, public policy, and computer science
  • The framework should react to the information as it is observed, thereby enabling early warning signals

Why do we need contact tracing? Framework in practice


Thinking of digital test-and-trace as a personalization framework has several advantages, especially when experts across different domains need to come together to address the problem. Such a unified framework can potentially serve the following purposes -

  • Facilitate collaboration: It can serve as a common language between researchers across various disciplines, e.g., epidemiologists, privacy experts, computer scientists, etc.
  • Domain transfer: Researchers or practitioners in other domains may develop this framework further to apply it to their respective field of expertise.
  • Algorithm adaptability: In the face of a new contagious virus, not everything about the virus will be known on day zero. The attributes of such a virus evolve with lab results and scientific experiments. Moreover, these attributes might be specific to a location (e.g. variants of Sars-CoV-2). Thus, a framework lets practitioners across the globe adapt the algorithms readily as more about the virus is known.

Epidemiology & Virology

A general framework will need to capture how the virus replicates and spreads from one host organism to another. Epidemiologists and virologists study these mechanisms. Before we bake in the technological constraints, it will help to familiarize ourselves with some basics of the respective fields.

If the virus can infect the host, the host’s state is termed as susceptible. Once the host is infected, the virus keeps replicating until the host’s immunity is strong enough to combat it. The evolving quantity viral load, studied by virologists, is the number of these virus cells inside the host after the infection. However, the host organism starts emitting this virus (e.g., via breath) a short time after the infection when there are enough of them in the body. This state of the host is termed as infectious. At any time, the quantity of virus emitted is proportional to the viral load. For the short period when the concentration of the virus is not enough to be emitted by the host, the state is referred to as exposed. Finally, once the virus has been eliminated, leaving behind the antibodies to prevent future contagion, the host’s state is termed as recovered.

Symptoms do not show up until the incubation period (e.g., mean of 5.3 days for Sars-CoV-2) of the virus. In the context of Covid-19, the infectiousness onset happens roughly two days before symptoms appear. Note that the term infected can refer to the infectious and exposed state. However, it does not necessarily mean that the host is infectious (or contagious). The four states of susceptible, exposed, infectious, recovered (SEIR) are mutually exclusive, and often used in state-based modeling of contagious diseases.

Although the shape of the viral load curve will vary depending on the individual factors (e.g., age, sex, or pre-existing conditions), we consider a piecewise linear curve shown in Figure 3. This simplification follows from the empirical study conducted by To et al. . Viral load is a measure of the number of viral particles in a milliliter of blood. However, for our purposes, we consider a proportional quantity which we call effective viral load. A piecewise linear curve gives us the freedom to adapt the framework to different individual characteristics. It is also valid to assume a vanilla distribution (e.g., gamma) with just 2 degrees of freedom.

Figure 3: Effective viral load (unitless) is our proxy for viral load (number of viral particles in each milliliter of blood).

Demands on the framework

Imagine an oracle that knows everyone’s viral load at any time (wlog, we assume viral load for susceptibles to be zero). It would be relatively easy to control the outbreak without causing much economic disruption by recommending individuals with a non-zero viral load to self-isolate. Therefore, the framework should help us create such an oracle or a predictor that estimates an individual’s effective viral load.

As an input to the predictor, we would want to use relevant data points such as personal attributes, diagnostic results, symptoms, and encounters. These inputs, however, have associated uncertainties arising from vacillating user behavior or the insufficient research developments. For example, the RT-PCR test, the most promising diagnostic test for SARS-CoV-2, has a false negative rate of 33% at the viral load peak. It is also unreasonable to rely on users to input their symptoms or personal information all the time. The risk of infection associated with encounters will depend heavily on the characteristics such as location, distance, or duration.

Finally, we consider the framework in practice when people with appropriately configured devices (typically, in the form of an app) will use it. We may ask for user information like symptoms or pre-existing conditions through a deliberate user-behavior design of such an app. To respect the privacy of the users, we would want these inputs to never leave the device. Thus, the framework should use such individual information in isolation. Further, to use the peer-to-peer Bluetooth communication protocol, our framework should respect its constraints. As discussed in the previous article, we would want the framework to pass N-bits of information. N’s smaller values make it favorable for user privacy, so we restrict it to 1, 2, 3, or 4.

Since we want to capture the underlying contagion dynamics, the framework should model how much viral load might have been emitted during the encounter. We would want the communication channel between two user devices to communicate this viral load via N-bits. However, because the viral load predictions are dependent on inputs that change with time, we must use the communication channel for updates. Thus, we call these messages warning signals in this series, but are also referred to by risk messages in Bengio et al. and Gupta et al.. Finally, given an individual’s estimated viral load, we would want to recommend user behaviors from a set of well-defined directives (e.g., work from home, avoid public transport) designed by user-behavior researchers in conjunction with PHEs. Such a system of recommending behaviors is empirically deemed effective by Ayres et al..

To recap the above discussion, we want the framework to exhibit the following properties :

  • Capture contagion: The framework should model the amount of viral load emitted from an infectious host during the encounter. To model such transmission, we would want to send this information encoded in N-bits as warning signals to the devices involved in an encounter. This way, we can send warning signals of varying intensities.
  • Use relevant inputs: For an individual, in addition to the warning signals received, the framework may or may not have access to the following informative signals: symptoms, diagnostic test results, personal attributes. A well-designed predictor uses all of these inputs to estimate an individual's viral load.
  • Adapt to inputs: Because these inputs are observed over time, the framework, at regular intervals, should allow for a correction in the intensity of the past warning signals. For example, if the predictor estimates a warning signal of 5 based on today's symptoms, a negative test result two days later would make the predictor estimate a warning signal of 0.
  • Recommend behavior: Framework should be able to recommend increasingly restrictive cautionary behaviors depending on their infection risk. These recommendations are the only lever for the framework to control the outbreak while minimizing economic disruption.

Proactive Contact Tracing

The ability to adapt to inputs enables the framework to update its prediction as more information flows in the user network. For a scenario where diagnostic tests are administered after the symptoms appear (e.g., if tests are a scarce resource), a framework that relies on symptoms will potentially react faster. Similarly, a predictor relying on warning signals will be even faster. Therefore, we call this framework Proactive Contact Tracing (PCT) due to its ability to generate early warning signals.

The accuracy of these warning signals is dependent on the predictor. The N-bits of the communication channel can also be used to transmit the confidence in these predictions. Due to the uncertainty in predictions, we would want to recommend less restrictive behaviors when appropriate. Thus, PCT is a personalization framework to minimize the true viral load of the population.

In the following post, we explore how to design a reasonably accurate predictor for the PCT framework.

Why do we need contact tracing? Framework in practice