Data Management for Assigning and Managing Investigations

Updated May 26, 2020

The development and implementation of a robust data management infrastructure will be critical for assigning and managing investigations, linking clients with confirmed and probable COVID-19 to their contacts, and evaluating success and opportunities for improvement in a case investigation and contact tracing program.  COVID-19 case investigations will likely be triggered by one of three events:

  1. A positive SARS-CoV-2 laboratory test or
  2. A provider report of a confirmed or probable COVID-19 diagnosis or
  3. Identification of a contact as having COVID-19 through contact tracing

Data management systems should be able to capture these three types of events electronically. Ideally laboratory, provider, and contact case reports should be transmitted to the local health authority electronically and then seamlessly imported into the system in an automated fashion.

Upon receipt of a triggering event, the case investigation and contact tracing system should assign an auto-generated, unique patient identifier to the patient report. The unique IDs should not include a component of personal identifying information (e.g., date of birth, patient initials). The system should allow for the assignment of new investigations to contact tracing staff with clear modules for the investigation. Module components that should be considered include:

  • Patient locating and sociodemographic information (e.g., date of birth, race/ethnicity, residential address) and COVID-19-specific information (e.g., symptoms, date of symptom onset, date of SARS-CoV-2 testing, test results, hospitalizations, co-morbid conditions).
  • Patient risk assessment (e.g., specific people the patient had close contact with during the contact elicitation window, community locations where the patient visited and may have exposed others (e.g., supermarket, workplace, public transportation). Greater specificity in the information collected can greatly improve the effectiveness of contact tracing, so the system should be flexible enough to allow for text field entry but structured enough that frequency distributions of locations and people can be quickly obtained.
  • Named contacts should be captured on a separate module which includes all information provided by the patient, as well as additional risk assessment information for the contact (e.g., location and close contacts they may have had during their contact elicitation window). This module should allow for the entry of specific people (Jane Doe), people with partial contact information (Doug from the neighborhood BBQ) and locations (bus or train routes, neighborhood grocery store).  This module should also create a unique identifier that can be assigned to each contact using the same conventions as above, and that will allow for a contact to be linked to the case and to also turn into a patient if they are in fact infected with SARS-CoV-2.  Additional information to collect from the contacts include any symptoms of COVID-19, symptom onset dates, and dates and results of SARS-CoV-2 tests, as well as whether they were previously aware or informed of their exposure.

In addition, every case and contact form should have a public health ID variable for program evaluation purposes.

Management of COVID-19 investigations will quickly become complicated as the number of case reports increases.  The data system must be relational in nature and be able to link multiple individuals to many other individuals in the system.  Assigning each individual in the system a unique ID will help support this.  However, separate tables will be needed to account for the complex interconnectedness of many of the investigations.

Ideally, the data system should facilitate many-to-many relationship mapping between identified cases and contacts in order to support data analysis and source-spread mapping of COVID-19 transmission. The data system should be user-friendly, flexible and accessible by mobile device, as well as a laptop or desktop computer.  A cloud-based system will allow the greatest flexibility and ensure routine data storage, protection and updating, but unique jurisdictional laws and regulations may necessitate on-premises data storage.

Additionally, the system should be able to be manipulated by public health informatics, data management and epidemiologic staff, without the need for support from the vendor or contractors.  This will allow for rapid updates of modules as new data collection needs are identified, the ability to produce both canned and ad-hoc reports for management of investigations, and the ability to seamlessly integrate with other modules to perform quality assurance and more complicated network and epidemiologic analyses. The data system should also be interoperable with existing surveillance systems.

Ensuring local health authority IT and informatics support is critical. Involving relevant IT, informatics, fiscal, and leadership entities early can help overcome delays and cost over-runs in implementation and maintenance of the system.  Additionally, consideration should be made to who has ultimate responsibility for the maintenance and upkeep of the system early to ensure the correct partners are engaged in system developments.  Data security and confidentiality standards should be considered and incorporated into all plans related to case investigation and contact tracing activities.