Guidelines for VSD Data Sharing Using the Research Data Center at the National Center for Health Statistics
The Vaccine Safety Datalink (VSD) Project is a collaborative project between the Centers for Disease Control and Prevention (CDC) and several large health plans. The VSD was established to evaluate vaccine safety issues in the United States by reviewing large-linked databases collected at the participating sites as part of their routine health services. Additional background information regarding the VSD can be found at Vaccine Safety Datalink (VSD) webpage.
There are several ways interested researchers can access VSD data. In 2002, the VSD established a data sharing program at the National Center for Health Statistics (NCHS) Research Data Center (RDC) to allow external Guest Analysts to (1) conduct new vaccine safety studies using VSD data files available at CDC or (2) to reanalyze study-specific datasets from published VSD studies. The VSD data sharing program is a three-step process:
- Submission of proposals to CDC’s RDC at NCHS
- Submission of proposals to VSD site Institutional Review Boards (IRBs)
- Use of CDC’s RDC at NCHS
To conduct new vaccine safety studies:
To conduct a vaccine safety study, an external Guest Analyst may use VSD data files that reside at the RDC. Requests for data for examining new hypotheses are limited to only the variables found in the VSD data files (as listed in the data dictionary). Please note: data for new vaccine safety studies are available only through December 31, 2000. Therefore, the RDC will not accept proposals requesting VSD data after December 31, 2000, for new vaccine safety studies. Access to VSD sites’ data after December 31, 2000 for new vaccine safety studies may be accessed through establishing a formal collaboration with the VSD site(s); however, such collaboration is at the discretion of the VSD site and outside the scope of the RDC data sharing program and CDC authority.
To assist Guest Analysts, CDC offers a list of recommended scientific references relevant to conducting research using large linked databases such as the VSD data files and a data dictionary Cdc-pdf[PDF – 645 KB] that lists all the variables in the VSD data files available for new vaccine safety research. Consistent with the Health Insurance Portability and Accountability Act (HIPAA) regulations, proposals for new vaccine safety studies should include only those specific variables that are needed to conduct the proposed analyses, including a brief explanation with justification for use of these variables.
Data collected for the VSD project and accessible through the data sharing program have been created from each participating health plan’s administrative data and were not solely collected for the purpose of scientific research. It should be noted that the quality of the data from the VSD for new vaccine safety studies cannot be guaranteed and cannot be resolved at the RDC .
To reanalyze study-specific datasets from published VSD studies:
External Guest Analysts who would like to conduct a reanalysis of a VSD study published by VSD investigators may request the final dataset for the specific study they wish to re-analyze. Data collected for the final datasets of the published studies may include additional variables not listed in the data dictionary that is referenced above; therefore, the RDC will provide the external Guest Analyst with the necessary data dictionary for the requested dataset. No additional source of “raw” data, or earlier versions of the final dataset, are available for reanalysis of published VSD studies.
Most VSD studies published after the establishment of the CDC data sharing program in August 2002 have datasets available for re-analysis. Datasets from some of the earlier published VSD studies may not be available for re-analysis.
All proposals requesting use of VSD data must contain the following information:
- Project title
- Name of proposed investigator and collaborators (per RDC rules Cdc-pdf[PDF – 212 KB], a maximum of three persons may utilize a work station)
- Name, address, telephone number, and email address of point-of-contact
- Summary of proposed study (including background, reasons for conducting the study, public health benefits)
- Specific hypothesis of vaccine safety study to be investigated or title of published VSD study to be reanalyzed
- Proposed methodology for new vaccine safety study to be investigated or the specification of methods used in published study. Abstracts of published studies will not be accepted as proposed methodology for new studies or reanalyses. Description of methodology must include:
- Definition of the study population of interest
- Age of the population required (child or adult–related data are available (0-17 or 18+)
- Study years of interest (i.e. 199X-2000). Please note the study years available vary by VSD site
- Approach to select the study population from the data files, based on available fields in the VSD data dictionary
- Type of study to be conducted
- Descriptive studies: specify the variables and values for those variables to be used to select the study population
- Case-control studies: specify criteria for cases and controls
- Cohort studies: specify criteria for exposed and unexposed population
- Specification of the variables that will be required, including:
- Exposures: specify the exposures to be studied. Specific criteria defining exposures based on the VSD data dictionary should be included. For instance, specific vaccines given within 14 days of the outcome of interest
- Outcomes of interest: specify the outcomes to be studied. Specific criteria defining those outcomes based on the VSD data dictionary should be included. For instance, ICD-9 codes for outcomes of interest and type of health care encounter (hospitalization, outpatient encounter, emergency room visit)
- Person-time or enrollment: Specify criteria to determine the calculation of person-time, follow-up time, or participating VSD site enrollment restrictions
- Confounding or control variables, including
- Demographic information
- Pre-existing or co-morbid conditions
- Concurrent vaccinations
- VSD site
- Detailed proposed analytic strategies and statistical methods. VSD data are in SAS software (SAS Institute, Cary, NC) format. A description of the output that the Guest Analyst intends to have (table shells, model equations, or test statistics of any output that the Guest Analyst plans to remove from the RDC). This will help the reviewers to determine the risk of disclosure and plan for the disclosure review.
- Definition of the study population of interest
All proposals to access the VSD data through the VSD Data Sharing program at the RDC will be evaluated using the same criteria used by NCHS for other RDC studies. These criteria include:
- Completion of proposal
- Scientific and technical feasibility of the project
- Availability of resources at the RDC
- Risk of disclosure of restricted information.
Determination of scientific and technical feasibility includes whether requested data are available at the RDC (for new vaccine safety studies) or if requested final dataset is available (for reanalyses). If the requested data are not available (for the reasons stated above), CDC will notify the external Guest Analyst. The Guest Analyst may resubmit the proposal to utilize available data, if desired.
After completing the review, RDC staff will notify the external Guest Analyst whether the proposal meets the evaluation criteria. If all the requested data variables can be located for the new vaccine safety studies or proposed re-analyses, review of the proposal by the appropriate VSD sites’ IRBs takes place. Submission of applications to the VSD sites IRB will be the Guest Analyst’s responsibility. At least two of the VSD sites’ IRBs must approve the proposal for the file to be accessed. Access by external Guest Analysts to a portion of the VSD data files or to datasets from VSD published studies requires review and approval by the appropriate IRBs of the relevant VSD sites.
Review of a proposal submitted by an external Guest Analyst by a VSD site IRB does not imply that CDC approves or endorses the external Guest Analyst’s proposed research. IRB applications may require a more detailed description of the proposed vaccine safety study and may vary according to individual IRB requirements. Furthermore, various IRBs may have different timelines for submission of proposals for review. Each IRB may have specific policies or requirements for data sharing that have not been adopted by the other VSD sites’ IRBs. These policies may include required collaboration with a VSD site investigator, fees associated with the IRB review process, or differing criteria for the IRB review process.
VSD sites’ IRBs have the responsibility to protect the confidentiality and privacy of their members’ medical records and to adhere to the rules and regulations of their respective institution(s). Consequently, each of the VSD sites’ IRBs must review any request for access to the VSD data files that contain information on its participating health plans’ members. Any appeal by the requestor of an IRB decision must follow the procedures for that IRBs. CDC is not included in the participating VSD sites’ IRB process at any time.
VSD sites’ IRBs will use their established procedures and timelines to review the proposed research. Approval for access to VSD health plan data contained within the VSD data files applies solely to VSD data files that reside in the NCHS RDC. Approval does not permit the Guest Analyst to obtain additional data contained within the participating health plans’ member medical records or elsewhere.
For new vaccine safety studies, it is possible that an external Guest Analyst may receive approval for access to VSD data from some but not all relevant IRBs. If this occurs, then the datasets(s) needed to conduct the new vaccine safety study will still be created, but only with data from the participating VSD site whose IRBs approved access. For re-analysis of a published VSD study, all relevant IRBs from the VSD sites that participated in the published study must approve the proposal for re-analysis; therefore if one or more IRBs do not approve access to VSD data used in the published study, the final dataset cannot be provided and no partial datasets may be provided.
Once the external Guest Analyst has received a response from all relevant site IRBs, NCHS RDC will begin the process of creating or formatting the approved dataset(s). NCHS RDC will not create or prepare the dataset(s) until it receives copies of all final IRB dispositions along with other responses directly from the reviewing site IRBs.
In addition, each proposed investigator must submit a signed copy of the RDC Agreement Regarding Conditions of Access to Confidential Data Cdc-pdf[PDF – 27 KB] and the RDC Researcher Affidavit of Confidentiality Cdc-pdf[PDF – 5 KB]External.
Following receipt of final IRB dispositions, RDC staff will arrange for access to the RDC as described in the general data sharing document. All rules, procedures, and fees as outlined in the general data sharing document will apply.
When an external Guest Analyst has completed work at the RDC and wishes to publish research results and findings using VSD data, the following requirements must be observed:
- External Guest Analysts must submit a copy of these data sharing guidelines with any manuscript submitted to a journal
- External Guest Analysts must submit (to the journal) a copy of the agreement signed prior to conducting research at the RDC
- Disclaimers must be included in the manuscript which state that “the research was conducted using data from the Vaccine Safety Datalink Project, through the data sharing program at the Centers for Disease Control and Prevention.” Any published material using VSD data must acknowledge CDC as the original data source
- An additional disclaimer must be included that states “the analysis, interpretations, and conclusions reported here are the responsibility of the authors and do not represent the views and opinions of the Centers for Disease Control and Prevention, the Federal Government, or participating Vaccine Safety Datalink health plans”