Skip directly to local search Skip directly to A to Z list Skip directly to navigation Skip directly to site content Skip directly to page options
CDC Home

Resources

Human Genome Epidemiology (2nd ed.): Building the evidence for using genetic information to improve health and prevent disease

“The findings and conclusions in this book are those of the author(s) and do not
necessarily represent the views of the funding agency.”

 

These chapters were published with modifications by Oxford University Press (2010)

 


Chapter 12 - Genome-wide association studies, field synopses, and the development of the knowledge base on genetic variation and human diseases

Muin J. Khoury, Lars Bertram, Paolo Boffetta, Adam S. Butterworth, Stephen J. Chanock, Siobhan M. Dolan, Isabel Fortier, Montserrat Garcia-Closas, Marta Gwinn, Julian P. T. Higgins, A. Cecile J. W. Janssens, James M. Ostell, Ryan P. Owen, Roberta A. Pagon, Timothy R. Rebbeck, Nathaniel Rothman, Jonine L. Bernstein, Paul R. Burton, Harry Campbell, Anand P. Chokkalingam, Helena Furberg, Julian Little, Thomas R. O’Brien, Daniela Seminara, Paolo Vineis, Deborah M. Winn, Wei Yu, and John P. A. Ioannidis



Table 12-1
Trends in numbers of published articles on human genome epidemiology, meta-analyses, and genome-wide association studies and numbers of genes studied, by year, 2001–2007*
Year No. of Genes No. of Diseases No. of Articles Published
Total GWAS Meta-Analyses
2001
633
690
2,492
0
34
2002
794
855
3,196
0
45
2003
832
880
3,476
3
65
2004
1,124
1,021
4,280
0
86
2005
1,308
1,077
5,029
5
113
2006
1,502
1,109
5,364
12
155
2007
2,142
1,292
7,222
104
208
2008
3,336
1,203
7,659
134
236

*HuGE Navigator query.
Genes column does not include the numbers of studied variants per gene (difficult to obtain).
Meta analyses also include HuGE reviews.
GWAS: Genome-wide association studies (individual genes not counted in genes column, unless featured
in the paper).

[back to chapter]


Table 12-2
Key characteristics of pilot field synopses of genetic associations
  No. of Meta-
Anal-yses
No. of Data
Sets*(a)
Thres-hold for
Meta-Analysis
No. of Statist-ically
Signi-ficant Asso-ciations
Strong§ (Grade A) World Wide Web Address
Alzheimer disease||
228
1,072
4 data sets
53
NA
Schizophrenia#
118
1,179
4 data sets
24
4
DNA repair genes and various cancers
241
1,087
2 independent
teams
31
3
www.episat.org/
episat/index.php
Bladder cancer
36
356
3 data sets
7
1
Not yet online
Coronary heart disease
48
1,039
-
4
0
Preterm birth
17
87
3 data sets
2
0
Major depression
22
131
3 data sets
6
2
Not yet online

*Total number of data sets included in the meta-analyses (not including data sets that did not undergo meta-analysis).
Authors’ prerequisite condition for conducting a meta-analysis.
Statistically significant (P < 0.05) by random-effects calculations on the default (per allelele) analysis (for coronary heart disease, results are based on a meta-regression model and correspond to effects in the largest studies, while for DNA repair genes, both recessive and dominant models were investigated).
§Grade AAA with regard to all three Venice criteria (18).
||Current on February 27, 2008.
#Current on April 30, 2008.
Data sets: the sum of data sets included in the meta-analyses (not including data sets that did not undergo meta-analysis); threshold: authors’ prerequisite condition for conducting metaanalysis; significant: p<0.05 by random effects calculations on the default (per allele) analysis (for coronary heart disease, results are based on a meta-regression model and correspond to effects in the largest studies, while for DNA repair genes both recessive and dominant models were investigated); strong (grade A): grade AAA in all three Venice criteria; online address: Web site for deposited data sets.

[back to chapter]


Table 12-3
Venice interim guidelines for assessing the credibility of cumulative evidence on genetic associations (Ioannidis et al., reference 22)
Criteria and Categories Proposed Operationalization
Amount of evidence
A: Large-scale evidence
B: Moderate amount of evidence
C: Little evidence
Thresholds may be defined on the basis of sample size, power, or false-discovery rate considerations. The frequency of the genetic variant of interest should be accounted for. As a simple rule, we suggest that category A require a sample size of more than 1,000 (total number in cases and controls, assuming a 1:1 ratio) evaluated in the least common genetic group of interest; that B correspond to a sample size of 100–1,000 evaluated in this group; and that C correspond to a sample size of less than 100 evaluated in this group (see “Discussion” section in the text and Table 12.2 for further elaboration).
Replication
A: Extensive replication including at least 1 well-conducted meta-analysis with little between-study inconsistency
B: Well-conducted meta-analysis with some methodological limitations or moderate between-study inconsistency
C: No association; no independent replication; failed replication; scattered studies; flawed meta-analysis or large inconsistency
Between-study inconsistency entails statistical considerations (e.g., defined by metrics such as I 2, where values of 50% and above are considered large and values of 25–50% are considered moderate inconsistency) and also epidemiologic considerations for the similarity/standardization or at least harmonization of phenotyping, genotyping, and analytical models across studies. See “Discussion” section in the text for the threshold (statistical or other) required for claiming replication under different circumstances (e.g., with or without inclusion of the discovery data in situations with massive testing of polymorphisms).
Protection from bias
A: Bias, if at all present, could affect the magnitude but probably not the presence of the association
B: No obvious bias that may affect the presence of the association, but there is considerable missing information on the generation of evidence
C: Considerable potential for or demonstrable bias that can affect even the presence or absence of the association
A prerequisite for A is that the bias due to phenotype measurement, genotype measurement, confounding (population stratification), and selective reporting (for meta-analyses) can be appraised as not being high (as shown in detail in Table 12.4)—plus, there is no other demonstrable bias in any other aspect of the design, analysis, or accumulation of the evidence that could invalidate the presence of the proposed association. In category B, although no strong biases are visible, there is no such assurance that major sources of bias have been minimized or accounted for, because information is missing on how phenotyping, genotyping, and confounding have been handled. Given that occult bias can never be ruled out completely, note that even in category A, we use the qualifier “probably.”

[back to chapter]

Table 12-4
Some checks for retrospective meta-analyses in field synopses of genetic associations
General checks for the occurrence of or susceptibility to potential problems*
  • Small effect size (e.g., odds ratio <1.15-fold from the null value)
  • Association lost with exclusion of first study
  • Association lost with exclusion of HWE-violating studies or with adjustment for HWE
  • Evidence for small-study effect in an asymmetry regression test with proper type I error (Stat Med. 2006;25:3443–3457)
  • Evidence for excess of single studies with formally statistically significant results (Clin Trials. 2007;4:245–253)
Topic- or subject-specific checks: Consider whether they are problems
  • Unclear/misclassified phenotypes with possible differential misclassification against genotyping
  • Differential misclassification of genotyping against phenotypes
  • Major concerns for population stratification (need to justify for affecting odds ratio greater than 1.15-fold, not invoked to date)
  • Any other reason (case-by-case basis) that would render the evidence for association highly questionable

*All general checks are likely to have only modest, imperfect sensitivity and specificity for detecting problems. In particular for effect size, a small effect size may very well reflect a true association, since many genetic associations have small effect sizes. However, if this effect has been documented in a retrospective meta-analysis that is susceptible to publication and other reporting biases, it also needs to be replicated in a prospective setting where such biases cannot operate before high credibility can be attributed to it.

 

Contact Us:
  • Centers for Disease Control and Prevention
    1600 Clifton Rd. Atlanta, GA 30333 USA
    800-CDC-INFO (800-232-4636)
  • Additional information for Public Health Genomics is available on our contact page.
USA.gov: The U.S. Government's Official Web PortalDepartment of Health and Human Services
Centers for Disease Control and Prevention   1600 Clifton Road Atlanta, GA 30329-4027, USA
800-CDC-INFO (800-232-4636) TTY: (888) 232-6348 - Contact CDC–INFO
A-Z Index
  1. A
  2. B
  3. C
  4. D
  5. E
  6. F
  7. G
  8. H
  9. I
  10. J
  11. K
  12. L
  13. M
  14. N
  15. O
  16. P
  17. Q
  18. R
  19. S
  20. T
  21. U
  22. V
  23. W
  24. X
  25. Y
  26. Z
  27. #