Human Genome Epidemiology (2nd ed.): Building the evidence for using genetic information to improve health and prevent disease
necessarily represent the views of the funding agency.”
John P. A. Ioannidis and Daniela Seminara
In the 2 years since the Epidemiology article on “The Emergence of Networks in Human Genome Epidemiology: Challenges and Opportunities” was published, the contribution of consortia and networks to research in human genome epidemiology has become essential. In particular, the key role of consortia has become evident in the confirmation and validation of primary genome-wide association studies (GWAS) of common diseases and in the performance of the interdisciplinary research needed to begin translating GWAS results into clinical and public health applications (1). A few successful examples of this interactive approach are the Welcome Trust Case-Control Consortium, the Cancer Cohorts Consortium (CoCo), and the Genetic Association Information Network (GAIN), whose successful paradigm has been emulated in many subsequent studies (2–4). With the advent of GWAS, it has become obvious that the successful replication of emerging association signals needs very large sample sizes. Genetic effects for discovered common variants have turned up to be even smaller than previously thought and power to replicate such associations is very limited, even with very large studies (5,6). Few GWAS have hit genome-wide significance in new discovered associations immediately at the Stage 1 data (7). Consensus recommendations have pointed to the need for sizeable replication studies in similar or ethnically and geographically different populations to confirm the validity of emerging associations between SNPs and disease. Further, a number of weaker associations or associations with rarer variants may still lie below the detection threshold of initial studies due to simple power considerations (8). This has led to new, larger collaborative efforts, where data from many GWAS and/or many replication studies are merged together for extensive meta-analyses, as for example in type 2 diabetes (DIAGRAM initiative) and colon cancer (9,10). It is, therefore, fortunate that, for some common diseases, genomic data originated from more than one consortium may be available. This shows the popularity of the concept and it may stimulate healthy competition and joint scientific efforts, whenever appropriate. Given that several consortia may develop databases independently, there is a need for some synopsis of the accumulated information, as discussed in Chapter 12. Furthermore, a worldwide collaborative consortia approach will be essential in incorporating the wealth of GWAS data into gene–gene and gene–environment interaction studies.
Availability of data and biospecimens from consortia is also an issue that has shown considerable progress in the past 2 years. Several initiatives have been launched to improve on current data sharing practices (11,12). At the same time, issues of proper credit to the original investigators and protection of confidentiality need to be properly addressed (13).
- Khoury MJ, Bradley L. Why should genomic medicine be more evidence-based? Genomic Med. 2007;1:91–93.
- Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007;447:661–678.
- National Cancer Institute Web site: Cohort Consortium.
- Foundation for the National Institutes of Health Web site.
- Moonesinghe R, Khoury MJ, Liu T, et al. Required sample size and nonreplicability thresholds for heterogeneous genetic associations. Proc Natl Acad Sci USA. January 15, 2008;105(2):617–622.
- Burton PR, Hansell AL, Fortier I, et al. Size matters: just how big is BIG?: quantifying realistic sample size requirements for human genome epidemiology. Int J Epidemiol. February 2009;38(1):263–273.
- Manolio TA, Brooks LD, Collins FS. A HapMap harvest of insights into the genetics of common disease. J Clin Invest. 2008;118(5):1590–1605.
- NCI-NHGRI Working Group on Replication in Association Studies, Chanock SJ, Manolio T, et al. Replicating genotype-phenotype associations. Nature. June 7, 2007;447(7145):655–660.
- Zeggini E, Scott LJ, Saxena R, et al. Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes. Nat Genet. 2008;40(5):638–645.
- COGENT Study, Houlston RS, Webb E, et al. The NCBI dbGaP database of genotypes and phenotypes. Nat Genet. 2007;39(10):1181–1186.
- GAIN Collaborative Research Group, Manolio TA, Rodriguez LL, et al. New models of collaboration in genome-wide association studies: the Genetic Association Information Network. Nat Genet. 2007;39(9):1045–1051.
- Homer N, Szelinger S, Redman M, et al. Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet. 2008;4(8):e1000167.
- Zerhouni EA, Nabel EG. Protecting aggregate genomic data. Science. 2008;322:44.