Power and Sample Size Calculations for Genetic Case/Control Studies Using Gene-Centric SNP Maps: Application to Human Chromosomes 6, 21, and 22 in Three Populationsde La Vega F.M.a,1 · Gordon D.b,1 · Su X.a,2 · Scafe C.a · Isaac H.a · Gilbert D.A.a · Spier E.G.a
aApplied Biosystems, Foster City, Calif., and bLaboratory of Statistical Genetics, Rockefeller University, New York, N.Y., USA
Power and sample size calculations are critical parts of any research design for genetic association. We present a method that utilizes haplotype frequency information and average marker-marker linkage disequilibrium on SNPs typed in and around all genes on a chromosome. The test statistic used is the classic likelihood ratio test applied to haplotypes in case/control populations. Haplotype frequencies are computed through specification of genetic model parameters. Power is determined by computation of the test’s non-centrality parameter. Power per gene is computed as a weighted average of the power assuming each haplotype is associated with the trait. We apply our method to genotype data from dense SNP maps across three entire chromosomes (6, 21, and 22) for three different human populations (African-American, Caucasian, Chinese), three different models of disease (additive, dominant, and multiplicative) and two trait allele frequencies (rare, common). We perform a regression analysis using these factors, average marker-marker disequilibrium, and the haplotype diversity across the gene region to determine which factors most significantly affect average power for a gene in our data. Also, as a ‘proof of principle’ calculation, we perform power and sample size calculations for all genes within 100 kb of the PSORS1 locus (chromosome 6) for a previously published association study of psoriasis. Results of our regression analysis indicate that four highly significant factors that determine average power to detect association are: disease model, average marker-marker disequilibrium, haplotype diversity, and the trait allele frequency. These findings may have important implications for the design of well-powered candidate gene association studies. Our power and sample size calculations for the PSORS1 gene appear consistent with published findings, namely that there is substantial power (>0.99) for most genes within 100 kb of the PSORS1 locus at the 0.01 significance level.
© 2005 S. Karger AG, Basel