Journal Mobile Options
Table of Contents
Vol. 69, No. 4, 2010
Issue release date: April 2010
Free Access
Hum Hered 2010;69:219–228
(DOI:10.1159/000291927)

Approaches for Evaluating Rare Polymorphisms in Genetic Association Studies

Li Q.a · Zhang H.b · Yu K.b
aKey Laboratory of Systems and Control, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China; bDivision of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Md., USA
email Corresponding Author

Abstract

Most current genetic association studies, including genome-wide association studies, look for the single nucleotide polymorphisms (SNPs) with a relatively large minor allele frequency (MAF) (e.g. >5%) in the search for genetic loci underlying the susceptibility for complex diseases. The strategy of focusing on common SNPs in genetic association studies is very effective under the common-disease-common-variant (CDCV) hypothesis, which claims that common diseases are caused by common variants that have relatively small to moderate effects. Although the CDCV hypothesis has become the dogma guiding the conduct of association studies over the past decade, growing evidence from recent empirical data and simulations suggests that the causal genetic polymorphisms, including SNPs and copy number variants (CNVs), for common diseases have a wide spectrum of MAFs, ranging from rare to common. Unlike the analysis for common genetic variants, statistical approaches for the analysis of rare variants receive very little attention. Methods developed for common variants usually rely on their asymptotic properties, which can be inaccurate for the study of the rare variants with limited sample size. Although Fisher’s exact test can be used for such a scenario, it is usually conservative and thus its usefulness is diminished to some extent. Here we propose two novel approaches for the analysis of rare genetic variants. Simulation studies and two real examples demonstrate the advantages of the proposed methods over the existing methods.


 goto top of outline Key Words

  • Association test
  • CDRV
  • Rare polymorphisms

 goto top of outline Abstract

Most current genetic association studies, including genome-wide association studies, look for the single nucleotide polymorphisms (SNPs) with a relatively large minor allele frequency (MAF) (e.g. >5%) in the search for genetic loci underlying the susceptibility for complex diseases. The strategy of focusing on common SNPs in genetic association studies is very effective under the common-disease-common-variant (CDCV) hypothesis, which claims that common diseases are caused by common variants that have relatively small to moderate effects. Although the CDCV hypothesis has become the dogma guiding the conduct of association studies over the past decade, growing evidence from recent empirical data and simulations suggests that the causal genetic polymorphisms, including SNPs and copy number variants (CNVs), for common diseases have a wide spectrum of MAFs, ranging from rare to common. Unlike the analysis for common genetic variants, statistical approaches for the analysis of rare variants receive very little attention. Methods developed for common variants usually rely on their asymptotic properties, which can be inaccurate for the study of the rare variants with limited sample size. Although Fisher’s exact test can be used for such a scenario, it is usually conservative and thus its usefulness is diminished to some extent. Here we propose two novel approaches for the analysis of rare genetic variants. Simulation studies and two real examples demonstrate the advantages of the proposed methods over the existing methods.

Copyright © 2010 S. Karger AG, Basel


 goto top of outline References
  1. Freidlin B, Zheng G, Li Z, Gastwirth JL: Trend tests for case control studies of genetic markers: power, sample size and robustness. Hum Hered 2002;53:146–152.
  2. Hunter DJ, Kraft P, Jacobs KB, Cox DG, Yeager M, Hankinson SE, Wacholder S, Wang Z, Welch R, Hutchinson A, Wang J, Yu K, Chatterjee N, Orr N, Willett WC, Colditz GA, Ziegler RG, Berg CD, Buys SS, McCarty CA, Feigelson HS, Calle EE, Thun MJ, Hayes RB, Tucker M, Gerhard DS, Fraumeni JF Jr, Hoover RN, Thomas G, Chanock SJ: A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nat Genet 2007;39:870–874.
  3. Yeager M, Orr N, Hayes RB, Jacobs KB, Kraft P, Wacholder S, Minichiello MJ, Fearnhead P, Yu K, Chatterjee N, Wang Z, Welch R, Staats BJ, Calle EE, Feigelson HS, Thun MJ, Rodriguez C, Albanes D, Virtamo J, Weinstein S, Schumacher FR, Giovannucci E, Willett WC, Cancel-Tassin G, Cussenot O, Valeri A, Andriole GL, Gelmann EP, Tucker M, Gerhard DS, Fraumeni JF Jr, Hoover R, Hunter DJ, Chanock SJ, Thomas G: Genome-wide association study of prostate cancer identifies a second risk locus at 8q24. Nat Genet 2007;39:645–649.
  4. Marazita ML, Lidral AC, Murray JC, Field LL, Maher BS, Goldstein McHenry T, Cooper ME, Govil M, Daack-Hirsch S, Riley B, Jugessur A, Felix T, Moreno L, Mansilla MA, Vieira AR, Doheny K, Pugh E, Valencia-Ramirez C, Arcos-Burgos M: Genome scan, fine-mapping, and candidate gene analysis of non-syndromic cleft lip with or without cleft palate reveals phenotype-specific differences in linkage and association results. Hum Hered 2009;68:151–170.
  5. Li B, Leal SM: Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet 2008;83:311–321.
  6. Arnett FC, Howard RF, Tan F, Moulds JM, Bias WB, Durban E, Cameron HD, Paxton G, Hodge TJ, Weathers PE, Reveille JD: Increased prevalence of systemic sclerosis in a Native American tribe in Oklahoma. Association with an Amerindian HLA haplotype. Arthritis Rheum 1996;39:1362–1370.
  7. Pritchard JK: Are rare variants responsible for susceptibility to complex diseases? Am J Hum Genet 2001;69:124–137.
  8. Pritchard JK, Cox NJ: The allelic architecture of human disease genes: common disease-common variant or not? Hum Mol Genet 2002;11:2417–2423.
  9. Fearnhead NS, Wilding JL, Winney B, Tonks S, Bartlett S, Bicknell DC, Tomlinson IP, Mortensen NJ, Bodmer WF: Multiple rare variants in different genes account for multifactorial inherited susceptibility to colorectal adenomas. Proc Natl Acad Sci USA 2004;101:15992–15997.
  10. Cohen JC, Kiss RS, Pertsemlidis A, Marcel YL, McPherson R, Hobbs HH: Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science 2004;305:869–872.
  11. Iyengar SK, Elston RC: The genetic basis of complex traits: Rare variants or ‘common gene, common disease’? Methods Mol Biol 2007;376:71–84.
  12. Kryukov GV, Pennacchio LA, Sunyaev SR: Most rare missense alleles are deleterious in humans: implications for complex disease and association studies. Am J Hum Genet 2007;80:727–739.
  13. Azzopardi D, Dallosso AR, Eliason K, Hendrickson BC, Jones N, Rawstorne E, Colley J, Moskvina V, Frye C, Sampson JR, Wenstrup R, Scholl T, Cheadle JP: Multiple rare nonsynonymous variants in the adenomatous polyposis coli gene predispose to colorectal adenomas. Cancer Res 2008;68:358–363.
  14. Gorlov IP, Gorlova OY, Sunyaev SR, Spitz MR, Amos CI: Shifting paradigm of association studies: value of rare single-nucleotide polymorphisms. Am J Hum Genet 2008;82:100–112.
  15. Slatter TL, Jones GT, Williams MJ, van Rij AM, McCormick SP: Novel rare mutations and promoter haplotypes in ABCA1 contribute to low-HDL-C levels. Clin Genet 2008;73:179–184.
  16. Need AC, Ge D, Weale ME, Maia J, Feng S, Heinzen EL, Shianna KV, Yoon W, Kasperaviciūte D, Gennarelli M, Strittmatter WJ, Bonvicini C, Rossi G, Jayathilake K, Cola PA, McEvoy JP, Keefe RS, Fisher EM, St Jean PL, Giegling I, Hartmann AM, Möller HJ, Ruppert A, Fraser G, Crombie C, Middleton LT, St Clair D, Roses AD, Muglia P, Francks C, Rujescu D, Meltzer HY, Goldstein DB: A genome-wide investigation of SNPs and CNVs in schizophrenia. PLoS Genet 2009;5:e1000373.
  17. Altham P: Exact Bayesian analysis of a 2 × 2 contingency table, and Fisher’s ‘exact’ significance test. J R Stat Soc Series B Stat Methodol 1969;31:261–269.

    External Resources

  18. Howard JV: The 2 × 2 Table: A Discussion from a Bayesian Viewpoint. Stat Sci 1998;13:351–367.

    External Resources

  19. Audic S, Claverie JM: The significance of digital gene expression profiles. Genome Res 1997;7:986–995.
  20. McCarthy SE, Makarov V, Kirov G, Addington AM, McClellan J, Yoon S, Perkins DO, Dickel DE, Kusenda M, Krastoshevsky O, Krause V, Kumar RA, Grozeva D, Malhotra D, Walsh T, Zackai EH, Kaplan P, Ganesh J, Krantz ID, Spinner NB, Roccanova P, Bhandari A, Pavon K, Lakshmi B, Leotta A, Kendall J, Lee YH, Vacic V, Gary S, Iakoucheva LM, Crow TJ, Christian SL, Lieberman JA, Stroup TS, Lehtimäki T, Puura K, Haldeman-Englert C, Pearl J, Goodell M, Willour VL, Derosse P, Steele J, Kassem L, Wolff J, Chitkara N, McMahon FJ, Malhotra AK, Potash JB, Schulze TG, Nöthen MM, Cichon S, Rietschel M, Leibenluft E, Kustanovich V, Lajonchere CM, Sutcliffe JS, Skuse D, Gill M, Gallagher L, Mendell NR; Wellcome Trust Case Control Consortium, Craddock N, Owen MJ, O’Donovan MC, Shaikh TH, Susser E, Delisi LE, Sullivan PF, Deutsch CK, Rapoport J, Levy DL, King MC, Sebat J: Microduplications of 16p11.2 are associated with schizophrenia. Nat Genet 2009;41:1223–1227.

 goto top of outline Author Contacts

Kai Yu
Division of Cancer Epidemiology and Genetics
National Cancer Institute, National Institutes of Health
Bethesda, MD 20892 (USA)
Tel. +1 301 594 7206, Fax +1 301 402 0081, E-Mail yuka@mail.nih.gov


 goto top of outline Article Information

Received: June 25, 2009
Accepted after revision: November 3, 2009
Published online: March 24, 2010
Number of Print Pages : 10
Number of Figures : 4, Number of Tables : 0, Number of References : 20


 goto top of outline Publication Details

Human Heredity (International Journal of Human and Medical Genetics)

Vol. 69, No. 4, Year 2010 (Cover Date: April 2010)

Journal Editor: Devoto M. (Philadelphia, Pa.)
ISSN: 0001-5652 (Print), eISSN: 1423-0062 (Online)

For additional information: http://www.karger.com/HHE


Copyright / Drug Dosage / Disclaimer

Copyright: All rights reserved. No part of this publication may be translated into other languages, reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, microcopying, or by any information storage and retrieval system, without permission in writing from the publisher or, in the case of photocopying, direct payment of a specified fee to the Copyright Clearance Center.
Drug Dosage: The authors and the publisher have exerted every effort to ensure that drug selection and dosage set forth in this text are in accord with current recommendations and practice at the time of publication. However, in view of ongoing research, changes in goverment regulations, and the constant flow of information relating to drug therapy and drug reactions, the reader is urged to check the package insert for each drug for any changes in indications and dosage and for added warnings and precautions. This is particularly important when the recommended agent is a new and/or infrequently employed drug.
Disclaimer: The statements, opinions and data contained in this publication are solely those of the individual authors and contributors and not of the publishers and the editor(s). The appearance of advertisements or/and product references in the publication is not a warranty, endorsement, or approval of the products or services advertised or of their effectiveness, quality or safety. The publisher and the editor(s) disclaim responsibility for any injury to persons or property resulting from any ideas, methods, instructions or products referred to in the content or advertisements.

Abstract

Most current genetic association studies, including genome-wide association studies, look for the single nucleotide polymorphisms (SNPs) with a relatively large minor allele frequency (MAF) (e.g. >5%) in the search for genetic loci underlying the susceptibility for complex diseases. The strategy of focusing on common SNPs in genetic association studies is very effective under the common-disease-common-variant (CDCV) hypothesis, which claims that common diseases are caused by common variants that have relatively small to moderate effects. Although the CDCV hypothesis has become the dogma guiding the conduct of association studies over the past decade, growing evidence from recent empirical data and simulations suggests that the causal genetic polymorphisms, including SNPs and copy number variants (CNVs), for common diseases have a wide spectrum of MAFs, ranging from rare to common. Unlike the analysis for common genetic variants, statistical approaches for the analysis of rare variants receive very little attention. Methods developed for common variants usually rely on their asymptotic properties, which can be inaccurate for the study of the rare variants with limited sample size. Although Fisher’s exact test can be used for such a scenario, it is usually conservative and thus its usefulness is diminished to some extent. Here we propose two novel approaches for the analysis of rare genetic variants. Simulation studies and two real examples demonstrate the advantages of the proposed methods over the existing methods.



 goto top of outline Author Contacts

Kai Yu
Division of Cancer Epidemiology and Genetics
National Cancer Institute, National Institutes of Health
Bethesda, MD 20892 (USA)
Tel. +1 301 594 7206, Fax +1 301 402 0081, E-Mail yuka@mail.nih.gov


 goto top of outline Article Information

Received: June 25, 2009
Accepted after revision: November 3, 2009
Published online: March 24, 2010
Number of Print Pages : 10
Number of Figures : 4, Number of Tables : 0, Number of References : 20


 goto top of outline Publication Details

Human Heredity (International Journal of Human and Medical Genetics)

Vol. 69, No. 4, Year 2010 (Cover Date: April 2010)

Journal Editor: Devoto M. (Philadelphia, Pa.)
ISSN: 0001-5652 (Print), eISSN: 1423-0062 (Online)

For additional information: http://www.karger.com/HHE


Copyright / Drug Dosage

Copyright: All rights reserved. No part of this publication may be translated into other languages, reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, microcopying, or by any information storage and retrieval system, without permission in writing from the publisher or, in the case of photocopying, direct payment of a specified fee to the Copyright Clearance Center.
Drug Dosage: The authors and the publisher have exerted every effort to ensure that drug selection and dosage set forth in this text are in accord with current recommendations and practice at the time of publication. However, in view of ongoing research, changes in goverment regulations, and the constant flow of information relating to drug therapy and drug reactions, the reader is urged to check the package insert for each drug for any changes in indications and dosage and for added warnings and precautions. This is particularly important when the recommended agent is a new and/or infrequently employed drug.
Disclaimer: The statements, opinions and data contained in this publication are solely those of the individual authors and contributors and not of the publishers and the editor(s). The appearance of advertisements or/and product references in the publication is not a warranty, endorsement, or approval of the products or services advertised or of their effectiveness, quality or safety. The publisher and the editor(s) disclaim responsibility for any injury to persons or property resulting from any ideas, methods, instructions or products referred to in the content or advertisements.

References

  1. Freidlin B, Zheng G, Li Z, Gastwirth JL: Trend tests for case control studies of genetic markers: power, sample size and robustness. Hum Hered 2002;53:146–152.
  2. Hunter DJ, Kraft P, Jacobs KB, Cox DG, Yeager M, Hankinson SE, Wacholder S, Wang Z, Welch R, Hutchinson A, Wang J, Yu K, Chatterjee N, Orr N, Willett WC, Colditz GA, Ziegler RG, Berg CD, Buys SS, McCarty CA, Feigelson HS, Calle EE, Thun MJ, Hayes RB, Tucker M, Gerhard DS, Fraumeni JF Jr, Hoover RN, Thomas G, Chanock SJ: A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nat Genet 2007;39:870–874.
  3. Yeager M, Orr N, Hayes RB, Jacobs KB, Kraft P, Wacholder S, Minichiello MJ, Fearnhead P, Yu K, Chatterjee N, Wang Z, Welch R, Staats BJ, Calle EE, Feigelson HS, Thun MJ, Rodriguez C, Albanes D, Virtamo J, Weinstein S, Schumacher FR, Giovannucci E, Willett WC, Cancel-Tassin G, Cussenot O, Valeri A, Andriole GL, Gelmann EP, Tucker M, Gerhard DS, Fraumeni JF Jr, Hoover R, Hunter DJ, Chanock SJ, Thomas G: Genome-wide association study of prostate cancer identifies a second risk locus at 8q24. Nat Genet 2007;39:645–649.
  4. Marazita ML, Lidral AC, Murray JC, Field LL, Maher BS, Goldstein McHenry T, Cooper ME, Govil M, Daack-Hirsch S, Riley B, Jugessur A, Felix T, Moreno L, Mansilla MA, Vieira AR, Doheny K, Pugh E, Valencia-Ramirez C, Arcos-Burgos M: Genome scan, fine-mapping, and candidate gene analysis of non-syndromic cleft lip with or without cleft palate reveals phenotype-specific differences in linkage and association results. Hum Hered 2009;68:151–170.
  5. Li B, Leal SM: Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet 2008;83:311–321.
  6. Arnett FC, Howard RF, Tan F, Moulds JM, Bias WB, Durban E, Cameron HD, Paxton G, Hodge TJ, Weathers PE, Reveille JD: Increased prevalence of systemic sclerosis in a Native American tribe in Oklahoma. Association with an Amerindian HLA haplotype. Arthritis Rheum 1996;39:1362–1370.
  7. Pritchard JK: Are rare variants responsible for susceptibility to complex diseases? Am J Hum Genet 2001;69:124–137.
  8. Pritchard JK, Cox NJ: The allelic architecture of human disease genes: common disease-common variant or not? Hum Mol Genet 2002;11:2417–2423.
  9. Fearnhead NS, Wilding JL, Winney B, Tonks S, Bartlett S, Bicknell DC, Tomlinson IP, Mortensen NJ, Bodmer WF: Multiple rare variants in different genes account for multifactorial inherited susceptibility to colorectal adenomas. Proc Natl Acad Sci USA 2004;101:15992–15997.
  10. Cohen JC, Kiss RS, Pertsemlidis A, Marcel YL, McPherson R, Hobbs HH: Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science 2004;305:869–872.
  11. Iyengar SK, Elston RC: The genetic basis of complex traits: Rare variants or ‘common gene, common disease’? Methods Mol Biol 2007;376:71–84.
  12. Kryukov GV, Pennacchio LA, Sunyaev SR: Most rare missense alleles are deleterious in humans: implications for complex disease and association studies. Am J Hum Genet 2007;80:727–739.
  13. Azzopardi D, Dallosso AR, Eliason K, Hendrickson BC, Jones N, Rawstorne E, Colley J, Moskvina V, Frye C, Sampson JR, Wenstrup R, Scholl T, Cheadle JP: Multiple rare nonsynonymous variants in the adenomatous polyposis coli gene predispose to colorectal adenomas. Cancer Res 2008;68:358–363.
  14. Gorlov IP, Gorlova OY, Sunyaev SR, Spitz MR, Amos CI: Shifting paradigm of association studies: value of rare single-nucleotide polymorphisms. Am J Hum Genet 2008;82:100–112.
  15. Slatter TL, Jones GT, Williams MJ, van Rij AM, McCormick SP: Novel rare mutations and promoter haplotypes in ABCA1 contribute to low-HDL-C levels. Clin Genet 2008;73:179–184.
  16. Need AC, Ge D, Weale ME, Maia J, Feng S, Heinzen EL, Shianna KV, Yoon W, Kasperaviciūte D, Gennarelli M, Strittmatter WJ, Bonvicini C, Rossi G, Jayathilake K, Cola PA, McEvoy JP, Keefe RS, Fisher EM, St Jean PL, Giegling I, Hartmann AM, Möller HJ, Ruppert A, Fraser G, Crombie C, Middleton LT, St Clair D, Roses AD, Muglia P, Francks C, Rujescu D, Meltzer HY, Goldstein DB: A genome-wide investigation of SNPs and CNVs in schizophrenia. PLoS Genet 2009;5:e1000373.
  17. Altham P: Exact Bayesian analysis of a 2 × 2 contingency table, and Fisher’s ‘exact’ significance test. J R Stat Soc Series B Stat Methodol 1969;31:261–269.

    External Resources

  18. Howard JV: The 2 × 2 Table: A Discussion from a Bayesian Viewpoint. Stat Sci 1998;13:351–367.

    External Resources

  19. Audic S, Claverie JM: The significance of digital gene expression profiles. Genome Res 1997;7:986–995.
  20. McCarthy SE, Makarov V, Kirov G, Addington AM, McClellan J, Yoon S, Perkins DO, Dickel DE, Kusenda M, Krastoshevsky O, Krause V, Kumar RA, Grozeva D, Malhotra D, Walsh T, Zackai EH, Kaplan P, Ganesh J, Krantz ID, Spinner NB, Roccanova P, Bhandari A, Pavon K, Lakshmi B, Leotta A, Kendall J, Lee YH, Vacic V, Gary S, Iakoucheva LM, Crow TJ, Christian SL, Lieberman JA, Stroup TS, Lehtimäki T, Puura K, Haldeman-Englert C, Pearl J, Goodell M, Willour VL, Derosse P, Steele J, Kassem L, Wolff J, Chitkara N, McMahon FJ, Malhotra AK, Potash JB, Schulze TG, Nöthen MM, Cichon S, Rietschel M, Leibenluft E, Kustanovich V, Lajonchere CM, Sutcliffe JS, Skuse D, Gill M, Gallagher L, Mendell NR; Wellcome Trust Case Control Consortium, Craddock N, Owen MJ, O’Donovan MC, Shaikh TH, Susser E, Delisi LE, Sullivan PF, Deutsch CK, Rapoport J, Levy DL, King MC, Sebat J: Microduplications of 16p11.2 are associated with schizophrenia. Nat Genet 2009;41:1223–1227.