Free Access
Hum Hered 2012;74:17–26

The Robustness of Generalized Estimating Equations for Association Tests in Extended Family Data

Suktitipat B.a,c,d · Mathias R.A.b · Vaidya D.b · Yanek L.R.b · Young J.H.b · Becker L.C.b · Becker D.M.b · Wilson A.F.a · Fallin M.D.b, c
aGenometrics Section, Inherited Disease Research Branch, National Human Genome Research Institute, National Institutes of Health, bDepartment of Medicine, Johns Hopkins Medical Institutions, and cDepartment of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Md., USA; dDepartment of Biochemistry, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
email Corresponding Author

 goto top of outline Key Words

  • Generalized estimating equation
  • Variance components analysis
  • Family-based association study
  • Genome-wide scan

 goto top of outline Abstract

Variance components analysis (VCA), the traditional method for handling correlations within families in genetic association studies, is computationally intensive for genome-wide analyses, and the computational burden of VCA increases with family size and the number of genetic markers. Alternative approaches that do not require the computation of familial correlations are preferable, provided that they do not inflate type I error or decrease power. We performed a simulation study to evaluate practical alternatives to VCA that use regression with generalized estimating equations (GEE) in extended family data. We compared the properties of linear regression with GEE applied to an entire extended family structure (GEE-EXT) and GEE applied to nuclear family structures split from these extended families (GEE-SPL) to variance components likelihood-based methods (FastAssoc). GEE-EXT was evaluated with and without robust variance estimators to estimate the standard errors. We observed similar average type I error rates from GEE-EXT and FastAssoc compared to GEE-SPL. Type I error rates for the GEE-EXT method with a robust variance estimator were marginally higher than the nominal rate when the minor allele frequency (MAF) was <0.1, but were close to the nominal rate when the MAF was ≥0.2. All methods gave consistent effect estimates and had similar power. In summary, the GEE framework with the robust variance estimator, the computationally fastest and least data management-intensive approach, appears to work well in extended families and thus provides a reasonable alternative to full variance components approaches for extended pedigrees in a genome-wide association study setting.

Copyright © 2012 S. Karger AG, Basel

 goto top of outline References
  1. Spielman RS, McGinnis RE, Ewens WJ: Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am J Hum Genet 1993;52:506–516.
  2. Horvath S, Xu X, Laird NM: The family based association test method: strategies for studying general genotype-phenotype associations. Eur J Hum Genet 2001;9:301–306.
  3. Lange C, DeMeo D, Silverman EK, Weiss ST, Laird NM: PBAT: tools for family-based association studies. Am J Hum Genet 2004;74:367–369.
  4. Chen WM, Abecasis GR: Family-based association tests for genomewide association scans. Am J Hum Genet 2007;81:913–926.
  5. Aulchenko YS, de Koning DJ, Haley C: Genomewide rapid association using mixed model and regression: A fast and simple method for genomewide pedigree-based quantitative trait loci association analysis. Genetics 2007;177:577–585.
  6. Zhang Z, Ersoz E, Lai CQ, Todhunter RJ, Tiwari HK, Gore MA, Bradbury PJ, Yu J, Arnett DK, Ordovas JM, Buckler ES: Mixed linear model approach adapted for genome-wide association studies. Nat Genet 2010;42:355–360.
  7. Kang HM, Zaitlen NA, Wade CM, Kirby A, Heckerman D, Daly MJ, Eskin E: Efficient control of population structure in model organism association mapping. Genetics 2008;178:1709–1723.

    External Resources

  8. Kang HM, Sul JH, Service SK, Zaitlen NA, Kong SY, Freimer NB, Sabatti C, Eskin E: Variance component model to account for sample structure in genome-wide association studies. Nat Genet 2010;42:348–354.
  9. Liang K, Zeger SL: Longitudinal data analysis using generalized linear models. Biometrika 1986;73:13–22.

    External Resources

  10. Zeger SL, Liang KY: Longitudinal data analysis for discrete and continuous outcomes. Biometrics 1986;42:121–130.
  11. Cupples LA, Arruda HT, Benjamin EJ, D’Agostino RB Sr, Demissie S, DeStefano AL, Dupuis J, Falls KM, Fox CS, Gottlieb DJ, Govindaraju DR, Guo CY, Heard-Costa NL, Hwang SJ, Kathiresan S, Kiel DP, Laramie JM, Larson MG, Levy D, Liu CY, Lunetta KL, Mailman MD, Manning AK, Meigs JB, Murabito JM, Newton-Cheh C, O’Connor GT, O’Donnell CJ, Pandey M, Seshadri S, Vasan RS, Wang ZY, Wilk JB, Wolf PA, Yang Q, Atwood LD: The Framingham Heart Study 100K SNP genome-wide association study resource: overview of 17 phenotype working group reports. BMC Med Genet 2007;8(suppl 1):S1.
  12. Tregouet DA, Ducimetiere P, Tiret L: Testing association between candidate-gene markers and phenotype in related individuals, by use of estimating equations. Am J Hum Genet 1997;61:189–199.
  13. Vaidya D, Yanek LR, Moy TF, Pearson TA, Becker LC, Becker DM: Incidence of coronary artery disease in siblings of patients with premature coronary artery disease: 10 years of follow-up. Am J Cardiol 2007;100:1410–1415.

    External Resources

  14. R Development Core Team: R: A Language and Environment for Statistical Computing. Vienna, R Foundation for Statistical Computing, 2010.
  15. Abecasis GR, Cherny SS, Cookson WO, Cardon LR: Merlin – rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet 2002;30:97–101.
  16. Kraja AT, Culverhouse R, Daw EW, Wu J, Van Brunt A, Province MA, Borecki IB: The Genetic Analysis Workshop 16 Problem 3: simulation of heritable longitudinal cardiovascular phenotypes based on actual genome-wide single-nucleotide polymorphisms in the Framingham Heart Study. BMC Proc 2009;3(suppl 7):S4.

    External Resources

  17. Huber P: The behavior of maximum likelihood estimates under nonstandard conditions. Proc Fifth Berkeley Symp Math Stat Prob 1967;1:221–233.
  18. White H: A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 1980;48:817–838.

    External Resources

  19. Vincent JC: GEE: generalized estimation equation solver, ported to R by Thomas Lumley (versions 3.13 and 4.4) and Brian Ripley (version 4.13). R package version 4.13–14., 2009.
  20. McArdle PF, O’Connell JR, Pollin TI, Baumgarten M, Shuldiner AR, Peyser PA, Mitchell BD: Accounting for relatedness in family based genetic association studies. Hum Hered 2007;64:234–242.

 goto top of outline Author Contacts

M. Daniele Fallin, PhD
615 N. Wolfe Street
Room 6509
Baltimore, MD 21205 (USA)
Tel. +1 410 955 3643, E-Mail

 goto top of outline Article Information

Received: October 25, 2011
Accepted after revision: July 4, 2012
Published online: October 3, 2012
Number of Print Pages : 10
Number of Figures : 4, Number of Tables : 5, Number of References : 20
Additional supplementary material is available online - Number of Parts : 1

 goto top of outline Publication Details

Human Heredity (International Journal of Human and Medical Genetics)

Vol. 74, No. 1, Year 2012 (Cover Date: November 2012)

Journal Editor: Devoto M. (Philadelphia, Pa./Rome)
ISSN: 0001-5652 (Print), eISSN: 1423-0062 (Online)

For additional information:

Copyright / Drug Dosage / Disclaimer

Copyright: All rights reserved. No part of this publication may be translated into other languages, reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, microcopying, or by any information storage and retrieval system, without permission in writing from the publisher or, in the case of photocopying, direct payment of a specified fee to the Copyright Clearance Center.
Drug Dosage: The authors and the publisher have exerted every effort to ensure that drug selection and dosage set forth in this text are in accord with current recommendations and practice at the time of publication. However, in view of ongoing research, changes in goverment regulations, and the constant flow of information relating to drug therapy and drug reactions, the reader is urged to check the package insert for each drug for any changes in indications and dosage and for added warnings and precautions. This is particularly important when the recommended agent is a new and/or infrequently employed drug.
Disclaimer: The statements, opinions and data contained in this publication are solely those of the individual authors and contributors and not of the publishers and the editor(s). The appearance of advertisements or/and product references in the publication is not a warranty, endorsement, or approval of the products or services advertised or of their effectiveness, quality or safety. The publisher and the editor(s) disclaim responsibility for any injury to persons or property resulting from any ideas, methods, instructions or products referred to in the content or advertisements.