Free Access
Hum Hered 2008;66:87–98

Likelihood-Based Association Analysis for Nuclear Families and Unrelated Subjects with Missing Genotype Data

Dudbridge F.
MRC Biostatistics Unit, Cambridge, UK
email Corresponding Author

 goto top of outline Key Words

  • Conditional likelihood
  • Family-based association tests
  • Missing data
  • Population stratification
  • Transmission/disequilibrium test
  • Unphased genotype data

 goto top of outline Abstract

Missing data occur in genetic association studies for several reasons including missing family members and uncertain haplotype phase. Maximum likelihood is a commonly used approach to accommodate missing data, but it can be difficult to apply to family-based association studies, because of possible loss of robustness to confounding by population stratification. Here a novel likelihood for nuclear families is proposed, in which distinct sets of association parameters are used to model the parental genotypes and the offspring genotypes. This approach is robust to population structure when the data are complete, and has only minor loss of robustness when there are missing data. It also allows a novel conditioning step that gives valid analysis for multiple offspring in the presence of linkage. Unrelated subjects are included by regarding them as the children of two missing parents. Simulations and theory indicate similar operating characteristics to TRANSMIT, but with no bias with missing data in the presence of linkage. In comparison with FBAT and PCPH, the proposed model is slightly less robust to population structure but has greater power to detect strong effects. In comparison to APL and MITDT, the model is more robust to stratification and can accommodate sibships of any size. The methods are implemented for binary and continuous traits in software, UNPHASED, available from the author.

Copyright © 2008 S. Karger AG, Basel

 goto top of outline References
  1. Risch NJ: Searching for genetic determinants in the new millenium. Nature 2000;405:847–856.
  2. Sasieni P: From genotypes to genes: doubling the sample size. Biometrics 1997;53:1253–1261.
  3. Laird NM, Lange C: Family-based designs in the age of large-scale gene association studies. Nat Rev Genet 2006;7:385–394.
  4. Palmer LJ, Cardon LR: Population stratification and spurious allelic association. Lancet 2003;361:598–604.
  5. Weinberg CR,Wilcox AJ, Lie RT: A log-linear approach to case-parent-triad data: Assessing effects of disease genes that act either directly or through maternal effects and that may be subject to parental imprinting. Am J Hum Genet 1998;62:969–978.
  6. Balding DJ: A tutorial on statistical methods for population association studies. Nat Rev Genet 2006;7:781–791.
  7. Excoffier L, Slatkin M: Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol Biol Evol 1995;12:921–927.
  8. Dudbridge F, Koeleman BP, Clayton DG, Todd JA: Unbiased application of the transmission/disequilibrium test to multilocus haplotypes. Am J Hum Genet 2000;66:2009–2012.
  9. Lake SL, Blacker D, Laird NM: Family-based tests of association in the presence of linkage. Am J Hum Genet 2000;67:1515–1525.
  10. Satten GA, Epstein MP: Comparison of prospective and retrospective methods for haplotype inference in case-control studies. Genet Epidemiol 2004;27:192–201.
  11. Cordell HJ: Estimation and testing of genotype and haplotype effects in case-control studies: Comparison of weighted regression and multiple imputation procedures. Genet Epidemiol 2006;30:259–275.
  12. Clayton D: A generalization of the transmission/disequilibrium test for uncertain-haplotype transmission. Am J Hum Genet 1999;65:1170–1177.
  13. Nicodemus KK, Luna A, Shugart YY: An evaluation of power and type I error of single-nucleotide polymorphism transmission/disequilibrium-based statistical methods under different family structures, missing parental data, and population stratification. Am J Hum Genet 2007;80:178–185.
  14. Spielman RS, McGinnis RE, Ewens WJ: Transmission test for linkage disequilibrium: The insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am J Hum Genet 1993;52:506–516.
  15. Rabinowitz D: Adjusting for population heterogeneity and misspecified haplotype frequencies when testing nonparametric null hypotheses in statistical genetics. J Am Stat Assoc 2002;97:742–751.

    External Resources

  16. Dudbridge F: Pedigree disequilibrium tests for multilocus haplotypes. Genet Epidemiol 2003;25:115–121.
  17. Horvath S, Xu X, Lake SL, Silverman EK, Weiss ST, Laird NM: Family-based tests for associating haplotypes with general phenotype data: application to asthma genetics. Genet Epidemiol 2004;26:61–69.
  18. Allen AS, Satten GA: Inference on haplotype/disease association using parent-affected-child data: the projection conditional on parental haplotypes method. Genet Epidemiol 2007;31:211–223.
  19. Becker T, Knapp M: Maximum-likelihood estimation of haplotype frequencies in nuclear families. Genet Epidemiol 2004;27:21–32.
  20. Li M, Boehnke M, Abecasis GR: Efficient study designs for test of genetic association using sibship data and unrelated cases and controls. Am J Hum Genet 2006;78:778–792.
  21. Purcell S, Daly MJ, Sham PC: WHAP: Haplotype-based association analysis. Bioinformatics 2007;23:255–256.
  22. Spielman RS, Ewens WJ: The TDT and other family-based tests for linkage disequilibrium and association. Am J Hum Genet 1996;59:938–989.
  23. Martin ER, Bass MP, Hauser ER, Kaplan NL: Accounting for linkage in family-based tests of association with missing parental genotypes. Am J Hum Genet 2003;73:1016–1026.
  24. Abecasis GR, Cardon LR, Cookson WO: A general test of association for quantitative traits in nuclear families. Am J Hum Genet 2000;66:279–292.
  25. Göring HH, Terwilliger JD: Linkage analysis in the presence of errors IV: Joint pseudomarker analysis of linkage and/or linkage disequilibrium on a mixture of pedigrees and singletons when the mode of inheritance cannot be accurately specified. Am J Hum Genet 2000;66:1310–1327.
  26. Li M, Boehnke M, Abecasis GR: Joint modeling of linkage and association: Identifying SNPs responsible for a linkage signal. Am J Hum Genet 2005;76:934–949.
  27. Kwee LC, Epstein MP, Manatunga AK, Duncan R, Allen AS, Satten GA: Simple methods for assessing haplotype-environment interactions in case-only and case-control studies. Genet Epidemiol 2007;31:75–90.
  28. Lin DY, Zeng D: Likelihood-based inference on haplotype effects in genetic association studies. J Am Stat Assoc 2006;101:89–104.
  29. Huang BE, Lin DY: Efficient association mapping of quantitative trait loci with selective genotyping. Am J Hum Genet 2007;80:567–576.
  30. Schaid DJ, Sommer SS: Genotype relative risks: methods for design and analysis of candidate-gene association studies. Am J Hum Genet 1993;53:1114–1126.
  31. Kruglyak L, Daly MJ, Reeve-Daly MP, Lander ES: Parametric and nonparametric linkage analysis: A unified multipoint approach. Am J Hum Genet 1996;58:1347–1363.
  32. Waldman ID, Robinson BF, Rowe DC: A logistic regression based extension of the TDT for continuous and categorical traits. Ann Hum Genet 1999;63:320–340.

    External Resources

  33. Kistner EO, Weinberg CR: Method for using complete and incomplete trios to identify genes related to a quantitative trait. Genet Epidemiol 2004;27:33–42.
  34. Epstein MP, Satten GA: Inference on haplotype effects in case-control studies using unphased genotype data. Am J Hum Genet 2003;73:1316–1329.
  35. Epstein MP, Veal CD, Trembath RC, Barker JN, Li C, Satten GA: Genetic association analysis using data from triads and unrelated subjects. Am J Hum Genet 2005;76:592–608.
  36. Gould W, Pitblado J, Sribney W: Maximumlikelihood estimation with Stata. College Station, Stata Press, 2005.
  37. Schaid DJ, Rowland CM, Tines DE, Jacobson RM, Poland GA: Score tests for association between traits and haplotypes when linkage phase is ambiguous. Am J Hum Genet 2002;70:425–434.
  38. Clayton D, Chapman J, Cooper J: Use of unphased multilocus genotype data in indirect association studies. Genet Epidemiol 2004;27:415–427.
  39. Cordell HJ, Clayton DG: A unified stepwise regression procedure for evaluating the relative effects of polymorphisms within a gene using case/control or family data: application to HLA in type 1 diabetes. Am J Hum Genet 2002;70:124–141.
  40. Press WH, Teukolsky SA, Vetterling WT, Flannery BP: Numerical Recipes in C, ed 2. Cambridge, Cambridge University Press, 1992.
  41. Croiseau P, Génin E, Cordell HJ: Dealing with missing data in family-based association studies: a mulitple imputation approach. Hum Hered 2007;63:229–238.
  42. Rabinowitz D, Laird NM: A unified approach to adjusting association tests for population admixture with arbitrary pedigree structure and arbitrary missing marker information. Hum Hered 2000;50:211–223.

 goto top of outline Author Contacts

Frank Dudbridge
MRC Biostatistics Unit, Institute for Public Health
Robinson Way, Cambridge CB2 0SR (UK)
Tel. +44 1223 330 300, Fax +44 1223 330 388

 goto top of outline Article Information

Published online: March 31, 2008
Number of Print Pages : 12
Number of Figures : 1, Number of Tables : 4, Number of References : 42

 goto top of outline Publication Details

Human Heredity (International Journal of Human and Medical Genetics)

Vol. 66, No. 2, Year 2008 (Cover Date: March 2008)

Journal Editor: Devoto M. (Philadelphia, Pa.)
ISSN: 0001–5652 (Print), eISSN: 1423–0062 (Online)

For additional information:

Copyright / Drug Dosage / Disclaimer

Copyright: All rights reserved. No part of this publication may be translated into other languages, reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, microcopying, or by any information storage and retrieval system, without permission in writing from the publisher or, in the case of photocopying, direct payment of a specified fee to the Copyright Clearance Center.
Drug Dosage: The authors and the publisher have exerted every effort to ensure that drug selection and dosage set forth in this text are in accord with current recommendations and practice at the time of publication. However, in view of ongoing research, changes in goverment regulations, and the constant flow of information relating to drug therapy and drug reactions, the reader is urged to check the package insert for each drug for any changes in indications and dosage and for added warnings and precautions. This is particularly important when the recommended agent is a new and/or infrequently employed drug.
Disclaimer: The statements, opinions and data contained in this publication are solely those of the individual authors and contributors and not of the publishers and the editor(s). The appearance of advertisements or/and product references in the publication is not a warranty, endorsement, or approval of the products or services advertised or of their effectiveness, quality or safety. The publisher and the editor(s) disclaim responsibility for any injury to persons or property resulting from any ideas, methods, instructions or products referred to in the content or advertisements.