Haplotyping Methods for PedigreesGao G.a · Allison D.B.a · Hoeschele I.b
aDepartment of Biostatistics, Section on Statistical Genetics, University of Alabama at Birmingham, Birmingham, Ala., bVirginia Bioinformatics Institute and Department of Statistics, Virginia Tech, Blacksburg, Va., USA
Virginia Bioinformatics Institute, Virginia Tech
Blacksburg, VA 24061-0477 (USA)
Tel. +1 540 231 3135, Fax +1 540 231 2606
Do you have an account?
Haplotypes provide valuable information in the study of diseases, complex traits, population histories, and evolutionary genetics. With the dramatic increase in the number of available single nucleotide polymorphism (SNP) markers, haplotype inference (haplotyping) using observed genotype data has become an important component of genetic studies in general and of statistical gene mapping in particular. Existing haplotyping methods include (1) population-based methods, (2) methods for pooled DNA samples, and (3) methods for family and pedigree data. The methods and computer programs for population data and pooled DNA samples were reviewed recently in the literature. As several authors noted, family and pedigree datasets are abundant and have unique advantages. In the past twenty years, many haplotyping methods for family and pedigree data have been developed. Therefore, in this contribution we review haplotyping methods and the corresponding computer programs suitable for family and pedigree data and discuss their applications and limitations. We explore the connections among these methods, and describe the challenges that remain to be addressed.
© 2009 S. Karger AG, Basel
- Marchini J, Cutler D, Patterson N, Stephens M, Eskin E, Halperin E, Lin S, Qin ZS, Munro HM, Abecasis GR, Donnelly PA: Comparison of phasing algorithms for trios and unrelated individuals. Am J Hum Genet 2006;78:437–450.
Sobel E, Lange K, O’Connell JR, Weeks DE: Haplotyping algorithm; in Speed T, Waterman MS (eds): IMA volumes in mathematics and its applications. Genetic mapping and DNA sequencing. New York, Springer-Verlag, 1996, Vol 81, pp 89–110.
- Akey J, Jin L, Xong M: Haplotypes vs. single marker linkage disequilibrium tests: What do we gain? Eur J Hum Genet 2001;9:291–300.
- Hugot JP, Chamaillard M, Zuoali H, Lesage S, Cezard JP, Belaiche J, Almer S, Tysk C, O’Morain CA, Gassull M, Binder V, Finkel Y, Cortot A, Modigliani R, Laurent-Puig P, Gower-Rousseau C, Macry J, Colombel JF, Sahbatou M, Thomas G: Association of NOD2 leucine-rich repeat variants with susceptibility to Crohn’s disease. Nature 2001;411:599–603.
- Lin S, Chakravarti A, Cutler DJ: Haplotype and missing data inference in nuclear familes. Genome Res 2004;14:1624–1632.
- Haines JL: Chromlook: an interactive program for error detection and mapping in reference linkage data. Genomics 1992;14:517–519.
- Schaid DJ, McDonnell SK, Wang L, Cunningham JM, Thibodeau SN: Caution on pedigree haplotype inference with software that assumes linkage equilibrium. Am J Hum Genet 2002;71:992–995.
- Abecasis GR, Cherny SS, Cookson WO, Cardon LR: Merlin-rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet 2002;30:97–101.
- Burdick JT, Chen WM, Abecasis GR, Cheung VG: In silico method for inferring genotypes in pedigrees. Nat Genet 2006;38:1002–1004.
- Niu T: Algorithms for inferring haplotypes. Genet Epidemiol 2004;27:334–347.
- Salem RM, Wessel J, Schork NJ: A comprehensive literature review of haplotyping software and methods for use with unrelated individuals. Hum Genomics 2005;2:39–66.
- Clerget-Darpoux F, Elston RC: Are linkage analysis and the collection of family data dead? Prospects for family studies in the age of genome-wide association. Hum Hered 2007;64:91–96.
- Lin S, Speed TP: An algorithm for haplotype analysis. J Comput Biol 1997;4:535–546.
- Gao G, Hoeschele I, Sorensen P, Du FX: Conditional probability methods for haplotyping in pedigrees. Genetics 2004;167:2055–2065.
- Kruglyak L, Daly MJ, Reeve-Daly MP, Lander ES: Parametric and nonparametric linkage analysis: a unified multipoint approach. Am J Hum Genet 1996;58:1347–1363.
- Gudbjartsson DF, Jonasson K, Frigge ML, Kong A: Allegro, a new computer program for multipoint linkage analysis. Nat Genet 2000;25:12–13.
- Gudbjartsson DF, Thorvaldsson T, Kong A, Gunnarsson G, Ingolfsdottir A: Allegro version 2. Nat Genet 2005;37:1015–1016.
- Sobel E, Lange K: Descent graphs in pedigree analysis: applications to haplotyping, location scores, and marker sharing statistics. Am J Hum Genet 1996;58:1323–1337.
- Thomas A, Gutin A, Abkevich V, Bansal A: Multilocus linkage analysis by blocked Gibbs sampling. Stat Comput 2000;10:259–269.
- Lin S, Skrivanek Z, Irwin M: Haplotyping using SIMPLE: caution on ignoring interference. Genet Epidemiol 2003;25:384–387.
- Gao G, Hoeschele I: A rapid conditional enumeration haplotyping method in pedigrees. Genet Sel Evol 2008;40:25–36.
- Abecasis GR, Wigginton JE: Handling marker-marker linkage disequilibrium: pedigree analysis with clustered markers. Am J Hum Genet 2005;77:754–767.
- Rohde K, Fuerst R: Haplotyping and estimation of haplotype frequencies for closely linked biallelic multilocus genetic phenotypes including nuclear family information. Hum Mutation 2001;17:289–295.
- Ding X, Zhang Q, Flury C, Simianer H: Haplotype reconstruction and estimation of haplotype frequencies from nuclear families with only one parent available. Hum Hered 2006;62:12–19.
- Lindholm E, Zhang J, Hodge SE Greenberg DA: The reliability of haplotyping inference in nuclear families: Misassignment rates for SNPs and microsatellites. Hum Hered 2004;57:117–127.
- Gao G, Hoeschele I: Approximating identity-by-descent matrices using multiple haplotype configurations on pedigrees. Genetics 2005;171:365–376.
- Haldane JBS: The combination of linkage values and the calculation of distances between the loci of linked factors. J Genet 1919;8:299–309.
- Lander ES, Green P: Construction of multilocus genetic linkage maps in humans. Proc Natl Acad Sci USA 1987;84:2363–2367.
- Fishelson M, Dovgolevsky N, Geiger D: Maximum likelihood haplotyping for general pedigrees. Hum Hered 2005;59:41–60.
Thompson EA: Statistical inference from genetic data on pedigrees. NSF-CBMS Regional Conference Series in Probability and Statistics, Vol 6. Beachwood, OH: Institute of Mathematical Statistics, 2000.
- Rabiner LR: A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 1989;77:257–286.
Baum LE: An inequality and associated maximization technique in statistical estimation for probabilistic functions on Markov processes; in Shisha O (ed): Inequalities-III; Proceedings of the Third Symposium on Inequalities. University of California Los Angeles, 1969, Academic Press, New York, 1972, pp 1–8.
Fujita M, McGeer PC, Yang JC: Multi-terminal binary decision diagrams: An efficient data structure for matrix representation. Formal methods in system design 1997;10:149–169.
Ingolfsdottir A, Gudbjartsson D: Genetic linkage analysis algorithms and their implementation; in Priami C et al (ed): Transact on Computat Systems Biol III, LNBI 3737. Berlin/Heidelberg, Springer-Verlag, 2005, pp 123–144.
- Lafferty J, Vardy A: Ordered Binary decision diagrams and minimal trellises. IEEE Trans Computers 1999;48:971–986.
Hermanns H, Meyer-Kayser J, Siegle M: Multi terminal binary decision diagrams to represent and analyse continuous time Markov chains. In 3rd Int. Workshop on the Numerical Solution of Markov Chains. Prensas Univesitaris de Zaragoza, 1999, pp 188–207.
- O’Connell JR: Zero-recombinant haplotyping: applications to fine mapping using SNPs. Genet Epidemiol 2000;19(suppl 1):S64–S70.
- Zhang K, Sun F, Zhao H: HAPLORE: a program for haplotype reconstruction in general pedigrees without recombination. Bioinformatics 2005;21:90–103.
- Elston RC, Stewart J: A general model for the genetic analysis of pedigree data. Hum Hered 1971;21:523–542.
Ott J: Analysis of human genetic linkage. Baltimore and London. The Johns Hopkins University Press, 1999.
- Cannings C, Thompson E, Skolnick M: Probability functions on complex pedigrees. Adv Appl Prob 1978;10:26–61.
- Lange K, Elston RC: Extensions to pedigree analysis. I. Likelihood calculations for simple and complex pedigrees. Hum Hered 1975;25:95–105.
- Lange K, Boehnke M: Extensions to pedigree analysis. V. Optimal calculation of Mendelian likelihoods. Hum Hered 1983;33:291–301.
- Ploughman LM, Boehnke M: Estimating the power of a proposed linkage study for a complex genetic trait. Am J Hum Genet 1989;44:543–551.
- Janss LLG, Van Arendonk JAM, Van der Werf JHJ: Computing approximate monogenic model likelihoods in large pedigrees with loops. Genet Sel Evol 1995;27:567–579.
- Wang T, Fernando RL, Stricker C, Elston RC: An approximation to the likelihood for a pedigree with loops. Theor Appl Genet 1996;93:1299–1309.
- Fernandez SA, Fernando RL, Guldbrandtsen B, Totir LR, Carriquiry AL: Sampling genotypes in large pedigrees with loops. Genet Sel Evol 2001;33:337–367.
Lauritzen SL, Spiegelhalter DL: Local computations with probabilities on graphical structures and their application to expert systems. J R Stat Soc B 1988;50:157–224.
- Lauritzen SL, Sheehan NA: Graphical models for genetic analyses. Stat Sci 2003;18:489–514.
Heath SC: Genetic linkage analysis using Markov chain Monte Carlo techniques; in Green PJ, Hjort NL, Richardson S (ed): Highly Structured Stochastic System. London/New York/Oxford, Oxford University Press, 2003, pp 363–381.
- Fernandez SA, Fernando RL: Determining peeling order using sparse matrix algorithms. J Dairy Sci 2002;85:1623–1629.
Pearl J: Probabilistic Reasoning in Intelligent Systems. San Francisco, Morgan Kaufmann, 1988.
Lauritzen SL: Graphical Models. Oxford University Press, 1996.
Dechter R: Bucket elimination: a unifying framework for probabilistic inference; in J M I (ed): Learning in Graphical Models. Kluwer Academic Press, 1998, pp 75–104.
Murphy KP: A brief introduction to graphical models and Bayesian networks. Http://www.cs.ubc.ca/∼murphyk/Bayes/bayes-tutorial.pdf. 2001.
- O’Connell JR, Weeks DE: An optimal algorithm for automatic genotype elimination. Am J Hum Genet 1999;65:1733–1740.
- Lange K, Goradia TM: An algorithm for automatic genotype elimination. Am J Hum Genet 1987;40:250–256.
- O’Connell JR, Weeks DE: The VITESSE algorithm for rapid exact multilocus linkage analysis via genotype set-recoding and fuzzy inheritance. Nat Genet 1995;11:402–408.
- Jensen CS, Kjaerulff U, Kong A: Blocking Gibbs sampling in very large probabilistic expert systems. Int J Hum Comp Stud 1995;42:647–666.
- Irwin M, Cox N, Kong A: Sequential imputation for multilocus linkage analysis. Proc Natl Acad Sci USA 1994;91:11684–11688.
- Skrivanek Z, Lin S, Irwin M: Linkage analysis with sequential imputation. Genet Epidemiol 2003;25:25–35.
Heath SC: Markov chain Monte Carlo segregation and linkage analysis for oligogenic models. Am J Hum Genet 1997;61:8748–760.
Thompson EA, Heath SC: Estimation of conditional multilocus gene identity among relatives; in Seillier-Moiseiwitsch F (ed): Statistics in Molecular Biology and Genetics: Selected Proceedings of a 1997 Joint AMS-IMS-SIAM Summer Conference on Statistics in Molecular Biology, IMS Lecture Note-Monograph Series Volume 33, Hayward, CA: Institute of Mathematical Statistics, 1999, pp 95–113.
- Heath SC, Thompson EA: MCMC samplers for multilocus analyses on complex pedigrees. Am J Hum Genet 1997;61:A278.
- Broman KW, Weber JL: Characterization of human crossover interference. Am J Hum Genet 2000;66:1911–1926.
- Lin S, Cheng R, Wright FA: Genetic crossover interference in the human genome. Ann Hum Genet 2001;65:79–93.
- Wijsman E: A deductive method of haplotype analysis in pedigrees. Am J Hum Genet 1987;41:356–373.
- Tapadar P, Ghosh S, Majumder PP: Haplotyping in pedigrees via a genetic algorithm. Hum Hered 2000;50:43–56.
- Qian D, Beckmann L: Minimum-recombinant haplotyping in pedigrees. Am J Hum Genet 2002;70:1434–1445.
- Li J, Jiang T: Computing the minimum recombinant haplotype configuration from incomplete genotype data on a pedigree by integer linear programming. J Comp Biol 2005;12:719–739. [http://www.cs.ucr.edu/∼jili/haplotyping.html]
- Baruch E, Weller JI, Cohen M, Ron M, Seroussi E: Efficient inference of haplotypes from genotypes on a large animal pedigree. Genetics 2006;172:1757–1765.
Dempster A, Laird N, Rubin D: Maximum likelihood from incomplete data via the E-M algorithm. J R Stat Soc Ser B 1977;39:1–38.
- Qin ZS, Niu T, Liu JS: Partition-ligation-expectation-maximization algorithm for haplotype inference with single-nucleotide polymorphisms. Am J Hum Genet 2002;71:1242–1247.
- Liu PY, Lu Y, Deng HW: Accurate haplotype inference for multiple linked single-nucleotide polymorphisms using sibship data. Genetics 2006;174:499–509.
- Excoffier L, Slatkin M: Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol Biol Evol 1995;12:921–927.
- Hawley ME, Kidd KK: HAPLO: a program using the EM algorithm to estimate frequencies of multi-site haplotypes. J Hered 1995;86:409–411.
- Niu T, Qin ZS, Xu X, Liu JS: Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms. Am J Hum Genet 2002;70:157–169.
- Stephens M, Smith NJ, Donnelly P: A new statistical method for haplotype reconstruction from population data. Am J Hum Genet 2001;68:978–989.
- Stephens M, Scheet P: Accounting for decay of linkage disequilibrium in haplotype inference and missing-data imputation. Am J Hum Genet 2005;76:449–462.
- Halperin E, Eskin E: Haplotype reconstruction from genotype data using imperfect phylogeny. Bioinformatics 2004;20:1842–1849.
- International HapMap Consortium: A haplotype map of the human genome. Nature 2005;437:1299–1320.
- Sobel E, Papp JC, Lang K: Detection and integration of genotyping errors in statistical genetics. Am J Hum Genet 2002;70:496–508.
- Douglas JA Skol AD, Boehnke M: Probability of detection of genotyping errors and mutations as inheritance inconsistencies in nuclear-family data. Am J Hum Genet 2002;70:487–495.
- Weiss LA, Pan L, Abeny M, Ober C: The sex-specific genetic architecture of quantitative traits in humans. Nat Genet 2006;38:218–222.
- Zhang Y, Niu T, Liu JS: A coalescence-guided hierarchical Bayesian method for haplotype inference. Am J Hum Genet 2006;79:313–22.
Article / Publication Details
Copyright / Drug Dosage / DisclaimerCopyright: All rights reserved. No part of this publication may be translated into other languages, reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, microcopying, or by any information storage and retrieval system, without permission in writing from the publisher.
Drug Dosage: The authors and the publisher have exerted every effort to ensure that drug selection and dosage set forth in this text are in accord with current recommendations and practice at the time of publication. However, in view of ongoing research, changes in government regulations, and the constant flow of information relating to drug therapy and drug reactions, the reader is urged to check the package insert for each drug for any changes in indications and dosage and for added warnings and precautions. This is particularly important when the recommended agent is a new and/or infrequently employed drug.
Disclaimer: The statements, opinions and data contained in this publication are solely those of the individual authors and contributors and not of the publishers and the editor(s). The appearance of advertisements or/and product references in the publication is not a warranty, endorsement, or approval of the products or services advertised or of their effectiveness, quality or safety. The publisher and the editor(s) disclaim responsibility for any injury to persons or property resulting from any ideas, methods, instructions or products referred to in the content or advertisements.