A 4,103 marker integrated physical and comparative map of the horse genomeRaudsepp T.a · Gustafson-Seabury A.a · Durkin K.a · Wagner M.L.b · Goh G.a · Seabury C.M.a · Brinkmeyer-Langford C.a · Lee E-J.a · Agarwala R.c · Stallknecht-Rice E.c · Schäffer A.A.c · Skow L.C.a · Tozaki T.d · Yasue H.e · Penedo M.C.T.f · Lyons L.A.g · Khazanehdari K.A.h · Binns M.M.i · MacLeod J.N.j · Distl O.k · Guérin G.l · Leeb T.m · Mickelson J.R.b · Chowdhary B.P.a
aDepartment of Veterinary Integrative Biosciences, Texas A&M University, College Station, TX bDepartment of Veterinary Biosciences, University of Minnesota, St. Paul, MN cNCBI, NIH, DHHS, Bethesda, MD (USA); dLaboratory of Racing Chemistry, Utsunomiya, eNational Institute of Agrobiological Sciences, Tsukuba (Japan); fVeterinary Genetics Laboratory, and gDepartment of Population Health & Reproduction, University of California, Davis, CA (USA) hCentral Veterinary Research Laboratory, Dubai (UAE); iRoyal Veterinary College, London (UK) jGluck Equine Research Center, Department of Veterinary Science, University of Kentucky, Lexington, KY (USA) kInstitute for Animal Breeding and Genetics, University of Veterinary Medicine Hannover, Hannover (Germany) lINRA, Centre de Recherche de Jouy, Jouy-en-Josas (France); mInstitute of Genetics, University of Berne, Berne (Switzerland) Corresponding Author
A comprehensive second-generation whole genome radiation hybrid (RH II), cytogenetic and comparative map of the horse genome (2n = 64) has been developed using the 5000rad horse × hamster radiation hybrid panel and fluorescence in situ hybridization (FISH). The map contains 4,103 markers (3,816 RH; 1,144 FISH) assigned to all 31 pairs of autosomes and the X chromosome. The RH maps of individual chromosomes are anchored and oriented using 857 cytogenetic markers. The overall resolution of the map is one marker per 775 kilobase pairs (kb), which represents a more than five-fold improvement over the first-generation map. The RH II incorporates 920 markers shared jointly with the two recently reported meiotic maps. Consequently the two maps were aligned with the RH II maps of individual autosomes and the X chromosome. Additionally, a comparative map of the horse genome was generated by connecting 1,904 loci on the horse map with genome sequences available for eight diverse vertebrates to highlight regions of evolutionarily conserved syntenies, linkages, and chromosomal breakpoints. The integrated map thus obtained presents the most comprehensive information on the physical and comparative organization of the equine genome and will assist future assemblies of whole genome BAC fingerprint maps and the genome sequence. It will also serve as a tool to identify genes governing health, disease and performance traits in horses and assist us in understanding the evolution of the equine genome in relation to other species.
© 2008 S. Karger AG, Basel
High-resolution gene maps are essential for understanding the structure and organization of a genome, determining the location and relative order of genes and markers on chromosomes, obtaining detailed comparative information in relation to other genomes, and isolating gene(s) governing traits of interest. In horses, traits of interest range from those governed by a single gene (e.g., coat color and a number of inherited disorders) to complex traits controlled by the interaction of several genes (e.g., allergies, disease resistance, athletic performance, reproduction, fertility). Horse (Equus caballus, ECA; 2n = 64) whole-genome (WG) maps reported to date are low to medium density and contain ∼700–800 markers distributed on various autosomes and the X chromosome. These maps include the first-generation WG radiation hybrid (RH) and comparative map (Chowdhary et al., 2003, denoted below as RH I), the latest iterations of the two linkage maps (IRFHP – Penedo et al., 2005; AHT – Swinburne et al., 2006), and cytogenetic maps (Milenkovic et al., 2002; Perrocheau et al., 2006). Though all these maps have been successfully used in the recent past to isolate genes governing some monogenic traits and to detect the mutation/variation responsible for the phenotype (see Chowdhary and Raudsepp, 2008), their resolution is not sufficient to study the genetics of complex traits.
In recent years, medium to high density WG or single chromosome RH maps with a resolution of about 1 marker per megabase (Mb) have been generated for a range of livestock and pet species including cattle (Everts-van der Wind et al., 2004, 2005; Itoh et al., 2005; Jann et al., 2006; McKay et al., 2007), pig (Hamasima et al., 2003; Meyers et al., 2005), dog (Breen et al., 2004), etc. These maps are facilitating identification of genes for various traits in different species and are being used to compare genomes of distantly related mammals and study chromosome evolution (Murphy et al., 2005). The maps have also been instrumental in integrating synteny, cytogenetic, and genetic linkage information into a single linearly ordered map and have been useful in assembling the emerging WG sequence information (Rowe et al., 2003; Kwitek et al., 2004; Meyers et al., 2005; Jann et al., 2006; Snelling et al., 2007). Medium- to high-resolution gene maps have been reported for some of the 31 pairs of equine autosomes and the X chromosome (Lee et al., 2004; Raudsepp et al., 2004; Brinkmeyer-Langford et al., 2005; Gustafson-Seabury et al., 2005; Dierks et al., 2006; Wagner et al., 2006; Goh et al., 2007). Since then we have added markers to all chromosomes and produced a high-resolution second-generation map of the entire equine genome (denoted below as RH II), excluding the Y chromosome. This map should serve as a valuable tool for many types of equine genome analysis.
Material and methods
Markers for RH mapping were developed using equine genome resources available from UCSC (http://genome.ucsc.edu/), NCBI (http://www.ncbi.nlm.nih.gov/), HorseMap (http://locus.jouy.inra.fr/cgi-bin/lgbc/mapping/horsemap/intro2.pl/), horse BES (BAC end sequences) databases (http://www.tiho-hannover.de/einricht/zucht/hgp/index.htm), and from published literature. Additionally, a number of gene specific markers were generated from conserved regions of orthologous mammalian genes using alignment (Chenna et al., 2003) of sequences from multiple species. The orthologous genes were chosen from the human genome sequence map at ∼1 Mb intervals as described earlier (Lee et al., 2004; Raudsepp et al., 2004; Goh et al., 2007).
Primers were designed with Primer3 (http://frodo.wi.mit.edu/cgi-bin/primer3/primer3_www.cgi) or were obtained from publications. All PCR products amplified using heterologous primers were validated by sequencing. Details about all markers included in this study are available in Supplementary Table 1 (for Supplementary Material, see www.karger.com/doi/10.1159/000151313). All markers were genotyped in duplicate on the WG 5000rad horse × hamster RH panel (Chowdhary et al., 2002), resolved on 2% agarose gels and scored manually as described previously (Chowdhary et al., 2003; Brinkmeyer-Langford et al., 2005; Gustafson-Seabury et al., 2005; Wagner et al., 2006; Goh et al., 2007). Genotyping information available from previously published RH maps (Chowdhary et al., 2003; Raudsepp et al., 2004; Brinkmeyer-Langford et al., 2005; Gustafson-Seabury et al., 2005; Wagner et al., 2006; Goh et al., 2007) was included as part of the input for the map computation.
Computations to analyze the genome-wide genotyping data and to construct RH maps for individual chromosomes were performed using the rh_tsp_map (Agarwala et al., 2000; Schäffer et al., 2007; ftp://ftp.ncbi.nih.gov/pub/agarwala/rhmapping/rh_tsp_map.tar), CONCORDE (Applegate et al., 2006; http://www.isye.gatech.edu/∼wcook/rh/), and Qsopt (http://www.isye.gatech.edu/∼wcook/qsopt) software packages using the same procedures as described in a recently published WG RH map for the cat (Murphy et al., 2007) and in the rh_tsp_map tutorial, but with increased automation (Schäffer et al., 2007). We used a consensus of three formulations of the maximum-likelihood (MLE) criterion (Agarwala et al., 2000).
Markers were assigned to linkage groups by two-point analysis with a LOD score threshold of 7.6. Thresholds with one digit after the decimal between 7 and 8 were considered and LOD 7.6 gave the best balance between the competing objectives of: i) discarding fewer markers with inter-chromosomal scores above the threshold, and ii) unifying more linkage groups. Markers that did not have a score ≥7.6 with any other marker were removed from further analysis. The MLE-consensus maps passed a flips test at LOD threshold 0.5. Markers dropped from the MLE-consensus map were placed in an interval between consecutive markers if the best placement was at least 0.1 LOD units better than second best; multiple markers placed in the same interval passed a flips test. The MLE-consensus markers and placed markers were assigned cR positions by solving instances of a restricted traveling salesman problem. Remaining markers were binned if their best placements spanned at most three adjacent MLE-consensus intervals. The order and orientation of linkage groups on a chromosome were primarily determined by FISH and further verified using available genetic linkage maps (Penedo et al., 2005; Swinburne et al., 2006). Detailed information regarding the RH map for each chromosome (linkage group size, map distances, MLE-consensus, placements, binned markers) is available in Supplementary Table 2.
The CHORI-241 BAC library was used to isolate clones containing markers pertinent for anchoring, ordering and orienting RH groups. Library screening by PCR and BAC DNA isolation followed procedures described earlier (Chowdhary et al., 2003). The BACs were individually labeled with biotin and/or digoxigenin and hybridized in pairs or triplets to horse metaphase or interphase chromosomes. DNA labeling, in situ hybridization, signal detection, microscopy, and image analysis were performed as previously described (Chowdhary et al., 2003).
Comparative information for equine orthologs of human, chimpanzee, dog, cattle, mouse, rat, opossum, and chicken genes was retrieved from the UCSC Genome Browser (http://genome.ucsc.edu/). Homology between regions flanking equine microsatellites and the human genome sequence was obtained from published papers (Tozaki et al., 2007) and by BLAST. Homologies between human genome and equine BES were retrieved from Leeb et al. (2006) and http://www.tiho-hannover.de/einricht/zucht/hgp/index.htm. Blocks of conserved synteny and conserved linkage were defined as described earlier (Nadeau and Sankoff, 1998; Chowdhary et al., 2003, see legends for Supplementary Figs. 1.1–1.X). Comparative positions of centromeres and telomeres were retrieved from available sequence and cytogenetic maps as described below.
Comparative information was retrieved from the UCSC Genome Bioinformatics website (http://genome.ucsc.edu) using the following builds for each species: human –NCBI Build 36.1, browser March 2006; chimpanzee – panTro2 Build 2 v1, browser March 2006; dog – canFam2 v2.0, browser May 2005; cattle – Baylor release Btau_4.0, browser September 2007; mouse – mm9 NCBI Build 37, browser July 2007; rat – rn4 version 3.4, browser November 2004; opossum – monDom4, browser January 2006 and chicken – galGal3 v2. 1 draft assembly, browser May 2006. Sequence maps of individual species were used to identify Mb positions of the equine orthologs. Conserved syntenies and conserved linkages were manually demarcated following the convention laid out by Nadeau and Sankoff (1998) and further explained by us (Chowdhary et al., 2003). A minimum of three markers sharing the same order in two species was considered as linked order shared between them (i.e., conserved linkage). Within blocks of conserved linkages, flips up to 5 Mb were not considered as breakage in conservation because such variations could be attributed to e.g., statistical constraints, assembly errors, and even marginal genotyping errors. Centromere positions of biarmed chromosomes were retrieved from combined cytogenetic and sequence maps and were available only for human and chimpanzee. Centromere positions for acrocentric chromosomes in cattle, dog, mouse, and rat were determined as the lowest Mb position on their sequence map. Centromere positions were not available for opossum and chicken and for chimpanzee chromosomes that have rearrangements compared to their human counterpart (Supplementary Table 3). Locations of telomeres were derived from comparative marker(s) located at or closest to 0 Mb and the highest Mb positions position on the sequence map for individual chromosomes in each species.
Results and discussion
Markers and retention frequencies (RF). A total of 4,493 markers were genotyped on the 5,000rad Equine WG RH-panel. During and following two-marker LOD score computations, 677 markers (15%) were discarded for one of the following reasons: i) marker retention frequency was below the designated threshold of 5%; ii) genotyping results were inconsistent between duplicate typing with the same marker; iii) the same marker was genotyped by different research groups under alias names; iv) markers had high LOD scores with other markers on at least two distinct chromosomes suggesting that primers do not recognize unique sequences; v) markers had no LOD score ≥7.6 with any other marker; and vi) markers could not be reliably assigned to a multi-marker bin relative to the framework maps. The final RH map contains 3,816 markers (see Table 1 and Supplementary Table 4 and Supplementary Fig. 2.1–2.X for details) of which 1,917 are on maximum likelihood (MLE)-consensus (a.k.a. framework) maps, 1,311 are placed in relation to the MLE-consensus markers, and 588 are binned in intervals spanning at most four MLE-consensus markers (Supplementary Tables 2 and 4).
|Table 1. Chromosome wise information on selected parameters of the integrated RH and comparative map|
The average retention frequency (RF) of markers in the panel is 19% (Supplementary Table 5). This ranges from 10.4% for ECA1 to 39.4% for ECA11, which contains the selectable TK1marker preferentially retained in all cell hybrids. Markers with low RF are mainly found on the larger chromosomes, viz.ECA1, ECA17 and ECAX, while markers with RF above the genome average are present mainly on small chromosomes such as ECA29 and ECA30. This suggests that irradiation-induced breakages were fewer in the smaller chromosomes than the rest of the genome. Along the length of the chromosomes, the retention of markers is also slightly higher in the pericentromeric and telomeric region of many chromosomes, a trend that was also seen in RH I.
LOD score computations for all pairs of markers and subsequent single-linkage clustering partitioned the markers into 102 RH groups distributed over all horse autosomes and the X chromosome (Table 1, Supplementary Fig. 1.1–1.X). On average, there are three RH groups per chromosome. While ECA14, 19, 22, 25, 26, 28, 29, 30, and 31 (Fig. 1) contain only one RH group each, ECA1 and ECA17 have 13 and 10 RH groups, respectively, which coincides with their lowest overall retention frequency (Supplementary Table 5). A break in the RH groups is typically observed at the centromeres of biarmed chromosomes, except for ECA11 and ECAX (Supplementary Fig. 1.11 and 1.X). The large number of RH groups is influenced by regions of low RF as well as our decision to include ∼25 small RH groups to avoid gaps in coverage. The overall size of the map calculated as the sum of the 102 RH group lengths is 38,361 cR. Considering the physical size of the horse genome to be somewhere between 2,462 Mb (UCSC EquCab1, http://genome.ucsc.edu/) and 2,952 Mb (Chowdhary et al., 2003; Supplementary Table 4), 1 cR in the 5000rad equine map correlates on average to ∼64–76 kb.
|Fig. 1. Integrated map of ECA31: RH II (middle), cytogenetic map (left), and comparison with sequence maps of eight vertebrate species (right). Detailed legend and full size map for ECA31 are available in legends for Supplementary Fig. 1.1–1.X and in Supplementary Fig. 1.31, respectively.|
Distribution and density of markers. RH II has an average density of 1 marker/775 kb – including all 3,816 markers and an average density of 1 marker/915 kb for the 3,228 markers assigned a cR position (using the genome length estimate of 2,952 Mb from Chowdhary et al. (2003) for ease of comparison). The marker density is highest on ECA22 with 1 marker/540 kb and lowest on ECA25 with 1 marker/1,330 kb (Table 1). The current map provides a greater than five-fold improvement compared to RH I, where the average density was 1 marker/4,044 kb, making it comparable to the recently reported 3000rad–7000rad WG RH maps in other species (Hamasima et al., 2003; Breen et al., 2004; Everts-van der Wind et al., 2005; Jann et al., 2006; McKay et al., 2007). In cattle, map resolution ranges from 1 marker/440 kb (Jann et al., 2006) to 1 marker/880 kb (Everts-van der Wind et al., 2005); in pig, the density is 1 marker/490 kb (Hamasima et al., 2003) and in dog, the density is 1 marker/900 kb (Breen et al., 2004).
The number and distribution of Type I (1,937) and Type II (1,737 microsatellite and 142 other STS) markers is fairly balanced on almost all chromosomes, with a slight bias towards genes on ECA5, 14 and X and towards microsatellites on ECA7, 19 and 24 (Table 1). The large number of polymorphic microsatellites makes RH II useful for genetic studies of horse traits. However, FISH-mapped markers and Type I markers were preferentially selected over others in computing the MLE-consensus map to enable better comparisons with high-resolution RH maps for domestic species that are strongly biased towards genes (Breen et al., 2004; Everts-van der Wind et al., 2004).
Comparison of RH II with previously reported RH maps. RH II has improved almost all important map parameters compared to RH I (Table 2) both overall and on each chromosome. In recent years, medium- to high-resolution RH maps were generated for ten horse chromosomes or chromosomal regions (Lee et al., 2004; Raudsepp et al., 2004; Brinkmeyer-Langford et al., 2005; Gustafson-Seabury et al., 2005; Wagner et al., 2006; Goh et al., 2007), however, maps for all of these chromosomes/regions have been further improved by mapping and analyzing an additional set of markers. For example, the most recently published map for ECA14 (Goh et al., 2007) with 1 marker per 940 kb has been further improved herein to a 1 marker per 700 kb.
|Table 2. Comparison of RH II and RH I (Chowdhary et al., 2003)|
Cytogenetic anchoring of the RH map. The cytogenetic map contains 1,144 markers (Table 1, Supplementary Fig. 1.1–1.X). The majority of the 401 newly FISH-mapped markers were selected systematically from the ends of all RH groups and at regular intervals along the length of larger RH groups (see Fig. 1). Refined multicolor FISH in interphase nuclei using combinations of three markers was applied to resolve the position and orientation of all small RH groups. Altogether RH II contains 857 anchor loci (RH mapped or binned markers also present on the FISH map) that associate RH groups to chromosomes and confirm the computed marker order; 287 markers present only on the cytogenetic map contribute primarily to the comparative map.
The utility of FISH is particularly noted in the assignment of 19 small RH groups containing only 3–5 markers (on ECA1, 3, 6, 7, 13, 16, 17, 18, 23, and 27) and in correcting the location of some of the RH groups or individual markers compared to previously published data. For example, FISH mapping of LPL, SFTPC and CTSB showed that these markers are present on ECA2q and not on ECA9 as reported earlier (Milenkovic et al., 2002; Chowdhary et al., 2003). Similarly, new cytogenetic mapping of microsatellite AHT30 moved a small RH group (two loci) from the previously reported location on ECA22q13 (Swinburne et al., 2000; Chowdhary et al., 2003) to ECA13q13; FISH localization of KNG, UMPS and ZNF148 also showed that these loci are present on ECA19 and not on ECA16 (Godard et al., 2000; Milenkovic et al., 2002).
FISH also resolved discrepancies between the RH map and the most recent iteration of the two meiotic maps. For example, FISH corrects reverse orientation of the meiotic map for ECA26 by Penedo et al. (2005) (Supplementary Fig. 2.26) and both recent meiotic maps for ECA25 (Penedo et al., 2005; Swinburne et al., 2006) (Supplementary Fig. 2.25). With over 1,000 FISH-mapped markers and 857 anchor loci, the horse integrated RH/FISH map is one of the most comprehensive among domestic species and is comparable only to the dog WG map that contains a total of 1,000 FISH markers and 851 anchor loci (Breen et al., 2004). The second-generation WG RH maps for other domestic species are not physically aligned to the chromosomes by FISH.
Comparison of the FISH and RH maps. The observed discrepancies in marker order between the RH and FISH maps are minor and concern single or a few loci scattered over the genome. These are partly attributed to imprecise band designations reported in earlier FISH mapping studies. Further, most of the earlier FISH studies used single-color FISH which cannot precisely order closely located loci. Some anomalies are also due to misidentification of the probes. In such cases, the BAC library was rescreened using published PCR primers, the amplicons were resequenced to verify gene identity, and the new BAC clones were again mapped by FISH. Examples of such corrections include the reassignment of NFIA from ECA5q12→q13 (Milenkovic et al., 2002) to ECA7q12, BRCA2 from ECA17q22 (Milenkovic et al., 2002) to ECA17q14, and AR from ECAXq15→q16 (Milenkovic et al., 2002) to ECAXq12. Several of these discrepancies could also be attributed to isolation and FISH mapping of clones containing another member of the same gene family. For example, primers thought to be for NFIA actually correspond to NFIX.
The 766 markers in the IHRFP male linkage map (Penedo et al., 2005) and the 742 markers in the AHT sex-averaged meiotic map (Swinburne et al., 2006) were, to the extent possible, aligned with the 1,737 microsatellite markers present in RH II on all autosomes and the X chromosome. As a result, there are 920 markers shared between RH II and jointly the two meiotic maps. Alignments between the three maps demonstrate a general agreement in the order and orientation of markers and/or linkage groups (Supplementary Fig. 2.1–2.X). The exceptions include i) minor flips on ECA5, 6, 10, 14, 16, 17, 20, and 21; ii) an evident difference in the relative order of HTG001 and ASB029in the IHRFP map of ECA4 compared to their order in the other two maps (Supplementary Fig. 2.4); iii) reversals involving entire linkage groups (e.g., for ECA25 and ECA26, as described above); and, iv) 17 disparities involving assignment of markers to a different chromosome in one or both meiotic maps compared to RH II (Table 3). More differences were observed between the RH II and IHRFP maps than between the RH II and AHT maps.
|Table 3. Discrepancies between RH II and the two recent linkage maps|
Comparison of the spacing of framework markers between the linkage and RH maps reveals regions that have high or low recombination rates per cR. Typically recombination is reduced near centromeres and elevated at distal parts of the chromosomes (Rowe et al., 2003). For example, on ECA12 two pairs of markers – AHT027–TKY404 and COR058–UCDEQ497 – are separated by similar distances on the RH map, >127.5 cR and 181.6 cR, respectively. However, their meiotic distances differ by more than ten-fold – 2.2 cM for the two pericentromeric markers AHT027–TKY404 and 23.4 cM for the two distal loci COR058– UCDEQ497 (Supplementary Fig. 2.12, Supplementary Table 2). The approximate genome-wide ratio of physical and genetic distances between the RH II and the IHRFP and the AHT linkage maps is 10.1 cR5000/cMand 13.7 cR5000/cM, respectively. This ratio varies between individual chromosomes but is clearly higher than those reported for cattle – from 4 cR5000/cM (Everts-van der Wind et al., 2004) to 7.5 cR7000/cM (Itoh et al., 2005) indicating reduced coverage of the genome between physical and genetic maps in the horse. The latter may be caused by the gaps between RH groups, as well as a lower number of shared markers between RH and linkage maps compared to cattle. On the whole, alignment of RH and linkage maps indirectly connects meiotic data with the cytogenetic and comparative information and facilitates integration of all available mapping information for the equine genome.
The 1,904 genes and BAC end sequences (BES) present in RH II enable a comparative overview of the organization of the horse genome in relation to eight sequenced vertebrate genomes representing eutherian mammals (human, chimpanzee, dog, cattle, mouse, rat), marsupials (opossum), and birds (chicken). On average, comparative markers are distributed at 1.4 Mb intervals in RH II. Further, BLAST alignment of the flanking sequences of 766 equine microsatellite loci with the human genome sequence (Tozaki et al., 2007) provides additional comparative markers for these two genomes. The new map shows a four- to five-fold improvement in the number of comparative markers over RH I and extends the comparison of the horse genome from human and mouse to six additional species. Because the focus of this study is the high-resolution WG map for the horse, our remarks on map comparisons will be restricted to salient comparative features of the equine genome in relation to the eight sequenced genomes, without expanding on the putative common ancestor.
The comparative map presented in the RH II map figures for 31 autosomes and the X chromosome (Fig. 1; Supplementary Fig. 1.1–1.X) confirms and refines the boundaries of conserved syntenic segments known between the horse and human genomes (Raudsepp et al., 1996; Yang et al., 2004). For example, mapping 39 horse-human comparative loci on ECA27, 60 on ECA26, and 54 on ECA13 reaffirms synteny conservation and improves the previously known boundaries of correspondence with the human chromosomes HSA4/HSA8, HSA3/HSA21, and HSA7/HSA16, respectively. Mapping 14 HSA1 markers to the proximal region of ECA1q shows that the equine segment corresponds to the 225–229 Mb region on HSA1q and confirms recent Zoo-FISH findings (Yang et al., 2004). The map also reveals a previously undetected segment of homology between ECA2q and HSA8p. Furthermore, the new map refines the status of several previously reported horse-human conserved syntenies and corrects a number of single-locus-based homologies previously described between ECA1–HSA22, ECA2–HSA1, ECA3–HSA3, ECA5–HSA22, and ECA7–HSA19 (Milenkovic et al., 2002; Chowdhary et al., 2003).
An overview of conserved synteny or linkage between the horse and the eight compared genomes shows other interesting features. ECA11, ECA17, ECA22, and ECAX are the only chromosomes that share one-to-one homology with human and chimpanzee chromosomes, but the conserved synteny does not translate into conserved gene order along the three autosomes. ECAX seems to be the only chromosome that shares conserved linkage with human, chimpanzee, and also dog counterparts, including the position of the centromere. Broadly, the conservation holds good also for pig (Raudsepp et al., 2004), but not for cattle, mouse, and rat where there are several rearrangements (Supplementary Fig. 1.X).
The centromeres of many equine metacentric chromosomes represent sites of synteny breaks in the genomes of most of the compared species. Examples of such breaks are seen on ECA3, 6, 8, and 10 where the short and the long arms correspond to separate chromosomes in all species including opossum and chicken, suggesting that putative ancestral segments have fused at these points in the horse. Further, the position of equine centromeres and/or telomeres coincides at ∼58% of the locations with human and chimpanzee, at 37% of the locations with dog, but at less than 20% of the locations of telomeres/centromeres in the remaining species (Supplementary Table 3). These observations, together with the overall size of syntenic segments in various species shared with the horse, indicate that the organization of the horse chromosomes resembles human/chimpanzee more closely than other compared species.
Some other interesting aspects about comparative organization of the horse genome in relation to the eight sequenced genomes include:
a) Clustering of synteny breaks or rearrangements at a number of places in the eight genomes. Two such clusters can be seen on ECA1q14 and q15 where a distinct break in synteny is observed in almost all genomes. In cattle, where the break does not occur, a rearrangement is evident at the same spot. Another interesting example of such a reshuffle is seen at ECA9cen where an inversion is evident in all species sharing conserved synteny. Aggregation of these rearrangements or synteny breaks at specific spots in other genomes highlights some of the sites where fusion/reshuffle occurred during the formation of the horse chromosomes. Similar congregation of synteny breaks or rearrangements can be seen on ECA2cen, ECA3cen, ECA7, ECA21, and a number of other locations in relation to the horse genome as indicated by a red vertical line across the eight genomes (Supplementary Fig. 1.1–1.X).
b) Overall, the human and chimpanzee genomes exhibit an almost identical pattern of similarity with the horse genome. However, over the distal parts of ECA3q, 6q, 14q, and ECA21, the chimp genome shows inversions that are not seen in the human genome. These inversions were reported in previous human–chimp genome comparisons (Schmutz et al., 2004; Kehrer-Sawatzki et al., 2005a, b) and likely represent independent events that occurred during the evolution of the chimpanzee chromosomes. However, chimpanzee inversion breakpoints seen on ECA3 and ECA14 comparative maps might have broader evolutionary importance, as they coincide with synteny breaks in dog and rodents, respectively.
c) The mouse/rat genomes have undergone rapid karyotype evolution and therefore the total number of syntenic segments shared between them and other eutherian genomes is considerably higher than that seen for comparisons between non-rodent mammalian genomes. Although a similar trend is seen for horse–mouse/rat chromosome comparisons, some remarkably uninterrupted conserved syntenies and linkages spanning entire equine chromosomes are worth mentioning. For example, the conserved linkage shared between ECA11 and parts of MMU11 and RNO10 is not seen for corresponding chromosomal segments in the other six species. A similar trend is seen in conserved linkage shared between ECA24 and parts of MMU12/RNO6, and between ECA26 and parts of MMU16/RNO11. Additionally, ECA22, 27, 28, 29, and 30 share synteny conservation along their entire length with parts of (or complete) rat chromosomes, while ECA22, 27, and 30 share it with mouse chromosomes.
d) While a number of segments from the chicken genome (particularly the macrochromosomes) individually correspond to parts of single equine chromosomes, the conserved linkage between ECA17 and part of GGA1, and the conserved synteny between ECA9, ECA18, ECA24, ECA28, and ECA31 with parts of GGA2, GGA7, GGA5, GGA1, and GGA3, respectively, are noteworthy because some of these segments most likely represent ancestral vertebrate regions.
e) Among large equine autosomes, only ECA9 shares synteny and linkage conservation with the corresponding opossum chromosome (MDO3). Large blocks of synteny conservation are present also between other equine and marsupial chromosomes, but the gene order is usually rearranged resulting in shorter segments of conserved linkages. It is not yet clear whether lower degree of linkage conservation between eutherian and marsupial genomes is due to evolutionary divergence or an incomplete opossum genome assembly. Likewise, many synteny and linkage rearrangements observed between horse and cattle genomes can probably be attributed to the sequence assembly difficulties rather than to real differences.
f) Finally, comparative maps of some horse chromosomes, e. g., ECA5q, 13qcen and 22q show discrepant synteny or linkage positions of some markers in all other species and most likely reflect inaccuracies in the horse RH map. Such discrepant markers tend to be located at the ends of RH groups and are shown in red font on RH maps (Supplementary Fig. 1.1–1.X).
A WG physical map of BAC contigs based on fingerprinting 150,000 BAC clones from CHORI-241 library is currently under construction (O. Distl, unpublished). Since RH II contains over 500 markers that are derived from CHORI-241 BAC clones, they can be used to verify the BAC fingerprint assembly and anchor it to specific chromosomes. The majority of the 4,103 markers on the integrated map should also serve as an excellent framework with which the WG draft sequence assembly (currently at build 2) of the female horse Twilight (C. Wade, unpublished) could be validated. We used e-PCR (Schuler, 1997) to locate the RH II markers on horse build 2 to quantify how much improvement is possible. Only 2,869 markers had a chromosomal location, and only 2,757 markers had a unique location. Thus, it may be possible to use the locations and primers of the other 1,000+ markers given herein to produce better horse genome assemblies. We anticipate that as for human (Lander et al., 2001; Olivier et al., 2001; Venter et al., 2001), mouse (Rowe et al., 2003), rat (Kwitek et al., 2004), and other species, the integrated WG RH II and comparative horse map presented in this study will serve as the main framework to support future efforts in both genome sequence and BAC contig assembly.in this study will serve as the main framework to support future efforts in both genome sequence and BAC contig assembly.
Request reprints from Terje Raudsepp or Bhanu P. Chowdhary
Department of Veterinary Integrative Biosciences
Texas A&M University, College Station, TX 77843 (USA)
telephone: +1 979 862 2879 or +1 979 458 0519, fax: +1 979 845 9972
e-mail: firstname.lastname@example.org, email@example.com
T.R. and A.G.-S. contributed equally to this work.
This work was supported by USDA/NRI grant 2003-03687 and 2006-04801, the Link Endowment, the Morris Animal Foundation, the Dorothy Russell Havemeyer Foundation, the Grayson Jockey Club and in part by the Intramural Research Program of the NIH, NLM.
Accepted in revised form for publication by M. Schmid,: 9 June 2008.
Published online: October 14, 2008
Number of Print Pages : 9
Number of Figures : 1, Number of Tables : 3, Number of References : 45
Cytogenetic and Genome Research
Vol. 122, No. 1, Year 2008 (Cover Date: October 2008)
Journal Editor: Schmid M. (Würzburg)
ISSN: 1424–8581 (Print), eISSN: 1424–859X (Online)
For additional information: http://www.karger.com/CGR