Abstract
The rich fossil record of equids has made them a model for evolutionary processes1. Here we present a 1.12-times coverage draft genome from a horse bone recovered from permafrost dated to approximately 560–780 thousand years before present (kyr bp)2,3. Our data represent the oldest full genome sequence determined so far by almost an order of magnitude. For comparison, we sequenced the genome of a Late Pleistocene horse (43 kyr bp), and modern genomes of five domestic horse breeds (Equus ferus caballus), a Przewalski’s horse (E. f. przewalskii) and a donkey (E. asinus). Our analyses suggest that the Equus lineage giving rise to all contemporary horses, zebras and donkeys originated 4.0–4.5 million years before present (Myr bp), twice the conventionally accepted time to the most recent common ancestor of the genus Equus4,5. We also find that horse population size fluctuated multiple times over the past 2 Myr, particularly during periods of severe climatic changes. We estimate that the Przewalski’s and domestic horse populations diverged 38–72 kyr bp, and find no evidence of recent admixture between the domestic horse breeds and the Przewalski’s horse investigated. This supports the contention that Przewalski’s horses represent the last surviving wild horse population6. We find similar levels of genetic variation among Przewalski’s and domestic populations, indicating that the former are genetically viable and worthy of conservation efforts. We also find evidence for continuous selection on the immune system and olfaction throughout horse evolution. Finally, we identify 29 genomic regions among horse breeds that deviate from neutrality and show low levels of genetic variation compared to the Przewalski’s horse. Such regions could correspond to loci selected early during domestication.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 51 print issues and online access
185,98 € per year
only 3,65 € per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Accession codes
Accessions
Sequence Read Archive
Data deposits
All sequence data have been submitted to Sequence Read Archive under accession number SRA082086 and are available for download, together with final BAM and VCF files, de novo donkey scaffolds, and proteomic data at http://geogenetics.ku.dk/publications/middle-pleistocene-omics.
References
Franzen, J. L. The Rise of Horses: 55 Million Years of Evolution (Johns Hopkins Univ. Press, 2010)
Froese, D. G., Westgate, J. A., Reyes, A. V., Enkin, R. J. & Preece, S. J. Ancient permafrost and a future, warmer Arctic. Science 321, 1648 (2008)
Westgate, J. A. et al. Gold Run tephra: A Middle Pleistocene stratigraphic and paleoenvironmental marker across west-central Yukon Territory, Canada. Can. J. Earth Sci. 46, 465–478 (2009)
Eisenmann, V. Origins, dispersals, and migrations of Equus (Mammalia, Perissofactyla). Courier Forschungsintitut Senckenberg 153, 161–170 (1992)
Forsten, A. Mitochondrial-DNA timetable and the evolution of Equus: Comparison of molecular and paleontological evidence. Ann. Zool. Fenn. 28, 301–309 (1992)
Goto, H. et al. A massively parallel sequencing approach uncovers ancient origins and high genetic variability of endangered Przewalski’s horses. Genome Biol. Evol. 3, 1096–1106 (2011)
Reyes, A. V., Froese, D. G. & Jensen, B. J. Response of permafrost to last interglacial warming: field evidence from non-glaciated Yukon and Alaska. Quat. Sci. Rev. 29, 3256–3274 (2010)
Orlando, L. et al. True single-molecule DNA sequencing of a Pleistocene horse bone. Genet. Res. 21, 1705–1719 (2011)
Lindahl, T. Instability and decay of the primary structure of DNA. Nature 362, 709–715 (1993)
Willerslev, E. et al. Ancient biomolecules from deep ice cores reveal a forested southern Greenland. Science 317, 111–114 (2007)
Miller, W. et al. Polar and brown bear genomes reveal ancient admixture and demographic footprints of past climate change. Proc. Natl Acad. Sci. USA 109, E2382–E2390 (2012)
Cappellini, E. et al. Proteomic analysis of a pleistocene mammoth femur reveals more than one hundred ancient bone proteins. J. Proteome Res. 11, 917–926 (2012)
Ginolhac, A. et al. Improving the performance of True Single Molecule Sequencing for ancient DNA. BMC Genomics 13, 177 (2012)
Rasmussen, M. et al. Ancient human genome sequence of an extinct Palaeo-Eskimo. Nature 463, 757–762 (2010)
Meyer, M. et al. A high-coverage genome sequence from an archaic Denisovan individual. Science 338, 222–226 (2012)
van Doorn, N. L., Wilson, J., Hollund, H., Soressi, M. & Collins, M. J. Site-specific deamidation of glutamine: a new marker of bone collagen deterioration. Rapid Commun. Mass Spectrom. 26, 2319–2327 (2012)
Vilstrup, J. T. et al. Mitochondrial phylogenomics of modern and ancient equids. PLoS ONE 8, e55950 (2013)
McFadden, B. J. & Carranza-Castaneda, O. Cranium of Dinohippus mexicanus (Mammalia Equidae) from the early Pliocene (latest Hemphillian) of central Mexico and the origin of Equus. Bull. Florida Museum Nat.. History 43, 163–185 (2002)
Weinstock, J. et al. Evolution, systematics, and phylogeography of Pleistocene horses in the new world: a molecular perspective. PLoS Biol. 3, e241 (2005)
Green, R. E. et al. A draft sequence of the Neandertal genome. Science 328, 710–722 (2010)
Li, H. & Durbin, R. Inference of human population history from individual whole-genome sequences. Nature 475, 493–496 (2011)
Lorenzen, E. D. et al. Species-specific responses of Late Quaternary megafauna to climate and humans. Nature 479, 359–364 (2011)
International Union for Conservation of Nature. IUCN Red List of Threatened Species, Version 2010.1, http://www.iucnredlist.org (downloaded 11 March 2010)
Reich, D. et al. Genetic history of an archaic hominin group from Denisova Cave in Siberia. Nature 468, 1053–1060 (2010)
Bowling, A. T. et al. Genetic variation in Przewalski’s horses, with special focus on the last wild caught mare, 231 Orlitza III. Cytogenet. Genome Res. 102, 226–234 (2003)
Wade, C. M. et al. Genome sequence, comparative analysis, and population genetics of the domestic horse. Science 326, 865–867 (2009)
Allentoft, M. E. et al. The half-life of DNA in bone: measuring decay kinetics in 158 dated fossils. Proc. R. Soc. Lond. B 279, 4724–4733 (2012)
Kelstrup, C. D., Young, C., Lavallee, R., Nielsen, M. L. & Olsen, J. V. Optimized fast and sensitive acquisition methods for shotgun proteomics on a quadrupole orbitrap mass spectrometer. J. Proteome Res. 11, 3487–3497 (2012)
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009)
Orlando, L. et al. Revising the recent evolutionary history of equids using ancient DNA. Proc. Natl Acad. Sci. USA 106, 21754–21759 (2009)
Rohland, N. & Hofreiter, M. Ancient DNA extraction from bones and teeth. Nature Protocols 2, 1756–1762 (2007)
Meyer, M. & Kircher, M. Illumina sequencing library preparation for highly multiplexed target capture and sequencing. Cold Spring Harb. Protoc.. 6, http://dx.doi.org/10.1101/pdb.prot5448 (2010)
Luo, R. et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. GigaScience 1, 18 (2012)
Stanke, M., Steinkamp, R., Waack, S. & Morgenstern, B. AUGUSTUS: a web server for gene finding in eukaryotes. Nucleic Acids Res. 32, W309–W312 (2004)
Carlton, J. M. et al. Draft genome sequence of the sexually transmitted pathogen Trichonomas vaginalis. Science 315, 207–212 (2007)
Li, H. & Durbin, R. R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595 (2010)
Li, H. et al. The Sequence alignment/map (SAM) format and SAMtools. Bioinformatics 25, 2078–2079 (2009)
McCue, M. E. et al. A high density SNP array for the domestic horse and extant Perissodactyla: utility for association mapping, genetic diversity, and phylogeny studies. PLoS Genet. 8, e1002451 (2012)
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007)
Patterson, N., Price, A. L. & Reich, D. Population structure and eigenanalysis. PLoS Genet. 2, e190 (2006)
R Development Core Team. A language and environment for statistical computing, http://www.R-project.org (R Foundation for Statistical Computing, 2011)
McLaren, W. et al. Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26, 2069–2070 (2010)
Smith, C. I., Chamberlain, A. T., Riley, M. S., Stringer, C. & Collins, M. J. The thermal history of human fossils and the likelihood of successful DNA amplification. J. Hum. Evol. 45, 203–217 (2003)
Ginolhac, A., Rasmussen, M., Gilbert, T. M., Willerslev, E. & Orlando, L. mapDamage: testing for damage patterns in ancient DNA sequences. Bioinformatics 27, 2153–2155 (2011)
Briggs, A. W. et al. Patterns of damage in genomic DNA sequences from a Neandertal. Proc. Natl Acad. Sci. USA 104, 14616–14621 (2007)
Cox, J. & Mann, M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nature Biotechnol. 26, 1367–1372 (2008)
Cox, J. et al. Andromeda: a peptide search engine integrated into the MaxQuant environment. J. Proteome Res. 10, 1794–1805 (2011)
Katoh, K., Misawa, K., Kuma, K. & Miyata, T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 3059–3066 (2002)
Katoh, K. & Toh, H. Recent developments in the MAFFT multiple sequence alignment program. Brief. Bioinform. 9, 286–298 (2008)
Stamatakis, A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688–2690 (2006)
Stamatakis, A. et al. RAxML-Light: a tool for computing Terabyte phylogenies. Bioinformatics 28, 2064–2066 (2012)
Sanderson, M. J. r8s: inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock. Bioinformatics 19, 301–302 (2003)
Shimodaira, H. & Hasegawa, M. CONSEL: for assessing the confidence of phylogenetic tree selection. Bioinformatics 17, 1246–1247 (2001)
Lippold, S., Matzke, N. J., Reissmann, M. & Hofreiter, M. Whole mitochondrial genome sequencing of domestic horses reveals incorporation of extensive wild horse diversity during domestication. BMC Evol. Biol. 11, 328 (2011)
Achilli, A. et al. Mitochondrial genomes from modern horses reveal the major haplogroups that underwent domestication. Proc. Natl Acad. Sci. USA 109, 2449–2454 (2012)
Warmuth, V. et al. Reconstructing the origin and spread of horse domestication in the Eurasian steppe. Proc. Natl Acad. Sci. USA 109, 8202–8206 (2012)
Drummond, A. J. & Rambaut, A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol. Biol. 7, 214 (2007)
Drummond, A. J., Suchard, M. A., Xie, D. & Rambaut, A. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol. Biol. Evol. 29, 1969–1973 (2012)
Rambaut, A. & Drummond, A. J. Tracer v1. 5, http://beast.bio.ed.ac.uk/Tracer (2009)
Hudson, R. R. Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics 18, 337–338 (2002)
Zhang, Z. Computational Molecular Evolution (Oxford Univ. Press, 2006)
Huang, D. W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID Bioinformatics Resources. Nature Protocols 4, 44–57 (2009)
Nielsen, R. Molecular signatures of natural selection. Annu. Rev. Genet. 39, 197–218 (2005)
Busing, F. M. T. A., Meijer, E. & Van Der Leeden, R. Delete-m Jackknife for Unequal m. Stat. Comput. 9, 3–8 (1999)
Keane, T. M., Creevey, C. J., Pentony, M. M., Naughton, T. J. & McInerney, J. O. Assessment of methods for amino acid matrix selection and their use on empirical data shows that ad hoc assumptions for choice of matrix are not justified. BMC Evol. Biol. 6, 29 (2006)
Guindon, S. et al. New algorithms and methods to estimate Maximum-Likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010)
Acknowledgements
We thank T. Brand, the laboratory technicians at the Danish National High-throughput DNA Sequencing Centre and the Illumina sequencing platform at SciLifeLab-Uppsala for technical assistance; J. Clausen for help with the donkey samples; S. Rasmussen for computational assistance; J. N. MacLeod and T. Kalbfleisch for discussions involving the re-sequencing of the horse reference genome; and S. Sawyer for providing published ancient horse data. This work was supported by the Danish Council for Independent Research, Natural Sciences (FNU); the Danish National Research Foundation; the Novo Nordisk Foundation; the Lundbeck Foundation (R52-A5062); a Marie-Curie Career Integration grant (FP7 CIG-293845); the National Science Foundation ARC-0909456; National Science Foundation DBI-0906041; the Searle Scholars Program; King Saud University Distinguished Scientist Fellowship Program (DSFP); Natural Science and Engineering Research Council of Canada; the US National Science Foundation DMR-0923096; and a grant RC2 HG005598 from the National Human Genetics Research Institute (NHGRI). A.G. was supported by a Marie-Curie Intra-European Fellowship (FP7 IEF-299176). M.F. was supported by EMBO Long-Term Post-doctoral Fellowship (ALTF 229-2011). A.-S.M. was supported by a fellowship from the Swiss National Science Foundation (SNSF). Mi.S. was supported by the Lundbeck foundation (R82-5062).
Author information
Authors and Affiliations
Contributions
L.O. and E.W. initially conceived and headed the project; G.Z. and Ju.W. headed research at BGI; L.O. and E.W. designed the experimental research project set-up, with input from B.S. and R.N.; D.F. and G.D.Z. provided the Thistle Creek specimen, stratigraphic context and morphological information, with input from K.K.; K.H.R., B.S., K.G., D.C.M., D.F.A., K.A.S.A.-R. and M.F.B. provided samples; L.O., J.T.V., Ma.R., M.H., C.M. and J.S. did ancient and modern DNA extractions and constructed Illumina DNA libraries for shotgun sequencing; Ja.W. did the independent replication in Oxford; Ma.S. did ancient DNA extractions and generated target enrichment sequence data; Ji.M. and X.W. did Illumina libraries on donkey extracts; K.M., C.M. and A.S.-O. performed Illumina sequencing for the Middle Pleistocene and the 43-kyr-old horse genomes, the five domestic horse genomes and the Przewalski’s horse genome at Copenhagen, with input from Mo.R.; Ji.M. and X.W. performed Illumina sequencing for the Middle Pleistocene and the donkey genomes at BGI; J.F.T. headed true Single DNA Molecule Sequencing of the Middle Pleistocene genome; A.G., B.P. and Mi.S. did the mapping analyses and generated genome alignments, with input from L.O. and A.K.; Jo.V. and T.S.-P. did the metagenomic analyses, with input from A.G., B.P., S.B. and L.O.; Jo.V. and T.S.-P. did the ab initio prediction of the donkey genes and the identification of the Y chromosome scaffolds, with input from A.G. and Mi.S.; L.O., A.G. and P.L.F.J. did the damage analyses, with input from I.M.; A.G. did the functional SNP assignment; A.M.V.V. and L.O. did the PCA analyses, with input from O.R.; B.S. did the phylogenetic and Bayesian skyline reconstructions on mitochondrial data; Mi.S. did the phylogenetic and divergence dating based on nuclear data, with input from L.O.; A.G. did the PSMC analyses using data generated by C.J.R. and L.A.; L.O. and A.G. did the population divergence analyses, with input from J.C., R.N. and M.F.; L.O., A.G. and T.K. did the selection scans, with input from A.-S.M. and R.N.; A.A., I.M. and M.F. did the admixture analyses, with input from R.N.; L.O. and A.G. did the analysis of paralogues and structural variation; Ja.V. and A.D. did the amino-acid composition analyses; E.C., C.D.K., D.S., L.J.J. and J.V.O. did the proteomic analyses, with input from M.T.P.G. and A.M.V.V.; L.O. and V.E. performed the morphological analyses, with input from D.F. and G.D.Z.; L.O. and E.W. wrote the manuscript, with critical input from M.H., B.S., Jo.M. and all remaining authors.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Information
This file contains Supplementary Text and Data, Supplementary Figures, Supplementary Tables and additional references (see Contents for details). This file was updated on 3 July 2013 to correctly display figure S1.3 (PDF 20068 kb)
Supplementary Figures
This file contains Supplementary Figures S6.8-S6.38, which show DNA fragmentation and nucleotide misincorporation patterns for mitochondrial reads from other ancient samples analyzed in this study. (PDF 2191 kb)
Supplementary Tables
This zipped file contains Supplementary Tables 4.2, 4.3, 4.4, 5.9, 11.3, 11.4, 11.7 and 12.8. (ZIP 10146 kb)
Rights and permissions
About this article
Cite this article
Orlando, L., Ginolhac, A., Zhang, G. et al. Recalibrating Equus evolution using the genome sequence of an early Middle Pleistocene horse. Nature 499, 74–78 (2013). https://doi.org/10.1038/nature12323
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nature12323