Molecular homology and multiple-sequence alignment: an analysis of concepts and practiceDavid A. Morrison A D , Matthew J. Morgan B and Scot A. Kelchner C
A Systematic Biology, Uppsala University, Norbyvägen 18D, Uppsala 75236, Sweden.
B CSIRO Ecosystem Sciences, GPO Box 1700, Canberra, ACT 2601, Australia.
C Department of Biology, Utah State University, 5305 Old Main Hill, Logan, UT 84322-5305, USA.
D Corresponding author. Email: email@example.com
Australian Systematic Botany 28(1) 46-62 https://doi.org/10.1071/SB15001
Submitted: 2 February 2015 Accepted: 8 April 2015 Published: 10 September 2015
Sequence alignment is just as much a part of phylogenetics as is tree building, although it is often viewed solely as a necessary tool to construct trees. However, alignment for the purpose of phylogenetic inference is primarily about homology, as it is the procedure that expresses homology relationships among the characters, rather than the historical relationships of the taxa. Molecular homology is rather vaguely defined and understood, despite its importance in the molecular age. Indeed, homology has rarely been evaluated with respect to nucleotide sequence alignments, in spite of the fact that nucleotides are the only data that directly represent genotype. All other molecular data represent phenotype, just as do morphology and anatomy. Thus, efforts to improve sequence alignment for phylogenetic purposes should involve a more refined use of the homology concept at a molecular level. To this end, we present examples of molecular-data levels at which homology might be considered, and arrange them in a hierarchy. The concept that we propose has many levels, which link directly to the developmental and morphological components of homology. Of note, there is no simple relationship between gene homology and nucleotide homology. We also propose terminology with which to better describe and discuss molecular homology at these levels. Our over-arching conceptual framework is then used to shed light on the multitude of automated procedures that have been created for multiple-sequence alignment. Sequence alignment needs to be based on aligning homologous nucleotides, without necessary reference to homology at any other level of the hierarchy. In particular, inference of nucleotide homology involves deriving a plausible scenario for molecular change among the set of sequences. Our clarifications should allow the development of a procedure that specifically addresses homology, which is required when performing alignment for phylogenetic purposes, but which does not yet exist.
Additional keywords: multiple alignment, nucleotide alignment, sequence homology.
ReferencesAbouheif E (1997) Developmental genetics and homology: a hierarchical approach. Trends in Ecology & Evolution 12, 405–408.
Agnarsson I, Coddington JA (2008) Quantitative tests of primary homology. Cladistics 24, 51–61.
Ajawatanawong P, Baldauf SL (2013) Evolution of protein indels in plants, animals and fungi. BMC Evolutionary Biology 13, 140
Aniba MR, Poch O, Thompson JD (2010) Issues in bioinformatics benchmarking: the case study of multiple sequence alignment. Nucleic Acids Research 38, 7353–7363.
Anisimova M, Cannarozzi GM, Liberles DA (2010) Finding the balance between the mathematical and biological optima in multiple sequence alignment. Trends in Evolutionary Biology 2, e7
Arunapuram P, Edvardsson I, Golden M, Anderson JWJ, Novak Á, Sükösd Z, Hein J (2013) StatAlign 2.0: combining statistical alignment with RNA secondary structure prediction. Bioinformatics 29, 654–655.
Assis LCS (2013) Are homology and synapomorphy the same or different? Cladistics 29, 7–9.
Assis LCS (2015) Homology assessment in parsimony and model-based analyses: two sides of the same coin. Cladistics 31, 315–320.
| Homology assessment in parsimony and model-based analyses: two sides of the same coin.CrossRef |
Bailey TL, Gribskov M (1998) Combining evidence using P-values: application to sequence homology searches. Bioinformatics 14, 48–54.
Bapteste E, Philippe H (2002) The potential value of indels as phylogenetic markers: position of trichomonads as a case study. Molecular Biology and Evolution 19, 972–977.
Bapteste E, van Iersel L, Janke A, Kelchner S, Kelk S, McInerney JO, Morrison DA, Nakhleh L, Steel M, Stougie L, Whitfield J (2013) Networks: expanding evolutionary thinking. Trends in Genetics 29, 439–441.
Barta JR (1997) Investigating phylogenetic relationships within the Apicomplexa using sequence data: the search for homology. Methods 13, 81–88.
Bergsten J (2005) A review of long-branch attraction. Cladistics 21, 163–193.
Bergthorsson U, Adams KL, Thomason B, Palmer JD (2003) Widespread horizontal transfer of mitochondrial genes in flowering plants. Nature 424, 197–201.
Blackburne BP, Whelan S (2013) Class of multiple sequence alignment algorithm affects genomic analysis. Molecular Biology and Evolution 30, 642–653.
Bonham-Carter O, Steele J, Bastola D (2014) Alignment-free genetic sequence comparisons: a review of recent approaches by word analysis. Briefings in Bioinformatics 15, 890–905.
Brigandt I (2003) Homology in comparative, molecular, and evolutionary developmental biology: the radiation of a concept. Journal of Experimental Zoology. Part B, Molecular and Developmental Evolution 299, 9–17.
Brower AVZ, de Pinna MCC (2012) Homology and errors. Cladistics 28, 529–538.
Brower AVZ, Schawaroch V (1996) Three steps of homology assessment. Cladistics 12, 265–272.
Burge SW, Daub J, Eberhardt R, Tate J, Barquist L, Nawrocki EP, Eddy SR, Gardner PP, Bateman A (2012) Rfam 11.0: 10 years of RNA families. Nucleic Acids Research 40, D226–D232.
Butler AB, Saidel WM (2000) Defining sameness: historical, biological, and generative homology. BioEssays 22, 846–853.
Chan CX, Ragan MA (2013) Next-generation phylogenomics. Biology Direct 8, 3
Chan CX, Bernard G, Poirion O, Hogan JM, Ragan MA (2014) Inferring phylogenies of evolving sequences without multiple sequence alignment. Scientific Reports 4, 6504
Chong Z, Zhai W, Li C, Gao M, Gong Q, Ruan J, Li J, Jiang L, Lv X, Hungate E, Wu C-I (2013) The evolution of small insertions and deletions in the coding genes of Drosophila melanogaster. Molecular Biology and Evolution 30, 2699–2708.
Cracraft J (2005) Phylogeny and evo-devo: characters, homology, and the historical analysis of the evolution of development. Zoology 108, 345–356.
Dayrat B (2003) The roots of phylogeny: how did Haeckel build his trees? Systematic Biology 52, 515–527.
de Beer GR (1971) ‘Homology: an Unsolved Problem.’ (Oxford University Press: Oxford, UK)
De Laet J (2014) Parsimony analysis of unaligned sequence data: maximization of homology and minimization of homoplasy, not minimization of operationally defined total cost or minimization of equally weighted transformations. Cladistics,
| Parsimony analysis of unaligned sequence data: maximization of homology and minimization of homoplasy, not minimization of operationally defined total cost or minimization of equally weighted transformations.CrossRef | in press. [Published online 28 October 2014]
de Pinna MCC (1991) Concepts and tests of homology in the cladistic paradigm. Cladistics 7, 367–394.
DeBlasio D, Bruand J, Zhang S (2012) A memory efficient method for structure-based RNA multiple alignment. IEEE/ACM Transactions on Computational Biology and Bioinformatics 9, 1–11.
Denton JSS, Wheeler WC (2012) Indel information eliminates trivial sequence alignment in maximum likelihood phylogenetic analysis. Cladistics 28, 514–528.
Dessimov C, Gil M (2010) Phylogenetic assessment of alignments reveals neglected tree signal in gaps. Genome Biology 11, R37
Dickinson WJ (1995) Molecules and morphology: where’s the homology? Trends in Genetics 11, 119–121.
Domazet-Loöo M, Haubold B (2011) Alignment-free detection of local similarity among viral and bacterial genomes. Bioinformatics 27, 1466–1472.
Donoghue MJ (1992) Homology. In ‘Keywords in Evolutionary Biology’. (Eds E Fox Keller, E Lloyd) pp. 170–179. (Harvard University Press: Cambridge, MA)
Doolittle RF (1981) Similar amino acid sequences: chance or common ancestry? Science 214, 149–159.
Doyle JJ, Davis JI (1998) Homology in molecular phylogenetics: a parsimony perspective. In ‘Molecular Systematics of Plants II’. (Eds DE Soltis, PS Soltis, JJ Doyle) pp. 101–131. (Kluwer Academic Publishers: Dordrecht, Netherlands)
Dwivedi B, Gadagkar SR (2009) Phylogenetic inference under varying proportions of indel-induced alignment gaps. BMC Evolutionary Biology 9, 211
Edgar RC (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5, 113
Edgar RC (2010) Quality measures for protein alignment benchmarks. Nucleic Acids Research 38, 2145–2153.
Egan AN, Crandall KA (2008) Incorporating gaps as phylogenetic characters across either DNA regions: ramifications for North American Psoraleeae (Leguminosae). Molecular Phylogenetics and Evolution 46, 532–546.
Ellis J, Morrison DA (1995) Effects of sequence alignment on the phylogeny of Sarcocystis deduced from 18S rDNA sequences. Parasitology Research 81, 696–699.
Farris JS (2014) Homology and misdirection. Cladistics 30, 555–561.
Felsenstein J (2004) ‘Inferring Phylogenies.’ (Sinauer Associates: Sunderland, MA)
Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington K, Holm L, Mistry J, Sonnhammer ELL, Tate J, Punta M (2014) The Pfam protein families database. Nucleic Acids Research 42, D222–D230.
Fitch WM (1970) Distinguishing homologous from analogous proteins. Systematic Zoology 19, 99–113.
Fitch WM (2000) Homology: a personal view on some of the problems. Trends in Genetics 16, 227–231.
Freudenstein JV (2005) Characters, states and homology. Systematic Biology 54, 965–973.
Freudenstein JV, Pickett KM, Simmons MP, Wenzel JW (2003) From basepairs to birdsongs: phylogenetic data in the age of genomics. Cladistics 19, 333–347.
Gabaldón T (2008) Large-scale assignment of orthology: back to phylogenetics? Genome Biology 9, 235
Galperin MY, Koonin EV (2012) Divergence and convergence in enzyme evolution. The Journal of Biological Chemistry 287, 21–28.
Giribet G (2005) Generating implied alignments under direct optimization using POY. Cladistics 21, 396–402.
Golubchik T, Wise MJ, Easteal S, Jermiin LS (2007) Mind the gaps: evidence of bias in estimates of multiple sequence alignments. Molecular Biology and Evolution 24, 2433–2442.
Graham SW, Reeves PA, Burns ACF, Olmstead RG (2000) Microstructural changes in noncoding chloroplast DNA: interpretation, evolution, and utility of indels and inversions in basal angiosperm phylogenetic inference. International Journal of Plant Sciences 161, 83–96.
Gusfield D (1997) ‘Algorithms on Strings Trees, and Sequences: Computer Science and Computational Biology.’ (Cambridge University Press: Cambridge, MA, USA)
Haeckel E (1866) ‘Generelle Morphologie der Organismen.’ (Verlag von Georg Reimer: Berlin)
Haggerty LS, Jachiet P-A, Hanage WP, Fitzpatrick D, Lopez P, O’Connell MJ, Pisani D, Wilkinson M, Bapteste E, McInerney JO (2014) A pluralistic account of homology: adapting the models to the data. Molecular Biology and Evolution 31, 501–516.
Hall BK (Ed.) (1994) ‘Homology: the Hierarchical Basis of Comparative Biology.’ (Academic Press: San Diego, CA)
Hall BK (2007) Homology and homoplasy: dichotomy or continuum? Journal of Human Evolution 52, 473–479.
Hawkins JA (2000) A survey of primary homology assessment: different botanists perceive and define characters in different ways. In ‘Homology and Systematics: Coding Characters for Phylogenetic Analysis’. (Eds R Scotland, RT Pennington) pp. 22–53. (Taylor and Francis: London)
Hawkins JA, Hughes CE, Scotland RW (1997) Primary homology assessment, characters and character states. Cladistics 13, 275–283.
Hickson RE, Simon C, Perrey SW (2000) The performance of several multiple-sequence alignment programs in relation to secondary-structure features for an rRNA sequence. Molecular Biology and Evolution 17, 530–539.
Hillis DM (1994) Homology in molecular biology. In ‘Homology: the Hierarchical Basis of Comparative Biology’. (Ed. BK Hall) pp. 339–368. (Academic Press: New York)
Höhl M, Ragan MA (2007) Is multiple-sequence alignment required for accurate inference of phylogeny? Systematic Biology 56, 206–221.
Hoßfeld U, Olsson L (2005) The history of the homology concept and the ‘Phylogenetisches Symposium’. Theory in Biosciences 124, 243–253.
Huntley MA, Clark AG (2007) Evolutionary analysis of amino acid repeats across the genomes of 12 Drosophila species. Molecular Biology and Evolution 24, 2598–2609.
Iantorno S, Gori K, Goldman N, Gil M, Dessimoz C (2014) Who watches the watchmen? An appraisal of benchmarks for multiple sequence alignment. Methods in Molecular Biology 1079, 59–73.
Jardine N (1967) The concept of homology in biology. The British Journal for the Philosophy of Science 18, 125–139.
Jardine N (1969) The observational and theoretical components of homology: a study based on the morphology of the dermal skull-roofs of rhipidistian fishes. Biological Journal of the Linnean Society. Linnean Society of London 1, 327–361.
Johannsen W (1909) ‘Elemente der Exakten Erblichkeitslehre.’ (Gustav Fischer: Jena, Germany)
Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Molecular Biology and Evolution 30, 772–780.
Kelchner SA (2000) The evolution of non-coding chloroplast DNA and its application in plant systematics. Annals of the Missouri Botanical Garden 87, 482–498.
Kelchner SA (2002) Group II introns as phylogenetic tools: structure, function, and evolutionary constraints. American Journal of Botany 89, 1651–1669.
Kelchner SA (2009) Phylogenetic models and model selection for noncoding DNA. Plant Systematics and Evolution 282, 109–126.
Kelchner SA, Clark LG (1997) Molecular evolution and phylogenetic utility of the rpl16 intron in Chusquea and the Bambusoideae (Poaceae). Molecular Phylogenetics and Evolution 8, 385–397.
Kelchner SA, Wendel JF (1996) Hairpins create minute inversions in non-coding regions of chloroplast DNA. Current Genetics 30, 259–262.
Kemena C, Notredame C (2009) Upcoming challenges for multiple sequence alignment methods in the high-throughput era. Bioinformatics 25, 2455–2465.
Kim J, Ma J (2014) PSAR-Align: improving multiple sequence alignment using probabilistic sampling. Bioinformatics 30, 1010–1012.
Kjer KM (1995) Use of rRNA secondary structure in phylogenetic studies to identify homologous positions: an example of alignment and data presentation from the frogs. Molecular Phylogenetics and Evolution 4, 314–330.
Kleisner K (2007) The formation of the theory of homology in biological sciences. Acta Biotheoretica 55, 317–340.
Kramerov DA, Vassetzky NS (2011) Origin and evolution of SINEs in eukaryotic genomes. Heredity 107, 487–495.
Kummerfeld SK, Teichmann SA (2005) Relative rates of gene fusion and fission in multi-domain proteins. Trends in Genetics 21, 25–30.
Lankester ER (1870) On the use of the term homology in modern zoology, and the distinction between homogenetic and homoplastic agreements. Annals and Magazine of Natural History, series 4 6, 34–43.
Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG (2007) Clustal W and Clustal X version 2.0. Bioinformatics 23, 2947–2948.
Lassmann T, Sonnhammer ELL (2002) Quality assessment of multiple alignment programs. FEBS Letters 529, 126–130.
Lassmann T, Sonnhammer ELL (2005) Automatic assessment of alignment quality. Nucleic Acids Research 33, 7120–7128.
Laubichler MD (2000) Homology in development and the development of the homology concept. American Zoologist 40, 777–788.
Laubichler MD (2014) Homology as a bridge between evolutionary morphology, developmental evolution, and phylogenetic systematics. In ‘The Evolution of Phylogenetic Systematics’. (Ed. A Hamilton) pp. 63–85. (University of California Press: Berkeley, CA)
Legume Phylogeny Working Group (2013) Legume phylogeny and classification in the 21st century: progress, prospects and lessons for other species-rich clades. Taxon 62, 217–248.
Letsch HO, Kück P, Stocsits RR, Misof B (2010) The impact of rRNA secondary structure consideration in alignment and tree reconstruction: simulated data and a case study on the phylogeny of hexapods. Molecular Biology and Evolution 27, 2507–2521.
Li H, Homer N (2010) A survey of sequence alignment algorithms for next-generation sequencing. Briefings in Bioinformatics 11, 473–483.
Liu K, Warnow T (2012) Treelength optimization for phylogeny estimation. PLoS One 7, e33104
Love AC (2007) Functional homology and homology of function: biological concepts and philosophical consequences. Biology & Philosophy 22, 691–708.
Löytynoja A (2012) Alignment methods: strategies, challenges, benchmarking, and comparative overview. In ‘Evolutionary Genomics: Statistical and Computational Methods, Volume 1’. (Ed. M Anisimova) pp. 203–235. (Humana Press: New York)
Löytynoja A, Goldman N (2008) Phylogeny-aware gap placement prevents errors in sequence alignment and evolutionary analysis. Science 320, 1632–1635.
Maas S (2012) Posttranscriptional recoding by RNA editing. Advances in Protein Chemistry and Structural Biology 86, 193–224.
MacLeod N (2008) Understanding morphology in systematic contexts: 3D specimen ordination and 3D specimen recognition. In ‘The New Taxonomy’. (Ed. QD Wheeler) pp. 143–210. (CRC Press: Boca Raton, FL)
Mallo D, de Oliveira Martins L, Posada D (2014) Unsorted homology within locus and species trees. Systematic Biology 63, 988–992.
Margoliash E (1969) Homology: a definition. Science 163, 127
McCune AR, Schimenti JC (2012) Using genetic networks and homology to understand the evolution of phenotypic traits. Current Genomics 13, 74–84.
Medema MH, Takano E, Breitling R (2013) Detecting sequence homology at the gene cluster level with MultiGeneBlast. Molecular Biology and Evolution 30, 1218–1223.
Messer PW, Arndt PF (2007) The majority of recent short DNA insertions in the human genome are tandem duplications. Molecular Biology and Evolution 24, 1190–1197.
Metzler D, Fleissner R (2009) Sequence evolution models for simultaneous alignment and phylogeny construction. In ‘Sequence Alignment: Methods, Models, Concepts, and Strategies’, (Ed. MS Rosenberg) pp. 71–93. (University of California Press: Berkeley, CA)
Meyer A (1999) Homology and homoplasy: the retention of genetic programmes. In ‘Homology’. (Eds GR Bock, G Cardew) pp. 141–157. (Wiley: Chichester, UK)
Mindell DP (1991) Similarity and congruence as criteria for molecular homology. Molecular Biology and Evolution 8, 897–900.
Mindell DP, Meyer A (2001) Homology evolving. Trends in Ecology & Evolution 16, 434–440.
Mishler BD (2005) The logic of the data matrix in phylogenetic analysis. In ‘Parsimony, Phylogeny, and Genomics’. (Ed. VA Albert) pp. 57–70. (Oxford University Press: Oxford, UK)
Moore AD, Bornberg-Bauer E (2012) The dynamics and evolutionary potential of domain loss and emergence. Molecular Biology and Evolution 29, 787–796.
Morgan MJ, Kelchner SA (2010) Inference of molecular homology and sequence alignment by direct optimization. Molecular Phylogenetics and Evolution 56, 305–311.
Morrison DA (2006) Multiple sequence alignment for phylogenetic purposes. Australian Systematic Botany 19, 479–539.
Morrison DA (2009a) A framework for phylogenetic sequence alignment. Plant Systematics and Evolution 282, 127–149.
Morrison DA (2009b) Why would phylogeneticists ignore computerized sequence alignment? Systematic Biology 58, 150–158.
Morrison DA (2015) Is sequence alignment an art or a science? Systematic Botany 40, 14–26.
Morrison DA, Ellis JT (1997) Effects of nucleotide sequence alignment on phylogeny estimation: a case study of 18S rDNAs of Apicomplexa. Molecular Biology and Evolution 14, 428–441.
Mount DM (2004) ‘Bioinformatics: Sequence and Genome Analysis’, 2nd edn. (Cold Spring Harbor Laboratory Press: Cold Spring Harbor, NY)
Nelesen S, Liu K, Wang LS, Linder CR, Warnow T (2012) DACTAL: divide-and-conquer trees (almost) without alignments. Bioinformatics 28, i274–i282.
Nielsen C, Martinez P (2003) Patterns of gene expression: homology or homocracy? Development Genes and Evolution 213, 149–154.
Nixon KC, Carpenter JM (2012) On homology. Cladistics 28, 160–169.
Notredame C (2007) Recent evolutions of multiple sequence alignment algorithms. PLoS Computational Biology 3, e123
Nuin PAS, Wang Z, Tillier ERM (2006) The accuracy of several multiple sequence alignment programs for proteins. BMC Bioinformatics 7, 471
Owen R (1843) ‘Lectures on the Comparative Anatomy and Physiology of the Invertebrate Animals’. (Longman, Brown, Green, and Longmans: London)
Pais FS-M, Ruy PC, Oliveira G, Coimbra RS (2014) Assessing the efficiency of multiple sequence alignment programs. Algorithms for Molecular Biology; AMB 9, 4
Patterson C (1982) Morphological characters and homology. In ‘Problems of Phylogenetic Reconstruction’. (Ed. KA Joysey, AE Friday) pp. 21–74. (Academic Press: London)
Patterson C (1988) Homology in classical and molecular biology. Molecular Biology and Evolution 5, 603–625.
Pavlinov IY (2012) The contemporary concepts of homology in biology: a theoretical review. Biology Bulletin Reviews 2, 36–54.
Pei J (2008) Multiple protein sequence alignment. Current Opinion in Structural Biology 18, 382–386.
Pei J, Grishin NV (2007) PROMALS: towards accurate multiple sequence alignments of distantly related proteins. Bioinformatics 23, 802–808.
Pevzner P (2000) ‘Computational Molecular Biology: an Algorithmic Approach.’ (The MIT Press: Cambridge, MA)
Phillips A, Janies D, Wheeler W (2000) Multiple sequence alignment in phylogenetic analysis. Molecular Phylogenetics and Evolution 16, 317–330.
Platnick NJ (1979) Philosophy and the transformation of cladistics. Systematic Zoology 28, 537–546.
Pleijel F (1995) On character coding for phylogeny reconstruction. Cladistics 11, 309–315.
Quandt D, Stech M (2005) Molecular evolution of the trnLUAA intron in bryophytes. Molecular Phylogenetics and Evolution 36, 429–443.
Ray DA, Xing J, Salem A-H, Batzer MA (2006) SINEs of a nearly perfect character. Systematic Biology 55, 928–935.
Redelings BD, Suchard MA (2009) Robust inferences from ambiguous alignments. In ‘Sequence Alignment: Methods, Models, Concepts, and Strategies’. (Ed. MS Rosenberg) pp. 209–270. (University of California Press: Berkeley, CA)
Reeck GR, de Haën C, Teller DC, Doolittle RF, Fitch WM, Dickerson RE, Chambon P, McLachlan AD, Margoliash E, Jukes TH, Zuckerkandl E (1987) ‘Homology’ in proteins and nucleic acids: a terminology muddle and a way out of it. Cell 50, 667
Ren J, Song K, Sun F, Deng M, Reinert G (2013) Multiple alignment-free sequence comparison. Bioinformatics 29, 2690–2698.
Richter S (2005) Homologies in phylogenetic analyses: concept and test. Theory in Biosciences 124, 105–150.
Rieppel OC (1988) ‘Fundamentals of Comparative Biology.’ (Birkhäuser Verlag: Basel, Switzerland)
Rieppel O (2004) The language of systematics, and the philosophy of ‘total evidence’. Systematics and Biodiversity 2, 9–19.
Rieppel O, Kearney M (2002) Similarity. Biological Journal of the Linnean Society. Linnean Society of London 75, 59–82.
Rosenberg MS, Ogden TH (2009) Simulation approaches to evaluating alignment error and methods for comparing alternate alignments. In ‘Sequence Alignment: Methods, Models, Concepts, and Strategies’. (Ed. MS Rosenberg) pp. 179–207. (University of California Press: Berkeley, CA)
Roth VL (1991) Homology and hierarchies: problems solved and unresolved. Journal of Evolutionary Biology 4, 167–194.
Roth FP, Hughes JD, Estep PW, Church GM (1998) Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation. Nature Biotechnology 16, 939–945.
Rutishauser R, Moline P (2005) Evo-devo and the search for homology (‘sameness’) in biological systems. Theory in Biosciences 124, 213–241.
Sahraeian SME, Yoo B-J (2011) PicXAA-R: efficient structural alignment of multiple RNA sequences using a greedy approach. BMC Bioinformatics 12, S38
Sankoff D, Morel C, Cedergren RJ (1973) Evolution of 5S RNA and the non-randomness of base replacement. Nature 245, 232–234.
Scotland R, Pennington RT (Eds) (2000) ‘Homology and Systematics: Coding Characters for Phylogenetic Analysis.’ (Taylor and Francis: London)
Simmons MP (2000) A fundamental problem with amino-acid-sequence characters for phylogenetic analyses. Cladistics 16, 274–282.
Simmons MP, Ochoterena H, Carr TG (2001) Incorporation, relative homoplasy, and effect of gap characters in sequence-based phylogenetic analyses. Systematic Biology 50, 454–462.
Sims GE, Jun S-R, Wu GA, Kim S-H (2009) Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions. Proceedings of the National Academy of Sciences of the United States of America 106, 2677–2682.
Smit S, Knight R, Heringa J (2009) RNA structure prediction from evolutionary patterns of nucleotide composition. Nucleic Acids Research 37, 1378–1386.
Stace CA (2005) Plant taxonomy and biosystematics: does DNA provide all the answers? Taxon 54, 999–1007.
States DJ, Boguski MS (1991) Homology and similarity. In ‘Sequence Analysis Primer’. (Eds M Gribskov, Devereux) pp. 89–157. (Oxford University Press: New York)
Terekhanova NV, Bazykin GA, Neverov A, Kondrashov AS, Seplyarskiy VB (2013) Prevalence of multinucleotide replacements in evolution of primates and Drosophila. Molecular Biology and Evolution 30, 1315–1325.
Thompson JD, Poch O (2005) Sequence alignment. In ‘Encyclopedia of Life Sciences’. (Wiley: New York)
Thornton JW, DeSalle R (2000) Gene family evolution and homology: genomics meets phylogenetics. Annual Review of Genomics and Human Genetics 1, 41–73.
Vinga S, Almeida J (2003) Alignment-free sequence comparison: a review. Bioinformatics 19, 513–523.
Wagner GP (1989) The biological homology concept. Annual Review of Ecology and Systematics 20, 51–69.
Wagner GP (2014) ‘Homology, Genes, and Evolutionary Innovation.’ (Princeton University Press: Princeton, NJ)
Went FW (1971) Parallel evolution. Taxon 20, 1–26.
Wheeler WC (2003) Implied alignment: a synapomorphy-based multiple-sequence alignment method and its use in cladogram search. Cladistics 19, 261–268.
Wheeler WC, Lucaroni N, Hong L, Crowley LM, Varón A (2015) POY version 5: phylogenetic analysis using dynamic homologies under multiple optimality criteria. Cladistics 31, 189–196.
Wilke C (2012) Bringing molecules back into molecular evolution. PLoS Computational Biology 8, e1002572
Wilkinson M (1995) A comparison of two methods of character construction. Cladistics 11, 297–308.
Wilm A, Mainz I, Steger G (2006) An enhanced RNA alignment benchmark for sequence alignment programs. Algorithms for Molecular Biology; AMB 1, 19
Wong KM, Suchard MA, Huelsenbeck JP (2008) Alignment uncertainty and genomic analysis. Science 25, 473–476.
Wrabl JO, Grishin NV (2004) Gaps in structurally similar proteins: towards improvement of multiple sequence alignment. Proteins 54, 71–87.
Wray GA, Abouheif E (1998) When is homology not homology? Current Opinion in Genetics & Development 8, 675–680.
Wuyts J, Van de Peer Y, De Wachter R (2001) Distribution of substitution rates and location of insertion sites in the tertiary structure of ribosomal RNA. Nucleic Acids Research 29, 5017–5028.
Xiao L, Sulaiman IM, Ryan UM, Zhou L, Atwill ER, Tischler ML, Zhang X, Fayer R, Lal AA (2002) Host adaptation and host-parasite co-evolution in Cryptosporidium: implications for taxonomy and public health. International Journal for Parasitology 32, 1773–1785.
Yue F, Shi J, Tang J (2009) Simultaneous phylogeny reconstruction and multiple sequence alignment. BMC Bioinformatics 10, S11