Australian Systematic Botany Australian Systematic Botany Society
Taxonomy, biogeography and evolution of plants
L. A. S. JOHNSON REVIEW

Multiple sequence alignment for phylogenetic purposes

David A. Morrison

Department of Parasitology (SWEPAR), National Veterinary Institute and Swedish University of Agricultural Sciences, 751 89 Uppsala, Sweden. Email: David.Morrison@bvf.slu.se

Australian Systematic Botany 19(6) 479-539 http://dx.doi.org/10.1071/SB06020
Submitted: 3 July 2006  Accepted: 30 October 2006   Published: 14 December 2006

Abstract

I have addressed the biological rather than bioinformatics aspects of molecular sequence alignment by covering a series of topics that have been under-valued, particularly within the context of phylogenetic analysis. First, phylogenetic analysis is only one of the many objectives of sequence alignment, and the most appropriate multiple alignment may not be the same for all of these purposes. Phylogenetic alignment thus occupies a specific place within a broader context. Second, homology assessment plays an intricate role in phylogenetic analysis, with sequence alignment consisting of primary homology assessment and tree building being secondary homology assessment. The objective of phylogenetic alignment thus distinguishes it from other sorts of alignment. Third, I summarise what is known about the serious limitations of using phenetic similarity as a criterion for automated multiple alignment, and provide an overview of what is currently being done to improve these computerised procedures. This synthesises information that is apparently not widely known among phylogeneticists. Fourth, I then consider the recent development of automated procedures for combining alignment and tree building, thus integrating primary and secondary homology assessment. Finally, I outline various strategies for increasing the biological content of sequence alignment procedures, which consists of taking into account known evolutionary processes when making alignment decisions. These procedures can be objective and repeatable, and can involve computerised algorithms to automate much of the work. Perhaps the most important suggestion is that alignment should be seen as a process where new sequences are added to a pre-existing alignment that has been manually curated by the biologist.


References

Aagesen L Petersen G Seberg O 2005 Sequence length variation, indel costs, and congruence in sensitivity analysis. Cladistics 21 15 30

Aboitiz F 1987 Letter to the editor. Cell 51 515 516
doi:10.1016/0092-8674(87)90117-6

Achaz G Boyer F Rocha EPC Viari AC 2006 Repseek, a tool to retrieve approximate repeats from large DNA sequences. Bioinformatics doi:10.1093/bioinformatics/bt1519 in press

Al-Lazikani B Sheinerman FB Honig B 2001 Combining multiple structure and sequence alignments to improve sequence detection and alignment: application to the SH2 domains of janus kinases. Proceedings of the National Academy of Sciences USA 98 14 796 14 801 doi:10.1073/pnas.011577898

Allison L Wallace CS 1994 The posterior probability distribution of alignments and its application to parameter estimation of evolutionary trees and to optimization of multiple alignments. Journal of Molecular Evolution 39 418 430 doi:10.1007/BF00160274

Allison L , Wallace CS , Yee CN (1992) Minimum message length encoding, evolutionary trees and multiple alignment. In ‘Proceedings of the Hawaii international conference on system sciences (HICSS-25).’ pp. 663–674. (IEEE Press: Piscataway)

Althaus E Caprara A Lenhof H-P Reinert K 2002 Multiple sequence alignment with arbitrary gap costs: computing an optimal solution using polyhedral combinatorics. Bioinformatics 18 S4 S16

Anbarasu LA Narayanasamy P Sundararajan V 2000 Multiple molecular sequence alignment by island parallel genetic algorithm. Current Science 78 858 863


Andersen ES Rosenblad MA Larsen N Westergaard JC Burks J Wower IK Wower J Gorodkin J Samuelsson T Zwieb C 2006 The tmRDB and SRPDB resources. Nucleic Acids Research 34 D163 D168
doi:10.1093/nar/gkj142

Anwar T Khan AU 2006 SSRscanner: a program for reporting distribution and location of simple sequence repeats. Bioinformation 1 89 91

Apostolico A Giancarlo R 1998 Sequence alignment in molecular biology. Journal of Computational Biology 5 173 196


Armougom F Moretti S Poirot O Audic S Dumas P Schaeli B Keduas V Notredame C 2006 Expresso: automatic incorporation of structural information in multiple sequence alignments using 3D-Coffee. Nucleic Acids Research 34 W604 W608


Arvestad L 1997 Aligning coding DNA in the presence of frame-shift errors. Lecture Notes in Computer Science 1264 180 190


Badger JH Eisen JA Ward NL 2005 Genomic analysis of Hyphomonas neptunium contradicts 16S rRNA gene-based phylogenetic analysis: implications for the taxonomy of the orders ‘Rhodobacterales’ and Caulobacterales. International Journal of Systematic and Evolutionary Microbiology 55 1021 1026
doi:10.1099/ijs.0.63510-0

Bafna V Tang H Zhang S 2006 Consensus folding of unaligned RNA sequences revisited. Journal of Computational Biology 13 283 295 doi:10.1089/cmb.2006.13.283

Bahr A Thompson JD Thierry J-C Poch O 2001 BAliBASE (Benchmark Alignment dataBASE): enhancements for repeats, transmembrane sequences and circular permutations. Nucleic Acids Research 29 323 326 doi:10.1093/nar/29.1.323

Barta JR 1997 Investigating phylogenetic relationships within the Apicomplexa using sequence data: the search for homology. Methods 13 81 88 doi:10.1006/meth.1997.0501

Barton GJ Sternberg MJE 1987 A strategy for the rapid multiple alignment of protein sequences: confidence levels from tertiary structure comparisons. Journal of Molecular Biology 198 327 337 doi:10.1016/0022-2836(87)90316-0

Batzoglou S 2005 The many faces of sequence alignment. Briefings in Bioinformatics 6 6 22 doi:10.1093/bib/6.1.6

Bauer M Klau GW Reinert K 2005 a Fast and accurate structural RNA alignment by progressive lagrangian optimization. Lecture Notes in Computer Science 3695 217 228

Bauer M Klau GW Reinert K 2005 b Multiple structural RNA alignment with lagrangian relaxation. Lecture Notes in Computer Science 3692 303 314


Baumel A Ainouche ML Bayer RJ Ainouche AK Misset MT 2002 Molecular phylogeny of hybridizing species from the genus Spartina Schreb. (Poaceae). Molecular Phylogenetics and Evolution 22 303 314
doi:10.1006/mpev.2001.1064

Beebe NW Cooper RD Morrison DA Ellis JT 2000 Subset partitioning of the ribosomal DNA small subunit and its effects on the phylogeny of the Anopheles punctulatus group. Insect Molecular Biology 9 515 520 doi:10.1046/j.1365-2583.2000.00211.x

Bell LH Coggins JR Milner-White EJ 1993 Mix’n’Match: an improved multiple sequence alignment procedure for distantly related proteins using secondary structure predictions, designed to be independent of the choice of gap penalty and scoring matrix. Protein Engineering 6 683 690

Belshaw R Quicke DLJ 2002 Robustness of ancestral state estimates: evolution of life history strategy in ichneumonoid parasitoids. Systematic Biology 51 450 477
doi:10.1080/10635150290069896

Benner SA Cohen MA Gonnet GH 1993 Empirical and structural models for insertions and deletions in the divergent evolution of proteins. Journal of Molecular Biology 229 1065 1082 doi:10.1006/jmbi.1993.1105

Benson G 1997 Sequence alignment with tandem duplication. Journal of Computational Biology 4 351 367

Benson G 1999 Tandem Repeats Finder: a program to analyze DNA sequences. Nucleic Acids Research 27 573 580
doi:10.1093/nar/27.2.573

Bininda-Emonds ORP 2005 TransAlign: using amino acids to facilitate the multiple alignment of protein-coding DNA sequences. BMC Bioinformatics 6 156 doi:10.1186/1471-2105-6-156

Bishop MJ Thompson EA 1986 Maximum likelihood alignment of DNA sequences. Journal of Molecular Biology 190 159 165 doi:10.1016/0022-2836(86)90289-5

Blackshields G Wallace IM Larkin M Higgins DG 2006 Analysis and comparison of benchmarks for multiple sequence alignment. In Silico Biology 6 0030

Blaisdell BE 1986 A measure of the similarity of sets of sequences not requiring sequence alignment. Proceedings of the National Academy of Sciences USA 83 5155 5159
doi:10.1073/pnas.83.14.5155

Bledsoe AH Sheldon FH 1990 Molecular homology and DNA hybridization. Journal of Molecular Evolution 30 425 433 doi:10.1007/BF02101114

Boeva V Regnier M Papatsenko D Makeev V 2006 Short fuzzy tandem repeats in genomic sequences, identification, and possible role in regulation of gene expression. Bioinformatics 22 676 684 doi:10.1093/bioinformatics/btk032

Bonizzoni P Della Vedova G 2001 The complexity of multiple sequence alignment with SP-score that is a metric. Theoretical Computer Science 259 63 79 doi:10.1016/S0304-3975(99)00324-2

Brawley SH 1999 Submission and retrieval of an aligned set of nucleic acid sequences. Journal of Phycology 35 433 437 doi:10.1046/j.1529-8817.1999.3520433.x

Brenner SE Chothia C Hubbard TJ 1998 Assessing sequence comparison methods with reliable structurally-identified distant evolutionary relationships. Proceedings of the National Academy of Sciences USA 95 6073 6078 doi:10.1073/pnas.95.11.6073

Briffeuil P Baudoux G Lambert C De Bolle X Vinals C Feytmans E Depiereux E 1998 Comparative analysis of seven multiple protein sequence alignment servers: clues to enhance reliability of predictions. Bioinformatics 14 357 366 doi:10.1093/bioinformatics/14.4.357

Britten RJ Rowen L Williams J Cameron RA 2003 Majority of divergence between closely related DNA samples is due to indels. Proceedings of the National Academy of Sciences USA 100 4661 4665 doi:10.1073/pnas.0330964100

Brower AVZ Schawaroch V 1996 Three steps of homology assessment. Cladistics 12 265 272

Brown JW 1999 The ribonuclease P database. Nucleic Acids Research 27 314
doi:10.1093/nar/27.1.314

Bucka-Lassen K Caprani O Hein J 1999 Combining many multiple alignments in one improved alignment. Bioinformatics 15 122 130 doi:10.1093/bioinformatics/15.2.122

Butler AB Saidel WM 2000 Defining sameness: historical, biological, and generative homology. BioEssays 22 846 853 doi:10.1002/1521-1878(200009)22:9<846::AID-BIES10>3.0.CO;2-R

Campagna D Romualdi C Vitulo N Del Favero M Lexa M Cannata N Valle G 2005 RAP: a new computer program for de novo identification of repeated sequences in whole genomes. Bioinformatics 21 582 588 doi:10.1093/bioinformatics/bti039

Cannone JJ Subramanian S Schnare MN Collett JR D’Souza LM et al 2002 The Comparative RNA Web (CRW) Site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs. BMC Bioinformatics 3 2 doi:10.1186/1471-2105-3-2

Carfi A Pares S Duée E Galleni M Duez C Frère JM Dideberg O 1995 The 3-D structure of a zinc metallo-β-lactamase from Bacillus cereus reveals a new type of protein fold. EMBO Journal 14 4914 4921

Cartmill M 1994 A critique of homology as a morphological concept. American Journal of Physical Anthropology 94 115 123
doi:10.1002/ajpa.1330940109

Cartwright RA 2005 DNA assembly with gaps (DAWG): simulating sequence evolution. Bioinformatics 21 iii31 iii38 doi:10.1093/bioinformatics/bti1200

Castelo AT Martins W Gao GR 2002 TROLL—tandem repeat occurrence locator. Bioinformatics 18 634 636 doi:10.1093/bioinformatics/18.4.634

Catherinot V Labesse G 2004 ViTO: tool for refinement of protein sequence–structure alignments. Bioinformatics 20 3694 3696 doi:10.1093/bioinformatics/bth429

Cerchio S Tucker P 1998 Influence of alignment on the mtDNA phylogeny of Cetacea: questionable support for a Mysticeti / Physeteroidea clade. Systematic Biology 47 336 344 doi:10.1080/106351598260941

Chain P Kurtz S Ohlebusch E Slezak T 2003 An applications-focused review of comparative genomics tools: capabilities, limitations, and future challenges. Briefings in Bioinformatics 4 105 123 doi:10.1093/bib/4.2.105

Chakrabarti S Bhardwaj N Anand PA Sowdhamini R 2004 Improvement of alignment accuracy utilizing sequentially conserved motifs. BMC Bioinformatics 5 167 doi:10.1186/1471-2105-5-167

Chakrabarti S Lanczycki CJ Panchenko AR Przytycka TM Thiessen PA Bryant SH 2006 Refining multiple sequence alignments with conserved core regions. Nucleic Acids Research 34 2598 2606 doi:10.1093/nar/gkl274

Chan SC Wong AKC Chiu DKY 1992 A survey of multiple sequence comparison methods. Bulletin of Mathematical Biology 54 563 598

Chang MSS Benner SA 2004 Empirical analysis of protein insertions and deletions determining parameters for the correct placement of gaps in protein sequence alignments. Journal of Molecular Biology 341 617 631
doi:10.1016/j.jmb.2004.05.045

Chenna R Sugawara H Koike T Lopez R Gibson TJ Higgins DG Thompson JD 2003 Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Research 31 3497 3500 doi:10.1093/nar/gkg500

Chiaromonte F , Yap VB , Miller W (2002) Scoring pairwise genomic sequence alignments. In ‘Proceedings of the 7th Pacific Symposium on Biocomputing 2002, Lihue, Hawaii’. pp. 115–126.

Chindelevitch L Li Z Blais E Blanchette M 2006 On the inference of parsimonious indel evolutionary scenarios. Journal of Bioinformatics and Computational Biology 4 721 744 doi:10.1142/S0219720006002168

Clamp M Cuff J Searle SM Barton GJ 2004 The Jalview java alignment editor. Bioinformatics 20 426 427 doi:10.1093/bioinformatics/btg430

Cognato AI Vogler AP 2001 Exploring data interaction and nucleotide alignment in a multiple gene analysis of Ips (Coleoptera: Scolytinae). Systematic Biology 50 758 780 doi:10.1080/106351501753462803

Cole JR Chai B Farris RJ Wang Q Kulam SA McGarrell DM Garrity GM Tiedje JM 2005 The Ribosomal Database Project (RDP-II): sequences and tools for high-throughput rRNA analysis. Nucleic Acids Research 33 D294 D296 doi:10.1093/nar/gki038

Cooper A Lalueza-Fox C Anderson S Rambaut A Austin J Ward R 2001 Complete mitochondrial genome sequences of two extinct moas clarify ratite evolution. Nature 409 704 707 doi:10.1038/35055536

Corpet F 1988 Multiple sequence alignment with hierarchical clustering. Nucleic Acids Research 16 10 881 10 890

Corpet F Michot B 1994 RNAlign program: alignment of RNA sequences using both primary and secondary structures. Computer Applications in the Biosciences 10 389 399


Cozzetto D Tramontano A 2005 Relationship between multiple sequence alignments and quality of protein comparative models. Proteins: Structure, Function, and Bioinformatics 58 151 157
doi:10.1002/prot.20284

Croan DG Morrison DA Ellis JT 1997 Evolution of the genus Leishmania revealed by comparison of DNA and RNA polymerase gene sequences. Molecular and Biochemical Parasitology 89 149 159 doi:10.1016/S0166-6851(97)00111-4

Dalli D Wilm A Mainz I Steger G 2006 STRAL: progressive alignment of non-coding RNA using base pairing probability vectors in quadratic time. Bioinformatics 22 1593 1599 doi:10.1093/bioinformatics/btl142

Darling ACE Mau B Blattner FR Perna NT 2004 Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Research 14 1394 1403 doi:10.1101/gr.2289704

De Laet JE (2005) Parsimony and the problem of inapplicables in sequence data. In ‘Parsimony, phylogeny, and genomics.’ (Ed. VA Albert) pp. 81–116. (Oxford University Press: Oxford)

Deléage G Clerc FF Roux B Gautheron DC 1988 ANTHEPROT: a package for protein sequence analysis using a microcomputer. Computer Applications in the Biosciences 4 351 356

De Rijk P De Wachter R 1993 DCSE, an interactive tool for sequence alignment and secondary structure research. Bioinformatics 9 735 740


DeSantis TZ Hugenholtz P Larsen N Rojas M Brodie EL Keller K Huber T Dalevi D Hu P Andersen GL 2006 a Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Applied and Environmental Microbiology 72 5069 5072
doi:10.1128/AEM.03006-05

DeSantis TZ Hugenholtz P Keller K Brodie EL Larsen N Piceno YM Phan R Andersen GL 2006 b NAST: a multiple sequence alignment server for comparative analysis of 16S rRNA genes. Nucleic Acids Research 34 W394 W399 doi:10.1093/nar/gkj156

Dewey CN Pachter L 2006 Evolution at the nucleotide level: the problem of multiple whole-genome alignment. Human Molecular Genetics 15 R51 R56 doi:10.1093/hmg/ddl056

Do CB Mahabhashyam MSP Brudno M Batzoglou S 2005 ProbCons: probabilistic consistency-based multiple sequence alignment. Genome Research 15 330 340 doi:10.1101/gr.2821705

Domingues FS Lackner P Andreeva A Sippl MJ 2000 Structure-based evaluation of sequence comparison and fold recognition alignment accuracy. Journal of Molecular Biology 297 1003 1013 doi:10.1006/jmbi.2000.3615

Donoghue MJ , Sanderson MJ (1994) Complexity and homology in plants. In ‘Homology: the hierarchical basis of comparative biology’. (Ed. BK Hall) pp. 393–421. (Academic Press: San Diego)

Doolittle RF 1981 Similar amino acid sequences: chance or common ancestry? Science 214 149 159 doi:10.1126/science.7280687

Duret L , Abdeddaïm S (2000) Multiple alignments for structural, functional, or phylogenetic analyses of homologous sequences. In ‘Bioinformatics: sequence, structure, and databanks.’ (Ed. D Higgins, W Taylor) pp. 51–76. (Oxford University Press: Oxford)

Ebedes J Datta A 2004 Multiple sequence alignment in parallel on a workstation cluster. Bioinformatics 20 1193 1195 doi:10.1093/bioinformatics/bth055

Eddy SR 1998 Profile hidden markov models. Bioinformatics 14 755 763 doi:10.1093/bioinformatics/14.9.755

Eddy SR 2002 a A memory efficient dynamic programming algorithm for optimal structural alignment of a sequence to an RNA secondary structure. BMC Bioinformatics 3 18 doi:10.1186/1471-2105-3-18

Eddy SR 2002 b Computational genomics of noncoding RNA genes. Cell 109 137 140 doi:10.1016/S0092-8674(02)00727-4

Edgar RC 2004 a Local homology recognition and distance measures in linear time using compressed amino acid alphabets. Nucleic Acids Research 32 380 385 doi:10.1093/nar/gkh180

Edgar RC 2004 b MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32 1792 1797 doi:10.1093/nar/gkh340

Edgar RC 2004 c MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5 113 doi:10.1186/1471-2105-5-113

Edgar RC Sjölander K 2004 A comparison of scoring functions for protein sequence profile alignment. Bioinformatics 20 1301 1308 doi:10.1093/bioinformatics/bth090

Edgar RC Batzoglou S 2006 Multiple sequence alignment. Current Opinion in Structural Biology 16 368 373 doi:10.1016/j.sbi.2006.04.004

Elias I 2003 Settling the intractability of multiple alignment. Lecture Notes in Computer Science 2906 352 363

Ellis J Morrison D 1995 Effects of sequence alignment on the phylogeny of Sarcocystis deduced from 18S rDNA sequences. Parasitology Research 81 696 699
doi:10.1007/BF00931849

Errami M Geourjon C Deléage G 2003 Conservation of amino acids into multiple alignments involved in pairwise interactions in three-dimensional protein structures. Journal of Bioinformatics and Computational Biology 1 505 520 doi:10.1142/S0219720003000228

Feng D-F Doolittle RF 1987 Progressive sequence alignment as a prerequisite to correct phylogenetic trees. Journal of Molecular Evolution 25 351 360

Finn RD Mistry J Schuster-Böckler B Griffiths-Jones S Hollich V et al 2006 Pfam: clans, web tools and services. Nucleic Acids Research 34 D247 D251
doi:10.1093/nar/gkj149

Fitch WM 2000 Homology: a personal view on some of the problems. Trends in Genetics 16 227 231 doi:10.1016/S0168-9525(00)02005-9

Fitch WM Smith TF 1983 Optimal sequence alignments. Proceedings of the National Academy of Sciences USA 80 1382 1386 doi:10.1073/pnas.80.5.1382

Fleißner R (2004) ‘Sequence alignment and phylogenetic inference.’ (Logos Verlag: Berlin)

Fleissner R Metzler D von Haeseler A 2005 Simultaneous statistical multiple alignment and phylogeny reconstruction. Systematic Biology 54 548 561 doi:10.1080/10635150590950371

Frith MC Hansen U Spouge JL Weng Z 2004 Finding functional sequence elements by multiple local alignment. Nucleic Acids Research 32 189 200 doi:10.1093/nar/gkh169

Gagnon S Bourbeau D Levesque RC 1996 Secondary structures and features of the 18S, 5.8S and 26S ribosomal RNAs from the Apicomplexan parasite Toxoplasma gondii. Gene 173 129 135 doi:10.1016/0378-1119(96)00215-6

Gardner PP Giegerich R 2004 A comprehensive comparison of comparative RNA structure prediction approaches. BMC Bioinformatics 5 140 doi:10.1186/1471-2105-5-140

Gardner PP Wilm A Washietl S 2005 A benchmark of multiple sequence alignment programs upon structural RNAs. Nucleic Acids Research 33 2433 2439 doi:10.1093/nar/gki541

Geiger DL 2002 Stretch coding and block coding: two new strategies to represent questionably aligned DNA sequences. Journal of Molecular Evolution 54 191 199 doi:10.1007/s00239-001-0001-5

Gille C Frömmel C 2001 STRAP: editor for structural alignments of proteins. Bioinformatics 17 377 378 doi:10.1093/bioinformatics/17.4.377

Gillespie JJ 2004 Characterizing regions of ambiguous alignment caused by the expansion and contraction of hairpin-stem loops in ribosomal RNA molecules. Molecular Phylogenetics and Evolution 33 936 943 doi:10.1016/j.ympev.2004.08.004

Gillespie JJ Yoder MJ Wharton RA 2005 a Predicted secondary structure for 28S and 18S rRNA from Ichneumonoidea (Insecta: Hymenoptera: Apocrita): impact on sequence alignment and phylogeny estimation. Journal of Molecular Evolution 61 114 137 doi:10.1007/s00239-004-0246-x

Gillespie JJ McKenna CH Yoder MJ Gutell RR Johnston JS Kathirithamby J Cognato AI 2005 b Assessing the odd secondary structural properties of nuclear small subunit ribosomal RNA sequences (18S) of the twisted-wing parasites (Insecta: Strepsiptera). Insect Molecular Biology 14 625 643 doi:10.1111/j.1365-2583.2005.00591.x

Giribet G 2001 Exploring the behavior of POY, a program for direct optimization of molecular data. Cladistics 17 S60 S70 doi:10.1111/j.1096-0031.2001.tb00105.x

Giribet G (2002) Relationship among metazoan phyla as inferred from 18S rRNA sequence data: a methodological approach. In ‘Molecular systematics and evolution: theory and practice’. (Eds R DeSalle, G Giribet, W Wheeler) pp. 85–101. (Birkhäuser Verlag: Basel)

Giribet G 2005 Generating implied alignments under direct optimization using POY. Cladistics 21 396 402 doi:10.1111/j.1096-0031.2005.00071.x

Giribet G Wheeler WC 1999 On gaps. Molecular Phylogenetics and Evolution 13 132 143 doi:10.1006/mpev.1999.0643

Giribet G , Wheeler WC , Muona J (2002) DNA multiple sequence alignments. In ‘Molecular systematics and evolution: theory and practice’. (Eds R DeSalle, G Giribet, W Wheeler) pp. 107–114. (Birkhäuser Verlag: Basel)

Gonnet GH Korostensky C Benner S 2000 Evaluation measures of multiple sequence alignments. Journal of Computational Biology 7 261 276 doi:10.1089/10665270050081513

Gotoh O 1982 An improved algorithm for matching biological sequences. Journal of Molecular Biology 162 705 708 doi:10.1016/0022-2836(82)90398-9

Gotoh O 1990 Consistency of optimal sequence alignments. Bulletin of Mathematical Biology 52 509 525

Gotoh O 1995 A weighting scheme and algorithm for aligning many phylogenetically related sequences. Computer Applications in the Biosciences 11 543 551


Gotoh O 1996 Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments. Journal of Molecular Biology 264 823 838
doi:10.1006/jmbi.1996.0679

Gotoh O 1999 Multiple sequence alignment: algorithms and applications. Advances in Biophysics 36 159 206 doi:10.1016/S0065-227X(99)80007-0

Gough J 2005 Convergent evolution of domain architectures is rare. Bioinformatics 21 1464 1471 doi:10.1093/bioinformatics/bti204

Graham SW Reeves PA Burns ACE Olmstead RG 2000 Microstructural changes in noncoding chloroplast DNA: interpretation, evolution, and utility of indels and inversions in basal angiosperm phylogenetic inference. International Journal of Plant Sciences 161 S83 S96 doi:10.1086/317583

Grasso C Lee C 2004 Combining partial order alignment and progressive multiple sequence alignment increases alignment speed and scalability to very large alignment problems. Bioinformatics 20 1546 1556 doi:10.1093/bioinformatics/bth126

Greenberg HJ Hart WE Lancia G 2004 Opportunities for combinatorial optimization in computational biology. INFORMS Journal on Computing 16 211 231 doi:10.1287/ijoc.1040.0073

Griffiths-Jones S 2005 RALEE—RNA alignment editor in emacs. Bioinformatics 21 257 259 doi:10.1093/bioinformatics/bth489

Griffiths-Jones S Moxon S Marshall M Khanna A Eddy SR Bateman A 2005 Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Research 33 D121 D124 doi:10.1093/nar/gki081

Gu X Li W-H 1995 The size distribution of insertions and deletions in human and rodent pseudogenes suggests the logarithmic gap penalty for sequence alignment. Journal of Molecular Evolution 40 464 473 doi:10.1007/BF00164032

Gueneau de Novoa P Williams KP 2004 The tmRNA website: reductive evolution of tmRNA in plastids and other endosymbionts. Nucleic Acids Research 32 D104 D108 doi:10.1093/nar/gkh102

Gupta SK Kececioglu JD Schäffer AA 1995 Improving the practical space and time efficiency of the shortest-paths approach to sum-of-pairs multiple sequence alignment. Journal of Computational Biology 2 459 472

Gutell RR Lee JC Cannone JJ 2002 The accuracy of ribosomal RNA comparative structure models. Current Opinion in Structural Biology 12 301 310
doi:10.1016/S0959-440X(02)00339-1

Hall TA 1999 BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symposium Series 41 95 98

Hancock JM Vogler AP 2000 How slippage-derived sequences are incorporated into rRNA variable-region secondary structure: implications for phylogeny reconstruction. Molecular Phylogenetics and Evolution 14 366 374
doi:10.1006/mpev.1999.0709

Haszprunar G 1998 Parsimony analysis as a specific kind of homology estimation and the implications for character weighting. Molecular Phylogenetics and Evolution 9 333 339 doi:10.1006/mpev.1998.0496

Heger A Holm L 2000 Rapid automatic detection and alignment of repeats in protein sequences. Proteins: Structure, Function, and Genetics 41 224 237 doi:10.1002/1097-0134(20001101)41:2<224::AID-PROT70>3.0.CO;2-Z

Hein J 1990 Unified approach to alignment and phylogenies. Methods in Enzymology 183 626 645

Hein J 1994 An algorithm combining DNA and protein alignment. Journal of Theoretical Biology 167 169 174
doi:10.1006/jtbi.1994.1062

Hein J Støvlbæk J 1996 Combined DNA and protein alignment. Methods in Enzymology 266 402 418

Helm M Brulé H Friede D Giegé R Pütz J Florentz C 2000 Search for characteristic structural features of mammalian mitochondrial tRNAs. RNA 6 1356 1379
doi:10.1017/S1355838200001047

Henneke CM 1989 A multiple sequence alignment algorithm for homologous proteins using secondary structure information and optionally keying alignments to functionally important sites. Computer Applications in the Biosciences 5 141 150

Hennig W (1966) ‘Phylogenetic systematics.’ [Transl. DD Davis, R Zangerl from W Hennig (1950) ‘Grundzüge einer theorie der phylogenetischen systematik.’ (Deutscher Zentralverlag: Berlin)] (University of Illinois Press: Urbana)

Henikoff S 1991 Playing with blocks: some pitfalls of forcing multiple alignments. The New Biologist 3 1148 1154


Heringa J 1999 Two strategies for sequence comparison: profile-preprocessed and secondary structure-induced multiple alignment. Computers and Chemistry 23 341 364
doi:10.1016/S0097-8485(99)00012-1

Hickson RE Simon C Cooper A Spicer GS Sullivan J Penny D 1996 Conserved sequence motifs, alignment, and secondary structure for the third domain of animal 12S rRNA. Molecular Biology and Evolution 13 150 169

Hickson RE Simon C Perrey SW 2000 The performance of several multiple-sequence alignment programs in relation to secondary-structure features for an rRNA sequence. Molecular Biology and Evolution 17 530 539


Higgins DG Thompson JD Gibson TJ 1996 Using CLUSTAL for multiple sequence alignments. Methods in Enzymology 266 383 402


Higgins DG Blackshields G Wallace IM 2005 Mind the gaps: progress in progressive alignment. Proceedings of the National Academy of Sciences USA 102 10 411 10 412
doi:10.1073/pnas.0504801102

Higgs PG 2000 RNA secondary structure: physical and computational aspects. Quarterly Reviews of Biophysics 33 199 253 doi:10.1017/S0033583500003620

Hillis DM (1994) Homology in molecular biology. In ‘Homology: the hierarchical basis of comparative biology’. (Ed. BK Hall) pp. 339–368. (Academic Press: San Diego)

Hirosawa M Totoki Y Hoshida M Ishikawa M 1995 Comprehensive study of iterative algorithms of multiple sequence alignment. Computer Applications in the Biosciences 11 13 18

Hofacker IL Bernhart SHF Stadler PF 2004 Alignment of RNA base pairing probability matrices. Bioinformatics 20 2222 2227
doi:10.1093/bioinformatics/bth229

Hogeweg P Hesper B 1984 The alignment of sets of sequences and the construction of phyletic trees: an integrated method. Journal of Molecular Evolution 20 175 186 doi:10.1007/BF02257378

Holm L Sander C 1996 Mapping the protein universe. Science 273 595 603 doi:10.1126/science.273.5275.595

Holmes I 2005 Accelerated probabilistic inference of RNA structure evolution. BMC Bioinformatics 6 73 doi:10.1186/1471-2105-6-73

Holmes I Durbin R 1998 Dynamic programming alignment accuracy. Journal of Computational Biology 5 493 504

Hoot SB Douglas AW 1998 Phylogeny of the Proteaceae based on atpB and atpB–rbcL intergenic spacer region sequences. Australian Systematic Botany 11 301 320
doi:10.1071/SB98027

Hua Y Jiang T Wu B 1999 Aligning DNA sequences to minimize the change in protein. Journal of Combinatorial Optimization 3 227 245 doi:10.1023/A:1009889710983

Huang X Miller W 1991 A time-efficient, linear-space local similarity algorithm. Advances in Applied Mathematics 12 337 357 doi:10.1016/0196-8858(91)90017-D

Hudak J , McClure MA (1999) A comparative analysis of computational motif-detection methods. In ‘Proceedings of the 4th Pacific Symposium on Biocomputing 1999, Hawaii’. pp. 138–149.

Janies DA Wheeler WC 2001 Efficiency of parallel direct optimization. Cladistics 17 S71 S82 doi:10.1111/j.1096-0031.2001.tb00106.x

Jennings AJ Edge CM Sternberg MJE 2001 An approach to improving multiple alignments of protein sequences using predicted secondary structure. Protein Engineering 14 227 231 doi:10.1093/protein/14.4.227

Jeon Y-S Chung H Park S Hur I Lee J-H Chun J 2005 jPHYDIT: a JAVA-based integrated environment for molecular phylogeny of ribosomal RNA sequences. Bioinformatics 21 3171 3173 doi:10.1093/bioinformatics/bti463

Jiang T , Lawler EL , Wang L (1994) Aligning sequences via an evolutionary tree: complexity and approximation. In ‘Proceedings of the 26th annual ACM symposium on theory of computing’. pp. 760–769. (ACM Press: New York)

Johnson MS Sali A Blundell TL 1990 Phylogenetic relationships from three-dimensional protein structures. Methods in Enzymology 183 670 690

Johnson R 1982 Parsimony principles in phylogenetic systematics: a critical re-appraisal. Evolutionary Theory 6 79 90


Just W 2001 Computational complexity of multiple sequence alignment with SP-score. Journal of Computational Biology 8 615 623
doi:10.1089/106652701753307511

Just W Della Vedova G 2004 Multiple sequence alignment as a facility location problem. INFORMS Journal on Computing 16 430 440 doi:10.1287/ijoc.1040.0093

Karaca M Bilgen M Onus AN Ince AG Elmasulu SY 2005 Exact Tandem Repeats Analyzer (E-TRA): a new program for DNA sequence mining. Journal of Genetics 84 49 54

Karp RM 2002 Mathematical challenges from genomics and molecular biology. Notices of the AMS 49 544 553


Karplus K Hu B 2001 Evaluation of protein multiple alignments by SAM-T99 using the BAliBASE multiple alignment test set. Bioinformatics 17 713 720
doi:10.1093/bioinformatics/17.8.713

Katoh K Misawa K Kuma K Miyata T 2002 MAFFT: a novel method for rapid multiple sequence alignment based on fast fourier transform. Nucleic Acids Research 30 3059 3066 doi:10.1093/nar/gkf436

Katoh K Kuma K Toh H Miyata T 2005 a MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Research 33 511 518 doi:10.1093/nar/gki198

Katoh K Kuma K Miyata T Toh H 2005 b Improvement in the accuracy of multiple sequence alignment program MAFFT. Genome Informatics 16 22 33

Kawakita A Sota T Ascher JS Ito M Tanaka H Kato M 2003 Evolution and phylogenetic utility of alignment gaps within intron sequences of three nuclear genes in bumble genes (Bombus). Molecular Biology and Evolution 20 87 92
doi:10.1093/molbev/msg007

Kececioglu J , Starrett D (2004) Aligning alignments exactly. In ‘Proceedings of the 8th ACM conference on research in computational molecular biology (RECOMB’04)’. pp. 85–96. (ACM Press: New York)

Kececioglu J Kim E 2006 Simple and fast inverse alignment. Lecture Notes in Computer Science 3909 441 455

Keightley PD Johnson T 2004 MCALIGN: stochastic alignment of noncoding DNA sequences based on an evolutionary model of sequence evolution. Genome Research 14 442 450
doi:10.1101/gr.1571904

Kelchner SA 2000 The evolution of non-coding chloroplast DNA and its application in plant systematics. Annals of the Missouri Botanical Garden 87 482 498 doi:10.2307/2666142

Kelchner SA 2002 Group II introns as phylogenetic tools: structure, function, and evolutionary constraints. American Journal of Botany 89 1651 1669

Kelchner SA Wendel JF 1996 Hairpins create minute inversions in non-coding regions of chloroplast DNA. Current Genetics 30 259 262
doi:10.1007/s002940050130

Kelchner SA Clark LG 1997 Molecular evolution and phylogenetic utility of the chloroplast rpl16 intron in Chusquea and the Bambusoideae (Poaceae). Molecular Phylogenetics and Evolution 8 385 397 doi:10.1006/mpev.1997.0432

Kjer KM 1995 Use of rRNA secondary structure in phylogenetic studies to identify homologous positions: an example of alignment and data presentation from the frogs. Molecular Phylogenetics and Evolution 4 314 330 doi:10.1006/mpev.1995.1028

Kjer KM 1997 An alignment template for amphibian 12S rRNA, domain III: conserved primary and secondary structural motifs. Journal of Herpetology 31 599 604 doi:10.2307/1565621

Kjer KM 2004 Aligned 18S and insect phylogeny. Systematic Biology 53 506 514 doi:10.1080/10635150490445922

Kjer KM Baldridge GD Fallon AM 1994 Mosquito large subunit ribosomal RNA: simultaneous alignment of primary and secondary structure. Biochimica et Biophysica Acta 1217 147 155

Kjer KM Gillespie JJ Ober KA 2006 Opinions on multiple sequence alignment, and an empirical comparison of repeatability and accuracy between POY and structural alignment. Systematic Biology
in press

Kleinjung J Douglas N Heringa J 2002 Parallelized multiple alignment. Bioinformatics 18 1270 1271
doi:10.1093/bioinformatics/18.9.1270

Knudsen B Miyamoto M 2003 Sequence alignments and pair hidden markov models using evolutionary history. Journal of Molecular Biology 333 453 460 doi:10.1016/j.jmb.2003.08.015

Kolodny R Koehl P Levitt M 2005 Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures. Journal of Molecular Biology 346 1173 1188 doi:10.1016/j.jmb.2004.12.032

Kreitman M 1983 Nucleotide polymorphism at the alcohol dehydrogenase locus of Drosophila melanogaster. Nature 304 412 417 doi:10.1038/304412a0

Kroken S Taylor JW 2001 Outcrossing and recombination in the lichenized fungus Letharia. Fungal Genetics and Biology 34 83 92 doi:10.1006/fgbi.2001.1291

Kurtz S Schleiermacher C 1999 REPuter: fast computation of maximal repeats in complete genomes. Bioinformatics 15 426 427 doi:10.1093/bioinformatics/15.5.426

Lambert C Van Campenhout J-M DeBolle X Depiereux E 2003 Review of common sequence alignment methods: clues to enhance reliability. Current Genomics 4 131 146 doi:10.2174/1389202033350038

Lancia G Ravi R 1999 GESTALT: genomic steiner aligments. Lecture Notes in Computer Science 1645 101 114

Lassmann T Sonnhammer ELL 2002 Quality assessment of multiple alignment programs. FEBS Letters 529 126 130
doi:10.1016/S0014-5793(02)03189-7

Lassmann T Sonnhammer ELL 2005 Automatic assessment of alignment quality. Nucleic Acids Research 33 7120 7128 doi:10.1093/nar/gki1020

Laurenne NM Broad GR Quicke DLJ 2006 Direct optimization and multiple alignment of 28S D2–D3 rDNA sequences: problems with indels on the way to a molecular phylogeny of the cryptine ichneumon wasps (Insecta: Hymenoptera). Cladistics 22 442 473 doi:10.1111/j.1096-0031.2006.00112.x

Lawrence CJ Malmberg RL Muszynski MG Dawe RK 2002 Maximum likelihood methods reveal conservation of function among closely related kinesin families. Journal of Molecular Evolution 54 42 53 doi:10.1007/s00239-001-0016-y

Lawrence CJ Zmasek CM Dawe RK Malmberg RL 2004 LumberJack: a heuristic tool for sequence alignment exploration and phylogenetic inference. Bioinformatics 20 1977 1979 doi:10.1093/bioinformatics/bth180

Lebrun E Santini JM Brugna M Ducluzeau A-L Ouchane S Schoepp-Cothenet B Baymann F Nitschke W 2006 The rieske protein: a case study on the pitfalls of multiple sequence alignments and phylogenetic reconstruction. Molecular Biology and Evolution 23 1180 1191 doi:10.1093/molbev/msk010

Lecompte O Thompson JD Plewniak F Thierry J-C Poch O 2001 Multiple alignment of complete sequences (MACS) in the post-genomic era. Gene 270 17 30 doi:10.1016/S0378-1119(01)00461-9

Lee MSY 2001 Unalignable sequences and molecular evolution. Trends in Ecology and Evolution 16 681 685 doi:10.1016/S0169-5347(01)02313-8

Lenhof H-P Reinert K Vingron M 1998 A polyhedral approach to RNA sequence structure alignment. Journal of Computational Biology 5 517 530

Li K-B 2003 ClustalW-MPI: ClustalW analysis using distributed and parallel computing. Bioinformatics 19 1585 1586
doi:10.1093/bioinformatics/btg192

Lombard V Camon EB Parkinson HE Hingamp P Stoesser G Redaschi N 2002 EMBL-Align: a new public nucleotide and amino acid multiple sequence alignment database. Bioinformatics 18 763 764 doi:10.1093/bioinformatics/18.5.763

Löytynoja A Milinkovitch MC 2001 SOAP, cleaning multiple alignments from unstable blocks. Bioinformatics 17 573 574 doi:10.1093/bioinformatics/17.6.573

Löytynoja A Milinkovitch MC 2003 A hidden markov model for progressive multiple alignment. Bioinformatics 19 1505 1513 doi:10.1093/bioinformatics/btg193

Löytynoja A Goldman N 2005 An algorithm for progressive multiple alignment of sequences with insertions. Proceedings of the National Academy of Sciences USA 102 10 557 10 562 doi:10.1073/pnas.0409137102

Lu CL Huang YP 2005 A memory-efficient algorithm for multiple sequence alignment with constraints. Bioinformatics 21 23 30

Ludwig W Strunk O Westram R Richter L Meier H et al 2004 ARB: a software environment for sequence data. Nucleic Acids Research 32 1363 1371
doi:10.1093/nar/gkh293

Lunter G , Drummond AJ , Miklós I , Hein J (2005) Statistical alignment: recent progress, new applications, and challenges. In ‘Statistical methods in molecular evolution’. (Ed. R Nielsen) pp. 375–405. (Springer: New York)

Manohar A , Batzoglou S (2005) TreeRefiner: a tool for refining a multiple alignment on a phylogenetic tree. In ‘Proceedings of the 2005 IEEE computational systems bioinformatics conference (CSB’05)’. pp. 111–119. (IEEE Press: Piscataway)

Marchler-Bauer A Panchenko AR Ariel N Bryant SH 2002 Comparison of sequence and structure alignments for protein domains. Proteins: Structure, Function, and Genetics 48 439 446 doi:10.1002/prot.10163

Marchler-Bauer A Anderson JB Cherukuri PF DeWeese-Scott C Geer LY et al 2005 CDD: a conserved domain database for protein classification. Nucleic Acids Research 33 D192 D196 doi:10.1093/nar/gki069

Margulies EH Chen CW Green ED 2006 Differences between pair-wise and multi-sequence alignment methods affect vertebrate genome comparisons. Trends in Genetics 22 187 193 doi:10.1016/j.tig.2006.02.005

Marsden B Abagyan R 2004 SAD—a normalized structural alignment database: improving sequence–structure alignments. Bioinformatics 20 2333 2344 doi:10.1093/bioinformatics/bth244

Marti-Renom MA Madhusudhan MS Sali A 2004 Alignment of protein sequences by their profiles. Protein Science 13 1071 1087 doi:10.1110/ps.03379804

May ACW 2004 Percent sequence identity: the need to be explicit. Structure 12 737 738 doi:10.1016/j.str.2004.04.001

McClure MA Vasi TK Fitch WM 1994 Comparative analysis of multiple protein-sequence alignment methods. Molecular Biology and Evolution 11 571 592

Mecham J Clement M Snell Q Freestone T Seppi K Crandall K 2006 Jumpstarting phylogenetic analysis. International Journal of Bioinformatics Research and Applications 2 19 35


Miklós I Lunter GA Holmes I 2004 A “long indel” model for evolutionary sequence alignment. Molecular Biology and Evolution 21 529 540
doi:10.1093/molbev/msh043

Milinkovitch MC LeDuc RG Adachi J Farnir F Georges M Hasegawa M 1996 Effects of character weighting and species sampling on phylogeny reconstruction: a case study based on DNA sequence data in cetaceans. Genetics 144 1817 1833

Miller W 2001 Comparison of genomic DNA sequences: solved and unsolved problems. Bioinformatics 17 391 397
doi:10.1093/bioinformatics/17.5.391

Mindell DP (1991) Aligning DNA sequences: homology and phylogenetic weighting. In ‘Phylogenetic analysis of DNA sequences’. (Eds MM Miyamoto, J Cracraft) pp. 73–89. (Oxford University Press: New York)

Morell V 1996 TreeBASE: the roots of phylogeny. Science 273 569 doi:10.1126/science.273.5275.569

Morgenstern B 1999 DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment. Bioinformatics 15 211 218 doi:10.1093/bioinformatics/15.3.211

Morgenstern B Prohaska SJ Pohler D Stadler PF 2006 Multiple sequence alignment with user-defined anchor points. Algorithms for Molecular Biology 1 6 doi:10.1186/1748-7188-1-6

Morris P Cobabe E 1991 Cuvier meets Watson and Crick: the utility of molecules as classical homologies. Biological Journal of the Linnean Society 44 307 324

Morrison DA 2006 Phylogenetic analyses of parasites in the new millennium. Advances in Parasitology 63 1 124


Morrison DA Ellis JT 1997 Effects of nucleotide sequence alignment on phylogeny estimation: a case study of 18S rDNAs of Apicomplexa. Molecular Biology and Evolution 14 428 441


Mugridge NB Morrison DA Jäkel T Heckeroth AR Tenter AM Johnson AM 2000 Effects of sequence alignment and structural domains of ribosomal DNA on phylogeny reconstruction for the protozoan family Sarcocystidae. Molecular Biology and Evolution 17 1842 1853


Myers G Selznick S Zhang Z Miller W 1996 Progressive multiple alignment with constraints. Journal of Computational Biology 3 563 572


Nguyen HD Yoshihara I Yamamori K Yasunaga M 2002 Aligning multiple protein sequences by parallel hybrid genetic algorithm. Genome Informatics 13 123 132


Nicholas HB Ropelewski AJ Deerfield DW 2002 Strategies for multiple sequence alignment. BioTechniques 32 572 591


Notredame C 2002 Recent progress in multiple sequence alignment: a survey. Pharmacogenomics 3 131 144
doi:10.1517/14622416.3.1.131

Notredame C O’Brien EA Higgins DG 1997 RAGA: RNA sequence alignment by genetic algorithm. Nucleic Acids Research 25 4570 4580 doi:10.1093/nar/25.22.4570

Notredame C Holm L Higgins DG 1998 COFFEE: an objective function for multiple sequence alignments. Bioinformatics 14 407 422 doi:10.1093/bioinformatics/14.5.407

Notredame C Higgins DG Heringa J 2000 T-coffee: a novel method for fast and accurate multiple sequence alignment. Journal of Molecular Biology 302 205 217 doi:10.1006/jmbi.2000.4042

Nozaki Y Bellgard M 2005 Statistical evaluation and comparison of a pairwise alignment algorithm that a priori assigns the number of gaps rather than employing gap penalties. Bioinformatics 21 1421 1428 doi:10.1093/bioinformatics/bti198

O’Brien EA Notredame C Higgins DG 1998 Optimization of ribosomal RNA profile alignments. Bioinformatics 14 332 341 doi:10.1093/bioinformatics/14.4.332

O’Donnell K Kistler HC Tacke BK Casper HH 2000 Gene genealogies reveal global phylogeographic structure and reproductive isolation among lineages of Fusarium graminearum, the fungus causing wheat scab. Proceedings of the National Academy of Sciences USA 97 7905 7910 doi:10.1073/pnas.130193297

Ogden TH Rosenberg MS 2006 Multiple sequence alignment accuracy and phylogenetic inference. Systematic Biology 55 314 328 doi:10.1080/10635150500541730

Ohlson T Wallner B Elofsson A 2004 Profile–profile methods provide improved fold recognition: a study of different profile–profile alignment methods. Proteins: Structure, Function, and Bioinformatics 57 188 197 doi:10.1002/prot.20184

Oliver T Schmidt B Nathan D Clemens R Maskell D 2005 Using reconfigurable hardware to accelerate multiple sequence alignment with ClustalW. Bioinformatics 21 3431 3432 doi:10.1093/bioinformatics/bti508

Ophir R Graur D 1997 Patterns and rates of indel evolution in processed pseudogenes from humans and murids. Gene 205 191 202 doi:10.1016/S0378-1119(97)00398-3

O’Sullivan O Suhre K Abergel C Higgins DG Notredame C 2004 3DCoffee: combining protein sequences and structures within multiple sequence alignments. Journal of Molecular Biology 340 385 395 doi:10.1016/j.jmb.2004.04.058

Page RDM 2000 Comparative analysis of secondary structure of insect mitochondrial small subunit ribosomal RNA using maximum weighted matching. Nucleic Acids Research 28 3839 3845 doi:10.1093/nar/28.20.3839

Parida L Floratos A Rigoutsos I 1999 An approximation algorithm for alignment of multiple sequences using motif discovery. Journal of Combinatorial Optimization 3 247 275 doi:10.1023/A:1009841927822

Parmentier G Trystram D Zola J 2004 Cache-based parallelization of multiple sequence alignment problem. Lecture Notes in Computer Science 3149 1005 1012

Pascarella S Argos P 1992 Analysis of insertions / deletions in protein structures. Journal of Molecular Biology 224 461 471
doi:10.1016/0022-2836(92)91008-D

Patterson C 1988 Homology in classical and molecular biology. Molecular Biology and Evolution 5 603 625

Pearson WR Sierk ML 2005 The limits of protein sequence comparison? Current Opinion in Structural Biology 15 254 260
doi:10.1016/j.sbi.2005.05.005

Pedersen CNS Lyngsø R Hein J 1998 Comparison of coding DNA. Lecture Notes in Computer Science 1448 153 173

Pei J Grishin NV 2006 MUMMALS: multiple sequence alignment improved by using hidden markov models with local structural information. Nucleic Acids Research 34 4364 4374
doi:10.1093/nar/gkl514

Pei J Sadreyev R Grishin NV 2003 PCMA: fast and accurate multiple sequence alignment based on profile consistency. Bioinformatics 19 427 428 doi:10.1093/bioinformatics/btg008

Petersen G Seberg O Aagesen L Frederiksen S 2004 An empirical test of the treatment of indels during optimization alignment based on the phylogeny of the genus Secale (Poaceae). Molecular Phylogenetics and Evolution 30 733 742 doi:10.1016/S1055-7903(03)00206-9

Pettersson EU Ljunggren EL Morrison DA Mattsson JG 2005 Functional analysis and localisation of a class delta glutathione S-transferase from Sarcoptes scabiei. International Journal for Parasitology 35 39 48 doi:10.1016/j.ijpara.2004.09.006

Phillips A 2006 Homology assessment and molecular sequence alignment. Journal of Biomedical Informatics 39 18 33 doi:10.1016/j.jbi.2005.11.005

Phillips A Janies D Wheeler W 2000 Multiple sequence alignment in phylogenetic analysis. Molecular Phylogenetics and Evolution 16 317 330 doi:10.1006/mpev.2000.0785

Pible O Imbert G Pellequer J-L 2005 INTERALIGN: interactive alignment editor for distantly related protein sequences. Bioinformatics 21 3166 3167 doi:10.1093/bioinformatics/bti474

de Pinna MCC 1991 Concepts and tests of homology in the cladistic paradigm. Cladistics 7 367 394 doi:10.1111/j.1096-0031.1991.tb00045.x

Poch O Delarue M 1996 Converting sequence block alignments into structural insights. Methods in Enzymology 266 662 680

Pollard DA Bergman CM Stoye J Celniker SE Eisen MB 2004 Benchmarking tools for the alignment of functional noncoding DNA. BMC Bioinformatics 5 6
doi:10.1186/1471-2105-5-6

Ponting CP , Birney E (2000) Identification of domains from protein sequences. In ‘Protein structure prediction: methods and protocols’. (Ed. DM Webster) pp. 53–69. (Humana Press: Totowa)

Qian B Goldstein RA 2001 Distribution of indel lengths. Proteins: Structure, Function, and Genetics 45 102 104 doi:10.1002/prot.1129

Raghava GPS Searle SMJ Audley PC Barber JD Barton GJ 2003 OXBench: a benchmark for evaluation of protein multiple sequence alignment accuracy. BMC Bioinformatics 4 47 doi:10.1186/1471-2105-4-47

Rainaldi G Volpicella M Licciulli F Liuni S Gallerani R Ceci LR 2003 PLMItRNA, a database on the heterogeneous genetic origin of mitochondrial tRNA genes and tRNAs in photosynthetic eukaryotes. Nucleic Acids Research 31 436 438 doi:10.1093/nar/gkg080

Raphael B Zhi D Tang H Pevzner P 2004 A novel method for multiple alignment of sequences with repeated and shuffled elements. Genome Research 14 2336 2346 doi:10.1101/gr.2657504

Redelings BD Suchard MA 2005 Joint bayesian estimation of alignment and phylogeny. Systematic Biology 54 401 418 doi:10.1080/10635150590947041

Reeck GR de Haën C Teller DC Doolittle RF Fitch WM et al 1987 “Homology” in proteins and nucleic acids: a terminology muddle and a way out of it. Cell 50 667 doi:10.1016/0092-8674(87)90322-9

Reese JT Pearson WR 2002 Empirical determination of effective gap penalties for sequence comparison. Bioinformatics 18 1500 1507 doi:10.1093/bioinformatics/18.11.1500

Reinert K Stoye J Will T 2000 An iterative method for faster sum-of-pairs multiple sequence alignment. Bioinformatics 16 808 814 doi:10.1093/bioinformatics/16.9.808

Riaz T Wang Y Li K-B 2004 Multiple sequence alignment using tabu search. Conferences in Research and Practice in Information Technology 29 223 232

Riaz T Wang Y Li K-B 2005 Tabu search algorithm for post-processing multiple sequence alignment. Journal of Bioinformatics and Computational Biology 3 145 156
doi:10.1142/S0219720005000928

Rice KA Donoghue MJ Olmstead RG 1997 Analyzing large data sets: rbcL 500 revisited. Systematic Biology 46 554 563 doi:10.2307/2413696

Rieppel O (1994) Homology, topology, and typology: the history of modern debates. In ‘Homology: the hierarchical basis of comparative biology’. (Ed. BK Hall) pp. 63–100. (Academic Press: San Diego)

Rieppel O Kearney M 2002 Similarity. Biological Journal of the Linnean Society 75 59 82 doi:10.1046/j.1095-8312.2002.00006.x

Rinsma-Melchert I 1993 The expected number of matches in optimal global sequence alignments. New Zealand Journal of Botany 31 219 230

Rodriguez R , Vriend G (1997) Professional gambling. In ‘Biomolecular structure and dynamics: recent experimental and theoretical advances’. (Eds G Vergoten, T Theophanides) pp. 79–120. (Kluwer Academic Publishers: Dordrecht)

Rosenberg MS 2005 a Evolutionary distance estimation and fidelity of pair wise sequence alignment. BMC Bioinformatics 6 102
doi:10.1186/1471-2105-6-102

Rosenberg MS 2005 b MySSP: non-stationary evolutionary sequence simulation, including indels. Evolutionary Bioinformatics Online 1 81 83

Roshan U Livesay DR 2006 Probalign: multiple sequence alignment using partition function posterior probabilities. Bioinformatics 22 2715 2721
doi:10.1093/bioinformatics/bt1472

Rost B Valencia A 1996 Pitfalls of protein sequence analysis. Current Opinion in Biotechnology 7 457 461 doi:10.1016/S0958-1669(96)80124-8

Sadreyev RI Grishin NV 2004 Estimates of statistical significance for comparison of individual positions in multiple sequence alignments. BMC Bioinformatics 5 106 doi:10.1186/1471-2105-5-106

Sammeth M Heringa J 2006 Global multiple-sequence alignment with repeats. Proteins: Structure, Function, and Bioinformatics 64 263 274 doi:10.1002/prot.20957

Sammeth M Weniger T Harmsen D Stoye J 2005 Alignment of tandem repeats with excision, duplication, substitution and indels (EDSI). Lecture Notes in Computer Science 3692 276 290

Sanchis A Michelana JM Latorre A Quicke DLJ Gärdenfors U Belshaw R 2001 The phylogenetic analysis of variable-length sequence data: elongation factor-1α introns in European populations of the parasitoid wasp genus Pauesia (Hymenoptera: Braconidae: Aphidiinae). Molecular Biology and Evolution 18 1117 1131


Sankoff D , Cedergren RJ (1983) Simultaneous comparison of three or more sequences related by a tree. In ‘Time warps, string edits, and macromolecules: the theory and practice of sequence comparison’. (Eds D Sankoff, JB Kruskal) pp. 253–264. (Addison-Wesley: Reading)

Sankoff D Morel C Cedergren RJ 1973 Evolution of 5S RNA and the non-randomness of base replacement. Nature 245 232 234
doi:10.1038/245232a0

Sauder JM Arthur JW Dunbrack RL 2000 Large-scale comparison of protein sequence alignment algorithms with structure alignments. Proteins: Structure, Function, and Genetics 40 6 22 doi:10.1002/(SICI)1097-0134(20000701)40:1<6::AID-PROT30>3.0.CO;2-7

Schmollinger M Nieselt K Kaufmann M Morgenstern B 2004 DIALIGN P: fast pair-wise and multiple sequence alignment using parallel processors. BMC Bioinformatics 5 128 doi:10.1186/1471-2105-5-128

Schuler GD Altschul SF Lipman DJ 1991 A workbench for multiple alignment construction and analysis. Proteins 9 180 190 doi:10.1002/prot.340090304

Schultes EA Hraber PT LaBean TH 1999 Estimating the contributions of selection and self-organization in RNA secondary structure. Journal of Molecular Evolution 49 76 83 doi:10.1007/PL00006536

Schultz J Maisel S Gerlach D Müller T Wolf M 2005 A common core of secondary structure of the internal transcribed spacer 2 (ITS2) throughout the Eukaryota. RNA 11 361 364 doi:10.1261/rna.7204505

Schwikowski B Vingron M 1997 a The deferred path heuristic for the generalized tree alignment problem. Journal of Computational Biology 4 415 431

Schwikowski B Vingron M 1997 b A clustering approach to generalized tree alignment with application to Alu repeats. Lecture Notes in Computer Science 1278 115 124


Schwikowski B Vingron M 2003 Sequence graphs: boosting iterated dynamic programming using locally suboptimal solutions. Discrete Applied Mathematics 127 95 117
doi:10.1016/S0166-218X(02)00288-3

Shakhnovich BE 2005 Improving the precision of the structure–function relationship by considering phylogenetic context. PLoS Computational Biology 1 e9 doi:10.1371/journal.pcbi.0010009

Shull VL Vogler AP Baker MD Maddison DR Hammond PM 2001 Sequence alignment of 18S ribosomal RNA and the basal relationships of adephagan beetles: evidence for monophyly of aquatic families and the placement of Trachypachidae. Systematic Biology 50 945 969 doi:10.1080/106351501753462894

Siddharthan R 2006 Sigma: multiple alignment of weakly-conserved non-coding DNA sequence. BMC Bioinformatics 7 143 doi:10.1186/1471-2105-7-143

Siebert S Backofen R 2005 MARNA: multiple alignment and consensus structure prediction of RNAs based on sequence structure comparisons. Bioinformatics 21 3352 3359 doi:10.1093/bioinformatics/bti550

Simmons MP 2004 Independence of alignment and tree search. Molecular Phylogenetics and Evolution 31 874 879 doi:10.1016/j.ympev.2003.10.008

Simmons MP Ochoterena H 2000 Gaps as characters in sequence-based phylogenetic analysis. Systematic Biology 49 369 381 doi:10.1080/10635159950173889

Simmons MP Freudenstein JV 2003 The effects of increasing genetic distance on alignment of, and tree construction from, rDNA internal transcribed spacer sequences. Molecular Phylogenetics and Evolution 26 444 451 doi:10.1016/S1055-7903(02)00366-4

Simmons MP Carr TG O’Neill K 2004 Relative character-state space, amount of potential phylogenetic information, and heterogeneity of nucleotide and amino acid characters. Molecular Phylogenetics and Evolution 32 913 926 doi:10.1016/j.ympev.2004.04.011

Simossis VA Heringa J 2004 Integrating protein secondary structure prediction and multiple sequence alignment. Current Protein and Peptide Science 5 249 266 doi:10.2174/1389203043379675

Simossis VA Heringa J 2005 PRALINE: a multiple sequence alignment toolbox that integrates homology-extended and secondary structure information. Nucleic Acids Research 33 W289 W294 doi:10.1093/nar/gki390

Simossis VA Kleinjung J Heringa J 2005 Homology-extended sequence alignment. Nucleic Acids Research 33 816 824 doi:10.1093/nar/gki233

Slowinski JB 1998 The number of multiple alignments. Molecular Phylogenetics and Evolution 10 264 266 doi:10.1006/mpev.1998.0522

Sluys R 1996 The notion of homology in current comparative biology. Journal of Zoological Systematics and Evolutionary Research 34 145 152

Smith NGC Hurst LD 1998 Sensitivity of patterns of molecular evolution to alterations in methodology: a critique of Hughes and Yeager. Journal of Molecular Evolution 47 493 500
doi:10.1007/PL00013151

del Sol Mesa A Pazos F Valencia A 2003 Automatic methods for predicting functionally important residues. Journal of Molecular Biology 326 1289 1302 doi:10.1016/S0022-2836(02)01451-1

Sprinzl M Vassilenko KS 2005 Compilation of tRNA sequences and sequences of tRNA genes. Nucleic Acids Research 33 D139 D140 doi:10.1093/nar/gki012

Stebbings LA Mizuguchi K 2004 HOMSTRAD: recent developments of the homologous protein structure alignment database. Nucleic Acids Research 32 D203 D207 doi:10.1093/nar/gkh027

Stocsits RR Hofaker IL Fried C Stadler PF 2005 Multiple sequence alignments of partially coding nucleic acid sequences. BMC Bioinformatics 6 160 doi:10.1186/1471-2105-6-160

Stoye J Evers D Meyer F 1998 Rose: generating sequence families. Bioinformatics 14 157 163 doi:10.1093/bioinformatics/14.2.157

Subramanian AR Weyer-Menkhoff J Kaufmann M Morgenstern B 2005 DIALIGN-T: an improved algorithm for segment-based multiple sequence alignment. BMC Bioinformatics 6 66 doi:10.1186/1471-2105-6-66

Sze S-H Lu Y Yang Q 2006 A polynomial time solvable formulation of multiple sequence alignment. Journal of Computational Biology 13 309 319 doi:10.1089/cmb.2006.13.309

Szklarczyk R Heringa J 2004 Tracking repeats using significance and transitivity. Bioinformatics 20 i311 i317 doi:10.1093/bioinformatics/bth911

Szymanski M Barciszewska MZ Erdmann VA Barciszewski J 2002 5S ribosomal RNA database. Nucleic Acids Research 30 176 178 doi:10.1093/nar/30.1.176

Taylor WR 1986 Identification of protein sequence homology by consensus template alignment. Journal of Molecular Biology 188 233 258 doi:10.1016/0022-2836(86)90308-6

Taylor WR 1987 Multiple sequence alignment by a pairwise algorithm. Computer Applications in the Biosciences 3 81 87

Taylor WR 1996 Multiple protein sequence alignment: algorithms and gap insertion. Methods in Enzymology 266 343 367


Teeling H Gloeckner FO 2006 RibAlign: a software tool and database for eubacterial phylogeny based on concatenated ribosomal protein subunits. BMC Bioinformatics 7 66
doi:10.1186/1471-2105-7-66

Telford MJ Wise MJ Gowri-Shankar V 2005 Consideration of RNA secondary structure significantly improves likelihood-based estimates of phylogeny: examples from the Bilateria. Molecular Biology and Evolution 22 1129 1136 doi:10.1093/molbev/msi099

Terry MD Whiting MF 2005 Comparison of two alignment techniques within a single complex data set: POY versus Clustal. Cladistics 21 272 281 doi:10.1111/j.1096-0031.2005.00063.x

Thébault P Monestié A Higgins DG 1999 MIAH: automatic alignment of eukaryotic SSU rRNAs. Bioinformatics 15 341 342 doi:10.1093/bioinformatics/15.4.341

Thompson JD Higgins DG Gibson TJ 1994 CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research 22 4673 4680

Thompson JD Gibson TJ Plewniak F Jeanmougin F Higgins DG 1997 The CLUSTAL-X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Research 25 4876 4882
doi:10.1093/nar/25.24.4876

Thompson JD Plewniak F Poch O 1999 a BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics 15 87 88 doi:10.1093/bioinformatics/15.1.87

Thompson JD Plewniak F Poch O 1999 b A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Research 27 2682 2690 doi:10.1093/nar/27.13.2682

Thompson JD Plewniak F Thierry J-C Poch O 2000 DbClustal: rapid and reliable global multiple alignments of protein sequences detected by database searches. Nucleic Acids Research 28 2919 2926 doi:10.1093/nar/28.15.2919

Thompson JD Plewniak F Ripp R Thierry J-C Poch O 2001 Towards a reliable objective function for multiple sequence alignments. Journal of Molecular Biology 314 937 951 doi:10.1006/jmbi.2001.5187

Thompson JD Thierry JC Poch O 2003 RASCAL: rapid scanning and correction of multiple sequence alignments. Bioinformatics 19 1155 1161 doi:10.1093/bioinformatics/btg133

Thompson JD Koehl P Ripp R Poch O 2005 BAliBASE 3.0: latest developments of the multiple sequence alignment benchmark. Proteins: Structure, Function, and Bioinformatics 61 127 136 doi:10.1002/prot.20527

Thomsen R , Fogel GB , Krink T (2002) A Clustal alignment improver using evolutionary algorithms. In ‘Proccedings of the fourth congress on evolutionary computation (CEC-2002)’. (Eds DB Fogel, X Yao, G Greenwood, H Iba, P Marrow, M Shackleton) pp. 121–126. (IEEE Press: Piscataway)

Thomsen R , Fogel GB , Krink T (2003) Improvement of Clustal-derived sequence alignments with evolutionary algorithms. In ‘Proccedings of the fifth congress on evolutionary computation (CEC-2003)’. (Eds DR Sarker, R Reynolds, H Abbass, KC Tan, B McKay, D Essam, T Gedeon) pp. 1499–1507. (IEEE Press: Piscataway)

Thorne JL Kishino H 1992 Freeing phylogenies from artifacts of alignment. Molecular Biology and Evolution 9 1148 1162

Thorne JL Churchill GA 1995 Estimation and reliability of molecular sequence alignments. Biometrics 51 100 113
doi:10.2307/2533318

Thorne JL Kishino H Felsenstein J 1991 An evolutionary model for maximum likelihood alignment of DNA sequences. Journal of Molecular Evolution 33 114 124 doi:10.1007/BF02193625

Thorne JL Kishino H Felsenstein J 1992 Inching toward reality: an improved likelihood model for sequence evolution. Journal of Molecular Evolution 34 3 16 doi:10.1007/BF00163848

Titus TA Frost DR 1996 Molecular homology assessment and phylogeny in the lizard family Opluridae (Squamata: Iguania). Molecular Phylogenetics and Evolution 6 49 62 doi:10.1006/mpev.1996.0057

Touzet H Perriquet O 2004 CARNAC: folding families of related RNAs. Nucleic Acids Research 32 W142 W145

Trystram D Zola J 2005 Parallel multiple sequence alignment with decentralized cache support. Lecture Notes in Computer Science 3648 1217 1226


Tsai YT Huang YP Yu CT Lu CL 2004 MuSiC: a tool for multiple sequence alignment with constraints. Bioinformatics 20 2309 2311
doi:10.1093/bioinformatics/bth220

Tyson H 1992 Relationships between amino acid sequences determined through optimum alignments, clustering, and specific distance patterns: application to a group of scorpion toxins. Genome 35 360 371

van Valen L 1982 Homology and causes. Journal of Morphology 173 305 312
doi:10.1002/jmor.1051730307

Van Walle I Lasters I Wyns L 2004 Align-m—a new algorithm for multiple alignment of highly divergent sequences. Bioinformatics 20 1428 1435 doi:10.1093/bioinformatics/bth116

Van Walle I Lasters I Wyns L 2005 SABmark—a benchmark for sequence alignment that covers the entire known fold space. Bioinformatics 21 1267 1268 doi:10.1093/bioinformatics/bth493

Varani G , Pardi A (1994) Structure of RNA. In ‘RNA–protein interactions’. (Eds K Nagai, IW Mattaj) pp. 1–24. (IRL Press: Oxford)

Vingron M (1999) Sequence alignment and phylogeny construction. In ‘Mathematical support for molecular biology’. (Eds M Farach-Colton, FS Roberts, M Vingron, M Waterman) pp. 53–64. (American Mathematical Society: Providence)

Vingron M Waterman MS 1994 Sequence alignments and penalty choice: review of concepts, case studies and implications. Journal of Molecular Biology 235 1 12 doi:10.1016/S0022-2836(05)80006-3

Vingron M von Haeseler A 1997 Towards integration of multiple alignment and phylogenetic tree construction. Journal of Computational Biology 4 23 34

Vogt G Etzold T Argos P 1995 An assessment of amino acid exchange matrices in aligning protein sequences: the twilight zone revisited. Journal of Molecular Biology 249 816 831
doi:10.1006/jmbi.1995.0340

Vogt L 2002 Testing and weighting characters. Organisms, Diversity and Evolution 2 319 333 doi:10.1078/1439-6092-00051

Wagner GP 1989 The biological homology concept. Annual Review of Ecology and Systematics 20 51 69 doi:10.1146/annurev.es.20.110189.000411

Wallace IM Blackshields G Higgins DG 2005 a Multiple sequence alignments. Current Opinion in Structural Biology 15 261 266 doi:10.1016/j.sbi.2005.04.002

Wallace IM O’Sullivan O Higgins DG 2005 b Evaluation of iterative alignment algorithms for multiple alignment. Bioinformatics 21 1408 1414 doi:10.1093/bioinformatics/bti159

Wallace IM O’Sullivan O Higgins DG Notredame C 2006 M-Coffee: combining multiple sequence alignment methods with T-Coffee. Nucleic Acids Research 34 1692 1699 doi:10.1093/nar/gkl091

Wang G Dunbrack RL 2004 Scoring profile-to-profile sequence alignments. Protein Science 13 1612 1626 doi:10.1110/ps.03601504

Wang L Jiang T 1994 On the complexity of multiple sequence alignment. Journal of Computational Biology 1 337 348

Wang Y Li K-B 2004 An adaptive and iterative algorithm for refining multiple sequence alignment. Computational Biology and Chemistry 28 141 148
doi:10.1016/j.compbiolchem.2004.02.001

Wareham HT 1995 A simplified proof of the NP- and MAX SNP-hardness of multiple sequence tree alignment. Journal of Computational Biology 2 509 514

Waterman MS (1995) ‘Introduction to computational biology: maps, sequences and genomes.’ (Chapman & Hall: London)

Wegner K Jansen S Wuchty S Gauges R Kummer U 2004 CombAlign: a protein sequence comparison algorithm considering recombinations. In Silico Biology 4 0021


Wegnez M 1987 Letter to the editor. Cell 51 516
doi:10.1016/0092-8674(87)90118-8

Wernersson R Pedersen AG 2003 RevTrans: multiple alignment of coding DNA from aligned amino acid sequences. Nucleic Acids Research 31 3537 3539 doi:10.1093/nar/gkg609

Westbrook J Feng Z Chen L Yang H Berman HM 2003 The Protein Data Bank and structural genomics. Nucleic Acids Research 31 489 491 doi:10.1093/nar/gkg068

Wexler Y Yakhini Z Kashi Y Geiger D 2005 Finding approximate tandem repeats in genomic sequences. Journal of Computational Biology 12 928 942 doi:10.1089/cmb.2005.12.928

Wheeler WC 1993 The triangle inequality and character analysis. Molecular Biology and Evolution 10 707 712

Wheeler WC (1994) Sources of ambiguity in nucleic acid sequence alignment. In ‘Molecular ecology and evolution: approaches and applications’. (Eds B Schierwater, B Streit, GP Wagner, R DeSalle) pp. 323–352. (Birkhäuser Verlag: Basel)

Wheeler WC 1995 Sequence alignment, parameter sensitivity, and phylogenetic analysis of molecular data. Systematic Biology 44 321 331
doi:10.2307/2413595

Wheeler W 1996 Optimization alignment: the end of multiple sequence alignment in phylogenetics? Cladistics 12 1 9 doi:10.1111/j.1096-0031.1996.tb00189.x

Wheeler W (1998) Alignment characters, dynamic programming and heuristic solutions. In ‘Molecular approaches to ecology and evolution’. (Eds R DeSalle, B Schierwater) pp. 243–251. (Birkhäuser Verlag: Basel)

Wheeler WC 1999 Fixed character states and the optimization of molecular sequence data. Cladistics 15 379 385 doi:10.1111/j.1096-0031.1999.tb00274.x

Wheeler W (2001 a) Homology and DNA sequence data. In ‘The character concept in evolutionary biology’. (Ed. GP Wagner) pp. 303–317. (Academic Press: San Diego)

Wheeler W 2001 b Homology and the optimization of DNA sequence data. Cladistics 17 S3 S11 doi:10.1111/j.1096-0031.2001.tb00100.x

Wheeler WC (2002) Optimization alignment: down, up, error, and improvements. In ‘Techniques in molecular systematics and evolution’. (Eds R DeSalle, G Giribet, W Wheeler) pp. 55–69. (Birkhäuser Verlag: Basel)

Wheeler WC 2003 a Iterative pass optimization of sequence data. Cladistics 19 254 260 doi:10.1111/j.1096-0031.2003.tb00368.x

Wheeler WC 2003 b Implied alignment: a synapomorphy-based multiple-sequence alignment method and its use in cladogram search. Cladistics 19 261 268 doi:10.1111/j.1096-0031.2003.tb00369.x

Wheeler WC 2003 c Search-based optimization. Cladistics 19 348 355 doi:10.1111/j.1096-0031.2003.tb00378.x

Wheeler WC (2005) Alignment, dynamic homology, and optimization. In ‘Parsimony, phylogeny, and genomics’. (Ed. VA Albert) pp. 71–80. (Oxford University Press: Oxford)

Wheeler WC 2006 Dynamic homology and the likelihood criterion. Cladistics 22 157 170 doi:10.1111/j.1096-0031.2006.00096.x

Wheeler WC Gladstein DS 1994 MALIGN: a multiple sequence alignment program. Journal of Heredity 85 417 418

Whelan S de Bakker PIW Quevillon E Rodriguez N Goldman N 2006 PANDIT: an evolution-centric database of protein and associated nucleotide domains with inferred trees. Nucleic Acids Research 34 D327 D331
doi:10.1093/nar/gkj087

Whiting AS Sites JW Pellegrino KCM Rodrigues MT 2006 Comparing alignment methods for inferring the history of the new world lizard genus Mabuya (Squamata: Scincidae). Molecular Phylogenetics and Evolution 38 719 730 doi:10.1016/j.ympev.2005.11.011

Williams DM 1993 A note on molecular homology: multiple patterns from single datasets. Cladistics 9 233 245 doi:10.1111/j.1096-0031.1993.tb00221.x

Winnepenninckx B Backeljau T 1996 18S rRNA alignments derived from different secondary structure models can produce alternative phylogenies. Journal of Zoological Systematics and Evolutionary Research 34 135 143

Winter WP Walsh KA Neurath H 1968 Homology as applied to proteins. Science 162 1433
doi:10.1126/science.162.3861.1433

Wrabl JO Grishin NV 2004 Gaps in structurally similar proteins: towards improvement of multiple sequence alignment. Proteins: Structure, Function, and Bioinformatics 54 71 87 doi:10.1002/prot.10508

Wuyts J Perrière G Van de Peer Y 2004 The European ribosomal RNA database. Nucleic Acids Research 32 D101 D103 doi:10.1093/nar/gkh065

Xiao L Sulaiman IM Ryan UM Zhou L Atwill ER Tischler ML Zhang X Fayer R Lal AA 2002 Host adaptation and host-parasite co-evolution in Cryptosporidium: implications for taxonomy and public health. International Journal for Parasitology 32 1773 1785 doi:10.1016/S0020-7519(02)00197-2

Yamada S Gotoh O Yamana H 2004 Extension of Prrn: implementation of a doubly nested randomized iterative refinement strategy under a piecewise linear gap cost. Genome Informatics 15 P082

Yu H , Deng M (2005) ClustalY: speed up the guide tree building for ClustalW. In ‘Proceedings of the eighth international conference on high-performance computing in Asia-Pacific region (HPCASIA’05)’. pp. 608–610. (IEEE Press: Piscataway)

Yuan J Amend A Borkowski J DeMarco R Bailey W Liu Y Xie G Blevins R 1999 MULTICLUSTAL: a systematic method for surveying ClustalW alignment parameters. Bioinformatics 15 862 863
doi:10.1093/bioinformatics/15.10.862

Zhang X , Kahveci T (2006) A new approach for alignment of multiple proteins. In ‘Proceedings of the 11th Pacific Symposium on Biocomputing 2006, Hawaii’. pp. 339–350.

Zhou H Zhou Y 2005 SPEM: improving multiple sequence alignment with sequence profiles and predicted secondary structure. Bioinformatics 21 3615 3621 doi:10.1093/bioinformatics/bti582

Zhu J Liu JS Lawrence CE 1998 Bayesian adaptive sequence alignment algorithms. Bioinformatics 14 25 39 doi:10.1093/bioinformatics/14.1.25

Zwieb C 1997 The uRNA database. Nucleic Acids Research 25 102 103 doi:10.1093/nar/25.1.102


Full Text PDF (970.6 KB) Export Citation