Version 3.2

CLUSTALW
Multiple Sequence Alignment

Selected Sequence(s)
  • Caenorhabditis elegans cosmid T28A8, complete sequence_
  • S.cerevisiae chromosome XIII cosmid 8520_
  • Drosophila melanogaster mutL homolog (Mlh1) gene, complete cds_
  • Rattus norvegicus mismatch repair protein (MLH1) mRNA, complete
  • Mus musculus MutL homolog 1 protein (MLH1) mRNA, complete cds.
  • Human DNA mismatch repair (hmlh1) mRNA, complete cds_


    Fasta label (*)Workbench label
    GENPEPT:3880333Caenorhabditis elegans cosmid T28A8, complete sequence_
    GENPEPT:825572S.cerevisiae chromosome XIII cosmid 8520_
    GENPEPT:3192877Drosophila melanogaster mutL homolog (Mlh1) gene, complete cds_
    GENPEPT:1724118Rattus norvegicus mismatch repair protein (MLH1) mRNA, complete
    GENPEPT:7595954Mus musculus MutL homolog 1 protein (MLH1) mRNA, complete cds.
    GENPEPT:466462Human DNA mismatch repair (hmlh1) mRNA, complete cds_

    (*) Clustalw cuts off Fasta labels after the first space (e.g. ">abc def" becomes ">abc").


    Sequence alignment

    Consensus key (see documentation for details)
    * - single, fully conserved residue
    : - conservation of strong groups
    . - conservation of weak groups
      - no consensus
    
    
    CLUSTAL W (1.81) multiple sequence alignment
    
    
    GENPEPT_7595954      -----------------MAFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMIENCLDAK
    GENPEPT_1724118      -----------------MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMTENCLDAK
    GENPEPT_466462       -----------------MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMIENCLDAK
    GENPEPT_3192877      ---------------MAEYLQPGVIRKLDEVVVNRIAAGEIIQRPANALKELLENSLDAQ
    GENPEPT_825572       --------------------MSLRIKALDASVVNKIAAGEIIISPVNALKEMMENSIDAN
    GENPEPT_3880333      MWHCGYRTRNCDEFSKIEFSLMGLIQRLPQDVVNRMAAGEVLARPCNAIKELVENSLDAG
                                                 *: *   ***::****::  * **:**: **.:** 
    
    GENPEPT_7595954      STNIQVVVKEGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQTFEDLASISTYGFRGEA
    GENPEPT_1724118      STNIQVIVREGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQTFEDLAMISTYGFRGEA
    GENPEPT_466462       STSIQVIVKEGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQSFEDLASISTYGFRGEA
    GENPEPT_3192877      STHIQVQVKAGGLKLLQIQDNGTGIRREDLAIVCERFTTSKLTRFEDLSQIATFGFRGEA
    GENPEPT_825572       ATMIDILVKEGGIKVLQITDNGSGINKADLPILCERFTTSKLQKFEDLSQIQTYGFRGEA
    GENPEPT_3880333      ATEIMVNMQNGGLKLLQVSDNGKGIEREDFALVCERFATSKLQKFEDLMHMKTYGFRGEA
                         :* * : :: **:*::*: ***.**.: *: ::****:****  ****  : *:******
    
    GENPEPT_7595954      LASISHVAHVTITTKTADGKCAYRASYSDGKLQAPPKPCAGNQGTLITVEDLFYNIITRR
    GENPEPT_1724118      LASISHVAHVTITTKTADGKCAYRASYSDGKLQAPPKPCAGNQGTLITVEDLFYNIITRK
    GENPEPT_466462       LASISHVAHVTITTKTADGKCAYRASYSDGKLKAPPKPCAGNQGTQITVEDLFYNIATRR
    GENPEPT_3192877      LASISHVAHLSIQTKTAKEKCGYKATYADGKLQGQPKPCAGNQGTIICIEDLFYNMPQRR
    GENPEPT_825572       LASISHVARVTVTTKVKEDRCAWRVSYAEGKMLESPKPVAGKDGTTILVEDLFFNIPSRL
    GENPEPT_3880333      LASLSHVAKVNIVSKRADAKCAYQANFLDGKMTADTKPAAGKNGTCITATDLFYNLPTRR
                         ***:****::.: :*  . :*.::..: :**:   .** **::** *   ***:*:  * 
    
    GENPEPT_7595954      KALKNPSEEYGKILEVVGRYSIHNSGISFSVKKQGETVSDVRTLPNATTVDNIRSIFGNA
    GENPEPT_1724118      KALKNPSEEYGKILEVVGRYSIHNSGISFSVKKQGETVSDVRTLPNATTVDNIRSIFGNA
    GENPEPT_466462       KALKNPSEEYGKILEVVGRYSVHNAGISFSVKKQGETVADVRTLPNASTVDNIRSVFGNA
    GENPEPT_3192877      QALRSPAEEFQRLSEVLARYAVHNPRVGFTLRKQGDAQPALRTPVASSRSENIRIIYGAA
    GENPEPT_825572       RALRSHNDEYSKILDVVGRYAIHSKDIGFSCKKFGDSNYSLSVKPSYTVQDRIRTVFNKS
    GENPEPT_3880333      NKMTTHGEEAKMVNDTLLRFAIHRPDVSFALRQ--NQAGDFRTKGDGNFRDVVCNLLGRD
                         . : .  :*   : :.: *:::*   :.*: ::  :    . .    .  : :  : .  
    
    GENPEPT_7595954      VSRELIEVG-CEDKTLAFK-MNGYISNANYSVKKCIF----------LLFINHRLVESAA
    GENPEPT_1724118      VSRELIEVG-CEDKTLAFK-MNGYISNANYSVKKCIF----------LLFINHRLVESAA
    GENPEPT_466462       VSRELIEIG-CEDKTLAFK-MNGYISNANYSVKKCIF----------LLFINHRLVESTS
    GENPEPT_3192877      ISKELLEFS-HRDEVYKFE-AECLITQVNYSAKKCQM----------LLFINQRLVESTA
    GENPEPT_825572       VASNLITFHISKVEDLNLESVDGKVCNLNFISKKSISP---------IFFINNRLVTCDL
    GENPEPT_3880333      VADTILPLS-LNSTRLKFT-FTGHISKPIASATAAIAQNRKTSRSFFSVFINGRSVRCDI
                         ::  :: .   .     :      : :     . .             .*** * * .  
    
    GENPEPT_7595954      LRKAIETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESILQRVQQHIE
    GENPEPT_1724118      LKKAIEAVYAAYLPKNTHPFLYLILEISPQNVDVNVHPTKHEVHFLHEESILERVQQHIE
    GENPEPT_466462       LRKAIETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESILERVQQHIE
    GENPEPT_3192877      LRTSVDSIYATYLPRGHHPFVYMSLTLPPQNLDVNVHPTKHEVHFLYQEEIVDSIKQQVE
    GENPEPT_825572       LRRALNSVYSNYLPKGNRPFIYLGIVIDPAAVDVNVHPTKREVRFLSQDEIIEKIANQLH
    GENPEPT_3880333      LKHPIDEVLG--ARQLHAQFCALHLQIDETRIDVNVHPTKNSVIFLEKEEIIEEIRAYFE
                         *: .:: : .    :    *  : : :    :********..* ** ::.*:: :   ..
    
    GENPEPT_7595954      SKLLGSNSSRMYFTQTLLPGLAG------PSGEAARPTTGVASSSTSGSGDKVYAYQMVR
    GENPEPT_1724118      SKLLGSNSSRMYFTQTLLPGLAG------PSGEAVKSTTGIASSSTSGSGDKVHAYQMVR
    GENPEPT_466462       SKLLGSNSSRMYFTQTLLPGLAG------PSGEMVKSTTSLTSSSTSGSSDKVYAHQMVR
    GENPEPT_3192877      ARLLGSNATRTFYKQLRLPGAP-----------------DLDETQLADKTQRIYPKEMVR
    GENPEPT_825572       AELSAIDTSRTFKASSISTNKPESLIPFNDTIESDRNRKSLRQAQVVENSYTTANSQLRK
    GENPEPT_3880333      KVIGEIFGFEALDVEKPEEEQPD--------IENLVMIPMSQSLKSIEAIRKPDTKPEFK
                           :      .    .      .                    . .              :
    
    GENPEPT_7595954      TDSRDQKLDAFLQPVSSLVPSQPQDPAPVRGARTEGSPERATREDEEMLALPAPAEAAAE
    GENPEPT_1724118      TDSRDQKLDAFMQPVSRRLPSQPQD--PVPGNRTEGSPEKAMQKDQEISELPAPMEAAAD
    GENPEPT_466462       TDSREQKLDAFLQPLSKPLSSQPQ--AIVTEDKTDISSGRARQQDEEMLELPAPAEVAAK
    GENPEPT_3192877      TDSTEQKLDKFLAPLVK-------------------------------------------
    GENPEPT_825572       AKRQENKLVRIDASQAKITSFLSSS--QQFNFEGSSTKRQLSEPKVTNVSHSQEAEKLTL
    GENPEPT_3880333      SSPSAWKSDKKRVDYMEVRTDAKERKIDEFVTRGGAVGPTTSNDDIFGGSGILKRARTED
                         :.    *                                                     
    
    GENPEPT_7595954      SENLERESLMETSDAAQKAAPTSSPGSSRKRHREDSDVEMVENASGKEMTAACYPRRRII
    GENPEPT_1724118      SASLERESVIGASEVVAPQRHPSSPGSSRKRHPEDSDVEMMENDSRKEMTAACYPRRRII
    GENPEPT_466462       NQSLEGDTTKGTSEMSEKRGPTSS--NPRKRHREDSDVEMVEDDSRKEMTAACTPRRRII
    GENPEPT_3192877      ----------------SDSGVSSSSSQEASRLPEES------------FRVTAAKKSREV
    GENPEPT_825572       NESEQPRDANTINDNDLKDQPKKKQKLGDYKVPSIADDEKNALPISKDGYIRVPKERVNV
    GENPEPT_3880333      STGGEKEPEDLNTDFDDVSMVSLVSTADGRRLNESQD-----LGEDDDVDFEYGKTHREF
                                                       :  .                         .
    
    GENPEPT_7595954      NLTSVLSLQEEISERCHETLREILRNHSFVGCVNPQW--ALAQHQTKLYLLNTTKLSEEL
    GENPEPT_1724118      NLTSVLSLQEEINDRGHETLREMLRNHTFVGCVNPQW--ALAQHQTKLYLLNTTKLSEEL
    GENPEPT_466462       NLTSVLSLQEEINEQGHEVLREMLHNHSFVGCVNPQW--ALAQHQTKLYLLNTTKLSEEL
    GENPEPT_3192877      RLSSVLDMRKRVERQCSVQLRSTLKNLVYVGCVDERR--ALFQHETRLYMCNTRSFSEEL
    GENPEPT_825572       NLTSIKKLREKVDDSIHRELTDIFANLNYVGVVDEERRLAAIQHDLKLFLIDYGSVCYEL
    GENPEPT_3880333      HFESIEVLRKEIIANSSQSLREMFKTSTFVGSINVKQ--VLIQFGTSLYHLDFSTVLREF
                         .: *:  :::.:       * . : .  :** :: .   .  *.   *:  :  ..  *:
    
    GENPEPT_7595954      FYQILIYDFANFGVLRLSEPAPLFDLAMLALDSPESGWTEDDGPKEGLA-----EYIVEF
    GENPEPT_1724118      FYQILIYDFANFGVLRLPEPAPLFDFAMLALDSPESGWTEEDGPKEGLA-----EYIVEF
    GENPEPT_466462       FYQILIYDFANFGVLRLSEPAPLFDLAMLALDSPESGWTEEDGPKEGLA-----EYIVEF
    GENPEPT_3192877      FYQRMIYEFQNCSEITICPPLPLKELLILSLESRAAGWTPEDEDKAELA-----DGAADI
    GENPEPT_825572       FYQIGLTDFANFGKINLQSTNVSDDIVLYNLLSEFDELN-DDASK---------EKIISK
    GENPEPT_3880333      FYQISVFSFGNYGSYRLDE-EPPAIIEILELLGELSTREPNYAAFEVFANVENRFAAEKL
                         ***  : .* * .   :        : :  * .       :                 . 
    
    GENPEPT_7595954      LKKKAEMLADYFSVEIDEEGN--------LIGLPLLIDSYVPPLEGLPIFILRLATEVNW
    GENPEPT_1724118      LKKKAKMLADYFSVEIDEEGN--------LIGLPLLIDSYVPPLEGLPIFILRLATEVNW
    GENPEPT_466462       LKKKAEMLADYFSLEIDEEGN--------LIGLPLLIDNYVPPLEGLPIFILRLATEVNW
    GENPEPT_3192877      LLKKAPIMREYFGLRISEDGM--------LESLPSLLHQHRPCVAHLPVYLLRLATEVDW
    GENPEPT_825572       IWDMSSMLNEYYSIELVNDGLDNDLKSVKLKSLPLLLKGYIPSLVKLPFFIYRLGKEVDW
    GENPEPT_3880333      LAEHADLLHDYFAIKLDQLENGR----LHITEIPSLVHYFVPQLEKLPFLIATLVLNVDY
                         : . : :: :*:.:.: :           :  :* *:. . * :  **. :  *  :*::
    
    GENPEPT_7595954      DEEKECFESLSKECAMFYSIRKQYILEESTLSGQQSDMPGSTSKPWKWT--VEHIIYKAF
    GENPEPT_1724118      DEE-ECFESLSKECAVFYSIRKQYILEESALSGQQSDMPGSPSKPWKWT--VEHIIYKAF
    GENPEPT_466462       DEEKECFESLSKECAMFYSIRKQYISEESTLSGQQSEVPGSIPNSWKWT--VEHIVYKAL
    GENPEPT_3192877      EQETRCFETFCRETARFY--------------AQLDWREGATAVFSRWT--MEHVLFPAF
    GENPEPT_825572       EDEQECLDGILREIALLYIPDMVPKVDTSDASLSEDEKAQFINRKEHISSLLEHVLFPCI
    GENPEPT_3880333      DDEQNTFRTICRAIGDLFTLDTN---------FITLDKKISAFSATPWKTLIKEVLMPLV
                         ::* . :  : :  . ::                              .  ::.::   .
    
    GENPEPT_7595954      RSHLLPPKHFTEDGNVLQLANLPDLYKVFERC--
    GENPEPT_1724118      RSHLLPPKHFTEDGNVLQLANLPDLCKVFERC--
    GENPEPT_466462       RSHILPPKHFTEDGNILQLANLPDLYKVFERC--
    GENPEPT_3192877      KKYLLPPR---IKDQIYELTNLPTLYKVFERC--
    GENPEPT_825572       KRRFLAPRHILKD--VVEIANLPDLYKVFERC--
    GENPEPT_3880333      KRKFIPPEHFKQAGVIRQLADSHDLYKVFERCGT
                         :  ::.*.       : ::::   * ******  
    
    
    

    Clustal W dendrogram



    Unrooted tree (generated by Phylip's Drawtree)

    Download a PostScript version of the output



    Phylip-format dendrogram

    (
    GENPEPT_466462:0.05763,
    (
    GENPEPT_7595954:0.03521,
    GENPEPT_1724118:0.04802)
    :0.02510,
    (
    GENPEPT_3192877:0.24567,
    (
    GENPEPT_825572:0.33043,
    GENPEPT_3880333:0.37568)
    :0.04833)
    :0.19089);
    
    

    Clustal W options and diagnostic messages

    Alignment type: Protein                 Alignment order: aligned                
    
                        Pairwise alignment parameters
    
    Method: accurate                        
    Matrix: Gonnet                          
    Gap open penalty: 10.00                 Gap extension penalty: 0.10             
    
                        Multiple alignment parameters
    
    Matrix: Gonnet                          Negative matrix?: no                    
    Gap open penalty: 10.00                 Gap extension penalty: 0.20             
    % identity for delay: 30                Residue-specific gap penalties: on      
    Penalize end gaps: on                   Hydrophilic gap penalties: on           
    Gap separation distance: 0              Hydrophilic residues: GPSNDQEKR         
    
    
    
    
     CLUSTAL W (1.81) Multiple Sequence Alignments
    
    
    
    Sequence type explicitly set to Protein
    Sequence format is Pearson
    Sequence 1: GENPEPT_466462       756 aa
    Sequence 2: GENPEPT_7595954      760 aa
    Sequence 3: GENPEPT_1724118      757 aa
    Sequence 4: GENPEPT_3192877      663 aa
    Sequence 5: GENPEPT_825572       769 aa
    Sequence 6: GENPEPT_3880333      779 aa
    Start of Pairwise alignments
    Aligning...
    Sequences (1:2) Aligned. Score:  88
    Sequences (1:3) Aligned. Score:  86
    Sequences (1:4) Aligned. Score:  51
    Sequences (1:5) Aligned. Score:  36
    Sequences (1:6) Aligned. Score:  32
    Sequences (2:3) Aligned. Score:  91
    Sequences (2:4) Aligned. Score:  50
    Sequences (2:5) Aligned. Score:  36
    Sequences (2:6) Aligned. Score:  32
    Sequences (3:4) Aligned. Score:  48
    Sequences (3:5) Aligned. Score:  36
    Sequences (3:6) Aligned. Score:  32
    Sequences (4:5) Aligned. Score:  38
    Sequences (4:6) Aligned. Score:  32
    Sequences (5:6) Aligned. Score:  29
    Time for pairwise alignment: 2.274260
    
    Guide tree        file created:   [../tmp-dir/20758.CLUSTALW.dnd]
    Start of Multiple Alignment
    There are 5 groups
    Aligning...
    Group 1: Sequences:   2      Score:15725
    Group 2: Sequences:   3      Score:15397
    Group 3: Sequences:   4      Score:10921
    Group 4: Sequences:   5      Score:10155
    Group 5: Sequences:   6      Score:7685
    Time for multiple alignment: 5.505672
    
    Alignment Score 30082
    CLUSTAL-Alignment file created  [../tmp-dir/20758.CLUSTALW.aln]
    
    

    Citation

      Algorithm Citation:

      Higgins, D.G., Bleasby, A.J. and Fuchs, R. (1992) CLUSTAL V: improved software for multiple sequence alignment. Computer Applications in the Biosciences (CABIOS), 8(2):189-191.

      Thompson J.D., Higgins D.G., Gibson T.J. "CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice." Nucleic Acids Res. 22:4673-4680(1994).

      Felsenstein, J. 1989. PHYLIP -- Phylogeny Inference Package (Version 3.2). Cladistics 5: 164-166.

      Program Citation:

      CLUSTAL W: Julie D. Thompson, Desmond G. Higgins and Toby J. Gibson, modified; any errors are due to the modifications.

      PHYLIP: Felsenstein, J. 1993. PHYLIP (Phylogeny Inference Package) version 3.5c. Distributed by the author. Department of Genetics, University of Washington, Seattle.


    Copyright (C) 1999, Board of Trustees of the University of Illinois.