Human DNA mismatch repair (hmlh1) mRNA, complete cds_
Fasta label (*) | Workbench label |
---|
GENPEPT:3880333 | Caenorhabditis elegans cosmid T28A8, complete sequence_ |
GENPEPT:825572 | S.cerevisiae chromosome XIII cosmid 8520_ |
GENPEPT:3192877 | Drosophila melanogaster mutL homolog (Mlh1) gene, complete cds_ |
GENPEPT:1724118 | Rattus norvegicus mismatch repair protein (MLH1) mRNA, complete |
GENPEPT:7595954 | Mus musculus MutL homolog 1 protein (MLH1) mRNA, complete cds. |
GENPEPT:466462 | Human DNA mismatch repair (hmlh1) mRNA, complete cds_ |
(*) Clustalw cuts off Fasta labels after the first space (e.g. ">abc def" becomes ">abc").
Sequence alignment
Consensus key (see documentation for details)
* - single, fully conserved residue
: - conservation of strong groups
. - conservation of weak groups
- no consensus
CLUSTAL W (1.81) multiple sequence alignment
GENPEPT_7595954 -----------------MAFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMIENCLDAK
GENPEPT_1724118 -----------------MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMTENCLDAK
GENPEPT_466462 -----------------MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMIENCLDAK
GENPEPT_3192877 ---------------MAEYLQPGVIRKLDEVVVNRIAAGEIIQRPANALKELLENSLDAQ
GENPEPT_825572 --------------------MSLRIKALDASVVNKIAAGEIIISPVNALKEMMENSIDAN
GENPEPT_3880333 MWHCGYRTRNCDEFSKIEFSLMGLIQRLPQDVVNRMAAGEVLARPCNAIKELVENSLDAG
*: * ***::****:: * **:**: **.:**
GENPEPT_7595954 STNIQVVVKEGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQTFEDLASISTYGFRGEA
GENPEPT_1724118 STNIQVIVREGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQTFEDLAMISTYGFRGEA
GENPEPT_466462 STSIQVIVKEGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQSFEDLASISTYGFRGEA
GENPEPT_3192877 STHIQVQVKAGGLKLLQIQDNGTGIRREDLAIVCERFTTSKLTRFEDLSQIATFGFRGEA
GENPEPT_825572 ATMIDILVKEGGIKVLQITDNGSGINKADLPILCERFTTSKLQKFEDLSQIQTYGFRGEA
GENPEPT_3880333 ATEIMVNMQNGGLKLLQVSDNGKGIEREDFALVCERFATSKLQKFEDLMHMKTYGFRGEA
:* * : :: **:*::*: ***.**.: *: ::****:**** **** : *:******
GENPEPT_7595954 LASISHVAHVTITTKTADGKCAYRASYSDGKLQAPPKPCAGNQGTLITVEDLFYNIITRR
GENPEPT_1724118 LASISHVAHVTITTKTADGKCAYRASYSDGKLQAPPKPCAGNQGTLITVEDLFYNIITRK
GENPEPT_466462 LASISHVAHVTITTKTADGKCAYRASYSDGKLKAPPKPCAGNQGTQITVEDLFYNIATRR
GENPEPT_3192877 LASISHVAHLSIQTKTAKEKCGYKATYADGKLQGQPKPCAGNQGTIICIEDLFYNMPQRR
GENPEPT_825572 LASISHVARVTVTTKVKEDRCAWRVSYAEGKMLESPKPVAGKDGTTILVEDLFFNIPSRL
GENPEPT_3880333 LASLSHVAKVNIVSKRADAKCAYQANFLDGKMTADTKPAAGKNGTCITATDLFYNLPTRR
***:****::.: :* . :*.::..: :**: .** **::** * ***:*: *
GENPEPT_7595954 KALKNPSEEYGKILEVVGRYSIHNSGISFSVKKQGETVSDVRTLPNATTVDNIRSIFGNA
GENPEPT_1724118 KALKNPSEEYGKILEVVGRYSIHNSGISFSVKKQGETVSDVRTLPNATTVDNIRSIFGNA
GENPEPT_466462 KALKNPSEEYGKILEVVGRYSVHNAGISFSVKKQGETVADVRTLPNASTVDNIRSVFGNA
GENPEPT_3192877 QALRSPAEEFQRLSEVLARYAVHNPRVGFTLRKQGDAQPALRTPVASSRSENIRIIYGAA
GENPEPT_825572 RALRSHNDEYSKILDVVGRYAIHSKDIGFSCKKFGDSNYSLSVKPSYTVQDRIRTVFNKS
GENPEPT_3880333 NKMTTHGEEAKMVNDTLLRFAIHRPDVSFALRQ--NQAGDFRTKGDGNFRDVVCNLLGRD
. : . :* : :.: *:::* :.*: :: : . . . : : : .
GENPEPT_7595954 VSRELIEVG-CEDKTLAFK-MNGYISNANYSVKKCIF----------LLFINHRLVESAA
GENPEPT_1724118 VSRELIEVG-CEDKTLAFK-MNGYISNANYSVKKCIF----------LLFINHRLVESAA
GENPEPT_466462 VSRELIEIG-CEDKTLAFK-MNGYISNANYSVKKCIF----------LLFINHRLVESTS
GENPEPT_3192877 ISKELLEFS-HRDEVYKFE-AECLITQVNYSAKKCQM----------LLFINQRLVESTA
GENPEPT_825572 VASNLITFHISKVEDLNLESVDGKVCNLNFISKKSISP---------IFFINNRLVTCDL
GENPEPT_3880333 VADTILPLS-LNSTRLKFT-FTGHISKPIASATAAIAQNRKTSRSFFSVFINGRSVRCDI
:: :: . . : : : . . .*** * * .
GENPEPT_7595954 LRKAIETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESILQRVQQHIE
GENPEPT_1724118 LKKAIEAVYAAYLPKNTHPFLYLILEISPQNVDVNVHPTKHEVHFLHEESILERVQQHIE
GENPEPT_466462 LRKAIETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESILERVQQHIE
GENPEPT_3192877 LRTSVDSIYATYLPRGHHPFVYMSLTLPPQNLDVNVHPTKHEVHFLYQEEIVDSIKQQVE
GENPEPT_825572 LRRALNSVYSNYLPKGNRPFIYLGIVIDPAAVDVNVHPTKREVRFLSQDEIIEKIANQLH
GENPEPT_3880333 LKHPIDEVLG--ARQLHAQFCALHLQIDETRIDVNVHPTKNSVIFLEKEEIIEEIRAYFE
*: .:: : . : * : : : :********..* ** ::.*:: : ..
GENPEPT_7595954 SKLLGSNSSRMYFTQTLLPGLAG------PSGEAARPTTGVASSSTSGSGDKVYAYQMVR
GENPEPT_1724118 SKLLGSNSSRMYFTQTLLPGLAG------PSGEAVKSTTGIASSSTSGSGDKVHAYQMVR
GENPEPT_466462 SKLLGSNSSRMYFTQTLLPGLAG------PSGEMVKSTTSLTSSSTSGSSDKVYAHQMVR
GENPEPT_3192877 ARLLGSNATRTFYKQLRLPGAP-----------------DLDETQLADKTQRIYPKEMVR
GENPEPT_825572 AELSAIDTSRTFKASSISTNKPESLIPFNDTIESDRNRKSLRQAQVVENSYTTANSQLRK
GENPEPT_3880333 KVIGEIFGFEALDVEKPEEEQPD--------IENLVMIPMSQSLKSIEAIRKPDTKPEFK
: . . . . . :
GENPEPT_7595954 TDSRDQKLDAFLQPVSSLVPSQPQDPAPVRGARTEGSPERATREDEEMLALPAPAEAAAE
GENPEPT_1724118 TDSRDQKLDAFMQPVSRRLPSQPQD--PVPGNRTEGSPEKAMQKDQEISELPAPMEAAAD
GENPEPT_466462 TDSREQKLDAFLQPLSKPLSSQPQ--AIVTEDKTDISSGRARQQDEEMLELPAPAEVAAK
GENPEPT_3192877 TDSTEQKLDKFLAPLVK-------------------------------------------
GENPEPT_825572 AKRQENKLVRIDASQAKITSFLSSS--QQFNFEGSSTKRQLSEPKVTNVSHSQEAEKLTL
GENPEPT_3880333 SSPSAWKSDKKRVDYMEVRTDAKERKIDEFVTRGGAVGPTTSNDDIFGGSGILKRARTED
:. *
GENPEPT_7595954 SENLERESLMETSDAAQKAAPTSSPGSSRKRHREDSDVEMVENASGKEMTAACYPRRRII
GENPEPT_1724118 SASLERESVIGASEVVAPQRHPSSPGSSRKRHPEDSDVEMMENDSRKEMTAACYPRRRII
GENPEPT_466462 NQSLEGDTTKGTSEMSEKRGPTSS--NPRKRHREDSDVEMVEDDSRKEMTAACTPRRRII
GENPEPT_3192877 ----------------SDSGVSSSSSQEASRLPEES------------FRVTAAKKSREV
GENPEPT_825572 NESEQPRDANTINDNDLKDQPKKKQKLGDYKVPSIADDEKNALPISKDGYIRVPKERVNV
GENPEPT_3880333 STGGEKEPEDLNTDFDDVSMVSLVSTADGRRLNESQD-----LGEDDDVDFEYGKTHREF
: . .
GENPEPT_7595954 NLTSVLSLQEEISERCHETLREILRNHSFVGCVNPQW--ALAQHQTKLYLLNTTKLSEEL
GENPEPT_1724118 NLTSVLSLQEEINDRGHETLREMLRNHTFVGCVNPQW--ALAQHQTKLYLLNTTKLSEEL
GENPEPT_466462 NLTSVLSLQEEINEQGHEVLREMLHNHSFVGCVNPQW--ALAQHQTKLYLLNTTKLSEEL
GENPEPT_3192877 RLSSVLDMRKRVERQCSVQLRSTLKNLVYVGCVDERR--ALFQHETRLYMCNTRSFSEEL
GENPEPT_825572 NLTSIKKLREKVDDSIHRELTDIFANLNYVGVVDEERRLAAIQHDLKLFLIDYGSVCYEL
GENPEPT_3880333 HFESIEVLRKEIIANSSQSLREMFKTSTFVGSINVKQ--VLIQFGTSLYHLDFSTVLREF
.: *: :::.: * . : . :** :: . . *. *: : .. *:
GENPEPT_7595954 FYQILIYDFANFGVLRLSEPAPLFDLAMLALDSPESGWTEDDGPKEGLA-----EYIVEF
GENPEPT_1724118 FYQILIYDFANFGVLRLPEPAPLFDFAMLALDSPESGWTEEDGPKEGLA-----EYIVEF
GENPEPT_466462 FYQILIYDFANFGVLRLSEPAPLFDLAMLALDSPESGWTEEDGPKEGLA-----EYIVEF
GENPEPT_3192877 FYQRMIYEFQNCSEITICPPLPLKELLILSLESRAAGWTPEDEDKAELA-----DGAADI
GENPEPT_825572 FYQIGLTDFANFGKINLQSTNVSDDIVLYNLLSEFDELN-DDASK---------EKIISK
GENPEPT_3880333 FYQISVFSFGNYGSYRLDE-EPPAIIEILELLGELSTREPNYAAFEVFANVENRFAAEKL
*** : .* * . : : : * . : .
GENPEPT_7595954 LKKKAEMLADYFSVEIDEEGN--------LIGLPLLIDSYVPPLEGLPIFILRLATEVNW
GENPEPT_1724118 LKKKAKMLADYFSVEIDEEGN--------LIGLPLLIDSYVPPLEGLPIFILRLATEVNW
GENPEPT_466462 LKKKAEMLADYFSLEIDEEGN--------LIGLPLLIDNYVPPLEGLPIFILRLATEVNW
GENPEPT_3192877 LLKKAPIMREYFGLRISEDGM--------LESLPSLLHQHRPCVAHLPVYLLRLATEVDW
GENPEPT_825572 IWDMSSMLNEYYSIELVNDGLDNDLKSVKLKSLPLLLKGYIPSLVKLPFFIYRLGKEVDW
GENPEPT_3880333 LAEHADLLHDYFAIKLDQLENGR----LHITEIPSLVHYFVPQLEKLPFLIATLVLNVDY
: . : :: :*:.:.: : : :* *:. . * : **. : * :*::
GENPEPT_7595954 DEEKECFESLSKECAMFYSIRKQYILEESTLSGQQSDMPGSTSKPWKWT--VEHIIYKAF
GENPEPT_1724118 DEE-ECFESLSKECAVFYSIRKQYILEESALSGQQSDMPGSPSKPWKWT--VEHIIYKAF
GENPEPT_466462 DEEKECFESLSKECAMFYSIRKQYISEESTLSGQQSEVPGSIPNSWKWT--VEHIVYKAL
GENPEPT_3192877 EQETRCFETFCRETARFY--------------AQLDWREGATAVFSRWT--MEHVLFPAF
GENPEPT_825572 EDEQECLDGILREIALLYIPDMVPKVDTSDASLSEDEKAQFINRKEHISSLLEHVLFPCI
GENPEPT_3880333 DDEQNTFRTICRAIGDLFTLDTN---------FITLDKKISAFSATPWKTLIKEVLMPLV
::* . : : : . :: . ::.:: .
GENPEPT_7595954 RSHLLPPKHFTEDGNVLQLANLPDLYKVFERC--
GENPEPT_1724118 RSHLLPPKHFTEDGNVLQLANLPDLCKVFERC--
GENPEPT_466462 RSHILPPKHFTEDGNILQLANLPDLYKVFERC--
GENPEPT_3192877 KKYLLPPR---IKDQIYELTNLPTLYKVFERC--
GENPEPT_825572 KRRFLAPRHILKD--VVEIANLPDLYKVFERC--
GENPEPT_3880333 KRKFIPPEHFKQAGVIRQLADSHDLYKVFERCGT
: ::.*. : :::: * ******
Clustal W dendrogram
Unrooted tree (generated by Phylip's Drawtree)
Phylip-format dendrogram
(
GENPEPT_466462:0.05763,
(
GENPEPT_7595954:0.03521,
GENPEPT_1724118:0.04802)
:0.02510,
(
GENPEPT_3192877:0.24567,
(
GENPEPT_825572:0.33043,
GENPEPT_3880333:0.37568)
:0.04833)
:0.19089);
Clustal W options and diagnostic messages
Alignment type: Protein Alignment order: aligned
Pairwise alignment parameters
Method: accurate
Matrix: Gonnet
Gap open penalty: 10.00 Gap extension penalty: 0.10
Multiple alignment parameters
Matrix: Gonnet Negative matrix?: no
Gap open penalty: 10.00 Gap extension penalty: 0.20
% identity for delay: 30 Residue-specific gap penalties: on
Penalize end gaps: on Hydrophilic gap penalties: on
Gap separation distance: 0 Hydrophilic residues: GPSNDQEKR
CLUSTAL W (1.81) Multiple Sequence Alignments
Sequence type explicitly set to Protein
Sequence format is Pearson
Sequence 1: GENPEPT_466462 756 aa
Sequence 2: GENPEPT_7595954 760 aa
Sequence 3: GENPEPT_1724118 757 aa
Sequence 4: GENPEPT_3192877 663 aa
Sequence 5: GENPEPT_825572 769 aa
Sequence 6: GENPEPT_3880333 779 aa
Start of Pairwise alignments
Aligning...
Sequences (1:2) Aligned. Score: 88
Sequences (1:3) Aligned. Score: 86
Sequences (1:4) Aligned. Score: 51
Sequences (1:5) Aligned. Score: 36
Sequences (1:6) Aligned. Score: 32
Sequences (2:3) Aligned. Score: 91
Sequences (2:4) Aligned. Score: 50
Sequences (2:5) Aligned. Score: 36
Sequences (2:6) Aligned. Score: 32
Sequences (3:4) Aligned. Score: 48
Sequences (3:5) Aligned. Score: 36
Sequences (3:6) Aligned. Score: 32
Sequences (4:5) Aligned. Score: 38
Sequences (4:6) Aligned. Score: 32
Sequences (5:6) Aligned. Score: 29
Time for pairwise alignment: 2.274260
Guide tree file created: [../tmp-dir/20758.CLUSTALW.dnd]
Start of Multiple Alignment
There are 5 groups
Aligning...
Group 1: Sequences: 2 Score:15725
Group 2: Sequences: 3 Score:15397
Group 3: Sequences: 4 Score:10921
Group 4: Sequences: 5 Score:10155
Group 5: Sequences: 6 Score:7685
Time for multiple alignment: 5.505672
Alignment Score 30082
CLUSTAL-Alignment file created [../tmp-dir/20758.CLUSTALW.aln]
Citation
Algorithm Citation:
Higgins, D.G., Bleasby, A.J. and Fuchs, R. (1992) CLUSTAL V: improved
software for multiple sequence alignment. Computer Applications in the
Biosciences (CABIOS), 8(2):189-191.
Thompson J.D., Higgins D.G., Gibson T.J. "CLUSTAL W: improving the
sensitivity of progressive multiple sequence alignment through sequence
weighting, position-specific gap penalties and weight matrix choice."
Nucleic Acids Res. 22:4673-4680(1994).
Felsenstein, J. 1989. PHYLIP -- Phylogeny Inference Package
(Version 3.2). Cladistics 5: 164-166.
Program Citation:
CLUSTAL W: Julie D. Thompson, Desmond G. Higgins and Toby J. Gibson,
modified; any errors are due to the modifications.
PHYLIP: Felsenstein, J. 1993. PHYLIP (Phylogeny Inference Package)
version 3.5c. Distributed by the author. Department of Genetics,
University of Washington, Seattle.
Copyright (C) 1999, Board of Trustees of the University of Illinois.