Mus musculus MutL homolog 1 protein (MLH1) mRNA, complete cds.
Fasta label (*) | Workbench label |
---|
MLH1_HUMAN | DNA MISMATCH REPAIR PROTEIN MLH1 (MUTL PROTEIN HOMOLOG 1) [Homo sapiens (Human)] |
GENPEPT:3880333 | Caenorhabditis elegans cosmid T28A8, complete sequence_ |
GENPEPT:460627 | Saccharomyces cerevisiae DNA mismatch repair (MLH1) gene, complete |
GENPEPT:7304079 | Drosophila melanogaster genomic scaffold 142000013386047 section 5 |
GENPEPT:1724118 | Rattus norvegicus mismatch repair protein (MLH1) mRNA, complete |
GENPEPT:7595954 | Mus musculus MutL homolog 1 protein (MLH1) mRNA, complete cds. |
(*) Clustalw cuts off Fasta labels after the first space (e.g. ">abc def" becomes ">abc").
Sequence alignment
Consensus key (see documentation for details)
* - single, fully conserved residue
: - conservation of strong groups
. - conservation of weak groups
- no consensus
CLUSTAL W (1.81) multiple sequence alignment
GENPEPT_7595954 -----------------MAFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMIENCLDAK
GENPEPT_1724118 -----------------MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMTENCLDAK
MLH1_HUMAN -----------------MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMIENCLDAK
GENPEPT_7304079 ---------------MAEYLQPGVIRKLDEVVVNRIAAGEIIQRPANALKELLENSLDAQ
GENPEPT_460627 --------------------MSLRIKALDASVVNKIAAGEIIISPVNALKEMMENSIDAN
GENPEPT_3880333 MWHCGYRTRNCDEFSKIEFSLMGLIQRLPQDVVNRMAAGEVLARPCNAIKELVENSLDAG
*: * ***::****:: * **:**: **.:**
GENPEPT_7595954 STNIQVVVKEGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQTFEDLASISTYGFRGEA
GENPEPT_1724118 STNIQVIVREGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQTFEDLAMISTYGFRGEA
MLH1_HUMAN STSIQVIVKEGGLKLIQIQDNGTGIRKEDLDIVCERFTTSKLQSFEDLASISTYGFRGEA
GENPEPT_7304079 STHIQVQVKAGGLKLLQIQDNGTGIRREDLAIVCERFTTSKLTRFEDLSQIATFGFRGEA
GENPEPT_460627 ATMIDILVKEGGIKVLQITDNGSGINKADLPILCERFTTSKLQKFEDLSQIQTYGFRGEA
GENPEPT_3880333 ATEIMVNMQNGGLKLLQVSDNGKGIEREDFALVCERFATSKLQKFEDLMHMKTYGFRGEA
:* * : :: **:*::*: ***.**.: *: ::****:**** **** : *:******
GENPEPT_7595954 LASISHVAHVTITTKTADGKCAYRASYSDGKLQAPPKPCAGNQGTLITVEDLFYNIITRR
GENPEPT_1724118 LASISHVAHVTITTKTADGKCAYRASYSDGKLQAPPKPCAGNQGTLITVEDLFYNIITRK
MLH1_HUMAN LASISHVAHVTITTKTADGKCAYRASYSDGKLKAPPKPCAGNQGTQITVEDLFYNIATRR
GENPEPT_7304079 LASISHVAHLSIQTKTAKEKCGYKATYADGKLQGQPKPCAGNQGTIICIEDLFYNMPQRR
GENPEPT_460627 LASISHVARVTVTTKVKEDRCAWRVSYAEGKMLESPKPVAGKDGTTILVEDLFFNIPSRL
GENPEPT_3880333 LASLSHVAKVNIVSKRADAKCAYQANFLDGKMTADTKPAAGKNGTCITATDLFYNLPTRR
***:****::.: :* . :*.::..: :**: .** **::** * ***:*: *
GENPEPT_7595954 KALKNPSEEYGKILEVVGRYSIHNSGISFSVKKQGETVSDVRTLPNATTVDNIRSIFGNA
GENPEPT_1724118 KALKNPSEEYGKILEVVGRYSIHNSGISFSVKKQGETVSDVRTLPNATTVDNIRSIFGNA
MLH1_HUMAN KALKNPSEEYGKILEVVGRYSVHNAGISFSVKKQGETVADVRTLPNASTVDNIRSIFGNA
GENPEPT_7304079 QALRSPAEEFQRLSEVLARYAVHNPRVGFTLRKQGDAQPALRTPVASSRSENIRIIYGAA
GENPEPT_460627 RALRSHNDEYSKILDVVGRYAIHSKDIGFSCKKFGDSNYSLSVKPSYTVQDRIRTVFNKS
GENPEPT_3880333 NKMTTHGEEAKMVNDTLLRFAIHRPDVSFALRQ--NQAGDFRTKGDGNFRDVVCNLLGRD
. : . :* : :.: *:::* :.*: :: : . . . : : : .
GENPEPT_7595954 VSRELIEVG-CEDKTLAFK-MNGYISNANYSVKKCI----------FLLFINHRLVESAA
GENPEPT_1724118 VSRELIEVG-CEDKTLAFK-MNGYISNANYSVKKCI----------FLLFINHRLVESAA
MLH1_HUMAN VSRELIEIG-CEDKTLAFK-MNGYISNANYSVKKCI----------FLLFINHRLVESTS
GENPEPT_7304079 ISKELLEFS-HRDEVYKFE-AECLITQVNYSAKKCQ----------MLLFINQRLVESTA
GENPEPT_460627 VASNLITFHISKVEDLNLESVDGKVCNLNFISKKSIS---------LIFFINNRLVTCDL
GENPEPT_3880333 VADTILPLS-LNSTRLKFT-FTGHISKPIASATAAIAQNRKTSRSFFSVFINGRSVRCDI
:: :: . . : : : . . : .*** * * .
GENPEPT_7595954 LRKAIETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESILQRVQQHIE
GENPEPT_1724118 LKKAIEAVYAAYLPKNTHPFLYLILEISPQNVDVNVHPTKHEVHFLHEESILERVQQHIE
MLH1_HUMAN LRKAIETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESILERVQQHIE
GENPEPT_7304079 LRTSVDSIYATYLPRGHHPFVYMSLTLPPQNLDVNVHPTKHEVHFLYQEEIVDSIKQQVE
GENPEPT_460627 LRRALNSVYSNYLPKGFRPFIYLGIVIDPAAVDVNVHPTKREVRFLSQDEIIEKIANQLH
GENPEPT_3880333 LKHPIDEVLG--ARQLHAQFCALHLQIDETRIDVNVHPTKNSVIFLEKEEIIEEIRAYFE
*: .:: : . : * : : : :********..* ** ::.*:: : ..
GENPEPT_7595954 SKLLGSNSSRMYFTQTLLPGLAG------PSGEAARPTTGVASSSTSGSGDKVYAYQMVR
GENPEPT_1724118 SKLLGSNSSRMYFTQTLLPGLAG------PSGEAVKSTTGIASSSTSGSGDKVHAYQMVR
MLH1_HUMAN SKLLGSNSSRMYFTQTLLPGLAG------PSGEMVKSTTSLTSSSTSGSSDKVYAHQMVR
GENPEPT_7304079 ARLLGSNATRTFYKQLRLPGAP-----------------DLDETQLADKTQRIYPKEMVR
GENPEPT_460627 AELSAIDTSRTFKASSISTNKPESLIPFNDTIESDRNRKSLRQAQVVENSYTTANSQLRK
GENPEPT_3880333 KVIGEIFGFEALDVEKPEEEQPD--------IENLVMIPMSQSLKSIEAIRKPDTKPEFK
: . . . . . :
GENPEPT_7595954 TDSRDQKLDAFLQPVSSLVPSQPQDPAPVRGARTEGSPERATREDEEMLALPAPAEAAAE
GENPEPT_1724118 TDSRDQKLDAFMQPVSRRLPSQPQD--PVPGNRTEGSPEKAMQKDQEISELPAPMEAAAD
MLH1_HUMAN TDSREQKLDAFLQPLSKPLSSQPQ--AIVTEDKTDISSGRARQQDEEMLELPAPAEVAAK
GENPEPT_7304079 TDSTEQKLDKFLAPLVK-------------------------------------------
GENPEPT_460627 AKRQENKLVRIDASQAKITSFLSSS--QQFNFEGSSTKRQLSEPKVTNVSHSQEAEKLTL
GENPEPT_3880333 SSPSAWKSDKKRVDYMEVRTDAKERKIDEFVTRGGAVGPTTSNDDIFGGSGILKRARTED
:. *
GENPEPT_7595954 SENLERESLMETSDAAQKAAPTSSPGSSRKRHREDSDVEMVENASGKEMTAACYPRRRII
GENPEPT_1724118 SASLERESVIGASEVVAPQRHPSSPGSSRKRHPEDSDVEMMENDSRKEMTAACYPRRRII
MLH1_HUMAN NQSLEGDTTKGTSEMSEKRGPTSS--NPRKRHREDSDVEMVEDDSRKEMTAACTPRRRII
GENPEPT_7304079 ----------------SDSGVSSSSSQEASRLPEES------------FRVTAAKKSREV
GENPEPT_460627 NESEQPRDANTINDNDLKDQPKKKQKLGDYKVPSIADDEKNALPISKDGYIRVPKERVNV
GENPEPT_3880333 STGGEKEPEDLNTDFDDVSMVSLVSTADGRRLNESQD-----LGEDDDVDFEYGKTHREF
: . .
GENPEPT_7595954 NLTSVLSLQEEISERCHETLREILRNHSFVGCVNPQW--ALAQHQTKLYLLNTTKLSEEL
GENPEPT_1724118 NLTSVLSLQEEINDRGHETLREMLRNHTFVGCVNPQW--ALAQHQTKLYLLNTTKLSEEL
MLH1_HUMAN NLTSVLSLQEEINEQGHEVLREMLHNHSFVGCVNPQW--ALAQHQTKLYLLNTTKLSEEL
GENPEPT_7304079 RLSSVLDMRKRVERQCSVQLRSTLKNLVYVGCVDERR--ALFQHETRLYMCNTRSFSEEL
GENPEPT_460627 NLTSIKKLREKVDDSIHRELTDIFANLNYVGVVDEERRLAAIQHDLKLFLIDYGSVCYEL
GENPEPT_3880333 HFESIEVLRKEIIANSSQSLREMFKTSTFVGSINVKQ--VLIQFGTSLYHLDFSTVLREF
.: *: :::.: * . : . :** :: . . *. *: : .. *:
GENPEPT_7595954 FYQILIYDFANFGVLRLSEPAPLFDLAMLALDSPESGWTEDDGPKEGLA-----EYIVEF
GENPEPT_1724118 FYQILIYDFANFGVLRLPEPAPLFDFAMLALDSPESGWTEEDGPKEGLA-----EYIVEF
MLH1_HUMAN FYQILIYDFANFGVLRLSEPAPLFDLAMLALDSPESGWTEEDGPKEGLA-----EYIVEF
GENPEPT_7304079 FYQRMIYEFQNCSEITISPPLPLKELLILSLESEAAGWTPEDGDKAELA-----DGAADI
GENPEPT_460627 FYQIGLTDFANFGKINLQSTNVSDDIVLYNLLSEFDELN-DDASK---------EKIISK
GENPEPT_3880333 FYQISVFSFGNYGSYRLDE-EPPAIIEILELLGELSTREPNYAAFEVFANVENRFAAEKL
*** : .* * . : : : * . : . .
GENPEPT_7595954 LKKKAEMLADYFSVEIDEEGN--------LIGLPLLIDSYVPPLEGLPIFILRLATEVNW
GENPEPT_1724118 LKKKAKMLADYFSVEIDEEGN--------LIGLPLLIDSYVPPLEGLPIFILRLATEVNW
MLH1_HUMAN LKKKAEMLADYFSLEIDEEGN--------LIGLPLLIDNYVPPLEGLPIFILRLATEVNW
GENPEPT_7304079 LLKKAPIMREYFGLRISEDGM--------LESLPSLLHQHRPCVAHLPVYLLRLATEVDW
GENPEPT_460627 IWDMSSMLNEYYSIELVNDGLDNDLKSVKLKSLPLLLKGYIPSLVKLPFFIYRLGKEVDW
GENPEPT_3880333 LAEHADLLHDYFAIKLDQLENGR----LHITEIPSLVHYFVPQLEKLPFLIATLVLNVDY
: . : :: :*:.:.: : : :* *:. . * : **. : * :*::
GENPEPT_7595954 DEEKECFESLSKECAMFYSIRKQYILEESTLSGQQSDMPGSTSKPWKWT--VEHIIYKAF
GENPEPT_1724118 DEE-ECFESLSKECAVFYSIRKQYILEESALSGQQSDMPGSPSKPWKWT--VEHIIYKAF
MLH1_HUMAN DEEKECFESLSKECAMFYSIRKQYISEESTLSGQQSEVPGSIPNSWKWT--VEHIVYKAL
GENPEPT_7304079 EQETRCFETFCRETARFY--------------AQLDWREGATAGFSRWT--MEHVLFPAF
GENPEPT_460627 EDEQECLDGILREIALLYIPDMVPKVDTLDASLSEDEKAQFINRKEHISSLLEHVLFPCI
GENPEPT_3880333 DDEQNTFRTICRAIGDLFTLDTN---------FITLDKKISAFSATPWKTLIKEVLMPLV
::* . : : : . :: . ::.:: .
GENPEPT_7595954 RSHLLPPKHFTEDGNVLQLANLPDLYKVFERC--
GENPEPT_1724118 RSHLLPPKHFTEDGNVLQLANLPDLCKVFERC--
MLH1_HUMAN RSHILPPKHFTEDGNILQLANLPDLYKVFERC--
GENPEPT_7304079 KKYLLPPPRIKD--QIYELTNLPTLYKVFERC--
GENPEPT_460627 KRRFLAPRHILK--DVVEIANLPDLYKVFERC--
GENPEPT_3880333 KRKFIPPEHFKQAGVIRQLADSHDLYKVFERCGT
: ::.* :: . : :::: * ******
Clustal W dendrogram
Unrooted tree (generated by Phylip's Drawtree)
Phylip-format dendrogram
(
(
GENPEPT_7595954:0.03502,
GENPEPT_1724118:0.04820)
:0.02520,
(
GENPEPT_7304079:0.24257,
(
GENPEPT_460627:0.33127,
GENPEPT_3880333:0.37354)
:0.04885)
:0.19165,
MLH1_HUMAN:0.05620);
Clustal W options and diagnostic messages
Alignment type: Protein Alignment order: aligned
Pairwise alignment parameters
Method: accurate
Matrix: Gonnet
Gap open penalty: 10.00 Gap extension penalty: 0.10
Multiple alignment parameters
Matrix: Gonnet Negative matrix?: no
Gap open penalty: 10.00 Gap extension penalty: 0.20
% identity for delay: 30 Residue-specific gap penalties: on
Penalize end gaps: on Hydrophilic gap penalties: on
Gap separation distance: 0 Hydrophilic residues: GPSNDQEKR
CLUSTAL W (1.81) Multiple Sequence Alignments
Sequence type explicitly set to Protein
Sequence format is Pearson
Sequence 1: GENPEPT_7595954 760 aa
Sequence 2: GENPEPT_1724118 757 aa
Sequence 3: GENPEPT_7304079 664 aa
Sequence 4: GENPEPT_460627 769 aa
Sequence 5: GENPEPT_3880333 779 aa
Sequence 6: MLH1_HUMAN 756 aa
Start of Pairwise alignments
Aligning...
Sequences (1:2) Aligned. Score: 91
Sequences (1:3) Aligned. Score: 50
Sequences (1:4) Aligned. Score: 36
Sequences (1:5) Aligned. Score: 32
Sequences (1:6) Aligned. Score: 88
Sequences (2:3) Aligned. Score: 48
Sequences (2:4) Aligned. Score: 36
Sequences (2:5) Aligned. Score: 32
Sequences (2:6) Aligned. Score: 86
Sequences (3:4) Aligned. Score: 37
Sequences (3:5) Aligned. Score: 33
Sequences (3:6) Aligned. Score: 51
Sequences (4:5) Aligned. Score: 29
Sequences (4:6) Aligned. Score: 36
Sequences (5:6) Aligned. Score: 32
Time for pairwise alignment: 1.192039
Guide tree file created: [../tmp-dir/7919.CLUSTALW.dnd]
Start of Multiple Alignment
There are 5 groups
Aligning...
Group 1: Sequences: 2 Score:15725
Group 2: Sequences: 3 Score:15401
Group 3: Sequences: 4 Score:10965
Group 4: Sequences: 5 Score:10162
Group 5: Sequences: 6 Score:7702
Time for multiple alignment: 2.113608
Alignment Score 30185
CLUSTAL-Alignment file created [../tmp-dir/7919.CLUSTALW.aln]
Citation
Algorithm Citation:
Higgins, D.G., Bleasby, A.J. and Fuchs, R. (1992) CLUSTAL V: improved
software for multiple sequence alignment. Computer Applications in the
Biosciences (CABIOS), 8(2):189-191.
Thompson J.D., Higgins D.G., Gibson T.J. "CLUSTAL W: improving the
sensitivity of progressive multiple sequence alignment through sequence
weighting, position-specific gap penalties and weight matrix choice."
Nucleic Acids Res. 22:4673-4680(1994).
Felsenstein, J. 1989. PHYLIP -- Phylogeny Inference Package
(Version 3.2). Cladistics 5: 164-166.
Program Citation:
CLUSTAL W: Julie D. Thompson, Desmond G. Higgins and Toby J. Gibson,
modified; any errors are due to the modifications.
PHYLIP: Felsenstein, J. 1993. PHYLIP (Phylogeny Inference Package)
version 3.5c. Distributed by the author. Department of Genetics,
University of Washington, Seattle.
Copyright (C) 1999, Board of Trustees of the University of Illinois.