============================================================
GCATGCTGGCTCCTTTGGGATCGATCCGTCCGGTTCTTCTCCGGCCGGCCACCTCTCGAA
GGTGACGCTGTCGCCGACGAGCCACCGACATCCGACCGACAGCCCCCGACAGCGCTCCTA
CGCGGTGCCGACATGACACCGACACCGCAGGTCGGACGACGGGGGCTCAGGCGCGACGGG
CGCGGATCACGACGGCCGTACCGCCGCGACGGCGAGCACCGCCGCGCCGCCGAGGAGTGG
CCGAAGGAGTGAAGATCGGTTACGGACCGTAAAGGAGTACCTGGCGCACCGGCGCGTTGT
CGCATCGTCGTCCCGGCCGGTGGCGGAGCATGCCACCCATGCTGTCCGGTCTTCTGGCCA
GATTGGTCAAACTGCTGCTCGGGCGCCACGGCAGTGCGCTGCACTGGAGGGCCGCGGGTG
CCGCGACGGTCCTCCTGGTGATCGTCCTCCTCGCGGGCTCGTACTTGGCCGTCCTGGCTG
AGCGCGGCGCACCGGGCGCGCAGCTGATCACGTATCCGCGGGCGCTGTGGTGGTCCGTGG
AGACCGCGACGACCGTCGGCTACGGCGACCTGTACCCCGTGACTCTGTGGGGCCGGCTCG
TGGCCGTGGTGGTGATGGTCGCCGGGATCACCTCCTTCGGTCTGGTGACCGCCGCGCTGG
CCACCTGGTTCGTCGGCCGGGAACAAGAGCGCCGGGGCCACTTCGTGCGCCACTCCGAGA
AGGCCGCCGAGGAGGCGTACACGCGGACGACCCGGGCGCTGCACGAGCGTTTCGACCGTT
TGGAGCGAATGCTCGACGACAACCGCCGGTGACTCCGCCGGTGACCGCCCGAGCGAGGCC
GCACCGATGAGTCTGCGGCGGTTGTGCGGTCTACCCGTCGACGAAGGGAGCGCACCATGC
GCAAGATCATCATTTGCACGTTCCTGACGCTGGACGGCGTCATGCAGGCGCCGGGCGGCC
CGGACGAGGACGCCGAGAGCGGCTTCGAACACGGCGGCTGGCAGAAGCCGGTGGACGACG
ACGAGGTCGGCACGGCCATCGCCGGCTGGTACGAGGACTCCGACGCCATGCTCCTCGGCC
GCAAGACCTACGACATCTTCGCGTCGTACTGGCCGACCGCCGACCCCGACAACCCGTTCA
CCCATCGGATGAACAGCATGC
============================================================

Use ORF finder to predict the coding region of above gene.


(a) Which open reading frame is the correct one?

1. 先連結到http://www.ncbi.nih.gov/,並點選右方之ORF Finder,進入ORF Finder。

2. 將上述DNA序列輸入文字框中,並按下OrfFind按鈕。

3. 結果出現後,選取由上往下數第三條長條圖上的第一個綠色方塊,選取之後按下Accept按鈕。

4. 在接受此序列前,可先選取畫面上方之program為blastp,按blast按鈕於資料庫中搜尋是否有類似蛋白質,如果有類似蛋白質的話則此ORF為正確的ORF,可按下畫面中央的Accept按鈕接受之。並重複此二步驟接受同一ORF上另一蛋白質。


(b) How many residues does this protein have? Give the sequence in FASTA format.

1. 將(a)部分中畫面上方下拉選單選為FASTA,並按下左方之View按鈕。

2. 完成後即可得到此蛋白質之FASTA format序列。
>lcl|Sequence 1 ORF:330..812 Frame +3
MPPMLSGLLARLVKLLLGRHGSALHWRAAGAATVLLVIVLLAGSYLAVLAERGAPGAQLITYPRALWWSV
ETATTVGYGDLYPVTLWGRLVAVVVMVAGITSFGLVTAALATWFVGREQERRGHFVRHSEKAAEEAYTRT
TRALHERFDRLERMLDDNRR*
>lcl|Sequence 2 ORF:897..1161 Frame +3
MRKIIICTFLTLDGVMQAPGGPDEDAESGFEHGGWQKPVDDDEVGTAIAGWYEDSDAMLLGRKTYDIFAS
YWPTADPDNPFTHRMNSM


(c) Use blast to search the nr database. Set E value to 0.0000001 with PAM70 matrix. Show 10 top hits.

1. 由NCBI首頁上方選取BLAST後進入blastp首頁

2. 於搜尋序列對話框中分別輸入(b)中得到的FASTA format序列,並於下方將E value及所使用之matrix設為上述matrix,並將discription設為10,以限制輸出數目。完成後得到如下結果:


(d) Please identify this protein. Give its name, accession, function, and organism.

1. 根據上面結果,選取E value最小的entry,可得到兩個protein。
(1) voltage gated potassium channel,accession# NP_631700,功能為potassium channel,物種為Streptomyces coelicolar A3(2)
(2)putative secreted protein,accession# NP_631701,功能不明,物種亦為Streptomyces coelicolar A3(2)


(e) Use blast 2 sequence to compare the same protein in Homo sapiens.

1. 去Entrez搜尋voltage gated potassium channel,可搜尋到人類的類似基因。

2. 記下該基因的GI#,至blast2seq中輸入。

3. 按下Align鈕即可得到結果。


(f) Search for conserved domains of this gene in Pfam database. How many domains can you find? Give the name, Pfam number and consensus sequence of the conserved domain.

1. 連結到Pfam首頁,點選Protein Search進入搜尋畫面。

2. 在對話框內輸入用ORF Finder翻譯出來的蛋白質序列後按送出查詢按鈕。

3. 第一個翻譯出來的序列找不到conserved domain,第二個序列可找到一個conserved domain。
Domain name: RibD_C
Accession number: PF01872
Consensus sequence:
>lcl|consensus RibD C-terminal domain. The function of this domain is not known, but it is thought to be involved in riboflavin biosynthesis. This domain is found in the C terminus of RibD/RibG, in combination with pfam00383, as well as in isolation in some archaebacterial proteins.
------------------------------------------------------------
------------------------------------------------------------
---------------------------------------YVTLKYAMSLDGKTATATGSS
KWITGEEARRDVHQLRAEADAILVGAGTVLADNPSLTVRWVDGRQA----------RQPV
RVVVDSSLRVPLDARVLNt-DEAPTVIATTETaDSEKIEKLKELgvEVLVLGDDR--VDL
KELLEELYE-RGIRSVLVEGGA-TLNGSFLKAGLVDELIIYIAPKILG-GNAPTLVGGEG
FQKLADALRLRLKS-------------------------


Bonus: What is special about this protein ?

Ans: This year's nobel laureate Roderick MacKinnon is awarded because his discovery of spacial structure of potassium channels.