2023年12月8日發(作者:宋楊萬里)

細胞色素c序列查找和分析1 登陸NCBI網站,查找關于細胞色素C相關的蛋白的序列,選取了human,rat,yeast,drosophila等14個物種的細胞色素C蛋白序列,制訂成表格,如下:NO.
1
ACC NO Organism
AAA28437 fruit fly
Protein quences
2 AAA21711 Rattus
norvegicus
3 Homo sapiens Homo sapiens
4 P00006 Bos taurus
5 CAA25046 Gallus gallus
6 S11172 yeast
7 AAC80552 Tigriopus
californicus
8 CCSF s tarfish
9 CCCA common carp
1 mgvpagdvek gkklfvqrca qchtveaggk
hkvgpnlhgl igrktgqaag faytdankak
gitwnedtlf eylenpkkyi pgtkmifagl
kkpnergdli aylksatk
1 mgdvekgkki fvqkcaqcht vekggkhktg
pnlhglfgrk tgqaagfsyt danknkgitw
gedtlmeyle npkkyipgtk mifagikkkg
eradliaylk katne
1 mgdvekgkki fimkcsqcht vekggkhktg
pnlhglfgrk tgqapgysyt aanknkgiiw
gedtlmeyle npkkyipgtk mifvgikkke
eradliaylk katne
1 gdvekgkkif vqkcaqchtv ekggkhktgp
nlhglfgrkt gqapgfsytd anknkgitwg
eetlmeylen pkkyipgtkm ifagikkkge
redliaylkk atne
1 mgdiekgkki fvqkcsqcht vekggkhktg
pnlhglfgrk tgqaegfsyt danknkgitw
gedtlmeyle npkkyipgtk mifagikkks
ervdliaylk datsk
1 mpyapgdekk gaslfktrca qchtvekgga
nkvgpnlhgv fgrktgqaeg fsyteanrdk
gitwdeetlf aylenpkkyi pgtkmafagf
kkpadrnnvi tylkkat
1 mgdidkgkki fvqkctqcht ieaggkhkvg
pnlhgmygrq tgkaagysyt dankskgvtw
neetldiylt npkkyipgtk mvfaglkkkg
dredliaylk sasss
1 gqvekgkkif vqrcaqchtv ekagkhktgp
nlngilgrkt gqaagfsytd anrnkgitwk
netlfeylen pkkyipgtkm vfaglkkqke
rqdliaylea atk
1 gdvekgkkvf vqkcaqchtv zbggkhkvgp
nlwglfgrkt gqapgfsytb abkskgivwb
zztlmeylzb pkkyipgtkm ifagikkkge 10 CCHOZ common zebra
11 AAL67777 Actinobacillus
lignieresii
12 CCHOD donkey
13 AAB86817 Pichia stipitis
14 CAA25899 Mus musculus
radliaylks ats
1 gdvekgkkif vqkcaqchtv ekggkhktgp
nlhglfgrkt gqapgfsytd anknkgitwk
eetlmeylen pkkyipgtkm ifagikkkte
redliaylkk atne
1 mtkllqkiaf ilplvfslva xaemvdtfqf
qnetdrvrav alakslrcpq cqnqnlvesn
attayklrle vyemvnqgkt deeiikimte
rfghfvnykp pfna
1 gdvekgkkif vqkcaqchtv ekggkhktgp
nlhglfgrkt gqapgfsytd anknkgitwk
eetlmeylen pkkyipgtkm ifagikkkte
redliaylkk atne
1 mpapfekg kkgatlfktr clqchtveeg
gphkvgpnlh gimgrksgqa vgysytdank
kkgvewqtmsdylenpkkyipgtkmafg
glkkpkdrnd lvtylasatk
1 mgdvekgkki fvqkcaqcht vekggkhktg
pnlhglfgrk tgqaagfsyt danknkgitw
gedtlmeyle npkkyipgtk mifagikkkg
eradliaylk katne
2 將所查找的序列作成fasta格式的文本文檔。
3 選取第二條序列(AAA21711)為代表,進行蛋白質一級,二級,三級結構的預測
a.一級結構用的是/tools/,結果如下:Ur-provided quence:
1 11 21 31 41 51
| | | | | |
1 MGDVEKGKKI FIMKCSQCHT VEKGGKHKTG PNLHGLFGRK TGQAPGYSYT AANKNKGIIW
60
61 GEDTLMEYLE NPKKYIPGTK MIFVGIKKKE ERADLIAYLK KATNE
References and documentation are available. Number of amino acids: 105
Molecular weight: 11748.7
Theoretical pI: 9.59
Amino acid composition:
Ala (A) 6 5.7%
Arg (R) 2 1.9%
Asn (N) 5 4.8%
Asp (D) 3 2.9%
Cys (C) 2 1.9%
Gln (Q) 2 1.9%
Glu (E) 8 7.6%
Gly (G) 13 12.4%
His (H) 3 2.9%
Ile (I) 8 7.6%
Leu (L) 6 5.7%
Lys (K) 18 17.1%
Met (M) 4 3.8%
Phe (F) 3 2.9%
Pro (P) 4 3.8%
Ser (S) 2 1.9%
Thr (T) 7 6.7%
Trp (W) 1 1.0%
Tyr (Y) 5 4.8%
Val (V) 3 2.9%
Asx (B) 0 0.0%
Glx (Z) 0 0.0%
Xaa (X) 0 0.0%
Total number of negatively charged residues (Asp + Glu
):Total number of positively charged residues (Arg + Lys
):
Atomic composition:
Carbon C 526
Hydrogen H 845
Nitrogen N 143
11 20Oxygen O 149
Sulfur S 6
Formula: C526H845N143O149S6
Total number of atoms: 1669
Extinction coefficients:
Conditions: 6.0 M guanidium hydrochloride 0.02 M phosphate buffer
pH 6.5
-1-1Extinction coefficients are in units of M cm .
The first table lists values computed assuming ALL Cys
residues appear as half cystines, whereas the cond table
assumes that NONE do.
276 278 279 280 282
nm nm nm nm nm
Ext. coefficient 12795 12727 12505 12210 11720
Abs 0.1% (=1 g/l) 1.089 1.083 1.064 1.039 0.998
276 278 279 280 282
nm nm nm nm nm
Ext. coefficient 12650 12600 12385 12090 11600
Abs 0.1% (=1 g/l) 1.077 1.072 1.054 1.029 0.987
Estimated half-life:
The N-terminal of the quence considered is M (Met).
The estimated half-life is: 30 hours (mammalian reticulocytes, in vitro).
>20 hours (yeast, in vivo).
>10 hours (Escherichia coli, in vivo).
Instability index:
The instability index (II) is computed to be 11.38This classifies the protein as stable.
Aliphatic index: 66.00
Grand average of hydropathicity (GRAVY): -0.706
b.二級結構用的是:
/cgi-bin/npsa_?page=npsa_
結果如下:
GOR4 result for : UNK_162940
Abstract
GOR condary structure prediction method version IV, J. Garnier, J.-F. Gibrat, B. Robson,
Methods in Enzymology,R.F. Doolittle Ed., vol 266, 540-553, (1996)
View GOR4 in: [MPSA
(Mac, UNIX) , ] [AnTheProt
(PC) , ]
[HELP]
10 20 30 40 50 60
70
| | | | | |
|
MGDVEKGKKIFVQKCAQCHTVEKGGKHKTGPNLHGLFGRKTGQAAGFSYTDANKNKGITWGEDTLMEYLE
cccccccceeeeeecccceeeecccccccccceeeecccccccccceeeccccccccceecccchhhhhc
NPKKYIPGTKMIFAGIKKKGERADLIAYLKKATNE
ccccccccchhhhhhhhhhcchhhhhhhhhhceec
Sequence length : 105
GOR4 :
Alpha helix (Hh) : 25 is 23.81% 310
helix (Gg) : 0 is 0.00% Pi helix (Ii) : 0 is 0.00% Beta bridge (Bb) : 0 is 0.00% Extended strand (Ee) : 21 is 20.00% Beta turn (Tt) : 0 is 0.00% Bend region (Ss) : 0 is 0.00% Random coil (Cc) : 59 is 56.19% Ambigous states (?) : 0 is 0.00% Other states : 0 is 0.00%
Prediction result file (text): [GOR4]
C.三級結構用的是/urbm/bioinfo/esypred/
結果如下:e-mail:*******************4.
用PHYLIP軟件推導進化樹。
a. 打開文件 →在下拉菜單file中點擊load Sequence
→在彈出窗口中選擇 打開 → 在下拉菜單Alignment中單擊Do Complete Alignment → 單擊ALIGN →在下拉菜單file中點擊Save Sequence as →在彈出窗口的Format選項中選擇PHYLIP→OK→得到文件。 如下:14 111
fruit --MGVPAGDV EKGKKLFVQR CAQCHTVEAG GKHKVGPNLH GLIGRKTGQA
starfish -------GQV EKGKKIFVQR CAQCHTVEKA GKHKTGPNLN GILGRKTGQA
common -------GDV EKGKKIFVQK CAQCHTVEKG GKHKTGPNLH GLFGRKTGQA
donkey -------GDV EKGKKIFVQK CAQCHTVEKG GKHKTGPNLH GLFGRKTGQA
Bos -------GDV EKGKKIFVQK CAQCHTVEKG GKHKTGPNLH GLFGRKTGQA
Rattus ------MGDV EKGKKIFVQK CAQCHTVEKG GKHKTGPNLH GLFGRKTGQA
Mus ------MGDV EKGKKIFVQK CAQCHTVEKG GKHKTGPNLH GLFGRKTGQA
Homo ------MGDV EKGKKIFIMK CSQCHTVEKG GKHKTGPNLH GLFGRKTGQA
Gallus ------MGDI EKGKKIFVQK CSQCHTVEKG GKHKTGPNLH GLFGRKTGQA
carp -------GDV EKGKKVFVQK CAQCHTVZBG GKHKVGPNLW GLFGRKTGQA
Tigriopus ------MGDI DKGKKIFVQK CTQCHTIEAG GKHKVGPNLH GMYGRQTGKA
yeast --MPYAPGDE KKGASLFKTR CAQCHTVEKG GANKVGPNLH GVFGRKTGQA
Pichia MPAPFEKGSE KKGATLFKTR CLQCHTVEEG GPHKVGPNLH GIMGRKSGQA
Actinobaci -----MTKLL QKIAFILPLV FSLVAXAEMV DTFQFQNETD RVR--AVALA
AGFAYTDANK AKGITWNEDT LFEYLENPKK YIPGTKMIFA GLKKPNERGD
AGFSYTDANR NKGITWKNET LFEYLENPKK YIPGTKMVFA GLKKQKERQD
PGFSYTDANK NKGITWKEET LMEYLENPKK YIPGTKMIFA GIKKKTERED
PGFSYTDANK NKGITWKEET LMEYLENPKK YIPGTKMIFA GIKKKTERED
PGFSYTDANK NKGITWGEET LMEYLENPKK YIPGTKMIFA GIKKKGERED
AGFSYTDANK NKGITWGEDT LMEYLENPKK YIPGTKMIFA GIKKKGERAD
AGFSYTDANK NKGITWGEDT LMEYLENPKK YIPGTKMIFA GIKKKGERAD
PGYSYTAANK NKGIIWGEDT LMEYLENPKK YIPGTKMIFV GIKKKEERAD
EGFSYTDANK NKGITWGEDT LMEYLENPKK YIPGTKMIFA GIKKKSERVD
PGFSYTBABK SKGIVWBZZT LMEYLZBPKK YIPGTKMIFA GIKKKGE---
AGYSYTDANK SKGVTWNEET LDIYLTNPKK YIPGTKMVFA GLKKKGDRED
EGFSYTEANR DKGITWDEET LFAYLENPKK YIPGTKMAFA GFKKPADRNN
VGYSYTDANK KKGVEWSEQT MSDYLENPKK YIPGTKMAFG GLKKPKDRND
KSLRCPQCQN QNLVESNATT AYKLRLEVYE MVNQGKTDEE IIKIMTERFG
LIAYLKSATK -
LIAYLEAATK - LIAYLKKATN E LIAYLKKATN E LIAYLKKATN E LIAYLKKATN E LIAYLKKATN E LIAYLKKATN E LIAYLKDATS K ---------- - LIAYLKSASS S
VITYLKKATS E LVTYLASATK -
HFVNYKPPFN Ab 進入EXE文件夾,點擊SEQBOOT軟件輸入文件名,回車后,輸入R更改參數,更改重復數字為200。輸Y確認參數。輸入奇數種子3。程序開始運行,并在EXE文件夾中產生outfile文件。
c 把文件outfile改為infile。點擊protdist程序。輸入M更改參數,輸入D選擇data ts。輸入200。輸Y確認參數。程序開始運行,并在EXE文件夾中產生outfile d 將outfile文件名改為infile,為避免與原先infile文件重復,將
原先文件名改為infile1。在EXE文件夾中選擇通過距離矩陣推測進化樹的算法,點擊NEIGHBOR程序。輸入M更改參數,輸入D選擇data ts。輸入200。輸入奇數種子3。輸Y確認參數。程序開始運行,并在EXE文件夾中產生outfile和outtree兩個結果輸出。outtree文件是一個樹文件,可以用treeview等軟件打開。outfile是一個分析結果的輸出報告,包括了樹和其他一些分析報告,可以用記事本直接打開。部分結果如下:Connsus tree program, version 3.6bSpecies in order:
1. Homo 2. Mus
3. Rattus 4. Bos
5. common 6. donkey 7. carp
8. Tigriopus 9. yeast
10. Actinobaci 11. Pichia 12. Gallus
13. starfish 14. fruit
Sets included in the connsus tree
Set (species in order) How many times out of 200.00
....**.... .... 166.00
.**....... .... 161.00
.......*** *.** ** *.** 93.00
...******* **** ***.... .... 83.00
........** .... 80.00
.......*** **** ** *.*. 68.00
........** *... 61.00
...****... .... 57.00
Sets NOT included in connsus tree:
Set (species in order) How many times out of
200.00
......**** *.** 63.00
...******* *.** *. *... 57.00
......**** **** .* ..*. 45.00
.........* *...
.********* *.**
...*..*... ....
........** *..*
....****** *.**
.**....*** ****
.........* ..**
.......*** *..*
........** ..**
........** ..*.
........** ...*
......**.. ....
.........* ...*
....****** ****
.........* *.*.
.......*** *.*.
.......**. *...
37.00
37.0035.00
29.00
26.00
24.00
24.00
23.00
23.00
23.00
22.00
20.00
19.00
16.00
16.00
15.00
14.00
.......*.. *... 13.00
........** **** ***... .... *.* ...* 11.00
.******... .... *****.. ....
.**...**** ****
..******** ****
.***...... ....
........*. *.*.
.......*.* ..**
.........* .*..
...*.*.... ....
.***..**** ****
.*.******* ****
....**.*** *.**
.......... *.*.
.......*** *...
.......*.* ....
......**** *..*
.***..*... ....
....**.*** ****
10.00
10.00 10.009.00
9.00
8.00
8.00
8.00
8.00 7.00 7.007.00
7.00
7.00
6.00
6.00
6.00
.......**. *.** .* .**. .* *.** ** .*** **..... .... 5.00
.......*.* *.**
......***. *.**
.**.****** ****
.......*.* .*.*
.......*.* *.*.
........** .*..
........*. *.**
.......*.. ...*
.......*.* *...
.....***.. ....
..**...... ....
.***..***. *.**
......*.** *.**
........** ***.
...*****.. *...
....****.. ....
......*... ...*
5.00
5.00
5.00 4.00
4.00
4.00
4.00
4.00
4.00
3.00
3.00
3.00 3.00
3.00
3.00
3.00
3.00
......*..* ...* 3.00.**....... .*.. 3.00
...*..**** *.** **.. *... 3.00
.*.****... .... ***. *...
.......*.* .***
...***.*** *.**
.......*** ....
.********. *.**
......*..* ....
.......... ..**
.......*.* ..*.
......**.. .*..
.......*.* .**.
.******... .*..
...******. *.**
.*******.. ....
......*.*. *...
....*....* .*..
...*.**... ....
....**...* .*..
3.00
3.00
3.00
2.00
2.002.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
.*******.. *... **.*... .... 2.00
..******.. .... *. *..* 2.00
.*****.... .... *.. *.**
....*..*** *.**
....*.*... ....
.......**. *.*.
.**.....** ****
......*... .*..
.........* .***
...***.*.. ....
.......*** ...*
......*.*. *.*.
........** **..
.........* .*.*
.*......** ****
.......... *.**
...*..***. *.**
.*.....*** ****
.........* ***.
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
.......*.* *..* 2.00
.**.****.. .... *.** .*** *.*. *.** ***.... ..*. 1.00
...****.** *.**
..*...*... ....
.**......* ....
.**...**.. ....
.*******.. ..*.
....*....* .**.
.......*.* ****
.***...*** ****
......**.* ...*
....*..*** ****
.****...** ****
.****..... ....
...****... ...*
.*.......* ....
....*****. ****
.*****.*** ****
........*. ...*
1.001.00
1.00
1.00
1.00 1.00
1.00
1.00 1.00
1.00 1.00 1.00
1.00
1.00
1.00 1.001.00
......*... *... 1.00
.*****.*** *.** **..** *.** 1.00.**.**...* .*.* 1.00
.***....** **** 1.00..*.....** ****
......*.*. *..*
....**.... ..*.
...*..**.. ....
......*.** ****
.......*.* .*..
....*...** ....
...*****.. ..*.
....**...* .**.
..*****... ....
.****..*** ****
...*.*.*** *.**
......*.** *.*.
...****.*. *.*.
....**..** ****
......**.* .*.*
....**...* .*.*
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
.....**... .... 1.00
....***.*. *.*. *.*. .... 1.00
....**..** .... 1.00.*.....*.. .... 1.00
....***..* .***
.******.** ****
.******... ...*
...***.*** ****
......***. ...*
.**....*.. ....
.**.*..... ....
......**.. ...*
.......**. *..*
.......*.. .*..
...****..* .***
.**...***. *.**
.*.......* .*..
.**..***.. ....
.**......* .*..
...*..**.. *.**
....**.*.* .*.*
1.00 1.00 1.00 1.00 1.00
1.00
1.00
1.00
1.00
1.00
1.00 1.001.00
1.00
1.00
1.00
1.00
.*.*...... .... 1.00
......***. **** ** **.* 1.00
.******..* .*** ****... .*.. 1.00
.......*.. *.*.
.********. ****
.*.....*.* .*.*
.**....*.* .*.*
.**....*.* .**.
..******.. ..*.
......***. *.*.
.*******.. **..
..*....*** ****
......*.*. ...*
.***..**** *.**
.*****..** ****
...*...*** *.**
.**...*... ....
......*..* .*..
...******. ****
.......*** .*..
1.00
1.00 1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
.**.....** .*** 1.00
.*.*..*... .... 1.00
Extended majority rule connsus tree
CONSENSUS TREE:
the numbers on the branches indicate the number
of times the partition of the species into the two tswhich are parated by that branch occurred
among the trees, out of 200.00 trees
+-------------Pichia
+-61.0-|
|
+------yeast
+-68.0-|
+-80.0-|
|
+------Actinobaci
+-93.0-| | |
+--------------------starfish
+133.0-| |
|
|
|
|
|
+---------------------------fruit
|
+-70.0-| | |
+----------------------------------Tigriopus | |
|
+-----------------------------------------Gallus +-84.0-|
|
+------common
|
+166.0-|
| |
+------donkey
| |
+------| +----------------------57.0-|
| |
|
+--------------------carp
| |
|
|
|
+-83.0-|
| |+-------------Bos | |
|
+------Rattus
| +------------------------------------------161.0-|
|
+------Mus |
+--------------------------------------------------------------Homo
e 將outtree文件名改為intree,點擊DRAWTREE程序,輸入font1文件名,作為參數。輸Y確認參數。程序開始運行,并出現Tree
Preview圖。
f 點擊DRAWGRAM程序,輸入font1文件名,作為參數。輸Y確認參數。程序開始運行,并出現Tree Preview圖。 g 將EXE文件夾中的outfile文件名改為outfile1,以避免被新生成的outfile 文件覆蓋。點擊CONSENSE程序。輸入Y確認設置。EXE文件夾中新生成outfile和outtree。Outfile文件用記事本打開,將EXE文件夾中的intree文件名改為intree1,將outtree改intree。點擊DRAWTREE程序,輸入font1文件名,作為參數。輸Y確認參數。程序開始運行,并出現Tree Preview圖。 8、點擊DRAWGRAM程序,輸入font1文件名,作為參數。輸Y確認參數。程序開始運行,并出現Tree Preview圖。
本文發布于:2023-12-08 02:42:47,感謝您對本站的認可!
本文鏈接:http://m.newhan.cn/zhishi/a/1701974567239217.html
版權聲明:本站內容均來自互聯網,僅供演示用,請勿用于商業和其他非法用途。如果侵犯了您的權益請與我們聯系,我們將在24小時內刪除。
本文word下載地址:細胞色素c序列查找和分析.doc
本文 PDF 下載地址:細胞色素c序列查找和分析.pdf
| 留言與評論(共有 0 條評論) |