; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

CmoCh08G006330 (gene) of Cucurbita moschata (Rifu) v1 genome

Gene IDCmoCh08G006330
OrganismCucurbita moschata Rifu (Cucurbita moschata (Rifu) v1)
DescriptionRetrovirus-related Pol polyprotein from transposon TNT 1-94
Genome locationCmo_Chr08:4106513..4107460
RNA-Seq ExpressionCmoCh08G006330
SyntenyCmoCh08G006330
Gene Ontology termsGO:0003676 - nucleic acid binding (molecular function)
GO:0008270 - zinc ion binding (molecular function)
InterPro domainsIPR025314 - Domain of unknown function DUF4219
IPR036875 - Zinc finger, CCHC-type superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_012483005.1 PREDICTED: uncharacterized protein LOC105797604 [Gossypium raimondii]2.0e-10167.54Show/hide
Query:  MEGESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPTVAQIKLQKEKKTRKSKAKTCLF------------------------
        MEG S+FS V PPVFDGDNYQMWAVRMETYL+ALDLWE +EEDY+VP LPANPTVAQIK QKEKKTRKSKAK CLF                        
Subjt:  MEGESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPTVAQIKLQKEKKTRKSKAKTCLF------------------------

Query:  -----------------------VTKMKESESVKEYSD---------RLLGSVLNDSRIVEKLLVIVLEKFEATITTLENSKDLSKISLIEPLNALQAQE
                               + KMKESESVKEYSD         RLLGS LNDSRIVEKLLV V EKFEAT TTLEN+KDLSKISL E LNALQAQE
Subjt:  -----------------------VTKMKESESVKEYSD---------RLLGSVLNDSRIVEKLLVIVLEKFEATITTLENSKDLSKISLIEPLNALQAQE

Query:  QRRFIRQQGVIEGVLPIKHQDNSRYKNKKIFKNQSTNGDS-SANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRPDVKCSKCNQLGHEAVICKVKGQV
        QRR ++Q+GVIEG LP+KHQDN+RYK KK FKNQST+G++ S NYQK+K GG KKSYPP  HCEKKGHPP+KCW+RPD KCSKCNQLGHEAVICKVKGQV
Subjt:  QRRFIRQQGVIEGVLPIKHQDNSRYKNKKIFKNQSTNGDS-SANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRPDVKCSKCNQLGHEAVICKVKGQV

Query:  KEVDA
        +EVDA
Subjt:  KEVDA

XP_022927156.1 uncharacterized protein LOC111434088 [Cucurbita moschata]2.4e-9969.76Show/hide
Query:  MEGESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPTVAQIKLQKEKKTRKSKAKTCLF------------------------
        M GESSFS VAPP+FDGDNYQMWAVRMETYL+ALDLWE IEEDY+VP LPANPTVAQIKLQKEKKTRKSKAK CLF                        
Subjt:  MEGESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPTVAQIKLQKEKKTRKSKAKTCLF------------------------

Query:  -----------------------VTKMKESESVKEYSD---------RLLGSVLNDSRIVEKLLVIVLEKFEATITTLENSKDLSKISLIEPLNALQAQE
                               + KMKESESVKEYSD         RLLGSVLNDSRIVEKLLV + EKFEATITTLEN+KDLSKISL E LNALQAQE
Subjt:  -----------------------VTKMKESESVKEYSD---------RLLGSVLNDSRIVEKLLVIVLEKFEATITTLENSKDLSKISLIEPLNALQAQE

Query:  QRRFIRQQGVIEGVLPIKHQDNSRYKNKKIFKNQSTNGDSSANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRPDVKCSKCNQLGHEAV
        QRR +RQ+GVIEG L +KHQD+ RYKN K FKNQ T GDSSANYQKTKGGGFKKSYPP RHCEKKGHPPYKCWRRPD  CSKCNQLGHEA+
Subjt:  QRRFIRQQGVIEGVLPIKHQDNSRYKNKKIFKNQSTNGDSSANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRPDVKCSKCNQLGHEAV

XP_022932242.1 uncharacterized protein LOC111438605 [Cucurbita moschata]1.9e-10473.24Show/hide
Query:  MEGESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPT----VAQIKLQKEKKTRKSKAKTCL---FVTKMKESESVKEYSD--
        M GESSFSTVAPPVFDGDNYQ+W VRMETYL+ALDLWEAIEEDY+VP LP NPT      + + + +++ R  K    +    + KMKES+SVKEY D  
Subjt:  MEGESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPT----VAQIKLQKEKKTRKSKAKTCL---FVTKMKESESVKEYSD--

Query:  -------RLLGSVLNDSRIVEKLLVIVLEKFEATITTLENSKDLSKISLIEPLNALQAQEQRRFIRQQGVIEGVLPIKHQDNSRYKNKKIFKNQSTNGDS
               RLLGS+LNDS+IVEKLLV V EKFEATITTLEN+KDLSKISL E LNALQAQEQRR +RQ+GVIE  L +KHQD+SRYKN K FKNQ T GDS
Subjt:  -------RLLGSVLNDSRIVEKLLVIVLEKFEATITTLENSKDLSKISLIEPLNALQAQEQRRFIRQQGVIEGVLPIKHQDNSRYKNKKIFKNQSTNGDS

Query:  SANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRPDVKCSKCNQLGHEAVICKVKGQVKEVDAQVVDQEEEDQLFVVTCSSGK
         ANYQKTKGGGFKKSYPP RHCEKKGHPPYKCWRRPD KCSKCNQLGHE +I KVKGQVKEVDAQ+VDQEEEDQLF+VT SS K
Subjt:  SANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRPDVKCSKCNQLGHEAVICKVKGQVKEVDAQVVDQEEEDQLFVVTCSSGK

XP_022959005.1 uncharacterized protein LOC111460124 [Cucurbita moschata]2.3e-11070.59Show/hide
Query:  GESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPTVAQIKLQKEKKTRKSKAKTCLF--------------------------
        GESSFS VAPPVFDGDNYQMWAVRMETYL+ALDLWEAIEEDY+VP LPANPTVAQIKLQKEKKTRKSKAK CLF                          
Subjt:  GESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPTVAQIKLQKEKKTRKSKAKTCLF--------------------------

Query:  ---------------------VTKMKESESVKEYSD---------RLLGSVLNDSRIVEKLLVIVLEKFEATITTLENSKDLSKISLIEPLNALQAQEQR
                             + KMK+SESVKEYS+         RLLGS+LNDSRIVEKLLV V EKFEATITTLEN+KDLSKISL E LNALQAQEQ+
Subjt:  ---------------------VTKMKESESVKEYSD---------RLLGSVLNDSRIVEKLLVIVLEKFEATITTLENSKDLSKISLIEPLNALQAQEQR

Query:  RFIRQQGVIEGVLPIKHQDNSRYKNKKIFKNQSTNGDSSANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRPDVKCSKCNQLGHEAVICKVKGQVKEV
        R +RQ+GVIEG L +KHQDNSRYKN K FKNQ T GDSS NYQKTKGGGFKKSYP  RHCEKK HPPYKCWRRPD  CSKCNQLGHEAVICKVK  VKEV
Subjt:  RFIRQQGVIEGVLPIKHQDNSRYKNKKIFKNQSTNGDSSANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRPDVKCSKCNQLGHEAVICKVKGQVKEV

Query:  DAQVVDQ-EEEDQLFVVTCSSGK
        DAQVVDQ EEEDQL +VT SS K
Subjt:  DAQVVDQ-EEEDQLFVVTCSSGK

XP_022964088.1 uncharacterized protein LOC111464225 [Cucurbita moschata]1.7e-14290.54Show/hide
Query:  MEGESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPTVAQIKLQKEKKTRKSKAKTCLFVT----------------------
        MEGESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPTVAQIKLQKEKKTRKSKAKTCLFVT                      
Subjt:  MEGESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPTVAQIKLQKEKKTRKSKAKTCLFVT----------------------

Query:  ------KMKESESVKEYSDRLLGSVLNDSRIVEKLLVIVLEKFEATITTLENSKDLSKISLIEPLNALQAQEQRRFIRQQGVIEGVLPIKHQDNSRYKNK
              KMKESESVKEYSDRLLGSVLNDSRIVEKLLVIVLEKFEATITTLENSKDLSKISLIEPLNALQAQEQRRFIRQQGVIEGVLPIKHQDNSRYKNK
Subjt:  ------KMKESESVKEYSDRLLGSVLNDSRIVEKLLVIVLEKFEATITTLENSKDLSKISLIEPLNALQAQEQRRFIRQQGVIEGVLPIKHQDNSRYKNK

Query:  KIFKNQSTNGDSSANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRPDVKCSKCNQLGHEAVICKVKGQVKEVDAQVVDQEEEDQLFVVTCSSGK
        KIFKNQSTNGDSSANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRPDVKCSKCNQLGHEAVICKVKGQVKEVDAQVVDQEEEDQLFVVTCSSGK
Subjt:  KIFKNQSTNGDSSANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRPDVKCSKCNQLGHEAVICKVKGQVKEVDAQVVDQEEEDQLFVVTCSSGK

TrEMBL top hitse value%identityAlignment
A0A6J1EGX0 uncharacterized protein LOC1114340881.2e-9969.76Show/hide
Query:  MEGESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPTVAQIKLQKEKKTRKSKAKTCLF------------------------
        M GESSFS VAPP+FDGDNYQMWAVRMETYL+ALDLWE IEEDY+VP LPANPTVAQIKLQKEKKTRKSKAK CLF                        
Subjt:  MEGESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPTVAQIKLQKEKKTRKSKAKTCLF------------------------

Query:  -----------------------VTKMKESESVKEYSD---------RLLGSVLNDSRIVEKLLVIVLEKFEATITTLENSKDLSKISLIEPLNALQAQE
                               + KMKESESVKEYSD         RLLGSVLNDSRIVEKLLV + EKFEATITTLEN+KDLSKISL E LNALQAQE
Subjt:  -----------------------VTKMKESESVKEYSD---------RLLGSVLNDSRIVEKLLVIVLEKFEATITTLENSKDLSKISLIEPLNALQAQE

Query:  QRRFIRQQGVIEGVLPIKHQDNSRYKNKKIFKNQSTNGDSSANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRPDVKCSKCNQLGHEAV
        QRR +RQ+GVIEG L +KHQD+ RYKN K FKNQ T GDSSANYQKTKGGGFKKSYPP RHCEKKGHPPYKCWRRPD  CSKCNQLGHEA+
Subjt:  QRRFIRQQGVIEGVLPIKHQDNSRYKNKKIFKNQSTNGDSSANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRPDVKCSKCNQLGHEAV

A0A6J1EPL3 uncharacterized protein LOC1114366712.8e-9868.24Show/hide
Query:  MEGESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPTVAQIKLQKEKKTRKSKAKTCLF------------------------
        M GESSFS VAP VFDGDNYQMWAVR+ETYL+ LDLWEA EEDY+VP LPANP VAQIKLQKEK TRKSKAK CLF                        
Subjt:  MEGESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPTVAQIKLQKEKKTRKSKAKTCLF------------------------

Query:  -----------------------VTKMKESESVKEYSD---------RLLGSVLNDSRIVEKLLVIVLEKFEATITTLENSKDLSKISLIEPLNALQAQE
                               + KMKESE VKEYSD         RLLGS+LNDSRIVEKLLV V EKFEATITTLEN+KDLSKISL   LNALQAQE
Subjt:  -----------------------VTKMKESESVKEYSD---------RLLGSVLNDSRIVEKLLVIVLEKFEATITTLENSKDLSKISLIEPLNALQAQE

Query:  QRRFIRQQGVIEGVLPIKHQDNSRYKNKKIFKNQSTNGDSSANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRPDVKCSKCNQLGHEAVICKVK
        QRR +RQ+GVIEG L +KHQDNSRYKN K FKNQ  NGD SANYQK KGGGFKKSYPP RHCEKKGHPPYKCWRRPD  CSKCNQLGHEAVICK K
Subjt:  QRRFIRQQGVIEGVLPIKHQDNSRYKNKKIFKNQSTNGDSSANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRPDVKCSKCNQLGHEAVICKVK

A0A6J1EW37 uncharacterized protein LOC1114386059.1e-10573.24Show/hide
Query:  MEGESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPT----VAQIKLQKEKKTRKSKAKTCL---FVTKMKESESVKEYSD--
        M GESSFSTVAPPVFDGDNYQ+W VRMETYL+ALDLWEAIEEDY+VP LP NPT      + + + +++ R  K    +    + KMKES+SVKEY D  
Subjt:  MEGESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPT----VAQIKLQKEKKTRKSKAKTCL---FVTKMKESESVKEYSD--

Query:  -------RLLGSVLNDSRIVEKLLVIVLEKFEATITTLENSKDLSKISLIEPLNALQAQEQRRFIRQQGVIEGVLPIKHQDNSRYKNKKIFKNQSTNGDS
               RLLGS+LNDS+IVEKLLV V EKFEATITTLEN+KDLSKISL E LNALQAQEQRR +RQ+GVIE  L +KHQD+SRYKN K FKNQ T GDS
Subjt:  -------RLLGSVLNDSRIVEKLLVIVLEKFEATITTLENSKDLSKISLIEPLNALQAQEQRRFIRQQGVIEGVLPIKHQDNSRYKNKKIFKNQSTNGDS

Query:  SANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRPDVKCSKCNQLGHEAVICKVKGQVKEVDAQVVDQEEEDQLFVVTCSSGK
         ANYQKTKGGGFKKSYPP RHCEKKGHPPYKCWRRPD KCSKCNQLGHE +I KVKGQVKEVDAQ+VDQEEEDQLF+VT SS K
Subjt:  SANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRPDVKCSKCNQLGHEAVICKVKGQVKEVDAQVVDQEEEDQLFVVTCSSGK

A0A6J1H529 uncharacterized protein LOC1114601241.1e-11070.59Show/hide
Query:  GESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPTVAQIKLQKEKKTRKSKAKTCLF--------------------------
        GESSFS VAPPVFDGDNYQMWAVRMETYL+ALDLWEAIEEDY+VP LPANPTVAQIKLQKEKKTRKSKAK CLF                          
Subjt:  GESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPTVAQIKLQKEKKTRKSKAKTCLF--------------------------

Query:  ---------------------VTKMKESESVKEYSD---------RLLGSVLNDSRIVEKLLVIVLEKFEATITTLENSKDLSKISLIEPLNALQAQEQR
                             + KMK+SESVKEYS+         RLLGS+LNDSRIVEKLLV V EKFEATITTLEN+KDLSKISL E LNALQAQEQ+
Subjt:  ---------------------VTKMKESESVKEYSD---------RLLGSVLNDSRIVEKLLVIVLEKFEATITTLENSKDLSKISLIEPLNALQAQEQR

Query:  RFIRQQGVIEGVLPIKHQDNSRYKNKKIFKNQSTNGDSSANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRPDVKCSKCNQLGHEAVICKVKGQVKEV
        R +RQ+GVIEG L +KHQDNSRYKN K FKNQ T GDSS NYQKTKGGGFKKSYP  RHCEKK HPPYKCWRRPD  CSKCNQLGHEAVICKVK  VKEV
Subjt:  RFIRQQGVIEGVLPIKHQDNSRYKNKKIFKNQSTNGDSSANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRPDVKCSKCNQLGHEAVICKVKGQVKEV

Query:  DAQVVDQ-EEEDQLFVVTCSSGK
        DAQVVDQ EEEDQL +VT SS K
Subjt:  DAQVVDQ-EEEDQLFVVTCSSGK

A0A6J1HJT1 uncharacterized protein LOC1114642258.5e-14390.54Show/hide
Query:  MEGESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPTVAQIKLQKEKKTRKSKAKTCLFVT----------------------
        MEGESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPTVAQIKLQKEKKTRKSKAKTCLFVT                      
Subjt:  MEGESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPTVAQIKLQKEKKTRKSKAKTCLFVT----------------------

Query:  ------KMKESESVKEYSDRLLGSVLNDSRIVEKLLVIVLEKFEATITTLENSKDLSKISLIEPLNALQAQEQRRFIRQQGVIEGVLPIKHQDNSRYKNK
              KMKESESVKEYSDRLLGSVLNDSRIVEKLLVIVLEKFEATITTLENSKDLSKISLIEPLNALQAQEQRRFIRQQGVIEGVLPIKHQDNSRYKNK
Subjt:  ------KMKESESVKEYSDRLLGSVLNDSRIVEKLLVIVLEKFEATITTLENSKDLSKISLIEPLNALQAQEQRRFIRQQGVIEGVLPIKHQDNSRYKNK

Query:  KIFKNQSTNGDSSANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRPDVKCSKCNQLGHEAVICKVKGQVKEVDAQVVDQEEEDQLFVVTCSSGK
        KIFKNQSTNGDSSANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRPDVKCSKCNQLGHEAVICKVKGQVKEVDAQVVDQEEEDQLFVVTCSSGK
Subjt:  KIFKNQSTNGDSSANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRPDVKCSKCNQLGHEAVICKVKGQVKEVDAQVVDQEEEDQLFVVTCSSGK

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGAAGGAGAATCCAGTTTTTCAACTGTCGCACCACCAGTCTTTGATGGAGACAATTATCAAATGTGGGCAGTTCGAATGGAGACTTATCTGAAGGCTTTGGATCTTTG
GGAAGCAATAGAAGAGGATTACAAGGTCCCTTCACTTCCAGCAAATCCTACTGTAGCACAAATCAAATTACAAAAGGAAAAGAAGACAAGGAAATCAAAGGCGAAAACTT
GCTTATTTGTCACTAAGATGAAGGAGTCGGAGTCGGTGAAAGAGTACTCTGACAGACTGCTTGGTTCTGTGTTAAATGATTCCAGGATCGTTGAAAAGCTGCTAGTCATT
GTTCTAGAGAAGTTTGAAGCCACCATTACTACTCTGGAGAACTCCAAAGACTTGTCAAAGATTTCTCTTATAGAGCCCTTGAATGCTTTACAAGCACAAGAGCAAAGGAG
GTTTATAAGGCAACAAGGGGTGATTGAAGGTGTCTTACCTATTAAGCATCAAGACAACAGCAGGTATAAAAACAAGAAAATTTTTAAAAATCAATCGACGAATGGAGATT
CATCTGCCAATTATCAGAAGACAAAAGGAGGAGGTTTCAAAAAATCCTATCCACCTTGGCGCCATTGTGAGAAGAAAGGTCATCCACCATACAAGTGTTGGAGAAGACCT
GACGTCAAATGCTCCAAATGCAATCAACTTGGACATGAAGCTGTGATCTGCAAAGTTAAAGGTCAGGTGAAAGAAGTAGATGCACAAGTAGTTGATCAAGAAGAAGAAGA
TCAATTGTTTGTTGTCACTTGTTCCTCAGGCAAATAA
mRNA sequenceShow/hide mRNA sequence
ATGGAAGGAGAATCCAGTTTTTCAACTGTCGCACCACCAGTCTTTGATGGAGACAATTATCAAATGTGGGCAGTTCGAATGGAGACTTATCTGAAGGCTTTGGATCTTTG
GGAAGCAATAGAAGAGGATTACAAGGTCCCTTCACTTCCAGCAAATCCTACTGTAGCACAAATCAAATTACAAAAGGAAAAGAAGACAAGGAAATCAAAGGCGAAAACTT
GCTTATTTGTCACTAAGATGAAGGAGTCGGAGTCGGTGAAAGAGTACTCTGACAGACTGCTTGGTTCTGTGTTAAATGATTCCAGGATCGTTGAAAAGCTGCTAGTCATT
GTTCTAGAGAAGTTTGAAGCCACCATTACTACTCTGGAGAACTCCAAAGACTTGTCAAAGATTTCTCTTATAGAGCCCTTGAATGCTTTACAAGCACAAGAGCAAAGGAG
GTTTATAAGGCAACAAGGGGTGATTGAAGGTGTCTTACCTATTAAGCATCAAGACAACAGCAGGTATAAAAACAAGAAAATTTTTAAAAATCAATCGACGAATGGAGATT
CATCTGCCAATTATCAGAAGACAAAAGGAGGAGGTTTCAAAAAATCCTATCCACCTTGGCGCCATTGTGAGAAGAAAGGTCATCCACCATACAAGTGTTGGAGAAGACCT
GACGTCAAATGCTCCAAATGCAATCAACTTGGACATGAAGCTGTGATCTGCAAAGTTAAAGGTCAGGTGAAAGAAGTAGATGCACAAGTAGTTGATCAAGAAGAAGAAGA
TCAATTGTTTGTTGTCACTTGTTCCTCAGGCAAATAA
Protein sequenceShow/hide protein sequence
MEGESSFSTVAPPVFDGDNYQMWAVRMETYLKALDLWEAIEEDYKVPSLPANPTVAQIKLQKEKKTRKSKAKTCLFVTKMKESESVKEYSDRLLGSVLNDSRIVEKLLVI
VLEKFEATITTLENSKDLSKISLIEPLNALQAQEQRRFIRQQGVIEGVLPIKHQDNSRYKNKKIFKNQSTNGDSSANYQKTKGGGFKKSYPPWRHCEKKGHPPYKCWRRP
DVKCSKCNQLGHEAVICKVKGQVKEVDAQVVDQEEEDQLFVVTCSSGK