; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

MS016238 (gene) of Bitter gourd (TR) v1 genome

Gene IDMS016238
OrganismMomordica charantia cv. TR (Bitter gourd (TR) v1)
DescriptionStress response NST1-like protein
Genome locationscaffold9_2:1354526..1357635
RNA-Seq ExpressionMS016238
SyntenyMS016238
Gene Ontology termsGO:0016021 - integral component of membrane (cellular component)
InterPro domainsNA


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_004152172.1 uncharacterized protein LOC101207869 [Cucumis sativus]3.1e-8574.31Show/hide
Query:  MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNN-------NGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLA
        MF AR SW  FSKR KP +TRSFCSKSH  TN        NG+NKVE DLSSY EAYKQLDNLD MTASKILFT P KKKKFG+DFHLVQLFFVCMPSLA
Subjt:  MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNN-------NGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLA

Query:  VYLVAQYARYEMRKMEADLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSGT-TKNSEKDREGGKVKHGENNMG
        VYLVAQYARYEMRKMEADLELK+KKEEEEKAKQ+ELEETE+I E NPELQEVK RLDKLE TIKEIAVESRK SG+G  TKNSEK  +  K KHG N   
Subjt:  VYLVAQYARYEMRKMEADLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSGT-TKNSEKDREGGKVKHGENNMG

Query:  NASESSKSVEDHLGRQKIELAPVLPKGRESESTSQENGQHPNDGGGSSPDAKR
           +S+KS++DHLG QKI  APVLPKGR SEST++++ +H N GGGSSPDA+R
Subjt:  NASESSKSVEDHLGRQKIELAPVLPKGRESESTSQENGQHPNDGGGSSPDAKR

XP_008454169.1 PREDICTED: uncharacterized protein LOC103494654 [Cucumis melo]3.0e-8071.94Show/hide
Query:  MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNN-------NGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLA
        MF AR SW  FSKR KP +TRSFCSK H  TN        NG+NKV+ DLSSY EAYKQLDNLDFMTASKILFT P KKKKFG+DFHLVQLFFVCMPSLA
Subjt:  MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNN-------NGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLA

Query:  VYLVAQYARYEMRKMEADLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSGT-TKNSEKDREGGKVKHGENNMG
        VYLVAQYARYEMRKMEADLELK+KKEEEEKAKQ+ELEE E+I E NPELQEVK RLDKLE+TIKEIAVESRK SG+G  TKNSEK  +  K KHG N   
Subjt:  VYLVAQYARYEMRKMEADLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSGT-TKNSEKDREGGKVKHGENNMG

Query:  NASESSKSVEDHLGRQKIELAPVLPKGRESESTSQENGQHPNDGGGSSPDAKR
           + +KS++DHLG QKI  APVLPK   SEST++E+ +H N G GSS D KR
Subjt:  NASESSKSVEDHLGRQKIELAPVLPKGRESESTSQENGQHPNDGGGSSPDAKR

XP_022153425.1 uncharacterized protein LOC111020934 [Momordica charantia]5.2e-12599.59Show/hide
Query:  MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNNNGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLAVYLVAQY
        MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNNNGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLAVYLVAQY
Subjt:  MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNNNGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLAVYLVAQY

Query:  ARYEMRKMEADLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSGTTKNSEKDREGGKVKHGENNMGNASESSKS
        ARYEMRKMEADLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSGT KNSEKDREGGKVKHGENNMGNASESSKS
Subjt:  ARYEMRKMEADLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSGTTKNSEKDREGGKVKHGENNMGNASESSKS

Query:  VEDHLGRQKIELAPVLPKGRESESTSQENGQHPNDGGGSSPDAKR
        VEDHLGRQKIELAPVLPKGRESESTSQENGQHPNDGGGSSPDAKR
Subjt:  VEDHLGRQKIELAPVLPKGRESESTSQENGQHPNDGGGSSPDAKR

XP_022956225.1 uncharacterized protein LOC111457985 [Cucurbita moschata]9.5e-7973.91Show/hide
Query:  MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNN-------NGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLA
        MF AR S  RFSKR KPFQT  FCSKS   TN        NG+NKVESDLSSY EAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLA
Subjt:  MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNN-------NGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLA

Query:  VYLVAQYARYEMRKMEADLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSG-TTKNSEKDREGGKVKHGENNMG
        VYLVAQYARYEMRKMEADLELK+KK EEE AKQ++LEE EEI +KN ELQEVK RLDKLEETIKEIAVESRK SGSG  TKNSEK +   K KHG N   
Subjt:  VYLVAQYARYEMRKMEADLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSG-TTKNSEKDREGGKVKHGENNMG

Query:  NASESSKSVEDHLGRQKIELAPVLPKGRESESTSQENGQHPNDGGGSSPDAKR
           + SKS++DHLG QKI  APVLPK R   ST+ E+ +H N GG SSPD+KR
Subjt:  NASESSKSVEDHLGRQKIELAPVLPKGRESESTSQENGQHPNDGGGSSPDAKR

XP_038901255.1 uncharacterized protein LOC120088201 [Benincasa hispida]2.6e-8475.9Show/hide
Query:  MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNN----NGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLAVYL
        M  AR SW RFSKR KPF+T SFCSKSH   N     NG+NKVESDLSSY EAYKQLDNLDFMTASKILFT P  KKKFGIDFHLVQLFF CMPSLAVYL
Subjt:  MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNN----NGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLAVYL

Query:  VAQYARYEMRKMEADLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSG-TTKNSEKDREGGKVKHGENNMGNAS
        VAQYARYEMRKMEADLELK+KKEEEEKAKQ+ELEETEEI EKN ELQEVKIRLDKLEETIKEIAVE RK SG+G  TKNSEK ++  K KHG N      
Subjt:  VAQYARYEMRKMEADLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSG-TTKNSEKDREGGKVKHGENNMGNAS

Query:  ESSKSVEDHLGRQKIELAPVLPKGRESESTSQENGQHPNDGGGSSPDAK
        + SKS++D LG QKI  APVLPKGR SEST++E+G+H N  GGSSP AK
Subjt:  ESSKSVEDHLGRQKIELAPVLPKGRESESTSQENGQHPNDGGGSSPDAK

TrEMBL top hitse value%identityAlignment
A0A0A0KTR7 Uncharacterized protein1.5e-8574.31Show/hide
Query:  MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNN-------NGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLA
        MF AR SW  FSKR KP +TRSFCSKSH  TN        NG+NKVE DLSSY EAYKQLDNLD MTASKILFT P KKKKFG+DFHLVQLFFVCMPSLA
Subjt:  MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNN-------NGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLA

Query:  VYLVAQYARYEMRKMEADLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSGT-TKNSEKDREGGKVKHGENNMG
        VYLVAQYARYEMRKMEADLELK+KKEEEEKAKQ+ELEETE+I E NPELQEVK RLDKLE TIKEIAVESRK SG+G  TKNSEK  +  K KHG N   
Subjt:  VYLVAQYARYEMRKMEADLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSGT-TKNSEKDREGGKVKHGENNMG

Query:  NASESSKSVEDHLGRQKIELAPVLPKGRESESTSQENGQHPNDGGGSSPDAKR
           +S+KS++DHLG QKI  APVLPKGR SEST++++ +H N GGGSSPDA+R
Subjt:  NASESSKSVEDHLGRQKIELAPVLPKGRESESTSQENGQHPNDGGGSSPDAKR

A0A1S3BZ75 uncharacterized protein LOC1034946541.4e-8071.94Show/hide
Query:  MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNN-------NGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLA
        MF AR SW  FSKR KP +TRSFCSK H  TN        NG+NKV+ DLSSY EAYKQLDNLDFMTASKILFT P KKKKFG+DFHLVQLFFVCMPSLA
Subjt:  MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNN-------NGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLA

Query:  VYLVAQYARYEMRKMEADLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSGT-TKNSEKDREGGKVKHGENNMG
        VYLVAQYARYEMRKMEADLELK+KKEEEEKAKQ+ELEE E+I E NPELQEVK RLDKLE+TIKEIAVESRK SG+G  TKNSEK  +  K KHG N   
Subjt:  VYLVAQYARYEMRKMEADLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSGT-TKNSEKDREGGKVKHGENNMG

Query:  NASESSKSVEDHLGRQKIELAPVLPKGRESESTSQENGQHPNDGGGSSPDAKR
           + +KS++DHLG QKI  APVLPK   SEST++E+ +H N G GSS D KR
Subjt:  NASESSKSVEDHLGRQKIELAPVLPKGRESESTSQENGQHPNDGGGSSPDAKR

A0A6J1DGT5 uncharacterized protein LOC1110209342.5e-12599.59Show/hide
Query:  MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNNNGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLAVYLVAQY
        MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNNNGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLAVYLVAQY
Subjt:  MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNNNGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLAVYLVAQY

Query:  ARYEMRKMEADLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSGTTKNSEKDREGGKVKHGENNMGNASESSKS
        ARYEMRKMEADLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSGT KNSEKDREGGKVKHGENNMGNASESSKS
Subjt:  ARYEMRKMEADLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSGTTKNSEKDREGGKVKHGENNMGNASESSKS

Query:  VEDHLGRQKIELAPVLPKGRESESTSQENGQHPNDGGGSSPDAKR
        VEDHLGRQKIELAPVLPKGRESESTSQENGQHPNDGGGSSPDAKR
Subjt:  VEDHLGRQKIELAPVLPKGRESESTSQENGQHPNDGGGSSPDAKR

A0A6J1GVZ5 uncharacterized protein LOC1114579854.6e-7973.91Show/hide
Query:  MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNN-------NGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLA
        MF AR S  RFSKR KPFQT  FCSKS   TN        NG+NKVESDLSSY EAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLA
Subjt:  MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNN-------NGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLA

Query:  VYLVAQYARYEMRKMEADLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSG-TTKNSEKDREGGKVKHGENNMG
        VYLVAQYARYEMRKMEADLELK+KK EEE AKQ++LEE EEI +KN ELQEVK RLDKLEETIKEIAVESRK SGSG  TKNSEK +   K KHG N   
Subjt:  VYLVAQYARYEMRKMEADLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSG-TTKNSEKDREGGKVKHGENNMG

Query:  NASESSKSVEDHLGRQKIELAPVLPKGRESESTSQENGQHPNDGGGSSPDAKR
           + SKS++DHLG QKI  APVLPK R   ST+ E+ +H N GG SSPD+KR
Subjt:  NASESSKSVEDHLGRQKIELAPVLPKGRESESTSQENGQHPNDGGGSSPDAKR

A0A6J1IQM5 uncharacterized protein LOC1114796318.7e-7872.73Show/hide
Query:  MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNNNGN-------NKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLA
        MF AR S  RFSKR KPFQT  FCSKS   TN N N       NKVESDLSSY EAYKQLDNLDFMTA KILFT+PPKKKKFGIDFHLVQLFFVCMPSLA
Subjt:  MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNNNGN-------NKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLA

Query:  VYLVAQYARYEMRKMEADLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSG-TTKNSEKDREGGKVKHGENNMG
        VYLVAQYARYEMRKMEADLELK+KK EEE AKQ++L+E EEI +KN ELQEVK RLDKLEETIKEIAVESRK SGSG  TKNSEK +   K KHG N   
Subjt:  VYLVAQYARYEMRKMEADLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSG-TTKNSEKDREGGKVKHGENNMG

Query:  NASESSKSVEDHLGRQKIELAPVLPKGRESESTSQENGQHPNDGGGSSPDAKR
           + SKS++DHLG QKI  APVLPK R   ST+ E+ +H N GG SSPD+KR
Subjt:  NASESSKSVEDHLGRQKIELAPVLPKGRESESTSQENGQHPNDGGGSSPDAKR

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
AT1G80700.1 unknown protein7.4e-3748.91Show/hide
Query:  MFGARVSWGRFSKRFKPFQTRSFCSKSH-----TPTNNNGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLAVY
        M   R SW   S R K ++TR FC+K       T ++   ++  ES +S Y E YK+LD LDF+TA+KILFT+PPKK KFG D+H+VQ   VC+PS+AVY
Subjt:  MFGARVSWGRFSKRFKPFQTRSFCSKSH-----TPTNNNGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLAVY

Query:  LVAQYARYEMRKMEADLELKRKKEEEEKAK---QMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSGTTKNSE
        LVAQYAR +M+ M+A+L  K++KEEE+K K   + +  + E   + + EL E++ RL K+EETIKEI +E++KPSG+  TK  E
Subjt:  LVAQYARYEMRKMEADLELKRKKEEEEKAK---QMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSGTTKNSE

AT1G80980.1 unknown protein4.3e-3748.91Show/hide
Query:  MFGARVSWGRFSKRFKPFQTRSFCSKSH-----TPTNNNGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLAVY
        M   R SW   S R K ++TR FC+K       T ++   +++ ES +S Y E YK+LD LDF+TA+KILFT+PPKK KFG D+H+VQ   VC+PS+AVY
Subjt:  MFGARVSWGRFSKRFKPFQTRSFCSKSH-----TPTNNNGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLAVY

Query:  LVAQYARYEMRKMEADLELKRKKEEEEKAK---QMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSGTTKNSE
        LVAQYAR +M+ M+A+L  K++KEEE+K K   + +  + E   + + EL E++ RL K+EETIKEI +E++KPSG+  TK  E
Subjt:  LVAQYARYEMRKMEADLELKRKKEEEEKAK---QMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSGTTKNSE


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGTTTGGCGCCAGAGTCAGTTGGGGTCGATTTTCAAAGCGATTCAAGCCTTTCCAAACCAGATCATTCTGCTCCAAATCCCACACTCCCACCAATAACAATGGCAACAA
CAAGGTTGAGTCGGATCTGAGCAGCTACGGTGAGGCTTATAAGCAGCTGGATAACCTGGACTTCATGACCGCCTCCAAGATCCTCTTCACTGATCCTCCCAAGAAGAAGA
AATTTGGGATTGATTTCCATCTGGTGCAACTCTTCTTTGTTTGCATGCCTTCTTTGGCTGTTTATTTGGTGGCCCAATATGCTCGTTATGAAATGAGGAAAATGGAAGCG
GACCTGGAGCTGAAAAGGAAGAAAGAAGAAGAAGAGAAAGCTAAACAAATGGAGTTAGAAGAGACCGAAGAAATTCAGGAAAAGAATCCAGAGCTACAGGAGGTAAAAAT
AAGACTGGATAAACTTGAGGAGACCATAAAGGAAATTGCTGTTGAATCCAGAAAACCATCAGGAAGTGGTACGACAAAGAACTCTGAAAAAGATCGAGAAGGTGGTAAAG
TCAAACATGGGGAAAACAACATGGGGAATGCGTCAGAATCAAGTAAATCTGTGGAAGACCATCTTGGCAGACAAAAAATAGAACTTGCTCCAGTTTTGCCCAAAGGGCGC
GAGAGCGAGTCTACATCACAAGAAAATGGTCAGCATCCAAACGACGGTGGAGGATCCTCTCCAGATGCCAAGAGA
mRNA sequenceShow/hide mRNA sequence
ATGTTTGGCGCCAGAGTCAGTTGGGGTCGATTTTCAAAGCGATTCAAGCCTTTCCAAACCAGATCATTCTGCTCCAAATCCCACACTCCCACCAATAACAATGGCAACAA
CAAGGTTGAGTCGGATCTGAGCAGCTACGGTGAGGCTTATAAGCAGCTGGATAACCTGGACTTCATGACCGCCTCCAAGATCCTCTTCACTGATCCTCCCAAGAAGAAGA
AATTTGGGATTGATTTCCATCTGGTGCAACTCTTCTTTGTTTGCATGCCTTCTTTGGCTGTTTATTTGGTGGCCCAATATGCTCGTTATGAAATGAGGAAAATGGAAGCG
GACCTGGAGCTGAAAAGGAAGAAAGAAGAAGAAGAGAAAGCTAAACAAATGGAGTTAGAAGAGACCGAAGAAATTCAGGAAAAGAATCCAGAGCTACAGGAGGTAAAAAT
AAGACTGGATAAACTTGAGGAGACCATAAAGGAAATTGCTGTTGAATCCAGAAAACCATCAGGAAGTGGTACGACAAAGAACTCTGAAAAAGATCGAGAAGGTGGTAAAG
TCAAACATGGGGAAAACAACATGGGGAATGCGTCAGAATCAAGTAAATCTGTGGAAGACCATCTTGGCAGACAAAAAATAGAACTTGCTCCAGTTTTGCCCAAAGGGCGC
GAGAGCGAGTCTACATCACAAGAAAATGGTCAGCATCCAAACGACGGTGGAGGATCCTCTCCAGATGCCAAGAGA
Protein sequenceShow/hide protein sequence
MFGARVSWGRFSKRFKPFQTRSFCSKSHTPTNNNGNNKVESDLSSYGEAYKQLDNLDFMTASKILFTDPPKKKKFGIDFHLVQLFFVCMPSLAVYLVAQYARYEMRKMEA
DLELKRKKEEEEKAKQMELEETEEIQEKNPELQEVKIRLDKLEETIKEIAVESRKPSGSGTTKNSEKDREGGKVKHGENNMGNASESSKSVEDHLGRQKIELAPVLPKGR
ESESTSQENGQHPNDGGGSSPDAKR