; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Spg031734 (gene) of Sponge gourd (cylindrica) v1 genome

Gene IDSpg031734
OrganismLuffa cylindrica (Sponge gourd (cylindrica) v1)
DescriptionUnknown protein
Genome locationscaffold11:44422035..44424910
RNA-Seq ExpressionSpg031734
SyntenySpg031734
Gene Ontology termsNA
InterPro domainsNA


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_004133783.1 uncharacterized protein LOC101222847 isoform X1 [Cucumis sativus]1.0e-8579.65Show/hide
Query:  DKTRNMRHDDINDFGTIATDGWPSSMLGDNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLA-SSDDPASPEDGNDVFVRSLHEAS
        DKTRN++ DDINDFGTIATDGW SSMLG+NDNNLE+IFLKIEAAQSKVHELKNRIDKV+NENPMKFS INQLY LA SSDDPASP DGND  VRSLHEAS
Subjt:  DKTRNMRHDDINDFGTIATDGWPSSMLGDNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLA-SSDDPASPEDGNDVFVRSLHEAS

Query:  QHMSEHALDVLMPENAMNSHGEVMLLPDMIQSADCRSTQKVLMQDSAVKEEEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEPDMQHKTK
        QHMSEHALDVLMPE A+ +HGEVMLLPDM++S DC +TQKVLMQDSAVK EE+Q+ +EVKGQ+ E  Q  EEQK IS AA+SQADL S+DKEPDM HKTK
Subjt:  QHMSEHALDVLMPENAMNSHGEVMLLPDMIQSADCRSTQKVLMQDSAVKEEEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEPDMQHKTK

Query:  PSSSVKP---KKTRKRGRRKIGSSKQNRKAT
          S++KP   KKTRKRGRRKIGSSK+NRKAT
Subjt:  PSSSVKP---KKTRKRGRRKIGSSKQNRKAT

XP_008437823.1 PREDICTED: uncharacterized protein LOC103483139 [Cucumis melo]1.9e-8479.13Show/hide
Query:  DKTRNMRHDDINDFGTIATDGWPSSMLGDNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLASSDDPASPEDGNDVFVRSLHEASQ
        DKTRN++ DDIND GTIATDGW SSMLG+NDNNLE+IFLKIEAAQSKVHELKNRIDKV+NENPMKFS+INQLY LASSDDPASPEDGND  VRSLHEASQ
Subjt:  DKTRNMRHDDINDFGTIATDGWPSSMLGDNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLASSDDPASPEDGNDVFVRSLHEASQ

Query:  HMSEHALDVLMPENAMNSHGEVMLLPDMIQSADCRSTQKVLMQDSAVKEEEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEPDMQHKTKP
        HMSEHALDVLMPE A+ +HGEVMLLPDM QS DC +TQKVLMQDSAVK EE+Q+ +E K Q+ E  Q  EEQK +S AA+SQAD +S+DKEPDM HK K 
Subjt:  HMSEHALDVLMPENAMNSHGEVMLLPDMIQSADCRSTQKVLMQDSAVKEEEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEPDMQHKTKP

Query:  SSSVKP---KKTRKRGRRKIGSSKQNRKAT
        SS+VKP   KKTRKRGRRKIGSSK+NRKAT
Subjt:  SSSVKP---KKTRKRGRRKIGSSKQNRKAT

XP_022974818.1 uncharacterized protein LOC111473601 isoform X2 [Cucurbita maxima]8.6e-8579.74Show/hide
Query:  QDKTRNMRHDDINDFGTIATDGWPSSMLGDNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLASSDDPASPEDGNDVFVRSLHEAS
        +DKT+NMRHD INDF TIATDGWPSSMLGDNDNNLEE+FLKIEAAQS+VHELKNRIDKV+NENPMKFSSINQ+Y+LASSDDPASPEDGNDVFVRSLHEAS
Subjt:  QDKTRNMRHDDINDFGTIATDGWPSSMLGDNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLASSDDPASPEDGNDVFVRSLHEAS

Query:  QHMSEHALDVLMPENAMNSHGEVMLLPDMIQSADC-RSTQKVLMQDSAVKEEEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEPDMQHKT
        QHMSE A DVLMPENA+ SHGEVMLLPDMIQSADC RST+KVL+QDSAVK EE QI EEV GQ  EQ  KLEEQ I SP     ADLAS  +EPDMQHKT
Subjt:  QHMSEHALDVLMPENAMNSHGEVMLLPDMIQSADC-RSTQKVLMQDSAVKEEEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEPDMQHKT

Query:  KPSSSVKP---KKTRKRGRRKIGSSKQNRKAT
        +  S+ KP   KKTRKRGRRK G  KQ RK T
Subjt:  KPSSSVKP---KKTRKRGRRKIGSSKQNRKAT

XP_031737513.1 uncharacterized protein LOC101222847 isoform X2 [Cucumis sativus]2.0e-8679.4Show/hide
Query:  YQDKTRNMRHDDINDFGTIATDGWPSSMLGDNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLA-SSDDPASPEDGNDVFVRSLHE
        Y DKTRN++ DDINDFGTIATDGW SSMLG+NDNNLE+IFLKIEAAQSKVHELKNRIDKV+NENPMKFS INQLY LA SSDDPASP DGND  VRSLHE
Subjt:  YQDKTRNMRHDDINDFGTIATDGWPSSMLGDNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLA-SSDDPASPEDGNDVFVRSLHE

Query:  ASQHMSEHALDVLMPENAMNSHGEVMLLPDMIQSADCRSTQKVLMQDSAVKEEEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEPDMQHK
        ASQHMSEHALDVLMPE A+ +HGEVMLLPDM++S DC +TQKVLMQDSAVK EE+Q+ +EVKGQ+ E  Q  EEQK IS AA+SQADL S+DKEPDM HK
Subjt:  ASQHMSEHALDVLMPENAMNSHGEVMLLPDMIQSADCRSTQKVLMQDSAVKEEEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEPDMQHK

Query:  TKPSSSVKP---KKTRKRGRRKIGSSKQNRKAT
        TK  S++KP   KKTRKRGRRKIGSSK+NRKAT
Subjt:  TKPSSSVKP---KKTRKRGRRKIGSSKQNRKAT

XP_038886802.1 uncharacterized protein LOC120076910 [Benincasa hispida]3.5e-8678.7Show/hide
Query:  DKTRNMRHDDINDFGTIATDGWPSSMLGDNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLASSDDPASPEDGNDVFVRSLHEASQ
        DKTRN++ DDINDFGT+ATDGW SSMLGD+DNNL+++FLKIEAAQSKVHELKNRIDKV+NENPMKFS+INQLY LASSDDPASPEDGND  VRSLHEASQ
Subjt:  DKTRNMRHDDINDFGTIATDGWPSSMLGDNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLASSDDPASPEDGNDVFVRSLHEASQ

Query:  HMSEHALDVLMPENAMNSHGEVMLLPDMIQSADCRSTQKVLMQDSAVKEEEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEPDMQHKTKP
        H+SEHALDVLMPE A+ +HGEVMLLPDM QSADC +T+KVL QDSAVK EE+Q+ E VKGQ+ E  QKLEEQKIIS AAVSQ+DL S DKEP+  HKTK 
Subjt:  HMSEHALDVLMPENAMNSHGEVMLLPDMIQSADCRSTQKVLMQDSAVKEEEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEPDMQHKTKP

Query:  SSSVKP---KKTRKRGRRKIGSSKQNRKAT
         S+ KP   K+TRKRGRRKIGSSK+NRKAT
Subjt:  SSSVKP---KKTRKRGRRKIGSSKQNRKAT

TrEMBL top hitse value%identityAlignment
A0A0A0L704 Uncharacterized protein4.9e-8679.65Show/hide
Query:  DKTRNMRHDDINDFGTIATDGWPSSMLGDNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLA-SSDDPASPEDGNDVFVRSLHEAS
        DKTRN++ DDINDFGTIATDGW SSMLG+NDNNLE+IFLKIEAAQSKVHELKNRIDKV+NENPMKFS INQLY LA SSDDPASP DGND  VRSLHEAS
Subjt:  DKTRNMRHDDINDFGTIATDGWPSSMLGDNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLA-SSDDPASPEDGNDVFVRSLHEAS

Query:  QHMSEHALDVLMPENAMNSHGEVMLLPDMIQSADCRSTQKVLMQDSAVKEEEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEPDMQHKTK
        QHMSEHALDVLMPE A+ +HGEVMLLPDM++S DC +TQKVLMQDSAVK EE+Q+ +EVKGQ+ E  Q  EEQK IS AA+SQADL S+DKEPDM HKTK
Subjt:  QHMSEHALDVLMPENAMNSHGEVMLLPDMIQSADCRSTQKVLMQDSAVKEEEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEPDMQHKTK

Query:  PSSSVKP---KKTRKRGRRKIGSSKQNRKAT
          S++KP   KKTRKRGRRKIGSSK+NRKAT
Subjt:  PSSSVKP---KKTRKRGRRKIGSSKQNRKAT

A0A1S3AUK2 uncharacterized protein LOC1034831399.2e-8579.13Show/hide
Query:  DKTRNMRHDDINDFGTIATDGWPSSMLGDNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLASSDDPASPEDGNDVFVRSLHEASQ
        DKTRN++ DDIND GTIATDGW SSMLG+NDNNLE+IFLKIEAAQSKVHELKNRIDKV+NENPMKFS+INQLY LASSDDPASPEDGND  VRSLHEASQ
Subjt:  DKTRNMRHDDINDFGTIATDGWPSSMLGDNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLASSDDPASPEDGNDVFVRSLHEASQ

Query:  HMSEHALDVLMPENAMNSHGEVMLLPDMIQSADCRSTQKVLMQDSAVKEEEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEPDMQHKTKP
        HMSEHALDVLMPE A+ +HGEVMLLPDM QS DC +TQKVLMQDSAVK EE+Q+ +E K Q+ E  Q  EEQK +S AA+SQAD +S+DKEPDM HK K 
Subjt:  HMSEHALDVLMPENAMNSHGEVMLLPDMIQSADCRSTQKVLMQDSAVKEEEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEPDMQHKTKP

Query:  SSSVKP---KKTRKRGRRKIGSSKQNRKAT
        SS+VKP   KKTRKRGRRKIGSSK+NRKAT
Subjt:  SSSVKP---KKTRKRGRRKIGSSKQNRKAT

A0A5A7U0V8 Uncharacterized protein9.2e-8579.13Show/hide
Query:  DKTRNMRHDDINDFGTIATDGWPSSMLGDNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLASSDDPASPEDGNDVFVRSLHEASQ
        DKTRN++ DDIND GTIATDGW SSMLG+NDNNLE+IFLKIEAAQSKVHELKNRIDKV+NENPMKFS+INQLY LASSDDPASPEDGND  VRSLHEASQ
Subjt:  DKTRNMRHDDINDFGTIATDGWPSSMLGDNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLASSDDPASPEDGNDVFVRSLHEASQ

Query:  HMSEHALDVLMPENAMNSHGEVMLLPDMIQSADCRSTQKVLMQDSAVKEEEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEPDMQHKTKP
        HMSEHALDVLMPE A+ +HGEVMLLPDM QS DC +TQKVLMQDSAVK EE+Q+ +E K Q+ E  Q  EEQK +S AA+SQAD +S+DKEPDM HK K 
Subjt:  HMSEHALDVLMPENAMNSHGEVMLLPDMIQSADCRSTQKVLMQDSAVKEEEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEPDMQHKTKP

Query:  SSSVKP---KKTRKRGRRKIGSSKQNRKAT
        SS+VKP   KKTRKRGRRKIGSSK+NRKAT
Subjt:  SSSVKP---KKTRKRGRRKIGSSKQNRKAT

A0A6J1IEX6 uncharacterized protein LOC111473601 isoform X32.7e-8479.65Show/hide
Query:  DKTRNMRHDDINDFGTIATDGWPSSMLGDNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLASSDDPASPEDGNDVFVRSLHEASQ
        +KT+NMRHD INDF TIATDGWPSSMLGDNDNNLEE+FLKIEAAQS+VHELKNRIDKV+NENPMKFSSINQ+Y+LASSDDPASPEDGNDVFVRSLHEASQ
Subjt:  DKTRNMRHDDINDFGTIATDGWPSSMLGDNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLASSDDPASPEDGNDVFVRSLHEASQ

Query:  HMSEHALDVLMPENAMNSHGEVMLLPDMIQSADC-RSTQKVLMQDSAVKEEEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEPDMQHKTK
        HMSE A DVLMPENA+ SHGEVMLLPDMIQSADC RST+KVL+QDSAVK EE QI EEV GQ  EQ  KLEEQ I SP     ADLAS  +EPDMQHKT+
Subjt:  HMSEHALDVLMPENAMNSHGEVMLLPDMIQSADC-RSTQKVLMQDSAVKEEEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEPDMQHKTK

Query:  PSSSVKP---KKTRKRGRRKIGSSKQNRKAT
          S+ KP   KKTRKRGRRK G  KQ RK T
Subjt:  PSSSVKP---KKTRKRGRRKIGSSKQNRKAT

A0A6J1IHF9 uncharacterized protein LOC111473601 isoform X24.1e-8579.74Show/hide
Query:  QDKTRNMRHDDINDFGTIATDGWPSSMLGDNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLASSDDPASPEDGNDVFVRSLHEAS
        +DKT+NMRHD INDF TIATDGWPSSMLGDNDNNLEE+FLKIEAAQS+VHELKNRIDKV+NENPMKFSSINQ+Y+LASSDDPASPEDGNDVFVRSLHEAS
Subjt:  QDKTRNMRHDDINDFGTIATDGWPSSMLGDNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLASSDDPASPEDGNDVFVRSLHEAS

Query:  QHMSEHALDVLMPENAMNSHGEVMLLPDMIQSADC-RSTQKVLMQDSAVKEEEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEPDMQHKT
        QHMSE A DVLMPENA+ SHGEVMLLPDMIQSADC RST+KVL+QDSAVK EE QI EEV GQ  EQ  KLEEQ I SP     ADLAS  +EPDMQHKT
Subjt:  QHMSEHALDVLMPENAMNSHGEVMLLPDMIQSADC-RSTQKVLMQDSAVKEEEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEPDMQHKT

Query:  KPSSSVKP---KKTRKRGRRKIGSSKQNRKAT
        +  S+ KP   KKTRKRGRRK G  KQ RK T
Subjt:  KPSSSVKP---KKTRKRGRRKIGSSKQNRKAT

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
AT3G59670.1 unknown protein3.1e-0826.75Show/hide
Query:  DNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLASSDDP----ASPEDGNDVFVRSLHEASQHMSEHALD--VLMPENAMNSHGEV
        D D+ LEE+  KIE   S+VH LK ++D V+++N  +FSS   L LLA+S  P    ++  +G+ +   +++ ASQHM+++ L   V   E  ++S+G+ 
Subjt:  DNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLASSDDP----ASPEDGNDVFVRSLHEASQHMSEHALD--VLMPENAMNSHGEV

Query:  MLLPDMIQS-----ADCRST--------------QKVLMQDSAVKE-----------EEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEP
          +PD+I+S     AD   T                +L+++   +E           +E +  EE +G      Q+ EE +  +     +  L  + +E 
Subjt:  MLLPDMIQS-----ADCRST--------------QKVLMQDSAVKE-----------EEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEP

Query:  DMQHKTKPSSSVKPKKTRKRGRRKIGSS
         +      S  + P+  R RG  +  SS
Subjt:  DMQHKTKPSSSVKPKKTRKRGRRKIGSS

AT4G37440.2 unknown protein9.3e-0526.88Show/hide
Query:  TIATDGWPSSMLGDNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLASSDDPASPEDGNDVF----------------VRSLHEAS
        T  ++  P     + D  LE+I LKIEAA+S+   LK R+DKV++ENP  F   N +  L ++D   S E    +                 V+S   +S
Subjt:  TIATDGWPSSMLGDNDNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLASSDDPASPEDGNDVF----------------VRSLHEAS

Query:  QHMS---EHALDVLMPENAMNSHGEVMLLPDMIQSADCRSTQKVLMQDSAVKEEEVQIPE
         H+S   +   D+L+ E               I ++  R  + ++   + VK E+  I E
Subjt:  QHMS---EHALDVLMPENAMNSHGEVMLLPDMIQSADCRSTQKVLMQDSAVKEEEVQIPE


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGGATTAGTGGAAGAAGTTGATGTTCCTAAAGTCTTGCAATCCAGCGAGCCTATGAGTGACTTCGGAGATTGGTGGATGATGTACTGTCGTCTGGGAGACTTGTGGAT
TACTAATCAGCCTCTGAAAGTATGTCTTTTCTTTGAGGAGTTCGACTTAAGTCTAGCTAGACATAGGGGTTGCGGAGAGATGATCAAGGAGTTCCTACCCCACCCTCTCT
CTTGGAGAGAGGAATACCAGGACAAGACAAGGAATATGAGACATGATGACATCAATGACTTCGGGACAATTGCAACTGATGGATGGCCATCTTCTATGTTGGGAGATAAT
GATAATAATTTGGAAGAAATCTTTCTAAAAATTGAAGCTGCTCAGTCAAAAGTTCATGAGTTGAAGAACAGAATTGACAAGGTGATGAATGAAAATCCCATGAAGTTCTC
TTCAATCAATCAGCTATACCTTCTTGCATCAAGTGATGATCCCGCTTCACCTGAAGATGGAAATGATGTGTTCGTTAGGTCTTTGCATGAAGCATCACAACACATGTCTG
AGCATGCATTAGATGTACTTATGCCAGAAAACGCGATGAACAGTCATGGAGAGGTCATGCTACTTCCTGATATGATTCAGAGCGCAGATTGTAGGAGTACTCAGAAAGTT
CTGATGCAAGATTCCGCAGTCAAGGAAGAAGAGGTGCAAATTCCGGAAGAGGTTAAAGGTCAGATTACTGAACAGCCTCAGAAATTGGAGGAGCAGAAAATCATTTCTCC
AGCTGCAGTTTCTCAAGCTGACTTAGCCTCAGAAGACAAGGAGCCTGACATGCAGCACAAAACGAAACCCTCTTCTTCTGTCAAACCTAAGAAAACAAGAAAGCGGGGAA
GACGAAAAATTGGTTCGAGTAAGCAGAATCGGAAAGCAACACGGTAG
mRNA sequenceShow/hide mRNA sequence
ATGGGATTAGTGGAAGAAGTTGATGTTCCTAAAGTCTTGCAATCCAGCGAGCCTATGAGTGACTTCGGAGATTGGTGGATGATGTACTGTCGTCTGGGAGACTTGTGGAT
TACTAATCAGCCTCTGAAAGTATGTCTTTTCTTTGAGGAGTTCGACTTAAGTCTAGCTAGACATAGGGGTTGCGGAGAGATGATCAAGGAGTTCCTACCCCACCCTCTCT
CTTGGAGAGAGGAATACCAGGACAAGACAAGGAATATGAGACATGATGACATCAATGACTTCGGGACAATTGCAACTGATGGATGGCCATCTTCTATGTTGGGAGATAAT
GATAATAATTTGGAAGAAATCTTTCTAAAAATTGAAGCTGCTCAGTCAAAAGTTCATGAGTTGAAGAACAGAATTGACAAGGTGATGAATGAAAATCCCATGAAGTTCTC
TTCAATCAATCAGCTATACCTTCTTGCATCAAGTGATGATCCCGCTTCACCTGAAGATGGAAATGATGTGTTCGTTAGGTCTTTGCATGAAGCATCACAACACATGTCTG
AGCATGCATTAGATGTACTTATGCCAGAAAACGCGATGAACAGTCATGGAGAGGTCATGCTACTTCCTGATATGATTCAGAGCGCAGATTGTAGGAGTACTCAGAAAGTT
CTGATGCAAGATTCCGCAGTCAAGGAAGAAGAGGTGCAAATTCCGGAAGAGGTTAAAGGTCAGATTACTGAACAGCCTCAGAAATTGGAGGAGCAGAAAATCATTTCTCC
AGCTGCAGTTTCTCAAGCTGACTTAGCCTCAGAAGACAAGGAGCCTGACATGCAGCACAAAACGAAACCCTCTTCTTCTGTCAAACCTAAGAAAACAAGAAAGCGGGGAA
GACGAAAAATTGGTTCGAGTAAGCAGAATCGGAAAGCAACACGGTAG
Protein sequenceShow/hide protein sequence
MGLVEEVDVPKVLQSSEPMSDFGDWWMMYCRLGDLWITNQPLKVCLFFEEFDLSLARHRGCGEMIKEFLPHPLSWREEYQDKTRNMRHDDINDFGTIATDGWPSSMLGDN
DNNLEEIFLKIEAAQSKVHELKNRIDKVMNENPMKFSSINQLYLLASSDDPASPEDGNDVFVRSLHEASQHMSEHALDVLMPENAMNSHGEVMLLPDMIQSADCRSTQKV
LMQDSAVKEEEVQIPEEVKGQITEQPQKLEEQKIISPAAVSQADLASEDKEPDMQHKTKPSSSVKPKKTRKRGRRKIGSSKQNRKATR