; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

HG10004675 (gene) of Bottle gourd (Hangzhou Gourd) v1 genome

Gene IDHG10004675
OrganismLagenaria siceraria cv. Hangzhou Gourd (Bottle gourd (Hangzhou Gourd) v1)
DescriptionDUF4050 domain-containing protein
Genome locationChr08:19434435..19436988
RNA-Seq ExpressionHG10004675
SyntenyHG10004675
Gene Ontology termsGO:0016020 - membrane (cellular component)
InterPro domainsIPR025124 - Domain of unknown function DUF4050


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAA0064925.1 uncharacterized protein E6C27_scaffold82G002430 [Cucumis melo var. makuwa]2.0e-9588.78Show/hide
Query:  MYSRCCLLSRLEGCSSKKPCCSFLQFSGEYLRALILLMVDNIKLLFHRRSCHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTV
        MYSRCCLL+RLEGCSSK PCCSFLQFSGEY+RALILLMVD IKLLFH+R   GCC+ASALGNAMD PSKGLRVKD+E KKQCLPEN PSSSTCEMDNSTV
Subjt:  MYSRCCLLSRLEGCSSKKPCCSFLQFSGEYLRALILLMVDNIKLLFHRRSCHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTV

Query:  WSQRSIPSAQSQDSHSNIGSSTDFVNSGLLLWNETRKQWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD
        WSQRS+ SAQS DS SNIGSSTDFVNSGLLLWNETRKQW GNKMS SQKQVQEPKISWNATYDSLLTTNKPFPE +PL EMIEFLVDVWEQEGLYD
Subjt:  WSQRSIPSAQSQDSHSNIGSSTDFVNSGLLLWNETRKQWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD

XP_004138726.1 uncharacterized protein LOC101216869 [Cucumis sativus]5.8e-9587.76Show/hide
Query:  MYSRCCLLSRLEGCSSKKPCCSFLQFSGEYLRALILLMVDNIKLLFHRRSCHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTV
        MYSRCCLL+RLEGCSSK PCCSFLQFSGEY+RALILLMVD IKLLFH+R   GCCTASALGNAMD PSKGLRVK++E KKQCLPEN PSSSTCEMDNSTV
Subjt:  MYSRCCLLSRLEGCSSKKPCCSFLQFSGEYLRALILLMVDNIKLLFHRRSCHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTV

Query:  WSQRSIPSAQSQDSHSNIGSSTDFVNSGLLLWNETRKQWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD
        WSQRS+ S Q+ DSHSNIGSSTDFVNSGLLLWNETRKQW GNKMS SQKQVQEPKISWNATYD+LLTTNKPFPE +PL EMIEFLVDVWEQEGLYD
Subjt:  WSQRSIPSAQSQDSHSNIGSSTDFVNSGLLLWNETRKQWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD

XP_008445211.1 PREDICTED: uncharacterized protein LOC103488310 isoform X1 [Cucumis melo]2.0e-9588.78Show/hide
Query:  MYSRCCLLSRLEGCSSKKPCCSFLQFSGEYLRALILLMVDNIKLLFHRRSCHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTV
        MYSRCCLL+RLEGCSSK PCCSFLQFSGEY+RALILLMVD IKLLFH+R   GCC+ASALGNAMD PSKGLRVKD+E KKQCLPEN PSSSTCEMDNSTV
Subjt:  MYSRCCLLSRLEGCSSKKPCCSFLQFSGEYLRALILLMVDNIKLLFHRRSCHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTV

Query:  WSQRSIPSAQSQDSHSNIGSSTDFVNSGLLLWNETRKQWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD
        WSQRS+ SAQS DS SNIGSSTDFVNSGLLLWNETRKQW GNKMS SQKQVQEPKISWNATYDSLLTTNKPFPE +PL EMIEFLVDVWEQEGLYD
Subjt:  WSQRSIPSAQSQDSHSNIGSSTDFVNSGLLLWNETRKQWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD

XP_022951409.1 uncharacterized protein LOC111454240 isoform X1 [Cucurbita moschata]5.1e-9185.2Show/hide
Query:  MYSRCCLLSRLEGCSSKKPCCSFLQFSGEYLRALILLMVDNIKLLFHRRSCHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTV
        MYSRCCLLSRLEGCSS KPCCSFLQFSG+YLRALI+L+VDN+KLLFHRRSC G CT  ALG+AMD PS GLRV+DQE KKQCLPEN  SSSTCEMDNSTV
Subjt:  MYSRCCLLSRLEGCSSKKPCCSFLQFSGEYLRALILLMVDNIKLLFHRRSCHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTV

Query:  WSQRSIPSAQSQDSHSNIGSSTDFVNSGLLLWNETRKQWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD
        WSQRS+ SAQS DSH+N+GSST+FVNSGLLLWNETRKQW GNK S+SQK+V+EPKISWNATYDSLLTTNKPFPE +PLAEMIEFLVDVWEQEGLYD
Subjt:  WSQRSIPSAQSQDSHSNIGSSTDFVNSGLLLWNETRKQWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD

XP_038885342.1 uncharacterized protein LOC120075759 isoform X1 [Benincasa hispida]1.3e-9991.33Show/hide
Query:  MYSRCCLLSRLEGCSSKKPCCSFLQFSGEYLRALILLMVDNIKLLFHRRSCHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTV
        MYSRCCLL RLEGCSSKKPCCSFLQFSGEYLRALILLMVDNIKLLFHRRSCHGCCTASAL NAMD PSKGLRVKDQE KKQCLPEN PSSSTCEMDNSTV
Subjt:  MYSRCCLLSRLEGCSSKKPCCSFLQFSGEYLRALILLMVDNIKLLFHRRSCHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTV

Query:  WSQRSIPSAQSQDSHSNIGSSTDFVNSGLLLWNETRKQWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD
        WSQRS+ SA S DSHSNIGSSTDFVNSGLLLWNETRKQW GNKMS+ QKQVQEPKISW+ATYDSLL TNKPFPEPVPL EMI+FLVDVWEQ+GLYD
Subjt:  WSQRSIPSAQSQDSHSNIGSSTDFVNSGLLLWNETRKQWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD

TrEMBL top hitse value%identityAlignment
A0A0A0LPL3 Uncharacterized protein2.8e-9587.76Show/hide
Query:  MYSRCCLLSRLEGCSSKKPCCSFLQFSGEYLRALILLMVDNIKLLFHRRSCHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTV
        MYSRCCLL+RLEGCSSK PCCSFLQFSGEY+RALILLMVD IKLLFH+R   GCCTASALGNAMD PSKGLRVK++E KKQCLPEN PSSSTCEMDNSTV
Subjt:  MYSRCCLLSRLEGCSSKKPCCSFLQFSGEYLRALILLMVDNIKLLFHRRSCHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTV

Query:  WSQRSIPSAQSQDSHSNIGSSTDFVNSGLLLWNETRKQWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD
        WSQRS+ S Q+ DSHSNIGSSTDFVNSGLLLWNETRKQW GNKMS SQKQVQEPKISWNATYD+LLTTNKPFPE +PL EMIEFLVDVWEQEGLYD
Subjt:  WSQRSIPSAQSQDSHSNIGSSTDFVNSGLLLWNETRKQWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD

A0A1S3BC47 uncharacterized protein LOC103488310 isoform X19.7e-9688.78Show/hide
Query:  MYSRCCLLSRLEGCSSKKPCCSFLQFSGEYLRALILLMVDNIKLLFHRRSCHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTV
        MYSRCCLL+RLEGCSSK PCCSFLQFSGEY+RALILLMVD IKLLFH+R   GCC+ASALGNAMD PSKGLRVKD+E KKQCLPEN PSSSTCEMDNSTV
Subjt:  MYSRCCLLSRLEGCSSKKPCCSFLQFSGEYLRALILLMVDNIKLLFHRRSCHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTV

Query:  WSQRSIPSAQSQDSHSNIGSSTDFVNSGLLLWNETRKQWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD
        WSQRS+ SAQS DS SNIGSSTDFVNSGLLLWNETRKQW GNKMS SQKQVQEPKISWNATYDSLLTTNKPFPE +PL EMIEFLVDVWEQEGLYD
Subjt:  WSQRSIPSAQSQDSHSNIGSSTDFVNSGLLLWNETRKQWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD

A0A5A7VGA9 Uncharacterized protein9.7e-9688.78Show/hide
Query:  MYSRCCLLSRLEGCSSKKPCCSFLQFSGEYLRALILLMVDNIKLLFHRRSCHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTV
        MYSRCCLL+RLEGCSSK PCCSFLQFSGEY+RALILLMVD IKLLFH+R   GCC+ASALGNAMD PSKGLRVKD+E KKQCLPEN PSSSTCEMDNSTV
Subjt:  MYSRCCLLSRLEGCSSKKPCCSFLQFSGEYLRALILLMVDNIKLLFHRRSCHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTV

Query:  WSQRSIPSAQSQDSHSNIGSSTDFVNSGLLLWNETRKQWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD
        WSQRS+ SAQS DS SNIGSSTDFVNSGLLLWNETRKQW GNKMS SQKQVQEPKISWNATYDSLLTTNKPFPE +PL EMIEFLVDVWEQEGLYD
Subjt:  WSQRSIPSAQSQDSHSNIGSSTDFVNSGLLLWNETRKQWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD

A0A6J1GIP5 uncharacterized protein LOC111454240 isoform X12.5e-9185.2Show/hide
Query:  MYSRCCLLSRLEGCSSKKPCCSFLQFSGEYLRALILLMVDNIKLLFHRRSCHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTV
        MYSRCCLLSRLEGCSS KPCCSFLQFSG+YLRALI+L+VDN+KLLFHRRSC G CT  ALG+AMD PS GLRV+DQE KKQCLPEN  SSSTCEMDNSTV
Subjt:  MYSRCCLLSRLEGCSSKKPCCSFLQFSGEYLRALILLMVDNIKLLFHRRSCHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTV

Query:  WSQRSIPSAQSQDSHSNIGSSTDFVNSGLLLWNETRKQWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD
        WSQRS+ SAQS DSH+N+GSST+FVNSGLLLWNETRKQW GNK S+SQK+V+EPKISWNATYDSLLTTNKPFPE +PLAEMIEFLVDVWEQEGLYD
Subjt:  WSQRSIPSAQSQDSHSNIGSSTDFVNSGLLLWNETRKQWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD

A0A6J1KQM2 uncharacterized protein LOC111496323 isoform X11.2e-9084.69Show/hide
Query:  MYSRCCLLSRLEGCSSKKPCCSFLQFSGEYLRALILLMVDNIKLLFHRRSCHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTV
        MYSRCCLLSRLEGCSS KPCCSFLQFSG+YLRALI+L+VDN+KLLFHRRSC G CT  ALG+AMD PS GLRV DQE KKQCLP+N  SSSTCEMDNSTV
Subjt:  MYSRCCLLSRLEGCSSKKPCCSFLQFSGEYLRALILLMVDNIKLLFHRRSCHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTV

Query:  WSQRSIPSAQSQDSHSNIGSSTDFVNSGLLLWNETRKQWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD
        WSQRS+ SAQS DSH+N+GSST+FVNSGLLLWNETRKQW GNK S+SQK+V+EPKISWNATYDSLLTTNKPFPE +PLAEMIEFLVDVWEQEGLYD
Subjt:  WSQRSIPSAQSQDSHSNIGSSTDFVNSGLLLWNETRKQWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
AT1G15350.1 unknown protein9.4e-2744.81Show/hide
Query:  CHGCCT--ASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTVWSQRSIPSA----QSQDSHSNIGSSTDFVNSGLLLWNETRKQWAG-NK
        C GC     S   +  D PS  +    +  KK  + E+  S+ST +MDN T  SQ S+ S+     SQ +  N  +  ++VN GLLLWN+TR++W G +K
Subjt:  CHGCCT--ASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTVWSQRSIPSA----QSQDSHSNIGSSTDFVNSGLLLWNETRKQWAG-NK

Query:  MSDSQKQVQEPKISWN-ATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD
         ++     Q  K++WN ATYDSLL +NK FP+P+PL EM++FLVD+WEQEGLYD
Subjt:  MSDSQKQVQEPKISWN-ATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD

AT1G15350.2 unknown protein9.4e-2744.81Show/hide
Query:  CHGCCT--ASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTVWSQRSIPSA----QSQDSHSNIGSSTDFVNSGLLLWNETRKQWAG-NK
        C GC     S   +  D PS  +    +  KK  + E+  S+ST +MDN T  SQ S+ S+     SQ +  N  +  ++VN GLLLWN+TR++W G +K
Subjt:  CHGCCT--ASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTVWSQRSIPSA----QSQDSHSNIGSSTDFVNSGLLLWNETRKQWAG-NK

Query:  MSDSQKQVQEPKISWN-ATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD
         ++     Q  K++WN ATYDSLL +NK FP+P+PL EM++FLVD+WEQEGLYD
Subjt:  MSDSQKQVQEPKISWN-ATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD

AT4G32342.1 unknown protein5.5e-3552.53Show/hide
Query:  NIKLLFHRRSCHGCCTAS-ALGNAMDEPSKGLRVKDQEVKK-QCLPENCPSSSTCEMD-NSTVWSQRSIPSAQSQDSHSNIGSSTDFVNSGLLLWNETRK
        N K L +  +C GCC     L   +DEPSKGL+++ + VKK     ++  S+STC+MD N T+ SQ S P    Q S SN   ST+FVN GL+LWN TR+
Subjt:  NIKLLFHRRSCHGCCTAS-ALGNAMDEPSKGLRVKDQEVKK-QCLPENCPSSSTCEMD-NSTVWSQRSIPSAQSQDSHSNIGSSTDFVNSGLLLWNETRK

Query:  QWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLY
        QW    ++  Q  V EP ISWN+TYDSLL+TNK FP+P+PL EM+ FLVDVWE+EGLY
Subjt:  QWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLY

AT5G25360.1 unknown protein8.8e-4156.38Show/hide
Query:  CHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTVWSQRSIPSAQSQDSHSNIGSS---TDFVNSGLLLWNETRKQWAGNKMSDS
        C GCC    L  A+DEPSKGLR++ + VKK  + E+  S+STCEMDNST+ SQRS+ S    ++ S   S+   T+FVN GL LWN+TR+QW  N  S  
Subjt:  CHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTVWSQRSIPSAQSQDSHSNIGSS---TDFVNSGLLLWNETRKQWAGNKMSDS

Query:  QKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD
        + +V+EP ISWNATY+SLL  NK F  P+PL EM++FLVDVWEQEGLYD
Subjt:  QKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD

AT5G25360.2 unknown protein8.8e-4156.38Show/hide
Query:  CHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTVWSQRSIPSAQSQDSHSNIGSS---TDFVNSGLLLWNETRKQWAGNKMSDS
        C GCC    L  A+DEPSKGLR++ + VKK  + E+  S+STCEMDNST+ SQRS+ S    ++ S   S+   T+FVN GL LWN+TR+QW  N  S  
Subjt:  CHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTVWSQRSIPSAQSQDSHSNIGSS---TDFVNSGLLLWNETRKQWAGNKMSDS

Query:  QKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD
        + +V+EP ISWNATY+SLL  NK F  P+PL EM++FLVDVWEQEGLYD
Subjt:  QKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGTATTCTAGGTGTTGTCTCCTCAGCCGCTTAGAGGGTTGCTCTAGCAAGAAACCATGTTGTTCATTCTTACAGTTTTCTGGAGAATATCTGCGCGCTCTTATTCTTTT
GATGGTGGATAATATCAAGCTTCTTTTCCATAGAAGAAGCTGTCATGGATGCTGCACTGCATCTGCACTAGGTAATGCAATGGACGAGCCGTCTAAAGGTCTGAGAGTTA
AAGACCAAGAAGTAAAGAAACAATGCTTACCTGAAAATTGCCCGAGCTCTAGCACATGTGAAATGGACAACAGTACAGTTTGGTCCCAGAGAAGCATTCCATCAGCCCAG
TCACAAGATTCTCACAGTAATATTGGGAGCAGTACAGACTTTGTAAACTCTGGACTACTTCTTTGGAATGAGACCAGGAAACAATGGGCTGGAAATAAAATGTCCGACAG
CCAAAAGCAAGTTCAAGAACCCAAAATAAGCTGGAATGCTACTTATGACAGCTTATTAACAACGAACAAACCGTTCCCCGAGCCCGTACCTCTTGCTGAGATGATAGAGT
TTCTTGTTGATGTCTGGGAGCAGGAGGGTCTATATGACTGA
mRNA sequenceShow/hide mRNA sequence
ATGTATTCTAGGTGTTGTCTCCTCAGCCGCTTAGAGGGTTGCTCTAGCAAGAAACCATGTTGTTCATTCTTACAGTTTTCTGGAGAATATCTGCGCGCTCTTATTCTTTT
GATGGTGGATAATATCAAGCTTCTTTTCCATAGAAGAAGCTGTCATGGATGCTGCACTGCATCTGCACTAGGTAATGCAATGGACGAGCCGTCTAAAGGTCTGAGAGTTA
AAGACCAAGAAGTAAAGAAACAATGCTTACCTGAAAATTGCCCGAGCTCTAGCACATGTGAAATGGACAACAGTACAGTTTGGTCCCAGAGAAGCATTCCATCAGCCCAG
TCACAAGATTCTCACAGTAATATTGGGAGCAGTACAGACTTTGTAAACTCTGGACTACTTCTTTGGAATGAGACCAGGAAACAATGGGCTGGAAATAAAATGTCCGACAG
CCAAAAGCAAGTTCAAGAACCCAAAATAAGCTGGAATGCTACTTATGACAGCTTATTAACAACGAACAAACCGTTCCCCGAGCCCGTACCTCTTGCTGAGATGATAGAGT
TTCTTGTTGATGTCTGGGAGCAGGAGGGTCTATATGACTGA
Protein sequenceShow/hide protein sequence
MYSRCCLLSRLEGCSSKKPCCSFLQFSGEYLRALILLMVDNIKLLFHRRSCHGCCTASALGNAMDEPSKGLRVKDQEVKKQCLPENCPSSSTCEMDNSTVWSQRSIPSAQ
SQDSHSNIGSSTDFVNSGLLLWNETRKQWAGNKMSDSQKQVQEPKISWNATYDSLLTTNKPFPEPVPLAEMIEFLVDVWEQEGLYD