; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Sgr002592 (gene) of Monk fruit (Qingpiguo) v1 genome

Gene IDSgr002592
OrganismSiraitia grosvenorii cv. Qingpiguo (Monk fruit (Qingpiguo) v1)
DescriptionProtein of unknown function (DUF1639)
Genome locationtig00001641:10619..16518
RNA-Seq ExpressionSgr002592
SyntenySgr002592
Gene Ontology termsNA
InterPro domainsIPR012438 - Protein of unknown function DUF1639


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_004137319.1 uncharacterized protein LOC101214785 [Cucumis sativus]2.2e-7676.56Show/hide
Query:  MATAPERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIEIM
        M+  P+RS PLHNFSLP LKWGSQRFLKCMKVSSNSN S LDHPS  R+SKSYQFRARP++S+  NF K++S  N +HSKQKP+H   DR   SSSIEIM
Subjt:  MATAPERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIEIM

Query:  RERIMLDIREESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAPQEERKVELGSSSTK-VITKKEKERTALSVSLSKEELDEDFAALVGRLPRRPKKR
        RE+IMLDIREESK+LKFSI +EGGEDESAAARPWNLRTRRAACKAP +ER +ELGSSSTK  + KKEK RTAL+VSLSKEEL++DFA LVGRLPRRPKKR
Subjt:  RERIMLDIREESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAPQEERKVELGSSSTK-VITKKEKERTALSVSLSKEELDEDFAALVGRLPRRPKKR

Query:  PRVVQKQLD
        PR VQKQ+D
Subjt:  PRVVQKQLD

XP_008453422.1 PREDICTED: uncharacterized protein LOC103494136 [Cucumis melo]3.7e-7677.03Show/hide
Query:  MATAPERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIEIM
        M+  P+RS PLHNFSLP LKWGSQRFLKCMKVSSNSN S LDHPS  R+SKSYQFRARP+NS+  NF K++S  N +HSKQKP H   DR   SSSIEIM
Subjt:  MATAPERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIEIM

Query:  RERIMLDIREESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAPQEERKVELGSSSTK-VITKKEKERTALSVSLSKEELDEDFAALVGRLPRRPKKR
        RE+IMLDIREESK+LKFSI +EGGEDESAAARPWNLRTRRAACKAP +ER +ELGSSSTK  + KK+K RTAL VSLSKEEL+EDFA LVGRLPRRPKKR
Subjt:  RERIMLDIREESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAPQEERKVELGSSSTK-VITKKEKERTALSVSLSKEELDEDFAALVGRLPRRPKKR

Query:  PRVVQKQLD
        PR VQKQ+D
Subjt:  PRVVQKQLD

XP_022134637.1 uncharacterized protein LOC111006857 [Momordica charantia]1.9e-8083.41Show/hide
Query:  MATAPERSKPLHNFSLPYLKWGSQRFLKCMKVS--SNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIE
        MA APERS PLHNFSLPYLKWGSQRFLKCMKVS  SNSNSSAL HPSAQRESKSYQFRAR MNSR ANF+K     +PSHSKQKP  A       SSSIE
Subjt:  MATAPERSKPLHNFSLPYLKWGSQRFLKCMKVS--SNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIE

Query:  IMRERIMLDIREESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAPQEERKVELG-SSSTKVITKKEKERTALSVSLSKEELDEDFAALVGRLPRRPK
         MRE+IMLDIREESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAP EER +ELG SSSTK + +KEK RTALSVSLSKEEL+EDFAALVGRLPRRPK
Subjt:  IMRERIMLDIREESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAPQEERKVELG-SSSTKVITKKEKERTALSVSLSKEELDEDFAALVGRLPRRPK

Query:  KRPRVVQKQLD
        KRPRVVQKQLD
Subjt:  KRPRVVQKQLD

XP_022921716.1 uncharacterized protein LOC111429881 [Cucurbita moschata]1.7e-6871.7Show/hide
Query:  MATAPERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIEIM
        MA AP+RSKPLHNFSLPYLKWGSQRFLKCMK+SSNSN      P+A R+S+SY+ R RP+NS+GAN  + S       S  KPS  NND   GSSSIEIM
Subjt:  MATAPERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIEIM

Query:  RERIMLDIREESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAPQEERKVELGSSSTKVITKKEKE----RTALSVSLSKEELDEDFAALVGRLPRRP
        RE+IMLDIREESK+LKFSI +EGGE ESAAARPWNLRTRRAACKAP +ER  E GSSS K +TKKEKE    R+ L VSLSKEEL+EDFA LVG+LPRRP
Subjt:  RERIMLDIREESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAPQEERKVELGSSSTKVITKKEKE----RTALSVSLSKEELDEDFAALVGRLPRRP

Query:  KKRPRVVQKQLD
        KKRPR VQKQLD
Subjt:  KKRPRVVQKQLD

XP_038898793.1 uncharacterized protein LOC120086296 [Benincasa hispida]5.5e-8079.9Show/hide
Query:  MATAPERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIEIM
        MA  P+RS PLHNFSLPYLKWGSQRFLKCMKVSSNS+ S LDHPS QR+SKSYQFRARP+NS+  NF KLSS  NP+HSKQKP+   NDR   SSSIEIM
Subjt:  MATAPERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIEIM

Query:  RERIMLDIREESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAPQEERKVELGSSSTK-VITKKEKERTALSVSLSKEELDEDFAALVGRLPRRPKKR
        RE+IMLDIREESK++KFSI +EGGEDESAAARPWNLRTRRAACKAPQEE+  ELGSSSTK ++ KKEK RTAL VSLSKEEL+EDFA LVGRLPRRPKKR
Subjt:  RERIMLDIREESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAPQEERKVELGSSSTK-VITKKEKERTALSVSLSKEELDEDFAALVGRLPRRPKKR

Query:  PRVVQKQLD
        PR VQKQ+D
Subjt:  PRVVQKQLD

TrEMBL top hitse value%identityAlignment
A0A0A0LS42 Uncharacterized protein1.0e-7676.56Show/hide
Query:  MATAPERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIEIM
        M+  P+RS PLHNFSLP LKWGSQRFLKCMKVSSNSN S LDHPS  R+SKSYQFRARP++S+  NF K++S  N +HSKQKP+H   DR   SSSIEIM
Subjt:  MATAPERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIEIM

Query:  RERIMLDIREESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAPQEERKVELGSSSTK-VITKKEKERTALSVSLSKEELDEDFAALVGRLPRRPKKR
        RE+IMLDIREESK+LKFSI +EGGEDESAAARPWNLRTRRAACKAP +ER +ELGSSSTK  + KKEK RTAL+VSLSKEEL++DFA LVGRLPRRPKKR
Subjt:  RERIMLDIREESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAPQEERKVELGSSSTK-VITKKEKERTALSVSLSKEELDEDFAALVGRLPRRPKKR

Query:  PRVVQKQLD
        PR VQKQ+D
Subjt:  PRVVQKQLD

A0A1S3BXD4 uncharacterized protein LOC1034941361.8e-7677.03Show/hide
Query:  MATAPERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIEIM
        M+  P+RS PLHNFSLP LKWGSQRFLKCMKVSSNSN S LDHPS  R+SKSYQFRARP+NS+  NF K++S  N +HSKQKP H   DR   SSSIEIM
Subjt:  MATAPERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIEIM

Query:  RERIMLDIREESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAPQEERKVELGSSSTK-VITKKEKERTALSVSLSKEELDEDFAALVGRLPRRPKKR
        RE+IMLDIREESK+LKFSI +EGGEDESAAARPWNLRTRRAACKAP +ER +ELGSSSTK  + KK+K RTAL VSLSKEEL+EDFA LVGRLPRRPKKR
Subjt:  RERIMLDIREESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAPQEERKVELGSSSTK-VITKKEKERTALSVSLSKEELDEDFAALVGRLPRRPKKR

Query:  PRVVQKQLD
        PR VQKQ+D
Subjt:  PRVVQKQLD

A0A5A7UX47 DUF1639 domain-containing protein1.8e-7677.03Show/hide
Query:  MATAPERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIEIM
        M+  P+RS PLHNFSLP LKWGSQRFLKCMKVSSNSN S LDHPS  R+SKSYQFRARP+NS+  NF K++S  N +HSKQKP H   DR   SSSIEIM
Subjt:  MATAPERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIEIM

Query:  RERIMLDIREESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAPQEERKVELGSSSTK-VITKKEKERTALSVSLSKEELDEDFAALVGRLPRRPKKR
        RE+IMLDIREESK+LKFSI +EGGEDESAAARPWNLRTRRAACKAP +ER +ELGSSSTK  + KK+K RTAL VSLSKEEL+EDFA LVGRLPRRPKKR
Subjt:  RERIMLDIREESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAPQEERKVELGSSSTK-VITKKEKERTALSVSLSKEELDEDFAALVGRLPRRPKKR

Query:  PRVVQKQLD
        PR VQKQ+D
Subjt:  PRVVQKQLD

A0A6J1C056 uncharacterized protein LOC1110068579.1e-8183.41Show/hide
Query:  MATAPERSKPLHNFSLPYLKWGSQRFLKCMKVS--SNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIE
        MA APERS PLHNFSLPYLKWGSQRFLKCMKVS  SNSNSSAL HPSAQRESKSYQFRAR MNSR ANF+K     +PSHSKQKP  A       SSSIE
Subjt:  MATAPERSKPLHNFSLPYLKWGSQRFLKCMKVS--SNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIE

Query:  IMRERIMLDIREESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAPQEERKVELG-SSSTKVITKKEKERTALSVSLSKEELDEDFAALVGRLPRRPK
         MRE+IMLDIREESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAP EER +ELG SSSTK + +KEK RTALSVSLSKEEL+EDFAALVGRLPRRPK
Subjt:  IMRERIMLDIREESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAPQEERKVELG-SSSTKVITKKEKERTALSVSLSKEELDEDFAALVGRLPRRPK

Query:  KRPRVVQKQLD
        KRPRVVQKQLD
Subjt:  KRPRVVQKQLD

A0A6J1E256 uncharacterized protein LOC1114298818.0e-6971.7Show/hide
Query:  MATAPERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIEIM
        MA AP+RSKPLHNFSLPYLKWGSQRFLKCMK+SSNSN      P+A R+S+SY+ R RP+NS+GAN  + S       S  KPS  NND   GSSSIEIM
Subjt:  MATAPERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIEIM

Query:  RERIMLDIREESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAPQEERKVELGSSSTKVITKKEKE----RTALSVSLSKEELDEDFAALVGRLPRRP
        RE+IMLDIREESK+LKFSI +EGGE ESAAARPWNLRTRRAACKAP +ER  E GSSS K +TKKEKE    R+ L VSLSKEEL+EDFA LVG+LPRRP
Subjt:  RERIMLDIREESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAPQEERKVELGSSSTKVITKKEKE----RTALSVSLSKEELDEDFAALVGRLPRRP

Query:  KKRPRVVQKQLD
        KKRPR VQKQLD
Subjt:  KKRPRVVQKQLD

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
AT1G25370.1 Protein of unknown function (DUF1639)6.4e-1833.47Show/hide
Query:  TAPE-RSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSS------ALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSS
        T PE RSK LHNF LP L WG+QR LKC K+ S SN++        DH   +R S   +F   P+ S    F                 H    ++    
Subjt:  TAPE-RSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSS------ALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSS

Query:  SIEIMRERIMLDIREESKKLKFSIPEEGGEDES--------------------AAARPWNLRTRR-AACKAPQEERKVELGSSSTKVITKK---------
         IE  R ++M D++ E+ K+  S+  +G  +E                        +PWNLR RR AACK P+    +  G    + + K          
Subjt:  SIEIMRERIMLDIREESKKLKFSIPEEGGEDES--------------------AAARPWNLRTRR-AACKAPQEERKVELGSSSTKVITKK---------

Query:  ----EKERTALSVSLSKEELDEDFAALVG-RLPRRPKKRPRVVQKQLD
            EK+R   S+ LSK+E++EDF  +VG R PRRPKKR + VQK+LD
Subjt:  ----EKERTALSVSLSKEELDEDFAALVG-RLPRRPKKRPRVVQKQLD

AT1G48770.1 Protein of unknown function (DUF1639)5.8e-1931.94Show/hide
Query:  ERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIEIMRERIM
        ERSK LHNFSLP L+WG QRFL+C+                                         +L +P  S   P HA  +R+   + + + R    
Subjt:  ERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIEIMRERIM

Query:  LDIREESKKLKFSIPEEGGEDES---AAARPWNLRTRRAACKAPQEERKVELGSSSTKVIT-------KKEKERTALSVSLSKEELDEDFAALVGRL-PR
                         GG       AAA+PWNLR RRAAC  P EE  +E+G +  + I         K+ E++  S++LS++E+++DF+ + G+  P+
Subjt:  LDIREESKKLKFSIPEEGGEDES---AAARPWNLRTRRAACKAPQEERKVELGSSSTKVIT-------KKEKERTALSVSLSKEELDEDFAALVGRL-PR

Query:  RPKKRPRVVQKQLDFV
        RPKKRPR+VQK+L+ +
Subjt:  RPKKRPRVVQKQLDFV

AT1G68340.1 Protein of unknown function (DUF1639)3.2e-1734.45Show/hide
Query:  TAPERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSA--QR-ESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIE-
        T  ERSK L NFSLP L WG+QR L+C K   +         S   QR   +S  F +   N R     K+ S       + +     + R       E 
Subjt:  TAPERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSA--QR-ESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIE-

Query:  IMRERIMLDIREESKKLKFSIPEEGGEDESAA---ARPWNLRTRRAACKA------PQEERKVELGSSSTKVITKKEKERTALSVSLSKEELDEDFAALV
        I R++++L   EE K+++    +   E   AA    RPWNLR RRAACKA        + +  E   + + +  +  K+R+ L  +LSK+E++ED+  ++
Subjt:  IMRERIMLDIREESKKLKFSIPEEGGEDESAA---ARPWNLRTRRAACKA------PQEERKVELGSSSTKVITKKEKERTALSVSLSKEELDEDFAALV

Query:  G-RLPRRPKKRPRVVQKQLD---FVSNMSEIKSSLWEV
        G + PRRPKKR R VQKQ+D   F S ++EI   L+ V
Subjt:  G-RLPRRPKKRPRVVQKQLD---FVSNMSEIKSSLWEV

AT3G18295.1 Protein of unknown function (DUF1639)1.1e-2535.84Show/hide
Query:  PERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIEIMRERI
        PERSK LHNF+LPYL+WG QRFL+C+K+                          P ++R  +F   SS  +P H       ++N   +G          +
Subjt:  PERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIEIMRERI

Query:  MLDIREESKKLKFSIPEEGGE----DESAAARPWNLRTRRAACKAPQEERKVELGSSSTKV-----------------ITKKEKERTALSVSLSKEELDE
         LD+  ++ + K S+   GG+    D  AAARPWNLRTRRAAC  P  +    +  SS+ +                   + + E+   SVSL +EE+++
Subjt:  MLDIREESKKLKFSIPEEGGE----DESAAARPWNLRTRRAACKAPQEERKVELGSSSTKV-----------------ITKKEKERTALSVSLSKEELDE

Query:  DFAALVG-RLPRRPKKRPRVVQKQLD
        DF+AL+G R PRRPKKRPR+VQKQ++
Subjt:  DFAALVG-RLPRRPKKRPRVVQKQLD

AT3G60410.1 Protein of unknown function (DUF1639)3.0e-0727.92Show/hide
Query:  ATAPERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQR---------ESKSYQFRARPMNSRG---------ANFAKLSSLTNPSHSK--Q
        +++P +S PLHNF L  L+W +       ++   S+ S L   +  +         E+    F  RP   +G         A+ +   S T    SK   
Subjt:  ATAPERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQR---------ESKSYQFRARPMNSRG---------ANFAKLSSLTNPSHSK--Q

Query:  KPSHANNDRTAGS--------SSIEIMRERIMLDIREESKKLKFSIPEEGGED-ESAAARPWNLRTRR----------------AACKAPQEERKVELGS
        +    NN+ TA S        +S+++  +     I  E ++    I + GG++ +    + WNLR RR                 +C     E K  LG+
Subjt:  KPSHANNDRTAGS--------SSIEIMRERIMLDIREESKKLKFSIPEEGGED-ESAAARPWNLRTRR----------------AACKAPQEERKVELGS

Query:  SSTKVI------------TKKEKERTALSVSLSKEELDEDFAALVGRLP-RRPKKRPRVVQKQLD
          T+ I            T++++++  LS+SLSK E+DED  AL G  P RRPKKR + VQKQLD
Subjt:  SSTKVI------------TKKEKERTALSVSLSKEELDEDFAALVGRLP-RRPKKRPRVVQKQLD


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGCTACGGCGCCGGAAAGATCAAAGCCACTGCACAACTTCTCCTTGCCGTATCTCAAATGGGGTTCACAAAGATTCCTCAAGTGTATGAAAGTTTCTTCCAACTCCAA
CTCGTCTGCGCTTGATCACCCATCTGCTCAACGTGAATCGAAATCCTATCAATTCCGGGCTAGACCCATGAATTCTCGGGGCGCGAACTTCGCCAAGCTTTCTTCTCTCA
CGAACCCGAGTCATTCCAAACAGAAACCAAGTCATGCGAACAATGATCGCACGGCCGGCAGCAGTTCCATCGAAATCATGCGAGAAAGGATCATGCTCGATATCAGGGAG
GAATCGAAGAAACTCAAATTTTCAATTCCTGAAGAGGGTGGCGAGGACGAGTCGGCTGCGGCGAGGCCGTGGAATTTGAGGACGCGCAGAGCAGCATGTAAGGCTCCTCA
GGAAGAGAGGAAGGTGGAATTGGGTTCATCGTCGACGAAGGTTATAACGAAGAAGGAGAAGGAGCGAACTGCTTTATCTGTCTCGCTGTCGAAGGAGGAGCTGGACGAGG
ACTTCGCGGCGCTGGTCGGTAGGCTACCGAGGAGGCCAAAGAAGAGGCCTAGGGTTGTACAAAAGCAATTGGACTTTGTCAGCAATATGAGTGAGATCAAGTCTTCGCTA
TGGGAAGTTGTTAACTCAGTTCACCAGTATGATCTTAACAGCTTCTACTTCAACATCGGAGTTGTCAGTTTCAACTTGTCGGTCACCACAATTACTAGATGCAGGTCAAC
TCTACAAGCAGAGCATGTTATAAGGAGATGGAGGAGTTTTCAGGTGCATGGATGCTCATCTTTCTTAAGCAAGGCTTTAAAGGTTGGGCTGAAATACATTGAAGTAGGGA
TTCAGAGCACTGCTGAGTCTGGTTCTAAAACGCGGATGGATGACTTGCATGGGAGCAAGGAGGTGGTACATTCAAAATGCGGTTTATATAGAGATTGTTGCCATAAAACC
CTCCGAGATTCTTCTTCAGATGGGGCTTTTGGAGTTTTGAATAAGAATTTGCAGCAGAATTTGAAAATGGGACTCTGGCTTTTCACATCCTTCAATGGCGTTCGTTTTGG
GTTTGGCACTGAACCAGTTGAGTTACTTCGTGATAGAAGTGGTAGTGAGCAAAACGAGCTCTTTCTACTGTCAAAAGTAACACTGTTACTTCTTCGGAACCCCCAGAAAG
ATTTGGACCGACTTTTTTTCTCCAATTCAGAATTCAAAATGTTAGCTGATTTTTCTGTGTTCTCTTTCTTTGAATTCTCGGTCGTTGCAGCAGGAGGAAGAGGAGGAAGT
GAGGAAGTGGGAACGATGACGCCATTGAAGAATAGCTCGTCTGCAGAAGAAGACTCGTGCTCAAAGCTACCACTAACACAAAACTCAAATTCAGAATTAGAATCCAACAG
CGATAAGTCTCTTGGAGATTCATGTGGCTTGATGGGCAGCATCTCGGACGGGACAAGGTCATTTGACAGTGAAATTCCAGGCAAACATTTTATCGAACGCATGGTGGGCA
CTTCACTCCCAGAAACTTTCAGTCGGGAAAAGGCGCAAAACTTGAAGACATTTCAGGAAACGAAAATGGCAGGGAAAGTACGCTTAAACACGAAGCGAAAGAAACACAAA
GGAAATGGTCGAAAATTGAGAAACAAGCAACTGGGTTCTGAAGATCAAAGGGAGACAGAAAATCAAAAACCAAAGCGACAGCCTCTGGCGTCCCGAAAATCGAGGATGCC
GGATCGGAGAGATAGAAAATGA
mRNA sequenceShow/hide mRNA sequence
ATGGCTACGGCGCCGGAAAGATCAAAGCCACTGCACAACTTCTCCTTGCCGTATCTCAAATGGGGTTCACAAAGATTCCTCAAGTGTATGAAAGTTTCTTCCAACTCCAA
CTCGTCTGCGCTTGATCACCCATCTGCTCAACGTGAATCGAAATCCTATCAATTCCGGGCTAGACCCATGAATTCTCGGGGCGCGAACTTCGCCAAGCTTTCTTCTCTCA
CGAACCCGAGTCATTCCAAACAGAAACCAAGTCATGCGAACAATGATCGCACGGCCGGCAGCAGTTCCATCGAAATCATGCGAGAAAGGATCATGCTCGATATCAGGGAG
GAATCGAAGAAACTCAAATTTTCAATTCCTGAAGAGGGTGGCGAGGACGAGTCGGCTGCGGCGAGGCCGTGGAATTTGAGGACGCGCAGAGCAGCATGTAAGGCTCCTCA
GGAAGAGAGGAAGGTGGAATTGGGTTCATCGTCGACGAAGGTTATAACGAAGAAGGAGAAGGAGCGAACTGCTTTATCTGTCTCGCTGTCGAAGGAGGAGCTGGACGAGG
ACTTCGCGGCGCTGGTCGGTAGGCTACCGAGGAGGCCAAAGAAGAGGCCTAGGGTTGTACAAAAGCAATTGGACTTTGTCAGCAATATGAGTGAGATCAAGTCTTCGCTA
TGGGAAGTTGTTAACTCAGTTCACCAGTATGATCTTAACAGCTTCTACTTCAACATCGGAGTTGTCAGTTTCAACTTGTCGGTCACCACAATTACTAGATGCAGGTCAAC
TCTACAAGCAGAGCATGTTATAAGGAGATGGAGGAGTTTTCAGGTGCATGGATGCTCATCTTTCTTAAGCAAGGCTTTAAAGGTTGGGCTGAAATACATTGAAGTAGGGA
TTCAGAGCACTGCTGAGTCTGGTTCTAAAACGCGGATGGATGACTTGCATGGGAGCAAGGAGGTGGTACATTCAAAATGCGGTTTATATAGAGATTGTTGCCATAAAACC
CTCCGAGATTCTTCTTCAGATGGGGCTTTTGGAGTTTTGAATAAGAATTTGCAGCAGAATTTGAAAATGGGACTCTGGCTTTTCACATCCTTCAATGGCGTTCGTTTTGG
GTTTGGCACTGAACCAGTTGAGTTACTTCGTGATAGAAGTGGTAGTGAGCAAAACGAGCTCTTTCTACTGTCAAAAGTAACACTGTTACTTCTTCGGAACCCCCAGAAAG
ATTTGGACCGACTTTTTTTCTCCAATTCAGAATTCAAAATGTTAGCTGATTTTTCTGTGTTCTCTTTCTTTGAATTCTCGGTCGTTGCAGCAGGAGGAAGAGGAGGAAGT
GAGGAAGTGGGAACGATGACGCCATTGAAGAATAGCTCGTCTGCAGAAGAAGACTCGTGCTCAAAGCTACCACTAACACAAAACTCAAATTCAGAATTAGAATCCAACAG
CGATAAGTCTCTTGGAGATTCATGTGGCTTGATGGGCAGCATCTCGGACGGGACAAGGTCATTTGACAGTGAAATTCCAGGCAAACATTTTATCGAACGCATGGTGGGCA
CTTCACTCCCAGAAACTTTCAGTCGGGAAAAGGCGCAAAACTTGAAGACATTTCAGGAAACGAAAATGGCAGGGAAAGTACGCTTAAACACGAAGCGAAAGAAACACAAA
GGAAATGGTCGAAAATTGAGAAACAAGCAACTGGGTTCTGAAGATCAAAGGGAGACAGAAAATCAAAAACCAAAGCGACAGCCTCTGGCGTCCCGAAAATCGAGGATGCC
GGATCGGAGAGATAGAAAATGA
Protein sequenceShow/hide protein sequence
MATAPERSKPLHNFSLPYLKWGSQRFLKCMKVSSNSNSSALDHPSAQRESKSYQFRARPMNSRGANFAKLSSLTNPSHSKQKPSHANNDRTAGSSSIEIMRERIMLDIRE
ESKKLKFSIPEEGGEDESAAARPWNLRTRRAACKAPQEERKVELGSSSTKVITKKEKERTALSVSLSKEELDEDFAALVGRLPRRPKKRPRVVQKQLDFVSNMSEIKSSL
WEVVNSVHQYDLNSFYFNIGVVSFNLSVTTITRCRSTLQAEHVIRRWRSFQVHGCSSFLSKALKVGLKYIEVGIQSTAESGSKTRMDDLHGSKEVVHSKCGLYRDCCHKT
LRDSSSDGAFGVLNKNLQQNLKMGLWLFTSFNGVRFGFGTEPVELLRDRSGSEQNELFLLSKVTLLLLRNPQKDLDRLFFSNSEFKMLADFSVFSFFEFSVVAAGGRGGS
EEVGTMTPLKNSSSAEEDSCSKLPLTQNSNSELESNSDKSLGDSCGLMGSISDGTRSFDSEIPGKHFIERMVGTSLPETFSREKAQNLKTFQETKMAGKVRLNTKRKKHK
GNGRKLRNKQLGSEDQRETENQKPKRQPLASRKSRMPDRRDRK