; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Lsi04G013690 (gene) of Bottle gourd (USVL1VR-Ls) v1 genome

Gene IDLsi04G013690
OrganismLagenaria siceraria USVL1VR-Ls (Bottle gourd (USVL1VR-Ls) v1)
DescriptionHydroxyproline-rich glycoprotein family protein
Genome locationchr04:21484509..21485654
RNA-Seq ExpressionLsi04G013690
SyntenyLsi04G013690
Gene Ontology termsNA
InterPro domainsNA


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAE8649926.1 hypothetical protein Csa_011922 [Cucumis sativus]2.6e-11577.85Show/hide
Query:  MNSTDQLCNFEAVAKISQPKPDGEPKKQVRRRRQS-RRLYKETPLDMAEARREIVTALKLHRA-STKE-AREQQQKQDQEIKQSVPLFPQLCPCFEAEGR
        MNSTDQL NFEA A+IS  KPD EPKKQVRRRR S RRLYKE PLDMAEARREIVTALKLHRA STKE AREQQQKQDQE KQS PLFPQ   CFEAEGR
Subjt:  MNSTDQLCNFEAVAKISQPKPDGEPKKQVRRRRQS-RRLYKETPLDMAEARREIVTALKLHRA-STKE-AREQQQKQDQEIKQSVPLFPQLCPCFEAEGR

Query:  RKSRRNPRIYPGYSYDCSFYLENGSGFVAPPPVAQNLNAEIPIQIFDDDFKT----------SFW-PPSSYICLTVSCPDTHQEVPKSISLSEEEGKLMA
        RKSRRNPRIYP  SYDCSFYLENGSG VAPPP  +NLN EIPIQ FDDDFKT          SFW PPSSYIC T+SCPDTHQE+PKS+SL EEEG LMA
Subjt:  RKSRRNPRIYPGYSYDCSFYLENGSGFVAPPPVAQNLNAEIPIQIFDDDFKT----------SFW-PPSSYICLTVSCPDTHQEVPKSISLSEEEGKLMA

Query:  SDFLFWSNNDPTRESEKDMQQGAV--EEAM-AMAEIRSMSMDVKALEFDVHHSSDNAMEFPEWLSINNDILQQHSNYKCVEEDYLQYPDLSCFDFGKIED
        SD +FW NNDPT  SEKDMQQ  V  EEAM AMA+I+SMSMDVKALE D  HSSDNAMEFP+WLSIN+D L Q+SNY CVEEDYLQ PDLSCFD  KIED
Subjt:  SDFLFWSNNDPTRESEKDMQQGAV--EEAM-AMAEIRSMSMDVKALEFDVHHSSDNAMEFPEWLSINNDILQQHSNYKCVEEDYLQYPDLSCFDFGKIED

Query:  VDGDWLA
        +D +WLA
Subjt:  VDGDWLA

KAG6608324.1 hypothetical protein SDJN03_01666, partial [Cucurbita argyrosperma subsp. sororia]6.2e-9363.38Show/hide
Query:  MNSTDQLCNFEAVAKISQPKPD--GEPKKQVRRRRQSRRLYKETPLDMAEARREIVTALKLHRASTKEAREQQQKQDQEIKQSVPLFP-QLCPCFEAEGR
        MNSTDQLCNFEA  KI QP+P   GE KKQVRRRRQSRRLYK+ PL+MAEARREIVTALKLHRASTKEA+EQQQKQDQ+IK S+P++P Q  PCFE E R
Subjt:  MNSTDQLCNFEAVAKISQPKPD--GEPKKQVRRRRQSRRLYKETPLDMAEARREIVTALKLHRASTKEAREQQQKQDQEIKQSVPLFP-QLCPCFEAEGR

Query:  RKSRRNPRIYPGYSYDCSFYLENGSGFVAPPPVAQNLNAEIPIQI------FDDD------------FKTSFWPPSSYICLTVS-CPDTHQEVPKSISLS
         KSRRNPRIYP    DCSFY ENGS F+APPPVAQ+L+ +IPIQ       F+D             +  SF PPSSYIC T      THQEVPKSISLS
Subjt:  RKSRRNPRIYPGYSYDCSFYLENGSGFVAPPPVAQNLNAEIPIQI------FDDD------------FKTSFWPPSSYICLTVS-CPDTHQEVPKSISLS

Query:  EEEGKLMASDFLFWSNNDPTRESEKDMQQGAV-----EEAMAMAEIRSMSMDVKALEFDVH--------HSSDNAMEFPEWLSINNDILQQHSNYKCVEE
        EEEG+LMASD LFWSNN PT ESEK++  GAV     EE   +AEIRSM      ++   H          S+ AMEFP+WLSIN+D LQ  SNY    E
Subjt:  EEEGKLMASDFLFWSNNDPTRESEKDMQQGAV-----EEAMAMAEIRSMSMDVKALEFDVH--------HSSDNAMEFPEWLSINNDILQQHSNYKCVEE

Query:  DYLQYPDLSCFDFGKIEDVDGDWLA
        DYLQ PDLSC D G+IEDVDGDWLA
Subjt:  DYLQYPDLSCFDFGKIEDVDGDWLA

XP_016901295.1 PREDICTED: uncharacterized protein LOC103493717 [Cucumis melo]6.3e-11476.7Show/hide
Query:  MNSTDQLCNFEAVAKISQPKPDGEPKKQVRRRRQS-RRLYKETPLDMAEARREIVTALKLHRA-STKE-AREQQQKQDQEIKQSVPLFPQLCPCFEAEGR
        MNS DQL NFEA A+IS  KPD EPKKQVRRRR S RRLYKE PLDMAEARREIVTALKLHRA STKE AREQQQKQDQE KQS PLFP+L  CFEAEGR
Subjt:  MNSTDQLCNFEAVAKISQPKPDGEPKKQVRRRRQS-RRLYKETPLDMAEARREIVTALKLHRA-STKE-AREQQQKQDQEIKQSVPLFPQLCPCFEAEGR

Query:  RKSRRNPRIYPGYSYDCSFYLENGSGFVAPPPVAQNLNAEIPIQIFDDDFKT----------SFW-PPSSYICLTVSCPDT-HQEVPKSISLSEEEGKLM
        RKS+RNPRIYP  SYDCSFYLENGSGFVAPPP  +NLN EIPIQ FDDDFKT          SFW PPSSYIC TVSCPDT HQE PKS+SL EEEG LM
Subjt:  RKSRRNPRIYPGYSYDCSFYLENGSGFVAPPPVAQNLNAEIPIQIFDDDFKT----------SFW-PPSSYICLTVSCPDT-HQEVPKSISLSEEEGKLM

Query:  ASDFLFWSNNDPTRESEKDMQQGAV--EEAMAMA--EIRSMSMDVKALEFDVHHSSDNAMEFPEWLSINNDILQQHSNYKCVEEDYLQYPDLSCFDFGKI
        ASD +FW NNDPT  +EKDMQQ AV  EEAMAMA  +++SMSMDVKALE D HHSSDNAM FP+W+SIN+D LQQ+SNY CVEED LQ PDLSCFD GKI
Subjt:  ASDFLFWSNNDPTRESEKDMQQGAV--EEAMAMA--EIRSMSMDVKALEFDVHHSSDNAMEFPEWLSINNDILQQHSNYKCVEEDYLQYPDLSCFDFGKI

Query:  EDVDGDWLA
        ED+  +WLA
Subjt:  EDVDGDWLA

XP_022940715.1 uncharacterized protein LOC111446225 [Cucurbita moschata]4.7e-9363.83Show/hide
Query:  MNSTDQLCNFEAVAKISQPKPD--GEPKKQVRRRRQSRRLYKETPLDMAEARREIVTALKLHRASTKEAREQQQKQDQEIKQSVPLFP-QLCPCFEAEGR
        MNSTDQLCNFEA  KI QP+P   GE KKQVRRRRQSRRLYK+ PL+MAEARREIVTALKLHRASTKEA+EQQQKQDQ+IK S+P++P Q  PCFE E R
Subjt:  MNSTDQLCNFEAVAKISQPKPD--GEPKKQVRRRRQSRRLYKETPLDMAEARREIVTALKLHRASTKEAREQQQKQDQEIKQSVPLFP-QLCPCFEAEGR

Query:  RKSRRNPRIYPGYSYDCSFYLENGSGFVAPPPVAQNLNAEIPIQI------FDDD--------------FKTSFWPPSSYICLTVS-CPDTHQEVPKSIS
         KSRRNPRIYP    DCSFY ENGS F+APPPVAQ+L+ +IPIQ       F+D               +  SF PPSSYIC T      THQEVPKSIS
Subjt:  RKSRRNPRIYPGYSYDCSFYLENGSGFVAPPPVAQNLNAEIPIQI------FDDD--------------FKTSFWPPSSYICLTVS-CPDTHQEVPKSIS

Query:  LSEEEGKLMASDFLFWSNNDPTRESEKDMQQGAV-----EEAMAMAEIRSMSMDVKALEFD--VH--------HSSDNAMEFPEWLSINNDILQQHSNYK
        LSEEEG+LMASD LFWSNN PT ESEK++  GAV     EE   +AEIR  S+D K LE D   H          S+ AMEFP+WLSIN+D LQ  SNY+
Subjt:  LSEEEGKLMASDFLFWSNNDPTRESEKDMQQGAV-----EEAMAMAEIRSMSMDVKALEFD--VH--------HSSDNAMEFPEWLSINNDILQQHSNYK

Query:  CVEEDYLQYPDLSCFDFGKIEDVDGDWLA
           EDYLQ PDLSC D G+IEDVDGDWLA
Subjt:  CVEEDYLQYPDLSCFDFGKIEDVDGDWLA

XP_038897806.1 uncharacterized protein LOC120085720 [Benincasa hispida]3.3e-11076.66Show/hide
Query:  MNSTDQLCNFEAVAKISQPKPDGEPKKQVRRRRQS-RRLYKETPLDMAEARREIVTALKLHRASTKEAREQQQKQDQEIKQSVPLFPQLCPCFEAEGRRK
        MNS DQLCNFEA A+ISQPKPDGE KKQVRRRR S RRLYKE PLDMAEARREIVTALKLHRASTKEAREQQQKQDQ+I QS+P+FPQL PCFE +GRRK
Subjt:  MNSTDQLCNFEAVAKISQPKPDGEPKKQVRRRRQS-RRLYKETPLDMAEARREIVTALKLHRASTKEAREQQQKQDQEIKQSVPLFPQLCPCFEAEGRRK

Query:  SRRNPRIYPGYSYDCSFYLENGSGFVAPPPVAQNLNAEIPIQIFDDDFKT------SFWPPSSYICLTVSCPDTHQEVPKSISLSEEEGKLMASDFLFWS
        SRRN R YP    DCSFYLENGSGFVAPP VAQNL  EIP Q FDDDFKT      SFWPPSSYI  TVSC  THQEVPKSISLSEEEG LMASD +FW 
Subjt:  SRRNPRIYPGYSYDCSFYLENGSGFVAPPPVAQNLNAEIPIQIFDDDFKT------SFWPPSSYICLTVSCPDTHQEVPKSISLSEEEGKLMASDFLFWS

Query:  NNDPTRESEKDMQQGAVEE--AMAMAEIRSMSMDVKALEFDVHHSSDNAMEFPEWLSINNDILQQHSNYKCVEEDYLQYPDLSCFDF
        NND     +KDMQ+GAVEE  A AMAE+R M+MDVKALE D HHS +N MEF +W SIN+D LQQHSNY CVEEDYLQ PDLS + F
Subjt:  NNDPTRESEKDMQQGAVEE--AMAMAEIRSMSMDVKALEFDVHHSSDNAMEFPEWLSINNDILQQHSNYKCVEEDYLQYPDLSCFDF

TrEMBL top hitse value%identityAlignment
A0A0A0L091 Uncharacterized protein1.9e-9277.11Show/hide
Query:  MAEARREIVTALKLHRA-STKE-AREQQQKQDQEIKQSVPLFPQLCPCFEAEGRRKSRRNPRIYPGYSYDCSFYLENGSGFVAPPPVAQNLNAEIPIQIF
        MAEARREIVTALKLHRA STKE AREQQQKQDQE KQS PLFPQ   CFEAEGRRKSRRNPRIYP  SYDCSFYLENGSG VAPPP  +NLN EIPIQ F
Subjt:  MAEARREIVTALKLHRA-STKE-AREQQQKQDQEIKQSVPLFPQLCPCFEAEGRRKSRRNPRIYPGYSYDCSFYLENGSGFVAPPPVAQNLNAEIPIQIF

Query:  DDDFKT----------SFW-PPSSYICLTVSCPDTHQEVPKSISLSEEEGKLMASDFLFWSNNDPTRESEKDMQQGAV--EEAM-AMAEIRSMSMDVKAL
        DDDFKT          SFW PPSSYIC T+SCPDTHQE+PKS+SL EEEG LMASD +FW NNDPT  SEKDMQQ  V  EEAM AMA+I+SMSMDVKAL
Subjt:  DDDFKT----------SFW-PPSSYICLTVSCPDTHQEVPKSISLSEEEGKLMASDFLFWSNNDPTRESEKDMQQGAV--EEAM-AMAEIRSMSMDVKAL

Query:  EFDVHHSSDNAMEFPEWLSINNDILQQHSNYKCVEEDYLQYPDLSCFDF
        E D  HSSDNAMEFP+WLSIN+D L Q+SNY CVEEDYLQ PDLS + F
Subjt:  EFDVHHSSDNAMEFPEWLSINNDILQQHSNYKCVEEDYLQYPDLSCFDF

A0A1S4DZY0 uncharacterized protein LOC1034937173.1e-11476.7Show/hide
Query:  MNSTDQLCNFEAVAKISQPKPDGEPKKQVRRRRQS-RRLYKETPLDMAEARREIVTALKLHRA-STKE-AREQQQKQDQEIKQSVPLFPQLCPCFEAEGR
        MNS DQL NFEA A+IS  KPD EPKKQVRRRR S RRLYKE PLDMAEARREIVTALKLHRA STKE AREQQQKQDQE KQS PLFP+L  CFEAEGR
Subjt:  MNSTDQLCNFEAVAKISQPKPDGEPKKQVRRRRQS-RRLYKETPLDMAEARREIVTALKLHRA-STKE-AREQQQKQDQEIKQSVPLFPQLCPCFEAEGR

Query:  RKSRRNPRIYPGYSYDCSFYLENGSGFVAPPPVAQNLNAEIPIQIFDDDFKT----------SFW-PPSSYICLTVSCPDT-HQEVPKSISLSEEEGKLM
        RKS+RNPRIYP  SYDCSFYLENGSGFVAPPP  +NLN EIPIQ FDDDFKT          SFW PPSSYIC TVSCPDT HQE PKS+SL EEEG LM
Subjt:  RKSRRNPRIYPGYSYDCSFYLENGSGFVAPPPVAQNLNAEIPIQIFDDDFKT----------SFW-PPSSYICLTVSCPDT-HQEVPKSISLSEEEGKLM

Query:  ASDFLFWSNNDPTRESEKDMQQGAV--EEAMAMA--EIRSMSMDVKALEFDVHHSSDNAMEFPEWLSINNDILQQHSNYKCVEEDYLQYPDLSCFDFGKI
        ASD +FW NNDPT  +EKDMQQ AV  EEAMAMA  +++SMSMDVKALE D HHSSDNAM FP+W+SIN+D LQQ+SNY CVEED LQ PDLSCFD GKI
Subjt:  ASDFLFWSNNDPTRESEKDMQQGAV--EEAMAMA--EIRSMSMDVKALEFDVHHSSDNAMEFPEWLSINNDILQQHSNYKCVEEDYLQYPDLSCFDFGKI

Query:  EDVDGDWLA
        ED+  +WLA
Subjt:  EDVDGDWLA

A0A5A7V8V7 Putative WRKY transcription factor protein 1 isoform X23.1e-11476.7Show/hide
Query:  MNSTDQLCNFEAVAKISQPKPDGEPKKQVRRRRQS-RRLYKETPLDMAEARREIVTALKLHRA-STKE-AREQQQKQDQEIKQSVPLFPQLCPCFEAEGR
        MNS DQL NFEA A+IS  KPD EPKKQVRRRR S RRLYKE PLDMAEARREIVTALKLHRA STKE AREQQQKQDQE KQS PLFP+L  CFEAEGR
Subjt:  MNSTDQLCNFEAVAKISQPKPDGEPKKQVRRRRQS-RRLYKETPLDMAEARREIVTALKLHRA-STKE-AREQQQKQDQEIKQSVPLFPQLCPCFEAEGR

Query:  RKSRRNPRIYPGYSYDCSFYLENGSGFVAPPPVAQNLNAEIPIQIFDDDFKT----------SFW-PPSSYICLTVSCPDT-HQEVPKSISLSEEEGKLM
        RKS+RNPRIYP  SYDCSFYLENGSGFVAPPP  +NLN EIPIQ FDDDFKT          SFW PPSSYIC TVSCPDT HQE PKS+SL EEEG LM
Subjt:  RKSRRNPRIYPGYSYDCSFYLENGSGFVAPPPVAQNLNAEIPIQIFDDDFKT----------SFW-PPSSYICLTVSCPDT-HQEVPKSISLSEEEGKLM

Query:  ASDFLFWSNNDPTRESEKDMQQGAV--EEAMAMA--EIRSMSMDVKALEFDVHHSSDNAMEFPEWLSINNDILQQHSNYKCVEEDYLQYPDLSCFDFGKI
        ASD +FW NNDPT  +EKDMQQ AV  EEAMAMA  +++SMSMDVKALE D HHSSDNAM FP+W+SIN+D LQQ+SNY CVEED LQ PDLSCFD GKI
Subjt:  ASDFLFWSNNDPTRESEKDMQQGAV--EEAMAMA--EIRSMSMDVKALEFDVHHSSDNAMEFPEWLSINNDILQQHSNYKCVEEDYLQYPDLSCFDFGKI

Query:  EDVDGDWLA
        ED+  +WLA
Subjt:  EDVDGDWLA

A0A6J1FRD8 uncharacterized protein LOC1114462252.3e-9363.83Show/hide
Query:  MNSTDQLCNFEAVAKISQPKPD--GEPKKQVRRRRQSRRLYKETPLDMAEARREIVTALKLHRASTKEAREQQQKQDQEIKQSVPLFP-QLCPCFEAEGR
        MNSTDQLCNFEA  KI QP+P   GE KKQVRRRRQSRRLYK+ PL+MAEARREIVTALKLHRASTKEA+EQQQKQDQ+IK S+P++P Q  PCFE E R
Subjt:  MNSTDQLCNFEAVAKISQPKPD--GEPKKQVRRRRQSRRLYKETPLDMAEARREIVTALKLHRASTKEAREQQQKQDQEIKQSVPLFP-QLCPCFEAEGR

Query:  RKSRRNPRIYPGYSYDCSFYLENGSGFVAPPPVAQNLNAEIPIQI------FDDD--------------FKTSFWPPSSYICLTVS-CPDTHQEVPKSIS
         KSRRNPRIYP    DCSFY ENGS F+APPPVAQ+L+ +IPIQ       F+D               +  SF PPSSYIC T      THQEVPKSIS
Subjt:  RKSRRNPRIYPGYSYDCSFYLENGSGFVAPPPVAQNLNAEIPIQI------FDDD--------------FKTSFWPPSSYICLTVS-CPDTHQEVPKSIS

Query:  LSEEEGKLMASDFLFWSNNDPTRESEKDMQQGAV-----EEAMAMAEIRSMSMDVKALEFD--VH--------HSSDNAMEFPEWLSINNDILQQHSNYK
        LSEEEG+LMASD LFWSNN PT ESEK++  GAV     EE   +AEIR  S+D K LE D   H          S+ AMEFP+WLSIN+D LQ  SNY+
Subjt:  LSEEEGKLMASDFLFWSNNDPTRESEKDMQQGAV-----EEAMAMAEIRSMSMDVKALEFD--VH--------HSSDNAMEFPEWLSINNDILQQHSNYK

Query:  CVEEDYLQYPDLSCFDFGKIEDVDGDWLA
           EDYLQ PDLSC D G+IEDVDGDWLA
Subjt:  CVEEDYLQYPDLSCFDFGKIEDVDGDWLA

A0A6J1IXC1 uncharacterized protein LOC1114807868.1e-9162.05Show/hide
Query:  MNSTDQLCNFEAVAKISQPKPD------GEPKKQVRRRRQSRRLYKETPLDMAEARREIVTALKLHRASTKEAREQQQKQDQEIKQSVPLFP-QLCPCFE
        MNSTDQLCNFEA  KI QP+P       GE KKQVRRRR++RRLYK+ PL+MAEARREIVTALKLHRASTKEA+EQQQKQDQ+IK S+P++P Q  PCFE
Subjt:  MNSTDQLCNFEAVAKISQPKPD------GEPKKQVRRRRQSRRLYKETPLDMAEARREIVTALKLHRASTKEAREQQQKQDQEIKQSVPLFP-QLCPCFE

Query:  AEGRRKSRRNPRIYPGYSYDCSFYLENGSGFVAPPPVAQNLNAEIPIQI------FDDD-------------FKTSFWPPSSYICLTVS-CPDTHQEVPK
         E R KSRRNPRIYP    DCSFY +NGS F+APPPVAQ+L+ +IPIQ       F+D              +  SF  PSSYIC T      TH+EVPK
Subjt:  AEGRRKSRRNPRIYPGYSYDCSFYLENGSGFVAPPPVAQNLNAEIPIQI------FDDD-------------FKTSFWPPSSYICLTVS-CPDTHQEVPK

Query:  SISLSEEEGKLMASDFLFWSNNDPTRESEKDMQQGAV-----EEAMAMAEIRSMSMDVKALEFD--VH--------HSSDNAMEFPEWLSINNDILQQHS
        SISLSEEEG+LMASD LFWSNN PT ESEK++  GAV     EE   +AEIR  SMD K LE D   H          S+ AMEFP+WLSIN+D LQ  S
Subjt:  SISLSEEEGKLMASDFLFWSNNDPTRESEKDMQQGAV-----EEAMAMAEIRSMSMDVKALEFD--VH--------HSSDNAMEFPEWLSINNDILQQHS

Query:  NYKCVEEDYLQYPDLSCFDFGKIEDVDGDWLA
        NY+   EDYLQ PDLSC D G+IEDVDGDWLA
Subjt:  NYKCVEEDYLQYPDLSCFDFGKIEDVDGDWLA

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
AT5G21280.1 hydroxyproline-rich glycoprotein family protein2.7e-1429.78Show/hide
Query:  KKQVRRRRQSRRLYKETPLDMAEARREIVTALKLHRASTKEAREQQQKQDQEIKQSVPLFPQLCPCFEAEGRRKSRRNPR---IYPGYSYDCSFYLENGS
        KKQVRRR  + R Y+E  L+MAEARREIVTALK HRAS ++A      Q     Q + LF    P         S  NP    + P      +   ++ +
Subjt:  KKQVRRRRQSRRLYKETPLDMAEARREIVTALKLHRASTKEAREQQQKQDQEIKQSVPLFPQLCPCFEAEGRRKSRRNPR---IYPGYSYDCSFYLENGS

Query:  GFVAPPPVAQNLNAEIPIQIFDDDFKTS---FWPPSSYICLTVSCPDTHQEVPKSISLSEEEGKLMASDFLFWSNNDPTRESEKDMQQGAVEEAMAMAEI
         F+       + ++          F T+   +  PS     T +  D+  ++P S   S  E  ++ S   +WS          ++    VE      EI
Subjt:  GFVAPPPVAQNLNAEIPIQIFDDDFKTS---FWPPSSYICLTVSCPDTHQEVPKSISLSEEEGKLMASDFLFWSNNDPTRESEKDMQQGAVEEAMAMAEI

Query:  RSMSMDVKALEFDVHHSSDNAMEFPEWLSINNDILQQHSNYKCVEEDYLQYPDLSCFDFGKIEDVDG-DWLA
        +  + +V  +E DV     + MEFP WL+   + L    N           P LSC + G+IE +DG DWLA
Subjt:  RSMSMDVKALEFDVHHSSDNAMEFPEWLSINNDILQQHSNYKCVEEDYLQYPDLSCFDFGKIEDVDG-DWLA


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGAACTCTACAGACCAACTCTGCAACTTTGAAGCTGTTGCAAAAATCTCACAGCCAAAGCCAGATGGAGAACCAAAGAAACAGGTTAGAAGGAGGCGTCAAAGCCGGCG
GCTTTACAAGGAAACGCCTCTGGATATGGCTGAGGCTAGAAGAGAGATTGTAACTGCACTTAAACTCCACAGAGCATCAACTAAAGAAGCAAGAGAACAGCAACAAAAAC
AAGACCAAGAAATTAAACAATCAGTTCCTCTGTTTCCTCAATTATGCCCATGTTTTGAAGCTGAAGGAAGAAGGAAATCCAGGAGAAATCCCAGGATATACCCAGGTTAC
TCCTATGATTGCTCATTTTATTTGGAAAATGGGTCTGGTTTTGTTGCTCCTCCACCTGTTGCACAGAATCTCAATGCAGAAATCCCTATACAAATCTTTGATGATGATTT
CAAAACTTCATTCTGGCCCCCATCTTCATATATTTGTCTCACTGTTTCTTGTCCTGATACTCATCAGGAAGTTCCCAAATCAATTTCATTATCTGAGGAAGAAGGGAAGT
TAATGGCTTCTGATTTCTTGTTTTGGTCCAATAATGATCCAACTAGAGAGAGTGAAAAAGATATGCAGCAAGGGGCAGTGGAGGAAGCTATGGCTATGGCTGAGATCAGG
TCCATGTCCATGGATGTGAAAGCTTTGGAGTTTGATGTTCACCATAGTTCTGATAATGCTATGGAATTTCCAGAGTGGTTGAGCATCAATAATGATATTTTGCAGCAGCA
TTCGAATTATAAATGCGTAGAGGAGGATTATCTTCAATATCCTGACCTATCCTGCTTCGACTTTGGGAAGATTGAAGATGTGGATGGAGATTGGTTAGCATGA
mRNA sequenceShow/hide mRNA sequence
ATGAACTCTACAGACCAACTCTGCAACTTTGAAGCTGTTGCAAAAATCTCACAGCCAAAGCCAGATGGAGAACCAAAGAAACAGGTTAGAAGGAGGCGTCAAAGCCGGCG
GCTTTACAAGGAAACGCCTCTGGATATGGCTGAGGCTAGAAGAGAGATTGTAACTGCACTTAAACTCCACAGAGCATCAACTAAAGAAGCAAGAGAACAGCAACAAAAAC
AAGACCAAGAAATTAAACAATCAGTTCCTCTGTTTCCTCAATTATGCCCATGTTTTGAAGCTGAAGGAAGAAGGAAATCCAGGAGAAATCCCAGGATATACCCAGGTTAC
TCCTATGATTGCTCATTTTATTTGGAAAATGGGTCTGGTTTTGTTGCTCCTCCACCTGTTGCACAGAATCTCAATGCAGAAATCCCTATACAAATCTTTGATGATGATTT
CAAAACTTCATTCTGGCCCCCATCTTCATATATTTGTCTCACTGTTTCTTGTCCTGATACTCATCAGGAAGTTCCCAAATCAATTTCATTATCTGAGGAAGAAGGGAAGT
TAATGGCTTCTGATTTCTTGTTTTGGTCCAATAATGATCCAACTAGAGAGAGTGAAAAAGATATGCAGCAAGGGGCAGTGGAGGAAGCTATGGCTATGGCTGAGATCAGG
TCCATGTCCATGGATGTGAAAGCTTTGGAGTTTGATGTTCACCATAGTTCTGATAATGCTATGGAATTTCCAGAGTGGTTGAGCATCAATAATGATATTTTGCAGCAGCA
TTCGAATTATAAATGCGTAGAGGAGGATTATCTTCAATATCCTGACCTATCCTGCTTCGACTTTGGGAAGATTGAAGATGTGGATGGAGATTGGTTAGCATGA
Protein sequenceShow/hide protein sequence
MNSTDQLCNFEAVAKISQPKPDGEPKKQVRRRRQSRRLYKETPLDMAEARREIVTALKLHRASTKEAREQQQKQDQEIKQSVPLFPQLCPCFEAEGRRKSRRNPRIYPGY
SYDCSFYLENGSGFVAPPPVAQNLNAEIPIQIFDDDFKTSFWPPSSYICLTVSCPDTHQEVPKSISLSEEEGKLMASDFLFWSNNDPTRESEKDMQQGAVEEAMAMAEIR
SMSMDVKALEFDVHHSSDNAMEFPEWLSINNDILQQHSNYKCVEEDYLQYPDLSCFDFGKIEDVDGDWLA