; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

ClCG05G015950 (gene) of Watermelon (Charleston Gray) v2.5 genome

Gene IDClCG05G015950
OrganismCitrullus lanatus subsp. vulgaris cv. Charleston Gray (Watermelon (Charleston Gray) v2.5)
DescriptionHydroxyproline-rich glycoprotein family protein
Genome locationCG_Chr05:27951377..27952255
RNA-Seq ExpressionClCG05G015950
SyntenyClCG05G015950
Gene Ontology termsNA
InterPro domainsNA


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAE8649926.1 hypothetical protein Csa_011922 [Cucumis sativus]1.7e-11478.77Show/hide
Query:  MNSTDQLCNFVATAQFSQPQPDGEPKKQIRRRRQS-RRLYKEMPLDMAEARREIVTALKLHRA-STKE-AREQQQKQDQQIKQSVPLFSQLCPCFEAEGR
        MNSTDQL NF A AQ S  +PD EPKKQ+RRRR S RRLYKE+PLDMAEARREIVTALKLHRA STKE AREQQQKQDQ+ KQS PLF Q   CFEAEGR
Subjt:  MNSTDQLCNFVATAQFSQPQPDGEPKKQIRRRRQS-RRLYKEMPLDMAEARREIVTALKLHRA-STKE-AREQQQKQDQQIKQSVPLFSQLCPCFEAEGR

Query:  RKSRRNPRIYPGCSYHCSFYLGNGSGFVAPPPLAQNLNTEIPIQSFDDGFKTVDTCSSFYSLSLW-PPSSYICPTVSCPDTHQEVPKSTSLSEEAGKLMA
        RKSRRNPRIYP CSY CSFYL NGSG VAPPP  +NLNTEIPIQ+FDD FKT+DTCSSF SLS W PPSSYICPT+SCPDTHQE+PKS SL EE G LMA
Subjt:  RKSRRNPRIYPGCSYHCSFYLGNGSGFVAPPPLAQNLNTEIPIQSFDDGFKTVDTCSSFYSLSLW-PPSSYICPTVSCPDTHQEVPKSTSLSEEAGKLMA

Query:  SDLFWSNNDPTGESEKDMQRWAVEEEKAM-AMAEIRSMSMDVKALETDGHHSSDNAMEFPDWLGINDDFLHQHWNYNCVEGDYLQYPDLSWY
        SD+FW NNDPTG SEKDMQ+  V EE+AM AMA+I+SMSMDVKALE DG HSSDNAMEFPDWL INDDFL Q+ NY+CVE DYLQ PDLS +
Subjt:  SDLFWSNNDPTGESEKDMQRWAVEEEKAM-AMAEIRSMSMDVKALETDGHHSSDNAMEFPDWLGINDDFLHQHWNYNCVEGDYLQYPDLSWY

KAG6608324.1 hypothetical protein SDJN03_01666, partial [Cucurbita argyrosperma subsp. sororia]1.9e-8663.31Show/hide
Query:  MNSTDQLCNFVAT-AQFSQPQPDGEPKKQIRRRRQSRRLYKEMPLDMAEARREIVTALKLHRASTKEAREQQQKQDQQIKQSVPLF-SQLCPCFEAEGRR
        MNSTDQLCNF AT     QPQP GE KKQ+RRRRQSRRLYK+MPL+MAEARREIVTALKLHRASTKEA+EQQQKQDQQIK S+P++  Q  PCFE E R 
Subjt:  MNSTDQLCNFVAT-AQFSQPQPDGEPKKQIRRRRQSRRLYKEMPLDMAEARREIVTALKLHRASTKEAREQQQKQDQQIKQSVPLF-SQLCPCFEAEGRR

Query:  KSRRNPRIYPGCSYHCSFYLGNGSGFVAPPPLAQNLNTEIPIQSFDDGFKTVDTCS--------SFYSLSLWPPSSYICPTVS-CPDTHQEVPKSTSLSE
        KSRRNPRIYP     CSFY  NGS F+APPP+AQ+L+ +IPIQ+        DT S        SFYSLS  PPSSYICPT      THQEVPKS SLSE
Subjt:  KSRRNPRIYPGCSYHCSFYLGNGSGFVAPPPLAQNLNTEIPIQSFDDGFKTVDTCS--------SFYSLSLWPPSSYICPTVS-CPDTHQEVPKSTSLSE

Query:  EAGKLMASDLFWSNNDPTGESEKDMQRWA--VEEEKAMAMAEIRSMSMDVKALETDGH----------HSSDNAMEFPDWLGINDDFLHQHWNYNCVEGD
        E G+LMASDLFWSNN PTGESEK++       EEE+   +AEIR  SM+ K LE DG             S+ AMEFPDWL INDDFL    NY     D
Subjt:  EAGKLMASDLFWSNNDPTGESEKDMQRWA--VEEEKAMAMAEIRSMSMDVKALETDGH----------HSSDNAMEFPDWLGINDDFLHQHWNYNCVEGD

Query:  YLQYPDLS
        YLQ PDLS
Subjt:  YLQYPDLS

XP_016901295.1 PREDICTED: uncharacterized protein LOC103493717 [Cucumis melo]8.6e-11176.35Show/hide
Query:  MNSTDQLCNFVATAQFSQPQPDGEPKKQIRRRRQS-RRLYKEMPLDMAEARREIVTALKLHRA-STKE-AREQQQKQDQQIKQSVPLFSQLCPCFEAEGR
        MNS DQL NF A AQ S  +PD EPKKQ+RRRR S RRLYKE+PLDMAEARREIVTALKLHRA STKE AREQQQKQDQ+ KQS PLF +L  CFEAEGR
Subjt:  MNSTDQLCNFVATAQFSQPQPDGEPKKQIRRRRQS-RRLYKEMPLDMAEARREIVTALKLHRA-STKE-AREQQQKQDQQIKQSVPLFSQLCPCFEAEGR

Query:  RKSRRNPRIYPGCSYHCSFYLGNGSGFVAPPPLAQNLNTEIPIQSFDDGFKTVDTCSSFYSLSLW-PPSSYICPTVSCPDT-HQEVPKSTSLSEEAGKLM
        RKS+RNPRIYP CSY CSFYL NGSGFVAPPP  +NLNTEIPIQ+FDD FKT+DTCSSF SLS W PPSSYICPTVSCPDT HQE PKS SL EE G LM
Subjt:  RKSRRNPRIYPGCSYHCSFYLGNGSGFVAPPPLAQNLNTEIPIQSFDDGFKTVDTCSSFYSLSLW-PPSSYICPTVSCPDT-HQEVPKSTSLSEEAGKLM

Query:  ASDLFWSNNDPTGESEKDMQRWAVEEEKAMAMA--EIRSMSMDVKALETDGHHSSDNAMEFPDWLGINDDFLHQHWNYNCVEGDYLQYPDLSWYGI
        ASD+FW NNDPTG +EKDMQ+ AV EE+AMAMA  +++SMSMDVKALE D HHSSDNAM FPDW+ INDD L Q+ NY+CVE D LQ PDLS + I
Subjt:  ASDLFWSNNDPTGESEKDMQRWAVEEEKAMAMA--EIRSMSMDVKALETDGHHSSDNAMEFPDWLGINDDFLHQHWNYNCVEGDYLQYPDLSWYGI

XP_022940715.1 uncharacterized protein LOC111446225 [Cucurbita moschata]1.5e-8663.14Show/hide
Query:  MNSTDQLCNFVAT-AQFSQPQPDGEPKKQIRRRRQSRRLYKEMPLDMAEARREIVTALKLHRASTKEAREQQQKQDQQIKQSVPLF-SQLCPCFEAEGRR
        MNSTDQLCNF AT     QPQP GE KKQ+RRRRQSRRLYK+MPL+MAEARREIVTALKLHRASTKEA+EQQQKQDQQIK S+P++  Q  PCFE E R 
Subjt:  MNSTDQLCNFVAT-AQFSQPQPDGEPKKQIRRRRQSRRLYKEMPLDMAEARREIVTALKLHRASTKEAREQQQKQDQQIKQSVPLF-SQLCPCFEAEGRR

Query:  KSRRNPRIYPGCSYHCSFYLGNGSGFVAPPPLAQNLNTEIPIQ------SFDDGFKTVDTCS------SFYSLSLWPPSSYICPTVS-CPDTHQEVPKST
        KSRRNPRIYP     CSFY  NGS F+APPP+AQ+L+ +IPIQ      +F+D    V  CS      SFYSLS  PPSSYICPT      THQEVPKS 
Subjt:  KSRRNPRIYPGCSYHCSFYLGNGSGFVAPPPLAQNLNTEIPIQ------SFDDGFKTVDTCS------SFYSLSLWPPSSYICPTVS-CPDTHQEVPKST

Query:  SLSEEAGKLMASDLFWSNNDPTGESEKDMQRWA--VEEEKAMAMAEIRSMSMDVKALETDGH----------HSSDNAMEFPDWLGINDDFLHQHWNYNC
        SLSEE G+LMASDLFWSNN PTGESEK++       EEE+   +AEIR  S+D K LE DG             S+ AMEFPDWL INDDFL    NY  
Subjt:  SLSEEAGKLMASDLFWSNNDPTGESEKDMQRWA--VEEEKAMAMAEIRSMSMDVKALETDGH----------HSSDNAMEFPDWLGINDDFLHQHWNYNC

Query:  VEGDYLQYPDLS
           DYLQ PDLS
Subjt:  VEGDYLQYPDLS

XP_038897806.1 uncharacterized protein LOC120085720 [Benincasa hispida]3.5e-11274.74Show/hide
Query:  MNSTDQLCNFVATAQFSQPQPDGEPKKQIRRRRQS-RRLYKEMPLDMAEARREIVTALKLHRASTKEAREQQQKQDQQIKQSVPLFSQLCPCFEAEGRRK
        MNS DQLCNF A AQ SQP+PDGE KKQ+RRRR S RRLYKEMPLDMAEARREIVTALKLHRASTKEAREQQQKQDQQI QS+P+F QL PCFE +GRRK
Subjt:  MNSTDQLCNFVATAQFSQPQPDGEPKKQIRRRRQS-RRLYKEMPLDMAEARREIVTALKLHRASTKEAREQQQKQDQQIKQSVPLFSQLCPCFEAEGRRK

Query:  SRRNPRIYPGCSYHCSFYLGNGSGFVAPPPLAQNLNTEIPIQSFDDGFKTVDTCSSFYSLSLWPPSSYICPTVSCPDTHQEVPKSTSLSEEAGKLMASDL
        SRRN R YP     CSFYL NGSGFVAPP +AQNL TEIP QSFDD FKT    SS+  LS WPPSSYI PTVSC  THQEVPKS SLSEE G LMASD+
Subjt:  SRRNPRIYPGCSYHCSFYLGNGSGFVAPPPLAQNLNTEIPIQSFDDGFKTVDTCSSFYSLSLWPPSSYICPTVSCPDTHQEVPKSTSLSEEAGKLMASDL

Query:  FWSNNDPTGESEKDMQRWAVEEEKAMAMAEIRSMSMDVKALETDGHHSSDNAMEFPDWLGINDDFLHQHWNYNCVEGDYLQYPDLSWYGINFS
        FW NND     +KDMQ  AVEE +A AMAE+R M+MDVKALE+DGHHS +N MEF DW  INDDFL QH NY+CVE DYLQ PDLSWY  N S
Subjt:  FWSNNDPTGESEKDMQRWAVEEEKAMAMAEIRSMSMDVKALETDGHHSSDNAMEFPDWLGINDDFLHQHWNYNCVEGDYLQYPDLSWYGINFS

TrEMBL top hitse value%identityAlignment
A0A0A0L091 Uncharacterized protein3.2e-10380.08Show/hide
Query:  MAEARREIVTALKLHRA-STKE-AREQQQKQDQQIKQSVPLFSQLCPCFEAEGRRKSRRNPRIYPGCSYHCSFYLGNGSGFVAPPPLAQNLNTEIPIQSF
        MAEARREIVTALKLHRA STKE AREQQQKQDQ+ KQS PLF Q   CFEAEGRRKSRRNPRIYP CSY CSFYL NGSG VAPPP  +NLNTEIPIQ+F
Subjt:  MAEARREIVTALKLHRA-STKE-AREQQQKQDQQIKQSVPLFSQLCPCFEAEGRRKSRRNPRIYPGCSYHCSFYLGNGSGFVAPPPLAQNLNTEIPIQSF

Query:  DDGFKTVDTCSSFYSLSLW-PPSSYICPTVSCPDTHQEVPKSTSLSEEAGKLMASDLFWSNNDPTGESEKDMQRWAVEEEKAM-AMAEIRSMSMDVKALE
        DD FKT+DTCSSF SLS W PPSSYICPT+SCPDTHQE+PKS SL EE G LMASD+FW NNDPTG SEKDMQ+  V EE+AM AMA+I+SMSMDVKALE
Subjt:  DDGFKTVDTCSSFYSLSLW-PPSSYICPTVSCPDTHQEVPKSTSLSEEAGKLMASDLFWSNNDPTGESEKDMQRWAVEEEKAM-AMAEIRSMSMDVKALE

Query:  TDGHHSSDNAMEFPDWLGINDDFLHQHWNYNCVEGDYLQYPDLSWYGINFS
         DG HSSDNAMEFPDWL INDDFL Q+ NY+CVE DYLQ PDLSWY  NFS
Subjt:  TDGHHSSDNAMEFPDWLGINDDFLHQHWNYNCVEGDYLQYPDLSWYGINFS

A0A1S4DZY0 uncharacterized protein LOC1034937174.2e-11176.35Show/hide
Query:  MNSTDQLCNFVATAQFSQPQPDGEPKKQIRRRRQS-RRLYKEMPLDMAEARREIVTALKLHRA-STKE-AREQQQKQDQQIKQSVPLFSQLCPCFEAEGR
        MNS DQL NF A AQ S  +PD EPKKQ+RRRR S RRLYKE+PLDMAEARREIVTALKLHRA STKE AREQQQKQDQ+ KQS PLF +L  CFEAEGR
Subjt:  MNSTDQLCNFVATAQFSQPQPDGEPKKQIRRRRQS-RRLYKEMPLDMAEARREIVTALKLHRA-STKE-AREQQQKQDQQIKQSVPLFSQLCPCFEAEGR

Query:  RKSRRNPRIYPGCSYHCSFYLGNGSGFVAPPPLAQNLNTEIPIQSFDDGFKTVDTCSSFYSLSLW-PPSSYICPTVSCPDT-HQEVPKSTSLSEEAGKLM
        RKS+RNPRIYP CSY CSFYL NGSGFVAPPP  +NLNTEIPIQ+FDD FKT+DTCSSF SLS W PPSSYICPTVSCPDT HQE PKS SL EE G LM
Subjt:  RKSRRNPRIYPGCSYHCSFYLGNGSGFVAPPPLAQNLNTEIPIQSFDDGFKTVDTCSSFYSLSLW-PPSSYICPTVSCPDT-HQEVPKSTSLSEEAGKLM

Query:  ASDLFWSNNDPTGESEKDMQRWAVEEEKAMAMA--EIRSMSMDVKALETDGHHSSDNAMEFPDWLGINDDFLHQHWNYNCVEGDYLQYPDLSWYGI
        ASD+FW NNDPTG +EKDMQ+ AV EE+AMAMA  +++SMSMDVKALE D HHSSDNAM FPDW+ INDD L Q+ NY+CVE D LQ PDLS + I
Subjt:  ASDLFWSNNDPTGESEKDMQRWAVEEEKAMAMA--EIRSMSMDVKALETDGHHSSDNAMEFPDWLGINDDFLHQHWNYNCVEGDYLQYPDLSWYGI

A0A5A7V8V7 Putative WRKY transcription factor protein 1 isoform X24.2e-11176.35Show/hide
Query:  MNSTDQLCNFVATAQFSQPQPDGEPKKQIRRRRQS-RRLYKEMPLDMAEARREIVTALKLHRA-STKE-AREQQQKQDQQIKQSVPLFSQLCPCFEAEGR
        MNS DQL NF A AQ S  +PD EPKKQ+RRRR S RRLYKE+PLDMAEARREIVTALKLHRA STKE AREQQQKQDQ+ KQS PLF +L  CFEAEGR
Subjt:  MNSTDQLCNFVATAQFSQPQPDGEPKKQIRRRRQS-RRLYKEMPLDMAEARREIVTALKLHRA-STKE-AREQQQKQDQQIKQSVPLFSQLCPCFEAEGR

Query:  RKSRRNPRIYPGCSYHCSFYLGNGSGFVAPPPLAQNLNTEIPIQSFDDGFKTVDTCSSFYSLSLW-PPSSYICPTVSCPDT-HQEVPKSTSLSEEAGKLM
        RKS+RNPRIYP CSY CSFYL NGSGFVAPPP  +NLNTEIPIQ+FDD FKT+DTCSSF SLS W PPSSYICPTVSCPDT HQE PKS SL EE G LM
Subjt:  RKSRRNPRIYPGCSYHCSFYLGNGSGFVAPPPLAQNLNTEIPIQSFDDGFKTVDTCSSFYSLSLW-PPSSYICPTVSCPDT-HQEVPKSTSLSEEAGKLM

Query:  ASDLFWSNNDPTGESEKDMQRWAVEEEKAMAMA--EIRSMSMDVKALETDGHHSSDNAMEFPDWLGINDDFLHQHWNYNCVEGDYLQYPDLSWYGI
        ASD+FW NNDPTG +EKDMQ+ AV EE+AMAMA  +++SMSMDVKALE D HHSSDNAM FPDW+ INDD L Q+ NY+CVE D LQ PDLS + I
Subjt:  ASDLFWSNNDPTGESEKDMQRWAVEEEKAMAMA--EIRSMSMDVKALETDGHHSSDNAMEFPDWLGINDDFLHQHWNYNCVEGDYLQYPDLSWYGI

A0A6J1FRD8 uncharacterized protein LOC1114462257.2e-8763.14Show/hide
Query:  MNSTDQLCNFVAT-AQFSQPQPDGEPKKQIRRRRQSRRLYKEMPLDMAEARREIVTALKLHRASTKEAREQQQKQDQQIKQSVPLF-SQLCPCFEAEGRR
        MNSTDQLCNF AT     QPQP GE KKQ+RRRRQSRRLYK+MPL+MAEARREIVTALKLHRASTKEA+EQQQKQDQQIK S+P++  Q  PCFE E R 
Subjt:  MNSTDQLCNFVAT-AQFSQPQPDGEPKKQIRRRRQSRRLYKEMPLDMAEARREIVTALKLHRASTKEAREQQQKQDQQIKQSVPLF-SQLCPCFEAEGRR

Query:  KSRRNPRIYPGCSYHCSFYLGNGSGFVAPPPLAQNLNTEIPIQ------SFDDGFKTVDTCS------SFYSLSLWPPSSYICPTVS-CPDTHQEVPKST
        KSRRNPRIYP     CSFY  NGS F+APPP+AQ+L+ +IPIQ      +F+D    V  CS      SFYSLS  PPSSYICPT      THQEVPKS 
Subjt:  KSRRNPRIYPGCSYHCSFYLGNGSGFVAPPPLAQNLNTEIPIQ------SFDDGFKTVDTCS------SFYSLSLWPPSSYICPTVS-CPDTHQEVPKST

Query:  SLSEEAGKLMASDLFWSNNDPTGESEKDMQRWA--VEEEKAMAMAEIRSMSMDVKALETDGH----------HSSDNAMEFPDWLGINDDFLHQHWNYNC
        SLSEE G+LMASDLFWSNN PTGESEK++       EEE+   +AEIR  S+D K LE DG             S+ AMEFPDWL INDDFL    NY  
Subjt:  SLSEEAGKLMASDLFWSNNDPTGESEKDMQRWA--VEEEKAMAMAEIRSMSMDVKALETDGH----------HSSDNAMEFPDWLGINDDFLHQHWNYNC

Query:  VEGDYLQYPDLS
           DYLQ PDLS
Subjt:  VEGDYLQYPDLS

A0A6J1IXC1 uncharacterized protein LOC1114807861.1e-8461.34Show/hide
Query:  MNSTDQLCNFVAT-----AQFSQPQPDGEPKKQIRRRRQSRRLYKEMPLDMAEARREIVTALKLHRASTKEAREQQQKQDQQIKQSVPLF-SQLCPCFEA
        MNSTDQLCNF AT         QPQP GE KKQ+RRRR++RRLYK+MPL+MAEARREIVTALKLHRASTKEA+EQQQKQDQQIK S+P++  Q  PCFE 
Subjt:  MNSTDQLCNFVAT-----AQFSQPQPDGEPKKQIRRRRQSRRLYKEMPLDMAEARREIVTALKLHRASTKEAREQQQKQDQQIKQSVPLF-SQLCPCFEA

Query:  EGRRKSRRNPRIYPGCSYHCSFYLGNGSGFVAPPPLAQNLNTEIPIQSFDDGFKTVDTCS---------SFYSLSLWPPSSYICPTVS-CPDTHQEVPKS
        E R KSRRNPRIYP     CSFY  NGS F+APPP+AQ+L+ +IPIQ+        DT S         SFYSLS   PSSYICPT      TH+EVPKS
Subjt:  EGRRKSRRNPRIYPGCSYHCSFYLGNGSGFVAPPPLAQNLNTEIPIQSFDDGFKTVDTCS---------SFYSLSLWPPSSYICPTVS-CPDTHQEVPKS

Query:  TSLSEEAGKLMASDLFWSNNDPTGESEKDMQRWA--VEEEKAMAMAEIRSMSMDVKALETDGH----------HSSDNAMEFPDWLGINDDFLHQHWNYN
         SLSEE G+LMASDLFWSNN PTGESEK++       EEE+   +AEIR  SMD K LE DG             S+ AMEFPDWL INDDFL    NY 
Subjt:  TSLSEEAGKLMASDLFWSNNDPTGESEKDMQRWA--VEEEKAMAMAEIRSMSMDVKALETDGH----------HSSDNAMEFPDWLGINDDFLHQHWNYN

Query:  CVEGDYLQYPDLS
            DYLQ PDLS
Subjt:  CVEGDYLQYPDLS

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
AT5G21280.1 hydroxyproline-rich glycoprotein family protein1.2e-1232.39Show/hide
Query:  KKQIRRRRQSRRLYKEMPLDMAEARREIVTALKLHRASTKEAREQQQKQDQQIKQSVPLFSQLCPCFEAEGRRKSRRNPRIYPGCSYHCSFYLGNGSGFV
        KKQ+RRR  + R Y+E  L+MAEARREIVTALK HRAS ++A      Q     Q + LFS   P             P  +   +   +F L N     
Subjt:  KKQIRRRRQSRRLYKEMPLDMAEARREIVTALKLHRASTKEAREQQQKQDQQIKQSVPLFSQLCPCFEAEGRRKSRRNPRIYPGCSYHCSFYLGNGSGFV

Query:  APPPLAQNLNTEIPIQSFDDGFKTVDTCSSFYSLSLWPPSSYICPTVSCPDTHQEVPK--STSLSEEAGKLMASDLFWSNNDPTGESEKDMQRWAVEEEK
           PL  NLN     Q F+D  +T  T SS  S S    SS I PT     +    P   +T+ S+ A +L +S          GE+      W  E   
Subjt:  APPPLAQNLNTEIPIQSFDDGFKTVDTCSSFYSLSLWPPSSYICPTVSCPDTHQEVPK--STSLSEEAGKLMASDLFWSNNDPTGESEKDMQRWAVEEEK

Query:  AMAMAEIRSMSMDVKALETDGHHSSDNAMEFPDWLGINDDFLHQHWN
             EI+  + +V  +E D      + MEFP WL   ++ L   +N
Subjt:  AMAMAEIRSMSMDVKALETDGHHSSDNAMEFPDWLGINDDFLHQHWN


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGAACTCTACAGACCAACTCTGCAACTTTGTAGCTACTGCACAATTCTCACAGCCACAGCCAGATGGAGAACCAAAGAAACAGATTAGAAGGAGGCGCCAAAGCCGGCG
GCTTTACAAAGAAATGCCTCTGGATATGGCTGAGGCTAGAAGAGAGATTGTAACTGCACTTAAACTCCACAGAGCATCAACCAAAGAAGCAAGAGAGCAGCAACAAAAAC
AGGATCAACAAATTAAACAATCAGTTCCTCTGTTTTCTCAATTATGTCCATGTTTTGAAGCTGAAGGAAGAAGAAAATCCAGGAGAAATCCCAGGATATACCCAGGTTGT
TCATATCATTGCTCATTTTATTTGGGAAATGGGTCTGGTTTTGTTGCTCCTCCACCTCTTGCACAGAATCTCAATACAGAGATCCCTATACAAAGCTTTGATGATGGTTT
CAAAACTGTGGATACTTGTTCTTCATTTTATTCACTTTCACTCTGGCCCCCATCTTCATATATTTGTCCCACTGTTTCTTGTCCTGATACTCATCAGGAAGTTCCCAAAT
CAACTTCATTATCTGAGGAAGCAGGGAAGCTAATGGCTTCTGATTTGTTTTGGTCAAATAATGATCCAACTGGAGAGAGTGAAAAAGATATGCAGCGGTGGGCGGTGGAG
GAGGAGAAGGCTATGGCTATGGCTGAGATCAGGTCCATGTCCATGGATGTGAAAGCTTTGGAGACTGATGGCCACCATAGTTCTGATAATGCTATGGAATTTCCAGATTG
GTTGGGCATTAATGATGATTTTCTGCATCAGCATTGGAATTATAATTGCGTAGAGGGGGATTATCTTCAATATCCTGACCTATCTTGGTATGGAATTAACTTTTCTTAA
mRNA sequenceShow/hide mRNA sequence
ATGAACTCTACAGACCAACTCTGCAACTTTGTAGCTACTGCACAATTCTCACAGCCACAGCCAGATGGAGAACCAAAGAAACAGATTAGAAGGAGGCGCCAAAGCCGGCG
GCTTTACAAAGAAATGCCTCTGGATATGGCTGAGGCTAGAAGAGAGATTGTAACTGCACTTAAACTCCACAGAGCATCAACCAAAGAAGCAAGAGAGCAGCAACAAAAAC
AGGATCAACAAATTAAACAATCAGTTCCTCTGTTTTCTCAATTATGTCCATGTTTTGAAGCTGAAGGAAGAAGAAAATCCAGGAGAAATCCCAGGATATACCCAGGTTGT
TCATATCATTGCTCATTTTATTTGGGAAATGGGTCTGGTTTTGTTGCTCCTCCACCTCTTGCACAGAATCTCAATACAGAGATCCCTATACAAAGCTTTGATGATGGTTT
CAAAACTGTGGATACTTGTTCTTCATTTTATTCACTTTCACTCTGGCCCCCATCTTCATATATTTGTCCCACTGTTTCTTGTCCTGATACTCATCAGGAAGTTCCCAAAT
CAACTTCATTATCTGAGGAAGCAGGGAAGCTAATGGCTTCTGATTTGTTTTGGTCAAATAATGATCCAACTGGAGAGAGTGAAAAAGATATGCAGCGGTGGGCGGTGGAG
GAGGAGAAGGCTATGGCTATGGCTGAGATCAGGTCCATGTCCATGGATGTGAAAGCTTTGGAGACTGATGGCCACCATAGTTCTGATAATGCTATGGAATTTCCAGATTG
GTTGGGCATTAATGATGATTTTCTGCATCAGCATTGGAATTATAATTGCGTAGAGGGGGATTATCTTCAATATCCTGACCTATCTTGGTATGGAATTAACTTTTCTTAA
Protein sequenceShow/hide protein sequence
MNSTDQLCNFVATAQFSQPQPDGEPKKQIRRRRQSRRLYKEMPLDMAEARREIVTALKLHRASTKEAREQQQKQDQQIKQSVPLFSQLCPCFEAEGRRKSRRNPRIYPGC
SYHCSFYLGNGSGFVAPPPLAQNLNTEIPIQSFDDGFKTVDTCSSFYSLSLWPPSSYICPTVSCPDTHQEVPKSTSLSEEAGKLMASDLFWSNNDPTGESEKDMQRWAVE
EEKAMAMAEIRSMSMDVKALETDGHHSSDNAMEFPDWLGINDDFLHQHWNYNCVEGDYLQYPDLSWYGINFS