; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

CaUC04G067030 (gene) of Watermelon (USVL246-FR2) v1 genome

Gene IDCaUC04G067030
OrganismCitrullus amarus (Watermelon (USVL246-FR2) v1)
DescriptionDNA glycosylase
Genome locationCiama_Chr04:1025119..1029563
RNA-Seq ExpressionCaUC04G067030
SyntenyCaUC04G067030
Gene Ontology termsGO:0006281 - DNA repair (biological process)
GO:0006355 - regulation of transcription, DNA-templated (biological process)
GO:0003824 - catalytic activity (molecular function)
InterPro domainsIPR007592 - GLABROUS1 enhancer-binding protein family
IPR011257 - DNA glycosylase


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAG6585875.1 hypothetical protein SDJN03_18608, partial [Cucurbita argyrosperma subsp. sororia]8.0e-12870.46Show/hide
Query:  DAKLRELIFDFDLERAVCNHGQFMMPPNQWIPSSKTLQRPLRLSNSNSSVFVSINQTSSFLLTIQIHSSAALSPQDQQTILDQVVRMLRLTEKDEDELRK
        + KL   + DF+LE+AVCNHG FMM PNQWIPSSKTLQRPLRLSNS++S+ VSINQ+SS LLT+QIHS  +L P+D+  ILDQV RMLRLTEKDEDE+R+
Subjt:  DAKLRELIFDFDLERAVCNHGQFMMPPNQWIPSSKTLQRPLRLSNSNSSVFVSINQTSSFLLTIQIHSSAALSPQDQQTILDQVVRMLRLTEKDEDELRK

Query:  FQSLHPKAKQMGFGRLFRSPTVFEDALKSILLCNTTWKRTLAMAGQLCELQARMSSQNRKRKRK---LIGNFPNAEEVCRMGVELLKKHNLGYRAGFIIN
        FQ+LHP AKQ+GFGR+FRSP++FED +KSIL+CNT+W+RTL MA +LCE+QA+M  +++KRKRK     GNFPNA EVCRMGVE LK H LGYRA +++ 
Subjt:  FQSLHPKAKQMGFGRLFRSPTVFEDALKSILLCNTTWKRTLAMAGQLCELQARMSSQNRKRKRK---LIGNFPNAEEVCRMGVELLKKHNLGYRAGFIIN

Query:  FAQRVQNATIDLQ-------NPNNFPKIKGFGPFATANLFMCLGFYRQLPIDTETIRHIKQVHGRQFCNNKTVREDVKQIYDKYAPFQCLAYWLELVEYY
        FAQ V++  I+LQ       +P+ FPKIKGFGPFATAN+FMCLGFY QLPIDTETIRH+KQVHG Q+C  KTV EDVKQIYD YAP+QCLAYWLELV+YY
Subjt:  FAQRVQNATIDLQ-------NPNNFPKIKGFGPFATANLFMCLGFYRQLPIDTETIRHIKQVHGRQFCNNKTVREDVKQIYDKYAPFQCLAYWLELVEYY

Query:  ESKFGKLSELCSLDYHKISGATLNL
        E+KFGKLSEL S DYHKISG+TL+L
Subjt:  ESKFGKLSELCSLDYHKISGATLNL

XP_021905122.1 uncharacterized protein LOC110820055 isoform X2 [Carica papaya]2.1e-8852.69Show/hide
Query:  NFDAKLRELIFDFDLERAVCNHGQFMMPPNQWIPSSKTLQRPLRLSNSNSSVFVSINQTS-SFLLTIQIHSSAALSPQDQQTILDQVVRMLRLTEKDEDE
        N    L E    F+LE+AVCNHG FMMPPN W PS KTL+RPLRLSN +SSV+ SI+  S S  L IQ+H    +S  D+  IL+QV RMLR+++KDE+ 
Subjt:  NFDAKLRELIFDFDLERAVCNHGQFMMPPNQWIPSSKTLQRPLRLSNSNSSVFVSINQTS-SFLLTIQIHSSAALSPQDQQTILDQVVRMLRLTEKDEDE

Query:  LRKFQSLHPKAKQMGFGRLFRSPTVFEDALKSILLCNTTWKRTLAMAGQLCELQ---ARMSSQNRKRKRKLI----------------GNFPNAEEVCRM
        +R+FQ +H  AK  GFGR+FRSP++FED +KS+LLCN TW RTL MA  LCELQ    R  S  +++KRK                  GNFPNAEE+  +
Subjt:  LRKFQSLHPKAKQMGFGRLFRSPTVFEDALKSILLCNTTWKRTLAMAGQLCELQ---ARMSSQNRKRKRKLI----------------GNFPNAEEVCRM

Query:  GVELLKKH-NLGYRAGFIINFAQRVQNATIDLQNPNNFPKIKGFGPFATANLFMCLGFYRQLPIDTETIRHIKQVHGRQFCNNKTVREDVKQIYDKYAPF
          +LL++   LGYRA ++IN AQ V++  +DL N  +  KIKGFG F  AN+ MC+GFY+ +P DTET+RH+KQVHG + C+  T+ +DVK IYDKY+PF
Subjt:  GVELLKKH-NLGYRAGFIINFAQRVQNATIDLQNPNNFPKIKGFGPFATANLFMCLGFYRQLPIDTETIRHIKQVHGRQFCNNKTVREDVKQIYDKYAPF

Query:  QCLAYWLELVEYYESKFGKLSELCSLDYHKISGA
        Q LAYW EL+ YYESK GKLSEL    Y  ++G+
Subjt:  QCLAYWLELVEYYESKFGKLSELCSLDYHKISGA

XP_022156993.1 uncharacterized protein LOC111023822 [Momordica charantia]1.2e-12069.33Show/hide
Query:  DAKLRELIFDFDLERAVCNHGQFMMPPNQWIPSSKTLQRPLRLSNSNSSVFVSINQTSSFLLTIQIHSSAALSPQDQQTILDQVVRMLRLTEKDEDELRK
        D  L E    FDLERAVCNHG FMMPPN+WIPSSKTLQRPLRL++S +SV VSI+Q SS LL IQIHSS + SP D+Q ILDQV RMLR+TE+DE+ +R 
Subjt:  DAKLRELIFDFDLERAVCNHGQFMMPPNQWIPSSKTLQRPLRLSNSNSSVFVSINQTSSFLLTIQIHSSAALSPQDQQTILDQVVRMLRLTEKDEDELRK

Query:  FQSLHPKAKQMGFGRLFRSPTVFEDALKSILLCNTTWKRTLAMAGQLCELQARMS----SQNRKRKRK-------LIGNFPNAEEVCRMGVELLKKHNLG
        FQ+LH KAK++GFGRLFRSPT+FEDA+KSILLCN TW+RTLAMAGQLCELQA++     +  +KRKRK         GNFP A E+CRM V LL+KH +G
Subjt:  FQSLHPKAKQMGFGRLFRSPTVFEDALKSILLCNTTWKRTLAMAGQLCELQARMS----SQNRKRKRK-------LIGNFPNAEEVCRMGVELLKKHNLG

Query:  YRAGFIINFAQRVQNATIDLQNPN---NFPKIKGFGPFATANLFMCLGFYRQLPIDTETIRHIKQVHGRQFCNNKTVREDVKQIYDKYAPFQCLAYWLEL
        YRA +II+ AQRVQN  IDLQ      +FPKIKGFGPF TAN+FMCLG Y +LPIDTETIRH+KQVHGRQ CN KT  E VK +YDKYAPFQCLAYW+EL
Subjt:  YRAGFIINFAQRVQNATIDLQNPN---NFPKIKGFGPFATANLFMCLGFYRQLPIDTETIRHIKQVHGRQFCNNKTVREDVKQIYDKYAPFQCLAYWLEL

Query:  VEYYESKFGKLSELCSLDYHKISGAT
        VEYYES+FGKLSEL   DY KISG T
Subjt:  VEYYESKFGKLSELCSLDYHKISGAT

XP_022951918.1 uncharacterized protein LOC111454659 [Cucurbita moschata]1.4e-12770.46Show/hide
Query:  DAKLRELIFDFDLERAVCNHGQFMMPPNQWIPSSKTLQRPLRLSNSNSSVFVSINQTSSFLLTIQIHSSAALSPQDQQTILDQVVRMLRLTEKDEDELRK
        + KL   + DF+LE+AVCNHG FMM PNQWIPSSKTLQRPLRLSNS++S+ VSINQ+SS LLT+QIHS  +L P+D+  ILDQV RMLRLTEKDEDE+R+
Subjt:  DAKLRELIFDFDLERAVCNHGQFMMPPNQWIPSSKTLQRPLRLSNSNSSVFVSINQTSSFLLTIQIHSSAALSPQDQQTILDQVVRMLRLTEKDEDELRK

Query:  FQSLHPKAKQMGFGRLFRSPTVFEDALKSILLCNTTWKRTLAMAGQLCELQARMSSQNRKRKRK---LIGNFPNAEEVCRMGVELLKKHNLGYRAGFIIN
        FQ+LHP AKQ+GFGR+FRSP++FED +KSIL+CNT+W+RTL MA +LCE+QA+M  +++KRKRK     GNFPNA EVCRMGVE LK H LGYRA +++ 
Subjt:  FQSLHPKAKQMGFGRLFRSPTVFEDALKSILLCNTTWKRTLAMAGQLCELQARMSSQNRKRKRK---LIGNFPNAEEVCRMGVELLKKHNLGYRAGFIIN

Query:  FAQRVQNATIDLQ-------NPNNFPKIKGFGPFATANLFMCLGFYRQLPIDTETIRHIKQVHGRQFCNNKTVREDVKQIYDKYAPFQCLAYWLELVEYY
        FAQ V++  I+LQ       +P+ FPKIKGFGPFATAN+FMCLGFY QLPIDTETIRH+KQVHG Q+C  KTV EDVKQIYD YAP+QCLAYWLELV+YY
Subjt:  FAQRVQNATIDLQ-------NPNNFPKIKGFGPFATANLFMCLGFYRQLPIDTETIRHIKQVHGRQFCNNKTVREDVKQIYDKYAPFQCLAYWLELVEYY

Query:  ESKFGKLSELCSLDYHKISGATLNL
        E+KFGKLSEL S DYHKISG+TL+L
Subjt:  ESKFGKLSELCSLDYHKISGATLNL

XP_038877617.1 uncharacterized protein LOC120069874 [Benincasa hispida]1.4e-13587.06Show/hide
Query:  IFDFDLERAVCNHGQFMMPPNQWIPSSKTLQRPLRLSNSNSSVFVSINQTSSFLLTIQIHSSAA-LSPQDQQTILDQVVRMLRLTEKDEDELRKFQSLHP
        + DFDLE+AVCNHGQFMMPPNQWIPSSKTLQRPLRLS+S+SSVFVSINQ SS LLTIQIHSS+  LSPQDQQ ILDQVVRMLRLTEKDEDELRKFQSLHP
Subjt:  IFDFDLERAVCNHGQFMMPPNQWIPSSKTLQRPLRLSNSNSSVFVSINQTSSFLLTIQIHSSAA-LSPQDQQTILDQVVRMLRLTEKDEDELRKFQSLHP

Query:  KAKQMGFGRLFRSPTVFEDALKSILLCNTTWKRTLAMAGQLCELQARMSSQ-NRKRKRKL------IGNFPNAEEVCRMGVELLKKHNLGYRAGFIINFA
        +AKQMGFGRLFRSPT+FEDALKSILLCNTTWKRTLAMAGQLCELQA+M  Q  RKRKRKL      IGNFPNAEEVCRMGVELLKKH LGYRA +IINFA
Subjt:  KAKQMGFGRLFRSPTVFEDALKSILLCNTTWKRTLAMAGQLCELQARMSSQ-NRKRKRKL------IGNFPNAEEVCRMGVELLKKHNLGYRAGFIINFA

Query:  QRVQNATIDLQNPNNFPKIKGFGPFATANLFMCLGFYRQLPIDTETIRHIKQVHGRQFCNNKTVREDVKQIYDKYAPFQCLAYWLE
        + VQ+  IDLQNPN FPKIKGFGPFATAN+ MCLG YRQLPIDTETIRH+KQVHGRQFCNNKTVREDVKQIYDKYAPFQCLAYWLE
Subjt:  QRVQNATIDLQNPNNFPKIKGFGPFATANLFMCLGFYRQLPIDTETIRHIKQVHGRQFCNNKTVREDVKQIYDKYAPFQCLAYWLE

TrEMBL top hitse value%identityAlignment
A0A438CJ05 Uncharacterized protein7.1e-8249.24Show/hide
Query:  FDLERAVCNHGQFMMPPNQWIPSSKTLQRPLRLSNSNSSVFVSINQ-TSSFLLTIQIHSSAALSPQDQQTILDQVVRMLRLTEKDEDELRKFQSLHPKAK
        F+LE AVCNHG FMM PN WIPS+KTLQRPLRL++  +S+  SI+   +   + +++H +  +SP DQ+ IL  V RMLR++++DE ++++F  + P+AK
Subjt:  FDLERAVCNHGQFMMPPNQWIPSSKTLQRPLRLSNSNSSVFVSINQ-TSSFLLTIQIHSSAALSPQDQQTILDQVVRMLRLTEKDEDELRKFQSLHPKAK

Query:  QMGFGRLFRSPTVFEDALKSILLCNTTWKRTLAMAGQLCELQARMSSQNRKR-------------KRKLIGNFPNAEEVCRMGVELLKKH-NLGYRAGFI
           FGR+FRSP++FED +KSILLCN  W+RTL MA  LCELQ  +    RKR             + + IGNFPN+ E+  +  E LKK  NLGYRA  I
Subjt:  QMGFGRLFRSPTVFEDALKSILLCNTTWKRTLAMAGQLCELQARMSSQNRKR-------------KRKLIGNFPNAEEVCRMGVELLKKH-NLGYRAGFI

Query:  INFAQRVQNATIDLQN-------------PNNFPKIKGFGPFATANLFMCLGFYRQLPIDTETIRHIKQVHGRQFCNNKTVREDVKQIYDKYAPFQCLAY
        +  A  ++N  + LQN              +   K KGFGPFA AN+ MC+G+Y+++P D+ET RH+K++HGR+    K   +DVK+IYDKYAPFQCLAY
Subjt:  INFAQRVQNATIDLQN-------------PNNFPKIKGFGPFATANLFMCLGFYRQLPIDTETIRHIKQVHGRQFCNNKTVREDVKQIYDKYAPFQCLAY

Query:  WLELVEYYESKFGKLSELCSLDYHKISGA
        WLEL EYY+S+FGKLSEL   +YH I+G+
Subjt:  WLELVEYYESKFGKLSELCSLDYHKISGA

A0A6A1W9S6 Uncharacterized protein5.6e-8748.25Show/hide
Query:  KLRELIFDFDLERAVCNHGQFMMPPNQWIPSSKTLQRPLRLSNSNSSVFVSINQ----TSSFLLTIQIHSSAALSPQDQQTILDQVVRMLRLTEKDEDEL
        +L E +  F++E+AVCNHG FMM PN WIPS+KTLQRPLRL+NS  SV VSI+     T++++L IQ+H +  +SPQD++ IL+QV RMLR++E+DE  L
Subjt:  KLRELIFDFDLERAVCNHGQFMMPPNQWIPSSKTLQRPLRLSNSNSSVFVSINQ----TSSFLLTIQIHSSAALSPQDQQTILDQVVRMLRLTEKDEDEL

Query:  RKFQSLHPKAKQMGFGRLFRSPTVFEDALKSILLCNTTWKRTLAMAGQLCELQ---------------ARMSSQNRKRKRKL------------------
        R+FQ+LHP+AK+ GFGR FRSP++FEDA+KS+LLCN TW RTL MA  LCELQ               AR  S+ R  KRK                   
Subjt:  RKFQSLHPKAKQMGFGRLFRSPTVFEDALKSILLCNTTWKRTLAMAGQLCELQ---------------ARMSSQNRKRKRKL------------------

Query:  ------------IGNFPNAEEVCRMGVELLKKH-NLGYRAGFIINFAQRVQNATIDLQNPNN------------FPKIKGFGPFATANLFMCLGFYRQLP
                    +GNFP+++EV  +    L+ H NLGYRA +I+  A++V++  + L+  ++              KIKGFGPFA AN+ MC+G+Y+ +P
Subjt:  ------------IGNFPNAEEVCRMGVELLKKH-NLGYRAGFIINFAQRVQNATIDLQNPNN------------FPKIKGFGPFATANLFMCLGFYRQLP

Query:  IDTETIRHIKQVHGRQFCNNKTVREDVKQIYDKYAPFQCLAYWLELVEYYESKFGKLSELCSLDYHKISGA
        +DTET+RH++QVHGR+    +TV EDVK +YDK+APFQ LAYW EL+E+YE KFGKLSEL +  Y  +SG+
Subjt:  IDTETIRHIKQVHGRQFCNNKTVREDVKQIYDKYAPFQCLAYWLELVEYYESKFGKLSELCSLDYHKISGA

A0A6J1DS88 uncharacterized protein LOC1110238226.0e-12169.33Show/hide
Query:  DAKLRELIFDFDLERAVCNHGQFMMPPNQWIPSSKTLQRPLRLSNSNSSVFVSINQTSSFLLTIQIHSSAALSPQDQQTILDQVVRMLRLTEKDEDELRK
        D  L E    FDLERAVCNHG FMMPPN+WIPSSKTLQRPLRL++S +SV VSI+Q SS LL IQIHSS + SP D+Q ILDQV RMLR+TE+DE+ +R 
Subjt:  DAKLRELIFDFDLERAVCNHGQFMMPPNQWIPSSKTLQRPLRLSNSNSSVFVSINQTSSFLLTIQIHSSAALSPQDQQTILDQVVRMLRLTEKDEDELRK

Query:  FQSLHPKAKQMGFGRLFRSPTVFEDALKSILLCNTTWKRTLAMAGQLCELQARMS----SQNRKRKRK-------LIGNFPNAEEVCRMGVELLKKHNLG
        FQ+LH KAK++GFGRLFRSPT+FEDA+KSILLCN TW+RTLAMAGQLCELQA++     +  +KRKRK         GNFP A E+CRM V LL+KH +G
Subjt:  FQSLHPKAKQMGFGRLFRSPTVFEDALKSILLCNTTWKRTLAMAGQLCELQARMS----SQNRKRKRK-------LIGNFPNAEEVCRMGVELLKKHNLG

Query:  YRAGFIINFAQRVQNATIDLQNPN---NFPKIKGFGPFATANLFMCLGFYRQLPIDTETIRHIKQVHGRQFCNNKTVREDVKQIYDKYAPFQCLAYWLEL
        YRA +II+ AQRVQN  IDLQ      +FPKIKGFGPF TAN+FMCLG Y +LPIDTETIRH+KQVHGRQ CN KT  E VK +YDKYAPFQCLAYW+EL
Subjt:  YRAGFIINFAQRVQNATIDLQNPN---NFPKIKGFGPFATANLFMCLGFYRQLPIDTETIRHIKQVHGRQFCNNKTVREDVKQIYDKYAPFQCLAYWLEL

Query:  VEYYESKFGKLSELCSLDYHKISGAT
        VEYYES+FGKLSEL   DY KISG T
Subjt:  VEYYESKFGKLSELCSLDYHKISGAT

A0A6J1GJ25 uncharacterized protein LOC1114546596.6e-12870.46Show/hide
Query:  DAKLRELIFDFDLERAVCNHGQFMMPPNQWIPSSKTLQRPLRLSNSNSSVFVSINQTSSFLLTIQIHSSAALSPQDQQTILDQVVRMLRLTEKDEDELRK
        + KL   + DF+LE+AVCNHG FMM PNQWIPSSKTLQRPLRLSNS++S+ VSINQ+SS LLT+QIHS  +L P+D+  ILDQV RMLRLTEKDEDE+R+
Subjt:  DAKLRELIFDFDLERAVCNHGQFMMPPNQWIPSSKTLQRPLRLSNSNSSVFVSINQTSSFLLTIQIHSSAALSPQDQQTILDQVVRMLRLTEKDEDELRK

Query:  FQSLHPKAKQMGFGRLFRSPTVFEDALKSILLCNTTWKRTLAMAGQLCELQARMSSQNRKRKRK---LIGNFPNAEEVCRMGVELLKKHNLGYRAGFIIN
        FQ+LHP AKQ+GFGR+FRSP++FED +KSIL+CNT+W+RTL MA +LCE+QA+M  +++KRKRK     GNFPNA EVCRMGVE LK H LGYRA +++ 
Subjt:  FQSLHPKAKQMGFGRLFRSPTVFEDALKSILLCNTTWKRTLAMAGQLCELQARMSSQNRKRKRK---LIGNFPNAEEVCRMGVELLKKHNLGYRAGFIIN

Query:  FAQRVQNATIDLQ-------NPNNFPKIKGFGPFATANLFMCLGFYRQLPIDTETIRHIKQVHGRQFCNNKTVREDVKQIYDKYAPFQCLAYWLELVEYY
        FAQ V++  I+LQ       +P+ FPKIKGFGPFATAN+FMCLGFY QLPIDTETIRH+KQVHG Q+C  KTV EDVKQIYD YAP+QCLAYWLELV+YY
Subjt:  FAQRVQNATIDLQ-------NPNNFPKIKGFGPFATANLFMCLGFYRQLPIDTETIRHIKQVHGRQFCNNKTVREDVKQIYDKYAPFQCLAYWLELVEYY

Query:  ESKFGKLSELCSLDYHKISGATLNL
        E+KFGKLSEL S DYHKISG+TL+L
Subjt:  ESKFGKLSELCSLDYHKISGATLNL

A0A6P4BPN5 uncharacterized protein LOC1074341911.2e-8149.12Show/hide
Query:  KLRELIFDFDLERAVCNHGQFMMPPNQWIPSSKTLQRPLRLSNSNSSVFVSINQT---SSFLLTIQIHSSAALSPQDQQTILDQVVRMLRLTEKDEDELR
        +L E    F+LE+AVCNHG FMM PN WIPS+KTLQRPLRLS+  +S  VSI+     S  LL I +HS    S  D+  IL QV RMLR++E+DE ++R
Subjt:  KLRELIFDFDLERAVCNHGQFMMPPNQWIPSSKTLQRPLRLSNSNSSVFVSINQT---SSFLLTIQIHSSAALSPQDQQTILDQVVRMLRLTEKDEDELR

Query:  KFQSLHPKAKQMGFGRLFRSPTVFEDALKSILLCNTTWKRTLAMAGQLCELQARMSSQNR---KRKR----------KLIGNFPNAEEVCRMGVELLKKH
        +FQ   PKAK  GFGRLFRSP++FEDA+KSILLCN TW ++L MA  LCELQ  +++  +   KRKR            +GNFP ++E+  +    L++ 
Subjt:  KFQSLHPKAKQMGFGRLFRSPTVFEDALKSILLCNTTWKRTLAMAGQLCELQARMSSQNR---KRKR----------KLIGNFPNAEEVCRMGVELLKKH

Query:  N--LGYRAGFIINFAQRVQNATIDLQ--------NPNNFPKI-------KGFGPFATANLFMCLGFYRQLPIDTETIRHIKQVHGRQFCNNKTVREDVKQ
           LGYRA +I+  A+ V++  + L+         P ++ ++        GFGP+  AN+FMC+G Y+ +P+DTETIRHI+QVHGR+ C+ KTV++ V++
Subjt:  N--LGYRAGFIINFAQRVQNATIDLQ--------NPNNFPKI-------KGFGPFATANLFMCLGFYRQLPIDTETIRHIKQVHGRQFCNNKTVREDVKQ

Query:  IYDKYAPFQCLAYWLELVEYYESKFGKLSELCSLDYHKIS
        IYDK+APFQCLAYW+EL++ YE KFGKLSEL    Y  +S
Subjt:  IYDKYAPFQCLAYWLELVEYYESKFGKLSELCSLDYHKIS

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGTCGGGCTCAGACAAATCAGATAAAGAGTACAAAATAGAGTCGGATGTGGAAAGTAGTGAAGAATATTCAGAATATTGTAGGTCAAAAAGAAGAAGGAAAGCA
CAAAGAAGAAGAAAAAAGAGATCATCATCAGGTGGAGTATTGGAATGGGAGGAAGATGATGAAATCACTGTGTTGAAACAACTCTACGAATTCTCAGCTGGGAGC
TGCAAGGGTTTGTATTCAAAGGCATTTTATGAACATGTGAAGCCAAAATTAATGAACAGAGAAGTGACGATGAGTGAAATGAGCAGCAAAATTGGTGGATTCAAA
AACCAATATTTGGAGTGGAGAAGGAAAGGCAATAGTAATAATCATGAATTGAGATTGTTAGGGCCTCACCGACGCCAAGTGTTTGTGTTATCGAACAGAATATGG
GGAGATTATGATGAGGAAATGAAGCATTTGGAAGGGTTTAGGGAGTTTCTTGAAGAGCTGGAGATCGATATCAATTGTTTGATGCCTTCATCTTTGGAATATCTT
AAGAAAGAGTGGGAAGTTCAAATGCTTATGTACATTCAGCTTTTGTCAATCAAGTCCAACTTTGATGCTAAGCTTAGGGAATTGATCTTTGATTTCGATCTTGAG
AGAGCAGTTTGTAACCATGGGCAATTTATGATGCCACCAAACCAATGGATTCCTTCTTCTAAAACTCTCCAACGTCCACTTCGTCTCTCTAATTCAAACTCTTCT
GTATTTGTCTCTATCAACCAAACTTCGTCTTTTCTTCTCACCATTCAAATCCACTCTTCTGCTGCTCTCTCTCCCCAAGATCAACAAACTATATTGGATCAAGTG
GTTCGGATGCTTAGGCTTACGGAGAAAGATGAGGATGAGTTGAGGAAATTTCAAAGTTTGCATCCCAAAGCCAAACAGATGGGATTTGGTCGGCTTTTTCGGTCT
CCCACTGTTTTTGAAGATGCACTCAAGTCCATCCTTCTATGCAATACCACGTGGAAAAGGACACTGGCAATGGCTGGACAGCTGTGTGAGCTCCAAGCCAGAATG
AGCAGCCAAAATAGGAAGAGAAAAAGGAAATTAATTGGGAATTTTCCAAATGCAGAAGAAGTTTGTAGAATGGGCGTTGAATTGTTGAAGAAGCATAATCTTGGT
TACAGAGCTGGTTTCATCATTAACTTTGCTCAACGCGTTCAAAATGCCACAATTGATCTCCAAAATCCTAATAATTTCCCTAAAATCAAAGGCTTCGGACCTTTT
GCAACCGCTAATCTATTCATGTGCCTCGGATTTTACCGTCAACTTCCTATTGATACTGAAACTATAAGGCACATAAAACAGGTACATGGAAGACAATTTTGCAAC
AATAAGACAGTACGGGAAGATGTCAAACAAATTTACGACAAGTATGCTCCATTCCAGTGCTTGGCCTATTGGTTGGAGCTTGTGGAATATTACGAGAGCAAATTC
GGGAAGCTAAGTGAACTGTGCTCCCTTGATTATCACAAGATCAGTGGCGCCACCCTCAACCTTTGA
mRNA sequenceShow/hide mRNA sequence
ATGTCGGGCTCAGACAAATCAGATAAAGAGTACAAAATAGAGTCGGATGTGGAAAGTAGTGAAGAATATTCAGAATATTGTAGGTCAAAAAGAAGAAGGAAAGCA
CAAAGAAGAAGAAAAAAGAGATCATCATCAGGTGGAGTATTGGAATGGGAGGAAGATGATGAAATCACTGTGTTGAAACAACTCTACGAATTCTCAGCTGGGAGC
TGCAAGGGTTTGTATTCAAAGGCATTTTATGAACATGTGAAGCCAAAATTAATGAACAGAGAAGTGACGATGAGTGAAATGAGCAGCAAAATTGGTGGATTCAAA
AACCAATATTTGGAGTGGAGAAGGAAAGGCAATAGTAATAATCATGAATTGAGATTGTTAGGGCCTCACCGACGCCAAGTGTTTGTGTTATCGAACAGAATATGG
GGAGATTATGATGAGGAAATGAAGCATTTGGAAGGGTTTAGGGAGTTTCTTGAAGAGCTGGAGATCGATATCAATTGTTTGATGCCTTCATCTTTGGAATATCTT
AAGAAAGAGTGGGAAGTTCAAATGCTTATGTACATTCAGCTTTTGTCAATCAAGTCCAACTTTGATGCTAAGCTTAGGGAATTGATCTTTGATTTCGATCTTGAG
AGAGCAGTTTGTAACCATGGGCAATTTATGATGCCACCAAACCAATGGATTCCTTCTTCTAAAACTCTCCAACGTCCACTTCGTCTCTCTAATTCAAACTCTTCT
GTATTTGTCTCTATCAACCAAACTTCGTCTTTTCTTCTCACCATTCAAATCCACTCTTCTGCTGCTCTCTCTCCCCAAGATCAACAAACTATATTGGATCAAGTG
GTTCGGATGCTTAGGCTTACGGAGAAAGATGAGGATGAGTTGAGGAAATTTCAAAGTTTGCATCCCAAAGCCAAACAGATGGGATTTGGTCGGCTTTTTCGGTCT
CCCACTGTTTTTGAAGATGCACTCAAGTCCATCCTTCTATGCAATACCACGTGGAAAAGGACACTGGCAATGGCTGGACAGCTGTGTGAGCTCCAAGCCAGAATG
AGCAGCCAAAATAGGAAGAGAAAAAGGAAATTAATTGGGAATTTTCCAAATGCAGAAGAAGTTTGTAGAATGGGCGTTGAATTGTTGAAGAAGCATAATCTTGGT
TACAGAGCTGGTTTCATCATTAACTTTGCTCAACGCGTTCAAAATGCCACAATTGATCTCCAAAATCCTAATAATTTCCCTAAAATCAAAGGCTTCGGACCTTTT
GCAACCGCTAATCTATTCATGTGCCTCGGATTTTACCGTCAACTTCCTATTGATACTGAAACTATAAGGCACATAAAACAGGTACATGGAAGACAATTTTGCAAC
AATAAGACAGTACGGGAAGATGTCAAACAAATTTACGACAAGTATGCTCCATTCCAGTGCTTGGCCTATTGGTTGGAGCTTGTGGAATATTACGAGAGCAAATTC
GGGAAGCTAAGTGAACTGTGCTCCCTTGATTATCACAAGATCAGTGGCGCCACCCTCAACCTTTGA
Protein sequenceShow/hide protein sequence
MSGSDKSDKEYKIESDVESSEEYSEYCRSKRRRKAQRRRKKRSSSGGVLEWEEDDEITVLKQLYEFSAGSCKGLYSKAFYEHVKPKLMNREVTMSEMSSKIGGFK
NQYLEWRRKGNSNNHELRLLGPHRRQVFVLSNRIWGDYDEEMKHLEGFREFLEELEIDINCLMPSSLEYLKKEWEVQMLMYIQLLSIKSNFDAKLRELIFDFDLE
RAVCNHGQFMMPPNQWIPSSKTLQRPLRLSNSNSSVFVSINQTSSFLLTIQIHSSAALSPQDQQTILDQVVRMLRLTEKDEDELRKFQSLHPKAKQMGFGRLFRS
PTVFEDALKSILLCNTTWKRTLAMAGQLCELQARMSSQNRKRKRKLIGNFPNAEEVCRMGVELLKKHNLGYRAGFIINFAQRVQNATIDLQNPNNFPKIKGFGPF
ATANLFMCLGFYRQLPIDTETIRHIKQVHGRQFCNNKTVREDVKQIYDKYAPFQCLAYWLELVEYYESKFGKLSELCSLDYHKISGATLNL