; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

CaUC09G163060 (gene) of Watermelon (USVL246-FR2) v1 genome

Gene IDCaUC09G163060
OrganismCitrullus amarus (Watermelon (USVL246-FR2) v1)
DescriptionGATA transcription factor-like protein
Genome locationCiama_Chr09:4907580..4913432
RNA-Seq ExpressionCaUC09G163060
SyntenyCaUC09G163060
Gene Ontology termsGO:0006355 - regulation of transcription, DNA-templated (biological process)
GO:0008270 - zinc ion binding (molecular function)
GO:0043565 - sequence-specific DNA binding (molecular function)
InterPro domainsIPR000679 - Zinc finger, GATA-type
IPR013088 - Zinc finger, NHR/GATA-type


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAG6575793.1 hypothetical protein SDJN03_26432, partial [Cucurbita argyrosperma subsp. sororia]1.6e-9778.48Show/hide
Query:  SRLTAIAPKSNWAFSLAQFQRLRRGGLTTCRTADPSVHANDDNDPAVLSGEPERSQDNLEPDNAKANYEREDSKQEDSNGPLGPHKAQYASSPRLETTAV
        SRLTAIA K NW FSLAQFQRLRR GLTTCRTADPSVHANDDN PAV SGEPE+SQDNLEPD+AK+NYER+DSKQ DSNGP  P KAQYASSPRLETT V
Subjt:  SRLTAIAPKSNWAFSLAQFQRLRRGGLTTCRTADPSVHANDDNDPAVLSGEPERSQDNLEPDNAKANYEREDSKQEDSNGPLGPHKAQYASSPRLETTAV

Query:  GQASKPITQQKRAQSTVIDDVSCVGVYGGPLEEAKTNRRTETKDQEEDNKDYYKHHKASPLAEIEFADTRKPITRATDGTAYDGAGKDVIGWLPEQLDTA
         QASKPITQQKRA STV+DDVSC+G  GGP    + NR  + K+QE+D ++YYKHHKASPLAEIEF DTRKPITRATDGTAYDG GKDVIGWLPEQ DT 
Subjt:  GQASKPITQQKRAQSTVIDDVSCVGVYGGPLEEAKTNRRTETKDQEEDNKDYYKHHKASPLAEIEFADTRKPITRATDGTAYDGAGKDVIGWLPEQLDTA

Query:  EDSLRRATEIWKQNAMRGDPDAPQSRHAVGFVSGEEY
        +DSL+RATEIWKQNAMRGDPDAPQSR  +  + GE++
Subjt:  EDSLRRATEIWKQNAMRGDPDAPQSRHAVGFVSGEEY

XP_022991237.1 uncharacterized protein LOC111487953 [Cucurbita maxima]3.6e-9779.32Show/hide
Query:  SRLTAIAPKSNWAFSLAQFQRLRRGGLTTCRTADPSVHANDDNDPAVLSGEPERSQDNLEPDNAKANYEREDSKQEDSNGPLGPHKAQYASSPRLETTAV
        SRLTAIA K NWAFSLAQFQRLRR GLTTCRTADPSVHANDDN PAV SGEPE+SQDNLEPD AKANY  +DSKQ DSNGP  P KAQYASSPRLETT V
Subjt:  SRLTAIAPKSNWAFSLAQFQRLRRGGLTTCRTADPSVHANDDNDPAVLSGEPERSQDNLEPDNAKANYEREDSKQEDSNGPLGPHKAQYASSPRLETTAV

Query:  GQASKPITQQKRAQSTVIDDVSCVGVYGGPLEEAKTNRRTETKDQEEDNKDYYKHHKASPLAEIEFADTRKPITRATDGTAYDGAGKDVIGWLPEQLDTA
         QASKPITQQKRA STV+ DVSC+G  GGP  E + NR  + K+QEED ++YYKHHKASPLAEIEFADTRKPITRATDGTAYDG GKDVI WLPEQ DT 
Subjt:  GQASKPITQQKRAQSTVIDDVSCVGVYGGPLEEAKTNRRTETKDQEEDNKDYYKHHKASPLAEIEFADTRKPITRATDGTAYDGAGKDVIGWLPEQLDTA

Query:  EDSLRRATEIWKQNAMRGDPDAPQSRHAVGFVSGEEY
        +DSLRRATEIWKQNAMRGDPDAPQSR  +  + GE++
Subjt:  EDSLRRATEIWKQNAMRGDPDAPQSRHAVGFVSGEEY

XP_023548846.1 uncharacterized protein LOC111807374 [Cucurbita pepo subsp. pepo]2.1e-9778.9Show/hide
Query:  SRLTAIAPKSNWAFSLAQFQRLRRGGLTTCRTADPSVHANDDNDPAVLSGEPERSQDNLEPDNAKANYEREDSKQEDSNGPLGPHKAQYASSPRLETTAV
        SRLTAIA K NW+FSLAQFQRLRR GLTTCRTADPSVHANDDN PAV SGEPE+SQDNLEPD+AKANYER+DSKQ DSNGP  P KAQYASSPRLETT V
Subjt:  SRLTAIAPKSNWAFSLAQFQRLRRGGLTTCRTADPSVHANDDNDPAVLSGEPERSQDNLEPDNAKANYEREDSKQEDSNGPLGPHKAQYASSPRLETTAV

Query:  GQASKPITQQKRAQSTVIDDVSCVGVYGGPLEEAKTNRRTETKDQEEDNKDYYKHHKASPLAEIEFADTRKPITRATDGTAYDGAGKDVIGWLPEQLDTA
         QASKPITQQKRA STV+ DVSC+G  GGP    + NR  + K+QEED ++YYKHHKASPLAEIEF DTRKPITRATDGTAYDG GKDVIGWLPEQ DT 
Subjt:  GQASKPITQQKRAQSTVIDDVSCVGVYGGPLEEAKTNRRTETKDQEEDNKDYYKHHKASPLAEIEFADTRKPITRATDGTAYDGAGKDVIGWLPEQLDTA

Query:  EDSLRRATEIWKQNAMRGDPDAPQSRHAVGFVSGEEY
        +DSLRRA EIWKQNAMRGDPDAPQSR  +  + GE++
Subjt:  EDSLRRATEIWKQNAMRGDPDAPQSRHAVGFVSGEEY

XP_038899333.1 uncharacterized protein LOC120086662 isoform X1 [Benincasa hispida]8.9e-10486.73Show/hide
Query:  SRLTAIAPKSNWAFSLAQFQRLRRGGLTTCRTADPSVHANDDNDPAVLSGEPERSQDNLEPDNAKANYEREDSKQEDSNGPLGPHKAQYASSPRLETTAV
        SRLTAIAPKSNWA SLAQFQRLRR  LTT RTADPSVHANDDNDPAVLSGEPE SQDNLEPDN KANYER+D K  DSNGP G  KAQ+ASSPRLET  V
Subjt:  SRLTAIAPKSNWAFSLAQFQRLRRGGLTTCRTADPSVHANDDNDPAVLSGEPERSQDNLEPDNAKANYEREDSKQEDSNGPLGPHKAQYASSPRLETTAV

Query:  GQASKPITQQKRAQSTVIDDVSCVGVYGGPLEEAKTNRRTETKDQEEDNKDYYKHHKASPLAEIEFADTRKPITRATDGTAYDGAGKDVIGWLPEQLDTA
        GQASKPITQQKR QSTV D+VSC+GVYGGPLE+ K NR TE K+QEEDN+DYYKHHKASPLAEIEFADTRKPITRATDGTAYDG GKDVIGWLPEQLDT 
Subjt:  GQASKPITQQKRAQSTVIDDVSCVGVYGGPLEEAKTNRRTETKDQEEDNKDYYKHHKASPLAEIEFADTRKPITRATDGTAYDGAGKDVIGWLPEQLDTA

Query:  EDSLRRATEIWKQNAMRGDPDAPQSR
        +DSLRRATEIWKQNAMRGDPDAPQSR
Subjt:  EDSLRRATEIWKQNAMRGDPDAPQSR

XP_038899334.1 uncharacterized protein LOC120086662 isoform X2 [Benincasa hispida]3.2e-10185.84Show/hide
Query:  SRLTAIAPKSNWAFSLAQFQRLRRGGLTTCRTADPSVHANDDNDPAVLSGEPERSQDNLEPDNAKANYEREDSKQEDSNGPLGPHKAQYASSPRLETTAV
        SRLTAIAPKSNWA SLAQFQRLRR  LTT RTADPSVHANDDNDPAVLSGEPE   DNLEPDN KANYER+D K  DSNGP G  KAQ+ASSPRLET  V
Subjt:  SRLTAIAPKSNWAFSLAQFQRLRRGGLTTCRTADPSVHANDDNDPAVLSGEPERSQDNLEPDNAKANYEREDSKQEDSNGPLGPHKAQYASSPRLETTAV

Query:  GQASKPITQQKRAQSTVIDDVSCVGVYGGPLEEAKTNRRTETKDQEEDNKDYYKHHKASPLAEIEFADTRKPITRATDGTAYDGAGKDVIGWLPEQLDTA
        GQASKPITQQKR QSTV D+VSC+GVYGGPLE+ K NR TE K+QEEDN+DYYKHHKASPLAEIEFADTRKPITRATDGTAYDG GKDVIGWLPEQLDT 
Subjt:  GQASKPITQQKRAQSTVIDDVSCVGVYGGPLEEAKTNRRTETKDQEEDNKDYYKHHKASPLAEIEFADTRKPITRATDGTAYDGAGKDVIGWLPEQLDTA

Query:  EDSLRRATEIWKQNAMRGDPDAPQSR
        +DSLRRATEIWKQNAMRGDPDAPQSR
Subjt:  EDSLRRATEIWKQNAMRGDPDAPQSR

TrEMBL top hitse value%identityAlignment
A0A0A0K9G7 Uncharacterized protein4.8e-9578.6Show/hide
Query:  SRLTAIAPKSNWAFSLAQFQRLRRGG--LTTCRTADPSVHAN---DDNDPAVLSGEPERSQDNLEPDNAKANYE-REDSKQEDSNGPLGPHKAQYASSPR
        S L AIAPKSNWAF + QFQ LRRGG  LTT RTADPS+HAN   DDNDPAVLSGEPERSQDNLEPDNAKANY+ R+D KQ DS GP G   AQ+ASSPR
Subjt:  SRLTAIAPKSNWAFSLAQFQRLRRGG--LTTCRTADPSVHAN---DDNDPAVLSGEPERSQDNLEPDNAKANYE-REDSKQEDSNGPLGPHKAQYASSPR

Query:  LETTAVGQASKPITQQKRAQSTVIDDVSCVGVYGGPLEEAKTNRRTETKDQEEDNKDYYKHHKASPLAEIEFADTRKPITRATDGTAYDGAGKDVIGWLP
        LETT VGQASKPITQQKRA S  IDDVSC+GVYGGPLE+ K NR TE K++EEDN+DYYKHHKASPLAEIEFADTRKPITRATDGTAYDG    VIGWLP
Subjt:  LETTAVGQASKPITQQKRAQSTVIDDVSCVGVYGGPLEEAKTNRRTETKDQEEDNKDYYKHHKASPLAEIEFADTRKPITRATDGTAYDGAGKDVIGWLP

Query:  EQLDTAEDSLRRATEIWKQNAMRGDPDAPQSRHAVGFVSGEEY
        EQ+DT +DSLRRATEIWKQNAMRGDPDAPQSR  +  + GEE+
Subjt:  EQLDTAEDSLRRATEIWKQNAMRGDPDAPQSRHAVGFVSGEEY

A0A1S3BR22 uncharacterized protein LOC1034927782.5e-9677.51Show/hide
Query:  SRLTAIAPKSNWAFSLAQFQRLRRGGLTTCRTADPSVHANDD--NDPAVLSGEPERSQDNLEPDNAKANYE-REDSKQEDSNGPLGPHKAQYASSPRLET
        SRL AIAP+SNWA  + QFQ LRRGGLTT RTADPSVHANDD  NDP+VLSGEPERSQDNLEPDNAKANYE R+D KQ DSNGP GP KAQ+ASSPRLET
Subjt:  SRLTAIAPKSNWAFSLAQFQRLRRGGLTTCRTADPSVHANDD--NDPAVLSGEPERSQDNLEPDNAKANYE-REDSKQEDSNGPLGPHKAQYASSPRLET

Query:  TAVGQASKPITQQKRAQSTVIDDVSCVGVYGGPLEEAKTNRRTETKDQE---------EDNKDYYKHHKASPLAEIEFADTRKPITRATDGTAYDGAGKD
        T VGQASKPITQQKRA S  IDDVSC+GVYGGPLEE K +R TE KD+E         EDN+DYYKHHKASPLAEIEF DTRKPITRATDGTA  G GK 
Subjt:  TAVGQASKPITQQKRAQSTVIDDVSCVGVYGGPLEEAKTNRRTETKDQE---------EDNKDYYKHHKASPLAEIEFADTRKPITRATDGTAYDGAGKD

Query:  VIGWLPEQLDTAEDSLRRATEIWKQNAMRGDPDAPQSRHAVGFVSGEEY
        VIGWLPEQ+DT +DSLRRATEIWKQNAMRGDPDAPQSR  +  + GE++
Subjt:  VIGWLPEQLDTAEDSLRRATEIWKQNAMRGDPDAPQSRHAVGFVSGEEY

A0A5D3D3D5 Uncharacterized protein6.2e-9577.02Show/hide
Query:  SRLTAIAPKSNWAFSLAQFQRLRRGGLTTCRTADPSVHANDD--NDPAVLSGEPERSQDNLEPDNAKANYE-REDSKQEDSNGPLGPHKAQYASSPRLET
        SRL AIAP+SNWA  + Q Q LRRGGLTT RTADPSVHANDD  NDP+VLSGEPERSQDNLEPDNAKANYE R+D KQ  SNGP GP KAQ+ASSPRLET
Subjt:  SRLTAIAPKSNWAFSLAQFQRLRRGGLTTCRTADPSVHANDD--NDPAVLSGEPERSQDNLEPDNAKANYE-REDSKQEDSNGPLGPHKAQYASSPRLET

Query:  TAVGQASKPITQQKRAQSTVIDDVSCVGVYGGPLEEAKTNRRTETKDQE--------EDNKDYYKHHKASPLAEIEFADTRKPITRATDGTAYDGAGKDV
        T VGQASKPITQQKRA S  IDDVSC+GVYGGPLEE K +R TE KD+E        EDN+DYYKHHKASPLAEIEF DTRKPITRATDGTA  G GK V
Subjt:  TAVGQASKPITQQKRAQSTVIDDVSCVGVYGGPLEEAKTNRRTETKDQE--------EDNKDYYKHHKASPLAEIEFADTRKPITRATDGTAYDGAGKDV

Query:  IGWLPEQLDTAEDSLRRATEIWKQNAMRGDPDAPQSRHAVGFVSGEEY
        IGWLPEQ+DT +DSLRRATEIWKQNAMRGDPDAPQSR  +  + GE++
Subjt:  IGWLPEQLDTAEDSLRRATEIWKQNAMRGDPDAPQSRHAVGFVSGEEY

A0A6J1GPT4 uncharacterized protein LOC1114563881.9e-9677.22Show/hide
Query:  SRLTAIAPKSNWAFSLAQFQRLRRGGLTTCRTADPSVHANDDNDPAVLSGEPERSQDNLEPDNAKANYEREDSKQEDSNGPLGPHKAQYASSPRLETTAV
        SRLTAIA K NW FSLAQFQRLRR GLTTCRTADPSVHANDDN PAV SGEPE+SQDNLEPD+AK+NYER+DSKQ DSNGP  P KAQYASSPRLETT V
Subjt:  SRLTAIAPKSNWAFSLAQFQRLRRGGLTTCRTADPSVHANDDNDPAVLSGEPERSQDNLEPDNAKANYEREDSKQEDSNGPLGPHKAQYASSPRLETTAV

Query:  GQASKPITQQKRAQSTVIDDVSCVGVYGGPLEEAKTNRRTETKDQEEDNKDYYKHHKASPLAEIEFADTRKPITRATDGTAYDGAGKDVIGWLPEQLDTA
         QASKPITQQKRA STV+ DVSC+G  GGP    + NR  + K+Q++D ++YYKHHKASPLAEIEF DTRKPITRATDGTAYDG GKD+IGWLPEQ DT 
Subjt:  GQASKPITQQKRAQSTVIDDVSCVGVYGGPLEEAKTNRRTETKDQEEDNKDYYKHHKASPLAEIEFADTRKPITRATDGTAYDGAGKDVIGWLPEQLDTA

Query:  EDSLRRATEIWKQNAMRGDPDAPQSRHAVGFVSGEEY
        +DSL+RATEIWKQNAMRGDPDAPQSR  +  + GE++
Subjt:  EDSLRRATEIWKQNAMRGDPDAPQSRHAVGFVSGEEY

A0A6J1JL82 uncharacterized protein LOC1114879531.7e-9779.32Show/hide
Query:  SRLTAIAPKSNWAFSLAQFQRLRRGGLTTCRTADPSVHANDDNDPAVLSGEPERSQDNLEPDNAKANYEREDSKQEDSNGPLGPHKAQYASSPRLETTAV
        SRLTAIA K NWAFSLAQFQRLRR GLTTCRTADPSVHANDDN PAV SGEPE+SQDNLEPD AKANY  +DSKQ DSNGP  P KAQYASSPRLETT V
Subjt:  SRLTAIAPKSNWAFSLAQFQRLRRGGLTTCRTADPSVHANDDNDPAVLSGEPERSQDNLEPDNAKANYEREDSKQEDSNGPLGPHKAQYASSPRLETTAV

Query:  GQASKPITQQKRAQSTVIDDVSCVGVYGGPLEEAKTNRRTETKDQEEDNKDYYKHHKASPLAEIEFADTRKPITRATDGTAYDGAGKDVIGWLPEQLDTA
         QASKPITQQKRA STV+ DVSC+G  GGP  E + NR  + K+QEED ++YYKHHKASPLAEIEFADTRKPITRATDGTAYDG GKDVI WLPEQ DT 
Subjt:  GQASKPITQQKRAQSTVIDDVSCVGVYGGPLEEAKTNRRTETKDQEEDNKDYYKHHKASPLAEIEFADTRKPITRATDGTAYDGAGKDVIGWLPEQLDTA

Query:  EDSLRRATEIWKQNAMRGDPDAPQSRHAVGFVSGEEY
        +DSLRRATEIWKQNAMRGDPDAPQSR  +  + GE++
Subjt:  EDSLRRATEIWKQNAMRGDPDAPQSRHAVGFVSGEEY

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
AT1G02700.1 unknown protein1.8e-4648.72Show/hide
Query:  SRLTAIAPKSNWAFSLAQFQRLRRGGLTTCRTADPSVHA-NDDNDPAVLSGEPERSQDNLEPDNAKANYEREDSKQEDSNGPLGPHKAQYASSPRLETTA
        SRL A A  +         +RL  G  T+ RTADP +HA ND  DPA+   +PE   D   P  A      +  +      PL P K+  A++ +LE+T 
Subjt:  SRLTAIAPKSNWAFSLAQFQRLRRGGLTTCRTADPSVHA-NDDNDPAVLSGEPERSQDNLEPDNAKANYEREDSKQEDSNGPLGPHKAQYASSPRLETTA

Query:  VGQASKPITQQKRAQSTV----IDDVSCVGVYGG--PLEEAKTNRRTETKDQEEDNKDYYKHHKASPLAEIEFADTRKPITRATDGTAYDGAGKDVIGWL
        VG  S+P  QQKR  ST     +D VSC G+ G   P +E +   +   +D+ E ++++YKHHKASPL+EIEFADTRKPIT+ATDGTAY  AGKDVIGWL
Subjt:  VGQASKPITQQKRAQSTV----IDDVSCVGVYGG--PLEEAKTNRRTETKDQEEDNKDYYKHHKASPLAEIEFADTRKPITRATDGTAYDGAGKDVIGWL

Query:  PEQLDTAEDSLRRATEIWKQNAMRGDPDA-PQSR
        PEQLDTAE+SL +AT I+K+NA RGDP+  P SR
Subjt:  PEQLDTAEDSLRRATEIWKQNAMRGDPDA-PQSR

AT3G22800.1 Leucine-rich repeat (LRR) family protein8.5e-0454.74Show/hide
Query:  CDNPCQLPPPPPPPPVVDCPPPPPPPSPPPPLPTSQCP-PPPSPPSCDTCVYPSPPPPSSVQPYPPADGGQFPGLAPPPPNPILPYFPYYYYSPP
        C  P   PPPPPPPP    PPPPPPP PPPP P    P PPP PPS    VYP PPPP  V P PP+    +P   PPPP+P     PY Y SPP
Subjt:  CDNPCQLPPPPPPPPVVDCPPPPPPPSPPPPLPTSQCP-PPPSPPSCDTCVYPSPPPPSSVQPYPPADGGQFPGLAPPPPNPILPYFPYYYYSPP

AT4G16140.1 proline-rich family protein5.2e-0944.22Show/hide
Query:  IFLTVFLFSLSKVSLLSFATVVVKDQVS-CSMCSTCDNPCQLPPPPPPPPVVDCPPPPPPPSPPPPLPTSQCPPPPS----------PPSCDTCVYPSPP
        +FL  F F+ +        T+    Q++ C+MC++CDNPCQ  P PPPPP     P PPPPSP     T+ CPPPPS          PP+  +  Y  PP
Subjt:  IFLTVFLFSLSKVSLLSFATVVVKDQVS-CSMCSTCDNPCQLPPPPPPPPVVDCPPPPPPPSPPPPLPTSQCPPPPS----------PPSCDTCVYPSPP

Query:  PPSSVQPY--PPADGGQFPGLAPPPPNPILPYFPYYYYSPPFASAKS
          SS   Y  PP  GG +P    PPPNPI+PYFP+YYY+PP  S  S
Subjt:  PPSSVQPY--PPADGGQFPGLAPPPPNPILPYFPYYYYSPPFASAKS

AT5G49280.1 hydroxyproline-rich glycoprotein family protein5.5e-1142.11Show/hide
Query:  TVFLFSLSKVSLLSFA----TVVVKDQ-VSCSMCSTCDNPCQLPPPPPPPPVVDCPPPPPPPSPPPPLPTSQCPPPPSPPSC---DTCVYPSP-------
        T  L++LS + ++       TV  KD+ VSC+MCS+CDNPC         PV   PPPP P  PPP  PT+ CPPPPSPPS     +  YP P       
Subjt:  TVFLFSLSKVSLLSFA----TVVVKDQ-VSCSMCSTCDNPCQLPPPPPPPPVVDCPPPPPPPSPPPPLPTSQCPPPPSPPSC---DTCVYPSP-------

Query:  --PPP----SSVQPYPPADGGQFPGLAPPPPNPILPYFPYYYYSPPFASAKS-VPFSWKFLPLPFLIILCL
          PPP         YPP   G +P   PPPPNPI+PYFP+YY++PP  S       S+  +   F + LCL
Subjt:  --PPP----SSVQPYPPADGGQFPGLAPPPPNPILPYFPYYYYSPPFASAKS-VPFSWKFLPLPFLIILCL


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGCTCAAAGCTGAAGTAATTGGTGATTTTGGTTTGGATCCTTCCATCATGGCCTTCCTCATGAACCGCGCTGCCGATCTCTCTGCTTCCACCTTCAACCCATCT
CTAATGCCCGACGCTTCGAAACAGAGTCTGCTCAATATGGAGCCGCAGAAACAGAGGGCCTGCGTCCACTGTCGCACCACCAGAACCCCTCTCTGGAGAGCCGGT
CCGGCTGGGCCAAGGTCGCTGTGCAATGCATGTGGGATTCGATACAGGAAGACGAAGAATAATAATAATAATAGTAATGGAGGAGTGAATAATAAGATGGGAAAA
GGGAAGAAAGTGGGAGAAGGATCATTGAAGGTGAGAGTGGTGAGCTTAGGAAGAGAAATAGTTGAAGAGGCAATCGGAGAAGAAGAACAGGCGGCGGCGATGCTT
CTTATGGCTCTATCTTCCGGCTATTCGAGACTGACGGCGATCGCACCGAAATCGAATTGGGCCTTTTCTCTGGCCCAATTCCAACGCCTCCGGCGAGGTGGTCTG
ACGACATGTCGTACTGCTGACCCTTCCGTTCACGCCAACGACGACAACGACCCCGCCGTTTTATCCGGTGAACCCGAGAGATCACAGGATAATTTAGAGCCGGAT
AATGCGAAGGCCAATTACGAAAGGGAGGACTCTAAACAGGAAGATTCAAATGGGCCATTAGGGCCACATAAGGCCCAATACGCTTCCTCCCCTCGGTTAGAAACC
ACTGCAGTGGGCCAGGCCTCAAAGCCCATTACTCAGCAAAAGAGAGCCCAGAGTACGGTGATCGACGACGTGAGTTGCGTCGGCGTTTACGGCGGGCCTTTGGAG
GAGGCGAAAACAAACAGAAGAACTGAAACGAAAGATCAGGAGGAAGACAATAAAGACTATTACAAGCACCACAAGGCGTCTCCGTTGGCGGAGATCGAGTTTGCT
GATACGCGTAAGCCGATAACCAGAGCGACGGACGGGACGGCGTACGACGGGGCCGGGAAGGATGTGATTGGGTGGTTGCCGGAGCAGCTGGATACGGCGGAGGAT
TCGCTTCGGAGAGCGACGGAGATTTGGAAACAAAATGCAATGCGTGGAGATCCGGATGCTCCACAGTCGAGGCATGCCGTGGGATTTGTAAGTGGAGAAGAGTAC
TACGGAGCAAAAGCAAGCATAAATGTATGGGCGCCGCGGGTGACGAATCACTTCCAATTCACATCAATGGCCAATCTCATCTCCATTTTCCTTACTGTCTTCCTC
TTTTCACTCTCCAAAGTTTCCCTTCTCTCTTTCGCCACCGTCGTGGTCAAGGATCAGGTTAGCTGTTCCATGTGCTCAACTTGTGACAACCCATGCCAACTCCCC
CCGCCGCCGCCTCCTCCGCCGGTTGTTGACTGCCCGCCGCCCCCTCCGCCTCCATCTCCACCACCGCCGCTCCCCACTTCCCAATGCCCTCCGCCTCCGTCGCCT
CCCTCCTGTGATACATGCGTCTACCCTTCTCCGCCGCCGCCGTCCTCCGTCCAGCCGTACCCTCCGGCCGACGGTGGCCAATTTCCCGGACTTGCTCCGCCGCCG
CCGAACCCAATTTTACCGTATTTTCCTTACTATTATTACAGTCCCCCCTTTGCTTCTGCCAAATCCGTTCCATTTTCATGGAAATTTTTGCCTCTGCCTTTCTTA
ATCATCCTTTGCCTCTAG
mRNA sequenceShow/hide mRNA sequence
CTCTTAAGGACACTTTCTTCAAACACCTTCCTTTCTAGTTTGTGATGCTCAAAGCTGAAGTAATTGGTGATTTTGGTTTGGATCCTTCCATCATGGCCTTCCTCA
TGAACCGCGCTGCCGATCTCTCTGCTTCCACCTTCAACCCATCTCTAATGCCCGACGCTTCGAAACAGAGTCTGCTCAATATGGAGCCGCAGAAACAGAGGGCCT
GCGTCCACTGTCGCACCACCAGAACCCCTCTCTGGAGAGCCGGTCCGGCTGGGCCAAGGTCGCTGTGCAATGCATGTGGGATTCGATACAGGAAGACGAAGAATA
ATAATAATAATAGTAATGGAGGAGTGAATAATAAGATGGGAAAAGGGAAGAAAGTGGGAGAAGGATCATTGAAGGTGAGAGTGGTGAGCTTAGGAAGAGAAATAG
TTGAAGAGGCAATCGGAGAAGAAGAACAGGCGGCGGCGATGCTTCTTATGGCTCTATCTTCCGGCTATTCGAGACTGACGGCGATCGCACCGAAATCGAATTGGG
CCTTTTCTCTGGCCCAATTCCAACGCCTCCGGCGAGGTGGTCTGACGACATGTCGTACTGCTGACCCTTCCGTTCACGCCAACGACGACAACGACCCCGCCGTTT
TATCCGGTGAACCCGAGAGATCACAGGATAATTTAGAGCCGGATAATGCGAAGGCCAATTACGAAAGGGAGGACTCTAAACAGGAAGATTCAAATGGGCCATTAG
GGCCACATAAGGCCCAATACGCTTCCTCCCCTCGGTTAGAAACCACTGCAGTGGGCCAGGCCTCAAAGCCCATTACTCAGCAAAAGAGAGCCCAGAGTACGGTGA
TCGACGACGTGAGTTGCGTCGGCGTTTACGGCGGGCCTTTGGAGGAGGCGAAAACAAACAGAAGAACTGAAACGAAAGATCAGGAGGAAGACAATAAAGACTATT
ACAAGCACCACAAGGCGTCTCCGTTGGCGGAGATCGAGTTTGCTGATACGCGTAAGCCGATAACCAGAGCGACGGACGGGACGGCGTACGACGGGGCCGGGAAGG
ATGTGATTGGGTGGTTGCCGGAGCAGCTGGATACGGCGGAGGATTCGCTTCGGAGAGCGACGGAGATTTGGAAACAAAATGCAATGCGTGGAGATCCGGATGCTC
CACAGTCGAGGCATGCCGTGGGATTTGTAAGTGGAGAAGAGTACTACGGAGCAAAAGCAAGCATAAATGTATGGGCGCCGCGGGTGACGAATCACTTCCAATTCA
CATCAATGGCCAATCTCATCTCCATTTTCCTTACTGTCTTCCTCTTTTCACTCTCCAAAGTTTCCCTTCTCTCTTTCGCCACCGTCGTGGTCAAGGATCAGGTTA
GCTGTTCCATGTGCTCAACTTGTGACAACCCATGCCAACTCCCCCCGCCGCCGCCTCCTCCGCCGGTTGTTGACTGCCCGCCGCCCCCTCCGCCTCCATCTCCAC
CACCGCCGCTCCCCACTTCCCAATGCCCTCCGCCTCCGTCGCCTCCCTCCTGTGATACATGCGTCTACCCTTCTCCGCCGCCGCCGTCCTCCGTCCAGCCGTACC
CTCCGGCCGACGGTGGCCAATTTCCCGGACTTGCTCCGCCGCCGCCGAACCCAATTTTACCGTATTTTCCTTACTATTATTACAGTCCCCCCTTTGCTTCTGCCA
AATCCGTTCCATTTTCATGGAAATTTTTGCCTCTGCCTTTCTTAATCATCCTTTGCCTCTAG
Protein sequenceShow/hide protein sequence
MLKAEVIGDFGLDPSIMAFLMNRAADLSASTFNPSLMPDASKQSLLNMEPQKQRACVHCRTTRTPLWRAGPAGPRSLCNACGIRYRKTKNNNNNSNGGVNNKMGK
GKKVGEGSLKVRVVSLGREIVEEAIGEEEQAAAMLLMALSSGYSRLTAIAPKSNWAFSLAQFQRLRRGGLTTCRTADPSVHANDDNDPAVLSGEPERSQDNLEPD
NAKANYEREDSKQEDSNGPLGPHKAQYASSPRLETTAVGQASKPITQQKRAQSTVIDDVSCVGVYGGPLEEAKTNRRTETKDQEEDNKDYYKHHKASPLAEIEFA
DTRKPITRATDGTAYDGAGKDVIGWLPEQLDTAEDSLRRATEIWKQNAMRGDPDAPQSRHAVGFVSGEEYYGAKASINVWAPRVTNHFQFTSMANLISIFLTVFL
FSLSKVSLLSFATVVVKDQVSCSMCSTCDNPCQLPPPPPPPPVVDCPPPPPPPSPPPPLPTSQCPPPPSPPSCDTCVYPSPPPPSSVQPYPPADGGQFPGLAPPP
PNPILPYFPYYYYSPPFASAKSVPFSWKFLPLPFLIILCL