; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

ClCG01G013890 (gene) of Watermelon (Charleston Gray) v2.5 genome

Gene IDClCG01G013890
OrganismCitrullus lanatus subsp. vulgaris cv. Charleston Gray (Watermelon (Charleston Gray) v2.5)
DescriptionReverse transcriptase
Genome locationCG_Chr01:27905985..27911668
RNA-Seq ExpressionClCG01G013890
SyntenyClCG01G013890
Gene Ontology termsGO:0006807 - nitrogen compound metabolic process (biological process)
GO:0043170 - macromolecule metabolic process (biological process)
GO:0044238 - primary metabolic process (biological process)
GO:0003676 - nucleic acid binding (molecular function)
GO:0003824 - catalytic activity (molecular function)
GO:0008270 - zinc ion binding (molecular function)
InterPro domainsIPR001878 - Zinc finger, CCHC-type
IPR005162 - Retrotransposon gag domain
IPR021109 - Aspartic peptidase domain superfamily
IPR036875 - Zinc finger, CCHC-type superfamily
IPR041577 - Reverse transcriptase/retrotransposon-derived protein, RNase H-like domain
IPR043128 - Reverse transcriptase/Diguanylate cyclase domain
IPR043502 - DNA/RNA polymerase superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_022155000.1 uncharacterized protein LOC111022144 [Momordica charantia]1.4e-6548.81Show/hide
Query:  FIRNFKRYGPPTFGGGSEKATAAEQWIVKLESLFEYLNCEDHLKVRGAIFMLRDEA----------------------FKDLLYDYYFPDTVKDDKETEF
        FI++FKRYGPPTF G SE+ATAAE+WI +LE+ + YL CED  KV+GA+FMLR EA                      FKDLLYDYY+ +TVKD KE EF
Subjt:  FIRNFKRYGPPTFGGGSEKATAAEQWIVKLESLFEYLNCEDHLKVRGAIFMLRDEA----------------------FKDLLYDYYFPDTVKDDKETEF

Query:  LHLTQGSMSVIQDERKFTELSRFAPDLVSTSERRIKRFIRGLCEKIRGVVALKEPTTFAAALRATLIMDKNAAKKPQATHSRWEASASSEFKRKSPPALS
        LHL QG++SV Q ERKFTELSRFA +L+  +  +IKRF++GL + IRG V L+ P ++A A+R  LIMDK+ + K     S  E  +SS  KRK  P  +
Subjt:  LHLTQGSMSVIQDERKFTELSRFAPDLVSTSERRIKRFIRGLCEKIRGVVALKEPTTFAAALRATLIMDKNAAKKPQATHSRWEASASSEFKRKSPPALS

Query:  DQTSKAHHPTSGQAITLPLCSLCNKHHLWQCWLDQRICFKCGKEGHFARMCPSKGEANTDKPTPKALPT-ATQGGKQKAHIFALTKKEAEDTD
        D + +A    +      P+C  C K H  QCW   + CF+CG+E HFAR CP    ANT +   +  PT +TQG  Q+A +FALT+KEA D +
Subjt:  DQTSKAHHPTSGQAITLPLCSLCNKHHLWQCWLDQRICFKCGKEGHFARMCPSKGEANTDKPTPKALPT-ATQGGKQKAHIFALTKKEAEDTD

XP_022155925.1 uncharacterized protein LOC111022925 [Momordica charantia]1.5e-5645.89Show/hide
Query:  GANRQQRVNPPIPPEVPPFIRNFKRYGPPTFGGGSEKATAAEQWIVKLESLFEYLNCEDHLKVRGAIFMLRDEA----------------------FKDL
        GA  QQ     I  E   FIR+FKR+GPP F G SE+ TAAE+W+ +LE+L+ YL C D  KVRGA+FML+ EA                      FKDL
Subjt:  GANRQQRVNPPIPPEVPPFIRNFKRYGPPTFGGGSEKATAAEQWIVKLESLFEYLNCEDHLKVRGAIFMLRDEA----------------------FKDL

Query:  LYDYYFPDTVKDDKETEFLHLTQGSMSVIQDERKFTELSRFAPDLVSTSERRIKRFIRGLCEKIRGVVALKEPTTFAAALRATLIMDKNAAKKPQATHSR
        LY+YYFP TV+++K  EFL LTQ S+ V Q ERKFTELSRF    + T + +I +FI GL  +I+G++ LKEPTT+AAA+R  L+MDK   ++PQ   S+
Subjt:  LYDYYFPDTVKDDKETEFLHLTQGSMSVIQDERKFTELSRFAPDLVSTSERRIKRFIRGLCEKIRGVVALKEPTTFAAALRATLIMDKNAAKKPQATHSR

Query:  WEASASSEFKRKSPPALSDQTSKAHHPTSGQAITLPLCSLCNKHHLWQCWLDQRICFKCGKEGHFARMCPSKGEANTDKPTPKALPTATQGG
            +SS  KRK     S Q S+ H     +  T P C  C K+H   CW+ +RIC++C KEGHFAR C   G +NT     +   TAT  G
Subjt:  WEASASSEFKRKSPPALSDQTSKAHHPTSGQAITLPLCSLCNKHHLWQCWLDQRICFKCGKEGHFARMCPSKGEANTDKPTPKALPTATQGG

XP_022156326.1 uncharacterized protein LOC111023247 [Momordica charantia]2.5e-6746.43Show/hide
Query:  MPP-HGWRPRGGLDMPALPGDGANRQQRVNPP----IPPEVPPFIRNFKRYGPPTFGGGSEKATAAEQWIVKLESLFEYLNCEDHLKVRGAIFMLRDEA-
        MPP H  R R   D    P  G     +  PP     P     FI++FKRYGPPTF G SE+ATA E+WI +LE+L+ YL CED  KV+GA+FMLR EA 
Subjt:  MPP-HGWRPRGGLDMPALPGDGANRQQRVNPP----IPPEVPPFIRNFKRYGPPTFGGGSEKATAAEQWIVKLESLFEYLNCEDHLKVRGAIFMLRDEA-

Query:  ---------------------FKDLLYDYYFPDTVKDDKETEFLHLTQGSMSVIQDERKFTELSRFAPDLVSTSERRIKRFIRGLCEKIRGVVALKEPTT
                             FK+LLYDYY+P+TVKD KE EFLHL QG++SV Q ERKFTELSRFA +L+ T   +IKRF++GL + IRG V L+ PTT
Subjt:  ---------------------FKDLLYDYYFPDTVKDDKETEFLHLTQGSMSVIQDERKFTELSRFAPDLVSTSERRIKRFIRGLCEKIRGVVALKEPTT

Query:  FAAALRATLIMDKNAAKKPQATHSRWEASASSEFKRKSPPALSDQTSKAHHPTSGQAITLPLCSLCNKHHLWQCWLDQRICFKCGKEGHFARMCPSKGEA
        +A A+R  L+MDK+ + K        E  +SS  KRK P   +D   +A    +      P+C  C K H  QCW   + CF+CG+EGHFAR CP    A
Subjt:  FAAALRATLIMDKNAAKKPQATHSRWEASASSEFKRKSPPALSDQTSKAHHPTSGQAITLPLCSLCNKHHLWQCWLDQRICFKCGKEGHFARMCPSKGEA

Query:  NTDKPTPK-ALPTATQGGKQKAHIFALTKKEAEDTD
        NT +   +   P +TQG  Q+A +FALT+KEA D +
Subjt:  NTDKPTPK-ALPTATQGGKQKAHIFALTKKEAEDTD

XP_022156328.1 LOW QUALITY PROTEIN: uncharacterized protein LOC111023249 [Momordica charantia]9.3e-6246.01Show/hide
Query:  GANRQQRVNPPIPPEVPPFIRNFKRYGPPTFGGGSEKATAAEQWIVKLESLFEYLNCEDHLKVRGAIFMLRDEA----------------------FKDL
        GA  QQ     IP +   FIR+FK +GPP F G SE+ TAAE+W+ +LE+L+ YL C D  KVRGA+FMLR EA                      FKDL
Subjt:  GANRQQRVNPPIPPEVPPFIRNFKRYGPPTFGGGSEKATAAEQWIVKLESLFEYLNCEDHLKVRGAIFMLRDEA----------------------FKDL

Query:  LYDYYFPDTVKDDKETEFLHLTQGSMSVIQDERKFTELSRFAPDLVSTSERRIKRFIRGLCEKIRGVVALKEPTTFAAALRATLIMDKNAAKKPQATHSR
        LY+YYFP   +++K  EFL LTQGS++V Q ERKFTELSRF    V T + +I +FI GL  +I+G++ LKEPTT+AAA+R  L+MDK   ++PQ   S+
Subjt:  LYDYYFPDTVKDDKETEFLHLTQGSMSVIQDERKFTELSRFAPDLVSTSERRIKRFIRGLCEKIRGVVALKEPTTFAAALRATLIMDKNAAKKPQATHSR

Query:  WEASASSEFKRKSPPALSDQTSKAHHPTSGQAITLPLCSLCNKHHLWQCWLDQRICFKCGKEGHFARMCPSKGEANT---DKPTPKALPTATQGGKQKAH
            ++S  KRK     + Q+S+ H   + +    P+C  C K+H   CWL ++ICFKC KEGHF R C   G +NT    + TP A  TATQGG Q A 
Subjt:  WEASASSEFKRKSPPALSDQTSKAHHPTSGQAITLPLCSLCNKHHLWQCWLDQRICFKCGKEGHFARMCPSKGEANT---DKPTPKALPTATQGGKQKAH

Query:  IFALTKKEAEDTD
        +FALT+ + E  +
Subjt:  IFALTKKEAEDTD

XP_022157413.1 uncharacterized protein LOC111024114 [Momordica charantia]5.8e-5643.55Show/hide
Query:  GANRQQRVNPPIPPEVPPFIRNFKRYGPPTFGGGSEKATAAEQWIVKLESLFEYLNCEDHLKVRGAIFMLRDEA----------------------FKDL
        GA  QQ     IP +   FIR+FKR+GPP F G SE+ TA E+W+ +LE+L+ YL C D  KVRGA+FMLR EA                      FKDL
Subjt:  GANRQQRVNPPIPPEVPPFIRNFKRYGPPTFGGGSEKATAAEQWIVKLESLFEYLNCEDHLKVRGAIFMLRDEA----------------------FKDL

Query:  LYDYYFPDTVKDDKETEFLHLTQGSMSVIQDERKFTELSRFAPDLVSTSERRIKRFIRGLCEKIRGVVALKEPTTFAAALRATLIMDKNAAKKPQATHSR
        LY+YYFP TV+++K  EFL LTQGS++V Q ERKFTELSRF    + T + +I +FI GL  +I+G++ +KEPTT+AAA+R  L+MDK   ++PQ   S+
Subjt:  LYDYYFPDTVKDDKETEFLHLTQGSMSVIQDERKFTELSRFAPDLVSTSERRIKRFIRGLCEKIRGVVALKEPTTFAAALRATLIMDKNAAKKPQATHSR

Query:  WEASASSEFKRKSPPALSDQTSKAHHPTSGQAITLPLCSLCNKHHLWQCWLDQRICFKCGKEGHFARMCPSKGEANTDKPTPKALPTATQGGKQKAHIFA
            +SS  KRK     S Q+S+ H     +    P+C  C K+H   CWL +RICF+C K                   TP A   A QGG Q+A +FA
Subjt:  WEASASSEFKRKSPPALSDQTSKAHHPTSGQAITLPLCSLCNKHHLWQCWLDQRICFKCGKEGHFARMCPSKGEANTDKPTPKALPTATQGGKQKAHIFA

Query:  LTKKEAEDTD
        LT+ + E  +
Subjt:  LTKKEAEDTD

TrEMBL top hitse value%identityAlignment
A0A6J1DL73 uncharacterized protein LOC1110221446.7e-6648.81Show/hide
Query:  FIRNFKRYGPPTFGGGSEKATAAEQWIVKLESLFEYLNCEDHLKVRGAIFMLRDEA----------------------FKDLLYDYYFPDTVKDDKETEF
        FI++FKRYGPPTF G SE+ATAAE+WI +LE+ + YL CED  KV+GA+FMLR EA                      FKDLLYDYY+ +TVKD KE EF
Subjt:  FIRNFKRYGPPTFGGGSEKATAAEQWIVKLESLFEYLNCEDHLKVRGAIFMLRDEA----------------------FKDLLYDYYFPDTVKDDKETEF

Query:  LHLTQGSMSVIQDERKFTELSRFAPDLVSTSERRIKRFIRGLCEKIRGVVALKEPTTFAAALRATLIMDKNAAKKPQATHSRWEASASSEFKRKSPPALS
        LHL QG++SV Q ERKFTELSRFA +L+  +  +IKRF++GL + IRG V L+ P ++A A+R  LIMDK+ + K     S  E  +SS  KRK  P  +
Subjt:  LHLTQGSMSVIQDERKFTELSRFAPDLVSTSERRIKRFIRGLCEKIRGVVALKEPTTFAAALRATLIMDKNAAKKPQATHSRWEASASSEFKRKSPPALS

Query:  DQTSKAHHPTSGQAITLPLCSLCNKHHLWQCWLDQRICFKCGKEGHFARMCPSKGEANTDKPTPKALPT-ATQGGKQKAHIFALTKKEAEDTD
        D + +A    +      P+C  C K H  QCW   + CF+CG+E HFAR CP    ANT +   +  PT +TQG  Q+A +FALT+KEA D +
Subjt:  DQTSKAHHPTSGQAITLPLCSLCNKHHLWQCWLDQRICFKCGKEGHFARMCPSKGEANTDKPTPKALPT-ATQGGKQKAHIFALTKKEAEDTD

A0A6J1DNV8 uncharacterized protein LOC1110229257.4e-5745.89Show/hide
Query:  GANRQQRVNPPIPPEVPPFIRNFKRYGPPTFGGGSEKATAAEQWIVKLESLFEYLNCEDHLKVRGAIFMLRDEA----------------------FKDL
        GA  QQ     I  E   FIR+FKR+GPP F G SE+ TAAE+W+ +LE+L+ YL C D  KVRGA+FML+ EA                      FKDL
Subjt:  GANRQQRVNPPIPPEVPPFIRNFKRYGPPTFGGGSEKATAAEQWIVKLESLFEYLNCEDHLKVRGAIFMLRDEA----------------------FKDL

Query:  LYDYYFPDTVKDDKETEFLHLTQGSMSVIQDERKFTELSRFAPDLVSTSERRIKRFIRGLCEKIRGVVALKEPTTFAAALRATLIMDKNAAKKPQATHSR
        LY+YYFP TV+++K  EFL LTQ S+ V Q ERKFTELSRF    + T + +I +FI GL  +I+G++ LKEPTT+AAA+R  L+MDK   ++PQ   S+
Subjt:  LYDYYFPDTVKDDKETEFLHLTQGSMSVIQDERKFTELSRFAPDLVSTSERRIKRFIRGLCEKIRGVVALKEPTTFAAALRATLIMDKNAAKKPQATHSR

Query:  WEASASSEFKRKSPPALSDQTSKAHHPTSGQAITLPLCSLCNKHHLWQCWLDQRICFKCGKEGHFARMCPSKGEANTDKPTPKALPTATQGG
            +SS  KRK     S Q S+ H     +  T P C  C K+H   CW+ +RIC++C KEGHFAR C   G +NT     +   TAT  G
Subjt:  WEASASSEFKRKSPPALSDQTSKAHHPTSGQAITLPLCSLCNKHHLWQCWLDQRICFKCGKEGHFARMCPSKGEANTDKPTPKALPTATQGG

A0A6J1DQB9 Reverse transcriptase4.5e-6246.01Show/hide
Query:  GANRQQRVNPPIPPEVPPFIRNFKRYGPPTFGGGSEKATAAEQWIVKLESLFEYLNCEDHLKVRGAIFMLRDEA----------------------FKDL
        GA  QQ     IP +   FIR+FK +GPP F G SE+ TAAE+W+ +LE+L+ YL C D  KVRGA+FMLR EA                      FKDL
Subjt:  GANRQQRVNPPIPPEVPPFIRNFKRYGPPTFGGGSEKATAAEQWIVKLESLFEYLNCEDHLKVRGAIFMLRDEA----------------------FKDL

Query:  LYDYYFPDTVKDDKETEFLHLTQGSMSVIQDERKFTELSRFAPDLVSTSERRIKRFIRGLCEKIRGVVALKEPTTFAAALRATLIMDKNAAKKPQATHSR
        LY+YYFP   +++K  EFL LTQGS++V Q ERKFTELSRF    V T + +I +FI GL  +I+G++ LKEPTT+AAA+R  L+MDK   ++PQ   S+
Subjt:  LYDYYFPDTVKDDKETEFLHLTQGSMSVIQDERKFTELSRFAPDLVSTSERRIKRFIRGLCEKIRGVVALKEPTTFAAALRATLIMDKNAAKKPQATHSR

Query:  WEASASSEFKRKSPPALSDQTSKAHHPTSGQAITLPLCSLCNKHHLWQCWLDQRICFKCGKEGHFARMCPSKGEANT---DKPTPKALPTATQGGKQKAH
            ++S  KRK     + Q+S+ H   + +    P+C  C K+H   CWL ++ICFKC KEGHF R C   G +NT    + TP A  TATQGG Q A 
Subjt:  WEASASSEFKRKSPPALSDQTSKAHHPTSGQAITLPLCSLCNKHHLWQCWLDQRICFKCGKEGHFARMCPSKGEANT---DKPTPKALPTATQGGKQKAH

Query:  IFALTKKEAEDTD
        +FALT+ + E  +
Subjt:  IFALTKKEAEDTD

A0A6J1DTA8 uncharacterized protein LOC1110241142.8e-5643.55Show/hide
Query:  GANRQQRVNPPIPPEVPPFIRNFKRYGPPTFGGGSEKATAAEQWIVKLESLFEYLNCEDHLKVRGAIFMLRDEA----------------------FKDL
        GA  QQ     IP +   FIR+FKR+GPP F G SE+ TA E+W+ +LE+L+ YL C D  KVRGA+FMLR EA                      FKDL
Subjt:  GANRQQRVNPPIPPEVPPFIRNFKRYGPPTFGGGSEKATAAEQWIVKLESLFEYLNCEDHLKVRGAIFMLRDEA----------------------FKDL

Query:  LYDYYFPDTVKDDKETEFLHLTQGSMSVIQDERKFTELSRFAPDLVSTSERRIKRFIRGLCEKIRGVVALKEPTTFAAALRATLIMDKNAAKKPQATHSR
        LY+YYFP TV+++K  EFL LTQGS++V Q ERKFTELSRF    + T + +I +FI GL  +I+G++ +KEPTT+AAA+R  L+MDK   ++PQ   S+
Subjt:  LYDYYFPDTVKDDKETEFLHLTQGSMSVIQDERKFTELSRFAPDLVSTSERRIKRFIRGLCEKIRGVVALKEPTTFAAALRATLIMDKNAAKKPQATHSR

Query:  WEASASSEFKRKSPPALSDQTSKAHHPTSGQAITLPLCSLCNKHHLWQCWLDQRICFKCGKEGHFARMCPSKGEANTDKPTPKALPTATQGGKQKAHIFA
            +SS  KRK     S Q+S+ H     +    P+C  C K+H   CWL +RICF+C K                   TP A   A QGG Q+A +FA
Subjt:  WEASASSEFKRKSPPALSDQTSKAHHPTSGQAITLPLCSLCNKHHLWQCWLDQRICFKCGKEGHFARMCPSKGEANTDKPTPKALPTATQGGKQKAHIFA

Query:  LTKKEAEDTD
        LT+ + E  +
Subjt:  LTKKEAEDTD

A0A6J1DUM2 uncharacterized protein LOC1110232471.2e-6746.43Show/hide
Query:  MPP-HGWRPRGGLDMPALPGDGANRQQRVNPP----IPPEVPPFIRNFKRYGPPTFGGGSEKATAAEQWIVKLESLFEYLNCEDHLKVRGAIFMLRDEA-
        MPP H  R R   D    P  G     +  PP     P     FI++FKRYGPPTF G SE+ATA E+WI +LE+L+ YL CED  KV+GA+FMLR EA 
Subjt:  MPP-HGWRPRGGLDMPALPGDGANRQQRVNPP----IPPEVPPFIRNFKRYGPPTFGGGSEKATAAEQWIVKLESLFEYLNCEDHLKVRGAIFMLRDEA-

Query:  ---------------------FKDLLYDYYFPDTVKDDKETEFLHLTQGSMSVIQDERKFTELSRFAPDLVSTSERRIKRFIRGLCEKIRGVVALKEPTT
                             FK+LLYDYY+P+TVKD KE EFLHL QG++SV Q ERKFTELSRFA +L+ T   +IKRF++GL + IRG V L+ PTT
Subjt:  ---------------------FKDLLYDYYFPDTVKDDKETEFLHLTQGSMSVIQDERKFTELSRFAPDLVSTSERRIKRFIRGLCEKIRGVVALKEPTT

Query:  FAAALRATLIMDKNAAKKPQATHSRWEASASSEFKRKSPPALSDQTSKAHHPTSGQAITLPLCSLCNKHHLWQCWLDQRICFKCGKEGHFARMCPSKGEA
        +A A+R  L+MDK+ + K        E  +SS  KRK P   +D   +A    +      P+C  C K H  QCW   + CF+CG+EGHFAR CP    A
Subjt:  FAAALRATLIMDKNAAKKPQATHSRWEASASSEFKRKSPPALSDQTSKAHHPTSGQAITLPLCSLCNKHHLWQCWLDQRICFKCGKEGHFARMCPSKGEA

Query:  NTDKPTPK-ALPTATQGGKQKAHIFALTKKEAEDTD
        NT +   +   P +TQG  Q+A +FALT+KEA D +
Subjt:  NTDKPTPK-ALPTATQGGKQKAHIFALTKKEAEDTD

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGAGCAATTTAGAAAGGCAAGATGGAAAGCTTCCTTCTCAACCTGAGTCCGATGAGGATCATACTCCACTTGAGCCCACTTCCAGAGGAGGACCACCTCCTAAATCTAC
AAACGGAGTAAAATTTGACCCTCCTGTTTCTCTGAACTCTAATGTGTCTAAAGCTTCTTTTTCTTCTAGGTTGACAAAACCCAAGGTGGATGAAGTTGACAGAGAACTTC
TTGGGTCATGCTTGATTCAGTTGGATGAATGTTCGTTTATTCATCCTTTAGGTGTAGTAGAAGACGTTTTAGTGCAAGTAAATGAGCTAATATTTCCTAAAGATTTTTAC
ATGCTAAAAATGGAAGAGTCTAGTTCTCCTTCATCTCCATCCATTTTGCTTGGTCGCCCCTTCATGAAGACGACCAAAACAAAAATAGATGTTGATGAAGGAACGTTATT
GGTCAAGTTTGACAGGGAGATTATTGGCGTTATAGATTCTTTGGTACAGGAAGTAATTTGGGACACCTACGATGATGAGGATGAGGATGAAAATTATGGAGAATGCACGA
TGCTCCTACTAGAATTAAAACAACTCCCTGAGCACTTGAAGAATGCTTATCTTAGAGAGGAGATAGCATTAAAAGATGGGTCCAAGCCATGCGATCAGTCCCAAAGGCGT
TTAAACTTAGCCTTGAGGGAGGTCGTGATGAAAAAGATCTTTAAGTTGCAGGAAGCAGGTAGTATCTATCTTATTTCTGACAGTGAATGGAAAACAGGTATCCCTGTAGT
TAGAAATGAAAAGCTAGATGTTCCTGTTAAGTTTCAAAACGAGTGGAGAATGTGTATTGATTTTCGGAAGTTGAATAGTGGAAGCTCTTTTCCTGTTTCCTTGATGGTTT
TTCAAGATTTTACCAAATTCCAATTTATCGGGATGATTATGAAAAGACGACATTCACTTGTCCATTTGGAACTTTTGCATTCAGACGGATTCCTTTCGTTCTATGCAATG
CCCCAGGCACATTTCAGAGATGACATGTTATTTCTGAACAGGGAATTGAAGTTGATTAAGCAAAGATTGATGTTATTGTTAGCCTCTCATACGCCACAAATGTTCATTAA
ACATTTCAATAAGATTGCATTGCTGATGACCACCTTGCTGCAAAAAGATATAGAGTTTAGTTTCAATGATGAATTCAAGCAAGCGTTTGGCAAAATTAAGGCGGCCCTAG
ATAGTGCTCCAATTGTGCAAGCTCCTAGGTGGGATTTCCCATTTAAGATTATGTGCAATGTGAGCAACTACGCTGTGGGAGCTGCCTTGGGCCAAAGGGAGTGCTTCTCT
GAAACAGGTTATAGGAAGAATTTGGGATTTGTTGTGGCTGCTAGATATGTGGTTAAGTTGCCTAGAGTTAATGGTCTTCCTCAATCTCCTCTCCATCACCAGACGATGCC
GCCACATGGATGGAGACCTAGAGGAGGCTTAGACATGCCTGCGCTTCCCGGTGACGGAGCAAACAGGCAGCAAAGGGTGAACCCTCCCATCCCCCCAGAAGTTCCCCCAT
TTATAAGGAATTTCAAGCGCTATGGGCCTCCGACCTTCGGCGGTGGGTCAGAGAAAGCTACGGCAGCTGAGCAGTGGATTGTAAAGCTGGAGTCATTGTTTGAGTACCTA
AATTGCGAGGATCATCTTAAGGTCAGAGGAGCAATTTTCATGCTTCGAGACGAGGCGTTTAAAGACCTTCTATATGACTACTACTTTCCTGACACAGTGAAGGATGACAA
GGAAACAGAGTTCCTGCATTTGACCCAGGGTAGCATGTCGGTGATCCAAGATGAAAGGAAGTTCACTGAGCTGTCTCGTTTTGCACCCGACCTGGTGAGTACGTCAGAGA
GGAGGATTAAGAGGTTCATCAGGGGCCTGTGCGAGAAAATTAGAGGTGTGGTCGCCTTAAAGGAGCCGACGACTTTTGCTGCAGCGCTTAGGGCCACCCTGATCATGGAC
AAAAATGCGGCTAAGAAACCTCAGGCGACACACTCACGTTGGGAGGCTAGTGCCTCATCTGAATTTAAAAGGAAGTCTCCCCCAGCTCTGTCAGATCAAACTTCCAAGGC
CCATCATCCGACCTCGGGTCAAGCTATCACCCTCCCATTGTGTAGCTTGTGCAACAAGCATCACTTGTGGCAATGCTGGCTAGACCAAAGGATTTGCTTCAAGTGTGGAA
AGGAAGGTCACTTTGCAAGAATGTGCCCAAGTAAAGGGGAGGCCAACACAGACAAGCCGACCCCGAAAGCCCTACCAACAGCTACTCAAGGAGGAAAACAAAAGGCACAC
ATCTTTGCACTGACCAAAAAGGAGGCTGAGGATACGGATTTTGGAACAACGGTTCCAATGCAATTTGAATCACTCAAATCGGAGTTCAAACGAAGAAGATATGGCCAAAA
CAAGCTTAATGGGAAATTCCCAAACTGA
mRNA sequenceShow/hide mRNA sequence
ATGAGCAATTTAGAAAGGCAAGATGGAAAGCTTCCTTCTCAACCTGAGTCCGATGAGGATCATACTCCACTTGAGCCCACTTCCAGAGGAGGACCACCTCCTAAATCTAC
AAACGGAGTAAAATTTGACCCTCCTGTTTCTCTGAACTCTAATGTGTCTAAAGCTTCTTTTTCTTCTAGGTTGACAAAACCCAAGGTGGATGAAGTTGACAGAGAACTTC
TTGGGTCATGCTTGATTCAGTTGGATGAATGTTCGTTTATTCATCCTTTAGGTGTAGTAGAAGACGTTTTAGTGCAAGTAAATGAGCTAATATTTCCTAAAGATTTTTAC
ATGCTAAAAATGGAAGAGTCTAGTTCTCCTTCATCTCCATCCATTTTGCTTGGTCGCCCCTTCATGAAGACGACCAAAACAAAAATAGATGTTGATGAAGGAACGTTATT
GGTCAAGTTTGACAGGGAGATTATTGGCGTTATAGATTCTTTGGTACAGGAAGTAATTTGGGACACCTACGATGATGAGGATGAGGATGAAAATTATGGAGAATGCACGA
TGCTCCTACTAGAATTAAAACAACTCCCTGAGCACTTGAAGAATGCTTATCTTAGAGAGGAGATAGCATTAAAAGATGGGTCCAAGCCATGCGATCAGTCCCAAAGGCGT
TTAAACTTAGCCTTGAGGGAGGTCGTGATGAAAAAGATCTTTAAGTTGCAGGAAGCAGGTAGTATCTATCTTATTTCTGACAGTGAATGGAAAACAGGTATCCCTGTAGT
TAGAAATGAAAAGCTAGATGTTCCTGTTAAGTTTCAAAACGAGTGGAGAATGTGTATTGATTTTCGGAAGTTGAATAGTGGAAGCTCTTTTCCTGTTTCCTTGATGGTTT
TTCAAGATTTTACCAAATTCCAATTTATCGGGATGATTATGAAAAGACGACATTCACTTGTCCATTTGGAACTTTTGCATTCAGACGGATTCCTTTCGTTCTATGCAATG
CCCCAGGCACATTTCAGAGATGACATGTTATTTCTGAACAGGGAATTGAAGTTGATTAAGCAAAGATTGATGTTATTGTTAGCCTCTCATACGCCACAAATGTTCATTAA
ACATTTCAATAAGATTGCATTGCTGATGACCACCTTGCTGCAAAAAGATATAGAGTTTAGTTTCAATGATGAATTCAAGCAAGCGTTTGGCAAAATTAAGGCGGCCCTAG
ATAGTGCTCCAATTGTGCAAGCTCCTAGGTGGGATTTCCCATTTAAGATTATGTGCAATGTGAGCAACTACGCTGTGGGAGCTGCCTTGGGCCAAAGGGAGTGCTTCTCT
GAAACAGGTTATAGGAAGAATTTGGGATTTGTTGTGGCTGCTAGATATGTGGTTAAGTTGCCTAGAGTTAATGGTCTTCCTCAATCTCCTCTCCATCACCAGACGATGCC
GCCACATGGATGGAGACCTAGAGGAGGCTTAGACATGCCTGCGCTTCCCGGTGACGGAGCAAACAGGCAGCAAAGGGTGAACCCTCCCATCCCCCCAGAAGTTCCCCCAT
TTATAAGGAATTTCAAGCGCTATGGGCCTCCGACCTTCGGCGGTGGGTCAGAGAAAGCTACGGCAGCTGAGCAGTGGATTGTAAAGCTGGAGTCATTGTTTGAGTACCTA
AATTGCGAGGATCATCTTAAGGTCAGAGGAGCAATTTTCATGCTTCGAGACGAGGCGTTTAAAGACCTTCTATATGACTACTACTTTCCTGACACAGTGAAGGATGACAA
GGAAACAGAGTTCCTGCATTTGACCCAGGGTAGCATGTCGGTGATCCAAGATGAAAGGAAGTTCACTGAGCTGTCTCGTTTTGCACCCGACCTGGTGAGTACGTCAGAGA
GGAGGATTAAGAGGTTCATCAGGGGCCTGTGCGAGAAAATTAGAGGTGTGGTCGCCTTAAAGGAGCCGACGACTTTTGCTGCAGCGCTTAGGGCCACCCTGATCATGGAC
AAAAATGCGGCTAAGAAACCTCAGGCGACACACTCACGTTGGGAGGCTAGTGCCTCATCTGAATTTAAAAGGAAGTCTCCCCCAGCTCTGTCAGATCAAACTTCCAAGGC
CCATCATCCGACCTCGGGTCAAGCTATCACCCTCCCATTGTGTAGCTTGTGCAACAAGCATCACTTGTGGCAATGCTGGCTAGACCAAAGGATTTGCTTCAAGTGTGGAA
AGGAAGGTCACTTTGCAAGAATGTGCCCAAGTAAAGGGGAGGCCAACACAGACAAGCCGACCCCGAAAGCCCTACCAACAGCTACTCAAGGAGGAAAACAAAAGGCACAC
ATCTTTGCACTGACCAAAAAGGAGGCTGAGGATACGGATTTTGGAACAACGGTTCCAATGCAATTTGAATCACTCAAATCGGAGTTCAAACGAAGAAGATATGGCCAAAA
CAAGCTTAATGGGAAATTCCCAAACTGA
Protein sequenceShow/hide protein sequence
MSNLERQDGKLPSQPESDEDHTPLEPTSRGGPPPKSTNGVKFDPPVSLNSNVSKASFSSRLTKPKVDEVDRELLGSCLIQLDECSFIHPLGVVEDVLVQVNELIFPKDFY
MLKMEESSSPSSPSILLGRPFMKTTKTKIDVDEGTLLVKFDREIIGVIDSLVQEVIWDTYDDEDEDENYGECTMLLLELKQLPEHLKNAYLREEIALKDGSKPCDQSQRR
LNLALREVVMKKIFKLQEAGSIYLISDSEWKTGIPVVRNEKLDVPVKFQNEWRMCIDFRKLNSGSSFPVSLMVFQDFTKFQFIGMIMKRRHSLVHLELLHSDGFLSFYAM
PQAHFRDDMLFLNRELKLIKQRLMLLLASHTPQMFIKHFNKIALLMTTLLQKDIEFSFNDEFKQAFGKIKAALDSAPIVQAPRWDFPFKIMCNVSNYAVGAALGQRECFS
ETGYRKNLGFVVAARYVVKLPRVNGLPQSPLHHQTMPPHGWRPRGGLDMPALPGDGANRQQRVNPPIPPEVPPFIRNFKRYGPPTFGGGSEKATAAEQWIVKLESLFEYL
NCEDHLKVRGAIFMLRDEAFKDLLYDYYFPDTVKDDKETEFLHLTQGSMSVIQDERKFTELSRFAPDLVSTSERRIKRFIRGLCEKIRGVVALKEPTTFAAALRATLIMD
KNAAKKPQATHSRWEASASSEFKRKSPPALSDQTSKAHHPTSGQAITLPLCSLCNKHHLWQCWLDQRICFKCGKEGHFARMCPSKGEANTDKPTPKALPTATQGGKQKAH
IFALTKKEAEDTDFGTTVPMQFESLKSEFKRRRYGQNKLNGKFPN