; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Lag0038649 (gene) of Sponge gourd (AG-4) v1 genome

Gene IDLag0038649
OrganismLuffa acutangula AG-4 (Sponge gourd (AG-4) v1)
DescriptionReverse transcriptase
Genome locationchr2:22302916..22310858
RNA-Seq ExpressionLag0038649
SyntenyLag0038649
Gene Ontology termsGO:0003676 - nucleic acid binding (molecular function)
GO:0003824 - catalytic activity (molecular function)
GO:0008270 - zinc ion binding (molecular function)
InterPro domainsIPR001878 - Zinc finger, CCHC-type
IPR005162 - Retrotransposon gag domain
IPR021109 - Aspartic peptidase domain superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_022156067.1 uncharacterized protein LOC111023035 [Momordica charantia]4.3e-8251.67Show/hide
Query:  TVKDDKETEFLHLVQGNMLVVQYERKFTELSRFAPDLVNTPERKIKRFVKGLREEIRGSVALKEPTTFVAALKGAFIMDKNVSKKALQPRWEVGSTSGVK
        TV+++K  EFL L QG++ V QY+RKFTELSRF    + T + KI +F+ GLR EI+G + LKE TT+ AA++ A +MDK + +   Q +  +GS+SGVK
Subjt:  TVKDDKETEFLHLVQGNMLVVQYERKFTELSRFAPDLVNTPERKIKRFVKGLREEIRGSVALKEPTTFVAALKGAFIMDKNVSKKALQPRWEVGSTSGVK

Query:  RQ--PTSSNYPSKNQRHQTQGQAPPPVCNICNKRHNGQCWSGHQICFKCGKEGHYVRLCPNKGEPGTENPAQKGLQAPT----HGNNQKTRVFALTKEKV
        R+    SS+  S   +H  Q Q  PP C  C K H G CW G +ICF+C KEGH+ R CP  G     N    G + PT     G  Q+ RVFALT+  V
Subjt:  RQ--PTSSNYPSKNQRHQTQGQAPPPVCNICNKRHNGQCWSGHQICFKCGKEGHYVRLCPNKGEPGTENPAQKGLQAPT----HGNNQKTRVFALTKEKV

Query:  EEVTTVVTETILVFKIPAFVLFDSGLSHSFVSSKFTRQASIELEPLGYLLSVSTPSSSIMLAQQVVKMGEVSIAGHILRARLIKLDMKDFDVILGMDWLV
        E    VVT TILV  +PA+ LFDSG SHSF++S F + A +ELE LG+LLSVSTPS S+++  QVVK G++S  G  L  +LI+LDM+DFDVILGMDWL 
Subjt:  EEVTTVVTETILVFKIPAFVLFDSGLSHSFVSSKFTRQASIELEPLGYLLSVSTPSSSIMLAQQVVKMGEVSIAGHILRARLIKLDMKDFDVILGMDWLV

Query:  ENQATIDCAKKEVCFRLPFGGNFKFKGAK
         N+A I+C+KKEV FRLP G NF FKG K
Subjt:  ENQATIDCAKKEVCFRLPFGGNFKFKGAK

XP_022156328.1 LOW QUALITY PROTEIN: uncharacterized protein LOC111023249 [Momordica charantia]4.4e-7951.54Show/hide
Query:  KDDKETEFLHLVQGNMLVVQYERKFTELSRFAPDLVNTPERKIKRFVKGLREEIRGSVALKEPTTFVAALKGAFIMDKNVSKKALQPRWEVGSTSGVKRQ
        +++K  EFL L QG++ V QYERKFTELSRF    V T + KI +F+ GLR EI+G + LKEPTT+ AA++ A +MDK + +   Q +  +GS SGVKR+
Subjt:  KDDKETEFLHLVQGNMLVVQYERKFTELSRFAPDLVNTPERKIKRFVKGLREEIRGSVALKEPTTFVAALKGAFIMDKNVSKKALQPRWEVGSTSGVKRQ

Query:  --PTSSNYPSKNQRHQTQGQAPPPVCNICNKRHNGQCWSGHQICFKCGKEGHYVRLCPNKGEPGTENPAQKGLQA-PTHGNNQKTRVFALTKEKVEEVTT
            S++  S+  +H  Q Q  PPVC  C K H   CW G +ICFKC KEGH+ R C   G   T+  +QK   A  T G  Q  RVFALT+  VE    
Subjt:  --PTSSNYPSKNQRHQTQGQAPPPVCNICNKRHNGQCWSGHQICFKCGKEGHYVRLCPNKGEPGTENPAQKGLQA-PTHGNNQKTRVFALTKEKVEEVTT

Query:  VVTETILVFKIPAFVLFDSGLSHSFVSSKFTRQASIELEPLGYLLSVSTPSSSIMLAQQVVKMGEVSIAGHILRARLIKLDMKDFDVILGMDWLVENQAT
        VVT TIL+  IPA+ LFDSG SHSF++S F R A +ELE  G+ LSVSTPS S+++  QVVK G++S  G  L   LI+L+M+DFDVILGMDWL  N+A 
Subjt:  VVTETILVFKIPAFVLFDSGLSHSFVSSKFTRQASIELEPLGYLLSVSTPSSSIMLAQQVVKMGEVSIAGHILRARLIKLDMKDFDVILGMDWLVENQAT

Query:  IDCAKKEVCFRLPFGGNFKFKGAK
        I+C+KKEV F L  G NF FKG K
Subjt:  IDCAKKEVCFRLPFGGNFKFKGAK

XP_022156992.1 uncharacterized protein LOC111023821 [Momordica charantia]2.0e-7950.46Show/hide
Query:  DDKETEFLHLVQGNMLVVQYERKFTELSRFAPDLVNTPERKIKRFVKGLREEIRGSVALKEPTTFVAALKGAFIMDKNVSKKALQPRWEVGSTSGVKRQ-
        ++K  EFL L QG++ V QYERKFTELSRF    +   + KI +F+ GL  EI+G + LKEPTT+ AA++ A +MDK + +   Q +  +GS+SGVKR+ 
Subjt:  DDKETEFLHLVQGNMLVVQYERKFTELSRFAPDLVNTPERKIKRFVKGLREEIRGSVALKEPTTFVAALKGAFIMDKNVSKKALQPRWEVGSTSGVKRQ-

Query:  -PTSSNYPSKNQRHQTQGQAPPPVCNICNKRHNGQCWSGHQICFKCGKEGHYVRLCPNKGEPGTENPAQK-GLQAPTHGNNQKTRVFALTKEKVEEVTTV
           SS+ PS+  +H  Q Q  PPVC  C K H G CW G  IC++C KEGH+ R CP  G P T+   Q+  +     G   + RVFALT+  V     V
Subjt:  -PTSSNYPSKNQRHQTQGQAPPPVCNICNKRHNGQCWSGHQICFKCGKEGHYVRLCPNKGEPGTENPAQK-GLQAPTHGNNQKTRVFALTKEKVEEVTTV

Query:  VTETILVFKIPAFVLFDSGLSHSFVSSKFTRQASIELEPLGYLLSVSTPSSSIMLAQQVVKMGEVSIAGHILRARLIKLDMKDFDVILGMDWLVENQATI
        V  T+LV  +PA+ LFDS  SHSF++S F R A +ELE LG+LLSVSTPS S+++  Q+VK G++S  G  L  +LI+LDM+DFDVILGMDWL  NQA I
Subjt:  VTETILVFKIPAFVLFDSGLSHSFVSSKFTRQASIELEPLGYLLSVSTPSSSIMLAQQVVKMGEVSIAGHILRARLIKLDMKDFDVILGMDWLVENQATI

Query:  DCAKKEVCFRLPFGGNFKFKGAK
        DC+KKE  FRLP   NF FKG K
Subjt:  DCAKKEVCFRLPFGGNFKFKGAK

XP_022158750.1 uncharacterized protein LOC111025215 [Momordica charantia]2.7e-8451.98Show/hide
Query:  TVKDDKETEFLHLVQGNMLVVQYERKFTELSRFAPDLVNTPERKIKRFVKGLREEIRGSVALKEPTTFVAALKGAFIMDKNVSKKALQPRWEVGSTSGVK
        TV+++K  EFL L QG++ V +YERKFTELSRF    + T + KI +F+ GLR EI+G + LKEPTT+ AA++ A +MDK + +   Q +  +GS+SGVK
Subjt:  TVKDDKETEFLHLVQGNMLVVQYERKFTELSRFAPDLVNTPERKIKRFVKGLREEIRGSVALKEPTTFVAALKGAFIMDKNVSKKALQPRWEVGSTSGVK

Query:  RQ--PTSSNYPSKNQRHQTQGQAPPPVCNICNKRHNGQCWSGHQICFKCGKEGHYVRLCPNKGEP----GTENPAQKGLQAPTHGNNQKTRVFALTKEKV
        R+    SS+ PS+  +H  Q Q  PPVC  C K H G CW G +IC++C KEGH+ R CP  G      G   PA    Q  TH    + RVFALT+  V
Subjt:  RQ--PTSSNYPSKNQRHQTQGQAPPPVCNICNKRHNGQCWSGHQICFKCGKEGHYVRLCPNKGEP----GTENPAQKGLQAPTHGNNQKTRVFALTKEKV

Query:  EEVTTVVTETILVFKIPAFVLFDSGLSHSFVSSKFTRQASIELEPLGYLLSVSTPSSSIMLAQQVVKMGEVSIAGHILRARLIKLDMKDFDVILGMDWLV
        E    VVT T+LV  +PA+ LFDSG SHSF++S F   A +ELE LG+LLSVSTPS S+++  QVVK G++S  G  L  +LI+LDM+DFDVILGMDWL 
Subjt:  EEVTTVVTETILVFKIPAFVLFDSGLSHSFVSSKFTRQASIELEPLGYLLSVSTPSSSIMLAQQVVKMGEVSIAGHILRARLIKLDMKDFDVILGMDWLV

Query:  ENQATIDCAKKEVCFRLPFGGNFKFKGAK
         N+A IDC+KK+V FRLP G NF FKG K
Subjt:  ENQATIDCAKKEVCFRLPFGGNFKFKGAK

XP_022159077.1 uncharacterized protein LOC111025517 [Momordica charantia]4.1e-7750.15Show/hide
Query:  TVKDDKETEFLHLVQGNMLVVQYERKFTELSRFAPDLVNTPERKIKRFVKGLREEIRGSVALKEPTTFVAALKGAFIMDKNVSKKALQPRWEVGSTSGVK
        T+KD KE EFLH   G + V QYERKFTELS FA +L+ T   KIKRFVKGLR+ IRG V L+ P T+  A++G  IMD +VS  ++QP  EVGS+SGVK
Subjt:  TVKDDKETEFLHLVQGNMLVVQYERKFTELSRFAPDLVNTPERKIKRFVKGLREEIRGSVALKEPTTFVAALKGAFIMDKNVSKKALQPRWEVGSTSGVK

Query:  RQ--PTSSNYPSKNQRHQTQGQAPPPVCNICNKRHNGQCWSGHQICFKCGKEGHYVRLCPNKGEPGTENPAQKGLQAPTHGNNQKTRVFALTKEKVEEVT
        R+  P  ++ P +  +   Q Q  PPVC  C KR  GQCW+G++ CF+CG+EGH+ R C                 + T  N Q+        ++     
Subjt:  RQ--PTSSNYPSKNQRHQTQGQAPPPVCNICNKRHNGQCWSGHQICFKCGKEGHYVRLCPNKGEPGTENPAQKGLQAPTHGNNQKTRVFALTKEKVEEVT

Query:  TVVTETILVFKIPAFVLFDSGLSHSFVSSKFTRQASIELEPLGYLLSVSTPSSSIMLAQQVVKMGEVSIAGHILRARLIKLDMKDFDVILGMDWLVENQA
        +    T LV  +PA+VLFD G SH+F+S+ F RQA++ELEPLG+LLSVSTPS S+++A Q+V+ GE+S     L ARLI+LDM+DFDVILGMDWL  NQA
Subjt:  TVVTETILVFKIPAFVLFDSGLSHSFVSSKFTRQASIELEPLGYLLSVSTPSSSIMLAQQVVKMGEVSIAGHILRARLIKLDMKDFDVILGMDWLVENQA

Query:  TIDCAKKEVCFRLPFGGNFKFKG
         I+C+K+EV F+LP G +F FKG
Subjt:  TIDCAKKEVCFRLPFGGNFKFKG

TrEMBL top hitse value%identityAlignment
A0A6J1DQB9 Reverse transcriptase2.1e-7951.54Show/hide
Query:  KDDKETEFLHLVQGNMLVVQYERKFTELSRFAPDLVNTPERKIKRFVKGLREEIRGSVALKEPTTFVAALKGAFIMDKNVSKKALQPRWEVGSTSGVKRQ
        +++K  EFL L QG++ V QYERKFTELSRF    V T + KI +F+ GLR EI+G + LKEPTT+ AA++ A +MDK + +   Q +  +GS SGVKR+
Subjt:  KDDKETEFLHLVQGNMLVVQYERKFTELSRFAPDLVNTPERKIKRFVKGLREEIRGSVALKEPTTFVAALKGAFIMDKNVSKKALQPRWEVGSTSGVKRQ

Query:  --PTSSNYPSKNQRHQTQGQAPPPVCNICNKRHNGQCWSGHQICFKCGKEGHYVRLCPNKGEPGTENPAQKGLQA-PTHGNNQKTRVFALTKEKVEEVTT
            S++  S+  +H  Q Q  PPVC  C K H   CW G +ICFKC KEGH+ R C   G   T+  +QK   A  T G  Q  RVFALT+  VE    
Subjt:  --PTSSNYPSKNQRHQTQGQAPPPVCNICNKRHNGQCWSGHQICFKCGKEGHYVRLCPNKGEPGTENPAQKGLQA-PTHGNNQKTRVFALTKEKVEEVTT

Query:  VVTETILVFKIPAFVLFDSGLSHSFVSSKFTRQASIELEPLGYLLSVSTPSSSIMLAQQVVKMGEVSIAGHILRARLIKLDMKDFDVILGMDWLVENQAT
        VVT TIL+  IPA+ LFDSG SHSF++S F R A +ELE  G+ LSVSTPS S+++  QVVK G++S  G  L   LI+L+M+DFDVILGMDWL  N+A 
Subjt:  VVTETILVFKIPAFVLFDSGLSHSFVSSKFTRQASIELEPLGYLLSVSTPSSSIMLAQQVVKMGEVSIAGHILRARLIKLDMKDFDVILGMDWLVENQAT

Query:  IDCAKKEVCFRLPFGGNFKFKGAK
        I+C+KKEV F L  G NF FKG K
Subjt:  IDCAKKEVCFRLPFGGNFKFKGAK

A0A6J1DR22 uncharacterized protein LOC1110230352.1e-8251.67Show/hide
Query:  TVKDDKETEFLHLVQGNMLVVQYERKFTELSRFAPDLVNTPERKIKRFVKGLREEIRGSVALKEPTTFVAALKGAFIMDKNVSKKALQPRWEVGSTSGVK
        TV+++K  EFL L QG++ V QY+RKFTELSRF    + T + KI +F+ GLR EI+G + LKE TT+ AA++ A +MDK + +   Q +  +GS+SGVK
Subjt:  TVKDDKETEFLHLVQGNMLVVQYERKFTELSRFAPDLVNTPERKIKRFVKGLREEIRGSVALKEPTTFVAALKGAFIMDKNVSKKALQPRWEVGSTSGVK

Query:  RQ--PTSSNYPSKNQRHQTQGQAPPPVCNICNKRHNGQCWSGHQICFKCGKEGHYVRLCPNKGEPGTENPAQKGLQAPT----HGNNQKTRVFALTKEKV
        R+    SS+  S   +H  Q Q  PP C  C K H G CW G +ICF+C KEGH+ R CP  G     N    G + PT     G  Q+ RVFALT+  V
Subjt:  RQ--PTSSNYPSKNQRHQTQGQAPPPVCNICNKRHNGQCWSGHQICFKCGKEGHYVRLCPNKGEPGTENPAQKGLQAPT----HGNNQKTRVFALTKEKV

Query:  EEVTTVVTETILVFKIPAFVLFDSGLSHSFVSSKFTRQASIELEPLGYLLSVSTPSSSIMLAQQVVKMGEVSIAGHILRARLIKLDMKDFDVILGMDWLV
        E    VVT TILV  +PA+ LFDSG SHSF++S F + A +ELE LG+LLSVSTPS S+++  QVVK G++S  G  L  +LI+LDM+DFDVILGMDWL 
Subjt:  EEVTTVVTETILVFKIPAFVLFDSGLSHSFVSSKFTRQASIELEPLGYLLSVSTPSSSIMLAQQVVKMGEVSIAGHILRARLIKLDMKDFDVILGMDWLV

Query:  ENQATIDCAKKEVCFRLPFGGNFKFKGAK
         N+A I+C+KKEV FRLP G NF FKG K
Subjt:  ENQATIDCAKKEVCFRLPFGGNFKFKGAK

A0A6J1DTE5 uncharacterized protein LOC1110238219.6e-8050.46Show/hide
Query:  DDKETEFLHLVQGNMLVVQYERKFTELSRFAPDLVNTPERKIKRFVKGLREEIRGSVALKEPTTFVAALKGAFIMDKNVSKKALQPRWEVGSTSGVKRQ-
        ++K  EFL L QG++ V QYERKFTELSRF    +   + KI +F+ GL  EI+G + LKEPTT+ AA++ A +MDK + +   Q +  +GS+SGVKR+ 
Subjt:  DDKETEFLHLVQGNMLVVQYERKFTELSRFAPDLVNTPERKIKRFVKGLREEIRGSVALKEPTTFVAALKGAFIMDKNVSKKALQPRWEVGSTSGVKRQ-

Query:  -PTSSNYPSKNQRHQTQGQAPPPVCNICNKRHNGQCWSGHQICFKCGKEGHYVRLCPNKGEPGTENPAQK-GLQAPTHGNNQKTRVFALTKEKVEEVTTV
           SS+ PS+  +H  Q Q  PPVC  C K H G CW G  IC++C KEGH+ R CP  G P T+   Q+  +     G   + RVFALT+  V     V
Subjt:  -PTSSNYPSKNQRHQTQGQAPPPVCNICNKRHNGQCWSGHQICFKCGKEGHYVRLCPNKGEPGTENPAQK-GLQAPTHGNNQKTRVFALTKEKVEEVTTV

Query:  VTETILVFKIPAFVLFDSGLSHSFVSSKFTRQASIELEPLGYLLSVSTPSSSIMLAQQVVKMGEVSIAGHILRARLIKLDMKDFDVILGMDWLVENQATI
        V  T+LV  +PA+ LFDS  SHSF++S F R A +ELE LG+LLSVSTPS S+++  Q+VK G++S  G  L  +LI+LDM+DFDVILGMDWL  NQA I
Subjt:  VTETILVFKIPAFVLFDSGLSHSFVSSKFTRQASIELEPLGYLLSVSTPSSSIMLAQQVVKMGEVSIAGHILRARLIKLDMKDFDVILGMDWLVENQATI

Query:  DCAKKEVCFRLPFGGNFKFKGAK
        DC+KKE  FRLP   NF FKG K
Subjt:  DCAKKEVCFRLPFGGNFKFKGAK

A0A6J1DWP4 uncharacterized protein LOC1110252151.3e-8451.98Show/hide
Query:  TVKDDKETEFLHLVQGNMLVVQYERKFTELSRFAPDLVNTPERKIKRFVKGLREEIRGSVALKEPTTFVAALKGAFIMDKNVSKKALQPRWEVGSTSGVK
        TV+++K  EFL L QG++ V +YERKFTELSRF    + T + KI +F+ GLR EI+G + LKEPTT+ AA++ A +MDK + +   Q +  +GS+SGVK
Subjt:  TVKDDKETEFLHLVQGNMLVVQYERKFTELSRFAPDLVNTPERKIKRFVKGLREEIRGSVALKEPTTFVAALKGAFIMDKNVSKKALQPRWEVGSTSGVK

Query:  RQ--PTSSNYPSKNQRHQTQGQAPPPVCNICNKRHNGQCWSGHQICFKCGKEGHYVRLCPNKGEP----GTENPAQKGLQAPTHGNNQKTRVFALTKEKV
        R+    SS+ PS+  +H  Q Q  PPVC  C K H G CW G +IC++C KEGH+ R CP  G      G   PA    Q  TH    + RVFALT+  V
Subjt:  RQ--PTSSNYPSKNQRHQTQGQAPPPVCNICNKRHNGQCWSGHQICFKCGKEGHYVRLCPNKGEP----GTENPAQKGLQAPTHGNNQKTRVFALTKEKV

Query:  EEVTTVVTETILVFKIPAFVLFDSGLSHSFVSSKFTRQASIELEPLGYLLSVSTPSSSIMLAQQVVKMGEVSIAGHILRARLIKLDMKDFDVILGMDWLV
        E    VVT T+LV  +PA+ LFDSG SHSF++S F   A +ELE LG+LLSVSTPS S+++  QVVK G++S  G  L  +LI+LDM+DFDVILGMDWL 
Subjt:  EEVTTVVTETILVFKIPAFVLFDSGLSHSFVSSKFTRQASIELEPLGYLLSVSTPSSSIMLAQQVVKMGEVSIAGHILRARLIKLDMKDFDVILGMDWLV

Query:  ENQATIDCAKKEVCFRLPFGGNFKFKGAK
         N+A IDC+KK+V FRLP G NF FKG K
Subjt:  ENQATIDCAKKEVCFRLPFGGNFKFKGAK

A0A6J1DYU5 uncharacterized protein LOC1110255172.0e-7750.15Show/hide
Query:  TVKDDKETEFLHLVQGNMLVVQYERKFTELSRFAPDLVNTPERKIKRFVKGLREEIRGSVALKEPTTFVAALKGAFIMDKNVSKKALQPRWEVGSTSGVK
        T+KD KE EFLH   G + V QYERKFTELS FA +L+ T   KIKRFVKGLR+ IRG V L+ P T+  A++G  IMD +VS  ++QP  EVGS+SGVK
Subjt:  TVKDDKETEFLHLVQGNMLVVQYERKFTELSRFAPDLVNTPERKIKRFVKGLREEIRGSVALKEPTTFVAALKGAFIMDKNVSKKALQPRWEVGSTSGVK

Query:  RQ--PTSSNYPSKNQRHQTQGQAPPPVCNICNKRHNGQCWSGHQICFKCGKEGHYVRLCPNKGEPGTENPAQKGLQAPTHGNNQKTRVFALTKEKVEEVT
        R+  P  ++ P +  +   Q Q  PPVC  C KR  GQCW+G++ CF+CG+EGH+ R C                 + T  N Q+        ++     
Subjt:  RQ--PTSSNYPSKNQRHQTQGQAPPPVCNICNKRHNGQCWSGHQICFKCGKEGHYVRLCPNKGEPGTENPAQKGLQAPTHGNNQKTRVFALTKEKVEEVT

Query:  TVVTETILVFKIPAFVLFDSGLSHSFVSSKFTRQASIELEPLGYLLSVSTPSSSIMLAQQVVKMGEVSIAGHILRARLIKLDMKDFDVILGMDWLVENQA
        +    T LV  +PA+VLFD G SH+F+S+ F RQA++ELEPLG+LLSVSTPS S+++A Q+V+ GE+S     L ARLI+LDM+DFDVILGMDWL  NQA
Subjt:  TVVTETILVFKIPAFVLFDSGLSHSFVSSKFTRQASIELEPLGYLLSVSTPSSSIMLAQQVVKMGEVSIAGHILRARLIKLDMKDFDVILGMDWLVENQA

Query:  TIDCAKKEVCFRLPFGGNFKFKG
         I+C+K+EV F+LP G +F FKG
Subjt:  TIDCAKKEVCFRLPFGGNFKFKG

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGACTTGGCCGATCCATTCGCGTGCTGGCTGTGGGGGATATACGTTTAAGCTTTGCGGCGACGGCGTTTCTCTGGTGTTGCTAGTCGTCGACGGTGACTTGAAG
ATGGCGGGTCTTGATAGCCGGCGAAAAGAAATGGACGAAGAAGAAGAATATGGTGGTGAGAAGAAATGGACGAGAGAGAAAAAAGAAAATCCTGCTCCTCAGCAT
CACCAAGCTCGTAGTATTCCTCATCACCTGCAAAATCTGGTGGTGGGTCAGTACAAAACCACGGTACTGAATACTGTTAAGGACGACAAGGAAACCGAATTTTTG
CACTTGGTACAAGGCAACATGTTAGTGGTCCAATATGAAAGGAAGTTTACTGAACTTTCTCGATTTGCTCCGGACCTGGTCAACACTCCTGAGCGGAAGATTAAA
AGGTTTGTCAAAGGCTTGAGAGAGGAGATTCGAGGCTCAGTGGCTCTGAAAGAGCCTACCACATTTGTTGCAGCGCTCAAGGGCGCTTTTATTATGGATAAGAAT
GTTTCGAAAAAGGCTCTACAACCTCGCTGGGAGGTCGGTTCTACATCAGGGGTCAAGAGACAACCTACATCTTCGAATTACCCTAGCAAAAATCAGCGACATCAA
ACTCAAGGGCAGGCTCCTCCACCTGTGTGTAACATTTGCAATAAGCGTCATAATGGCCAATGTTGGTCTGGGCACCAAATTTGCTTTAAATGCGGAAAGGAAGGC
CATTACGTCAGACTCTGTCCTAACAAAGGTGAGCCAGGTACTGAGAATCCAGCCCAAAAGGGTCTTCAAGCGCCTACTCATGGGAACAATCAAAAGACGCGTGTC
TTTGCGCTCACTAAGGAAAAAGTTGAAGAAGTCACTACAGTGGTGACAGAGACTATATTGGTTTTTAAAATTCCTGCTTTTGTGTTATTTGACTCGGGGTTGAGT
CACTCTTTTGTTTCATCAAAATTCACTCGACAAGCAAGTATAGAATTAGAGCCTTTAGGCTATTTGCTTTCGGTGTCCACACCGTCAAGCTCAATAATGCTTGCC
CAGCAAGTGGTGAAGATGGGAGAAGTCTCTATTGCAGGACATATCCTTAGAGCAAGGTTAATTAAGTTGGATATGAAGGACTTTGATGTGATATTAGGCATGGAT
TGGTTAGTAGAGAATCAGGCGACAATTGATTGTGCTAAAAAAGAAGTTTGCTTTCGGTTACCCTTTGGTGGGAATTTCAAGTTTAAAGGCGCCAAGCGAGAAGGT
CTAAAGGTGCAACTTGTTTGCTGCAATTCTTTCCCTTCCAAGACTAGTTTTGGGTTCCCACAAGCCGGATCTCGATTCCCAAGAGTCATAGCGGGTGATCCGACA
TGGTGGTGTTCTTGGAGCATTGTGCAGTATTCTTCAAGTTCATCGGTCGACCGCTACTGTCGACCTCCGACAACCGGCCGCTATCAACCTCCACTAATCGGCCTC
CGTCGATTGACCGCTACAGTCAACCTCAAACTACCAGGCACCGCCGATCTGCCACTACTGGCGACCTCTGACTACCGGCAGCTGCCAACCTCCATCGAACGATCG
CCACCATCGACCTCCAACTATCGATCGTCGTTGACCCCGCTGACCAGCCTCCTCCGACTGACAGCTACTGTCGACCTCAGACTACCAGGCACTGCCTACCAGCCA
CTACCGTCGACCTCTGACTATCGACAACCGCCGACCTCCGCCGAACGACTGCTACCGTCGACCTTTGGCTACCAGGTGCTGCCAACCTCTTATTACTGTCGACCA
ACCATCGTCGATTGGCCACAATCGTTGACCTCCGACTATAGAGCAACCGACTGCCAACATCGATTGACGATTCTGCCGACCCCTGCTCCGTTAACTAGCTTCCAC
CGACAGAACGCTACCGTCGACCTCCGACTACCTGGCACCGTCGACCAACCACCATCGTCGACCTCCGACTACCGAACGTTGTTGACCCCTGTTTACCAGCCTCTG
CCGACTGACCGTTACCGTCGACCTCTAACTACCGACCAGCCACTACCGTCGACCTCTAACTACCGACAGCCGTCGACCTCCATCGAACGACTGCTACCGTCGACT
TCTCACTACCAGGCATTGTCGACCTCTACTGACCGACATATGGCGATCGATCGTCATCGTCGACCTTCGACTACCAGCCACCACCGACCAACCGCCACCGTCGAC
CTCTGA
mRNA sequenceShow/hide mRNA sequence
ATGACTTGGCCGATCCATTCGCGTGCTGGCTGTGGGGGATATACGTTTAAGCTTTGCGGCGACGGCGTTTCTCTGGTGTTGCTAGTCGTCGACGGTGACTTGAAG
ATGGCGGGTCTTGATAGCCGGCGAAAAGAAATGGACGAAGAAGAAGAATATGGTGGTGAGAAGAAATGGACGAGAGAGAAAAAAGAAAATCCTGCTCCTCAGCAT
CACCAAGCTCGTAGTATTCCTCATCACCTGCAAAATCTGGTGGTGGGTCAGTACAAAACCACGGTACTGAATACTGTTAAGGACGACAAGGAAACCGAATTTTTG
CACTTGGTACAAGGCAACATGTTAGTGGTCCAATATGAAAGGAAGTTTACTGAACTTTCTCGATTTGCTCCGGACCTGGTCAACACTCCTGAGCGGAAGATTAAA
AGGTTTGTCAAAGGCTTGAGAGAGGAGATTCGAGGCTCAGTGGCTCTGAAAGAGCCTACCACATTTGTTGCAGCGCTCAAGGGCGCTTTTATTATGGATAAGAAT
GTTTCGAAAAAGGCTCTACAACCTCGCTGGGAGGTCGGTTCTACATCAGGGGTCAAGAGACAACCTACATCTTCGAATTACCCTAGCAAAAATCAGCGACATCAA
ACTCAAGGGCAGGCTCCTCCACCTGTGTGTAACATTTGCAATAAGCGTCATAATGGCCAATGTTGGTCTGGGCACCAAATTTGCTTTAAATGCGGAAAGGAAGGC
CATTACGTCAGACTCTGTCCTAACAAAGGTGAGCCAGGTACTGAGAATCCAGCCCAAAAGGGTCTTCAAGCGCCTACTCATGGGAACAATCAAAAGACGCGTGTC
TTTGCGCTCACTAAGGAAAAAGTTGAAGAAGTCACTACAGTGGTGACAGAGACTATATTGGTTTTTAAAATTCCTGCTTTTGTGTTATTTGACTCGGGGTTGAGT
CACTCTTTTGTTTCATCAAAATTCACTCGACAAGCAAGTATAGAATTAGAGCCTTTAGGCTATTTGCTTTCGGTGTCCACACCGTCAAGCTCAATAATGCTTGCC
CAGCAAGTGGTGAAGATGGGAGAAGTCTCTATTGCAGGACATATCCTTAGAGCAAGGTTAATTAAGTTGGATATGAAGGACTTTGATGTGATATTAGGCATGGAT
TGGTTAGTAGAGAATCAGGCGACAATTGATTGTGCTAAAAAAGAAGTTTGCTTTCGGTTACCCTTTGGTGGGAATTTCAAGTTTAAAGGCGCCAAGCGAGAAGGT
CTAAAGGTGCAACTTGTTTGCTGCAATTCTTTCCCTTCCAAGACTAGTTTTGGGTTCCCACAAGCCGGATCTCGATTCCCAAGAGTCATAGCGGGTGATCCGACA
TGGTGGTGTTCTTGGAGCATTGTGCAGTATTCTTCAAGTTCATCGGTCGACCGCTACTGTCGACCTCCGACAACCGGCCGCTATCAACCTCCACTAATCGGCCTC
CGTCGATTGACCGCTACAGTCAACCTCAAACTACCAGGCACCGCCGATCTGCCACTACTGGCGACCTCTGACTACCGGCAGCTGCCAACCTCCATCGAACGATCG
CCACCATCGACCTCCAACTATCGATCGTCGTTGACCCCGCTGACCAGCCTCCTCCGACTGACAGCTACTGTCGACCTCAGACTACCAGGCACTGCCTACCAGCCA
CTACCGTCGACCTCTGACTATCGACAACCGCCGACCTCCGCCGAACGACTGCTACCGTCGACCTTTGGCTACCAGGTGCTGCCAACCTCTTATTACTGTCGACCA
ACCATCGTCGATTGGCCACAATCGTTGACCTCCGACTATAGAGCAACCGACTGCCAACATCGATTGACGATTCTGCCGACCCCTGCTCCGTTAACTAGCTTCCAC
CGACAGAACGCTACCGTCGACCTCCGACTACCTGGCACCGTCGACCAACCACCATCGTCGACCTCCGACTACCGAACGTTGTTGACCCCTGTTTACCAGCCTCTG
CCGACTGACCGTTACCGTCGACCTCTAACTACCGACCAGCCACTACCGTCGACCTCTAACTACCGACAGCCGTCGACCTCCATCGAACGACTGCTACCGTCGACT
TCTCACTACCAGGCATTGTCGACCTCTACTGACCGACATATGGCGATCGATCGTCATCGTCGACCTTCGACTACCAGCCACCACCGACCAACCGCCACCGTCGAC
CTCTGA
Protein sequenceShow/hide protein sequence
MTWPIHSRAGCGGYTFKLCGDGVSLVLLVVDGDLKMAGLDSRRKEMDEEEEYGGEKKWTREKKENPAPQHHQARSIPHHLQNLVVGQYKTTVLNTVKDDKETEFL
HLVQGNMLVVQYERKFTELSRFAPDLVNTPERKIKRFVKGLREEIRGSVALKEPTTFVAALKGAFIMDKNVSKKALQPRWEVGSTSGVKRQPTSSNYPSKNQRHQ
TQGQAPPPVCNICNKRHNGQCWSGHQICFKCGKEGHYVRLCPNKGEPGTENPAQKGLQAPTHGNNQKTRVFALTKEKVEEVTTVVTETILVFKIPAFVLFDSGLS
HSFVSSKFTRQASIELEPLGYLLSVSTPSSSIMLAQQVVKMGEVSIAGHILRARLIKLDMKDFDVILGMDWLVENQATIDCAKKEVCFRLPFGGNFKFKGAKREG
LKVQLVCCNSFPSKTSFGFPQAGSRFPRVIAGDPTWWCSWSIVQYSSSSSVDRYCRPPTTGRYQPPLIGLRRLTATVNLKLPGTADLPLLATSDYRQLPTSIERS
PPSTSNYRSSLTPLTSLLRLTATVDLRLPGTAYQPLPSTSDYRQPPTSAERLLPSTFGYQVLPTSYYCRPTIVDWPQSLTSDYRATDCQHRLTILPTPAPLTSFH
RQNATVDLRLPGTVDQPPSSTSDYRTLLTPVYQPLPTDRYRRPLTTDQPLPSTSNYRQPSTSIERLLPSTSHYQALSTSTDRHMAIDRHRRPSTTSHHRPTATVD
L