; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Lag0041430 (gene) of Sponge gourd (AG-4) v1 genome

Gene IDLag0041430
OrganismLuffa acutangula AG-4 (Sponge gourd (AG-4) v1)
DescriptionReverse transcriptase
Genome locationchr13:17648607..17657698
RNA-Seq ExpressionLag0041430
SyntenyLag0041430
Gene Ontology termsGO:0015074 - DNA integration (biological process)
GO:0003676 - nucleic acid binding (molecular function)
InterPro domainsIPR001584 - Integrase, catalytic core
IPR012337 - Ribonuclease H-like superfamily
IPR025724 - GAG-pre-integrase domain
IPR029472 - Retrotransposon Copia-like, N-terminal
IPR036397 - Ribonuclease H superfamily
IPR041373 - Reverse transcriptase, RNase H-like domain
IPR043128 - Reverse transcriptase/Diguanylate cyclase domain
IPR043502 - DNA/RNA polymerase superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAA8524269.1 hypothetical protein F0562_010692 [Nyssa sinensis]1.2e-12741.83Show/hide
Query:  MASSSSSL-SESVSASFLQPNTS---IFLLSNICNLVPIRLDSTNYLFWKFQVESMLRAHSLFDIVDGTIPCPPKFLCDAEGNKLTTVNTAYTQWIAQDH
        MA+++ +L + S +++   PN S   IFLLSNICNL+  RLDS+NY+ WKFQ+ S+L+AHSL   +DGT PCP KF+ D  G     +N  Y  W  QD 
Subjt:  MASSSSSL-SESVSASFLQPNTS---IFLLSNICNLVPIRLDSTNYLFWKFQVESMLRAHSLFDIVDGTIPCPPKFLCDAEGNKLTTVNTAYTQWIAQDH

Query:  TLITLINVTLSKQAFSFVVRCKSSKEVWEALSKHFSSLTRSHIHKLKSALHIVSKSLAESIDDYLIRIKETVDKLETVSVTVDDEDILLYTLNGRPAKFN
         L+TL+N TLS+ A S V+   +S+E W AL + FS+ TRS+I +LKSALH +SK   +SID Y+ +IK+  D L +VSV ++DEDIL+Y LNG P ++N
Subjt:  TLITLINVTLSKQAFSFVVRCKSSKEVWEALSKHFSSLTRSHIHKLKSALHIVSKSLAESIDDYLIRIKETVDKLETVSVTVDDEDILLYTLNGRPAKFN

Query:  SFRTSIRTRKDSVTLDELHSLLKSEAKFIEQQNKIVANPLFNPTAMYA-------NLGRGSTSSGFRGRGRSNQGRGFSPGNSNPVQGRGSSGNFSPNPA
        +F+TSIRT+ +++TL+E++++LK E + IE  +K   +P F P AM A       +  RG + S F GRGR  +GR  + G      GR  S NF  +  
Subjt:  SFRTSIRTRKDSVTLDELHSLLKSEAKFIEQQNKIVANPLFNPTAMYA-------NLGRGSTSSGFRGRGRSNQGRGFSPGNSNPVQGRGSSGNFSPNPA

Query:  SSNSVNAGCGSNGNTPNNQGSSNSGQGRVICQICNRPGHGALDCFNRLNLSYQGRYPPSKLVSMAVANDPSSTTST--WLADSGCNIHVTHNSSNLALNS
             N    +     +NQ S+NS    V+CQICN+ GH ALDC++R++ SYQG+ P  +L +M+   +  S  S   W  D+G   H+T + +NL    
Subjt:  SSNSVNAGCGSNGNTPNNQGSSNSGQGRVICQICNRPGHGALDCFNRLNLSYQGRYPPSKLVSMAVANDPSSTTST--WLADSGCNIHVTHNSSNLALNS

Query:  NYNGEEAITLANGQAFPVAQAGFGTLSTSQNDLHLSNLFCVPDLTTNLLLVSQCCIDNNCIFVFDAEWFSIQDKPSGRVLYMSKSRDGLYPISAAVKTLS
         Y G++ IT+ANGQA  ++ +G  ++  + +   L+N+ CVP + TNLL V Q C DN+C F+FD+E F IQDK + ++L+   S  GLYP+  +  T  
Subjt:  NYNGEEAITLANGQAFPVAQAGFGTLSTSQNDLHLSNLFCVPDLTTNLLLVSQCCIDNNCIFVFDAEWFSIQDKPSGRVLYMSKSRDGLYPISAAVKTLS

Query:  STTSL--------LNASFLNHVPV----------CATTSRTSTTDLWHYRLGHPSTVVLHKLLSTYSIA---HDAPLQNKDYISCLQGKMTKLPFPLSIS
        S  SL         N    NH P+           A   +  +T LWH RLGHPST  L  +LS+ SI      APL       CL GKMTKLPFPLS +
Subjt:  STTSL--------LNASFLNHVPV----------CATTSRTSTTDLWHYRLGHPSTVVLHKLLSTYSIA---HDAPLQNKDYISCLQGKMTKLPFPLSIS

Query:  ESHAPLELIHSDFWGPSPSLSVSCFKYY----------------------------GITHQRSCPYTPEQNGVVEHKHRSIVDIALSLMFHASVLLEF
        ES APL+L+HSD WGP+P  S   F YY                            GI H+RSCP+TP+QNG+ E KHR IV+  L+L+  AS+ L++
Subjt:  ESHAPLELIHSDFWGPSPSLSVSCFKYY----------------------------GITHQRSCPYTPEQNGVVEHKHRSIVDIALSLMFHASVLLEF

XP_038972405.1 uncharacterized protein LOC120104748 [Phoenix dactylifera]1.9e-11749.71Show/hide
Query:  IRWILESIIVERAIQDSADKHSKIMEWKAPPIKPSLIEAPTLDLKPLSDHLKYVYLGEDSNWVSPVQCVHKKGGVTMVSNKDNELIPTRTVTGWRI----
        +R I  S+ + R + +  D H  I+E     + P++ E    ++    D    +Y   DS W+SPVQ V KKGG+T+V N++NELIPTRTVTGWR+    
Subjt:  IRWILESIIVERAIQDSADKHSKIMEWKAPPIKPSLIEAPTLDLKPLSDHLKYVYLGEDSNWVSPVQCVHKKGGVTMVSNKDNELIPTRTVTGWRI----

Query:  ------------------------GWQAYYCFLDGYSGYNQITIAPEDQEKTTFIALTG-----RLLLG---------ECLLA-----------------
                                   AYYCFLDGYSGYNQI+I+PEDQEKTTF    G     R+  G          C++A                 
Subjt:  ------------------------GWQAYYCFLDGYSGYNQITIAPEDQEKTTFIALTG-----RLLLG---------ECLLA-----------------

Query:  ------FAMLQQHFSGVLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLEVDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCN
              F     + S VL+RCE+T LV+NWEKCHFMV+EGIVLGH+IS  GLEVDRAKIE+IE+L PP +VKG++SFLGH GFYRRFIKDFSKISKPLCN
Subjt:  ------FAMLQQHFSGVLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLEVDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCN

Query:  LLCTDHVFDFNADCRKAFETLKAALMSAPILC-------------CSRCYVWAKAGQ----FIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRPY
        LL  D VFDF+ DC  AF  LK  L+SAPI+               S   + A  GQ     +H IYYASRVLN AQ+NY T EKELLA+VFAF KFR Y
Subjt:  LLCTDHVFDFNADCRKAFETLKAALMSAPILC-------------CSRCYVWAKAGQ----FIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRPY

Query:  LVGSKVTVFTDHATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIADHLPRLDPSSSLLKQSTIFDSFPDEQLFAVE--------VNHLC---
        LVGSKV V+TDH+ I+YL  KKDAKPRL RWVLLLQEFDLEI DK+G ENV+ADHL RL+   S   +  I +SFPDEQL AV         VN+L    
Subjt:  LVGSKVTVFTDHATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIADHLPRLDPSSSLLKQSTIFDSFPDEQLFAVE--------VNHLC---

Query:  ----MDWRQKKKFKHDI
            + + QKKKF  D+
Subjt:  ----MDWRQKKKFKHDI

XP_038973683.1 uncharacterized protein LOC120105384 [Phoenix dactylifera]1.9e-11749.71Show/hide
Query:  IRWILESIIVERAIQDSADKHSKIMEWKAPPIKPSLIEAPTLDLKPLSDHLKYVYLGEDSNWVSPVQCVHKKGGVTMVSNKDNELIPTRTVTGWRI----
        +R I  S+ + R + +  D H  I+E     + P++ E    ++    D    +Y   DS W+SPVQ V KKGG+T+V N++NELIPTRTVTGWR+    
Subjt:  IRWILESIIVERAIQDSADKHSKIMEWKAPPIKPSLIEAPTLDLKPLSDHLKYVYLGEDSNWVSPVQCVHKKGGVTMVSNKDNELIPTRTVTGWRI----

Query:  ------------------------GWQAYYCFLDGYSGYNQITIAPEDQEKTTFIALTG-----RLLLG---------ECLLA-----------------
                                   AYYCFLDGYSGYNQI+I+PEDQEKTTF    G     R+  G          C++A                 
Subjt:  ------------------------GWQAYYCFLDGYSGYNQITIAPEDQEKTTFIALTG-----RLLLG---------ECLLA-----------------

Query:  ------FAMLQQHFSGVLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLEVDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCN
              F     + S VL+RCE+T LV+NWEKCHFMV+EGIVLGH+IS  GLEVDRAKIE+IE+L PP +VKG++SFLGH GFYRRFIKDFSKISKPLCN
Subjt:  ------FAMLQQHFSGVLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLEVDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCN

Query:  LLCTDHVFDFNADCRKAFETLKAALMSAPILC-------------CSRCYVWAKAGQ----FIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRPY
        LL  D VFDF+ DC  AF  LK  L+SAPI+               S   + A  GQ     +H IYYASRVLN AQ+NY T EKELLA+VFAF KFR Y
Subjt:  LLCTDHVFDFNADCRKAFETLKAALMSAPILC-------------CSRCYVWAKAGQ----FIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRPY

Query:  LVGSKVTVFTDHATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIADHLPRLDPSSSLLKQSTIFDSFPDEQLFAVE--------VNHLC---
        LVGSKV V+TDH+ I+YL  KKDAKPRL RWVLLLQEFDLEI DK+G ENV+ADHL RL+   S   +  I +SFPDEQL AV         VN+L    
Subjt:  LVGSKVTVFTDHATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIADHLPRLDPSSSLLKQSTIFDSFPDEQLFAVE--------VNHLC---

Query:  ----MDWRQKKKFKHDI
            + + QKKKF  D+
Subjt:  ----MDWRQKKKFKHDI

XP_038976300.1 uncharacterized protein LOC120107204 [Phoenix dactylifera]2.5e-11749.52Show/hide
Query:  IRWILESIIVERAIQDSADKHSKIMEWKAPPIKPSLIEAPTLDLKPLSDHLKYVYLGEDSNWVSPVQCVHKKGGVTMVSNKDNELIPTRTVTGWRI----
        +R I  S+ + R + +  D H  I+E     + P++ E    ++    D    +Y   DS W+SPVQ V KKGG+T+V N++NELIPTRTVTGWR+    
Subjt:  IRWILESIIVERAIQDSADKHSKIMEWKAPPIKPSLIEAPTLDLKPLSDHLKYVYLGEDSNWVSPVQCVHKKGGVTMVSNKDNELIPTRTVTGWRI----

Query:  ------------------------GWQAYYCFLDGYSGYNQITIAPEDQEKTTFIALTG-----RLLLG---------ECLLA-----------------
                                   AYYCFLDGYSGYNQI+I+PEDQEKTTF    G     R+  G          C++A                 
Subjt:  ------------------------GWQAYYCFLDGYSGYNQITIAPEDQEKTTFIALTG-----RLLLG---------ECLLA-----------------

Query:  ------FAMLQQHFSGVLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLEVDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCN
              F     + S VL+RCE+T LV+NWEKCHFMV+EGI+LGH+IS  GLEVDRAKIE+IE+L PP +VKG++SFLGH GFYRRFIKDFSKISKPLCN
Subjt:  ------FAMLQQHFSGVLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLEVDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCN

Query:  LLCTDHVFDFNADCRKAFETLKAALMSAPILC-------------CSRCYVWAKAGQ----FIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRPY
        LL  D VFDF+ DC  AF  LK  L+SAPI+               S   + A  GQ     +H IYYASRVLN AQ+NY T EKELLA+VFAF KFR Y
Subjt:  LLCTDHVFDFNADCRKAFETLKAALMSAPILC-------------CSRCYVWAKAGQ----FIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRPY

Query:  LVGSKVTVFTDHATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIADHLPRLDPSSSLLKQSTIFDSFPDEQLFAVE--------VNHLC---
        LVGSKV V+TDH+ I+YL  KKDAKPRL RWVLLLQEFDLEI DK+G ENV+ADHL RL+   S   +  I +SFPDEQL AV         VN+L    
Subjt:  LVGSKVTVFTDHATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIADHLPRLDPSSSLLKQSTIFDSFPDEQLFAVE--------VNHLC---

Query:  ----MDWRQKKKFKHDI
            + + QKKKF  D+
Subjt:  ----MDWRQKKKFKHDI

XP_038976409.1 uncharacterized protein LOC113461320 [Phoenix dactylifera]1.9e-11749.71Show/hide
Query:  IRWILESIIVERAIQDSADKHSKIMEWKAPPIKPSLIEAPTLDLKPLSDHLKYVYLGEDSNWVSPVQCVHKKGGVTMVSNKDNELIPTRTVTGWRI----
        +R I  S+ + R + +  D H  I+E     + P++ E    ++    D    +Y   DS W+SPVQ V KKGG+T+V N++NELIPTRTVTGWR+    
Subjt:  IRWILESIIVERAIQDSADKHSKIMEWKAPPIKPSLIEAPTLDLKPLSDHLKYVYLGEDSNWVSPVQCVHKKGGVTMVSNKDNELIPTRTVTGWRI----

Query:  ------------------------GWQAYYCFLDGYSGYNQITIAPEDQEKTTFIALTG-----RLLLG---------ECLLA-----------------
                                   AYYCFLDGYSGYNQI+I+PEDQEKTTF    G     R+  G          C++A                 
Subjt:  ------------------------GWQAYYCFLDGYSGYNQITIAPEDQEKTTFIALTG-----RLLLG---------ECLLA-----------------

Query:  ------FAMLQQHFSGVLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLEVDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCN
              F     + S VL+RCE+T LV+NWEKCHFMV+EGIVLGH+IS  GLEVDRAKIE+IE+L PP +VKG++SFLGH GFYRRFIKDFSKISKPLCN
Subjt:  ------FAMLQQHFSGVLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLEVDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCN

Query:  LLCTDHVFDFNADCRKAFETLKAALMSAPILC-------------CSRCYVWAKAGQ----FIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRPY
        LL  D VFDF+ DC  AF  LK  L+SAPI+               S   + A  GQ     +H IYYASRVLN AQ+NY T EKELLA+VFAF KFR Y
Subjt:  LLCTDHVFDFNADCRKAFETLKAALMSAPILC-------------CSRCYVWAKAGQ----FIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRPY

Query:  LVGSKVTVFTDHATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIADHLPRLDPSSSLLKQSTIFDSFPDEQLFAVE--------VNHLC---
        LVGSKV V+TDH+ I+YL  KKDAKPRL RWVLLLQEFDLEI DK+G ENV+ADHL RL+   S   +  I +SFPDEQL AV         VN+L    
Subjt:  LVGSKVTVFTDHATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIADHLPRLDPSSSLLKQSTIFDSFPDEQLFAVE--------VNHLC---

Query:  ----MDWRQKKKFKHDI
            + + QKKKF  D+
Subjt:  ----MDWRQKKKFKHDI

TrEMBL top hitse value%identityAlignment
A0A2G9FWY3 Reverse transcriptase1.4e-11353.72Show/hide
Query:  VYLGEDSNWVSPVQCVHKKGGVTMVSNKDNELIPTRTVTGWRI----------------------------GWQAYYCFLDGYSGYNQITIAPEDQEKTT
        +Y   DS+WVSPVQCV KKGG+T+V N  NELIPTRTVTGWR+                              + +YCFLDGYSGYNQI IAPEDQEK T
Subjt:  VYLGEDSNWVSPVQCVHKKGGVTMVSNKDNELIPTRTVTGWRI----------------------------GWQAYYCFLDGYSGYNQITIAPEDQEKTT

Query:  FIALTG-----RLLLG---------ECLLA-----------------------FAMLQQHFSGVLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLE
        F    G     R+  G          C++A                       F     + S VLKRCEDT L++NWEKCHFMV+EGIVLGH++S  G+E
Subjt:  FIALTG-----RLLLG---------ECLLA-----------------------FAMLQQHFSGVLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLE

Query:  VDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCNLLCTDHVFDFNADCRKAFETLKAALMSAPILC-------------CSRCYVWA
        VD+AK+E IE+L PP SVKG++SFLGHAGFYRRFIKDFSKISKPLCNLL  D  F+F+  CR AF  LK  L+SAPI+               S   V A
Subjt:  VDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCNLLCTDHVFDFNADCRKAFETLKAALMSAPILC-------------CSRCYVWA

Query:  KAGQ----FIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRPYLVGSKVTVFTDHATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIA
          GQ        IYYAS+ LN+AQ+NY T EKELLA+VFAF KFR YLVG+KV V+TDHA IRYL  KKDAKPRL RWVLLLQEFDLEI D+KG+EN IA
Subjt:  KAGQ----FIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRPYLVGSKVTVFTDHATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIA

Query:  DHLPRLDPSSSLLKQSTIFDSFPDEQLFAV
        DHL RL+  +   + + I D+FPDEQL A+
Subjt:  DHLPRLDPSSSLLKQSTIFDSFPDEQLFAV

A0A2G9HYA0 Reverse transcriptase1.4e-11353.72Show/hide
Query:  VYLGEDSNWVSPVQCVHKKGGVTMVSNKDNELIPTRTVTGWRI----------------------------GWQAYYCFLDGYSGYNQITIAPEDQEKTT
        +Y   DS+WVSPVQCV KKGG+T+V N  NELIPTRTVTGWR+                              + +YCFLDGYSGYNQI IAPEDQEKTT
Subjt:  VYLGEDSNWVSPVQCVHKKGGVTMVSNKDNELIPTRTVTGWRI----------------------------GWQAYYCFLDGYSGYNQITIAPEDQEKTT

Query:  FIALTG-----RLLLG---------ECLLA-----------------------FAMLQQHFSGVLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLE
        F    G     R+  G          C++A                       F     + S VLKRCEDT L++NWEKCHFMV+EGIVLGH++S  G+E
Subjt:  FIALTG-----RLLLG---------ECLLA-----------------------FAMLQQHFSGVLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLE

Query:  VDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCNLLCTDHVFDFNADCRKAFETLKAALMSAPILC-------------CSRCYVWA
        VD+AK+E IE+L PP SVKG++SFLGHAGFYRRFIKDFSKISKPLCNLL  D  F+F+  C  AF  LK  L+SAPI+               S   V A
Subjt:  VDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCNLLCTDHVFDFNADCRKAFETLKAALMSAPILC-------------CSRCYVWA

Query:  KAGQ----FIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRPYLVGSKVTVFTDHATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIA
          GQ        IYYAS+ LN+AQ+NY T EKELLA+VFAF KFR YLVG+KV V+TDHA IRYL  KKDAKPRL RWVLLLQEFDLEI D+KG+EN IA
Subjt:  KAGQ----FIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRPYLVGSKVTVFTDHATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIA

Query:  DHLPRLDPSSSLLKQSTIFDSFPDEQLFAV
        DHL RL+  +   + + I D+FPDEQL A+
Subjt:  DHLPRLDPSSSLLKQSTIFDSFPDEQLFAV

A0A2G9HYD8 Reverse transcriptase4.7e-11453.72Show/hide
Query:  VYLGEDSNWVSPVQCVHKKGGVTMVSNKDNELIPTRTVTGWRI----------------------------GWQAYYCFLDGYSGYNQITIAPEDQEKTT
        +Y   DS+WVSPVQCV KKGG+T+V N  NELIPTRTVTGWR+                              + +YCFLDGYSGYNQI IAPEDQEKTT
Subjt:  VYLGEDSNWVSPVQCVHKKGGVTMVSNKDNELIPTRTVTGWRI----------------------------GWQAYYCFLDGYSGYNQITIAPEDQEKTT

Query:  FIALTG-----RLLLG---------ECLLA-----------------------FAMLQQHFSGVLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLE
        F    G     R+  G          C++A                       F     + S VLKRCEDT LV+NWEKCHFMV+EGIVLGH++S  G+E
Subjt:  FIALTG-----RLLLG---------ECLLA-----------------------FAMLQQHFSGVLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLE

Query:  VDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCNLLCTDHVFDFNADCRKAFETLKAALMSAPILC-------------CSRCYVWA
        VD+AK+E IE+L PP SVKG++SFLGHAGFYRRFIKDFSKISKPLCNLL  D  F F+  C  AF+ LK  L+SAPI+               S   + A
Subjt:  VDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCNLLCTDHVFDFNADCRKAFETLKAALMSAPILC-------------CSRCYVWA

Query:  KAGQ----FIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRPYLVGSKVTVFTDHATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIA
          GQ        IYYAS+ LN+AQ+NY T EKELLA+VFAF KFR YLVG+KV V+TDHA IRYL  KKDAKPRL RWVLLLQEFDLEI D+KG+EN IA
Subjt:  KAGQ----FIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRPYLVGSKVTVFTDHATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIA

Query:  DHLPRLDPSSSLLKQSTIFDSFPDEQLFAV
        DHL RL+  +   + + I D+FPDEQL A+
Subjt:  DHLPRLDPSSSLLKQSTIFDSFPDEQLFAV

A0A5J5A1U7 Integrase catalytic domain-containing protein5.7e-12841.83Show/hide
Query:  MASSSSSL-SESVSASFLQPNTS---IFLLSNICNLVPIRLDSTNYLFWKFQVESMLRAHSLFDIVDGTIPCPPKFLCDAEGNKLTTVNTAYTQWIAQDH
        MA+++ +L + S +++   PN S   IFLLSNICNL+  RLDS+NY+ WKFQ+ S+L+AHSL   +DGT PCP KF+ D  G     +N  Y  W  QD 
Subjt:  MASSSSSL-SESVSASFLQPNTS---IFLLSNICNLVPIRLDSTNYLFWKFQVESMLRAHSLFDIVDGTIPCPPKFLCDAEGNKLTTVNTAYTQWIAQDH

Query:  TLITLINVTLSKQAFSFVVRCKSSKEVWEALSKHFSSLTRSHIHKLKSALHIVSKSLAESIDDYLIRIKETVDKLETVSVTVDDEDILLYTLNGRPAKFN
         L+TL+N TLS+ A S V+   +S+E W AL + FS+ TRS+I +LKSALH +SK   +SID Y+ +IK+  D L +VSV ++DEDIL+Y LNG P ++N
Subjt:  TLITLINVTLSKQAFSFVVRCKSSKEVWEALSKHFSSLTRSHIHKLKSALHIVSKSLAESIDDYLIRIKETVDKLETVSVTVDDEDILLYTLNGRPAKFN

Query:  SFRTSIRTRKDSVTLDELHSLLKSEAKFIEQQNKIVANPLFNPTAMYA-------NLGRGSTSSGFRGRGRSNQGRGFSPGNSNPVQGRGSSGNFSPNPA
        +F+TSIRT+ +++TL+E++++LK E + IE  +K   +P F P AM A       +  RG + S F GRGR  +GR  + G      GR  S NF  +  
Subjt:  SFRTSIRTRKDSVTLDELHSLLKSEAKFIEQQNKIVANPLFNPTAMYA-------NLGRGSTSSGFRGRGRSNQGRGFSPGNSNPVQGRGSSGNFSPNPA

Query:  SSNSVNAGCGSNGNTPNNQGSSNSGQGRVICQICNRPGHGALDCFNRLNLSYQGRYPPSKLVSMAVANDPSSTTST--WLADSGCNIHVTHNSSNLALNS
             N    +     +NQ S+NS    V+CQICN+ GH ALDC++R++ SYQG+ P  +L +M+   +  S  S   W  D+G   H+T + +NL    
Subjt:  SSNSVNAGCGSNGNTPNNQGSSNSGQGRVICQICNRPGHGALDCFNRLNLSYQGRYPPSKLVSMAVANDPSSTTST--WLADSGCNIHVTHNSSNLALNS

Query:  NYNGEEAITLANGQAFPVAQAGFGTLSTSQNDLHLSNLFCVPDLTTNLLLVSQCCIDNNCIFVFDAEWFSIQDKPSGRVLYMSKSRDGLYPISAAVKTLS
         Y G++ IT+ANGQA  ++ +G  ++  + +   L+N+ CVP + TNLL V Q C DN+C F+FD+E F IQDK + ++L+   S  GLYP+  +  T  
Subjt:  NYNGEEAITLANGQAFPVAQAGFGTLSTSQNDLHLSNLFCVPDLTTNLLLVSQCCIDNNCIFVFDAEWFSIQDKPSGRVLYMSKSRDGLYPISAAVKTLS

Query:  STTSL--------LNASFLNHVPV----------CATTSRTSTTDLWHYRLGHPSTVVLHKLLSTYSIA---HDAPLQNKDYISCLQGKMTKLPFPLSIS
        S  SL         N    NH P+           A   +  +T LWH RLGHPST  L  +LS+ SI      APL       CL GKMTKLPFPLS +
Subjt:  STTSL--------LNASFLNHVPV----------CATTSRTSTTDLWHYRLGHPSTVVLHKLLSTYSIA---HDAPLQNKDYISCLQGKMTKLPFPLSIS

Query:  ESHAPLELIHSDFWGPSPSLSVSCFKYY----------------------------GITHQRSCPYTPEQNGVVEHKHRSIVDIALSLMFHASVLLEF
        ES APL+L+HSD WGP+P  S   F YY                            GI H+RSCP+TP+QNG+ E KHR IV+  L+L+  AS+ L++
Subjt:  ESHAPLELIHSDFWGPSPSLSVSCFKYY----------------------------GITHQRSCPYTPEQNGVVEHKHRSIVDIALSLMFHASVLLEF

A0A6J1E110 uncharacterized protein LOC1110254241.6e-11441.06Show/hide
Query:  PPQLLMGGQGSFAP-QNSESSLEAMMKEYMAHTDATIQSNQASMRALELHVGQLANELKARPQGKLPQILNILKGSNKD---AGASGSVLDVEPPYVPLP
        PPQ     +    P QN+ S+LE  MKEYMA TDA IQS  ASMR  E  +G LAN LK RPQG       + K   K+   A    S L  + P +P  
Subjt:  PPQLLMGGQGSFAP-QNSESSLEAMMKEYMAHTDATIQSNQASMRALELHVGQLANELKARPQGKLPQILNILKGSNKD---AGASGSVLDVEPPYVPLP

Query:  PYVPPLPFPQRQSLRIRWILESIIVERAIQDSADKHSKIMEWKAPPIKPSLIEAPTLDLKPLSDHLKYVYLGE---------------------------
                                                + +    +P+++E PTL+ KPL  HLKY YLG+                           
Subjt:  PYVPPLPFPQRQSLRIRWILESIIVERAIQDSADKHSKIMEWKAPPIKPSLIEAPTLDLKPLSDHLKYVYLGE---------------------------

Query:  ---------DSNWVSPVQCVHK------------------------KGGVTMVSNKDNELIPTRTVTGWRI----------------------------G
                 D   +S   C+HK                        K G+T+ +N+ NELI TRTV+GWR+                             
Subjt:  ---------DSNWVSPVQCVHK------------------------KGGVTMVSNKDNELIPTRTVTGWRI----------------------------G

Query:  WQAYYCFLDGYSGYNQITIAPEDQEKTTFIALTG-----RLLLG---------ECLLA----------------FAMLQQHFSG-------VLKRCEDTQ
         + +YCFLDGYSGYNQITIAPEDQ KTTF    G     R+  G          C++A                F++    F         VL+RCE T 
Subjt:  WQAYYCFLDGYSGYNQITIAPEDQEKTTFIALTG-----RLLLG---------ECLLA----------------FAMLQQHFSG-------VLKRCEDTQ

Query:  LVINWEKCHFMVKEGIVLGHRISKNGLEVDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCNLLCTDHVFDFNADCRKAFETLKAAL
        LV+NWEKCHFMV+EGIVLGH+ISK G+EVD AKI++I +L PP +VKGI+SFLGH GFYRRFIKDF+KISKPLC LL  D  F F  DC K+FE LK AL
Subjt:  LVINWEKCHFMVKEGIVLGHRISKNGLEVDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCNLLCTDHVFDFNADCRKAFETLKAAL

Query:  MSAPIL------------CCSRCY-VWAKAGQ----FIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRPYLVGSKVTVFTDHATIRYLKSKKDAK
         SAPI+            C +  Y + A  GQ     +HP+YYAS+ L  AQ+NY T EKELLA+VFAF KFR YL+G+KV VFTDH+ ++YL +KKDAK
Subjt:  MSAPIL------------CCSRCY-VWAKAGQ----FIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRPYLVGSKVTVFTDHATIRYLKSKKDAK

Query:  PRLNRWVLLLQEFDLEINDKKGSENVIADHLPRLDPSSSLLKQSTIF-DSFPDEQLFAVE
        PRL RW+LLLQEFD+E+ D+KG+EN +ADHL RL+  S L    T+  + F DEQL  V+
Subjt:  PRLNRWVLLLQEFDLEINDKKGSENVIADHLPRLDPSSSLLKQSTIF-DSFPDEQLFAVE

SwissProt top hitse value%identityAlignment
P04323 Retrovirus-related Pol polyprotein from transposon 17.63.8e-3633.95Show/hide
Query:  LGECLLAFAMLQQHFSG---VLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLEVDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISK
        L + ++    L +H      V ++     L +  +KC F+ +E   LGH ++ +G++ +  KIE I++   P   K I++FLG  G+YR+FI +F+ I+K
Subjt:  LGECLLAFAMLQQHFSG---VLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLEVDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISK

Query:  PLCNLLCTDHVFD-FNADCRKAFETLKAALMSAPIL-------------CCSRCYVWAKAGQFIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRP
        P+   L  +   D  N +   AF+ LK  +   PIL               S   + A   Q  HP+ Y SR LNE ++NY T+EKELLA+V+A   FR 
Subjt:  PLCNLLCTDHVFD-FNADCRKAFETLKAALMSAPIL-------------CCSRCYVWAKAGQFIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRP

Query:  YLVGSKVTVFTDHATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIADHLPRLDPSSSLLKQST
        YL+G    + +DH  + +L   KD   +L RW + L EFD +I   KG EN +AD L R+    + L + T
Subjt:  YLVGSKVTVFTDHATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIADHLPRLDPSSSLLKQST

P10394 Retrovirus-related Pol polyprotein from transposon 4121.1e-3028.85Show/hide
Query:  ITIAPEDQEKTTFIALTG-----RLLLGECLLAFAMLQQH----FSGVLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLEVDRAKIEVIERLEPPN
        + IAP   ++   IA +G       L  + L+     ++H     + V  +C +  L ++ EKC F + E   LGH+ +  G+  D  K +VI+    P+
Subjt:  ITIAPEDQEKTTFIALTG-----RLLLGECLLAFAMLQQH----FSGVLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLEVDRAKIEVIERLEPPN

Query:  SVKGIQSFLGHAGFYRRFIKDFSKISKPLCNLLCTDHVFDFNADCRKAFETLKAALMSAPIL------------------CCSRCYVWAKAGQFIHPIYY
             + F+    +YRRFIK+F+  S+ +  L   +  F++  +C+KAF  LK+ L++  +L                   C         G  + P+ Y
Subjt:  SVKGIQSFLGHAGFYRRFIKDFSKISKPLCNLLCTDHVFDFNADCRKAFETLKAALMSAPIL------------------CCSRCYVWAKAGQFIHPIYY

Query:  ASRVLNEAQVNYKTVEKELLAMVFAFVKFRPYLVGSKVTVFTDHATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIADHLPRL------DPS
        ASR   + + N  T E+EL A+ +A + FRPY+ G   TV TDH  + YL S  +   +L R  L L+E++  +   KG +N +AD L R+      D +
Subjt:  ASRVLNEAQVNYKTVEKELLAMVFAFVKFRPYLVGSKVTVFTDHATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIADHLPRL------DPS

Query:  SSLLKQSTIFDS
         ++LK +T F S
Subjt:  SSLLKQSTIFDS

P10401 Retrovirus-related Pol polyprotein from transposon gypsy1.2e-2927.75Show/hide
Query:  IGWQAYYCFLDGYSGYNQITIAPEDQEKTTFIALTGRLLLGECLLAFAMLQ---------------------------------------QHFSGVLKRC
        +G   ++  LD  SGY+QI +A  D+EKT+F    G+     C L F +                                         +H   VLK  
Subjt:  IGWQAYYCFLDGYSGYNQITIAPEDQEKTTFIALTGRLLLGECLLAFAMLQ---------------------------------------QHFSGVLKRC

Query:  EDTQLVINWEKCHFMVKEGIVLGHRISKNGLEVDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCNLL------CTDHV-----FDF
         D  + ++ EK  F  +    LG  +SK+G + D  K++ I+    P+ V  ++SFLG A +YR FIKDF+ I++P+ ++L       + H+      +F
Subjt:  EDTQLVINWEKCHFMVKEGIVLGHRISKNGLEVDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCNLL------CTDHV-----FDF

Query:  NADCRKAFETLKAALMSAPILC--------------CSRCYVWAKAGQFIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRPYLVGSK-VTVFTDH
        N   R AF+ L+  L S  ++                S   + A   Q   PI   SR L + + NY T E+ELLA+V+A  K + +L GS+ + +FTDH
Subjt:  NADCRKAFETLKAALMSAPILC--------------CSRCYVWAKAGQFIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRPYLVGSK-VTVFTDH

Query:  ATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIADHLPR
          + +  + ++   ++ RW   + + + ++  K G EN +AD L R
Subjt:  ATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIADHLPR

P20825 Retrovirus-related Pol polyprotein from transposon 2979.4e-3528.82Show/hide
Query:  DNELIPTRTVTGWRIGWQAYYCFLDGYSGYNQITIAPEDQEKTTFIALTGR----------------------------------LLLGECLLAFAMLQQ
        D   IP       ++G   Y+  +D   G++QI +  E   KT F   +G                                   + L + ++    L +
Subjt:  DNELIPTRTVTGWRIGWQAYYCFLDGYSGYNQITIAPEDQEKTTFIALTGR----------------------------------LLLGECLLAFAMLQQ

Query:  HFSG---VLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLEVDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCNLLCTDHVFD
        H +    V  +  D  L +  +KC F+ KE   LGH ++ +G++ +  K++ I     P   K I++FLG  G+YR+FI +++ I+KP+ + L      D
Subjt:  HFSG---VLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLEVDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCNLLCTDHVFD

Query:  F-NADCRKAFETLKAALMSAPIL-------------CCSRCYVWAKAGQFIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRPYLVGSKVTVFTDH
            +  +AFE LKA ++  PIL               S   + A   Q  HPI + SR LN+ ++NY  +EKELLA+V+A   FR YL+G +  + +DH
Subjt:  F-NADCRKAFETLKAALMSAPIL-------------CCSRCYVWAKAGQFIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRPYLVGSKVTVFTDH

Query:  ATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIADHLPRL
          +R+L + K+   +L RW + L E+  +I+  KG EN +AD L R+
Subjt:  ATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIADHLPRL

Q8I7P9 Retrovirus-related Pol polyprotein from transposon opus1.6e-3426.9Show/hide
Query:  VSNKDNELIPTRTVTGWRIGWQAYYCFLDGYSGYNQITIAPEDQEKTTFIALTGR----------------------------------LLLGECLLAFA
        V+  D   IP    T   +G   Y+  LD  SG++QI +   D  KT F  L G+                                  + + + ++   
Subjt:  VSNKDNELIPTRTVTGWRIGWQAYYCFLDGYSGYNQITIAPEDQEKTTFIALTGR----------------------------------LLLGECLLAFA

Query:  MLQQHFSG---VLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLEVDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCNLL---
            H+     VL       L +N EK HF+  +   LG+ ++ +G++ D  K+  I  + PP SVK ++ FLG   +YR+FI+D++K++KPL NL    
Subjt:  MLQQHFSG---VLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLEVDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCNLL---

Query:  --------CTDHVFDFNADCRKAFETLKAALMSAPIL---CCSRCY-------VWAKAGQFI-------HPIYYASRVLNEAQVNYKTVEKELLAMVFAF
                 +      +    ++F  LK+ L S+ IL   C ++ +        WA              PI Y SR LN+ + NY T+EKE+LA++++ 
Subjt:  --------CTDHVFDFNADCRKAFETLKAALMSAPIL---CCSRCY-------VWAKAGQFI-------HPIYYASRVLNEAQVNYKTVEKELLAMVFAF

Query:  VKFRPYLVGS-KVTVFTDHATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIADHLPRLDPSSSLLKQSTIFDSFPDEQLFAVEVNH
           R YL G+  + V+TDH  + +    ++   +L RW   ++E++ E+  K G  NV+AD L R+ P  + L  ST  D+ P++ + ++   H
Subjt:  VKFRPYLVGS-KVTVFTDHATIRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIADHLPRLDPSSSLLKQSTIFDSFPDEQLFAVEVNH

Arabidopsis top hitse value%identityAlignment
AT1G34070.1 CONTAINS InterPro DOMAIN/s: Retrotransposon gag protein (InterPro:IPR005162)5.9e-0826.1Show/hide
Query:  IFLLSNICNLVPIRLD--STNYLFWKFQVESMLRAHSL-FDI---VDGTIPCPPKFLCDAEGNKLTTVNTAYTQWIAQDHTL-ITLINVTLSKQAFSFVV
        I+ +SNI + +P+ LD   +NY  W+     +   H L FD+   +DGT               L   N     W  +D  + ++L      KQ     V
Subjt:  IFLLSNICNLVPIRLD--STNYLFWKFQVESMLRAHSL-FDI---VDGTIPCPPKFLCDAEGNKLTTVNTAYTQWIAQDHTL-ITLINVTLSKQAFSFVV

Query:  RCKSSKEVWEALSKHFSSLTRSHIHKLKSALHIVSKSLAE-SIDDYLIRIKETVDKLETVSVTVDDEDILLYTLNGRPAKFNSFRTSIRTRKDSVTLDEL
           +S+++W  +   F +   +   +L S L   +K + +  + DY  ++K+  D L  V V V D ++++Y LNG   KF++    I+ R+   + D+ 
Subjt:  RCKSSKEVWEALSKHFSSLTRSHIHKLKSALHIVSKSLAE-SIDDYLIRIKETVDKLETVSVTVDDEDILLYTLNGRPAKFNSFRTSIRTRKDSVTLDEL

Query:  HSLLKSEAKFIEQQNKIVANPLFNPTAM----------------YANLGR-GSTSSGFRGRGRSN---QGRG
         ++L+ E    E + K    P  NPT +                  N  R G    G+RGRGR N   +GRG
Subjt:  HSLLKSEAKFIEQQNKIVANPLFNPTAM----------------YANLGR-GSTSSGFRGRGRSN---QGRG

ATMG00860.1 DNA/RNA polymerases superfamily protein3.6e-1335.54Show/hide
Query:  HFSGVLKRCEDTQLVINWEKCHFMVKEGIVLGHR--ISKNGLEVDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCNLLCTDHVFDF
        H   VL+  E  Q   N +KC F   +   LGHR  IS  G+  D AK+E +     P +   ++ FLG  G+YRRF+K++ KI +PL  LL   +   +
Subjt:  HFSGVLKRCEDTQLVINWEKCHFMVKEGIVLGHR--ISKNGLEVDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRFIKDFSKISKPLCNLLCTDHVFDF

Query:  NADCRKAFETLKAALMSAPIL
              AF+ LK A+ + P+L
Subjt:  NADCRKAFETLKAALMSAPIL


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGGTTCTTCATCTATGGCTTCTTCCTCTTCAAGTCTCTCAGAAAGTGTTTCTGCTTCTTTTCTTCAGCCGAACACTTCGATTTTTCTCCTCTCAAATATATGCAATCT
TGTTCCTATTCGTCTTGATTCCACAAACTATCTCTTTTGGAAATTTCAAGTTGAATCCATGTTGAGAGCTCACTCCTTGTTCGATATTGTTGATGGAACTATCCCTTGCC
CGCCCAAATTTCTCTGTGATGCCGAAGGAAACAAACTTACAACAGTTAATACAGCCTACACTCAGTGGATCGCGCAAGATCACACTCTTATCACTCTGATCAATGTCACT
CTCTCCAAGCAAGCCTTTTCGTTCGTCGTCAGGTGTAAATCGTCCAAAGAGGTATGGGAAGCCCTATCTAAGCATTTTTCTTCCCTAACCAGATCTCATATTCACAAATT
GAAATCCGCCTTACACATTGTATCGAAGTCTCTTGCTGAATCCATAGATGACTATCTGATTCGTATTAAAGAAACTGTTGATAAATTAGAGACTGTTTCAGTCACGGTTG
ATGATGAGGATATTCTTCTCTATACTCTCAATGGCCGACCTGCTAAGTTCAACTCTTTTCGAACTTCTATTCGCACAAGAAAAGATTCAGTAACTCTAGATGAACTTCAT
TCTCTTCTGAAGTCTGAAGCAAAGTTCATCGAACAGCAAAACAAAATCGTCGCCAATCCTCTTTTCAATCCTACTGCTATGTATGCAAATCTAGGAAGAGGATCCACTTC
GTCTGGCTTCCGTGGTCGAGGCCGATCCAATCAAGGACGAGGTTTTTCTCCTGGCAATTCTAATCCAGTTCAGGGTCGAGGATCCAGTGGCAATTTTTCTCCCAATCCAG
CCTCTTCTAACTCGGTGAATGCTGGTTGTGGTTCGAATGGTAATACACCGAATAATCAGGGAAGCTCTAACTCTGGTCAAGGCCGAGTCATATGTCAGATCTGTAATAGG
CCTGGTCATGGTGCTCTTGACTGCTTCAATCGGTTGAATCTGTCTTATCAAGGTAGATATCCTCCGTCTAAGCTGGTTTCTATGGCTGTTGCCAACGATCCTTCTTCCAC
TACTTCAACATGGCTTGCAGACAGTGGATGCAACATCCATGTCACTCATAATTCCTCTAATTTGGCCCTGAACTCTAACTATAATGGAGAAGAAGCAATAACCCTTGCAA
ATGGTCAGGCTTTCCCTGTTGCACAGGCTGGTTTTGGTACTCTCTCGACCTCACAAAATGATCTTCATCTATCCAACTTGTTTTGCGTCCCTGATTTGACTACCAATCTG
TTATTAGTTTCCCAATGTTGCATCGATAATAATTGCATCTTTGTTTTTGATGCCGAATGGTTTTCTATTCAGGACAAGCCTTCGGGACGAGTACTATACATGAGCAAGAG
TAGAGATGGATTATATCCTATATCCGCTGCTGTCAAAACATTGAGTTCTACCACTAGTTTGTTGAATGCCTCTTTTTTAAATCATGTTCCTGTTTGTGCTACGACTTCCC
GTACATCTACTACTGATTTGTGGCACTATAGACTTGGTCATCCATCAACTGTTGTCTTACATAAATTGTTGTCCACATACTCCATTGCACATGATGCTCCATTACAAAAT
AAAGACTATATTAGCTGTCTGCAGGGAAAAATGACTAAGTTGCCTTTTCCCTTGTCTATATCTGAATCTCATGCTCCACTTGAATTGATTCACAGTGATTTTTGGGGACC
TTCTCCCTCTTTGTCTGTTTCTTGTTTTAAATACTATGGCATTACTCATCAACGGTCTTGTCCTTATACACCGGAACAAAATGGTGTTGTCGAGCATAAGCATCGTTCTA
TTGTTGATATTGCTTTATCCCTGATGTTTCATGCCTCTGTACTTTTGGAGTTTTGCCTTATAATAGTCACAAGTTACAACCTAAGACTCGTCAACATGTTTTCATGGGCT
ACTGATAACTCTCCAAAAGAATGCATTGCTGAGCGACTGGAGGGAGCAATTTCTGTGCTGCAGCAAAGCTGGGAGCAAAACTGCCACGTCACAGCTCGCGTGAGTTTGGT
GCATGAGCAATCTGCCTGGGGTAAGGGCGATCTTGCAATGATTGCTAATGCTCTTAAGAATGTGACAGTGATTAGTCATCAGCAGCCACCAGCTGTAGAGCCTGCTGCAG
TGGTGAACCAAGTTGCAGAGGAAGCATGTGTCTATTGTGGTGAAGATCACAACTACGAGTTTTGCCCCAGAATCCAGCTTCTGTGTTTTTTGTTGGCGCAACCACCCCAA
CTTCTCATGGGAGGACAAGGAAGCTTTGCCCCACAAAATTCAGAGAGTTCTCTCGAGGCAATGATGAAAGAATATATGGCTCATACAGATGCCACAATTCAAAGTAATCA
AGCTTCAATGAGAGCCCTGGAATTGCATGTGGGCCAGCTAGCTAATGAGCTGAAGGCAAGGCCTCAAGGGAAACTTCCTCAGATACTGAACATCCTAAAGGGAAGCAACA
AAGATGCTGGAGCATCTGGTTCTGTTCTAGATGTGGAACCACCTTATGTGCCGCTCCCACCTTATGTACCACCTCTACCTTTTCCACAAAGGCAAAGCCTAAGAATCAGA
TGGATTCTAGAGAGCATAATTGTTGAGAGAGCAATACAGGATTCGGCTGACAAGCATTCGAAGATCATGGAGTGGAAGGCTCCTCCTATTAAGCCATCCCTGATTGAGGC
ACCCACTTTAGATTTGAAGCCCTTGTCAGATCATCTAAAGTATGTGTATCTTGGGGAAGATAGCAATTGGGTAAGCCCTGTCCAATGTGTTCATAAGAAAGGAGGTGTCA
CTATGGTGAGCAATAAAGATAATGAGTTGATCCCAACCAGGACAGTAACTGGTTGGAGGATTGGCTGGCAGGCCTACTACTGTTTCTTAGATGGTTATTCTGGGTATAAC
CAGATTACTATTGCTCCTGAGGATCAGGAAAAAACCACTTTCATTGCCCTTACGGGACGTTTGCTTTTAGGCGAATGCCTTTTGGCCTTTGCAATGCTCCAGCAACATTT
CAGCGGTGTGTTAAAAAGATGTGAGGATACCCAACTAGTTATCAATTGGGAGAAATGCCACTTCATGGTGAAGGAGGGCATAGTGTTAGGACATAGGATTTCTAAGAATG
GTCTAGAAGTTGATAGAGCAAAAATTGAGGTGATTGAAAGATTAGAACCACCGAATTCAGTGAAAGGGATTCAGAGTTTTTTAGGTCATGCTGGATTTTATAGGAGGTTC
ATAAAGGATTTTTCGAAAATCAGTAAACCTCTTTGTAACTTATTGTGTACTGATCATGTTTTTGACTTTAATGCAGATTGTAGGAAAGCTTTTGAAACTTTAAAAGCTGC
TTTAATGTCAGCACCCATTCTTTGTTGCAGTAGGTGCTATGTTTGGGCAAAAGCAGGACAATTTATCCATCCTATATACTATGCAAGCAGGGTTCTAAATGAGGCACAAG
TCAACTATAAAACTGTTGAAAAAGAGTTGTTAGCTATGGTGTTCGCTTTTGTGAAATTCCGGCCATATTTGGTTGGATCAAAAGTCACGGTGTTCACGGATCATGCAACA
ATAAGGTACTTAAAGTCTAAGAAAGATGCAAAGCCTAGACTAAATCGTTGGGTTTTATTATTGCAGGAATTCGACTTGGAGATAAATGATAAGAAGGGATCAGAGAATGT
CATTGCAGATCATTTGCCGCGTCTTGATCCATCATCATCTTTGTTGAAGCAATCTACTATTTTCGATTCTTTTCCAGATGAACAGCTCTTTGCTGTTGAGGTAAACCATT
TATGTATGGATTGGAGGCAGAAGAAGAAGTTTAAGCATGATATTGCTTAA
mRNA sequenceShow/hide mRNA sequence
ATGGGTTCTTCATCTATGGCTTCTTCCTCTTCAAGTCTCTCAGAAAGTGTTTCTGCTTCTTTTCTTCAGCCGAACACTTCGATTTTTCTCCTCTCAAATATATGCAATCT
TGTTCCTATTCGTCTTGATTCCACAAACTATCTCTTTTGGAAATTTCAAGTTGAATCCATGTTGAGAGCTCACTCCTTGTTCGATATTGTTGATGGAACTATCCCTTGCC
CGCCCAAATTTCTCTGTGATGCCGAAGGAAACAAACTTACAACAGTTAATACAGCCTACACTCAGTGGATCGCGCAAGATCACACTCTTATCACTCTGATCAATGTCACT
CTCTCCAAGCAAGCCTTTTCGTTCGTCGTCAGGTGTAAATCGTCCAAAGAGGTATGGGAAGCCCTATCTAAGCATTTTTCTTCCCTAACCAGATCTCATATTCACAAATT
GAAATCCGCCTTACACATTGTATCGAAGTCTCTTGCTGAATCCATAGATGACTATCTGATTCGTATTAAAGAAACTGTTGATAAATTAGAGACTGTTTCAGTCACGGTTG
ATGATGAGGATATTCTTCTCTATACTCTCAATGGCCGACCTGCTAAGTTCAACTCTTTTCGAACTTCTATTCGCACAAGAAAAGATTCAGTAACTCTAGATGAACTTCAT
TCTCTTCTGAAGTCTGAAGCAAAGTTCATCGAACAGCAAAACAAAATCGTCGCCAATCCTCTTTTCAATCCTACTGCTATGTATGCAAATCTAGGAAGAGGATCCACTTC
GTCTGGCTTCCGTGGTCGAGGCCGATCCAATCAAGGACGAGGTTTTTCTCCTGGCAATTCTAATCCAGTTCAGGGTCGAGGATCCAGTGGCAATTTTTCTCCCAATCCAG
CCTCTTCTAACTCGGTGAATGCTGGTTGTGGTTCGAATGGTAATACACCGAATAATCAGGGAAGCTCTAACTCTGGTCAAGGCCGAGTCATATGTCAGATCTGTAATAGG
CCTGGTCATGGTGCTCTTGACTGCTTCAATCGGTTGAATCTGTCTTATCAAGGTAGATATCCTCCGTCTAAGCTGGTTTCTATGGCTGTTGCCAACGATCCTTCTTCCAC
TACTTCAACATGGCTTGCAGACAGTGGATGCAACATCCATGTCACTCATAATTCCTCTAATTTGGCCCTGAACTCTAACTATAATGGAGAAGAAGCAATAACCCTTGCAA
ATGGTCAGGCTTTCCCTGTTGCACAGGCTGGTTTTGGTACTCTCTCGACCTCACAAAATGATCTTCATCTATCCAACTTGTTTTGCGTCCCTGATTTGACTACCAATCTG
TTATTAGTTTCCCAATGTTGCATCGATAATAATTGCATCTTTGTTTTTGATGCCGAATGGTTTTCTATTCAGGACAAGCCTTCGGGACGAGTACTATACATGAGCAAGAG
TAGAGATGGATTATATCCTATATCCGCTGCTGTCAAAACATTGAGTTCTACCACTAGTTTGTTGAATGCCTCTTTTTTAAATCATGTTCCTGTTTGTGCTACGACTTCCC
GTACATCTACTACTGATTTGTGGCACTATAGACTTGGTCATCCATCAACTGTTGTCTTACATAAATTGTTGTCCACATACTCCATTGCACATGATGCTCCATTACAAAAT
AAAGACTATATTAGCTGTCTGCAGGGAAAAATGACTAAGTTGCCTTTTCCCTTGTCTATATCTGAATCTCATGCTCCACTTGAATTGATTCACAGTGATTTTTGGGGACC
TTCTCCCTCTTTGTCTGTTTCTTGTTTTAAATACTATGGCATTACTCATCAACGGTCTTGTCCTTATACACCGGAACAAAATGGTGTTGTCGAGCATAAGCATCGTTCTA
TTGTTGATATTGCTTTATCCCTGATGTTTCATGCCTCTGTACTTTTGGAGTTTTGCCTTATAATAGTCACAAGTTACAACCTAAGACTCGTCAACATGTTTTCATGGGCT
ACTGATAACTCTCCAAAAGAATGCATTGCTGAGCGACTGGAGGGAGCAATTTCTGTGCTGCAGCAAAGCTGGGAGCAAAACTGCCACGTCACAGCTCGCGTGAGTTTGGT
GCATGAGCAATCTGCCTGGGGTAAGGGCGATCTTGCAATGATTGCTAATGCTCTTAAGAATGTGACAGTGATTAGTCATCAGCAGCCACCAGCTGTAGAGCCTGCTGCAG
TGGTGAACCAAGTTGCAGAGGAAGCATGTGTCTATTGTGGTGAAGATCACAACTACGAGTTTTGCCCCAGAATCCAGCTTCTGTGTTTTTTGTTGGCGCAACCACCCCAA
CTTCTCATGGGAGGACAAGGAAGCTTTGCCCCACAAAATTCAGAGAGTTCTCTCGAGGCAATGATGAAAGAATATATGGCTCATACAGATGCCACAATTCAAAGTAATCA
AGCTTCAATGAGAGCCCTGGAATTGCATGTGGGCCAGCTAGCTAATGAGCTGAAGGCAAGGCCTCAAGGGAAACTTCCTCAGATACTGAACATCCTAAAGGGAAGCAACA
AAGATGCTGGAGCATCTGGTTCTGTTCTAGATGTGGAACCACCTTATGTGCCGCTCCCACCTTATGTACCACCTCTACCTTTTCCACAAAGGCAAAGCCTAAGAATCAGA
TGGATTCTAGAGAGCATAATTGTTGAGAGAGCAATACAGGATTCGGCTGACAAGCATTCGAAGATCATGGAGTGGAAGGCTCCTCCTATTAAGCCATCCCTGATTGAGGC
ACCCACTTTAGATTTGAAGCCCTTGTCAGATCATCTAAAGTATGTGTATCTTGGGGAAGATAGCAATTGGGTAAGCCCTGTCCAATGTGTTCATAAGAAAGGAGGTGTCA
CTATGGTGAGCAATAAAGATAATGAGTTGATCCCAACCAGGACAGTAACTGGTTGGAGGATTGGCTGGCAGGCCTACTACTGTTTCTTAGATGGTTATTCTGGGTATAAC
CAGATTACTATTGCTCCTGAGGATCAGGAAAAAACCACTTTCATTGCCCTTACGGGACGTTTGCTTTTAGGCGAATGCCTTTTGGCCTTTGCAATGCTCCAGCAACATTT
CAGCGGTGTGTTAAAAAGATGTGAGGATACCCAACTAGTTATCAATTGGGAGAAATGCCACTTCATGGTGAAGGAGGGCATAGTGTTAGGACATAGGATTTCTAAGAATG
GTCTAGAAGTTGATAGAGCAAAAATTGAGGTGATTGAAAGATTAGAACCACCGAATTCAGTGAAAGGGATTCAGAGTTTTTTAGGTCATGCTGGATTTTATAGGAGGTTC
ATAAAGGATTTTTCGAAAATCAGTAAACCTCTTTGTAACTTATTGTGTACTGATCATGTTTTTGACTTTAATGCAGATTGTAGGAAAGCTTTTGAAACTTTAAAAGCTGC
TTTAATGTCAGCACCCATTCTTTGTTGCAGTAGGTGCTATGTTTGGGCAAAAGCAGGACAATTTATCCATCCTATATACTATGCAAGCAGGGTTCTAAATGAGGCACAAG
TCAACTATAAAACTGTTGAAAAAGAGTTGTTAGCTATGGTGTTCGCTTTTGTGAAATTCCGGCCATATTTGGTTGGATCAAAAGTCACGGTGTTCACGGATCATGCAACA
ATAAGGTACTTAAAGTCTAAGAAAGATGCAAAGCCTAGACTAAATCGTTGGGTTTTATTATTGCAGGAATTCGACTTGGAGATAAATGATAAGAAGGGATCAGAGAATGT
CATTGCAGATCATTTGCCGCGTCTTGATCCATCATCATCTTTGTTGAAGCAATCTACTATTTTCGATTCTTTTCCAGATGAACAGCTCTTTGCTGTTGAGGTAAACCATT
TATGTATGGATTGGAGGCAGAAGAAGAAGTTTAAGCATGATATTGCTTAA
Protein sequenceShow/hide protein sequence
MGSSSMASSSSSLSESVSASFLQPNTSIFLLSNICNLVPIRLDSTNYLFWKFQVESMLRAHSLFDIVDGTIPCPPKFLCDAEGNKLTTVNTAYTQWIAQDHTLITLINVT
LSKQAFSFVVRCKSSKEVWEALSKHFSSLTRSHIHKLKSALHIVSKSLAESIDDYLIRIKETVDKLETVSVTVDDEDILLYTLNGRPAKFNSFRTSIRTRKDSVTLDELH
SLLKSEAKFIEQQNKIVANPLFNPTAMYANLGRGSTSSGFRGRGRSNQGRGFSPGNSNPVQGRGSSGNFSPNPASSNSVNAGCGSNGNTPNNQGSSNSGQGRVICQICNR
PGHGALDCFNRLNLSYQGRYPPSKLVSMAVANDPSSTTSTWLADSGCNIHVTHNSSNLALNSNYNGEEAITLANGQAFPVAQAGFGTLSTSQNDLHLSNLFCVPDLTTNL
LLVSQCCIDNNCIFVFDAEWFSIQDKPSGRVLYMSKSRDGLYPISAAVKTLSSTTSLLNASFLNHVPVCATTSRTSTTDLWHYRLGHPSTVVLHKLLSTYSIAHDAPLQN
KDYISCLQGKMTKLPFPLSISESHAPLELIHSDFWGPSPSLSVSCFKYYGITHQRSCPYTPEQNGVVEHKHRSIVDIALSLMFHASVLLEFCLIIVTSYNLRLVNMFSWA
TDNSPKECIAERLEGAISVLQQSWEQNCHVTARVSLVHEQSAWGKGDLAMIANALKNVTVISHQQPPAVEPAAVVNQVAEEACVYCGEDHNYEFCPRIQLLCFLLAQPPQ
LLMGGQGSFAPQNSESSLEAMMKEYMAHTDATIQSNQASMRALELHVGQLANELKARPQGKLPQILNILKGSNKDAGASGSVLDVEPPYVPLPPYVPPLPFPQRQSLRIR
WILESIIVERAIQDSADKHSKIMEWKAPPIKPSLIEAPTLDLKPLSDHLKYVYLGEDSNWVSPVQCVHKKGGVTMVSNKDNELIPTRTVTGWRIGWQAYYCFLDGYSGYN
QITIAPEDQEKTTFIALTGRLLLGECLLAFAMLQQHFSGVLKRCEDTQLVINWEKCHFMVKEGIVLGHRISKNGLEVDRAKIEVIERLEPPNSVKGIQSFLGHAGFYRRF
IKDFSKISKPLCNLLCTDHVFDFNADCRKAFETLKAALMSAPILCCSRCYVWAKAGQFIHPIYYASRVLNEAQVNYKTVEKELLAMVFAFVKFRPYLVGSKVTVFTDHAT
IRYLKSKKDAKPRLNRWVLLLQEFDLEINDKKGSENVIADHLPRLDPSSSLLKQSTIFDSFPDEQLFAVEVNHLCMDWRQKKKFKHDIA