; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Lag0023442 (gene) of Sponge gourd (AG-4) v1 genome

Gene IDLag0023442
OrganismLuffa acutangula AG-4 (Sponge gourd (AG-4) v1)
DescriptionRetrovirus-related Pol polyprotein from transposon RE1
Genome locationchr7:48253466..48256108
RNA-Seq ExpressionLag0023442
SyntenyLag0023442
Gene Ontology termsGO:0015074 - DNA integration (biological process)
GO:0003676 - nucleic acid binding (molecular function)
InterPro domainsIPR001584 - Integrase, catalytic core
IPR012337 - Ribonuclease H-like superfamily
IPR025724 - GAG-pre-integrase domain
IPR036397 - Ribonuclease H superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
GAU19483.1 hypothetical protein TSUD_77270 [Trifolium subterraneum]8.3e-15640.36Show/hide
Query:  SPSLNQLMNQITSIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVDQ
        S + N L + + S+KLDR+N+ LWK+L LP++R  KL+G++ G + CP+ +I++++     +SA                           +  W A DQ
Subjt:  SPSLNQLMNQITSIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVDQ

Query:  LLLGWLYNSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYNP
         LLGW+ NSMT E+ATQ++  E S++LW   Q L G  +R++  YL+  F   RKG +KM +YL  MK+  D L  AG+PVS  +L+ Q L GLD EYNP
Subjt:  LLLGWLYNSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYNP

Query:  VVAMIQGRVGIS-----------------------------CGYRQNSYGRGGQRGNGYRG---RGRARGNGYGNYNNNRPVCQVYGKPGHVALECYQRF
        VV  +  +  +S                                   S  RG    N +RG   RG   G G G    N   CQV G   H+A++C+ RF
Subjt:  VVAMIQGRVGIS-----------------------------CGYRQNSYGRGGQRGNGYRG---RGRARGNGYGNYNNNRPVCQVYGKPGHVALECYQRF

Query:  NKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLHISYIGSSCLTDG
        +K ++   N + G +         S NAF+ASQ++         V D  WY DSGASNHVT       + TE+ G   + VG+G KL I   GSS L   
Subjt:  NKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLHISYIGSSCLTDG

Query:  YNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSVLSAFVLSN
           LNL ++L VPNI KNL+SVSKLA DNN+ VEF EN C VKDK TG+V+LKG L DGL Q                        +  K   SAFV   
Subjt:  YNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSVLSAFVLSN

Query:  SSKAVSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQGLKYYVIFVDDYSRYVW
         S       WH+RLGHP+ KVL+  ++SC + V  ++   FC+ACQ+GK H+LPF  S S A    EL+HTD+WGPAP+ +  G KYYV FVDD+SR+ W
Subjt:  SSKAVSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQGLKYYVIFVDDYSRYVW

Query:  IYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLLA----------------
        IYPLKQK++T  AF  F  + +NQF+  +K IQ + GGEY P+ KL    GI+ R++CPYTSQQNGR ERKH HI E GLTLLA                
Subjt:  IYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLLA----------------

Query:  ---------QVINGKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCVTATGRVFISINVTFNEAEFPFSTSFGKA
                 QV   ++P +LM +K  D+  L+ FGCAC+PCL+PY   K Q+H+ +C +LG S  HKG+KC+ + GR+FIS +V FNE  FPF   F   
Subjt:  ---------QVINGKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCVTATGRVFISINVTFNEAEFPFSTSFGKA

Query:  STSPTDSPSSPPIHMWFSNLPTVSA
            T SP    I++  ++ P  +A
Subjt:  STSPTDSPSSPPIHMWFSNLPTVSA

GAU51268.1 hypothetical protein TSUD_412550 [Trifolium subterraneum]2.5e-15239.5Show/hide
Query:  NSPSLNQLMNQITSIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVD
        NSP  N L + I S+KLDR N+ LWK+L L ++R  KL+G++ G   CP+ ++                           TS+     +NP +  W+A D
Subjt:  NSPSLNQLMNQITSIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVD

Query:  QLLLGWLYNSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYN
        Q LLGWL NSM  ++ATQ++  E S++LW   Q L G  ++++  YL+  F  +RKG +KM EYL  MK+ +D L  AGSP+S  +L+ Q L GLD EYN
Subjt:  QLLLGWLYNSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYN

Query:  PVVAMIQGRV-------------------------GISCGYRQNSYGRGGQRGNGYRGRGRAR---------GNGYGNYNNNRPVCQVYGKPGHVALECY
        PVV  +  ++                         G++     N   +   RGN +  RG  R         G G G  +N +  CQV    GH+A++C 
Subjt:  PVVAMIQGRV-------------------------GISCGYRQNSYGRGGQRGNGYRGRGRAR---------GNGYGNYNNNRPVCQVYGKPGHVALECY

Query:  QRFNKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLHISYIGSSCL
         RF++ +TG     R  +T    Q            S + F+ASP    D  WY DSGA+NHVT   +      E+ G   + VG+G KL I   GS+ L
Subjt:  QRFNKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLHISYIGSSCL

Query:  TDGYNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSVLSAFV
            N LNL +VL VP I KNL+SVSKL  DNN+ VEF  N C VKDK TG+ +LKG L DGL Q  N    V +S                        
Subjt:  TDGYNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSVLSAFV

Query:  LSNSSKAVSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQGLKYYVIFVDDYSR
                 K  WH++LGHP+ KVL+  +K C++ +  +++  FC+ACQFGK H+LPF  S S       LIH+D+WGPAP+ S  G KYYV F+DD+SR
Subjt:  LSNSSKAVSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQGLKYYVIFVDDYSR

Query:  YVWIYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLLAQ------------
        + WI+PLKQK+DT  AF  F  + +NQF+  +K IQ + GGEY  + K+    GI+ R++CPYTSQQNGR ERKH H+ E+GLTLLAQ            
Subjt:  YVWIYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLLAQ------------

Query:  -----VIN--------GKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCVTATGRVFISINVTFNEAEFPFSTSF
             +IN         ++P +LMFK+  D+ AL+ FGCAC+PCL+PY   K QFH+ +C ++G S  HKG+KC+ + GR+F+S +V FNE  FPF   F
Subjt:  -----VIN--------GKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCVTATGRVFISINVTFNEAEFPFSTSF

PNX76291.1 gag/pol polyprotein - maize retrotransposon Hopscotch, partial [Trifolium pratense]3.2e-15540.19Show/hide
Query:  NSPSLNQLMNQITSIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVD
        NS   N L + + S+KLDR N+ LW+++ LPI+R  +L+G++ G+K CP+ +I+A              A SS+               NP++E W A D
Subjt:  NSPSLNQLMNQITSIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVD

Query:  QLLLGWLYNSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYN
        Q LLGWL NSMT  +ATQ++  E S +LW   Q L G  +R++  YL+  F  +RKG +KM +YL  MK+ AD L  AG+P+S  +L+ Q L GLD EYN
Subjt:  QLLLGWLYNSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYN

Query:  PVVAMIQGRVGISC--------------------------------------GYRQNSYGRGGQRGNGYRGRG-RARGNGYGNYNNNRPVCQVYGKPGHV
        PVV  +  +  +S                                       G R NS        N +RG   R    G G   + +  CQV G   H+
Subjt:  PVVAMIQGRVGISC--------------------------------------GYRQNSYGRGGQRGNGYRGRG-RARGNGYGNYNNNRPVCQVYGKPGHV

Query:  ALECYQRFNKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLHISYI
        A++C+ RF+K ++     N   N         S NAF+ASQ++         + D  WY DSGASNHVT   +   N +E+ G   + VG+G KL I   
Subjt:  ALECYQRFNKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLHISYI

Query:  GSSCLTDGYNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSV
        GSS L      LNL ++L VP I KNL+SVSKLA DNN+ VEF EN C VKDK TG+ +L+G L DGL Q     +S  VS                   
Subjt:  GSSCLTDGYNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSV

Query:  LSAFVLSNSSKAVSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQGLKYYVIFV
                      K  WH++LGHP+ KVL+  +KSC++ +  +++  FC+ACQ+GK H LPF  S S A    EL+HTD+WGPAP+ S  G KYYV F+
Subjt:  LSAFVLSNSSKAVSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQGLKYYVIFV

Query:  DDYSRYVWIYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLLAQ-------
        DD++R+ WIYPLKQK+DT  AF  F  MV+NQF   +K+IQ + GGEY P+ K     GI+ R++CPYTSQQNGR ERKH HI E GLTLLAQ       
Subjt:  DDYSRYVWIYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLLAQ-------

Query:  ------------------VINGKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCVTATGRVFISINVTFNEAEFP
                          V + K+P +L+ K+  D+ +L+ FGCAC+P L+PY   K QFH+ +C +LG S  HKG+KCV + GR+FIS +V FNE  FP
Subjt:  ------------------VINGKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCVTATGRVFISINVTFNEAEFP

Query:  FSTSFGKAST---SPTDSPSSP-PIH
        F   F        + T+SPSS  P+H
Subjt:  FSTSFGKAST---SPTDSPSSP-PIH

PNX94503.1 putative retrotransposon Ty1-copia subclass protein, partial [Trifolium pratense]1.2e-15942.12Show/hide
Query:  SIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVDQLLLGWLYNSMTP
        S+KLDR NF LWK+L LP++R  K +G++ G K CP  ++       TS   TE                     INP Y+ W A DQ LLGWL NSMT 
Subjt:  SIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVDQLLLGWLYNSMTP

Query:  EVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYNPVVAMIQGRVGIS
        ++ATQV+  E S++LW   Q L G  +R++  YL+  F  + K  +KM +YL  MK+ AD L  AGSP+S  +L+ Q L GLD EYNPVV  +  +  IS
Subjt:  EVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYNPVVAMIQGRVGIS

Query:  CGYRQ-----------------------------------NSYG-RGGQRGNGYR----GRGRARGNGYGNYNNNRPVCQVYGKPGHVALECYQRFNKEF
            Q                                   N +G RGG RG+  R    GRGRAR +        RP+CQ+ GK GH A +CY RF+K +
Subjt:  CGYRQ-----------------------------------NSYG-RGGQRGNGYR----GRGRARGNGYGNYNNNRPVCQVYGKPGHVALECYQRFNKEF

Query:  TGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLHISYIGSSCLTDGYNKL
        T   +   GE +  A                  FVASP    D  WY DSGASNHVT     + +  E  G   + VG+G KL I   GS+ L D    +
Subjt:  TGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLHISYIGSSCLTDGYNKL

Query:  NLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSVLSAFVLSNSSKA
        NL NVL VP I KNL+SVSKL  DNN  VEF EN+C VKDK TG+ +LKG L DGL Q                S+N     N +     +         
Subjt:  NLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSVLSAFVLSNSSKA

Query:  VSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQGLKYYVIFVDDYSRYVWIYPL
          K +WH++LGHP+ KVLE  +K  ++ +  ++K  FC+ACQFGK H+LPF  S S A    +LIHTD+WGPAP+ S    KYYV F+DD+SR+ WI+PL
Subjt:  VSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQGLKYYVIFVDDYSRYVWIYPL

Query:  KQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLLAQ-----------------VI
        KQK++T  AFN F  +V+NQF+  +K I+ + GGEY P+ K     GI+ +++CPYTSQQNGR ERKH H+ E+GLTLLAQ                 +I
Subjt:  KQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLLAQ-----------------VI

Query:  N--------GKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCVTATGRVFISINVTFNEAEFPFSTSF
        N         ++P TL+FKK  D+ AL+ FGCAC+PCL+PY   K QFH+ +C +LG S  HKG+KCV + GRVF+S +V FNE  FPF   F
Subjt:  N--------GKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCVTATGRVFISINVTFNEAEFPFSTSF

PNY01489.1 copia-like polyprotein, partial [Trifolium pratense]1.3e-14539.67Show/hide
Query:  NSPSLNQLMNQITSIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVD
        NS   N L + I S+KLDR N+ LWK+L LP++R  K +G++ G K CP+ ++                           TS+     +NP ++ WMA D
Subjt:  NSPSLNQLMNQITSIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVD

Query:  QLLLGWLYNSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYN
        Q LLGWL NSM  ++ATQ++  E S++LW   Q L G  ++++  YL+  F  +RKG +KM EYL  MK+ +D L  +GSP+S  +L+ Q L GLD EYN
Subjt:  QLLLGWLYNSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYN

Query:  PVVAMIQGRV-------------------------GISCGYRQNSYGRGGQRGNGYRGRGRAR---------GNGYGNYNNNRPVCQVYGKPGHVALECY
        PVV  +  ++                         G++     N   +   RGN +  RG  R         G G G  +N +  CQV    GH A++C 
Subjt:  PVVAMIQGRV-------------------------GISCGYRQNSYGRGGQRGNGYRGRGRAR---------GNGYGNYNNNRPVCQVYGKPGHVALECY

Query:  QRFNKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLHISYIGSSCL
         RF++ +TG     R  +T    Q            S + FVASP    D  WY DSGASNHVT   +      E+ G   + VG+G KL I   GS+ L
Subjt:  QRFNKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLHISYIGSSCL

Query:  TDGYNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSVLSAFV
            N LNL +VL VP I KNL+SVSKL  DNN+FVEF  N C VKDK TG+ +LKG L DGL Q  +V+                  +N +  V  +  
Subjt:  TDGYNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSVLSAFV

Query:  LSNSSKAVSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQGLKYYVIFVDDYSR
                 K  WH++LGHP+ KVLE  +K C++ +  +++  FC+ACQFGK H+LPF  S S       LIH+D+WGPAP+ S  G KYYV F+DD+SR
Subjt:  LSNSSKAVSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQGLKYYVIFVDDYSR

Query:  YVWIYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLLAQ------------
        + WI+PLKQK+DT  AF  F  + +NQF+  +K IQ + GGEY  + K+    GI+ R++CPYTSQQNGR ERKH H+VE+GLTLLAQ            
Subjt:  YVWIYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLLAQ------------

Query:  -----VIN--------GKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCVTATG
             +IN         ++P +LMFK+  D+ AL+ FGCAC+PCL+PY   K QFH+ +C ++G S  HKG     A G
Subjt:  -----VIN--------GKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCVTATG

TrEMBL top hitse value%identityAlignment
A0A2K3LCM1 Gag/pol polyprotein-maize retrotransposon Hopscotch (Fragment)1.5e-15540.19Show/hide
Query:  NSPSLNQLMNQITSIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVD
        NS   N L + + S+KLDR N+ LW+++ LPI+R  +L+G++ G+K CP+ +I+A              A SS+               NP++E W A D
Subjt:  NSPSLNQLMNQITSIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVD

Query:  QLLLGWLYNSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYN
        Q LLGWL NSMT  +ATQ++  E S +LW   Q L G  +R++  YL+  F  +RKG +KM +YL  MK+ AD L  AG+P+S  +L+ Q L GLD EYN
Subjt:  QLLLGWLYNSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYN

Query:  PVVAMIQGRVGISC--------------------------------------GYRQNSYGRGGQRGNGYRGRG-RARGNGYGNYNNNRPVCQVYGKPGHV
        PVV  +  +  +S                                       G R NS        N +RG   R    G G   + +  CQV G   H+
Subjt:  PVVAMIQGRVGISC--------------------------------------GYRQNSYGRGGQRGNGYRGRG-RARGNGYGNYNNNRPVCQVYGKPGHV

Query:  ALECYQRFNKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLHISYI
        A++C+ RF+K ++     N   N         S NAF+ASQ++         + D  WY DSGASNHVT   +   N +E+ G   + VG+G KL I   
Subjt:  ALECYQRFNKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLHISYI

Query:  GSSCLTDGYNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSV
        GSS L      LNL ++L VP I KNL+SVSKLA DNN+ VEF EN C VKDK TG+ +L+G L DGL Q     +S  VS                   
Subjt:  GSSCLTDGYNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSV

Query:  LSAFVLSNSSKAVSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQGLKYYVIFV
                      K  WH++LGHP+ KVL+  +KSC++ +  +++  FC+ACQ+GK H LPF  S S A    EL+HTD+WGPAP+ S  G KYYV F+
Subjt:  LSAFVLSNSSKAVSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQGLKYYVIFV

Query:  DDYSRYVWIYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLLAQ-------
        DD++R+ WIYPLKQK+DT  AF  F  MV+NQF   +K+IQ + GGEY P+ K     GI+ R++CPYTSQQNGR ERKH HI E GLTLLAQ       
Subjt:  DDYSRYVWIYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLLAQ-------

Query:  ------------------VINGKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCVTATGRVFISINVTFNEAEFP
                          V + K+P +L+ K+  D+ +L+ FGCAC+P L+PY   K QFH+ +C +LG S  HKG+KCV + GR+FIS +V FNE  FP
Subjt:  ------------------VINGKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCVTATGRVFISINVTFNEAEFP

Query:  FSTSFGKAST---SPTDSPSSP-PIH
        F   F        + T+SPSS  P+H
Subjt:  FSTSFGKAST---SPTDSPSSP-PIH

A0A2K3MUJ9 Putative retrotransposon Ty1-copia subclass protein (Fragment)6.0e-16042.12Show/hide
Query:  SIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVDQLLLGWLYNSMTP
        S+KLDR NF LWK+L LP++R  K +G++ G K CP  ++       TS   TE                     INP Y+ W A DQ LLGWL NSMT 
Subjt:  SIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVDQLLLGWLYNSMTP

Query:  EVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYNPVVAMIQGRVGIS
        ++ATQV+  E S++LW   Q L G  +R++  YL+  F  + K  +KM +YL  MK+ AD L  AGSP+S  +L+ Q L GLD EYNPVV  +  +  IS
Subjt:  EVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYNPVVAMIQGRVGIS

Query:  CGYRQ-----------------------------------NSYG-RGGQRGNGYR----GRGRARGNGYGNYNNNRPVCQVYGKPGHVALECYQRFNKEF
            Q                                   N +G RGG RG+  R    GRGRAR +        RP+CQ+ GK GH A +CY RF+K +
Subjt:  CGYRQ-----------------------------------NSYG-RGGQRGNGYR----GRGRARGNGYGNYNNNRPVCQVYGKPGHVALECYQRFNKEF

Query:  TGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLHISYIGSSCLTDGYNKL
        T   +   GE +  A                  FVASP    D  WY DSGASNHVT     + +  E  G   + VG+G KL I   GS+ L D    +
Subjt:  TGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLHISYIGSSCLTDGYNKL

Query:  NLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSVLSAFVLSNSSKA
        NL NVL VP I KNL+SVSKL  DNN  VEF EN+C VKDK TG+ +LKG L DGL Q                S+N     N +     +         
Subjt:  NLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSVLSAFVLSNSSKA

Query:  VSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQGLKYYVIFVDDYSRYVWIYPL
          K +WH++LGHP+ KVLE  +K  ++ +  ++K  FC+ACQFGK H+LPF  S S A    +LIHTD+WGPAP+ S    KYYV F+DD+SR+ WI+PL
Subjt:  VSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQGLKYYVIFVDDYSRYVWIYPL

Query:  KQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLLAQ-----------------VI
        KQK++T  AFN F  +V+NQF+  +K I+ + GGEY P+ K     GI+ +++CPYTSQQNGR ERKH H+ E+GLTLLAQ                 +I
Subjt:  KQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLLAQ-----------------VI

Query:  N--------GKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCVTATGRVFISINVTFNEAEFPFSTSF
        N         ++P TL+FKK  D+ AL+ FGCAC+PCL+PY   K QFH+ +C +LG S  HKG+KCV + GRVF+S +V FNE  FPF   F
Subjt:  N--------GKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCVTATGRVFISINVTFNEAEFPFSTSF

A0A2Z6MBG6 Integrase catalytic domain-containing protein4.0e-15640.36Show/hide
Query:  SPSLNQLMNQITSIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVDQ
        S + N L + + S+KLDR+N+ LWK+L LP++R  KL+G++ G + CP+ +I++++     +SA                           +  W A DQ
Subjt:  SPSLNQLMNQITSIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVDQ

Query:  LLLGWLYNSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYNP
         LLGW+ NSMT E+ATQ++  E S++LW   Q L G  +R++  YL+  F   RKG +KM +YL  MK+  D L  AG+PVS  +L+ Q L GLD EYNP
Subjt:  LLLGWLYNSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYNP

Query:  VVAMIQGRVGIS-----------------------------CGYRQNSYGRGGQRGNGYRG---RGRARGNGYGNYNNNRPVCQVYGKPGHVALECYQRF
        VV  +  +  +S                                   S  RG    N +RG   RG   G G G    N   CQV G   H+A++C+ RF
Subjt:  VVAMIQGRVGIS-----------------------------CGYRQNSYGRGGQRGNGYRG---RGRARGNGYGNYNNNRPVCQVYGKPGHVALECYQRF

Query:  NKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLHISYIGSSCLTDG
        +K ++   N + G +         S NAF+ASQ++         V D  WY DSGASNHVT       + TE+ G   + VG+G KL I   GSS L   
Subjt:  NKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLHISYIGSSCLTDG

Query:  YNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSVLSAFVLSN
           LNL ++L VPNI KNL+SVSKLA DNN+ VEF EN C VKDK TG+V+LKG L DGL Q                        +  K   SAFV   
Subjt:  YNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSVLSAFVLSN

Query:  SSKAVSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQGLKYYVIFVDDYSRYVW
         S       WH+RLGHP+ KVL+  ++SC + V  ++   FC+ACQ+GK H+LPF  S S A    EL+HTD+WGPAP+ +  G KYYV FVDD+SR+ W
Subjt:  SSKAVSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQGLKYYVIFVDDYSRYVW

Query:  IYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLLA----------------
        IYPLKQK++T  AF  F  + +NQF+  +K IQ + GGEY P+ KL    GI+ R++CPYTSQQNGR ERKH HI E GLTLLA                
Subjt:  IYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLLA----------------

Query:  ---------QVINGKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCVTATGRVFISINVTFNEAEFPFSTSFGKA
                 QV   ++P +LM +K  D+  L+ FGCAC+PCL+PY   K Q+H+ +C +LG S  HKG+KC+ + GR+FIS +V FNE  FPF   F   
Subjt:  ---------QVINGKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCVTATGRVFISINVTFNEAEFPFSTSFGKA

Query:  STSPTDSPSSPPIHMWFSNLPTVSA
            T SP    I++  ++ P  +A
Subjt:  STSPTDSPSSPPIHMWFSNLPTVSA

A0A2Z6P4D5 Integrase catalytic domain-containing protein1.2e-15239.5Show/hide
Query:  NSPSLNQLMNQITSIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVD
        NSP  N L + I S+KLDR N+ LWK+L L ++R  KL+G++ G   CP+ ++                           TS+     +NP +  W+A D
Subjt:  NSPSLNQLMNQITSIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVD

Query:  QLLLGWLYNSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYN
        Q LLGWL NSM  ++ATQ++  E S++LW   Q L G  ++++  YL+  F  +RKG +KM EYL  MK+ +D L  AGSP+S  +L+ Q L GLD EYN
Subjt:  QLLLGWLYNSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYN

Query:  PVVAMIQGRV-------------------------GISCGYRQNSYGRGGQRGNGYRGRGRAR---------GNGYGNYNNNRPVCQVYGKPGHVALECY
        PVV  +  ++                         G++     N   +   RGN +  RG  R         G G G  +N +  CQV    GH+A++C 
Subjt:  PVVAMIQGRV-------------------------GISCGYRQNSYGRGGQRGNGYRGRGRAR---------GNGYGNYNNNRPVCQVYGKPGHVALECY

Query:  QRFNKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLHISYIGSSCL
         RF++ +TG     R  +T    Q            S + F+ASP    D  WY DSGA+NHVT   +      E+ G   + VG+G KL I   GS+ L
Subjt:  QRFNKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLHISYIGSSCL

Query:  TDGYNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSVLSAFV
            N LNL +VL VP I KNL+SVSKL  DNN+ VEF  N C VKDK TG+ +LKG L DGL Q  N    V +S                        
Subjt:  TDGYNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSVLSAFV

Query:  LSNSSKAVSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQGLKYYVIFVDDYSR
                 K  WH++LGHP+ KVL+  +K C++ +  +++  FC+ACQFGK H+LPF  S S       LIH+D+WGPAP+ S  G KYYV F+DD+SR
Subjt:  LSNSSKAVSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQGLKYYVIFVDDYSR

Query:  YVWIYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLLAQ------------
        + WI+PLKQK+DT  AF  F  + +NQF+  +K IQ + GGEY  + K+    GI+ R++CPYTSQQNGR ERKH H+ E+GLTLLAQ            
Subjt:  YVWIYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLLAQ------------

Query:  -----VIN--------GKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCVTATGRVFISINVTFNEAEFPFSTSF
             +IN         ++P +LMFK+  D+ AL+ FGCAC+PCL+PY   K QFH+ +C ++G S  HKG+KC+ + GR+F+S +V FNE  FPF   F
Subjt:  -----VIN--------GKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCVTATGRVFISINVTFNEAEFPFSTSF

A0A803PEH4 Uncharacterized protein1.3e-15141.78Show/hide
Query:  MNQITSIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVDQLLLGWLY
        +NQ  S+KLDR+N+ LWK +   I+R ++L G+LSG  +CP                        E V+ G T  T     NP+YE+W+  DQLL+GWLY
Subjt:  MNQITSIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVDQLLLGWLY

Query:  NSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYNPVVAMIQG
        +SMT  +AT+VMG   +  L   ++ L+G  S++K D  R + Q +RKG+  MSEYLR  K+ ++ L  AG P    +LV+ VL GLD EY  +V  I+ 
Subjt:  NSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYNPVVAMIQG

Query:  R--------------------------------------VGISCGYRQNSYGRGGQ---------------RGNGYRGRGRARGNGYGNYNNNRPVCQVY
        R                                        ++     N  GRG Q               RG   R RGR RG G G    +RP CQVY
Subjt:  R--------------------------------------VGISCGYRQNSYGRGGQ---------------RGNGYRGRGRARGNGYGNYNNNRPVCQVY

Query:  GKPGHVALECYQRFNKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNK
        GK GH A  CY RF++ + G  + N   N   A Q+N + +A         FVA+PE++   +W+ADSGASNH+T+D   +    +Y G E V VG+G+K
Subjt:  GKPGHVALECYQRFNKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNK

Query:  LHISYIGSSCLT-DGYNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRV
        L I++IG+  L  +  N L L+++L VP I KNLVSVSKLA DNNV +EF+ NFCLVKDK T +V+L G L D L Q         + +   +SS+  + 
Subjt:  LHISYIGSSCLT-DGYNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRV

Query:  NNAEKSVLSAFVLS-----NSSKAVS-----KAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWG
        +N     LSAF +S     N S+  S       V H+RLGHPS+KVL   ++S ++ V  N     CDACQ+GKAH LPF  S +RA S  +LIHTDLWG
Subjt:  NNAEKSVLSAFVLS-----NSSKAVS-----KAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWG

Query:  PAPVQSVQGLKYYVIFVDDYSRYVWIYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHI
        PAP+ S     YY+ FVDDYSRY W+YPLK K+D  AAF  F A+V+NQF   +KS+++++GGEY P   L ++ GI+ +  CP+TS QNGR +RKH H 
Subjt:  PAPVQSVQGLKYYVIFVDDYSRYVWIYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHI

Query:  VEMGLTLLAQVINGKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCVTATGRVFISINVTFNEAEFPFSTSF
        VEMGLTLLAQ                           CFPCLR YQ+ KFQFHS KC  LG S  +KG+KC++ TGR++IS +V FNE  FPF T F
Subjt:  VEMGLTLLAQVINGKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCVTATGRVFISINVTFNEAEFPFSTSF

SwissProt top hitse value%identityAlignment
P04146 Copia protein1.1e-2525.1Show/hide
Query:  CQVYGKPGHVALECYQRFNKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVG
        C   G+ GH+  +C+      +    N    EN +    +     AF+  +  N       ++ +  +  DSGAS+H+  D +   +  E     ++ V 
Subjt:  CQVYGKPGHVALECYQRFNKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVG

Query:  -DGNKLHISYIGSSCLTDGYNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSN
          G  ++ +  G   L + + ++ LE+VL       NL+SV +L ++  + +EF ++   +       V   G LN           +V V N    S N
Subjt:  -DGNKLHISYIGSSCLTDGYNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSN

Query:  STRVNNAEKSVLSAFVLSNSSKAVSKAVWHKRLGHPS-LKVLESAIKSC----SLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFEL--IHTDLW
        +   NN                     +WH+R GH S  K+LE   K+     SL        + C+ C  GK   LPF +   +   K  L  +H+D+ 
Subjt:  STRVNNAEKSVLSAFVLSNSSKAVSKAVWHKRLGHPS-LKVLESAIKSC----SLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFEL--IHTDLW

Query:  GPAPVQSVQGLKYYVIFVDDYSRYVWIYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYI--PIHKLCESLGIKIRLTCPYTSQQNGRTERKH
        GP    ++    Y+VIFVD ++ Y   Y +K K+D ++ F  F+A  +  F+  V  +  +NG EY+   + + C   GI   LT P+T Q NG +ER  
Subjt:  GPAPVQSVQGLKYYVIFVDDYSRYVWIYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYI--PIHKLCESLGIKIRLTCPYTSQQNGRTERKH

Query:  MHIVEMGLTLL---------------------------AQVINGKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHK
          I E   T++                           A V + KTP  +   K      LRVFG   +  ++  Q  KF   S K  ++G  P   G K
Subjt:  MHIVEMGLTLL---------------------------AQVINGKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHK

Query:  CVTATGRVFI
           A    FI
Subjt:  CVTATGRVFI

P10978 Retrovirus-related Pol polyprotein from transposon TNT 1-941.6e-4024.36Show/hide
Query:  ESWMAVDQLLLGWLYNSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYL-RQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVL
        E W  +D+     +   ++ +V   ++  + ++ +W  ++ L+  ++   + YL +Q++            +L +       L   G  +   +    +L
Subjt:  ESWMAVDQLLLGWLYNSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYL-RQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVL

Query:  LGLDEEY-NPVVAMIQGRVGISC--------------------GYRQNSYGRGG--QRGNGYRGRGRARGNGYGNYNNNRPVCQVYGKPGHVALEC--YQ
          L   Y N    ++ G+  I                      G    + GRG   QR +   GR  ARG       +    C    +PGH   +C   +
Subjt:  LGLDEEY-NPVVAMIQGRVGISC--------------------GYRQNSYGRGG--QRGNGYRGRGRARGNGYGNYNNNRPVCQVYGKPGHVALEC--YQ

Query:  RFNKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLH--ISYIGSSC
        +   E +G +N    +NT    Q+N +   F+  +     ++ PE      W  D+ AS+H T   +       Y   +  TV  GN  +  I+ IG  C
Subjt:  RFNKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLH--ISYIGSSC

Query:  L-TDGYNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSVLSA
        + T+    L L++V  VP++  NL+S   L RD       ++ + L K      V+ KG     L +                        NAE   +  
Subjt:  L-TDGYNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSVLSA

Query:  FVLSNSSKAVSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQGLKYYVIFVDDY
          L+ +   +S  +WHKR+GH S K L+   K   +        + CD C FGK H + F  S  R  +  +L+++D+ GP  ++S+ G KY+V F+DD 
Subjt:  FVLSNSSKAVSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQGLKYYVIFVDDY

Query:  SRYVWIYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYI--PIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLL----------
        SR +W+Y LK K+  +  F  F A+V+ +    +K ++++NGGEY      + C S GI+   T P T Q NG  ER +  IVE   ++L          
Subjt:  SRYVWIYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYI--PIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLL----------

Query:  -------AQVINGKTPMTLMFK--------KTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKC-VTATGRVFISINVTFNEAE
                 +IN    + L F+        K + +  L+VFGC  F  +   Q  K    S  C ++G      G++       +V  S +V F E+E
Subjt:  -------AQVINGKTPMTLMFK--------KTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKC-VTATGRVFISINVTFNEAE

Q12501 Transposon Ty2-OR2 Gag-Pol polyprotein6.0e-1627.72Show/hide
Query:  NSSKAVSKAVW---HKRLGHPSLKVLESAIKSCSLPVV-------SNEKPQFCDACQFGKA----HILPFSESISRASSKFELIHTDLWGPAPVQSVQGL
        N SK+V+K  +   H+ LGH + + ++ ++K  ++  +       SN     C  C  GK+    H+         +   F+ +HTD++GP         
Subjt:  NSSKAVSKAVW---HKRLGHPSLKVLESAIKSCSLPVV-------SNEKPQFCDACQFGKA----HILPFSESISRASSKFELIHTDLWGPAPVQSVQGL

Query:  KYYVIFVDDYSRYVWIYPL--KQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYI--PIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLT
         Y++ F D+ +R+ W+YPL  +++      F   LA +KNQF++ V  IQ + G EY    +HK   + GI    T    S+ +G  ER +  ++    T
Subjt:  KYYVIFVDDYSRYVWIYPL--KQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYI--PIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLT

Query:  LL
        LL
Subjt:  LL

Q94HW2 Retrovirus-related Pol polyprotein from transposon RE12.3e-8429.13Show/hide
Query:  NSPSLNQLMNQITSIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVD
        N+  LN  M+ +T  KL  +N+L+W      +   Y+L G L G    P   I                          GT + P   +NP Y  W   D
Subjt:  NSPSLNQLMNQITSIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVD

Query:  QLLLGWLYNSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYN
        +L+   +  +++  V   V     + ++W  +++++   S      LR   +Q  KG   + +Y++ + +  D L   G P+     V +VL  L EEY 
Subjt:  QLLLGWLYNSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYN

Query:  PVVAMIQGR-------------------------------VGISCGYRQ---NSYGRGGQRGNGYRGRG---------RARGNGYGNYNNNRPV---CQV
        PV+  I  +                                  +  +R     +    G R N Y  R          ++  N + N N ++P    CQ+
Subjt:  PVVAMIQGR-------------------------------VGISCGYRQ---NSYGRGGQRGNGYRGRG---------RARGNGYGNYNNNRPV---CQV

Query:  YGKPGHVALECYQRFNKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGN
         G  GH A  C Q   + F    N  +  +  P T   P  N  + S    P+ ++       +W  DSGA++H+T+D+N ++    Y G + V V DG+
Subjt:  YGKPGHVALECYQRFNKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGN

Query:  KLHISYIGSSCLTDGYNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRV
         + IS+ GS+ L+     LNL N+L VPNI KNL+SV +L   N V VEF      VKD  TG  +L+G   D L ++  +A+S  VS  A  SS +T  
Subjt:  KLHISYIGSSCLTDGYNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRV

Query:  NNAEKSVLSAFVLSNSSKAVSKAVWHKRLGHPSLKVLESAIKSCSLPVVS-NEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQG
                              + WH RLGHP+  +L S I + SL V++ + K   C  C   K++ +PFS+S   ++   E I++D+W  +P+ S   
Subjt:  NNAEKSVLSAFVLSNSSKAVSKAVWHKRLGHPSLKVLESAIKSCSLPVVS-NEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQG

Query:  LKYYVIFVDDYSRYVWIYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLLA
         +YYVIFVD ++RY W+YPLKQK+     F  F  +++N+F + + +  ++NGGE++ + +     GI    + P+T + NG +ERKH HIVE GLTLL+
Subjt:  LKYYVIFVDDYSRYVWIYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLLA

Query:  QVINGKT-------------------------PMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCV-TATGRVFISIN
             KT                         P   +F  + ++  LRVFGCAC+P LRPY   K    S +C +LG S     + C+   T R++IS +
Subjt:  QVINGKT-------------------------PMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCV-TATGRVFISIN

Query:  VTFNEAEFPFSTSFGKASTSPTDSPSSPPIHMW--FSNLPT
        V F+E  FPFS     A+ SP          +W   + LPT
Subjt:  VTFNEAEFPFSTSFGKASTSPTDSPSSPPIHMW--FSNLPT

Q9ZT94 Retrovirus-related Pol polyprotein from transposon RE24.0e-8429.58Show/hide
Query:  NSPSLNQLMNQITSIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVD
        N+  LN  M+ +T  KL  +N+L+W      +   Y+L G L G    P   I                          GT + P   +NP Y  W   D
Subjt:  NSPSLNQLMNQITSIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVD

Query:  QLLLGWLYNSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNL-KMSEYLRIMKSHADNLGQAGSPVSPR--------------
        +L+   +  +++  V   V     + ++W  +++++   S      LR + +  +   L K  ++   ++   +NL     PV  +              
Subjt:  QLLLGWLYNSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNL-KMSEYLRIMKSHADNLGQAGSPVSPR--------------

Query:  ----NLVSQVLLGLDEEYNPVVAMIQGRVGISCGYRQNSYGRGGQRGNGYRGRGRARGNGYGNYNNNRPV------CQVYGKPGHVALECYQRFNKEFTG
            N  S++L     E  P+ A +      +    QN+ G      N        + +  G+ ++NR        CQ+    GH A  C Q    + T 
Subjt:  ----NLVSQVLLGLDEEYNPVVAMIQGRVGISCGYRQNSYGRGGQRGNGYRGRGRARGNGYGNYNNNRPV------CQVYGKPGHVALECYQRFNKEFTG

Query:  PQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLHISYIGSSCLTDGYNKLNL
            N+ ++T P T   P  N  V S    P+ A+       +W  DSGA++H+T+D+N ++    Y G + V + DG+ + I++ GS+ L      L+L
Subjt:  PQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASPEIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLHISYIGSSCLTDGYNKLNL

Query:  ENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSVLSAFVLSNSSKAVS
          VL VPNI KNL+SV +L   N V VEF      VKD  TG  +L+G   D L ++  +A+S  VS  A   S +T                       
Subjt:  ENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSVLSAFVLSNSSKAVS

Query:  KAVWHKRLGHPSLKVLESAIKSCSLPVVS-NEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQGLKYYVIFVDDYSRYVWIYPLK
         + WH RLGHPSL +L S I + SLPV++ + K   C  C   K+H +PFS S   +S   E I++D+W  +P+ S+   +YYVIFVD ++RY W+YPLK
Subjt:  KAVWHKRLGHPSLKVLESAIKSCSLPVVS-NEKPQFCDACQFGKAHILPFSESISRASSKFELIHTDLWGPAPVQSVQGLKYYVIFVDDYSRYVWIYPLK

Query:  QKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLLAQVINGKT--------------
        QK+     F  F ++V+N+F + + ++ ++NGGE++ +       GI    + P+T + NG +ERKH HIVEMGLTLL+     KT              
Subjt:  QKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKHMHIVEMGLTLLAQVINGKT--------------

Query:  -----------PMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCV-TATGRVFISINVTFNEAEFPFSTSFGKASTSP
                   P   +F +  ++  L+VFGCAC+P LRPY   K +  S++CA++G S     + C+   TGR++ S +V F+E  FPFST+    STS 
Subjt:  -----------PMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCV-TATGRVFISINVTFNEAEFPFSTSFGKASTSP

Query:  TDSPSSPPIHMWFSNLPT
             S P     + LPT
Subjt:  TDSPSSPPIHMWFSNLPT

Arabidopsis top hitse value%identityAlignment
AT1G34070.1 CONTAINS InterPro DOMAIN/s: Retrotransposon gag protein (InterPro:IPR005162)5.8e-0622.45Show/hide
Query:  IKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVDQLLLGWLYNSMTP-
        + ++ SN+  W+ L L    S+ + GH+ G  L        TN  +                                  +W   D ++   LY ++TP 
Subjt:  IKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINPQYESWMAVDQLLLGWLYNSMTP-

Query:  EVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYNPVVAMIQGR
        +     +    S+++W  I+  F     A+   L    +    G++++++Y R MK  AD+L     PV+ RNLV  VL GL+ +++ ++ +I+ R
Subjt:  EVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYNPVVAMIQGR

ATMG00300.1 Gag-Pol-related retrotransposon family protein4.5e-0625.95Show/hide
Query:  RVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSVLSAFVLSNSSKAVSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFG
        R +LKG  +D L   +  +     SN A  + + TR                        +WH RL H S + +E  +K   L        +FC+ C +G
Subjt:  RVVLKGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSVLSAFVLSNSSKAVSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFG

Query:  KAHILPFSESISRASSKFELIHTDLWGPAPV
        K H + FS       +  + +H+DLWG   V
Subjt:  KAHILPFSESISRASSKFELIHTDLWGPAPV


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGACCAACGTCAATCCCATCGGGTATCCTACCTTGGCCACTGGAACCATGAATTTCAACAGTCCCTCGTTGAACCAGCTCATGAATCAAATTACGTCTATCAAGTTGGA
TAGAAGCAACTTTTTGCTGTGGAAGAATCTTGCACTACCCATTCTTCGCAGCTACAAGCTCGAAGGGCACCTATCGGGAGAGAAATTGTGCCCTAAGATGTATATCTCTG
CAACAAATGAAGGTAACACAAGCTCAAGTGCTACTGAAGCAGGAGCCTCTAGTTCGGAAGGAGTCGTAAGTGGAGGAACAAGCTCAACTCCAGAGGCGACTATCAATCCG
CAATATGAGTCGTGGATGGCGGTCGATCAGCTGCTCTTGGGGTGGCTGTACAACTCGATGACGCCAGAGGTTGCCACTCAAGTGATGGGATTCGAAAAATCTCAAGAGTT
GTGGGCAGCAATACAGGAGCTTTTTGGCGTGCAATCTCGAGCAAAGGAGGATTACCTCCGCCAGGTATTTCAACAATCACGAAAAGGTAACTTAAAGATGTCTGAATATC
TGCGTATTATGAAGAGTCATGCAGATAACTTAGGTCAAGCAGGGAGTCCAGTGTCGCCAAGGAACCTAGTGTCTCAAGTACTTCTCGGTCTCGATGAGGAGTATAATCCG
GTGGTGGCCATGATCCAGGGCAGAGTAGGGATCTCTTGTGGCTATCGTCAAAATTCGTATGGCAGAGGCGGTCAAAGAGGAAATGGCTATAGAGGACGAGGAAGAGCTCG
AGGAAATGGCTATGGGAACTATAACAATAACAGACCTGTGTGCCAAGTATACGGCAAACCAGGGCATGTGGCTTTAGAGTGCTATCAGAGGTTCAACAAAGAATTCACTG
GACCTCAAAATCAGAATAGAGGTGAAAACACTCGGCCAGCTACACAAAGTAATCCATCACCCAATGCGTTTGTTGCCAGCCAAAGTACAAATCCGTTCGTGGCCTCACCT
GAGATTGTTATAGATCCAAGCTGGTATGCTGACAGCGGGGCCTCCAATCATGTAACGACAGATTACAATGCCATGGCTAATCCAACTGAGTATGAAGGTACTGAAAGGGT
TACGGTGGGTGATGGTAATAAACTGCATATTTCTTATATTGGTAGTTCTTGTCTAACTGATGGCTATAACAAGCTAAACCTGGAAAACGTGTTATGTGTGCCAAATATTG
TCAAAAATTTAGTTAGTGTATCAAAACTAGCTAGAGATAACAACGTTTTTGTGGAATTTCATGAGAATTTTTGCTTGGTTAAGGACAAGGCTACGGGTCGAGTGGTGCTG
AAAGGAGCGCTTAACGATGGGCTAAATCAGTTTGTGAATGTCGCTGCCTCTGTTGATGTGTCTAATTCAGCAAGAAGAAGTTCAAACAGCACAAGAGTCAATAATGCTGA
AAAGTCTGTTTTATCTGCATTTGTTTTATCCAATAGTTCTAAGGCTGTGTCTAAGGCCGTATGGCACAAAAGACTTGGACATCCGTCTTTAAAGGTCCTTGAGTCTGCTA
TTAAGTCATGTAGTCTTCCAGTTGTGTCTAATGAGAAGCCTCAATTTTGTGATGCTTGTCAATTTGGGAAAGCACACATACTTCCCTTTTCTGAATCTATATCTCGAGCT
TCGTCAAAATTTGAATTAATCCACACAGACCTTTGGGGTCCAGCACCCGTTCAGTCAGTTCAAGGTTTGAAATACTATGTTATTTTTGTGGATGATTACAGCAGATATGT
GTGGATTTATCCTCTTAAACAAAAGAATGACACCTATGCTGCCTTTAATCACTTTCTGGCCATGGTTAAGAATCAGTTTGATAGTTGTGTCAAGTCTATACAAACAAACA
ATGGTGGAGAGTATATTCCTATTCATAAATTGTGTGAGTCTTTGGGGATTAAAATTCGTCTAACATGCCCGTATACATCTCAGCAGAATGGAAGGACAGAGAGAAAGCAC
ATGCATATAGTGGAGATGGGTCTCACCTTGCTTGCTCAGGTAATCAATGGCAAGACTCCTATGACACTCATGTTCAAGAAAACCATTGACTTTGGTGCGTTGAGAGTATT
TGGCTGCGCCTGCTTCCCCTGCCTTCGTCCGTATCAAACGCAAAAATTTCAATTCCACTCAGAGAAGTGTGCCTATCTCGGCCCTAGTCCAGTTCATAAAGGGCATAAGT
GTGTTACAGCCACTGGGAGAGTATTTATCTCCATAAATGTGACATTCAATGAGGCTGAATTTCCTTTCTCCACCAGCTTTGGCAAGGCTTCAACATCTCCAACGGATTCT
CCCTCATCACCACCTATTCATATGTGGTTTTCAAACCTACCAACTGTCTCAGCTAACCAGCAACCCACAGAATCTATCACTTACCCACCAGTGACCCCACATATGTAG
mRNA sequenceShow/hide mRNA sequence
ATGACCAACGTCAATCCCATCGGGTATCCTACCTTGGCCACTGGAACCATGAATTTCAACAGTCCCTCGTTGAACCAGCTCATGAATCAAATTACGTCTATCAAGTTGGA
TAGAAGCAACTTTTTGCTGTGGAAGAATCTTGCACTACCCATTCTTCGCAGCTACAAGCTCGAAGGGCACCTATCGGGAGAGAAATTGTGCCCTAAGATGTATATCTCTG
CAACAAATGAAGGTAACACAAGCTCAAGTGCTACTGAAGCAGGAGCCTCTAGTTCGGAAGGAGTCGTAAGTGGAGGAACAAGCTCAACTCCAGAGGCGACTATCAATCCG
CAATATGAGTCGTGGATGGCGGTCGATCAGCTGCTCTTGGGGTGGCTGTACAACTCGATGACGCCAGAGGTTGCCACTCAAGTGATGGGATTCGAAAAATCTCAAGAGTT
GTGGGCAGCAATACAGGAGCTTTTTGGCGTGCAATCTCGAGCAAAGGAGGATTACCTCCGCCAGGTATTTCAACAATCACGAAAAGGTAACTTAAAGATGTCTGAATATC
TGCGTATTATGAAGAGTCATGCAGATAACTTAGGTCAAGCAGGGAGTCCAGTGTCGCCAAGGAACCTAGTGTCTCAAGTACTTCTCGGTCTCGATGAGGAGTATAATCCG
GTGGTGGCCATGATCCAGGGCAGAGTAGGGATCTCTTGTGGCTATCGTCAAAATTCGTATGGCAGAGGCGGTCAAAGAGGAAATGGCTATAGAGGACGAGGAAGAGCTCG
AGGAAATGGCTATGGGAACTATAACAATAACAGACCTGTGTGCCAAGTATACGGCAAACCAGGGCATGTGGCTTTAGAGTGCTATCAGAGGTTCAACAAAGAATTCACTG
GACCTCAAAATCAGAATAGAGGTGAAAACACTCGGCCAGCTACACAAAGTAATCCATCACCCAATGCGTTTGTTGCCAGCCAAAGTACAAATCCGTTCGTGGCCTCACCT
GAGATTGTTATAGATCCAAGCTGGTATGCTGACAGCGGGGCCTCCAATCATGTAACGACAGATTACAATGCCATGGCTAATCCAACTGAGTATGAAGGTACTGAAAGGGT
TACGGTGGGTGATGGTAATAAACTGCATATTTCTTATATTGGTAGTTCTTGTCTAACTGATGGCTATAACAAGCTAAACCTGGAAAACGTGTTATGTGTGCCAAATATTG
TCAAAAATTTAGTTAGTGTATCAAAACTAGCTAGAGATAACAACGTTTTTGTGGAATTTCATGAGAATTTTTGCTTGGTTAAGGACAAGGCTACGGGTCGAGTGGTGCTG
AAAGGAGCGCTTAACGATGGGCTAAATCAGTTTGTGAATGTCGCTGCCTCTGTTGATGTGTCTAATTCAGCAAGAAGAAGTTCAAACAGCACAAGAGTCAATAATGCTGA
AAAGTCTGTTTTATCTGCATTTGTTTTATCCAATAGTTCTAAGGCTGTGTCTAAGGCCGTATGGCACAAAAGACTTGGACATCCGTCTTTAAAGGTCCTTGAGTCTGCTA
TTAAGTCATGTAGTCTTCCAGTTGTGTCTAATGAGAAGCCTCAATTTTGTGATGCTTGTCAATTTGGGAAAGCACACATACTTCCCTTTTCTGAATCTATATCTCGAGCT
TCGTCAAAATTTGAATTAATCCACACAGACCTTTGGGGTCCAGCACCCGTTCAGTCAGTTCAAGGTTTGAAATACTATGTTATTTTTGTGGATGATTACAGCAGATATGT
GTGGATTTATCCTCTTAAACAAAAGAATGACACCTATGCTGCCTTTAATCACTTTCTGGCCATGGTTAAGAATCAGTTTGATAGTTGTGTCAAGTCTATACAAACAAACA
ATGGTGGAGAGTATATTCCTATTCATAAATTGTGTGAGTCTTTGGGGATTAAAATTCGTCTAACATGCCCGTATACATCTCAGCAGAATGGAAGGACAGAGAGAAAGCAC
ATGCATATAGTGGAGATGGGTCTCACCTTGCTTGCTCAGGTAATCAATGGCAAGACTCCTATGACACTCATGTTCAAGAAAACCATTGACTTTGGTGCGTTGAGAGTATT
TGGCTGCGCCTGCTTCCCCTGCCTTCGTCCGTATCAAACGCAAAAATTTCAATTCCACTCAGAGAAGTGTGCCTATCTCGGCCCTAGTCCAGTTCATAAAGGGCATAAGT
GTGTTACAGCCACTGGGAGAGTATTTATCTCCATAAATGTGACATTCAATGAGGCTGAATTTCCTTTCTCCACCAGCTTTGGCAAGGCTTCAACATCTCCAACGGATTCT
CCCTCATCACCACCTATTCATATGTGGTTTTCAAACCTACCAACTGTCTCAGCTAACCAGCAACCCACAGAATCTATCACTTACCCACCAGTGACCCCACATATGTAG
Protein sequenceShow/hide protein sequence
MTNVNPIGYPTLATGTMNFNSPSLNQLMNQITSIKLDRSNFLLWKNLALPILRSYKLEGHLSGEKLCPKMYISATNEGNTSSSATEAGASSSEGVVSGGTSSTPEATINP
QYESWMAVDQLLLGWLYNSMTPEVATQVMGFEKSQELWAAIQELFGVQSRAKEDYLRQVFQQSRKGNLKMSEYLRIMKSHADNLGQAGSPVSPRNLVSQVLLGLDEEYNP
VVAMIQGRVGISCGYRQNSYGRGGQRGNGYRGRGRARGNGYGNYNNNRPVCQVYGKPGHVALECYQRFNKEFTGPQNQNRGENTRPATQSNPSPNAFVASQSTNPFVASP
EIVIDPSWYADSGASNHVTTDYNAMANPTEYEGTERVTVGDGNKLHISYIGSSCLTDGYNKLNLENVLCVPNIVKNLVSVSKLARDNNVFVEFHENFCLVKDKATGRVVL
KGALNDGLNQFVNVAASVDVSNSARRSSNSTRVNNAEKSVLSAFVLSNSSKAVSKAVWHKRLGHPSLKVLESAIKSCSLPVVSNEKPQFCDACQFGKAHILPFSESISRA
SSKFELIHTDLWGPAPVQSVQGLKYYVIFVDDYSRYVWIYPLKQKNDTYAAFNHFLAMVKNQFDSCVKSIQTNNGGEYIPIHKLCESLGIKIRLTCPYTSQQNGRTERKH
MHIVEMGLTLLAQVINGKTPMTLMFKKTIDFGALRVFGCACFPCLRPYQTQKFQFHSEKCAYLGPSPVHKGHKCVTATGRVFISINVTFNEAEFPFSTSFGKASTSPTDS
PSSPPIHMWFSNLPTVSANQQPTESITYPPVTPHM