; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

CSPI02G15880 (gene) of Cucumber (PI 183967) v1 genome

Gene IDCSPI02G15880
OrganismCucumis sativus L. var. sativus cv. PI 183967 (Cucumber (PI 183967) v1)
DescriptionIntegrase catalytic domain-containing protein
Genome locationChr2:15347244..15349890
RNA-Seq ExpressionCSPI02G15880
SyntenyCSPI02G15880
Gene Ontology termsGO:0015074 - DNA integration (biological process)
GO:0003676 - nucleic acid binding (molecular function)
GO:0008270 - zinc ion binding (molecular function)
InterPro domainsIPR001584 - Integrase, catalytic core
IPR001878 - Zinc finger, CCHC-type
IPR005162 - Retrotransposon gag domain
IPR012337 - Ribonuclease H-like superfamily
IPR025724 - GAG-pre-integrase domain
IPR029472 - Retrotransposon Copia-like, N-terminal
IPR036397 - Ribonuclease H superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAA0065480.1 Cysteine-rich RLK (receptor-like protein kinase) 8 [Cucumis melo var. makuwa]2.4e-17948.84Show/hide
Query:  ESSTSGSSNYSVSITSSDLDAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKPSEGNLLSAWKCNNDVIASWIINSI
        +SST+GS    ++  +S  DAQLNP+ +HHS+ PT  +V+ PL G+ NY+SWSRAM++A+SG+NK GFITG I+KPS+G LL AW CNND++ASWI+NS+
Subjt:  ESSTSGSSNYSVSITSSDLDAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKPSEGNLLSAWKCNNDVIASWIINSI

Query:  SKEIAASLVYNGNVKEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECTCEGSKKMIDFLNAEFVMIFLMGLNES
        SKEIAAS++Y G++KEIWDEL++R+KQSNGP IYQLRK+ VT  QG+L++E YY K+ TIWQ L EYR  ++CTC G K  ID L +E++M FLMGLN+S
Subjt:  SKEIAASLVYNGNVKEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECTCEGSKKMIDFLNAEFVMIFLMGLNES

Query:  YSQIRAQILLIDPLPPINRVFSLIIQEERQRSIG-SSPSIESITLMANSERRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGHRLANSNNSVH
        Y+ +RAQILL+ PLP IN VFSL+IQEE+QRS G  +P I+ + L   S    S+D+++KK+ RP CS CG KGH ADKCYK HGYPPG++  NS NS+ 
Subjt:  YSQIRAQILLIDPLPPINRVFSLIIQEERQRSIG-SSPSIESITLMANSERRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGHRLANSNNSVH

Query:  QRQDNTIQNGNDKVTEVSKRNQSAFFASLNSDQYTQLLGMLQTHLN----------------------TLQNGEN--------------------FKN--
           D +  N        +      FF+SLNS+QY+QL+ +L  HL                       T  N ++                    FKN  
Subjt:  QRQDNTIQNGNDKVTEVSKRNQSAFFASLNSDQYTQLLGMLQTHLN----------------------TLQNGEN--------------------FKN--

Query:  ETTHIAV-----------------------LKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNERVNCIQH
         T ++ V                       LKDVL++  F YNL+SVS LL     ++ F  + C+IQD      IGKA   NGLY+L  K    NCI  
Subjt:  ETTHIAV-----------------------LKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNERVNCIQH

Query:  TALMCKVSASMWHKRMGHPSISRINELAKMIEISDFPNCKEVCHICPLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRYT
           +  +S   WH+R+GH S   ++ L+  + +S+       CH+CPLAKQ+RLSF   NN+A + FDL+H DIWGPFK P++ G+ YF T+V D  R+T
Subjt:  TALMCKVSASMWHKRMGHPSISRINELAKMIEISDFPNCKEVCHICPLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRYT

Query:  WVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIF
        WVY+L  KSD+L ++P+F +LIETQFSKVIK FRSDNAPEL   + FA+ GT HQFSC   PQQNSVVERKHQHLLNVARAL F  +   +F
Subjt:  WVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIF

KAE8733468.1 hypothetical protein F3Y22_tig00001120pilonHSYRG00022 [Hibiscus syriacus]1.4e-15541.08Show/hide
Query:  VSITSSDLDAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIK--KPSEGNLLSAWKCNNDVIASWIINSISKEIAASLV
        V   +   D   NP+ LH S  P   LV   L  ++N+ SW R+MMLALS KNK+GF+ G I+   PS  N  +AW   N+++ SW++NS+SK+IAASL+
Subjt:  VSITSSDLDAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIK--KPSEGNLLSAWKCNNDVIASWIINSISKEIAASLV

Query:  YNGNVKEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMD---ECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRA
        Y+    E+W++L +R++QSNGP ++ L+K L    Q  LSV  YY ++  +W EL   R +    +C C G ++M+     E V+ FLMGLNESY+ IR 
Subjt:  YNGNVKEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMD---ECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRA

Query:  QILLIDPLPPINRVFSLIIQEERQRSIGSSPSIESITLMANSERRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGHRLANSNNSVHQRQDNTI
        QILL+DPLPPI++VFSL++QEE QR++ S P I      A + + +    ++K  T  +CS+C   GHT D+CYKL GYPPG+   N ++S   R  N+ 
Subjt:  QILLIDPLPPINRVFSLIIQEERQRSIGSSPSIESITLMANSERRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGHRLANSNNSVHQRQDNTI

Query:  QNGNDKVTEVSKRNQSAFFASLNSDQYTQLLGMLQTHLNTLQNGENFKNETTHIAVLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLK
                                 Q  QL+ ML +    LQ+  +    TT I        + DF++NLLSVS LLK    ++ F  ++CLIQD  L +
Subjt:  QNGNDKVTEVSKRNQSAFFASLNSDQYTQLLGMLQTHLNTLQNGENFKNETTHIAVLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLK

Query:  TIGKAELTNGLYLLRMKNERVNCIQHTALMCKVSASMWHKRMGHPSISRINELAKMIEISDFPNCKEVCHICPLAKQRRLSFPILNNIAENIFDLIHCDI
         IG+ EL  GLYLL+M + + N   +   +  +  S WH R+GHPS+  +N L  ++ + +    K+ C I PLAKQ+RL FP+  +    +F+LIHCDI
Subjt:  TIGKAELTNGLYLLRMKNERVNCIQHTALMCKVSASMWHKRMGHPSISRINELAKMIEISDFPNCKEVCHICPLAKQRRLSFPILNNIAENIFDLIHCDI

Query:  WGPFKTPTHAGHSYFATIVYDKSRYTWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQH
        WGP+K  T+ G + F T+V D SR  WVYLL+HKSD+  +I  F+ +I+ QF   IK FRSDNAPEL F +LF   G  HQFSC  TPQQNSVV+RKHQH
Subjt:  WGPFKTPTHAGHSYFATIVYDKSRYTWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQH

Query:  LLNVARALMFQSKVPLIFWGECVLSAAYLINRTPMVLLSNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDI
        LL VARAL FQSKV + FWG+C+L+A YLINR P   LSN +P+  L+    DY+ ++ FGCL + ST    + KF  RA P VF+G+PPG+KGYR+Y +
Subjt:  LLNVARALMFQSKVPLIFWGECVLSAAYLINRTPMVLLSNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDI

Query:  AKRK--FFISRDVLFFEELFPFHSIKE-----KDILISHDFLEQFVIPCPLFDCLEKEDSIDARPTTEDSPEDSHGVDDQNPHISNSEETKNPPDH
          +K  F + + V    ++   H I++      D      F           +C++  D++   PT  DS   +   D  +  +  S  T   P +
Subjt:  AKRK--FFISRDVLFFEELFPFHSIKE-----KDILISHDFLEQFVIPCPLFDCLEKEDSIDARPTTEDSPEDSHGVDDQNPHISNSEETKNPPDH

KZV25004.1 Cysteine-rich RLK (receptor-like protein kinase) 8 [Dorcoceras hygrometricum]2.0e-16540.85Show/hide
Query:  ITSSDLDAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKP-SEGNLLSAWKCNNDVIASWIINSISKEIAASLVYNG
        I  + L+   +P+ LH+   P   LVS PL GS NY++W RAM++AL+ KNK+GFI   I +P SE  L  +W   N ++ SWI+NS+++ IA SL+Y  
Subjt:  ITSSDLDAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKP-SEGNLLSAWKCNNDVIASWIINSISKEIAASLVYNG

Query:  NVKEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLID
          +EIW +L ER+ +SN P IYQ++K L    QGS+ V  YY K+ T+W EL +Y+P   CTC   ++  ++ N E VM FLMGLN+SY+Q+RAQ+L+I+
Subjt:  NVKEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLID

Query:  PLPPINRVFSLIIQEERQRSI------------GSSPSIESITLMANSERRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGH---RLANSNNS
        PLP I +VF+L+IQEERQRSI            G   ++ S    A S R   + K  + D R ICS+C ++ HT DKCYKLHGYPPGH   +   S  S
Subjt:  PLPPINRVFSLIIQEERQRSI------------GSSPSIESITLMANSERRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGH---RLANSNNS

Query:  VHQRQDNTIQNGNDKVTEVSKRNQSAFFASLNSDQYTQLLGMLQTHLNTLQN-------------GENFKNETTHI------------------------
         H  Q ++    + +  ++   +      SL   Q  QL+  L + L T QN                  + T+HI                        
Subjt:  VHQRQDNTIQNGNDKVTEVSKRNQSAFFASLNSDQYTQLLGMLQTHLNTLQN-------------GENFKNETTHI------------------------

Query:  --------------------------------AVLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNERVN
                                         VL++VLY+P F++NLLSVS+L  +   ++SF   +C IQD   ++ IG  +    LY+L+  +  + 
Subjt:  --------------------------------AVLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNERVN

Query:  CIQHTALMCKV---SASMWHKRMGHPSISRINELAKMIEISDFPNCKEVCHICPLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIV
             + +C     ++ +WH+RMGHPS ++++ L  ++ I +  +   +CH C L+KQRRL     NNI+  IF+L+H D WGPF   +  G  +F TIV
Subjt:  CIQHTALMCKV---SASMWHKRMGHPSISRINELAKMIEISDFPNCKEVCHICPLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIV

Query:  YDKSRYTWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFW
         D SRYTWVY+L+ KSD+L + P F +++ TQF   +K  RSDNAPEL F D FAK G TH  SC   PQQNSVVERKHQH+LNVARAL+FQS +PL +W
Subjt:  YDKSRYTWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFW

Query:  GECVLSAAYLINRTPMVLLSNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFFISRDVLFFEELFP
         +C+ ++ YLINRTP  +L++ TPF  L  K   Y+ +K FGCL YAST   +R KF PRA  CVF+G+PPG KGY+L ++   + FISRDV+F E  FP
Subjt:  GECVLSAAYLINRTPMVLLSNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFFISRDVLFFEELFP

Query:  FHS
        + +
Subjt:  FHS

RVW82526.1 Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Vitis vinifera]4.6e-16238.68Show/hide
Query:  SDLDAQLNPFMLHHSITPTTNLVSTPLAGS-NNYSSWSRAMMLALSGKNKVGFITGLIKKPSEGNLL-SAWKCNNDVIASWIINSISKEIAASLVYNGNV
        S ++   +P+ LH+   P+ +LVS  LAGS +NY SW R+M+ AL+ KNK+GFI G I +P+  +LL S W   N ++ SW+ NS+ KEIA S++Y+   
Subjt:  SDLDAQLNPFMLHHSITPTTNLVSTPLAGS-NNYSSWSRAMMLALSGKNKVGFITGLIKKPSEGNLL-SAWKCNNDVIASWIINSISKEIAASLVYNGNV

Query:  KEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLIDPL
         EIW++L ER+ Q +GP I++L++ ++  TQGS  V  YY ++ ++W EL E++ +  C C G +  ++    E VM FL+GLNES++ I+AQILL++P 
Subjt:  KEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLIDPL

Query:  PPINRVFSLIIQEERQRSIGSSPSIESIT-----LMANSERRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGHRLANSNNSVHQRQDNTIQNG
        PP+N+VFSL++QEE QRS+ +S S    T       A S     ++ S+ +  RP+C++C   GHT D+CYK+HGY PG R   +      R +  + N 
Subjt:  PPINRVFSLIIQEERQRSIGSSPSIESIT-----LMANSERRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGHRLANSNNSVHQRQDNTIQNG

Query:  --NDKVTEVSKRNQSAFFASLNSDQYTQLLGMLQTH-----------LNTLQNG-ENF---------------------KNETTHIA-------------
           +++T       SA    L  DQ+ QLL +L  H            N LQ    NF                        T H+              
Subjt:  --NDKVTEVSKRNQSAFFASLNSDQYTQLLGMLQTH-----------LNTLQNG-ENF---------------------KNETTHIA-------------

Query:  --------------------------VLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNERVNCIQHTAL
                                  VL+ VLYIP F++NL+S+S L + + F+  F    C IQD    K IG       LYLL   +     I    +
Subjt:  --------------------------VLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNERVNCIQHTAL

Query:  MCKVSAS----MWHKRMGHPSISRINELAKMIEISDFPNCKEVCHICPLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRY
        +   +++    +WH R+ HPS  +++ L   +++    N    C ICPLAKQ+RL F   NN++ + FDLIHCDIWGPF  PTH G  YF TIV D +R 
Subjt:  MCKVSAS----MWHKRMGHPSISRINELAKMIEISDFPNCKEVCHICPLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRY

Query:  TWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFWGECVLS
        TWV+LL  KSD+  + P+F  +++T+F   IK  RSDNAPELN  +LF +    H FSC  TPQQNSVVERKHQH+LNVARAL FQS +P+ +WG+CVL+
Subjt:  TWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFWGECVLS

Query:  AAYLINRTPMVLLSNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFFISRDVLFFEELFPFHSIKE
        + YLINR P  LL+N TPF  L  K   Y+ +K+FGCL Y+ST    R KF PRA PCVF+G+P G KGY++ D+   +  +SR+V F E +FPF  + +
Subjt:  AAYLINRTPMVLLSNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFFISRDVLFFEELFPFHSIKE

Query:  KDILISHDFLEQFVIP-CPLFDCLEKEDSIDARPTTEDSPEDSHGVDDQNPHISNSEETKN
         +  ++ DF  + V+P  P+       D+  + P   DS       +D +PH ++   T++
Subjt:  KDILISHDFLEQFVIP-CPLFDCLEKEDSIDARPTTEDSPEDSHGVDDQNPHISNSEETKN

XP_012857659.1 PREDICTED: uncharacterized protein LOC105976934 [Erythranthe guttata]2.9e-16439.37Show/hide
Query:  SDLDAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKP--SEGNLLSAWKCNNDVIASWIINSISKEIAASLVYNGNV
        S +D   +P+ LH S  P   LVS+ L   +NY++W+RAMM++L+ KNK+GFI G I KP   E  LL+AW  NN ++ SWI+N+IS +I AS++Y+ + 
Subjt:  SDLDAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKP--SEGNLLSAWKCNNDVIASWIINSISKEIAASLVYNGNV

Query:  KEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRP---MDECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLI
         +IW++LK R+ Q+NGP I+QLR++L   TQ   SV +Y+ K+  IW EL  +RP      C+C G  K+ D  + E VM FLMGLN+S +  R QILL+
Subjt:  KEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRP---MDECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLI

Query:  DPLPPINRVFSLIIQEERQRSIGSSPSIESITLMANSER--------------RFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGHRLANSNNS
        DPLPPIN+VF+L+ QEER RS+  + S +    +A + R              +F    S++KD +  C++C   GHT +KCY+LHG+PPG++       
Subjt:  DPLPPINRVFSLIIQEERQRSIGSSPSIESITLMANSER--------------RFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGHRLANSNNS

Query:  VHQRQDNT---IQNGNDKVTEVSKRNQSA----------FFASLNSDQYTQLLGMLQTHL-NTLQNGENFKN----ETTHIA------------------
           +   T   +   +D V   +  N  +          F  ++ + Q  QLL  + +HL N      + KN    +T+HI+                  
Subjt:  VHQRQDNT---IQNGNDKVTEVSKRNQSA----------FFASLNSDQYTQLLGMLQTHL-NTLQNGENFKN----ETTHIA------------------

Query:  -----------------------------------------------------VLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTI
                                                             VL +V Y+P+FK+NL+SVS LL    + + F + +  IQD+ L+  I
Subjt:  -----------------------------------------------------VLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTI

Query:  GKAELTNGLYLLRMKNERVNCIQHTALMCKVSASMWHKRMGHPSISRINELAKMIEIS-DFPNCKEVCHICPLAKQRRLSFPILNNIAENIFDLIHCDIW
        GK     GLY+L   +   + I+H A   K+SA++WH R+GH    ++  LAK   +S D  +    C++CPLAKQ+RL F   ++++  +FDLIHCDIW
Subjt:  GKAELTNGLYLLRMKNERVNCIQHTALMCKVSASMWHKRMGHPSISRINELAKMIEIS-DFPNCKEVCHICPLAKQRRLSFPILNNIAENIFDLIHCDIW

Query:  GPFKTPTHAGHSYFATIVYDKSRYTWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHL
        GPFK P+++G  YF T+V D SR+TWV+LL+ KS+++ V+PRFLK++  QF K IKVFRSDNA EL F+ LF + G  HQFSC YTPQQN++VERKHQH+
Subjt:  GPFKTPTHAGHSYFATIVYDKSRYTWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHL

Query:  LNVARALMFQSKVPLIFWGECVLSAAYLINRTPMVLLSNNTPFAALFKKKA-DYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDI
        LNVAR+L FQS +P+ +W EC+L+A +LINR P   L++ +P+  L+  K  DY+ +K+FGCL +A+  S ++SKFDPRA  CVF+G+P GIKGY+L D+
Subjt:  LNVARALMFQSKVPLIFWGECVLSAAYLINRTPMVLLSNNTPFAALFKKKA-DYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDI

Query:  AKRKFFISRDVLFFEELFPFHSIKEKDILISHDF--LEQFVIPCPLFDCL-EKEDSIDARPTTEDSPEDSHGVDDQNPHISNSEETKNP
           K FISRDV+F E ++PF +       IS DF  L   VIP    D + E   S+        SP +   V    P   +S   K P
Subjt:  AKRKFFISRDVLFFEELFPFHSIKEKDILISHDF--LEQFVIPCPLFDCL-EKEDSIDARPTTEDSPEDSHGVDDQNPHISNSEETKNP

TrEMBL top hitse value%identityAlignment
A0A2N9GZW3 Integrase catalytic domain-containing protein6.0e-16839.98Show/hide
Query:  DAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKPSE--GNLLSAWKCNNDVIASWIINSISKEIAASLVYNGNVKEI
        D   + + LHH  +P   LVS  L G +NY +WSR+M++AL+ KNK+GF+ G+I++P +      +AW   N ++ SW++NS+SKEIA+S++Y    KEI
Subjt:  DAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKPSE--GNLLSAWKCNNDVIASWIINSISKEIAASLVYNGNVKEI

Query:  WDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLIDPLPPI
        W++L+ER+ Q NGP I++++K +   +Q + SV  YY ++ ++W EL  +RP+ +C+C   K ++D    E+VM FLMGLN+S+S +RAQIL+ DPLP I
Subjt:  WDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLIDPLPPI

Query:  NRVFSLIIQEERQRSI---GSSPSIESITLMANSE---RRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGHRLANSNNSVHQRQDNTIQNGND
         + F+L+IQEERQR+I     +P+ +S+ L    E     +  ++S KKD RPICS+CG  GHT DKCYKLHGYPPG++     +S HQ     +++ + 
Subjt:  NRVFSLIIQEERQRSI---GSSPSIESITLMANSE---RRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGHRLANSNNSVHQRQDNTIQNGND

Query:  KVTE------VSKRNQSAFFASLNSDQ-------------------------------------------------------------------------
          T+      +S  +  A  ASL S Q                                                                         
Subjt:  KVTE------VSKRNQSAFFASLNSDQ-------------------------------------------------------------------------

Query:  ----YTQLLGMLQTHLNTLQNGENFKNETTHIA--------VLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLL
            +T +   + T+++ L NGE  K   THI         +L DVL +P F +NL+S+S L       + F    C IQD    K IG     NGLY L
Subjt:  ----YTQLLGMLQTHLNTLQNGENFKNETTHIA--------VLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLL

Query:  RMKNERVN------CIQHTALMCKVSASMWHKRMGHPSISRINELAKMIEISDFPNCKEVCHICPLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPT
        +   + V          HTA+       +WH R+GHPS+SR++ L  +I     P+  E C +C ++KQ+RL F    + A+  FDLIHCDIWGP+  PT
Subjt:  RMKNERVN------CIQHTALMCKVSASMWHKRMGHPSISRINELAKMIEISDFPNCKEVCHICPLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPT

Query:  HAGHSYFATIVYDKSRYTWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARAL
             YF TIV D +R TWV+L++ KS+   +I  F  LI+TQFS  IK+ RSDN PE      +A+ GT HQ SC  TPQQN+ VERKHQHLL VARAL
Subjt:  HAGHSYFATIVYDKSRYTWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARAL

Query:  MFQSKVPLIFWGECVLSAAYLINRTPMVLLSNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFFIS
         FQ+ +PL FWG CVL+A +LINR P  LL N +PF  LFKK  +Y+ ++ FGCL YA+T S NR KF PR++ CV +G+P GIKGYRL D+  ++ F+S
Subjt:  MFQSKVPLIFWGECVLSAAYLINRTPMVLLSNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFFIS

Query:  RDVLFFEELFPFHSIKEKDILISHDFLEQFVIPCPLFD
        RDVLF+E  FPFH+++      S       V+P P+ D
Subjt:  RDVLFFEELFPFHSIKEKDILISHDFLEQFVIPCPLFD

A0A2N9HKE6 Uncharacterized protein3.2e-16142.4Show/hide
Query:  DAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKPSE--GNLLSAWKCNNDVIASWIINSISKEIAASLVYNGNVKEI
        D   + + LHH  +P   LVS  L G +NY +WSR+M++AL+ KNK+GF+ G+I++P +      +AW   N ++ SW++NS+SKEIA+S++Y    KEI
Subjt:  DAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKPSE--GNLLSAWKCNNDVIASWIINSISKEIAASLVYNGNVKEI

Query:  WDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLIDPLPPI
        W++L+ER+ Q NGP I++++K +   +Q + SV  YY ++ ++W EL  +RP+ +C+C   K ++D    E+VM FLMGLN+S+S +RAQIL+ DPLP I
Subjt:  WDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLIDPLPPI

Query:  NRVFSLIIQEERQRSI---GSSPSIESITLMANSE---RRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGHRLANSNNSVHQRQDNTIQNGND
         + F+L+IQEERQR+I     +P+ +S+ L    E     +  ++S KKD RPICS+CG  GHT DKCYKLHGYPPG++     +S HQ     +++ + 
Subjt:  NRVFSLIIQEERQRSI---GSSPSIESITLMANSE---RRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGHRLANSNNSVHQRQDNTIQNGND

Query:  KVTE------VSKRNQSAFFASLNSDQY---TQLLGMLQTHLNTLQNGENFKNETTHIAVLKDVLYIPDFKYNLLSVST---------LLKDDKFAISFA
          T+      +S  +  A  ASL S Q+    Q++   Q    T        +  +H      V  +  F     S++T         +L      +   
Subjt:  KVTE------VSKRNQSAFFASLNSDQY---TQLLGMLQTHLNTLQNGENFKNETTHIAVLKDVLYIPDFKYNLLSVST---------LLKDDKFAISFA

Query:  DSNCLIQDKWLLKTIGKAELTNGLYLLRMKNERV------NCIQHTALMCKVSASMWHKRMGHPSISRINELAKMIEISDFPNCKEVCHICPLAKQRRLS
         S  L  D    K IG     NGLY L+   + V      +   HTA+       +WH R+GHPS+SR++ L  +I     P+  E C +C ++KQ+RL 
Subjt:  DSNCLIQDKWLLKTIGKAELTNGLYLLRMKNERV------NCIQHTALMCKVSASMWHKRMGHPSISRINELAKMIEISDFPNCKEVCHICPLAKQRRLS

Query:  FPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRYTWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQ
        F    + A+  FDLIHCDIWGP+  PT     YF TIV D +R TWV+L++ KS+   +I  F  LI+TQFS  IK+ RSDN PE      +A+ GT HQ
Subjt:  FPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRYTWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQ

Query:  FSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFWGECVLSAAYLINRTPMVLLSNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQ
         SC  TPQQN+ VERKHQHLL VARAL FQ+ +PL FWG CVL+A +LINR P  LL N   F  LFKK  +Y+ ++ FGCL YA+T S NR KF PR++
Subjt:  FSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFWGECVLSAAYLINRTPMVLLSNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQ

Query:  PCVFMGFPPGIKGYRLYDIAKRKFFISRDVLFFEELFPFHSIK
         CV +G+P GIKGYRL D+  ++ F+SRDVLF+E  FPFH+++
Subjt:  PCVFMGFPPGIKGYRLYDIAKRKFFISRDVLFFEELFPFHSIK

A0A2Z7AT15 Cysteine-rich RLK (Receptor-like protein kinase) 89.6e-16640.85Show/hide
Query:  ITSSDLDAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKP-SEGNLLSAWKCNNDVIASWIINSISKEIAASLVYNG
        I  + L+   +P+ LH+   P   LVS PL GS NY++W RAM++AL+ KNK+GFI   I +P SE  L  +W   N ++ SWI+NS+++ IA SL+Y  
Subjt:  ITSSDLDAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKP-SEGNLLSAWKCNNDVIASWIINSISKEIAASLVYNG

Query:  NVKEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLID
          +EIW +L ER+ +SN P IYQ++K L    QGS+ V  YY K+ T+W EL +Y+P   CTC   ++  ++ N E VM FLMGLN+SY+Q+RAQ+L+I+
Subjt:  NVKEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLID

Query:  PLPPINRVFSLIIQEERQRSI------------GSSPSIESITLMANSERRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGH---RLANSNNS
        PLP I +VF+L+IQEERQRSI            G   ++ S    A S R   + K  + D R ICS+C ++ HT DKCYKLHGYPPGH   +   S  S
Subjt:  PLPPINRVFSLIIQEERQRSI------------GSSPSIESITLMANSERRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGH---RLANSNNS

Query:  VHQRQDNTIQNGNDKVTEVSKRNQSAFFASLNSDQYTQLLGMLQTHLNTLQN-------------GENFKNETTHI------------------------
         H  Q ++    + +  ++   +      SL   Q  QL+  L + L T QN                  + T+HI                        
Subjt:  VHQRQDNTIQNGNDKVTEVSKRNQSAFFASLNSDQYTQLLGMLQTHLNTLQN-------------GENFKNETTHI------------------------

Query:  --------------------------------AVLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNERVN
                                         VL++VLY+P F++NLLSVS+L  +   ++SF   +C IQD   ++ IG  +    LY+L+  +  + 
Subjt:  --------------------------------AVLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNERVN

Query:  CIQHTALMCKV---SASMWHKRMGHPSISRINELAKMIEISDFPNCKEVCHICPLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIV
             + +C     ++ +WH+RMGHPS ++++ L  ++ I +  +   +CH C L+KQRRL     NNI+  IF+L+H D WGPF   +  G  +F TIV
Subjt:  CIQHTALMCKV---SASMWHKRMGHPSISRINELAKMIEISDFPNCKEVCHICPLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIV

Query:  YDKSRYTWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFW
         D SRYTWVY+L+ KSD+L + P F +++ TQF   +K  RSDNAPEL F D FAK G TH  SC   PQQNSVVERKHQH+LNVARAL+FQS +PL +W
Subjt:  YDKSRYTWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFW

Query:  GECVLSAAYLINRTPMVLLSNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFFISRDVLFFEELFP
         +C+ ++ YLINRTP  +L++ TPF  L  K   Y+ +K FGCL YAST   +R KF PRA  CVF+G+PPG KGY+L ++   + FISRDV+F E  FP
Subjt:  GECVLSAAYLINRTPMVLLSNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFFISRDVLFFEELFP

Query:  FHS
        + +
Subjt:  FHS

A0A438HDI8 Retrovirus-related Pol polyprotein from transposon TNT 1-942.2e-16238.68Show/hide
Query:  SDLDAQLNPFMLHHSITPTTNLVSTPLAGS-NNYSSWSRAMMLALSGKNKVGFITGLIKKPSEGNLL-SAWKCNNDVIASWIINSISKEIAASLVYNGNV
        S ++   +P+ LH+   P+ +LVS  LAGS +NY SW R+M+ AL+ KNK+GFI G I +P+  +LL S W   N ++ SW+ NS+ KEIA S++Y+   
Subjt:  SDLDAQLNPFMLHHSITPTTNLVSTPLAGS-NNYSSWSRAMMLALSGKNKVGFITGLIKKPSEGNLL-SAWKCNNDVIASWIINSISKEIAASLVYNGNV

Query:  KEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLIDPL
         EIW++L ER+ Q +GP I++L++ ++  TQGS  V  YY ++ ++W EL E++ +  C C G +  ++    E VM FL+GLNES++ I+AQILL++P 
Subjt:  KEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLIDPL

Query:  PPINRVFSLIIQEERQRSIGSSPSIESIT-----LMANSERRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGHRLANSNNSVHQRQDNTIQNG
        PP+N+VFSL++QEE QRS+ +S S    T       A S     ++ S+ +  RP+C++C   GHT D+CYK+HGY PG R   +      R +  + N 
Subjt:  PPINRVFSLIIQEERQRSIGSSPSIESIT-----LMANSERRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGHRLANSNNSVHQRQDNTIQNG

Query:  --NDKVTEVSKRNQSAFFASLNSDQYTQLLGMLQTH-----------LNTLQNG-ENF---------------------KNETTHIA-------------
           +++T       SA    L  DQ+ QLL +L  H            N LQ    NF                        T H+              
Subjt:  --NDKVTEVSKRNQSAFFASLNSDQYTQLLGMLQTH-----------LNTLQNG-ENF---------------------KNETTHIA-------------

Query:  --------------------------VLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNERVNCIQHTAL
                                  VL+ VLYIP F++NL+S+S L + + F+  F    C IQD    K IG       LYLL   +     I    +
Subjt:  --------------------------VLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNERVNCIQHTAL

Query:  MCKVSAS----MWHKRMGHPSISRINELAKMIEISDFPNCKEVCHICPLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRY
        +   +++    +WH R+ HPS  +++ L   +++    N    C ICPLAKQ+RL F   NN++ + FDLIHCDIWGPF  PTH G  YF TIV D +R 
Subjt:  MCKVSAS----MWHKRMGHPSISRINELAKMIEISDFPNCKEVCHICPLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRY

Query:  TWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFWGECVLS
        TWV+LL  KSD+  + P+F  +++T+F   IK  RSDNAPELN  +LF +    H FSC  TPQQNSVVERKHQH+LNVARAL FQS +P+ +WG+CVL+
Subjt:  TWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFWGECVLS

Query:  AAYLINRTPMVLLSNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFFISRDVLFFEELFPFHSIKE
        + YLINR P  LL+N TPF  L  K   Y+ +K+FGCL Y+ST    R KF PRA PCVF+G+P G KGY++ D+   +  +SR+V F E +FPF  + +
Subjt:  AAYLINRTPMVLLSNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFFISRDVLFFEELFPFHSIKE

Query:  KDILISHDFLEQFVIP-CPLFDCLEKEDSIDARPTTEDSPEDSHGVDDQNPHISNSEETKN
         +  ++ DF  + V+P  P+       D+  + P   DS       +D +PH ++   T++
Subjt:  KDILISHDFLEQFVIP-CPLFDCLEKEDSIDARPTTEDSPEDSHGVDDQNPHISNSEETKN

A0A5A7VE66 Cysteine-rich RLK (Receptor-like protein kinase) 81.2e-17948.84Show/hide
Query:  ESSTSGSSNYSVSITSSDLDAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKPSEGNLLSAWKCNNDVIASWIINSI
        +SST+GS    ++  +S  DAQLNP+ +HHS+ PT  +V+ PL G+ NY+SWSRAM++A+SG+NK GFITG I+KPS+G LL AW CNND++ASWI+NS+
Subjt:  ESSTSGSSNYSVSITSSDLDAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKPSEGNLLSAWKCNNDVIASWIINSI

Query:  SKEIAASLVYNGNVKEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECTCEGSKKMIDFLNAEFVMIFLMGLNES
        SKEIAAS++Y G++KEIWDEL++R+KQSNGP IYQLRK+ VT  QG+L++E YY K+ TIWQ L EYR  ++CTC G K  ID L +E++M FLMGLN+S
Subjt:  SKEIAASLVYNGNVKEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECTCEGSKKMIDFLNAEFVMIFLMGLNES

Query:  YSQIRAQILLIDPLPPINRVFSLIIQEERQRSIG-SSPSIESITLMANSERRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGHRLANSNNSVH
        Y+ +RAQILL+ PLP IN VFSL+IQEE+QRS G  +P I+ + L   S    S+D+++KK+ RP CS CG KGH ADKCYK HGYPPG++  NS NS+ 
Subjt:  YSQIRAQILLIDPLPPINRVFSLIIQEERQRSIG-SSPSIESITLMANSERRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGHRLANSNNSVH

Query:  QRQDNTIQNGNDKVTEVSKRNQSAFFASLNSDQYTQLLGMLQTHLN----------------------TLQNGEN--------------------FKN--
           D +  N        +      FF+SLNS+QY+QL+ +L  HL                       T  N ++                    FKN  
Subjt:  QRQDNTIQNGNDKVTEVSKRNQSAFFASLNSDQYTQLLGMLQTHLN----------------------TLQNGEN--------------------FKN--

Query:  ETTHIAV-----------------------LKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNERVNCIQH
         T ++ V                       LKDVL++  F YNL+SVS LL     ++ F  + C+IQD      IGKA   NGLY+L  K    NCI  
Subjt:  ETTHIAV-----------------------LKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNERVNCIQH

Query:  TALMCKVSASMWHKRMGHPSISRINELAKMIEISDFPNCKEVCHICPLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRYT
           +  +S   WH+R+GH S   ++ L+  + +S+       CH+CPLAKQ+RLSF   NN+A + FDL+H DIWGPFK P++ G+ YF T+V D  R+T
Subjt:  TALMCKVSASMWHKRMGHPSISRINELAKMIEISDFPNCKEVCHICPLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRYT

Query:  WVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIF
        WVY+L  KSD+L ++P+F +LIETQFSKVIK FRSDNAPEL   + FA+ GT HQFSC   PQQNSVVERKHQHLLNVARAL F  +   +F
Subjt:  WVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIF

SwissProt top hitse value%identityAlignment
P04146 Copia protein3.4e-3526.17Show/hide
Query:  CSNCGYKGHTADKCYKLHGYPPGHRLANSNNSVHQRQDNTIQNGN-----DKVTEVSKRNQSAFFAS-------LNSDQ-YTQLLGML-QTHLNTLQNGE
        C +CG +GH    C+         R+ N+ N  +++Q  T  +        +V   S  +   F          +N +  YT  + ++    +   + GE
Subjt:  CSNCGYKGHTADKCYKLHGYPPGHRLANSNNSVHQRQDNTIQNGN-----DKVTEVSKRNQSAFFAS-------LNSDQ-YTQLLGML-QTHLNTLQNGE

Query:  NF--------KNETTHIAVLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNERVNCIQHTALMCKVSASM
                  +    H   L+DVL+  +   NL+SV   L++   +I F  S   I    L+  +  + + N + ++  +   +N  +H     K +  +
Subjt:  NF--------KNETTHIAVLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNERVNCIQHTALMCKVSASM

Query:  WHKRMGHPSISRINELAKMIEISDFP-------NCKEVCHICPLAKQRRLSFPIL---NNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRYTW
        WH+R GH S  ++ E+ +    SD         +C E+C  C   KQ RL F  L    +I   +F ++H D+ GP    T    +YF   V   + Y  
Subjt:  WHKRMGHPSISRINELAKMIEISDFP-------NCKEVCHICPLAKQRRLSFPIL---NNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRYTW

Query:  VYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPEL---NFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFWGECVL
         YL+++KSD+  +   F+   E  F+  +     DN  E      R    K G ++  +  +TPQ N V ER  + +   AR ++  +K+   FWGE VL
Subjt:  VYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPEL---NFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFWGECVL

Query:  SAAYLINRTPMVLL--SNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFFISRDVLFFE
        +A YLINR P   L  S+ TP+     KK     ++ FG   Y    +  + KFD ++   +F+G+ P   G++L+D    KF ++RDV+  E
Subjt:  SAAYLINRTPMVLL--SNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFFISRDVLFFE

P10978 Retrovirus-related Pol polyprotein from transposon TNT 1-941.1e-5726.37Show/hide
Query:  GSNNYSSWSRAM--MLALSGKNKVGFITGLIKKPSEGNLLSAWKCNNDVIASWIINSISKEIAASLVYNGNVKEIWDELKERYKQSNGPHIYQLRKDLVT
        G N +S+W R M  +L   G +KV  +    KKP +      W   ++  AS I   +S ++  +++     + IW  L+  Y      +   L+K L  
Subjt:  GSNNYSSWSRAM--MLALSGKNKVGFITGLIKKPSEGNLLSAWKCNNDVIASWIINSISKEIAASLVYNGNVKEIWDELKERYKQSNGPHIYQLRKDLVT

Query:  TTQGSLSVEIYYAKITTIWQELVEYRPMDECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLIDPLPPINRVFSLIIQEERQRSIGSSPSIESI
             +S    +     ++  L+          E   K I  LN+         L  SY  +   IL       +  V S ++  E+ R    +     I
Subjt:  TTQGSLSVEIYYAKITTIWQELVEYRPMDECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLIDPLPPINRVFSLIIQEERQRSIGSSPSIESI

Query:  TL-MANSERRFSSD----------KSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGHRLANSNNSVHQRQDNT---IQNGNDKVTEVSKRNQSAFFASL
        T     S +R S++          K++ K     C NC   GH    C       P  R      S  +  DNT   +QN ++ V  +++  +    +  
Subjt:  TL-MANSERRFSSD----------KSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGHRLANSNNSVHQRQDNT---IQNGNDKVTEVSKRNQSAFFASL

Query:  NSDQYTQLLG-------------MLQTHLNTLQNGENFKNETTHIA------------VLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKW
         S+                     +     T++ G    ++   I             VLKDV ++PD + NL+S    L  D +   FA+       KW
Subjt:  NSDQYTQLLG-------------MLQTHLNTLQNGENFKNETTHIA------------VLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKW

Query:  LLKTIGKAELTNGLY--LLRMKNERVNCIQHTALMCKVSASMWHKRMGHPSISRINELAKMIEISDFPNCK-EVCHICPLAKQRRLSFPILNNIAENIFD
         L T G   +  G+    L   N  +   +  A   ++S  +WHKRMGH S   +  LAK   IS       + C  C   KQ R+SF   +    NI D
Subjt:  LLKTIGKAELTNGLY--LLRMKNERVNCIQHTALMCKVSASMWHKRMGHPSISRINELAKMIEISDFPNCK-EVCHICPLAKQRRLSFPILNNIAENIFD

Query:  LIHCDIWGPFKTPTHAGHSYFATIVYDKSRYTWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPEL---NFRDLFAKTGTTHQFSCAYTPQQN
        L++ D+ GP +  +  G+ YF T + D SR  WVY+L+ K  + QV  +F  L+E +  + +K  RSDN  E     F +  +  G  H+ +   TPQ N
Subjt:  LIHCDIWGPFKTPTHAGHSYFATIVYDKSRYTWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPEL---NFRDLFAKTGTTHQFSCAYTPQQN

Query:  SVVERKHQHLLNVARALMFQSKVPLIFWGECVLSAAYLINRTPMVLLSNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPG
         V ER ++ ++   R+++  +K+P  FWGE V +A YLINR+P V L+   P      K+  Y+ +K FGC A+A  P   R+K D ++ PC+F+G+   
Subjt:  SVVERKHQHLLNVARALMFQSKVPLIFWGECVLSAAYLINRTPMVLLSNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPG

Query:  IKGYRLYDIAKRKFFISRDVLFFE-ELFPFHSIKEKDILISHDFLEQFV-IPCPLFDCLEKEDSIDARPTTEDSPEDSHGVDDQNPHISNSEETKNPPDH
          GYRL+D  K+K   SRDV+F E E+     + EK   + +  +  FV IP    +    E + D    +E   +    ++         EE ++P   
Subjt:  IKGYRLYDIAKRKFFISRDVLFFE-ELFPFHSIKEKDILISHDFLEQFV-IPCPLFDCLEKEDSIDARPTTEDSPEDSHGVDDQNPHISNSEETKNPPDH

Query:  TTHH
           H
Subjt:  TTHH

Q07791 Transposon Ty2-DR3 Gag-Pol polyprotein2.9e-1024.34Show/hide
Query:  NFKNET-THIAVLKDVLYIPDFKYNLLSVSTLLKDD------KFAISFADSNCLIQDKWLLKTIGKAELTNGLYLL--RMKNERVNCIQHTALMCKVSAS
        NF+N T T I      L+ P+  Y+LLS+S L   +      +  +  +D   L      +   G     +  YL+   +    +N +  +  + K    
Subjt:  NFKNET-THIAVLKDVLYIPDFKYNLLSVSTLLKDD------KFAISFADSNCLIQDKWLLKTIGKAELTNGLYLL--RMKNERVNCIQHTALMCKVSAS

Query:  MWHKRMGHPSISRINELAK-----MIEISDFPNCKEVCHICP------LAKQRRLSFPILN-NIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSR
        + H+ +GH +   I +  K      ++ SD        + CP        K R +    L    +   F  +H DI+GP      +  SYF +   +K+R
Subjt:  MWHKRMGHPSISRINELAK-----MIEISDFPNCKEVCHICP------LAKQRRLSFPILN-NIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSR

Query:  YTWVYLLEHKSD--ILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDL---FAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFW
        + WVY L  + +  IL V    L  I+ QF+  + V + D   E   + L   F   G T  ++     + + V ER ++ LLN  R L+  S +P   W
Subjt:  YTWVYLLEHKSD--ILQVIPRFLKLIETQFSKVIKVFRSDNAPELNFRDL---FAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFW

Query:  GECV
           V
Subjt:  GECV

Q94HW2 Retrovirus-related Pol polyprotein from transposon RE11.1e-5223.85Show/hide
Query:  AQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKPSEGNLLSA----------WKCNNDVIASWIINSISKEIAASLVY
        A     +L+++     N+ +     S NY  WSR +     G    GF+ G    P       A          WK  + +I S ++ +IS  +  ++  
Subjt:  AQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKPSEGNLLSA----------WKCNNDVIASWIINSISKEIAASLVY

Query:  NGNVKEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEY-RPMDECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQIL
             +IW+ L++ Y   +  H+ QLR  L   T+G+ +++ Y   + T + +L    +PMD              + E V   L  L E Y  +  QI 
Subjt:  NGNVKEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEY-RPMDECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQIL

Query:  LIDPLPPINRVFSLIIQEERQRSIGSSPSIESITLMANSER----------------------------------RFSSDKSKKKDTRPICSNCGYKGHT
          D  P +  +   ++  E +    SS ++  IT  A S R                                   F  + ++ K     C  CG +GH+
Subjt:  LIDPLPPINRVFSLIIQEERQRSIGSSPSIESITLMANSER----------------------------------RFSSDKSKKKDTRPICSNCGYKGHT

Query:  ADKCYKLHGYPPGHRLANSNNS------------VHQRQDNTIQNGNDKVTEVSKRNQSAFFASLNSDQ-YTQLLGMLQTHLNTL---QNGENFKNETTH
        A +C +L      H L++ N+              +    +   + N  +   +  + ++ F +L+  Q YT    ++    +T+     G    +  + 
Subjt:  ADKCYKLHGYPPGHRLANSNNS------------VHQRQDNTIQNGNDKVTEVSKRNQSAFFASLNSDQ-YTQLLGMLQTHLNTL---QNGENFKNETTH

Query:  IAVLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNE------RVNCIQHTALMC----KVSASMWHKRMG
           L ++LY+P+   NL+SV  L   +  ++ F  ++  ++D           L  G+ LL+ K +       +   Q  +L      K + S WH R+G
Subjt:  IAVLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNE------RVNCIQHTALMC----KVSASMWHKRMG

Query:  HPSISRINELAKMIEISDF-PNCKEV-CHICPLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRYTWVYLLEHKSDILQVI
        HP+ S +N +     +S   P+ K + C  C + K  ++ F      +    + I+ D+W      +H  + Y+   V   +RYTW+Y L+ KS + +  
Subjt:  HPSISRINELAKMIEISDF-PNCKEV-CHICPLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRYTWVYLLEHKSDILQVI

Query:  PRFLKLIETQFSKVIKVFRSDNAPE-LNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFWGECVLSAAYLINRTPMVLLSN
          F  L+E +F   I  F SDN  E +   + F++ G +H  S  +TP+ N + ERKH+H++     L+  + +P  +W      A YLINR P  LL  
Subjt:  PRFLKLIETQFSKVIKVFRSDNAPE-LNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFWGECVLSAAYLINRTPMVLLSN

Query:  NTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFFISRDVLFFEELFPF
         +PF  LF    +Y+ ++ FGC  Y      N+ K D +++ CVF+G+      Y    +   + +ISR V F E  FPF
Subjt:  NTPFAALFKKKADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFFISRDVLFFEELFPF

Q9ZT94 Retrovirus-related Pol polyprotein from transposon RE24.3e-4623.65Show/hide
Query:  LNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKPSEGNLLSA----------WKCNNDVIASWIINSISKEIAASLVYNG
        +N  +L+ +++  T L ST      NY  WSR +     G    GF+ G    P       A          W+  + +I S I+ +IS  +  ++    
Subjt:  LNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKPSEGNLLSA----------WKCNNDVIASWIINSISKEIAASLVYNG

Query:  NVKEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMD---------ECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQ
           +IW+ L++ Y   +  H+ QLR                   IT   Q  +  +PMD         E   +  K +ID + A+     L  ++E    
Subjt:  NVKEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMD---------ECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQ

Query:  IRAQILLIDPLPPINRVFSLIIQ-----EERQRSIGSSPSIESITLMANSERRFSS-DKSKKKDTRPI---CSNCGYKGHTADKCYKLHGYPPGHRLANS
          +++L ++    +    +++          Q + G + +  +    +NS +  SS  +S  +  +P    C  C  +GH+A +C +LH +        S
Subjt:  IRAQILLIDPLPPINRVFSLIIQ-----EERQRSIGSSPSIESITLMANSERRFSS-DKSKKKDTRPI---CSNCGYKGHTADKCYKLHGYPPGHRLANS

Query:  NNSVHQRQD------NTIQNGNDKVTEVSKRNQ-SAFFASLNSDQ-YTQLLGMLQTHLNTL---QNGENFKNETTHIAVLKDVLYIPDFKYNLLSVSTLL
         +     Q       N+  N N+ + +    +  ++ F +L+  Q YT    ++    +T+     G      ++    L  VLY+P+   NL+SV  L 
Subjt:  NNSVHQRQD------NTIQNGNDKVTEVSKRNQ-SAFFASLNSDQ-YTQLLGMLQTHLNTL---QNGENFKNETTHIAVLKDVLYIPDFKYNLLSVSTLL

Query:  KDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNERVNCIQHTALMC-KVSASMWHKRMGHPSISRINELAKMIEISDF-PNCKEV-CHICPL
          ++ ++ F  ++  ++D      + + +  + LY   + + +   +   A  C K + S WH R+GHPS++ +N +     +    P+ K + C  C +
Subjt:  KDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNERVNCIQHTALMC-KVSASMWHKRMGHPSISRINELAKMIEISDF-PNCKEV-CHICPL

Query:  AKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRYTWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPE-LNFRDLF
         K  ++ F      +    + I+ D+W      +   + Y+   V   +RYTW+Y L+ KS +      F  L+E +F   I    SDN  E +  RD  
Subjt:  AKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRYTWVYLLEHKSDILQVIPRFLKLIETQFSKVIKVFRSDNAPE-LNFRDLF

Query:  AKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFWGECVLSAAYLINRTPMVLLSNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNR
        ++ G +H  S  +TP+ N + ERKH+H++ +   L+  + VP  +W      A YLINR P  LL   +PF  LF +  +Y  +K FGC  Y      NR
Subjt:  AKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFWGECVLSAAYLINRTPMVLLSNNTPFAALFKKKADYNIIKTFGCLAYASTPSVNR

Query:  SKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFFISRDVLFFEELFPFHSIK------------EKDILISHDFLEQFVIPCPLFDCLEKEDSIDARPTTE
         K + +++ C FMG+      Y    I   + + SR V F E  FPF +                    SH  L    +  P   CL        RP + 
Subjt:  SKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFFISRDVLFFEELFPFHSIK------------EKDILISHDFLEQFVIPCPLFDCLEKEDSIDARPTTE

Query:  DSPEDSHGVDDQN-PHISNSEETKNPPDHTTHH
         SP  +  V   N P  S S  + + P   +H+
Subjt:  DSPEDSHGVDDQN-PHISNSEETKNPPDHTTHH

Arabidopsis top hitse value%identityAlignment
AT1G21280.1 CONTAINS InterPro DOMAIN/s: Retrotransposon gag protein (InterPro:IPR005162); Has 707 Blast hits to 705 proteins in 25 species: Archae - 0; Bacteria - 0; Metazoa - 4; Fungi - 0; Plants - 703; Viruses - 0; Other Eukaryotes - 0 (source: NCBI BLink).5.8e-2229.65Show/hide
Query:  SVSITSSDLDAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKPSE-GNLLSAWKCNNDVIASWIINSISKEIAASLV
        SVS TS        P  +HH   P+   +       +NY +W       L    K GFI G + KP     L   W+  N ++  W++NS++ ++  S++
Subjt:  SVSITSSDLDAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKPSE-GNLLSAWKCNNDVIASWIINSISKEIAASLV

Query:  YNGNVKEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECTCEG-----SKKMIDFLNAEFVMIFLMG--LNESYS
        Y     ++W++L+  +       IYQLR+ L T  QG  SVE Y+ K++ +W EL EY P+ EC C G     +K+  +    E    FLMG  LN+ + 
Subjt:  YNGNVKEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECTCEG-----SKKMIDFLNAEFVMIFLMG--LNESYS

Query:  QIRAQILLIDPLPPINRVFSLIIQEE
         +  +I+   P P ++  F+++   E
Subjt:  QIRAQILLIDPLPPINRVFSLIIQEE


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGACGACACCTGAGAGTTCGACGTCCGGCAGCTCCAACTACTCAGTTTCGATTACCTCTTCTGATCTTGACGCCCAACTAAATCCCTTCATGCTTCATCATTCCATCAC
TCCAACCACCAACCTTGTTTCTACACCATTGGCAGGATCGAACAACTACTCATCATGGAGTAGAGCAATGATGCTGGCCTTATCTGGAAAGAACAAAGTTGGGTTCATCA
CTGGCCTGATCAAGAAACCTTCAGAAGGTAATCTATTATCCGCTTGGAAATGCAACAATGATGTGATAGCTTCTTGGATTATCAACTCTATTTCAAAAGAGATAGCAGCA
AGCCTCGTTTATAATGGAAACGTAAAGGAAATATGGGATGAATTGAAAGAAAGGTACAAACAGTCCAATGGACCTCACATATACCAGTTACGAAAGGACCTAGTAACCAC
TACACAAGGAAGTTTATCAGTTGAAATATACTATGCAAAAATCACCACTATATGGCAGGAACTTGTGGAATATCGTCCTATGGATGAATGCACTTGTGAAGGATCAAAGA
AAATGATCGATTTCTTGAATGCAGAATTCGTAATGATCTTTCTCATGGGATTAAATGAATCCTATTCGCAAATTAGAGCCCAAATTTTGTTGATTGATCCTTTACCTCCC
ATAAATAGAGTCTTTTCTCTAATCATCCAAGAAGAAAGACAAAGATCCATTGGATCTTCACCCTCCATTGAGAGCATCACATTGATGGCTAACTCTGAAAGAAGATTTTC
TTCTGATAAATCCAAGAAGAAAGATACAAGACCTATATGCTCCAACTGTGGCTATAAAGGACACACTGCTGACAAATGCTACAAGTTACATGGCTACCCACCCGGACATA
GACTTGCCAACAGCAATAATTCTGTCCATCAAAGACAGGACAACACAATCCAAAATGGAAATGACAAAGTGACAGAAGTTTCTAAGAGGAATCAATCTGCCTTTTTTGCT
AGTCTCAACAGTGATCAATATACACAACTTCTGGGCATGCTTCAAACTCATCTCAACACACTTCAAAATGGTGAGAATTTCAAAAATGAGACTACACACATAGCAGTCCT
GAAAGATGTACTTTATATTCCTGACTTCAAGTATAACCTGCTGTCAGTAAGTACTCTCCTCAAGGATGACAAATTTGCCATATCATTTGCTGATTCTAATTGTCTTATTC
AGGACAAGTGGCTTTTGAAAACGATTGGGAAGGCTGAATTAACTAATGGGCTCTACCTACTCAGAATGAAAAATGAAAGAGTTAATTGCATTCAGCACACTGCACTAATG
TGTAAAGTCTCGGCCTCTATGTGGCATAAACGGATGGGACATCCCTCTATCAGCAGAATAAATGAGCTAGCTAAGATGATAGAAATTTCTGACTTTCCAAATTGTAAAGA
AGTCTGCCATATTTGTCCCTTAGCTAAACAAAGACGTCTCTCTTTTCCTATATTGAATAACATTGCTGAAAATATATTTGATCTTATACATTGTGACATATGGGGTCCTT
TCAAAACCCCAACACATGCTGGTCATTCATATTTTGCCACCATTGTATATGATAAATCTAGATACACTTGGGTATATCTTTTGGAACATAAGAGTGATATCCTACAAGTT
ATTCCTAGATTTTTGAAGTTAATTGAAACCCAATTTTCAAAAGTCATCAAGGTCTTTCGATCTGACAATGCTCCTGAGTTGAATTTCAGGGATCTTTTTGCCAAAACTGG
AACAACTCATCAATTCTCGTGTGCTTACACTCCTCAACAAAATTCAGTAGTGGAAAGAAAACACCAACACCTTCTTAACGTAGCAAGAGCATTGATGTTCCAATCAAAGG
TTCCTCTTATCTTTTGGGGAGAATGTGTTCTGAGTGCTGCATACTTGATCAACAGAACGCCTATGGTATTACTATCAAATAACACTCCCTTTGCTGCTCTATTCAAGAAG
AAAGCAGATTACAACATCATTAAGACCTTCGGGTGTCTTGCCTATGCCTCTACCCCCTCAGTAAACAGATCTAAGTTTGATCCTAGAGCACAACCTTGTGTTTTTATGGG
GTTCCCACCAGGCATCAAAGGATACAGATTATATGACATAGCCAAGAGAAAGTTCTTTATATCTAGGGATGTCCTATTCTTTGAAGAACTATTTCCCTTTCATTCTATCA
AAGAAAAGGACATTCTGATCTCCCATGACTTCCTTGAGCAATTCGTCATACCATGCCCTCTATTTGATTGCCTAGAAAAGGAAGATAGTATTGATGCAAGACCTACGACA
GAGGATAGCCCTGAAGACAGCCACGGTGTTGATGATCAAAATCCACATATCAGTAACTCAGAAGAAACCAAAAATCCTCCCGACCACACCACCCACCATCTTACCTAA
mRNA sequenceShow/hide mRNA sequence
ATGACGACACCTGAGAGTTCGACGTCCGGCAGCTCCAACTACTCAGTTTCGATTACCTCTTCTGATCTTGACGCCCAACTAAATCCCTTCATGCTTCATCATTCCATCAC
TCCAACCACCAACCTTGTTTCTACACCATTGGCAGGATCGAACAACTACTCATCATGGAGTAGAGCAATGATGCTGGCCTTATCTGGAAAGAACAAAGTTGGGTTCATCA
CTGGCCTGATCAAGAAACCTTCAGAAGGTAATCTATTATCCGCTTGGAAATGCAACAATGATGTGATAGCTTCTTGGATTATCAACTCTATTTCAAAAGAGATAGCAGCA
AGCCTCGTTTATAATGGAAACGTAAAGGAAATATGGGATGAATTGAAAGAAAGGTACAAACAGTCCAATGGACCTCACATATACCAGTTACGAAAGGACCTAGTAACCAC
TACACAAGGAAGTTTATCAGTTGAAATATACTATGCAAAAATCACCACTATATGGCAGGAACTTGTGGAATATCGTCCTATGGATGAATGCACTTGTGAAGGATCAAAGA
AAATGATCGATTTCTTGAATGCAGAATTCGTAATGATCTTTCTCATGGGATTAAATGAATCCTATTCGCAAATTAGAGCCCAAATTTTGTTGATTGATCCTTTACCTCCC
ATAAATAGAGTCTTTTCTCTAATCATCCAAGAAGAAAGACAAAGATCCATTGGATCTTCACCCTCCATTGAGAGCATCACATTGATGGCTAACTCTGAAAGAAGATTTTC
TTCTGATAAATCCAAGAAGAAAGATACAAGACCTATATGCTCCAACTGTGGCTATAAAGGACACACTGCTGACAAATGCTACAAGTTACATGGCTACCCACCCGGACATA
GACTTGCCAACAGCAATAATTCTGTCCATCAAAGACAGGACAACACAATCCAAAATGGAAATGACAAAGTGACAGAAGTTTCTAAGAGGAATCAATCTGCCTTTTTTGCT
AGTCTCAACAGTGATCAATATACACAACTTCTGGGCATGCTTCAAACTCATCTCAACACACTTCAAAATGGTGAGAATTTCAAAAATGAGACTACACACATAGCAGTCCT
GAAAGATGTACTTTATATTCCTGACTTCAAGTATAACCTGCTGTCAGTAAGTACTCTCCTCAAGGATGACAAATTTGCCATATCATTTGCTGATTCTAATTGTCTTATTC
AGGACAAGTGGCTTTTGAAAACGATTGGGAAGGCTGAATTAACTAATGGGCTCTACCTACTCAGAATGAAAAATGAAAGAGTTAATTGCATTCAGCACACTGCACTAATG
TGTAAAGTCTCGGCCTCTATGTGGCATAAACGGATGGGACATCCCTCTATCAGCAGAATAAATGAGCTAGCTAAGATGATAGAAATTTCTGACTTTCCAAATTGTAAAGA
AGTCTGCCATATTTGTCCCTTAGCTAAACAAAGACGTCTCTCTTTTCCTATATTGAATAACATTGCTGAAAATATATTTGATCTTATACATTGTGACATATGGGGTCCTT
TCAAAACCCCAACACATGCTGGTCATTCATATTTTGCCACCATTGTATATGATAAATCTAGATACACTTGGGTATATCTTTTGGAACATAAGAGTGATATCCTACAAGTT
ATTCCTAGATTTTTGAAGTTAATTGAAACCCAATTTTCAAAAGTCATCAAGGTCTTTCGATCTGACAATGCTCCTGAGTTGAATTTCAGGGATCTTTTTGCCAAAACTGG
AACAACTCATCAATTCTCGTGTGCTTACACTCCTCAACAAAATTCAGTAGTGGAAAGAAAACACCAACACCTTCTTAACGTAGCAAGAGCATTGATGTTCCAATCAAAGG
TTCCTCTTATCTTTTGGGGAGAATGTGTTCTGAGTGCTGCATACTTGATCAACAGAACGCCTATGGTATTACTATCAAATAACACTCCCTTTGCTGCTCTATTCAAGAAG
AAAGCAGATTACAACATCATTAAGACCTTCGGGTGTCTTGCCTATGCCTCTACCCCCTCAGTAAACAGATCTAAGTTTGATCCTAGAGCACAACCTTGTGTTTTTATGGG
GTTCCCACCAGGCATCAAAGGATACAGATTATATGACATAGCCAAGAGAAAGTTCTTTATATCTAGGGATGTCCTATTCTTTGAAGAACTATTTCCCTTTCATTCTATCA
AAGAAAAGGACATTCTGATCTCCCATGACTTCCTTGAGCAATTCGTCATACCATGCCCTCTATTTGATTGCCTAGAAAAGGAAGATAGTATTGATGCAAGACCTACGACA
GAGGATAGCCCTGAAGACAGCCACGGTGTTGATGATCAAAATCCACATATCAGTAACTCAGAAGAAACCAAAAATCCTCCCGACCACACCACCCACCATCTTACCTAA
Protein sequenceShow/hide protein sequence
MTTPESSTSGSSNYSVSITSSDLDAQLNPFMLHHSITPTTNLVSTPLAGSNNYSSWSRAMMLALSGKNKVGFITGLIKKPSEGNLLSAWKCNNDVIASWIINSISKEIAA
SLVYNGNVKEIWDELKERYKQSNGPHIYQLRKDLVTTTQGSLSVEIYYAKITTIWQELVEYRPMDECTCEGSKKMIDFLNAEFVMIFLMGLNESYSQIRAQILLIDPLPP
INRVFSLIIQEERQRSIGSSPSIESITLMANSERRFSSDKSKKKDTRPICSNCGYKGHTADKCYKLHGYPPGHRLANSNNSVHQRQDNTIQNGNDKVTEVSKRNQSAFFA
SLNSDQYTQLLGMLQTHLNTLQNGENFKNETTHIAVLKDVLYIPDFKYNLLSVSTLLKDDKFAISFADSNCLIQDKWLLKTIGKAELTNGLYLLRMKNERVNCIQHTALM
CKVSASMWHKRMGHPSISRINELAKMIEISDFPNCKEVCHICPLAKQRRLSFPILNNIAENIFDLIHCDIWGPFKTPTHAGHSYFATIVYDKSRYTWVYLLEHKSDILQV
IPRFLKLIETQFSKVIKVFRSDNAPELNFRDLFAKTGTTHQFSCAYTPQQNSVVERKHQHLLNVARALMFQSKVPLIFWGECVLSAAYLINRTPMVLLSNNTPFAALFKK
KADYNIIKTFGCLAYASTPSVNRSKFDPRAQPCVFMGFPPGIKGYRLYDIAKRKFFISRDVLFFEELFPFHSIKEKDILISHDFLEQFVIPCPLFDCLEKEDSIDARPTT
EDSPEDSHGVDDQNPHISNSEETKNPPDHTTHHLT