; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Lag0031619 (gene) of Sponge gourd (AG-4) v1 genome

Gene IDLag0031619
OrganismLuffa acutangula AG-4 (Sponge gourd (AG-4) v1)
DescriptionIntegrase catalytic domain-containing protein
Genome locationchr11:10953732..10957924
RNA-Seq ExpressionLag0031619
SyntenyLag0031619
Gene Ontology termsGO:0015074 - DNA integration (biological process)
GO:0003676 - nucleic acid binding (molecular function)
GO:0004523 - RNA-DNA hybrid ribonuclease activity (molecular function)
GO:0016740 - transferase activity (molecular function)
InterPro domainsIPR001584 - Integrase, catalytic core
IPR002156 - Ribonuclease H domain
IPR012337 - Ribonuclease H-like superfamily
IPR025724 - GAG-pre-integrase domain
IPR026960 - Reverse transcriptase zinc-binding domain
IPR036397 - Ribonuclease H superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAB2595380.1 hypothetical protein D8674_030830 [Pyrus ussuriensis x Pyrus communis]8.4e-5843.25Show/hide
Query:  QLSQSISPTCASVFVSRLPSNMLWHLRLGHPSLNVLHKLLFVSSITHDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLALLHSDVWEPSPSVSVSGFW
        Q   SI P  A   + +L  + +WH R GHPS      +L  + IT         C+SCL+GK T+LPF+  T  S+ P  ++HSD+W P+P  S+ GF 
Subjt:  QLSQSISPTCASVFVSRLPSNMLWHLRLGHPSLNVLHKLLFVSSITHDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLALLHSDVWEPSPSVSVSGFW

Query:  YYIKFVDDFSKFTWIFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQNGVAEQKHRSIVAIALTLMY
        YY+  +D+ ++F WIFPL+ KSD       F  F+  Q S ++K  +SDGGGEY+N  VH FL  +G+LH +SCP+ P+QNG+AE+KHR ++   +TL+ 
Subjt:  YYIKFVDDFSKFTWIFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQNGVAEQKHRSIVAIALTLMY

Query:  HAFVPLEFWYHAFTTAMFLLNRLPSSAIGFMTPFQKLYGYAPDLSHLRIFGC
        HA++P EFW     TA++L+NR+P++ +   +PF+ LYG  P +SHL+IFGC
Subjt:  HAFVPLEFWYHAFTTAMFLLNRLPSSAIGFMTPFQKLYGYAPDLSHLRIFGC

PKA60107.1 Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Apostasia shenzhenica]1.7e-5846.34Show/hide
Query:  SPTCASVFVSRLPSNMLWHLRLGHPSLNVLHKLLFVSSITHDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLALLHSDVWEPSPSVSVSGFWYYIKFV
        SPT     +    + +LWH RLGHP L  +H+++    +      T R C +C+K K+ KL F+ ST  ST PL L++SD+W P+P +S  GF YY  FV
Subjt:  SPTCASVFVSRLPSNMLWHLRLGHPSLNVLHKLLFVSSITHDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLALLHSDVWEPSPSVSVSGFWYYIKFV

Query:  DDFSKFTWIFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQNGVAEQKHRSIVAIALTLMYHAFVPL
        DDFSKFTW+FP+  KSD+  I   F   +E   +C ++ F SD GGEY  +++H  L+S G+ H+ SCPH PEQNG AE+KHR I+   L  + HA  P 
Subjt:  DDFSKFTWIFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQNGVAEQKHRSIVAIALTLMYHAFVPL

Query:  EFWYHAFTTAMFLLNRLPSSAIGFMTPFQKLYGYAPDLSHLRIFGC
        +FW  A  TA++L+NRLP+  +  +TP QKLYG  PD + LRIFGC
Subjt:  EFWYHAFTTAMFLLNRLPSSAIGFMTPFQKLYGYAPDLSHLRIFGC

TQD88914.1 hypothetical protein C1H46_025506 [Malus baccata]6.8e-6045.49Show/hide
Query:  TCASVFVSRLPSNMLWHLRLGHPSLNVLHKLLFVSSITHDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLALLHSDVWEPSPSVSVSGFWYYIKFVDD
        T  +VF+ +  ++ LWH RLGHP+ +VL   L  S+IT       + C +CL+GK TKLPF +  S S  PL ++H+DVW PSP+ S+ G+ YY+ F+D+
Subjt:  TCASVFVSRLPSNMLWHLRLGHPSLNVLHKLLFVSSITHDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLALLHSDVWEPSPSVSVSGFWYYIKFVDD

Query:  FSKFTWIFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQNGVAEQKHRSIVAIALTLMYHAFVPLEF
         +++TWIFP+  K+ V  I  QF  +I+N  S ++K  +SDGGGEY++     FL++KG++HQ+SCP+ PEQNG+AE+K+R +V  A+TL+  A +  +F
Subjt:  FSKFTWIFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQNGVAEQKHRSIVAIALTLMYHAFVPLEF

Query:  WYHAFTTAMFLLNRLPSSAIGFMTPFQKLYGYAPDLSHLRIFGC
        W+HA  T+ +L+NRLP+S +   +PF+ LY   P L HLR+FGC
Subjt:  WYHAFTTAMFLLNRLPSSAIGFMTPFQKLYGYAPDLSHLRIFGC

TQD95848.1 hypothetical protein C1H46_018590 [Malus baccata]3.7e-5845.64Show/hide
Query:  SVFVSRLPSNMLWHLRLGHPSLNVLHKLLFVSSITHDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLALLHSDVWEPSPSVSVSGFWYYIKFVDDFSK
        + F+ +  +  +WH RLGHPS  ++ ++L  S ++         C+ CL+GK +KLPF+ S   S  P  ++HSDVW P+P VS+ GF YY+ F++  +K
Subjt:  SVFVSRLPSNMLWHLRLGHPSLNVLHKLLFVSSITHDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLALLHSDVWEPSPSVSVSGFWYYIKFVDDFSK

Query:  FTWIFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQNGVAEQKHRSIVAIALTLMYHAFVPLEFWYH
        F WIFP+  KSDVSS    F  FI NQ   S+K   SDGGGEYI ++   FLS KG+ HQ SC H PEQNG+AE+KHR I+  ++TL+  A +P  FW  
Subjt:  FTWIFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQNGVAEQKHRSIVAIALTLMYHAFVPLEFWYH

Query:  AFTTAMFLLNRLPSSAIGFMTPFQKLYGYAPDLSHLRIFGC
        A  T+++L+NR+PS A+G  +PFQ +Y + P + HL++FGC
Subjt:  AFTTAMFLLNRLPSSAIGFMTPFQKLYGYAPDLSHLRIFGC

TQE09310.1 hypothetical protein C1H46_005046 [Malus baccata]8.9e-6040.85Show/hide
Query:  DEFCTRVRIRMDYIPYPHMFSPLKNDFATSGLQLSQSISPTCASVFVSRLPSNMLWHLRLGHPSLNVLHKLLFVSSITHDTQFTCRDCESCLKGKATKLP
        DEF   ++   D +    ++  L N+ A   L + +S SP   + ++ +  S+ LWH RLGHP+  VL   L  + I+     +   C++CL+GK T LP
Subjt:  DEFCTRVRIRMDYIPYPHMFSPLKNDFATSGLQLSQSISPTCASVFVSRLPSNMLWHLRLGHPSLNVLHKLLFVSSITHDTQFTCRDCESCLKGKATKLP

Query:  FAMSTSISTRPLALLHSDVWEPSPSVSVSGFWYYIKFVDDFSKFTWIFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGV
        F    S S  P  ++H+DVW PSPSVS+  + YY+ F+D+ +++TWIFP++ K+ V  +  QF  F+ N    S++  +SDGGGEY+      FL +KG+
Subjt:  FAMSTSISTRPLALLHSDVWEPSPSVSVSGFWYYIKFVDDFSKFTWIFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGV

Query:  LHQRSCPHKPEQNGVAEQKHRSIVAIALTLMYHAFVPLEFWYHAFTTAMFLLNRLPSSAIGFMTPFQKLYGYAPDLSHLRIFGC
        LH +SCP+ P+QNG+AE+K+R I   A+TL+  A +P +FWYHA  TA++L+NR+P+  +   +PF+KLY   P L HL+IFGC
Subjt:  LHQRSCPHKPEQNGVAEQKHRSIVAIALTLMYHAFVPLEFWYHAFTTAMFLLNRLPSSAIGFMTPFQKLYGYAPDLSHLRIFGC

TrEMBL top hitse value%identityAlignment
A0A2N9ER29 Integrase catalytic domain-containing protein2.5e-6044.65Show/hide
Query:  HMFSPLKNDFATSGLQLSQSISPTCASVFVSRLPSNMLWHLRLGHPS----LNVLHKLLFVSSITHDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLA
        H   P++ D   +    S ++S T          S  LWH RLGHP      +VLH  L V+S+++   F    C  C++GK  +LPF  S S++TRPL 
Subjt:  HMFSPLKNDFATSGLQLSQSISPTCASVFVSRLPSNMLWHLRLGHPS----LNVLHKLLFVSSITHDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLA

Query:  LLHSDVWEPSPSVSVSGFWYYIKFVDDFSKFTWIFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQN
        L+H+DVW P+P  S +G  YY+ F+DD+++FTW FPL YKS V    + F   +EN L C +K  RSD GGEY       F SS G+LHQ SCPH  +QN
Subjt:  LLHSDVWEPSPSVSVSGFWYYIKFVDDFSKFTWIFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQN

Query:  GVAEQKHRSIVAIALTLMYHAFVPLEFWYHAFTTAMFLLNRLPSSAIGFMTPFQKLYGYAPDLSHLRIFGC
        G+AE+KHR IV +ALTL+  + +PL FW +AF+TA+FL+NRLPS +   ++P++ L+G  PD    R+FGC
Subjt:  GVAEQKHRSIVAIALTLMYHAFVPLEFWYHAFTTAMFLLNRLPSSAIGFMTPFQKLYGYAPDLSHLRIFGC

A0A2N9EUP4 Integrase catalytic domain-containing protein2.5e-6044.65Show/hide
Query:  HMFSPLKNDFATSGLQLSQSISPTCASVFVSRLPSNMLWHLRLGHPS----LNVLHKLLFVSSITHDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLA
        H   P++ D   +    S ++S T          S  LWH RLGHP      +VLH  L V+S+++   F    C  C++GK  +LPF  S S++TRPL 
Subjt:  HMFSPLKNDFATSGLQLSQSISPTCASVFVSRLPSNMLWHLRLGHPS----LNVLHKLLFVSSITHDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLA

Query:  LLHSDVWEPSPSVSVSGFWYYIKFVDDFSKFTWIFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQN
        L+H+DVW P+P  S +G  YY+ F+DD+++FTW FPL YKS V    + F   +EN L C +K  RSD GGEY       F SS G+LHQ SCPH  +QN
Subjt:  LLHSDVWEPSPSVSVSGFWYYIKFVDDFSKFTWIFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQN

Query:  GVAEQKHRSIVAIALTLMYHAFVPLEFWYHAFTTAMFLLNRLPSSAIGFMTPFQKLYGYAPDLSHLRIFGC
        G+AE+KHR IV +ALTL+  + +PL FW +AF+TA+FL+NRLPS +   ++P++ L+G  PD    R+FGC
Subjt:  GVAEQKHRSIVAIALTLMYHAFVPLEFWYHAFTTAMFLLNRLPSSAIGFMTPFQKLYGYAPDLSHLRIFGC

A0A2N9F5A6 Integrase catalytic domain-containing protein2.5e-6044.65Show/hide
Query:  HMFSPLKNDFATSGLQLSQSISPTCASVFVSRLPSNMLWHLRLGHPS----LNVLHKLLFVSSITHDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLA
        H   P++ D   +    S ++S T          S  LWH RLGHP      +VLH  L V+S+++   F    C  C++GK  +LPF  S S++TRPL 
Subjt:  HMFSPLKNDFATSGLQLSQSISPTCASVFVSRLPSNMLWHLRLGHPS----LNVLHKLLFVSSITHDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLA

Query:  LLHSDVWEPSPSVSVSGFWYYIKFVDDFSKFTWIFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQN
        L+H+DVW P+P  S +G  YY+ F+DD+++FTW FPL YKS V    + F   +EN L C +K  RSD GGEY       F SS G+LHQ SCPH  +QN
Subjt:  LLHSDVWEPSPSVSVSGFWYYIKFVDDFSKFTWIFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQN

Query:  GVAEQKHRSIVAIALTLMYHAFVPLEFWYHAFTTAMFLLNRLPSSAIGFMTPFQKLYGYAPDLSHLRIFGC
        G+AE+KHR IV +ALTL+  + +PL FW +AF+TA+FL+NRLPS +   ++P++ L+G  PD    R+FGC
Subjt:  GVAEQKHRSIVAIALTLMYHAFVPLEFWYHAFTTAMFLLNRLPSSAIGFMTPFQKLYGYAPDLSHLRIFGC

A0A2N9G1Z9 Integrase catalytic domain-containing protein2.5e-6044.65Show/hide
Query:  HMFSPLKNDFATSGLQLSQSISPTCASVFVSRLPSNMLWHLRLGHPS----LNVLHKLLFVSSITHDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLA
        H   P++ D   +    S ++S T          S  LWH RLGHP      +VLH  L V+S+++   F    C  C++GK  +LPF  S S++TRPL 
Subjt:  HMFSPLKNDFATSGLQLSQSISPTCASVFVSRLPSNMLWHLRLGHPS----LNVLHKLLFVSSITHDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLA

Query:  LLHSDVWEPSPSVSVSGFWYYIKFVDDFSKFTWIFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQN
        L+H+DVW P+P  S +G  YY+ F+DD+++FTW FPL YKS V    + F   +EN L C +K  RSD GGEY       F SS G+LHQ SCPH  +QN
Subjt:  LLHSDVWEPSPSVSVSGFWYYIKFVDDFSKFTWIFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQN

Query:  GVAEQKHRSIVAIALTLMYHAFVPLEFWYHAFTTAMFLLNRLPSSAIGFMTPFQKLYGYAPDLSHLRIFGC
        G+AE+KHR IV +ALTL+  + +PL FW +AF+TA+FL+NRLPS +   ++P++ L+G  PD    R+FGC
Subjt:  GVAEQKHRSIVAIALTLMYHAFVPLEFWYHAFTTAMFLLNRLPSSAIGFMTPFQKLYGYAPDLSHLRIFGC

A0A2N9HMR4 Integrase catalytic domain-containing protein2.5e-6044.65Show/hide
Query:  HMFSPLKNDFATSGLQLSQSISPTCASVFVSRLPSNMLWHLRLGHPS----LNVLHKLLFVSSITHDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLA
        H   P++ D   +    S ++S T          S  LWH RLGHP      +VLH  L V+S+++   F    C  C++GK  +LPF  S S++TRPL 
Subjt:  HMFSPLKNDFATSGLQLSQSISPTCASVFVSRLPSNMLWHLRLGHPS----LNVLHKLLFVSSITHDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLA

Query:  LLHSDVWEPSPSVSVSGFWYYIKFVDDFSKFTWIFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQN
        L+H+DVW P+P  S +G  YY+ F+DD+++FTW FPL YKS V    + F   +EN L C +K  RSD GGEY       F SS G+LHQ SCPH  +QN
Subjt:  LLHSDVWEPSPSVSVSGFWYYIKFVDDFSKFTWIFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQN

Query:  GVAEQKHRSIVAIALTLMYHAFVPLEFWYHAFTTAMFLLNRLPSSAIGFMTPFQKLYGYAPDLSHLRIFGC
        G+AE+KHR IV +ALTL+  + +PL FW +AF+TA+FL+NRLPS +   ++P++ L+G  PD    R+FGC
Subjt:  GVAEQKHRSIVAIALTLMYHAFVPLEFWYHAFTTAMFLLNRLPSSAIGFMTPFQKLYGYAPDLSHLRIFGC

SwissProt top hitse value%identityAlignment
P04146 Copia protein1.9e-3335.56Show/hide
Query:  LWHLRLGHPS----LNVLHKLLFV-SSITHDTQFTCRDCESCLKGKATKLPFAM---STSISTRPLALLHSDVWEPSPSVSVSGFWYYIKFVDDFSKFTW
        LWH R GH S    L +  K +F   S+ ++ + +C  CE CL GK  +LPF      T I  RPL ++HSDV  P   V++    Y++ FVD F+ +  
Subjt:  LWHLRLGHPS----LNVLHKLLFV-SSITHDTQFTCRDCESCLKGKATKLPFAM---STSISTRPLALLHSDVWEPSPSVSVSGFWYYIKFVDDFSKFTW

Query:  IFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQNGVAEQKHRSIVAIALTLMYHAFVPLEFWYHAFT
         + + YKSDV S+ + FV   E   +  + +   D G EY++  + +F   KG+ +  + PH P+ NGV+E+  R+I   A T++  A +   FW  A  
Subjt:  IFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQNGVAEQKHRSIVAIALTLMYHAFVPLEFWYHAFT

Query:  TAMFLLNRLPSSAI--GFMTPFQKLYGYAPDLSHLRIFG
        TA +L+NR+PS A+     TP++  +   P L HLR+FG
Subjt:  TAMFLLNRLPSSAI--GFMTPFQKLYGYAPDLSHLRIFG

P10978 Retrovirus-related Pol polyprotein from transposon TNT 1-946.5e-3733.99Show/hide
Query:  LWHLRLGHPSLNVLHKLLFVSSITHDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLALLHSDVWEPSPSVSVSGFWYYIKFVDDFSKFTWIFPLVYKS
        LWH R+GH S   L  L   S I++    T + C+ CL GK  ++ F  S+      L L++SDV  P    S+ G  Y++ F+DD S+  W++ L  K 
Subjt:  LWHLRLGHPSLNVLHKLLFVSSITHDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLALLHSDVWEPSPSVSVSGFWYYIKFVDDFSKFTWIFPLVYKS

Query:  DVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQNGVAEQKHRSIVAIALTLMYHAFVPLEFWYHAFTTAMFLLNR
         V  + ++F   +E +    LK  RSD GGEY +R   E+ SS G+ H+++ P  P+ NGVAE+ +R+IV    +++  A +P  FW  A  TA +L+NR
Subjt:  DVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQNGVAEQKHRSIVAIALTLMYHAFVPLEFWYHAFTTAMFLLNR

Query:  LPSSAIGFMTPFQKLYGYAPDLSHLRIFGCEQSCGFCGKSAQNEVIQAQFPCL
         PS  + F  P +         SHL++FGC ++     K  + ++     PC+
Subjt:  LPSSAIGFMTPFQKLYGYAPDLSHLRIFGCEQSCGFCGKSAQNEVIQAQFPCL

Q12491 Transposon Ty2-B Gag-Pol polyprotein2.6e-1726.39Show/hide
Query:  LWHLRLGHPSLNVLHKLLFVSSITH----DTQF---TCRDCESCLKGKATKLPFAMSTSI----STRPLALLHSDVWEPSPSVSVSGFWYYIKFVDDFSK
        L H  LGH +   + K L  +++T+    D ++   +   C  CL GK+TK      + +    S  P   LH+D++ P   +  S   Y+I F D+ ++
Subjt:  LWHLRLGHPSLNVLHKLLFVSSITH----DTQF---TCRDCESCLKGKATKLPFAMSTSI----STRPLALLHSDVWEPSPSVSVSGFWYYIKFVDDFSK

Query:  FTWIFPLVYKSDVS--SIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQNGVAEQKHRSIVAIALTLMYHAFVPLEFW
        F W++PL  + + S  ++    + FI+NQ +  +   + D G EY N+++H+F +++G+    +       +GVAE+ +R+++    TL++ + +P   W
Subjt:  FTWIFPLVYKSDVS--SIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQNGVAEQKHRSIVAIALTLMYHAFVPLEFW

Query:  YHAFTTAMFLLNRLPS
        + A   +  + N L S
Subjt:  YHAFTTAMFLLNRLPS

Q94HW2 Retrovirus-related Pol polyprotein from transposon RE14.3e-4941.74Show/hide
Query:  WHLRLGHPSLNVLHKLLFVSSIT-HDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLALLHSDVWEPSPSVSVSGFWYYIKFVDDFSKFTWIFPLVYKS
        WH RLGHP+ ++L+ ++   S++  +       C  CL  K+ K+PF+ ST  STRPL  ++SDVW  SP +S   + YY+ FVD F+++TW++PL  KS
Subjt:  WHLRLGHPSLNVLHKLLFVSSIT-HDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLALLHSDVWEPSPSVSVSGFWYYIKFVDDFSKFTWIFPLVYKS

Query:  DVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQNGVAEQKHRSIVAIALTLMYHAFVPLEFWYHAFTTAMFLLNR
         V      F   +EN+    +  F SD GGE++  ++ E+ S  G+ H  S PH PE NG++E+KHR IV   LTL+ HA +P  +W +AF  A++L+NR
Subjt:  DVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQNGVAEQKHRSIVAIALTLMYHAFVPLEFWYHAFTTAMFLLNR

Query:  LPSSAIGFMTPFQKLYGYAPDLSHLRIFGC
        LP+  +   +PFQKL+G +P+   LR+FGC
Subjt:  LPSSAIGFMTPFQKLYGYAPDLSHLRIFGC

Q9ZT94 Retrovirus-related Pol polyprotein from transposon RE22.1e-4836.27Show/hide
Query:  WHLRLGHPSLNVLHKLLFVSSI-THDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLALLHSDVWEPSPSVSVSGFWYYIKFVDDFSKFTWIFPLVYKS
        WH RLGHPSL +L+ ++   S+   +       C  C   K+ K+PF+ ST  S++PL  ++SDVW  SP +S+  + YY+ FVD F+++TW++PL  KS
Subjt:  WHLRLGHPSLNVLHKLLFVSSI-THDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLALLHSDVWEPSPSVSVSGFWYYIKFVDDFSKFTWIFPLVYKS

Query:  DVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQNGVAEQKHRSIVAIALTLMYHAFVPLEFWYHAFTTAMFLLNR
         V      F   +EN+    +    SD GGE++   + ++LS  G+ H  S PH PE NG++E+KHR IV + LTL+ HA VP  +W +AF+ A++L+NR
Subjt:  DVSSIIRQFVPFIENQLSCSLKFFRSDGGGEYINRSVHEFLSSKGVLHQRSCPHKPEQNGVAEQKHRSIVAIALTLMYHAFVPLEFWYHAFTTAMFLLNR

Query:  LPSSAIGFMTPFQKLYGYAPDLSHLRIFGC-----------------EQSCGFCGKS-AQNEVIQAQFPCLGQKESKEVNYEEK
        LP+  +   +PFQKL+G  P+   L++FGC                  + C F G S  Q+  +    P      S+ V ++E+
Subjt:  LPSSAIGFMTPFQKLYGYAPDLSHLRIFGC-----------------EQSCGFCGKS-AQNEVIQAQFPCLGQKESKEVNYEEK

Arabidopsis top hitse value%identityAlignment
AT1G10000.1 Ribonuclease H-like superfamily protein1.9e-0725.13Show/hide
Query:  PKVKIHLWKALNNALPTLENIKKKGVDTNTCCFLCRCKNENVEHIFWRCKVVRKIWGNFAPSLTKVYDLCRDGWKCLDYWDFLYKT--LSPNEVVKAS--
        PK+K+ LWKA   ALP    + ++ + +   C  C    E   H+ + C    ++W N AP       +   G   L+  + L KT  L P  +  A+  
Subjt:  PKVKIHLWKALNNALPTLENIKKKGVDTNTCCFLCRCKNENVEHIFWRCKVVRKIWGNFAPSLTKVYDLCRDGWKCLDYWDFLYKT--LSPNEVVKAS--

Query:  -QIIWAIWFKRNQLKQSNSRITASRIIEDI-EVLVDRMQREDSYQSIP------PENHSSHGCWMSSKESQWKINIGAAWFDDAGIGGLGWIVRDSEGS
          I W IW  RNQL   NS  +   +IE + + + D +  + +  ++P      P  +S+     ++    +   + AAW   + + G GW+ + +  S
Subjt:  -QIIWAIWFKRNQLKQSNSRITASRIIEDI-EVLVDRMQREDSYQSIP------PENHSSHGCWMSSKESQWKINIGAAWFDDAGIGGLGWIVRDSEGS

AT1G33710.1 RNA-directed DNA polymerase (reverse transcriptase)-related family protein5.0e-0824.82Show/hide
Query:  KSIWLLKSRPKVKIHLWKALNNALPTLENIKKKGVDTNTCCFLCRCKNENVEHIFWRCKVVRKIWGNFAPSLTKVYDLCRDGWKCLDYWDFLYKTLSPNE
        K++W   + PK   H+W    + LPT   +   G+   T C LC    E+ +H+F  C+    +W   +  L ++       W  L  W       SP  
Subjt:  KSIWLLKSRPKVKIHLWKALNNALPTLENIKKKGVDTNTCCFLCRCKNENVEHIFWRCKVVRKIWGNFAPSLTKVYDLCRDGWKCLDYWDFLYKTLSPNE

Query:  VVK--ASQIIWAIWFKRNQLKQSNSRITASRIIEDIE
        + K     +++AIW +RN    ++  I  S + + I+
Subjt:  VVK--ASQIIWAIWFKRNQLKQSNSRITASRIIEDIE

AT3G09510.1 Ribonuclease H-like superfamily protein5.8e-1722.49Show/hide
Query:  KPIRANDSLKGKRVADILNVDGS---WKTGVIQDSFIPSDADAILSMTKRNMAVNDKIIWGVDKKGLFSVKSAYHLAVNEENQFSASSSESSSSNRIWKS
        +P+   ++ K   + ++    GS   W    I      SD   I  +        DKIIW  +  G ++V+S Y L  ++ +    + +    S  +   
Subjt:  KPIRANDSLKGKRVADILNVDGS---WKTGVIQDSFIPSDADAILSMTKRNMAVNDKIIWGVDKKGLFSVKSAYHLAVNEENQFSASSSESSSSNRIWKS

Query:  IWLLKSRPKVKIHLWKALNNALPTLENIKKKGVDTNTCCFLCRCKNENVEHIFWRCKVVRKIWGNFAPSLTKVYDLCRDGWKCL-DYWDFLY-KTLSPNE
        IW L   PK+K  LW+AL+ AL T E +  +G+  +  C  C  +NE++ H  + C      W     SL +   +  D  + + +  +F+   T+S   
Subjt:  IWLLKSRPKVKIHLWKALNNALPTLENIKKKGVDTNTCCFLCRCKNENVEHIFWRCKVVRKIWGNFAPSLTKVYDLCRDGWKCL-DYWDFLY-KTLSPNE

Query:  VVKASQIIWAIWFKRNQLKQSNSRITASRIIEDIEVLV-DRMQREDSYQSIP-PENHSSHGC--WMSSKESQWKINIGAAWFDDAGIGGLGWIVRDSEGS
         +    +IW IW  RN +  +  R + S+ +   +    D +    S++  P P    +     W +   +  K N  A +         GWI+R+  G+
Subjt:  VVKASQIIWAIWFKRNQLKQSNSRITASRIIEDIEVLV-DRMQREDSYQSIP-PENHSSHGC--WMSSKESQWKINIGAAWFDDAGIGGLGWIVRDSEGS

Query:  LIGAGGKKTSKKIEINFLESLTIVEGLNQISTKFQAYPEIRDHEVVVESDAAEIVKLLNQETIDLSEVSVDIDEIVGW
         I  G  K +        E+  ++  L Q  T  + Y      +V +E D   ++ L+N  +   S ++  +++I  W
Subjt:  LIGAGGKKTSKKIEINFLESLTIVEGLNQISTKFQAYPEIRDHEVVVESDAAEIVKLLNQETIDLSEVSVDIDEIVGW

AT3G25270.1 Ribonuclease H-like superfamily protein7.6e-1724.82Show/hide
Query:  IWLLKSRPKVKIHLWKALNNALPTLENIKKKGVDTNTCCFLCRCKNENVEHIFWRCKVVRKIWGNFAPSLTKVYDLCRDGWKCLDYWDFLYKTLSPNEVV
        IW LK+ PK+K  LWK L+ AL T +N+K++ +  +  C  C  ++E  +H+F+ C   +++W     S     +L   G       + L  +   N   
Subjt:  IWLLKSRPKVKIHLWKALNNALPTLENIKKKGVDTNTCCFLCRCKNENVEHIFWRCKVVRKIWGNFAPSLTKVYDLCRDGWKCLDYWDFLYKTLSPNEVV

Query:  K----ASQIIWAIWFKRNQLKQSNSRITASRIIEDIEVLVDRMQREDSY-QSIPPENHSSHGCWMSSKESQW--------KINIGAAWFDDAGIGGLGWI
        +    A  I+W +W  RNQL      I+    ++     V   +  ++Y QS+  + HSS     +   ++W        K N   A+         GW+
Subjt:  K----ASQIIWAIWFKRNQLKQSNSRITASRIIEDIEVLVDRMQREDSY-QSIPPENHSSHGCWMSSKESQW--------KINIGAAWFDDAGIGGLGWI

Query:  VRDSEGSLIGAG---GKKTSKKIEINFLESLTIVEGLNQISTKFQAYPEIRDHEVVVESDAAEIVKLLNQETID
        +RD  G  +G+G   G  TS  +E  F   +  ++         Q Y      +V+ E D+ ++ +L+N E ++
Subjt:  VRDSEGSLIGAG---GKKTSKKIEINFLESLTIVEGLNQISTKFQAYPEIRDHEVVVESDAAEIVKLLNQETID

AT4G29090.1 Ribonuclease H-like superfamily protein9.9e-2525.57Show/hide
Query:  RVADILNVDG-SWKTGVIQDSFIPSDADAILSMTKRNMAVNDKIIWGVDKKGLFSVKSAYHLAVNEENQFSASSSESSSS-NRIWKSIWLLKSRPKVKIH
        +V+D+++  G  W+  VI+  F   +   I  +      + D   W     G ++VKS Y +     N+ S+    S  S N I++ IW  ++ PK++  
Subjt:  RVADILNVDG-SWKTGVIQDSFIPSDADAILSMTKRNMAVNDKIIWGVDKKGLFSVKSAYHLAVNEENQFSASSSESSSS-NRIWKSIWLLKSRPKVKIH

Query:  LWKALNNALPTLENIKKKGVDTNTCCFLCRCKNENVEHIFWRCKVVRKIWGNFAPSLTKVYDLCRDGWKCLDYWDFLYKTLSPNEVVKASQII----WAI
        LWK L+N+LP    +  + +   + C  C    E V H+ ++C   R  W   +  +    +     +  L YW F     +P +  KASQ++    W +
Subjt:  LWKALNNALPTLENIKKKGVDTNTCCFLCRCKNENVEHIFWRCKVVRKIWGNFAPSLTKVYDLCRDGWKCLDYWDFLYKTLSPNEVVKASQII----WAI

Query:  WFKRNQL----KQSNSRITASRIIEDIEVLVDRMQREDSYQSIPPENHSSHGCWMSSKESQWKINIGAAWFDDAGIGGLGWIVRDSEGSLIGAGGKKTSK
        W  RN+L    ++ N++    R  +D+E    R + E S  + P  N SS G W        K N  A W  D    G+GW++R+ +G +   G +   K
Subjt:  WFKRNQL----KQSNSRITASRIIEDIEVLVDRMQREDSYQSIPPENHSSHGCWMSSKESQWKINIGAAWFDDAGIGGLGWIVRDSEGSLIGAGGKKTSK

Query:  KIEINFLESLTIVEGLNQISTKFQAYPEIRDHEVVVESDAAEIVKLLNQETI
              L+S+   E L  +     +    + + V+ ESD+  ++++LN + I
Subjt:  KIEINFLESLTIVEGLNQISTKFQAYPEIRDHEVVVESDAAEIVKLLNQETI


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGAGGTTAGGGTGCTGGAAACCTATTAGAGCCAATGACTCTCTCAAAGGCAAAAGGGTGGCTGATATTCTCAATGTAGATGGCTCGTGGAAAACAGGAGTTATTCAAGA
CTCTTTCATTCCTAGTGATGCAGACGCCATTTTAAGCATGACTAAGAGGAACATGGCTGTTAATGACAAGATCATCTGGGGTGTTGACAAAAAGGGTCTATTTTCCGTGA
AAAGTGCTTATCACTTAGCAGTTAATGAAGAGAACCAATTTTCAGCCTCTAGCTCTGAATCCTCATCATCCAACAGAATTTGGAAATCAATTTGGCTCCTCAAGAGTAGG
CCGAAAGTGAAAATTCACTTATGGAAAGCTTTGAACAATGCTCTTCCTACTCTTGAAAACATTAAAAAGAAAGGAGTAGACACTAACACTTGTTGTTTTCTGTGCAGGTG
TAAGAATGAGAATGTGGAGCATATTTTTTGGAGATGCAAAGTGGTAAGGAAAATTTGGGGGAATTTTGCGCCCTCCCTAACTAAGGTTTATGATTTGTGCAGAGATGGGT
GGAAATGTTTGGATTACTGGGATTTTCTATACAAGACGCTCAGTCCTAATGAAGTTGTGAAAGCCAGTCAAATTATATGGGCTATTTGGTTTAAAAGAAATCAGTTGAAG
CAATCTAACAGCAGAATCACAGCAAGCAGAATCATCGAAGATATTGAAGTGTTGGTTGACAGAATGCAGCGGGAAGATTCTTACCAGTCTATTCCGCCGGAGAACCATTC
GAGTCATGGTTGTTGGATGTCGTCGAAGGAGAGTCAGTGGAAGATTAACATCGGCGCAGCCTGGTTTGACGATGCCGGAATTGGAGGTTTAGGGTGGATTGTGCGCGACT
CCGAAGGTTCTCTGATCGGAGCTGGAGGGAAGAAAACATCAAAGAAAATCGAGATCAATTTCTTAGAATCCCTTACTATTGTAGAAGGGCTCAATCAAATTTCAACAAAA
TTTCAAGCGTACCCGGAGATTCGAGATCACGAGGTGGTAGTCGAGTCGGACGCGGCTGAGATAGTGAAACTGCTTAATCAGGAAACGATCGACCTCTCAGAGGTCTCTGT
CGACATCGATGAGATCGTGGGTTGGCATTGCAATGCTTTGATCACCTTGATCAATGCTACTCTCTCGAAGACGGCTTATTCATATGTAATTGGATGTAAATCTTCCAAAG
AGGAACTTCACGCTTTGTTGAAATCTGAGGTTAAAATTCTAGAGCAGCAGAACAAATCACCTTCACTTTCACCTCTTAATCCAACTGCGATATTTTTGTCTGCTCCTCAA
GGACAGTTTGATTCGAATCGAGGTCGTGGAAGAGGGAGAAATTCAGGACCTTATGGATCTGGTCATGGACAATTTAATCAAGGTCGTGGGACAAGGCCACAGGATGAGTT
TTGTACAAGAGTAAGAATAAGGATGGATTATATCCCATATCCTCATATGTTTTCTCCTTTGAAGAATGATTTTGCAACTTCTGGTTTGCAATTATCTCAGTCTATATCTC
CTACATGTGCATCTGTTTTTGTGTCTCGTTTACCTAGTAATATGTTGTGGCATTTGCGTCTTGGACATCCGTCCCTTAATGTTTTGCACAAGTTGTTATTTGTTAGTTCT
ATTACGCATGACACTCAATTTACTTGTCGAGATTGTGAGAGCTGTTTAAAGGGTAAAGCTACTAAACTTCCCTTTGCAATGTCTACTTCCATTTCTACTAGACCTCTTGC
TTTATTGCATAGTGATGTATGGGAACCATCACCATCAGTTTCTGTTTCTGGATTCTGGTATTATATCAAATTTGTTGATGATTTCAGCAAGTTTACATGGATATTTCCTC
TGGTGTATAAGTCAGATGTCTCATCTATCATAAGACAGTTTGTTCCTTTTATTGAAAACCAACTATCTTGTTCTTTGAAATTCTTTCGATCGGATGGAGGGGGTGAGTAC
ATTAATCGTTCAGTTCATGAGTTTCTTTCTTCTAAAGGTGTTCTTCACCAACGGTCTTGCCCTCACAAGCCTGAACAAAATGGTGTTGCTGAACAAAAACATAGGTCTAT
TGTTGCTATTGCTCTTACTCTTATGTACCATGCATTTGTTCCATTAGAATTTTGGTATCATGCCTTCACAACTGCTATGTTTTTGTTGAATCGGTTGCCTTCAAGTGCTA
TTGGTTTCATGACTCCATTTCAGAAATTGTATGGTTATGCACCTGATTTGTCTCATCTTCGTATTTTTGGATGTGAGCAAAGTTGTGGATTTTGTGGAAAAAGTGCCCAA
AATGAAGTCATTCAAGCCCAATTTCCATGCTTGGGTCAAAAGGAATCAAAAGAGGTCAACTATGAAGAAAAAGGCCCAAAAGAAATGAAGTGGAGGCCCAAAATCAAGCC
CAAAATCGCAGCTGGAAATTCTGGAAATCGCTTAGCGTCAAGACGCTGTAAGGACAGCGTCGCGATGCTGTCCATTTCTTGGTCGGCAAGTCGATGA
mRNA sequenceShow/hide mRNA sequence
ATGAGGTTAGGGTGCTGGAAACCTATTAGAGCCAATGACTCTCTCAAAGGCAAAAGGGTGGCTGATATTCTCAATGTAGATGGCTCGTGGAAAACAGGAGTTATTCAAGA
CTCTTTCATTCCTAGTGATGCAGACGCCATTTTAAGCATGACTAAGAGGAACATGGCTGTTAATGACAAGATCATCTGGGGTGTTGACAAAAAGGGTCTATTTTCCGTGA
AAAGTGCTTATCACTTAGCAGTTAATGAAGAGAACCAATTTTCAGCCTCTAGCTCTGAATCCTCATCATCCAACAGAATTTGGAAATCAATTTGGCTCCTCAAGAGTAGG
CCGAAAGTGAAAATTCACTTATGGAAAGCTTTGAACAATGCTCTTCCTACTCTTGAAAACATTAAAAAGAAAGGAGTAGACACTAACACTTGTTGTTTTCTGTGCAGGTG
TAAGAATGAGAATGTGGAGCATATTTTTTGGAGATGCAAAGTGGTAAGGAAAATTTGGGGGAATTTTGCGCCCTCCCTAACTAAGGTTTATGATTTGTGCAGAGATGGGT
GGAAATGTTTGGATTACTGGGATTTTCTATACAAGACGCTCAGTCCTAATGAAGTTGTGAAAGCCAGTCAAATTATATGGGCTATTTGGTTTAAAAGAAATCAGTTGAAG
CAATCTAACAGCAGAATCACAGCAAGCAGAATCATCGAAGATATTGAAGTGTTGGTTGACAGAATGCAGCGGGAAGATTCTTACCAGTCTATTCCGCCGGAGAACCATTC
GAGTCATGGTTGTTGGATGTCGTCGAAGGAGAGTCAGTGGAAGATTAACATCGGCGCAGCCTGGTTTGACGATGCCGGAATTGGAGGTTTAGGGTGGATTGTGCGCGACT
CCGAAGGTTCTCTGATCGGAGCTGGAGGGAAGAAAACATCAAAGAAAATCGAGATCAATTTCTTAGAATCCCTTACTATTGTAGAAGGGCTCAATCAAATTTCAACAAAA
TTTCAAGCGTACCCGGAGATTCGAGATCACGAGGTGGTAGTCGAGTCGGACGCGGCTGAGATAGTGAAACTGCTTAATCAGGAAACGATCGACCTCTCAGAGGTCTCTGT
CGACATCGATGAGATCGTGGGTTGGCATTGCAATGCTTTGATCACCTTGATCAATGCTACTCTCTCGAAGACGGCTTATTCATATGTAATTGGATGTAAATCTTCCAAAG
AGGAACTTCACGCTTTGTTGAAATCTGAGGTTAAAATTCTAGAGCAGCAGAACAAATCACCTTCACTTTCACCTCTTAATCCAACTGCGATATTTTTGTCTGCTCCTCAA
GGACAGTTTGATTCGAATCGAGGTCGTGGAAGAGGGAGAAATTCAGGACCTTATGGATCTGGTCATGGACAATTTAATCAAGGTCGTGGGACAAGGCCACAGGATGAGTT
TTGTACAAGAGTAAGAATAAGGATGGATTATATCCCATATCCTCATATGTTTTCTCCTTTGAAGAATGATTTTGCAACTTCTGGTTTGCAATTATCTCAGTCTATATCTC
CTACATGTGCATCTGTTTTTGTGTCTCGTTTACCTAGTAATATGTTGTGGCATTTGCGTCTTGGACATCCGTCCCTTAATGTTTTGCACAAGTTGTTATTTGTTAGTTCT
ATTACGCATGACACTCAATTTACTTGTCGAGATTGTGAGAGCTGTTTAAAGGGTAAAGCTACTAAACTTCCCTTTGCAATGTCTACTTCCATTTCTACTAGACCTCTTGC
TTTATTGCATAGTGATGTATGGGAACCATCACCATCAGTTTCTGTTTCTGGATTCTGGTATTATATCAAATTTGTTGATGATTTCAGCAAGTTTACATGGATATTTCCTC
TGGTGTATAAGTCAGATGTCTCATCTATCATAAGACAGTTTGTTCCTTTTATTGAAAACCAACTATCTTGTTCTTTGAAATTCTTTCGATCGGATGGAGGGGGTGAGTAC
ATTAATCGTTCAGTTCATGAGTTTCTTTCTTCTAAAGGTGTTCTTCACCAACGGTCTTGCCCTCACAAGCCTGAACAAAATGGTGTTGCTGAACAAAAACATAGGTCTAT
TGTTGCTATTGCTCTTACTCTTATGTACCATGCATTTGTTCCATTAGAATTTTGGTATCATGCCTTCACAACTGCTATGTTTTTGTTGAATCGGTTGCCTTCAAGTGCTA
TTGGTTTCATGACTCCATTTCAGAAATTGTATGGTTATGCACCTGATTTGTCTCATCTTCGTATTTTTGGATGTGAGCAAAGTTGTGGATTTTGTGGAAAAAGTGCCCAA
AATGAAGTCATTCAAGCCCAATTTCCATGCTTGGGTCAAAAGGAATCAAAAGAGGTCAACTATGAAGAAAAAGGCCCAAAAGAAATGAAGTGGAGGCCCAAAATCAAGCC
CAAAATCGCAGCTGGAAATTCTGGAAATCGCTTAGCGTCAAGACGCTGTAAGGACAGCGTCGCGATGCTGTCCATTTCTTGGTCGGCAAGTCGATGA
Protein sequenceShow/hide protein sequence
MRLGCWKPIRANDSLKGKRVADILNVDGSWKTGVIQDSFIPSDADAILSMTKRNMAVNDKIIWGVDKKGLFSVKSAYHLAVNEENQFSASSSESSSSNRIWKSIWLLKSR
PKVKIHLWKALNNALPTLENIKKKGVDTNTCCFLCRCKNENVEHIFWRCKVVRKIWGNFAPSLTKVYDLCRDGWKCLDYWDFLYKTLSPNEVVKASQIIWAIWFKRNQLK
QSNSRITASRIIEDIEVLVDRMQREDSYQSIPPENHSSHGCWMSSKESQWKINIGAAWFDDAGIGGLGWIVRDSEGSLIGAGGKKTSKKIEINFLESLTIVEGLNQISTK
FQAYPEIRDHEVVVESDAAEIVKLLNQETIDLSEVSVDIDEIVGWHCNALITLINATLSKTAYSYVIGCKSSKEELHALLKSEVKILEQQNKSPSLSPLNPTAIFLSAPQ
GQFDSNRGRGRGRNSGPYGSGHGQFNQGRGTRPQDEFCTRVRIRMDYIPYPHMFSPLKNDFATSGLQLSQSISPTCASVFVSRLPSNMLWHLRLGHPSLNVLHKLLFVSS
ITHDTQFTCRDCESCLKGKATKLPFAMSTSISTRPLALLHSDVWEPSPSVSVSGFWYYIKFVDDFSKFTWIFPLVYKSDVSSIIRQFVPFIENQLSCSLKFFRSDGGGEY
INRSVHEFLSSKGVLHQRSCPHKPEQNGVAEQKHRSIVAIALTLMYHAFVPLEFWYHAFTTAMFLLNRLPSSAIGFMTPFQKLYGYAPDLSHLRIFGCEQSCGFCGKSAQ
NEVIQAQFPCLGQKESKEVNYEEKGPKEMKWRPKIKPKIAAGNSGNRLASRRCKDSVAMLSISWSASR