; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Lag0015335 (gene) of Sponge gourd (AG-4) v1 genome

Gene IDLag0015335
OrganismLuffa acutangula AG-4 (Sponge gourd (AG-4) v1)
DescriptionTransposable element protein
Genome locationchr12:10629796..10638863
RNA-Seq ExpressionLag0015335
SyntenyLag0015335
Gene Ontology termsGO:0015074 - DNA integration (biological process)
GO:0003676 - nucleic acid binding (molecular function)
GO:0016779 - nucleotidyltransferase activity (molecular function)
InterPro domainsIPR001584 - Integrase, catalytic core
IPR005162 - Retrotransposon gag domain
IPR012337 - Ribonuclease H-like superfamily
IPR036397 - Ribonuclease H superfamily
IPR041588 - Integrase zinc-binding domain


Homology Show/hide homology
GenBank top hitse value%identityAlignment
PIM97577.1 DNA-directed DNA polymerase [Handroanthus impetiginosus]3.3e-16558.84Show/hide
Query:  AKPRLIRWILLLQEFDLEIKDKKGSENVIADHLSRLDPSSSLLEQSAISDSFPDEQLFAVQVKVVRDVPWYVDIANFLVKGVTPVDMDWRQKKKFKHDAK
        AKPRLIRW+LLLQEFDLEI+D+KG+EN IADHLSRL+  +   E + I+D+FPDEQL A+   V  DVPWY DI N+L  G+ P D+  +QKKKF  D +
Subjt:  AKPRLIRWILLLQEFDLEIKDKKGSENVIADHLSRLDPSSSLLEQSAISDSFPDEQLFAVQVKVVRDVPWYVDIANFLVKGVTPVDMDWRQKKKFKHDAK

Query:  FFFWDEPFMYKQCSDGIIRRCVSGAEAKEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFD
         +FWD+PF++KQ  D I+RRCV   E  +ILEQCH+SPYGGHF G RTA +IL  GFFWP LFKDAH F   CD CQR GN+  R EMPL  ILEVELFD
Subjt:  FFFWDEPFMYKQCSDGIIRRCVSGAEAKEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFD

Query:  VWGIDFMGLFPPSNGNVFILLAVDYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAE
        VWGIDFMG F PS GN++IL+AVDYVSKWV+A A   +D+K V  F++ +IF RFGTPRA++SD GTHF N     LL+KYG++H+I+TPYHPQ +GQ E
Subjt:  VWGIDFMGLFPPSNGNVFILLAVDYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAE

Query:  ISNREIKAILEKVVHPSRKDWSFRLDQALWAYRTAYKTPLGMSPYRLVYGKACHLPLELEHKTFWALKKLNFDLSRAGAIRMLQLNELEEFRQFSYENAK
        +SNREIK ILEK V  +RKDWS RLD+ALWAYRTAYKTP+GMSPYRLV+GKACHLP+ELEH  +WA++KLNFD+  AG  R+LQLNEL+EFR  +YENAK
Subjt:  ISNREIKAILEKVVHPSRKDWSFRLDQALWAYRTAYKTPLGMSPYRLVYGKACHLPLELEHKTFWALKKLNFDLSRAGAIRMLQLNELEEFRQFSYENAK

Query:  MYKEKTKLWHDKKLNLR---------------------------------------SLSRDEKDGR-VIKVNGQRVKHYWG
        +YKEK K WH+KK+  R                                       ++  + K+ R   KVN QR+KHYWG
Subjt:  MYKEKTKLWHDKKLNLR---------------------------------------SLSRDEKDGR-VIKVNGQRVKHYWG

PIM97577.1 DNA-directed DNA polymerase [Handroanthus impetiginosus]6.2e-1533.33Show/hide
Query:  GVPRDALRLTLFPYSLRDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKSRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYN
        GV +DALRL LF +SL   A  W  S    SI TW                                ET  EAW RF+++LR CP+H +P  IQ+ TFY+
Subjt:  GVPRDALRLTLFPYSLRDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKSRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYN

Query:  GLNGVTQGMVDASAGGALLAKTFDEAYEILERISINSCQWSDVRGTNKKVKSVLEVDGVSTIRAK
        GL    +  +D   G + L+ T  E + +L  +  N  +    R T  K   V+EVD V+ + AK
Subjt:  GLNGVTQGMVDASAGGALLAKTFDEAYEILERISINSCQWSDVRGTNKKVKSVLEVDGVSTIRAK

PIM97577.1 DNA-directed DNA polymerase [Handroanthus impetiginosus]2.3e-16345.65Show/hide
Query:  PILIENDRTRAIRAYVVPMFNELNPGIARPQ-----------------IQAANFEMKPVMFQMLQTVGQFHGVPRDALRLTLFPYSLRDGAKSWLNSFAP
        P+++ N       A V P+ N   P   RPQ                 +  A+F+       + QT+G   G  +D ++L LF ++L+D A+ W  +  P
Subjt:  PILIENDRTRAIRAYVVPMFNELNPGIARPQ-----------------IQAANFEMKPVMFQMLQTVGQFHGVPRDALRLTLFPYSLRDGAKSWLNSFAP

Query:  GSIRTWDELAEKFLSKYFPPNRNAKSRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYNGLNGVTQGMVDASAGGALLAKTFDEAYEI
        GSI TW E+   FL +Y+  NR +++R  I  F+Q   E F EA+ RFKELLRKCPHH +     ++ FY+GL       + A + G  L       +  
Subjt:  GSIRTWDELAEKFLSKYFPPNRNAKSRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYNGLNGVTQGMVDASAGGALLAKTFDEAYEI

Query:  LERISINSCQ--WSDVRGTNKKVKSV--------------------------LEVDGVSTIR----AKPRLIRWILLLQEFDLEIKDKKGSENVIADHLS
        LER S  S +   S  R  +   KSV                          L   GV  +     AKPRLIRWILLLQEFDLEI+DKKGSENV+ADHLS
Subjt:  LERISINSCQ--WSDVRGTNKKVKSV--------------------------LEVDGVSTIR----AKPRLIRWILLLQEFDLEIKDKKGSENVIADHLS

Query:  RLDPSSSLLEQSAISDSFPDEQLFAVQVKVVRDVPWYVDIANFLVKGVTPVDMDWRQKKKFKHDAKFFFWDEPFMYKQCSDGIIRRCVSGAEAKEILEQC
        R+   +   +   I+++FPDE + AV  K     PW+ + ANF V G  P     ++K++F   AK + WDEP ++K  +D IIRRC+   E K +LE  
Subjt:  RLDPSSSLLEQSAISDSFPDEQLFAVQVKVVRDVPWYVDIANFLVKGVTPVDMDWRQKKKFKHDAKFFFWDEPFMYKQCSDGIIRRCVSGAEAKEILEQC

Query:  HSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFDVWGIDFMGLFPPSNGNVFILLAVDYVSKWVKAIA
        H+S  GGHFSGQ+T  ++L CG +WPT+FKDA  F K C  CQ+ G++  R+EMP+  IL V++FDVWGIDFMG FP S GN +IL+AVDYVSKW++AIA
Subjt:  HSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFDVWGIDFMGLFPPSNGNVFILLAVDYVSKWVKAIA

Query:  CHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAEISNREIKAILEKVVHPSRKDWSFRLDQALWAYRT
           +D K V +F+Q++IF+RFG PR ++SD G+HF N    KLL ++GI HRIATPYHPQ +GQ E+SNR+IK IL+K V P RKDWS +LD ALWAYRT
Subjt:  CHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAEISNREIKAILEKVVHPSRKDWSFRLDQALWAYRT

Query:  AYKTPLGMSPYRLVYGKACHLPLELEHKTFWALKKLNFDLSRAGAIRMLQLNELEEFRQFSYENAKMYKEKTKLWHDKKLNLRSLSRDEK
        AYKTP+G +PYRLVYGK CHLP EL H+  WA+K++N D   AG  R L L ELEE R  +YE+A  YK+KTK  HD KL L+     +K
Subjt:  AYKTPLGMSPYRLVYGKACHLPLELEHKTFWALKKLNFDLSRAGAIRMLQLNELEEFRQFSYENAKMYKEKTKLWHDKKLNLRSLSRDEK

XP_023874613.1 uncharacterized protein LOC111987139 [Quercus suber]1.6e-4742.02Show/hide
Query:  ENDRTRAIRAYVVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVG-----------------------QFHGVPRDALRLTLFPYSLRDGAKSWLNSF
        +N + R ++ YV P+ N+   GI R  I A NFE+KP +  M+Q                          + +GV  D +RL LFP+SLRD A+ WL S 
Subjt:  ENDRTRAIRAYVVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVG-----------------------QFHGVPRDALRLTLFPYSLRDGAKSWLNSF

Query:  APGSIRTWDELAEKFLSKYFPPNRNAKSRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYNGLNGVTQGMVDASAGGALLAKTFDEAY
         PGSI +W ++AEKFL+K+FPP + A+ RSEI  FRQ + E+  EAWER+K+L+R CP HGLP  +Q++ FYNGLNG T+ +VDA++GG L++KT + A 
Subjt:  APGSIRTWDELAEKFLSKYFPPNRNAKSRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYNGLNGVTQGMVDASAGGALLAKTFDEAY

Query:  EILERISINSCQWSDVRGTNKKVKSVLEVDGVSTIRAK
         +LE ++ N+ QW   R   KKV  + E++  + + A+
Subjt:  EILERISINSCQWSDVRGTNKKVKSVLEVDGVSTIRAK

XP_034212709.1 uncharacterized protein LOC117625210 [Prunus dulcis]2.1e-17247.07Show/hide
Query:  NNQAENPILIENDRTRAIRAYVVPMFNELNPGIARPQIQAANFEMKPVMFQMLQT------------VGQF---------HGVPRDALRLTLFPYSLRDG
        NN      LI   R + +R + +P   +    I  PQ+ A  FE+K  M  +L T            + QF           +  + +++ LFP+SL+D 
Subjt:  NNQAENPILIENDRTRAIRAYVVPMFNELNPGIARPQIQAANFEMKPVMFQMLQT------------VGQF---------HGVPRDALRLTLFPYSLRDG

Query:  AKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKSRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYNGLNGVTQGMVDASAGGALL
        AKSWL S    SI TW+EL+ KFL K+FP  +  K R EI+GF Q E E F E WER+KE++  CPHH +   +QM++FY GL    + MVDA++G  L+
Subjt:  AKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKSRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYNGLNGVTQGMVDASAGGALL

Query:  AKTFDEAYEILERISINSCQWSDVRGTNKKVKSVLEVDGVSTIR------------AKPRLIRWILLLQEFDLEIKDKKGSENVIADHLSRLDPSSSLLE
         KT DEA+ + E +S NS QWS  +G    +K+V+    +   R            AKPRLIRWILLLQEFDLEIKDKKGSENV+ADHLSRL   S+  E
Subjt:  AKTFDEAYEILERISINSCQWSDVRGTNKKVKSVLEVDGVSTIR------------AKPRLIRWILLLQEFDLEIKDKKGSENVIADHLSRLDPSSSLLE

Query:  QS-AISDSFPDEQLFAV-QVKVVRDVPWYVDIANFLVKGVTPVDMDWRQKKKFKHDAKFFFWDEPFMYKQCSDGIIRRCVSGAEAKEILEQCHSSPYGGH
         S  + +SFPDEQLF++  +  +  +PW+ DI N+L     P  +   Q+ K +  A+++FWD+P+++K C D +IRR                      
Subjt:  QS-AISDSFPDEQLFAV-QVKVVRDVPWYVDIANFLVKGVTPVDMDWRQKKKFKHDAKFFFWDEPFMYKQCSDGIIRRCVSGAEAKEILEQCHSSPYGGH

Query:  FSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFDVWGIDFMGLFPPSNGNVFILLAVDYVSKWVKAIACHQSDAKT
                                         CQR GNL  R++MPLT IL +++FDVWGIDFMG FP S G  +IL+AVDYVSKWV+AIA   +DAK 
Subjt:  FSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFDVWGIDFMGLFPPSNGNVFILLAVDYVSKWVKAIACHQSDAKT

Query:  VARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAEISNREIKAILEKVVHPSRKDWSFRLDQALWAYRTAYKTPLGM
        V  FL+ +IF RFGTPRA++SD G+HFVN     LL KYGI H++ATPYHPQ +GQ EISNREIK ILEK V+ +RKDWS RLD ALWAYRTAYKTP+GM
Subjt:  VARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAEISNREIKAILEKVVHPSRKDWSFRLDQALWAYRTAYKTPLGM

Query:  SPYRLVYGKACHLPLELEHKTFWALKKLNFDLSRAGAIRMLQLNELEEFRQFSYENAKMYKEKTKLWHDKKLNLRSLSRDEK
        SPYRLV+GK CHLP+ELEH+ +WA+K  NFD+  AG  R LQLNELEE R  +YENAK+YKEKTK +HDKK+  ++  + +K
Subjt:  SPYRLVYGKACHLPLELEHKTFWALKKLNFDLSRAGAIRMLQLNELEEFRQFSYENAKMYKEKTKLWHDKKLNLRSLSRDEK

XP_042757945.1 uncharacterized protein LOC111885853 [Lactuca sativa]6.8e-16358Show/hide
Query:  AKPRLIRWILLLQEFDLEIKDKKGSENVIADHLSRLDPSSSLLEQSAISDSFPDEQLFAVQVKVVRDVPWYVDIANFLVKGVTPVDMDWRQKKKFKHDAK
        AK RLIRWILLLQEFDLE++DKKG ENV+ADHLSRLD  S+  +Q  I DSFPDE++    ++V  + PWY +I N+LV GV P    W QKKK   DAK
Subjt:  AKPRLIRWILLLQEFDLEIKDKKGSENVIADHLSRLDPSSSLLEQSAISDSFPDEQLFAVQVKVVRDVPWYVDIANFLVKGVTPVDMDWRQKKKFKHDAK

Query:  FFFWDEPFMYKQCSDGIIRRCVSGAEAKEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFD
        F+FWDEP++++   D + RRC+   E K+ILE+CH+S YGGHF G++TA+R+LH GF+WP+LFKDA+ F K+CD CQR GN+G R EMPL+ I+EVELFD
Subjt:  FFFWDEPFMYKQCSDGIIRRCVSGAEAKEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFD

Query:  VWGIDFMGLFPPSNGNVFILLAVDYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAE
        VWGIDFMG F PS+G ++IL+AVDYVSKWV+A+AC ++DA+TV  FL+  IF+RFGTPRA++SDEGTHF N +L  +LAKY I+HR+AT YHPQ NG AE
Subjt:  VWGIDFMGLFPPSNGNVFILLAVDYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAE

Query:  ISNREIKAILEKVVHPSRKDWSFRLDQALWAYRTAYKTPLGMSPYRLVYGKACHLPLELEHKTFWALKKLNFDLSRAGAIRMLQLNELEEFRQFSYENAK
         +N+++K ILEKVV+ SRKDW+ +LD  LWAYRTAY+T LG SPY+LV+GKACHLPLELEHK +WALK+LN D+  AG  RM QL ELEEFR  +YENAK
Subjt:  ISNREIKAILEKVVHPSRKDWSFRLDQALWAYRTAYKTPLGMSPYRLVYGKACHLPLELEHKTFWALKKLNFDLSRAGAIRMLQLNELEEFRQFSYENAK

Query:  MYKEKTKLWHDKKLNLRSLSR----------------------------------------DEKDGRVIKVNGQRVKHYWG
        + KEK K WHDKK+  R  +                                         +EKDG    VNGQRVKHY+G
Subjt:  MYKEKTKLWHDKKLNLRSLSR----------------------------------------DEKDGRVIKVNGQRVKHYWG

XP_042757945.1 uncharacterized protein LOC111885853 [Lactuca sativa]6.0e-4245.73Show/hide
Query:  FNELNPGIARPQIQAANFEMKPVMFQMLQTVGQF------------------------HGVPRDALRLTLFPYSLRDGAKSWLNSFAPGSIRTWDELAEK
        F   NP I    I A NFE KPVMFQM+   G F                         G+  DA RL LFPY+L+  A+ W  S    SI TW+EL EK
Subjt:  FNELNPGIARPQIQAANFEMKPVMFQMLQTVGQF------------------------HGVPRDALRLTLFPYSLRDGAKSWLNSFAPGSIRTWDELAEK

Query:  FLSKYFPPNRNAKSRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYNGLNGVTQGMVDASAGGALLAKTFDEAYEILERISINSCQW
        FL ++FPP RNA  R+ I  F+Q++ E+  E WER+  LLRKCP H LP  +++ETFY GL+  T+ +VD SAGGALL K+++E+ EIL+RI+ N+ QW
Subjt:  FLSKYFPPNRNAKSRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYNGLNGVTQGMVDASAGGALLAKTFDEAYEILERISINSCQW

XP_042757945.1 uncharacterized protein LOC111885853 [Lactuca sativa]6.8e-16358.33Show/hide
Query:  AKPRLIRWILLLQEFDLEIKDKKGSENVIADHLSRLDPSSSLLEQSAISDSFPDEQLFAVQVKVVRDVPWYVDIANFLVKGVTPVDMDWRQKKKFKHDAK
        AKPRLIRWILLLQEFDLE++DKKGSEN +ADHLSRL+    +     I ++FPDEQLFA ++K    +PWY DI NFL   V P D+ + Q+KKF HD K
Subjt:  AKPRLIRWILLLQEFDLEIKDKKGSENVIADHLSRLDPSSSLLEQSAISDSFPDEQLFAVQVKVVRDVPWYVDIANFLVKGVTPVDMDWRQKKKFKHDAK

Query:  FFFWDEPFMYKQCSDGIIRRCVSGAEAKEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFD
        ++ WDEP ++K+C D IIRRCV   E + IL  CHSS YGGHF   RTA ++L  GFFWP++F+D++   K CD CQR GN+  R E+PL  ILEVELFD
Subjt:  FFFWDEPFMYKQCSDGIIRRCVSGAEAKEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFD

Query:  VWGIDFMGLFPPSNGNVFILLAVDYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAE
        VWGIDFMG FPPS G V+ILLAVDYVSKWV+AIA   +DAK V +FL  +IF RFGTPRA++SDEGTHF N +   LL+KYG++H+IA  YHPQ NGQAE
Subjt:  VWGIDFMGLFPPSNGNVFILLAVDYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAE

Query:  ISNREIKAILEKVVHPSRKDWSFRLDQALWAYRTAYKTPLGMSPYRLVYGKACHLPLELEHKTFWALKKLNFDLSRAGAIRMLQLNELEEFRQFSYENAK
        ISNREIK ILEK V+ +RKDW+ +LD ALWAYRTA+KTP+GMSPYRLV+GKACHLP+ELEHK +WA+KK N DL  AG  R+LQLNE++EFR  +YENAK
Subjt:  ISNREIKAILEKVVHPSRKDWSFRLDQALWAYRTAYKTPLGMSPYRLVYGKACHLPLELEHKTFWALKKLNFDLSRAGAIRMLQLNELEEFRQFSYENAK

Query:  MYKEKTKLWHDKKL-----------------------NLRS----------------LSRDEKDGRVIKVNGQRVKHYWG
        +YKE+TK WHDK++                        LRS                +   +K G + +VNGQR+KHY+G
Subjt:  MYKEKTKLWHDKKL-----------------------NLRS----------------LSRDEKDGRVIKVNGQRVKHYWG

TrEMBL top hitse value%identityAlignment
A0A2G9FWY3 Reverse transcriptase1.6e-16558.84Show/hide
Query:  AKPRLIRWILLLQEFDLEIKDKKGSENVIADHLSRLDPSSSLLEQSAISDSFPDEQLFAVQVKVVRDVPWYVDIANFLVKGVTPVDMDWRQKKKFKHDAK
        AKPRLIRW+LLLQEFDLEI+D+KG+EN IADHLSRL+  +   E + I+D+FPDEQL A+   V  DVPWY DI N+L  G+ P D+  +QKKKF  D +
Subjt:  AKPRLIRWILLLQEFDLEIKDKKGSENVIADHLSRLDPSSSLLEQSAISDSFPDEQLFAVQVKVVRDVPWYVDIANFLVKGVTPVDMDWRQKKKFKHDAK

Query:  FFFWDEPFMYKQCSDGIIRRCVSGAEAKEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFD
         +FWD+PF++KQ  D I+RRCV   E  +ILEQCH+SPYGGHF G RTA +IL  GFFWP LFKDAH F   CD CQR GN+  R EMPL  ILEVELFD
Subjt:  FFFWDEPFMYKQCSDGIIRRCVSGAEAKEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFD

Query:  VWGIDFMGLFPPSNGNVFILLAVDYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAE
        VWGIDFMG F PS GN++IL+AVDYVSKWV+A A   +D+K V  F++ +IF RFGTPRA++SD GTHF N     LL+KYG++H+I+TPYHPQ +GQ E
Subjt:  VWGIDFMGLFPPSNGNVFILLAVDYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAE

Query:  ISNREIKAILEKVVHPSRKDWSFRLDQALWAYRTAYKTPLGMSPYRLVYGKACHLPLELEHKTFWALKKLNFDLSRAGAIRMLQLNELEEFRQFSYENAK
        +SNREIK ILEK V  +RKDWS RLD+ALWAYRTAYKTP+GMSPYRLV+GKACHLP+ELEH  +WA++KLNFD+  AG  R+LQLNEL+EFR  +YENAK
Subjt:  ISNREIKAILEKVVHPSRKDWSFRLDQALWAYRTAYKTPLGMSPYRLVYGKACHLPLELEHKTFWALKKLNFDLSRAGAIRMLQLNELEEFRQFSYENAK

Query:  MYKEKTKLWHDKKLNLR---------------------------------------SLSRDEKDGR-VIKVNGQRVKHYWG
        +YKEK K WH+KK+  R                                       ++  + K+ R   KVN QR+KHYWG
Subjt:  MYKEKTKLWHDKKLNLR---------------------------------------SLSRDEKDGR-VIKVNGQRVKHYWG

A0A2G9FWY3 Reverse transcriptase3.0e-1533.33Show/hide
Query:  GVPRDALRLTLFPYSLRDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKSRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYN
        GV +DALRL LF +SL   A  W  S    SI TW                                ET  EAW RF+++LR CP+H +P  IQ+ TFY+
Subjt:  GVPRDALRLTLFPYSLRDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKSRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYN

Query:  GLNGVTQGMVDASAGGALLAKTFDEAYEILERISINSCQWSDVRGTNKKVKSVLEVDGVSTIRAK
        GL    +  +D   G + L+ T  E + +L  +  N  +    R T  K   V+EVD V+ + AK
Subjt:  GLNGVTQGMVDASAGGALLAKTFDEAYEILERISINSCQWSDVRGTNKKVKSVLEVDGVSTIRAK

A0A2G9FWY3 Reverse transcriptase1.2e-16056.96Show/hide
Query:  AKPRLIRWILLLQEFDLEIKDKKGSENVIADHLSRLDPSSSLLEQSAISDSFPDEQLFAVQVKVVRDVPWYVDIANFLVKGVTPVDMDWRQKKKFKHDAK
        A P LI W+ LLQEFDLEI+D+KG+EN IADHLSRL+  + + E + I+D+F DEQL A+   V  DVPWY DI N+L  G+ P D+  +QKKK   D +
Subjt:  AKPRLIRWILLLQEFDLEIKDKKGSENVIADHLSRLDPSSSLLEQSAISDSFPDEQLFAVQVKVVRDVPWYVDIANFLVKGVTPVDMDWRQKKKFKHDAK

Query:  FFFWDEPFMYKQCSDGIIRRCVSGAEAKEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFD
         +FWD+ F++KQ  D I+RRCV   E  +ILEQCH+SPYGGHF G RTA +IL  GFFWP LFKDA+ F   CD CQR GN+  R EMPL  ILEVELFD
Subjt:  FFFWDEPFMYKQCSDGIIRRCVSGAEAKEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFD

Query:  VWGIDFMGLFPPSNGNVFILLAVDYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAE
        VWGIDFMGLF PS GN++IL+AVDYVSKWV+A+A   +D+K V  F++ +IF RFGTPRA++S+ GTHF N     LL+KYG++H+I+TPYHPQ +GQ E
Subjt:  VWGIDFMGLFPPSNGNVFILLAVDYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAE

Query:  ISNREIKAILEKVVHPSRKDWSFRLDQALWAYRTAYKTPLGMSPYRLVYGKACHLPLELEHKTFWALKKLNFDLSRAGAIRMLQLNELEEFRQFSYENAK
        +SNREIK ILEK V  +RKDWS RLD+ALWAYRTA+KTP+GMSPY+LV+GKACHLP+ELEH  +WA++KLNFD+  AG  R+LQLNEL+EFR  +YENAK
Subjt:  ISNREIKAILEKVVHPSRKDWSFRLDQALWAYRTAYKTPLGMSPYRLVYGKACHLPLELEHKTFWALKKLNFDLSRAGAIRMLQLNELEEFRQFSYENAK

Query:  MYKEKTKLWHDKKLNLRSLS----------------------------------------RDEKDGRVIKVNGQRVKHYWG
        +YKEKTK WHDKK+  R                                            +E      KVN QR+KHYWG
Subjt:  MYKEKTKLWHDKKLNLRSLS----------------------------------------RDEKDGRVIKVNGQRVKHYWG

A0A2G9G6G2 Reverse transcriptase7.9e-15756.13Show/hide
Query:  AKPRLIRWILLLQEFDLEIKDKKGSENVIADHLSRLDPSSSLLEQSAISDSFPDEQLFAVQVKVVRDVPWYVDIANFLVKGVTPVDMDWRQKKKFKHDAK
        A P LI W+LLLQEFDLEI+D+KG+EN IADHLSRL+  +   E + I+D+FPDEQL A+   V  +VPWY DI N+L  G+ P D+  +QKKK   D +
Subjt:  AKPRLIRWILLLQEFDLEIKDKKGSENVIADHLSRLDPSSSLLEQSAISDSFPDEQLFAVQVKVVRDVPWYVDIANFLVKGVTPVDMDWRQKKKFKHDAK

Query:  FFFWDEPFMYKQCSDGIIRRCVSGAEAKEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFD
         +FW++PF+ KQ  D I+RRCV   E  +ILEQCH+SPYGGHF G RTA +IL  GFFWP LFKDAH F   CD CQR  N+  R EMPL  ILEVELFD
Subjt:  FFFWDEPFMYKQCSDGIIRRCVSGAEAKEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFD

Query:  VWGIDFMGLFPPSNGNVFILLAVDYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAE
        VWGIDFMG F PS GN++IL+AVDYVSKWV+A A   +D+K V  F++ +IF RFGTPRA++SD  T+F N     LL+KYG++H+I TPYHPQ +G  E
Subjt:  VWGIDFMGLFPPSNGNVFILLAVDYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAE

Query:  ISNREIKAILEKVVHPSRKDWSFRLDQALWAYRTAYKTPLGMSPYRLVYGKACHLPLELEHKTFWALKKLNFDLSRAGAIRMLQLNELEEFRQFSYENAK
        +SNREIK ILEK V  +RKDWS RLD+ALWAYRTAYKTP+GMSPY L++GKACHLP+ELEH  +WA+ KLNFD+  AG  R+LQLNEL+EFR  +YENAK
Subjt:  ISNREIKAILEKVVHPSRKDWSFRLDQALWAYRTAYKTPLGMSPYRLVYGKACHLPLELEHKTFWALKKLNFDLSRAGAIRMLQLNELEEFRQFSYENAK

Query:  MYKEKTKLWHDKKLNLRSLS----------------------------------------RDEKDGRVIKVNGQRVKHYWG
        +YKEKTK WHDKK+  R                                            +E      K+N +R+KHYWG
Subjt:  MYKEKTKLWHDKKLNLRSLS----------------------------------------RDEKDGRVIKVNGQRVKHYWG

A0A2G9G6G2 Reverse transcriptase3.9e-1532.73Show/hide
Query:  GVPRDALRLTLFPYSLRDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKSRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYN
        GV +DALRL LF +SL   A  W  S    SI TW                                ET  EAW +F+++LR CP+H +P  IQ+ TFY+
Subjt:  GVPRDALRLTLFPYSLRDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKSRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYN

Query:  GLNGVTQGMVDASAGGALLAKTFDEAYEILERISINSCQWSDVRGTNKKVKSVLEVDGVSTIRAK
        GL    +  +D   G + L+ T  E + +L  +  N  +    R T  K   V+EVD V+ + AK
Subjt:  GLNGVTQGMVDASAGGALLAKTFDEAYEILERISINSCQWSDVRGTNKKVKSVLEVDGVSTIRAK

A0A2G9HBV9 DNA-directed DNA polymerase1.7e-1037.5Show/hide
Query:  FRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYNGLNGVTQGMVDASAGGALLAKTFDEAYEILERISINSCQWSDVRGTNKKVKSVLEVDGVST
        FRQ   ET  EAW RF+++L  CP+H +P  IQ+ TFY+GL    +   D   G + L+ T  E + +L  +  N  +    R T  K   V+EVD V+ 
Subjt:  FRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYNGLNGVTQGMVDASAGGALLAKTFDEAYEILERISINSCQWSDVRGTNKKVKSVLEVDGVST

Query:  IRAK
        + AK
Subjt:  IRAK

A0A2G9HBV9 DNA-directed DNA polymerase1.7e-15960.09Show/hide
Query:  AKPRLIRWILLLQEFDLEIKDKKGSENVIADHLSRLDPSSSLLEQSAISDSFPDEQLFAVQVKVVRDVPWYVDIANFLVKGVTPVDMDWRQKKKFKHDAK
        +KPRL+RWILLLQEFDLEI+DKKGSEN +ADHLSRL+      E+ AI D F DE + AV V      PW+ D AN++V    P D   +Q+KKF HD K
Subjt:  AKPRLIRWILLLQEFDLEIKDKKGSENVIADHLSRLDPSSSLLEQSAISDSFPDEQLFAVQVKVVRDVPWYVDIANFLVKGVTPVDMDWRQKKKFKHDAK

Query:  FFFWDEPFMYKQCSDGIIRRCVSGAEAKEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFD
        F+ WDEPF+YK+  DG++RRCV   E +++L  CH S YGGHFSG RTA ++L  G FWPTLFKDA  + K+CD CQR GN+  R+EMP   ILEVE+FD
Subjt:  FFFWDEPFMYKQCSDGIIRRCVSGAEAKEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFD

Query:  VWGIDFMGLFPPSNGNVFILLAVDYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAE
        VWGIDFMG FP S    +IL+AVDYVSKWV+AIA H +DA+ V  FL+ +IF+RFG PRAL+SDEGTHF+N  +  LL KY + HRIATPYHPQ +GQ E
Subjt:  VWGIDFMGLFPPSNGNVFILLAVDYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAE

Query:  ISNREIKAILEKVVHPSRKDWSFRLDQALWAYRTAYKTPLGMSPYRLVYGKACHLPLELEHKTFWALKKLNFDLSRAGAIRMLQLNELEEFRQFSYENAK
        +SNR+IK ILEK V+ SRKDWS +LD ALWAYRTA+KTP+GMSP+++VYGKACHLPLELEHK  WA K LNFDLS+AG  R+LQL+EL+EFR ++YENAK
Subjt:  ISNREIKAILEKVVHPSRKDWSFRLDQALWAYRTAYKTPLGMSPYRLVYGKACHLPLELEHKTFWALKKLNFDLSRAGAIRMLQLNELEEFRQFSYENAK

Query:  MYKEKTKLWHDKKLNLRSLSRDEKDGRVIKVNGQRVKHYWG
        ++KEKTK WHD+K+      ++ ++G+++ +   R++ + G
Subjt:  MYKEKTKLWHDKKLNLRSLSRDEKDGRVIKVNGQRVKHYWG

A0A2K3NJZ5 Integrase catalytic domain-containing protein (Fragment)3.8e-15960.09Show/hide
Query:  AKPRLIRWILLLQEFDLEIKDKKGSENVIADHLSRLDPSSSLLEQSAISDSFPDEQLFAVQVKVVRDVPWYVDIANFLVKGVTPVDMDWRQKKKFKHDAK
        +KPRL+RWILLLQEFDLEI+DKKGSEN +ADHLSRL+      E+  I D F DE + AV V      PW+ D AN++V    P D   +Q+KKF HD K
Subjt:  AKPRLIRWILLLQEFDLEIKDKKGSENVIADHLSRLDPSSSLLEQSAISDSFPDEQLFAVQVKVVRDVPWYVDIANFLVKGVTPVDMDWRQKKKFKHDAK

Query:  FFFWDEPFMYKQCSDGIIRRCVSGAEAKEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFD
        F+ WDEPF+YK+  DG++RRCV   E +++L  CH S YGGHFSG RTA ++L  G FWPTLFKDA  + K+CD CQR GN+  R+EMP   +LEVE+FD
Subjt:  FFFWDEPFMYKQCSDGIIRRCVSGAEAKEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFD

Query:  VWGIDFMGLFPPSNGNVFILLAVDYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAE
        VWGIDFMG FP S    +IL+AVDYVSKWV+AIA   +DA+ V  FL+ +IF+RFG PRAL+SDEGTHF+N  +  LL KY + HRIATPYHPQ +GQ E
Subjt:  VWGIDFMGLFPPSNGNVFILLAVDYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAE

Query:  ISNREIKAILEKVVHPSRKDWSFRLDQALWAYRTAYKTPLGMSPYRLVYGKACHLPLELEHKTFWALKKLNFDLSRAGAIRMLQLNELEEFRQFSYENAK
        +SNR+IK ILEK V+ SRKDWS +LD ALWAYRTA+KTP+GMSP+++VYGK+CHLPLELEHK  WA K LNFDLS+AG  R+LQL+EL+EFR F+YENAK
Subjt:  ISNREIKAILEKVVHPSRKDWSFRLDQALWAYRTAYKTPLGMSPYRLVYGKACHLPLELEHKTFWALKKLNFDLSRAGAIRMLQLNELEEFRQFSYENAK

Query:  MYKEKTKLWHDKKLNLRSLSRDEKDGRVIKVNGQRVKHYWG
        ++KEKTK WHDKK+     +R+ ++G+++ +   R+K + G
Subjt:  MYKEKTKLWHDKKLNLRSLSRDEKDGRVIKVNGQRVKHYWG

SwissProt top hitse value%identityAlignment
O93209 Pro-Pol polyprotein9.3e-2224.17Show/hide
Query:  EAKEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFDVWGIDFMGLFPPSNGNVFILLAVDY
        E  +++++ H+  +    +G+   +  +   ++WP + KD   F   C+ C+    L  +   P   +   + FD + +D++G  PPS G V +L+ VD 
Subjt:  EAKEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFDVWGIDFMGLFPPSNGNVFILLAVDY

Query:  VSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAEISNREIKAILEKVVHPSRKDWSFRL
         + +          +K   + L +H+      P+ L SD+G+ F +    +   +  IQ   +TPYHPQ++G+ E  N EIK +L K++      W   +
Subjt:  VSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAEISNREIKAILEKVVHPSRKDWSFRL

Query:  DQALWAYRTAYKTPLGMSPYRLVYGKACHLPLELEHKTFW
             A    +      +P++L++G  C+LP   +    W
Subjt:  DQALWAYRTAYKTPLGMSPYRLVYGKACHLPLELEHKTFW

P03359 Gag-Pol polyprotein3.0e-2029.49Show/hide
Query:  KEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDAC--------QRRGNLGPRDEMPLTYILEVELFDVWGIDFMGLFPPSNGNVFI
        KE +++ H      H   ++    +       P L         QC AC         R      R + P  Y         W +DF  + P   GN ++
Subjt:  KEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDAC--------QRRGNLGPRDEMPLTYILEVELFDVWGIDFMGLFPPSNGNVFI

Query:  LLAVDYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAEISNREIKAILEKV-VHPSR
        L+ +D  S WV+A       A TV + +   I  RFG P+ L SD G  FV  +   L  + GI  ++   Y PQ++GQ E  NR IK  L K+ +    
Subjt:  LLAVDYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAEISNREIKAILEKV-VHPSR

Query:  KDWSFRLDQALWAYRTAYKTP--LGMSPYRLVYG
        KDW   L  AL   R    TP   G++PY ++YG
Subjt:  KDWSFRLDQALWAYRTAYKTP--LGMSPYRLVYG

P10273 Gag-Pol polyprotein6.1e-2128.94Show/hide
Query:  AKEILEQCHSSPYGGHFSGQ--RTAMRILHCGFFWPTLFKDAHWFYKQCDACQR----RGNLGP----RDEMPLTYILEVELFDVWGIDFMGLFPPSNGN
        AKE++   H      H S +  +T +     GF+ P          + C AC +    +   GP    R   P T+         W +DF  + P   G 
Subjt:  AKEILEQCHSSPYGGHFSGQ--RTAMRILHCGFFWPTLFKDAHWFYKQCDACQR----RGNLGP----RDEMPLTYILEVELFDVWGIDFMGLFPPSNGN

Query:  VFILLAVDYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAEISNREIKAILEKV-VH
         ++L+ +D  S W +A       AK VA+ L   IF R+G P+ L SD G  F++ +   +    GI  ++   Y PQ++GQ E  NR IK  L K+ + 
Subjt:  VFILLAVDYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAEISNREIKAILEKV-VH

Query:  PSRKDWSFRLDQALWAYRTAYKTPLGMSPYRLVYG
           KDW   L   L+  R     P G++P+ ++YG
Subjt:  PSRKDWSFRLDQALWAYRTAYKTPLGMSPYRLVYG

P10394 Retrovirus-related Pol polyprotein from transposon 4123.5e-2125.32Show/hide
Query:  EAKEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVE-LFDVWGIDFMGLFPPS-NGNVFILLAV
        E + IL   H  P  G  +G    +  +   ++W  + K    + ++C  CQ +       + P+T     E  FD   +D +G  P S NGN + +  +
Subjt:  EAKEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVE-LFDVWGIDFMGLFPPS-NGNVFILLAV

Query:  DYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAEISNREIKAILEKVVHPSRKDWSF
          ++K++ AI      AKTVA+ +      ++G  +  ++D GT + N+I+T L     I++  +T +H Q  G  E S+R +   +   +   + DW  
Subjt:  DYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAEISNREIKAILEKVVHPSRKDWSF

Query:  RLDQALWAYRTAYKTPLGMSPYRLVYGKACHLP
         L   ++ + T         PY LV+G+  +LP
Subjt:  RLDQALWAYRTAYKTPLGMSPYRLVYGKACHLP

P31792 Pol polyprotein (Fragment)3.0e-2031.06Show/hide
Query:  EAKEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLG---------PRDEMPLTYILEVELFDVWGIDFMGLFPPSNGN
        EA  +++Q H+     H S Q+  + I    F  P            C  CQ + N G          R   P  Y         W IDF  + P   G 
Subjt:  EAKEILEQCHSSPYGGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLG---------PRDEMPLTYILEVELFDVWGIDFMGLFPPSNGN

Query:  VFILLAVDYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAEISNREIKAILEKV-VH
         ++L+ VD  S WV+A    Q  A  VA+ +   IF RFG P+ + SD G  FV+ +   L    GI  ++   Y PQ++GQ E  NR IK  L K+ + 
Subjt:  VFILLAVDYVSKWVKAIACHQSDAKTVARFLQSHIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAEISNREIKAILEKV-VH

Query:  PSRKDWSFRLDQALWAYRTAYKTPLGMSPYRLVYG
           KDW   L  AL   R       G++PY ++YG
Subjt:  PSRKDWSFRLDQALWAYRTAYKTPLGMSPYRLVYG

Arabidopsis top hitse value%identityAlignment
ATMG00750.1 GAG/POL/ENV polyprotein1.1e-1766.07Show/hide
Query:  ILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFDVWGIDFM
        +L  GF+WPT FKDAH F   CDACQR+GN   R+EMP  +ILEVE+FDVWGI FM
Subjt:  ILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFDVWGIDFM


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGACCAAGGAGGCAAAACCGGTAAGTGGGACGGGCCAAGACCAAAGGGGTCGGGTTTTTGGCCCGACCCCGTGCTCGGCCTCGGCCATGGGCCGAGGCCGAGCCCGTC
TGGTGCCGTTAGGTCCCCACTGCTTCCATCTGCCCCGGGAAGGGACAGAGATTCTATCCCTAAACACTATTTGACATTTTCTACTCTCTCTCCTCTTGCTCTTACTTTTC
CACTCCCTACCGTTCTGCTTGCTGACTTAAGCATCGGAGCCAGTGTGGCGAGCACCACACCGGTGCGCAGGTTTACTGTCTTGCAGGCCACGTTTTCCCCCTCATCTACA
AATTTACCGTTGGTGGCACGTGGAGGTCAGCTAGGCGAGGCATGGTGTTGGAACAACTTTAGAAGCATGTATGAAGTTGGAAGTTTTCGAGGGGCAATTTATCTCGACAT
GGGCCATAACACGTTGACAGAAATTGAATCTGCCGATACTAGGGGTGATCATCGGTCTGTCGGCGTCGATTTTAGGCTCAAACTGACGCCGACCACCAACGTGTCGGTTT
TCGGCGGTCGGTTTTCGTCGGTTGGTTTTCGTCGGTTTGGGATGAAGAAAGAAGAGAAAAAGAAAGAAGAGAAAAAGAGGAAAGAAGAGAAAAGGAAAAAAAAAAAGGTC
GCCGGCGGCTGGCGGCAGCGACGAGCGGTGGCCGCCGGCCGCCGGTCGCCGGACATAGGAAGAAGAAGAAGGAGGAAGGAGAAGATTATGCTGCTGAGCGACTGGAGGGA
GCAAATTCTATGCTGCAGCAAAACTAGGAACAGAAACTGCCACATCACAGCTCGTGTGATTTTGGTGCATGAGCGATCCGCCTGGGGTACGCAAAATCCACTGTTTGAGC
AAAATGAGCAGCGAAATAATCAGGCTGAGAATCCTATCTTGATAGAGAACGATAGGACCAGAGCCATTCGAGCGTATGTTGTCCCGATGTTTAATGAGTTGAATCCAGGG
ATTGCACGTCCCCAAATCCAAGCGGCGAATTTTGAAATGAAACCGGTAATGTTTCAGATGTTGCAAACCGTGGGGCAATTCCATGGAGTGCCTAGAGATGCTCTTAGATT
AACTTTGTTCCCGTATTCTCTTAGAGACGGAGCAAAGTCATGGTTAAACTCTTTTGCTCCAGGATCAATTAGGACGTGGGATGAGTTAGCTGAAAAATTTTTGAGTAAAT
ATTTCCCACCTAATAGAAATGCTAAATCAAGGAGTGAAATAGTAGGGTTTAGGCAACTTGAAGATGAGACTTTTAGTGAGGCTTGGGAGAGGTTTAAGGAGCTTTTACGA
AAGTGTCCCCACCATGGGTTACCTCATTGTATTCAAATGGAAACATTTTACAATGGTTTAAATGGAGTAACCCAAGGTATGGTCGATGCTTCGGCTGGAGGGGCCCTTTT
GGCAAAAACTTTTGATGAAGCCTATGAAATTTTAGAAAGAATATCTATTAATAGTTGTCAGTGGTCGGATGTTAGAGGCACGAATAAAAAGGTTAAGAGTGTATTAGAGG
TTGATGGTGTGTCCACCATTAGGGCAAAGCCTAGATTAATTCGTTGGATTTTACTATTGCAGGAATTTGACTTGGAGATAAAGGACAAGAAGGGATCAGAGAATGTCATT
GCAGATCATTTGTCTCGTCTTGATCCATCATCATCTTTGCTGGAGCAATCTGCCATTTCCGATTCTTTTCCAGATGAGCAGCTTTTTGCTGTTCAGGTAAAGGTAGTCAG
GGATGTCCCTTGGTATGTTGATATTGCCAACTTTTTAGTAAAGGGAGTCACTCCTGTTGACATGGATTGGAGGCAGAAGAAAAAGTTTAAGCATGATGCGAAATTTTTCT
TTTGGGATGAGCCATTTATGTATAAGCAATGCTCTGATGGTATTATTCGTAGGTGCGTTTCAGGTGCTGAAGCAAAGGAAATCCTGGAGCAATGTCACTCTTCGCCGTAT
GGAGGTCATTTCAGCGGTCAGAGGACGGCTATGAGGATTTTGCATTGCGGATTTTTCTGGCCTACGTTATTCAAGGATGCCCACTGGTTCTACAAGCAATGTGATGCTTG
TCAAAGGAGAGGAAATTTGGGGCCTAGAGATGAAATGCCTCTTACTTATATTTTAGAAGTTGAATTATTTGATGTATGGGGTATTGACTTTATGGGGCTATTTCCCCCTT
CTAATGGCAATGTTTTTATCTTATTGGCAGTTGATTACGTGTCCAAGTGGGTGAAGGCCATCGCATGCCATCAGAGTGATGCCAAGACAGTTGCAAGGTTTCTTCAATCG
CACATCTTTGCGCGGTTTGGGACACCTAGGGCTCTAGTGAGTGATGAGGGTACACATTTTGTTAATAATATCTTAACTAAGCTTTTAGCTAAGTATGGGATTCAGCATAG
GATAGCTACCCCTTATCACCCACAAGCAAATGGTCAAGCTGAAATTAGTAATAGGGAAATTAAAGCTATTTTAGAGAAAGTAGTCCATCCATCTAGAAAGGATTGGTCCT
TTAGGTTGGATCAGGCTCTTTGGGCTTATAGGACAGCTTATAAGACTCCTCTAGGTATGTCTCCCTATAGGTTAGTATATGGGAAAGCTTGCCATTTACCATTAGAGCTT
GAGCATAAAACATTTTGGGCTTTGAAAAAGTTAAATTTTGATCTGAGTCGTGCAGGAGCAATCAGAATGCTGCAGCTTAATGAGTTAGAAGAATTTCGTCAATTTTCTTA
CGAAAATGCGAAAATGTATAAGGAAAAGACTAAGCTGTGGCATGACAAAAAATTAAATCTAAGGAGTTTGTCAAGGGATGAAAAAGATGGGAGAGTGATCAAGGTGAATG
GACAACGTGTGAAGCATTATTGGGGTTCAGCAGATTGTTGCGGCAAAGATATGGCTGGAGCAAAATATTCCGAATTAAAAGGGTTTTCGTTGGAATTTATTATTTTTACC
GTTGGATTTGGTTTTGAATTCTCGCAGGTAAATATGCGTGCGTCATCTGATGAGGCCACGTGTCGCAGAGCAATTATCCGAAGGCTTCAATGA
mRNA sequenceShow/hide mRNA sequence
ATGGACCAAGGAGGCAAAACCGGTAAGTGGGACGGGCCAAGACCAAAGGGGTCGGGTTTTTGGCCCGACCCCGTGCTCGGCCTCGGCCATGGGCCGAGGCCGAGCCCGTC
TGGTGCCGTTAGGTCCCCACTGCTTCCATCTGCCCCGGGAAGGGACAGAGATTCTATCCCTAAACACTATTTGACATTTTCTACTCTCTCTCCTCTTGCTCTTACTTTTC
CACTCCCTACCGTTCTGCTTGCTGACTTAAGCATCGGAGCCAGTGTGGCGAGCACCACACCGGTGCGCAGGTTTACTGTCTTGCAGGCCACGTTTTCCCCCTCATCTACA
AATTTACCGTTGGTGGCACGTGGAGGTCAGCTAGGCGAGGCATGGTGTTGGAACAACTTTAGAAGCATGTATGAAGTTGGAAGTTTTCGAGGGGCAATTTATCTCGACAT
GGGCCATAACACGTTGACAGAAATTGAATCTGCCGATACTAGGGGTGATCATCGGTCTGTCGGCGTCGATTTTAGGCTCAAACTGACGCCGACCACCAACGTGTCGGTTT
TCGGCGGTCGGTTTTCGTCGGTTGGTTTTCGTCGGTTTGGGATGAAGAAAGAAGAGAAAAAGAAAGAAGAGAAAAAGAGGAAAGAAGAGAAAAGGAAAAAAAAAAAGGTC
GCCGGCGGCTGGCGGCAGCGACGAGCGGTGGCCGCCGGCCGCCGGTCGCCGGACATAGGAAGAAGAAGAAGGAGGAAGGAGAAGATTATGCTGCTGAGCGACTGGAGGGA
GCAAATTCTATGCTGCAGCAAAACTAGGAACAGAAACTGCCACATCACAGCTCGTGTGATTTTGGTGCATGAGCGATCCGCCTGGGGTACGCAAAATCCACTGTTTGAGC
AAAATGAGCAGCGAAATAATCAGGCTGAGAATCCTATCTTGATAGAGAACGATAGGACCAGAGCCATTCGAGCGTATGTTGTCCCGATGTTTAATGAGTTGAATCCAGGG
ATTGCACGTCCCCAAATCCAAGCGGCGAATTTTGAAATGAAACCGGTAATGTTTCAGATGTTGCAAACCGTGGGGCAATTCCATGGAGTGCCTAGAGATGCTCTTAGATT
AACTTTGTTCCCGTATTCTCTTAGAGACGGAGCAAAGTCATGGTTAAACTCTTTTGCTCCAGGATCAATTAGGACGTGGGATGAGTTAGCTGAAAAATTTTTGAGTAAAT
ATTTCCCACCTAATAGAAATGCTAAATCAAGGAGTGAAATAGTAGGGTTTAGGCAACTTGAAGATGAGACTTTTAGTGAGGCTTGGGAGAGGTTTAAGGAGCTTTTACGA
AAGTGTCCCCACCATGGGTTACCTCATTGTATTCAAATGGAAACATTTTACAATGGTTTAAATGGAGTAACCCAAGGTATGGTCGATGCTTCGGCTGGAGGGGCCCTTTT
GGCAAAAACTTTTGATGAAGCCTATGAAATTTTAGAAAGAATATCTATTAATAGTTGTCAGTGGTCGGATGTTAGAGGCACGAATAAAAAGGTTAAGAGTGTATTAGAGG
TTGATGGTGTGTCCACCATTAGGGCAAAGCCTAGATTAATTCGTTGGATTTTACTATTGCAGGAATTTGACTTGGAGATAAAGGACAAGAAGGGATCAGAGAATGTCATT
GCAGATCATTTGTCTCGTCTTGATCCATCATCATCTTTGCTGGAGCAATCTGCCATTTCCGATTCTTTTCCAGATGAGCAGCTTTTTGCTGTTCAGGTAAAGGTAGTCAG
GGATGTCCCTTGGTATGTTGATATTGCCAACTTTTTAGTAAAGGGAGTCACTCCTGTTGACATGGATTGGAGGCAGAAGAAAAAGTTTAAGCATGATGCGAAATTTTTCT
TTTGGGATGAGCCATTTATGTATAAGCAATGCTCTGATGGTATTATTCGTAGGTGCGTTTCAGGTGCTGAAGCAAAGGAAATCCTGGAGCAATGTCACTCTTCGCCGTAT
GGAGGTCATTTCAGCGGTCAGAGGACGGCTATGAGGATTTTGCATTGCGGATTTTTCTGGCCTACGTTATTCAAGGATGCCCACTGGTTCTACAAGCAATGTGATGCTTG
TCAAAGGAGAGGAAATTTGGGGCCTAGAGATGAAATGCCTCTTACTTATATTTTAGAAGTTGAATTATTTGATGTATGGGGTATTGACTTTATGGGGCTATTTCCCCCTT
CTAATGGCAATGTTTTTATCTTATTGGCAGTTGATTACGTGTCCAAGTGGGTGAAGGCCATCGCATGCCATCAGAGTGATGCCAAGACAGTTGCAAGGTTTCTTCAATCG
CACATCTTTGCGCGGTTTGGGACACCTAGGGCTCTAGTGAGTGATGAGGGTACACATTTTGTTAATAATATCTTAACTAAGCTTTTAGCTAAGTATGGGATTCAGCATAG
GATAGCTACCCCTTATCACCCACAAGCAAATGGTCAAGCTGAAATTAGTAATAGGGAAATTAAAGCTATTTTAGAGAAAGTAGTCCATCCATCTAGAAAGGATTGGTCCT
TTAGGTTGGATCAGGCTCTTTGGGCTTATAGGACAGCTTATAAGACTCCTCTAGGTATGTCTCCCTATAGGTTAGTATATGGGAAAGCTTGCCATTTACCATTAGAGCTT
GAGCATAAAACATTTTGGGCTTTGAAAAAGTTAAATTTTGATCTGAGTCGTGCAGGAGCAATCAGAATGCTGCAGCTTAATGAGTTAGAAGAATTTCGTCAATTTTCTTA
CGAAAATGCGAAAATGTATAAGGAAAAGACTAAGCTGTGGCATGACAAAAAATTAAATCTAAGGAGTTTGTCAAGGGATGAAAAAGATGGGAGAGTGATCAAGGTGAATG
GACAACGTGTGAAGCATTATTGGGGTTCAGCAGATTGTTGCGGCAAAGATATGGCTGGAGCAAAATATTCCGAATTAAAAGGGTTTTCGTTGGAATTTATTATTTTTACC
GTTGGATTTGGTTTTGAATTCTCGCAGGTAAATATGCGTGCGTCATCTGATGAGGCCACGTGTCGCAGAGCAATTATCCGAAGGCTTCAATGA
Protein sequenceShow/hide protein sequence
MDQGGKTGKWDGPRPKGSGFWPDPVLGLGHGPRPSPSGAVRSPLLPSAPGRDRDSIPKHYLTFSTLSPLALTFPLPTVLLADLSIGASVASTTPVRRFTVLQATFSPSST
NLPLVARGGQLGEAWCWNNFRSMYEVGSFRGAIYLDMGHNTLTEIESADTRGDHRSVGVDFRLKLTPTTNVSVFGGRFSSVGFRRFGMKKEEKKKEEKKRKEEKRKKKKV
AGGWRQRRAVAAGRRSPDIGRRRRRKEKIMLLSDWREQILCCSKTRNRNCHITARVILVHERSAWGTQNPLFEQNEQRNNQAENPILIENDRTRAIRAYVVPMFNELNPG
IARPQIQAANFEMKPVMFQMLQTVGQFHGVPRDALRLTLFPYSLRDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKSRSEIVGFRQLEDETFSEAWERFKELLR
KCPHHGLPHCIQMETFYNGLNGVTQGMVDASAGGALLAKTFDEAYEILERISINSCQWSDVRGTNKKVKSVLEVDGVSTIRAKPRLIRWILLLQEFDLEIKDKKGSENVI
ADHLSRLDPSSSLLEQSAISDSFPDEQLFAVQVKVVRDVPWYVDIANFLVKGVTPVDMDWRQKKKFKHDAKFFFWDEPFMYKQCSDGIIRRCVSGAEAKEILEQCHSSPY
GGHFSGQRTAMRILHCGFFWPTLFKDAHWFYKQCDACQRRGNLGPRDEMPLTYILEVELFDVWGIDFMGLFPPSNGNVFILLAVDYVSKWVKAIACHQSDAKTVARFLQS
HIFARFGTPRALVSDEGTHFVNNILTKLLAKYGIQHRIATPYHPQANGQAEISNREIKAILEKVVHPSRKDWSFRLDQALWAYRTAYKTPLGMSPYRLVYGKACHLPLEL
EHKTFWALKKLNFDLSRAGAIRMLQLNELEEFRQFSYENAKMYKEKTKLWHDKKLNLRSLSRDEKDGRVIKVNGQRVKHYWGSADCCGKDMAGAKYSELKGFSLEFIIFT
VGFGFEFSQVNMRASSDEATCRRAIIRRLQ