; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

CmoCh13G007680 (gene) of Cucurbita moschata (Rifu) v1 genome

Gene IDCmoCh13G007680
OrganismCucurbita moschata Rifu (Cucurbita moschata (Rifu) v1)
DescriptionRetrovirus-related Pol polyprotein from transposon RE1
Genome locationCmo_Chr13:7434032..7437406
RNA-Seq ExpressionCmoCh13G007680
SyntenyCmoCh13G007680
Gene Ontology termsGO:0015074 - DNA integration (biological process)
GO:0003676 - nucleic acid binding (molecular function)
GO:0008270 - zinc ion binding (molecular function)
GO:0016779 - nucleotidyltransferase activity (molecular function)
InterPro domainsIPR013103 - Reverse transcriptase, RNA-dependent DNA polymerase
IPR043502 - DNA/RNA polymerase superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
CAN68148.1 hypothetical protein VITISV_035665 [Vitis vinifera]1.3e-28750Show/hide
Query:  ADVSQLNTEEQSPNSIPNFSSEQLREIAQALSAINHHPSGNSDNHVNVAGLFPISTLSINSASSNSWILDSGATDHIVSKSSVMTEPKAAIMSAINLPNG
        A  SQ  ++  S +++  F++EQ++++AQA+ A+NH  SGN D + N A                      GATDHIVS  S+ T+ K + ++ +NLPNG
Subjt:  ADVSQLNTEEQSPNSIPNFSSEQLREIAQALSAINHHPSGNSDNHVNVAGLFPISTLSINSASSNSWILDSGATDHIVSKSSVMTEPKAAIMSAINLPNG

Query:  ETARVSHTGNISLSPNLQLNNVLCVPSFNLNLMSISKLTNNLKCYVTFYPDSCVMQDLATGKMIGSGKQFG-----------------------------
          + ++HTG +     L L +VLCVPSFNLNL+S SKL  +  CY+ F+PD C++QDL +GKMIGSGKQ G                             
Subjt:  ETARVSHTGNISLSPNLQLNNVLCVPSFNLNLMSISKLTNNLKCYVTFYPDSCVMQDLATGKMIGSGKQFG-----------------------------

Query:  ----------------------------------------------------------------------------------------------------
                                                                                                            
Subjt:  ----------------------------------------------------------------------------------------------------

Query:  ----------------------------------------------------------------------------------------------------
                                                                                                            
Subjt:  ----------------------------------------------------------------------------------------------------

Query:  ------------------------------------------GYPCGHKGYKLYDMQSHKFFISRDVKFCEDDFPF-SSASQTSTLAPSTPV--------
                                                  GYP G KGYK+ D+Q+ K  +SRDV F E+ FPF SS+SQ+   +PS P+        
Subjt:  ------------------------------------------GYPCGHKGYKLYDMQSHKFFISRDVKFCEDDFPF-SSASQTSTLAPSTPV--------

Query:  --VPLHDPSYSNIHPPP-----SIPSPPT---------------PPSPPIPSPTTPSSPPPSPDSPTNSNPIPPDTSAPLRRSTRTKQPPAWHKDYEMSS
           P+  P +S    PP      + SPP+                P P  PSP++ SSPP  P  P+N++   P    PLRRSTR  QPPAWH DY MS+
Subjt:  --VPLHDPSYSNIHPPP-----SIPSPPT---------------PPSPPIPSPTTPSSPPPSPDSPTNSNPIPPDTSAPLRRSTRTKQPPAWHKDYEMSS

Query:  GANHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRAFLALITSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDG
          NH ++ SS   GTRYPL  +LSF RFSP  RAFLAL+T+QTEP ++++A  DP W+QAM+ E+ ALERN+TW +VPLP GHK IGCRWVYKIKY+SDG
Subjt:  GANHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRAFLALITSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDG

Query:  SVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQNAFLHGNLDEEVYMSLPPGLRRQGENTVCRLHKSLYGLKQASRNW
        ++ERYKARLVAKGYTQV GIDY ETFSPTAKLTTLRCLLTVAA+R W+ HQLDV NAFLHGNL EEVYM+ PPGLRRQGEN VCRL KS+YGLKQASRNW
Subjt:  SVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQNAFLHGNLDEEVYMSLPPGLRRQGENTVCRLHKSLYGLKQASRNW

Query:  FSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDT
        FS F+ T+++AGY QSKADYSLFTKS+G  FTA+LIYVDDILLTGNDL EI+ LKT LL++F IKDLG LKYFLGIEFSRS+KGIFMSQRKY LDILQDT
Subjt:  FSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDT

Query:  GLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYC
        GLTG +P+KFPMEQNLKL+  +GE L+DPS+YRRL+GRLIYLTVTRPDI YSVR LSQFM+ PRKPHWEAALRVLRYIKG+PGQGL LPSENNL L A+C
Subjt:  GLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYC

Query:  DSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYILQDLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEI
        DSDWGGCR SRRS+SG+C+FLG+S+ISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYIL+DL V L +PA L+CDNQAAL+IAANPVFHERTKHIEI
Subjt:  DSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYILQDLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEI

Query:  DCHIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLG
        DCHIVREKLQAG+I+PCYVSTKMQLADVFTKALGR+QF+FL  KLG
Subjt:  DCHIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLG

KAD4180152.1 hypothetical protein E3N88_28743 [Mikania micrantha]3.0e-28955.69Show/hide
Query:  TEEQSPNSIPNFSSEQLREIAQALSAINHHPS-GNSDNHVNVAGLFPISTLSINSASSNSWILDSGATDHIVSKSSVMTEPKAAIMSAINLPNGETARVS
        TE  + N +  F+SEQL++++QALSA+N + S G  D   N AG   +  + +NS  S  WILDSGATDHI++  S +++ K++ +  +NLPNG +  ++
Subjt:  TEEQSPNSIPNFSSEQLREIAQALSAINHHPS-GNSDNHVNVAGLFPISTLSINSASSNSWILDSGATDHIVSKSSVMTEPKAAIMSAINLPNGETARVS

Query:  HTGNISLSPNLQLNNVLC--------------------------------------------VPSFNLNLMSISKLTNNLKCYV----------------
        HTG +S SP +QL NVLC                                             PS N   +  S L +N  C V                
Subjt:  HTGNISLSPNLQLNNVLC--------------------------------------------VPSFNLNLMSISKLTNNLKCYV----------------

Query:  -TFYP-------------DSCV---------------MQDLATGKMIGSG--------------------------------------------KQFG--
          F P              SCV               + ++A      SG                                            + FG  
Subjt:  -TFYP-------------DSCV---------------MQDLATGKMIGSG--------------------------------------------KQFG--

Query:  ----------------------GYPCGHKGYKLYDMQSHKFFISRDVKFCEDDFPFSSAS--QTSTLAPSTPVVPLHD----------PSYSNIHPPPSI
                              GYP G KGYKLYD+ S KFF+SRDVKF E  FPFSS S   +S+L P+      H+          P  +++ P  SI
Subjt:  ----------------------GYPCGHKGYKLYDMQSHKFFISRDVKFCEDDFPFSSAS--QTSTLAPSTPVVPLHD----------PSYSNIHPPPSI

Query:  PSP-PTPPSPPI----PSPTTPSSPPPSPDSPTNSNPIPPDTSAPLRRSTRTKQPPAWHKDYEMSSGANHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRA
        PS  P   SP +     S T+ S+  P+   PTN N  PP     LRRS+R KQ PAWH  Y M S ANH + ++S   GTRYPL  +LSFSRFSP+ R 
Subjt:  PSP-PTPPSPPI----PSPTTPSSPPPSPDSPTNSNPIPPDTSAPLRRSTRTKQPPAWHKDYEMSSGANHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRA

Query:  FLALITSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTT
        FL  IT+QTEP++YDEA+  P WQQAM  E+ AL+ N+TWSLVPLP+GHK IGCRWVYKIKYNSDG++ERYKARLVAKGYTQVEGIDY ETFSPTAKLTT
Subjt:  FLALITSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTT

Query:  LRCLLTVAAARKWFTHQLDVQNAFLHGNLDEEVYMSLPPGLRRQGENTVCRLHKSLYGLKQASRNWFSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAV
        LRCLLTVA AR WFTHQLDVQNAFLHG+L E VYM+ PPG  ++G+N VCRL+KSLYGLKQASRNWFS FS T+Q AGYTQSKADYSLFTK++G SFTA+
Subjt:  LRCLLTVAAARKWFTHQLDVQNAFLHGNLDEEVYMSLPPGLRRQGENTVCRLHKSLYGLKQASRNWFSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAV

Query:  LIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRR
        LIYVDDILLT NDL EI+ LK  LL++F IKDLG+LKYFLGIEF RS+ GIFMSQRKYA+DILQD+GL G+RP+KFPMEQNLKL+LT+G+KL+DP+KYRR
Subjt:  LIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRR

Query:  LIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQT
        L+GRLIYLTVTRPDI YSVR LSQFMHEPRKPHW+AA+RVL+YIKGTPGQGLLLPS NNLRL+A+CDSDWGGCR +RRS+SG+C+FLGNSIISWKSKKQ 
Subjt:  LIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQT

Query:  NVSRSSAEAEYRAMANTCLELTWLRYILQDLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEIDCHIVREKLQAGIIKPCYVSTKMQLADVFTKALG
        NVSRSSAEAEYRAMANTCLELTWLRY+LQDL VPLS P  LYCDN+AALHIAANPVFHERTKHIEIDCHIVR+K   G+I P Y+ T++QLAD+FTKALG
Subjt:  NVSRSSAEAEYRAMANTCLELTWLRYILQDLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEIDCHIVREKLQAGIIKPCYVSTKMQLADVFTKALG

Query:  RQQFDFLKDKLGVIDIHSPT
        R QF+ L++KLGV D+H PT
Subjt:  RQQFDFLKDKLGVIDIHSPT

PNX93906.1 hypothetical protein L195_g017068 [Trifolium pratense]1.6e-28561.6Show/hide
Query:  NTEEQSPNSIPNFSSEQLREIAQALSAI-NHHPSGNSDNHVNVAGLFPISTLSINSASSNSWILDSGATDHIVSKSSVMTEPKAAIMSAINLPNGETARV
        N  E S +++   S+EQL+++A+A + I +++ +GN+++H N AGL   S  SINS  +  WILDSGAT+HI S  + +++ K++ +  +NLP G +A +
Subjt:  NTEEQSPNSIPNFSSEQLREIAQALSAI-NHHPSGNSDNHVNVAGLFPISTLSINSASSNSWILDSGATDHIVSKSSVMTEPKAAIMSAINLPNGETARV

Query:  SHT--------------GNISLSP------NLQLNNVLCVPSFNLNLMSISKLTN----NLKCYVTFYPDSCVMQDLATGKMIGSGKQFGGYPCGHKGYK
        +                G   +SP      + Q++    +    L   S   L +       C+ T    +      A   +      F GYP G KGYK
Subjt:  SHT--------------GNISLSP------NLQLNNVLCVPSFNLNLMSISKLTN----NLKCYVTFYPDSCVMQDLATGKMIGSGKQFGGYPCGHKGYK

Query:  LYDMQSHKFFISRDVKFCEDDFP-FSSASQTSTLAPSTPVVPLHDPSYSNIHPPPSIPSPPTPPSPPIPSPTTPSSPPPSPDS-PTNSN-PIPPDTSAP-
        +YD ++  FF+SRDV+FCE DFP   + S+ ++++   P   L D   S I  P  +PS      PP P   TPS+  P  DS PT S+ P PP +  P 
Subjt:  LYDMQSHKFFISRDVKFCEDDFP-FSSASQTSTLAPSTPVVPLHDPSYSNIHPPPSIPSPPTPPSPPIPSPTTPSSPPPSPDS-PTNSN-PIPPDTSAP-

Query:  LRRSTRTKQPPAWHKDYEMSSGANHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRAFLALITSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPL
        +RRS R K PP WH+DY MS   N  +S  +  +GTRYPL HYLS+SR S T   FLA IT+  EP++YD+AV DP WQ AMN E+ AL++N+TW+LVPL
Subjt:  LRRSTRTKQPPAWHKDYEMSSGANHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRAFLALITSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPL

Query:  PLGHKAIGCRWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQNAFLHGNLDEEVYMSLPPGLRRQG
        P GHK IGC+WVYKIKY SDG++ERYKARLVAKGYTQVEGIDY ETFSPTAK+TTLRCLLTVAA+R WF HQLDVQNAFLHG+L E VYM  PPGLRRQG
Subjt:  PLGHKAIGCRWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQNAFLHGNLDEEVYMSLPPGLRRQG

Query:  ENTVCRLHKSLYGLKQASRNWFSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFS
        EN VCRL+KSLYGLKQASRNWFS FS  IQ AGY QSKADYSLFTK +GTSFTAVLIYVDDILLTGNDLEE++ LK  LL+ F IKDLG+LKYFLGIEFS
Subjt:  ENTVCRLHKSLYGLKQASRNWFSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFS

Query:  RSRKGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIK
        RS+KGIFMSQRKYALDILQD+GL GARPDKFPMEQNLKL+ T+G  L DP+KYRRL+GRLIYLTVTRPDI YSV+ LSQFMHEPRKPHW+AALRVLRYIK
Subjt:  RSRKGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIK

Query:  GTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYILQDLNVPLSEPALLYCDN
        GTPGQG+L  + N+L L+A+CDSDWGGC  +RRS++GFCIFLGNS ISWKSKKQ  VSRSSAE+EYRAMANTCLELTWLR+ILQDL V  + P  L+CDN
Subjt:  GTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYILQDLNVPLSEPALLYCDN

Query:  QAALHIAANPVFHERTKHIEIDCHIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLGVIDIHSPT
        QAALHIAANPVFHERTKHIEIDCHIVREKLQAG+I P YV T+ QLADVFTKALG+ QF  L++KLG+ DIHSPT
Subjt:  QAALHIAANPVFHERTKHIEIDCHIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLGVIDIHSPT

PNX93928.1 hypothetical protein L195_g017092, partial [Trifolium pratense]9.4e-27571.78Show/hide
Query:  FGGYPCGHKGYKLYDMQSHKFFISRDVKFCEDDFPFSSASQTSTLAPSTPVVPLHD--PSYSNIH---PPPSIPSPPTPPSPPIPSPTTPS--SPPPSPD
        F GYP G KGYK+YD ++  FF+SRDVKFCE +FP    +    L  S P     D  PS ++ H       IPS   P SP   +  T S  SP   P 
Subjt:  FGGYPCGHKGYKLYDMQSHKFFISRDVKFCEDDFPFSSASQTSTLAPSTPVVPLHD--PSYSNIH---PPPSIPSPPTPPSPPIPSPTTPS--SPPPSPD

Query:  SPTNSNPIPPDTSAP-LRRSTRTKQPPAWHKDYEMSSGANHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRAFLALITSQTEPKTYDEAVGDPLWQQAMND
          T     PP    P +R+S R K PP WH DY MS+  N   S  + G+GTRYPL HYLS+SR S +  AFLA IT+  EP++YD+AV DPLWQ AMN 
Subjt:  SPTNSNPIPPDTSAP-LRRSTRTKQPPAWHKDYEMSSGANHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRAFLALITSQTEPKTYDEAVGDPLWQQAMND

Query:  EIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQNAFLHGNL
        E+ ALE+N+TWSLVPLP GHK IGC+WVYKIKY SDG++ERYKARLVAKGYTQVEGIDY ETFSPTAK+TTLRCLLTVAAAR WF HQLDVQNAFLHG+L
Subjt:  EIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQNAFLHGNL

Query:  DEEVYMSLPPGLRRQGENTVCRLHKSLYGLKQASRNWFSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDLEEIQYLKTSLLQKFL
         E VYM  PPGLRRQGEN VCRL+KSLYGLKQASRNWFS FS  IQ AGY QSKADYSLFTKS+GTSFTAVLIYVDDILLTGNDL+E++ LK  LL++F 
Subjt:  DEEVYMSLPPGLRRQGENTVCRLHKSLYGLKQASRNWFSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDLEEIQYLKTSLLQKFL

Query:  IKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPDIAYSVRMLSQFMHEP
        IKDLGNLKYFLGIEFSRS+KGIFMSQRKYALDILQD+GLTGARPDKFPMEQNLKL+ T+G  LNDP+KYRRL+GRLIYLTVTRPDI YSV+ LSQFMHEP
Subjt:  IKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPDIAYSVRMLSQFMHEP

Query:  RKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYILQ
        RKPHW+AALRVLRYIKGTPGQGLL  S N+L L+A+CDSDWGGC  +RRS++GFC+FLGNS+ISWKSKKQ  VSRSSAE+EYRAMANTCLELTWLR+ILQ
Subjt:  RKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYILQ

Query:  DLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEIDCHIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLGVIDIHSPT
        DL V  + P  L+CDNQAALHIAANPVFHERTKHIEIDCHIVREKLQAGII P YV T+ QLADVFTKALG+ QF  L+ KLG+ DIHSPT
Subjt:  DLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEIDCHIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLGVIDIHSPT

RVW71032.1 Retrovirus-related Pol polyprotein from transposon RE1 [Vitis vinifera]1.8e-30564.55Show/hide
Query:  ADVSQLNTEEQSPNSIPNFSSEQLREIAQALSAINHHPSGNSDNHVNVAGLFPISTLSINSASSNSWILDSGATDHIVSKSSVMTEPKAAIMSAINLPNG
        A  SQ  ++  S +++  F++EQ++++AQA+ A+NH  SGN D + NVAGL P + +S+NS+S++SWILD+GATDHIVS  S+ T+ K + ++ +NLPNG
Subjt:  ADVSQLNTEEQSPNSIPNFSSEQLREIAQALSAINHHPSGNSDNHVNVAGLFPISTLSINSASSNSWILDSGATDHIVSKSSVMTEPKAAIMSAINLPNG

Query:  ETARVSHTGNISLSPNLQLNNVLCVPSFNLNLMSISKLTNNLKCYVTFYPDSCVMQDLATGKMIGSGKQFGGYPCGHKGYKLYDMQSHKFFISRDVKFCE
          + ++HTG +     L L +VLCVPSFNLNL+S SKL  +  C + F+PD C++QDL +GKMIGS             Y +  + +    +S    F  
Subjt:  ETARVSHTGNISLSPNLQLNNVLCVPSFNLNLMSISKLTNNLKCYVTFYPDSCVMQDLATGKMIGSGKQFGGYPCGHKGYKLYDMQSHKFFISRDVKFCE

Query:  DDFPFSSASQTSTLAPSTPVVPLHDPSYSNIHPPPSIPSPPTPPSPPIPSPTTPSSPPPSPDSPTNSNPIPPDTSAPLRRSTRTKQPPAWHKDYEMSSGA
           P  S S T  L+   PV     PS +   P P  P     P P  PSP++ SSPP  P  P+ ++   P    PLRRSTR  QPPAWH DY MS+  
Subjt:  DDFPFSSASQTSTLAPSTPVVPLHDPSYSNIHPPPSIPSPPTPPSPPIPSPTTPSSPPPSPDSPTNSNPIPPDTSAPLRRSTRTKQPPAWHKDYEMSSGA

Query:  NHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRAFLALITSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDGSV
        NH ++ SS               SR             +QTEP ++++A  DP W+QAM+ E+ ALERN+TW +VPLP GHK IGCRWVYKIKY+SDG++
Subjt:  NHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRAFLALITSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDGSV

Query:  ERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQNAFLHGNLDEEVYMSLPPGLRRQGENTVCRLHKSLYGLKQASRNWFS
        ERYKARLVAKGYTQV GIDY ETFSPTAKLTTLRCLLTVAA+R W+ HQLDV NAFLHGNL EEVYM+ PPGLRRQGEN VCRL KS+YGLKQASRNWFS
Subjt:  ERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQNAFLHGNLDEEVYMSLPPGLRRQGENTVCRLHKSLYGLKQASRNWFS

Query:  IFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDTGL
         F+ T+++AGY QSKADYSLFTKS+G  FTA+LIYVDDILLTGNDL EI+ LKT LL++F IKDLG LKYFLGIEFSRS+KGIFMSQRKYALDILQDTGL
Subjt:  IFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDTGL

Query:  TGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDS
        TGA+P+KFPMEQNLKL+   GE L+DPS+YRRL+GRLIYLTVTRPDI YSVR LSQFM+ PRKPHWEAALRVLRYIKG+PGQGL LPSENNL L A+CDS
Subjt:  TGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDS

Query:  DWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYILQDLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEIDC
        DWGGCR SRRS+SG+C+FLG+S+ISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYIL+DL V L +PA L+CDNQAAL+IAANPVFHERTKHIEIDC
Subjt:  DWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYILQDLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEIDC

Query:  HIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLGVIDIHSPT
        HIVREKLQAG+I+PCYVSTKMQLADVFTKALGR+QF+FL  KLGV+D+HSPT
Subjt:  HIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLGVIDIHSPT

TrEMBL top hitse value%identityAlignment
A0A2K3MSX0 Integrase catalytic domain-containing protein (Fragment)4.6e-27571.78Show/hide
Query:  FGGYPCGHKGYKLYDMQSHKFFISRDVKFCEDDFPFSSASQTSTLAPSTPVVPLHD--PSYSNIH---PPPSIPSPPTPPSPPIPSPTTPS--SPPPSPD
        F GYP G KGYK+YD ++  FF+SRDVKFCE +FP    +    L  S P     D  PS ++ H       IPS   P SP   +  T S  SP   P 
Subjt:  FGGYPCGHKGYKLYDMQSHKFFISRDVKFCEDDFPFSSASQTSTLAPSTPVVPLHD--PSYSNIH---PPPSIPSPPTPPSPPIPSPTTPS--SPPPSPD

Query:  SPTNSNPIPPDTSAP-LRRSTRTKQPPAWHKDYEMSSGANHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRAFLALITSQTEPKTYDEAVGDPLWQQAMND
          T     PP    P +R+S R K PP WH DY MS+  N   S  + G+GTRYPL HYLS+SR S +  AFLA IT+  EP++YD+AV DPLWQ AMN 
Subjt:  SPTNSNPIPPDTSAP-LRRSTRTKQPPAWHKDYEMSSGANHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRAFLALITSQTEPKTYDEAVGDPLWQQAMND

Query:  EIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQNAFLHGNL
        E+ ALE+N+TWSLVPLP GHK IGC+WVYKIKY SDG++ERYKARLVAKGYTQVEGIDY ETFSPTAK+TTLRCLLTVAAAR WF HQLDVQNAFLHG+L
Subjt:  EIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQNAFLHGNL

Query:  DEEVYMSLPPGLRRQGENTVCRLHKSLYGLKQASRNWFSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDLEEIQYLKTSLLQKFL
         E VYM  PPGLRRQGEN VCRL+KSLYGLKQASRNWFS FS  IQ AGY QSKADYSLFTKS+GTSFTAVLIYVDDILLTGNDL+E++ LK  LL++F 
Subjt:  DEEVYMSLPPGLRRQGENTVCRLHKSLYGLKQASRNWFSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDLEEIQYLKTSLLQKFL

Query:  IKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPDIAYSVRMLSQFMHEP
        IKDLGNLKYFLGIEFSRS+KGIFMSQRKYALDILQD+GLTGARPDKFPMEQNLKL+ T+G  LNDP+KYRRL+GRLIYLTVTRPDI YSV+ LSQFMHEP
Subjt:  IKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPDIAYSVRMLSQFMHEP

Query:  RKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYILQ
        RKPHW+AALRVLRYIKGTPGQGLL  S N+L L+A+CDSDWGGC  +RRS++GFC+FLGNS+ISWKSKKQ  VSRSSAE+EYRAMANTCLELTWLR+ILQ
Subjt:  RKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYILQ

Query:  DLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEIDCHIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLGVIDIHSPT
        DL V  + P  L+CDNQAALHIAANPVFHERTKHIEIDCHIVREKLQAGII P YV T+ QLADVFTKALG+ QF  L+ KLG+ DIHSPT
Subjt:  DLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEIDCHIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLGVIDIHSPT

A0A2K3MT28 Uncharacterized protein7.5e-28661.6Show/hide
Query:  NTEEQSPNSIPNFSSEQLREIAQALSAI-NHHPSGNSDNHVNVAGLFPISTLSINSASSNSWILDSGATDHIVSKSSVMTEPKAAIMSAINLPNGETARV
        N  E S +++   S+EQL+++A+A + I +++ +GN+++H N AGL   S  SINS  +  WILDSGAT+HI S  + +++ K++ +  +NLP G +A +
Subjt:  NTEEQSPNSIPNFSSEQLREIAQALSAI-NHHPSGNSDNHVNVAGLFPISTLSINSASSNSWILDSGATDHIVSKSSVMTEPKAAIMSAINLPNGETARV

Query:  SHT--------------GNISLSP------NLQLNNVLCVPSFNLNLMSISKLTN----NLKCYVTFYPDSCVMQDLATGKMIGSGKQFGGYPCGHKGYK
        +                G   +SP      + Q++    +    L   S   L +       C+ T    +      A   +      F GYP G KGYK
Subjt:  SHT--------------GNISLSP------NLQLNNVLCVPSFNLNLMSISKLTN----NLKCYVTFYPDSCVMQDLATGKMIGSGKQFGGYPCGHKGYK

Query:  LYDMQSHKFFISRDVKFCEDDFP-FSSASQTSTLAPSTPVVPLHDPSYSNIHPPPSIPSPPTPPSPPIPSPTTPSSPPPSPDS-PTNSN-PIPPDTSAP-
        +YD ++  FF+SRDV+FCE DFP   + S+ ++++   P   L D   S I  P  +PS      PP P   TPS+  P  DS PT S+ P PP +  P 
Subjt:  LYDMQSHKFFISRDVKFCEDDFP-FSSASQTSTLAPSTPVVPLHDPSYSNIHPPPSIPSPPTPPSPPIPSPTTPSSPPPSPDS-PTNSN-PIPPDTSAP-

Query:  LRRSTRTKQPPAWHKDYEMSSGANHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRAFLALITSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPL
        +RRS R K PP WH+DY MS   N  +S  +  +GTRYPL HYLS+SR S T   FLA IT+  EP++YD+AV DP WQ AMN E+ AL++N+TW+LVPL
Subjt:  LRRSTRTKQPPAWHKDYEMSSGANHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRAFLALITSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPL

Query:  PLGHKAIGCRWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQNAFLHGNLDEEVYMSLPPGLRRQG
        P GHK IGC+WVYKIKY SDG++ERYKARLVAKGYTQVEGIDY ETFSPTAK+TTLRCLLTVAA+R WF HQLDVQNAFLHG+L E VYM  PPGLRRQG
Subjt:  PLGHKAIGCRWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQNAFLHGNLDEEVYMSLPPGLRRQG

Query:  ENTVCRLHKSLYGLKQASRNWFSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFS
        EN VCRL+KSLYGLKQASRNWFS FS  IQ AGY QSKADYSLFTK +GTSFTAVLIYVDDILLTGNDLEE++ LK  LL+ F IKDLG+LKYFLGIEFS
Subjt:  ENTVCRLHKSLYGLKQASRNWFSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFS

Query:  RSRKGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIK
        RS+KGIFMSQRKYALDILQD+GL GARPDKFPMEQNLKL+ T+G  L DP+KYRRL+GRLIYLTVTRPDI YSV+ LSQFMHEPRKPHW+AALRVLRYIK
Subjt:  RSRKGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIK

Query:  GTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYILQDLNVPLSEPALLYCDN
        GTPGQG+L  + N+L L+A+CDSDWGGC  +RRS++GFCIFLGNS ISWKSKKQ  VSRSSAE+EYRAMANTCLELTWLR+ILQDL V  + P  L+CDN
Subjt:  GTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYILQDLNVPLSEPALLYCDN

Query:  QAALHIAANPVFHERTKHIEIDCHIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLGVIDIHSPT
        QAALHIAANPVFHERTKHIEIDCHIVREKLQAG+I P YV T+ QLADVFTKALG+ QF  L++KLG+ DIHSPT
Subjt:  QAALHIAANPVFHERTKHIEIDCHIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLGVIDIHSPT

A0A438GFQ0 Retrovirus-related Pol polyprotein from transposon RE18.5e-30664.55Show/hide
Query:  ADVSQLNTEEQSPNSIPNFSSEQLREIAQALSAINHHPSGNSDNHVNVAGLFPISTLSINSASSNSWILDSGATDHIVSKSSVMTEPKAAIMSAINLPNG
        A  SQ  ++  S +++  F++EQ++++AQA+ A+NH  SGN D + NVAGL P + +S+NS+S++SWILD+GATDHIVS  S+ T+ K + ++ +NLPNG
Subjt:  ADVSQLNTEEQSPNSIPNFSSEQLREIAQALSAINHHPSGNSDNHVNVAGLFPISTLSINSASSNSWILDSGATDHIVSKSSVMTEPKAAIMSAINLPNG

Query:  ETARVSHTGNISLSPNLQLNNVLCVPSFNLNLMSISKLTNNLKCYVTFYPDSCVMQDLATGKMIGSGKQFGGYPCGHKGYKLYDMQSHKFFISRDVKFCE
          + ++HTG +     L L +VLCVPSFNLNL+S SKL  +  C + F+PD C++QDL +GKMIGS             Y +  + +    +S    F  
Subjt:  ETARVSHTGNISLSPNLQLNNVLCVPSFNLNLMSISKLTNNLKCYVTFYPDSCVMQDLATGKMIGSGKQFGGYPCGHKGYKLYDMQSHKFFISRDVKFCE

Query:  DDFPFSSASQTSTLAPSTPVVPLHDPSYSNIHPPPSIPSPPTPPSPPIPSPTTPSSPPPSPDSPTNSNPIPPDTSAPLRRSTRTKQPPAWHKDYEMSSGA
           P  S S T  L+   PV     PS +   P P  P     P P  PSP++ SSPP  P  P+ ++   P    PLRRSTR  QPPAWH DY MS+  
Subjt:  DDFPFSSASQTSTLAPSTPVVPLHDPSYSNIHPPPSIPSPPTPPSPPIPSPTTPSSPPPSPDSPTNSNPIPPDTSAPLRRSTRTKQPPAWHKDYEMSSGA

Query:  NHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRAFLALITSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDGSV
        NH ++ SS               SR             +QTEP ++++A  DP W+QAM+ E+ ALERN+TW +VPLP GHK IGCRWVYKIKY+SDG++
Subjt:  NHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRAFLALITSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDGSV

Query:  ERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQNAFLHGNLDEEVYMSLPPGLRRQGENTVCRLHKSLYGLKQASRNWFS
        ERYKARLVAKGYTQV GIDY ETFSPTAKLTTLRCLLTVAA+R W+ HQLDV NAFLHGNL EEVYM+ PPGLRRQGEN VCRL KS+YGLKQASRNWFS
Subjt:  ERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQNAFLHGNLDEEVYMSLPPGLRRQGENTVCRLHKSLYGLKQASRNWFS

Query:  IFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDTGL
         F+ T+++AGY QSKADYSLFTKS+G  FTA+LIYVDDILLTGNDL EI+ LKT LL++F IKDLG LKYFLGIEFSRS+KGIFMSQRKYALDILQDTGL
Subjt:  IFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDTGL

Query:  TGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDS
        TGA+P+KFPMEQNLKL+   GE L+DPS+YRRL+GRLIYLTVTRPDI YSVR LSQFM+ PRKPHWEAALRVLRYIKG+PGQGL LPSENNL L A+CDS
Subjt:  TGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDS

Query:  DWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYILQDLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEIDC
        DWGGCR SRRS+SG+C+FLG+S+ISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYIL+DL V L +PA L+CDNQAAL+IAANPVFHERTKHIEIDC
Subjt:  DWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYILQDLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEIDC

Query:  HIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLGVIDIHSPT
        HIVREKLQAG+I+PCYVSTKMQLADVFTKALGR+QF+FL  KLGV+D+HSPT
Subjt:  HIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLGVIDIHSPT

A0A5N6N393 Integrase catalytic domain-containing protein1.5e-28955.69Show/hide
Query:  TEEQSPNSIPNFSSEQLREIAQALSAINHHPS-GNSDNHVNVAGLFPISTLSINSASSNSWILDSGATDHIVSKSSVMTEPKAAIMSAINLPNGETARVS
        TE  + N +  F+SEQL++++QALSA+N + S G  D   N AG   +  + +NS  S  WILDSGATDHI++  S +++ K++ +  +NLPNG +  ++
Subjt:  TEEQSPNSIPNFSSEQLREIAQALSAINHHPS-GNSDNHVNVAGLFPISTLSINSASSNSWILDSGATDHIVSKSSVMTEPKAAIMSAINLPNGETARVS

Query:  HTGNISLSPNLQLNNVLC--------------------------------------------VPSFNLNLMSISKLTNNLKCYV----------------
        HTG +S SP +QL NVLC                                             PS N   +  S L +N  C V                
Subjt:  HTGNISLSPNLQLNNVLC--------------------------------------------VPSFNLNLMSISKLTNNLKCYV----------------

Query:  -TFYP-------------DSCV---------------MQDLATGKMIGSG--------------------------------------------KQFG--
          F P              SCV               + ++A      SG                                            + FG  
Subjt:  -TFYP-------------DSCV---------------MQDLATGKMIGSG--------------------------------------------KQFG--

Query:  ----------------------GYPCGHKGYKLYDMQSHKFFISRDVKFCEDDFPFSSAS--QTSTLAPSTPVVPLHD----------PSYSNIHPPPSI
                              GYP G KGYKLYD+ S KFF+SRDVKF E  FPFSS S   +S+L P+      H+          P  +++ P  SI
Subjt:  ----------------------GYPCGHKGYKLYDMQSHKFFISRDVKFCEDDFPFSSAS--QTSTLAPSTPVVPLHD----------PSYSNIHPPPSI

Query:  PSP-PTPPSPPI----PSPTTPSSPPPSPDSPTNSNPIPPDTSAPLRRSTRTKQPPAWHKDYEMSSGANHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRA
        PS  P   SP +     S T+ S+  P+   PTN N  PP     LRRS+R KQ PAWH  Y M S ANH + ++S   GTRYPL  +LSFSRFSP+ R 
Subjt:  PSP-PTPPSPPI----PSPTTPSSPPPSPDSPTNSNPIPPDTSAPLRRSTRTKQPPAWHKDYEMSSGANHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRA

Query:  FLALITSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTT
        FL  IT+QTEP++YDEA+  P WQQAM  E+ AL+ N+TWSLVPLP+GHK IGCRWVYKIKYNSDG++ERYKARLVAKGYTQVEGIDY ETFSPTAKLTT
Subjt:  FLALITSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTT

Query:  LRCLLTVAAARKWFTHQLDVQNAFLHGNLDEEVYMSLPPGLRRQGENTVCRLHKSLYGLKQASRNWFSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAV
        LRCLLTVA AR WFTHQLDVQNAFLHG+L E VYM+ PPG  ++G+N VCRL+KSLYGLKQASRNWFS FS T+Q AGYTQSKADYSLFTK++G SFTA+
Subjt:  LRCLLTVAAARKWFTHQLDVQNAFLHGNLDEEVYMSLPPGLRRQGENTVCRLHKSLYGLKQASRNWFSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAV

Query:  LIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRR
        LIYVDDILLT NDL EI+ LK  LL++F IKDLG+LKYFLGIEF RS+ GIFMSQRKYA+DILQD+GL G+RP+KFPMEQNLKL+LT+G+KL+DP+KYRR
Subjt:  LIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRR

Query:  LIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQT
        L+GRLIYLTVTRPDI YSVR LSQFMHEPRKPHW+AA+RVL+YIKGTPGQGLLLPS NNLRL+A+CDSDWGGCR +RRS+SG+C+FLGNSIISWKSKKQ 
Subjt:  LIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQT

Query:  NVSRSSAEAEYRAMANTCLELTWLRYILQDLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEIDCHIVREKLQAGIIKPCYVSTKMQLADVFTKALG
        NVSRSSAEAEYRAMANTCLELTWLRY+LQDL VPLS P  LYCDN+AALHIAANPVFHERTKHIEIDCHIVR+K   G+I P Y+ T++QLAD+FTKALG
Subjt:  NVSRSSAEAEYRAMANTCLELTWLRYILQDLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEIDCHIVREKLQAGIIKPCYVSTKMQLADVFTKALG

Query:  RQQFDFLKDKLGVIDIHSPT
        R QF+ L++KLGV D+H PT
Subjt:  RQQFDFLKDKLGVIDIHSPT

A5BNR5 Integrase catalytic domain-containing protein6.1e-28850Show/hide
Query:  ADVSQLNTEEQSPNSIPNFSSEQLREIAQALSAINHHPSGNSDNHVNVAGLFPISTLSINSASSNSWILDSGATDHIVSKSSVMTEPKAAIMSAINLPNG
        A  SQ  ++  S +++  F++EQ++++AQA+ A+NH  SGN D + N A                      GATDHIVS  S+ T+ K + ++ +NLPNG
Subjt:  ADVSQLNTEEQSPNSIPNFSSEQLREIAQALSAINHHPSGNSDNHVNVAGLFPISTLSINSASSNSWILDSGATDHIVSKSSVMTEPKAAIMSAINLPNG

Query:  ETARVSHTGNISLSPNLQLNNVLCVPSFNLNLMSISKLTNNLKCYVTFYPDSCVMQDLATGKMIGSGKQFG-----------------------------
          + ++HTG +     L L +VLCVPSFNLNL+S SKL  +  CY+ F+PD C++QDL +GKMIGSGKQ G                             
Subjt:  ETARVSHTGNISLSPNLQLNNVLCVPSFNLNLMSISKLTNNLKCYVTFYPDSCVMQDLATGKMIGSGKQFG-----------------------------

Query:  ----------------------------------------------------------------------------------------------------
                                                                                                            
Subjt:  ----------------------------------------------------------------------------------------------------

Query:  ----------------------------------------------------------------------------------------------------
                                                                                                            
Subjt:  ----------------------------------------------------------------------------------------------------

Query:  ------------------------------------------GYPCGHKGYKLYDMQSHKFFISRDVKFCEDDFPF-SSASQTSTLAPSTPV--------
                                                  GYP G KGYK+ D+Q+ K  +SRDV F E+ FPF SS+SQ+   +PS P+        
Subjt:  ------------------------------------------GYPCGHKGYKLYDMQSHKFFISRDVKFCEDDFPF-SSASQTSTLAPSTPV--------

Query:  --VPLHDPSYSNIHPPP-----SIPSPPT---------------PPSPPIPSPTTPSSPPPSPDSPTNSNPIPPDTSAPLRRSTRTKQPPAWHKDYEMSS
           P+  P +S    PP      + SPP+                P P  PSP++ SSPP  P  P+N++   P    PLRRSTR  QPPAWH DY MS+
Subjt:  --VPLHDPSYSNIHPPP-----SIPSPPT---------------PPSPPIPSPTTPSSPPPSPDSPTNSNPIPPDTSAPLRRSTRTKQPPAWHKDYEMSS

Query:  GANHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRAFLALITSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDG
          NH ++ SS   GTRYPL  +LSF RFSP  RAFLAL+T+QTEP ++++A  DP W+QAM+ E+ ALERN+TW +VPLP GHK IGCRWVYKIKY+SDG
Subjt:  GANHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRAFLALITSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDG

Query:  SVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQNAFLHGNLDEEVYMSLPPGLRRQGENTVCRLHKSLYGLKQASRNW
        ++ERYKARLVAKGYTQV GIDY ETFSPTAKLTTLRCLLTVAA+R W+ HQLDV NAFLHGNL EEVYM+ PPGLRRQGEN VCRL KS+YGLKQASRNW
Subjt:  SVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQNAFLHGNLDEEVYMSLPPGLRRQGENTVCRLHKSLYGLKQASRNW

Query:  FSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDT
        FS F+ T+++AGY QSKADYSLFTKS+G  FTA+LIYVDDILLTGNDL EI+ LKT LL++F IKDLG LKYFLGIEFSRS+KGIFMSQRKY LDILQDT
Subjt:  FSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDT

Query:  GLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYC
        GLTG +P+KFPMEQNLKL+  +GE L+DPS+YRRL+GRLIYLTVTRPDI YSVR LSQFM+ PRKPHWEAALRVLRYIKG+PGQGL LPSENNL L A+C
Subjt:  GLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYC

Query:  DSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYILQDLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEI
        DSDWGGCR SRRS+SG+C+FLG+S+ISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYIL+DL V L +PA L+CDNQAAL+IAANPVFHERTKHIEI
Subjt:  DSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYILQDLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEI

Query:  DCHIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLG
        DCHIVREKLQAG+I+PCYVSTKMQLADVFTKALGR+QF+FL  KLG
Subjt:  DCHIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLG

SwissProt top hitse value%identityAlignment
P04146 Copia protein1.9e-9737.72Show/hide
Query:  WQQAMNDEIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQN
        W++A+N E+ A + N+TW++   P     +  RWV+ +KYN  G+  RYKARLVA+G+TQ   IDY ETF+P A++++ R +L++        HQ+DV+ 
Subjt:  WQQAMNDEIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQN

Query:  AFLHGNLDEEVYMSLPPGLRRQGENTVCRLHKSLYGLKQASRNWFSIFSTTIQNAGYTQSKADYSLFTKSKG--TSFTAVLIYVDDILLTGNDLEEIQYL
        AFL+G L EE+YM LP G+    +N VC+L+K++YGLKQA+R WF +F   ++   +  S  D  ++   KG       VL+YVDD+++   D+  +   
Subjt:  AFLHGNLDEEVYMSLPPGLRRQGENTVCRLHKSLYGLKQASRNWFSIFSTTIQNAGYTQSKADYSLFTKSKG--TSFTAVLIYVDDILLTGNDLEEIQYL

Query:  KTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLS-LTEGEKLNDPSKYRRLIGRLIYLTV-TRPDIAYS
        K  L++KF + DL  +K+F+GI        I++SQ  Y   IL    +        P+   +    L   E  N P   R LIG L+Y+ + TRPD+  +
Subjt:  KTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLS-LTEGEKLNDPSKYRRLIGRLIYLTV-TRPDIAYS

Query:  VRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLL----LPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGN-SIISWKSKKQTNVSRSSAEAEYRA
        V +LS++  +     W+   RVLRY+KGT    L+    L  EN  ++  Y DSDW G    R+S +G+   + + ++I W +K+Q +V+ SS EAEY A
Subjt:  VRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLL----LPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGN-SIISWKSKKQTNVSRSSAEAEYRA

Query:  MANTCLELTWLRYILQDLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEIDCHIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLGV
        +     E  WL+++L  +N+ L  P  +Y DNQ  + IA NP  H+R KHI+I  H  RE++Q  +I   Y+ T+ QLAD+FTK L   +F  L+DKLG+
Subjt:  MANTCLELTWLRYILQDLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEIDCHIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLGV

Query:  I
        +
Subjt:  I

P10978 Retrovirus-related Pol polyprotein from transposon TNT 1-941.5e-8933.91Show/hide
Query:  FGGYPCGHKGYKLYDMQSHKFFISRDVKFCEDDFPFS---SASQTSTLAPSTPVVPLHDPSYSNIHPPPSIPSPPTPPSPPIPSPTTPSSPPPSPDSPTN
        F GY     GY+L+D    K   SRDV F E +   +   S    + + P+   +P      S  + P S  S     S     P          D    
Subjt:  FGGYPCGHKGYKLYDMQSHKFFISRDVKFCEDDFPFS---SASQTSTLAPSTPVVPLHDPSYSNIHPPPSIPSPPTPPSPPIPSPTTPSSPPPSPDSPTN

Query:  SNPIP---PDTSAPLRRSTRTKQPPAWHKDYEMSSGANHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRAFLALITSQTEPKTYDEAVGDPLWQQ---AMN
            P    +   PLRRS R +                            RYP   Y+              LI+   EP++  E +  P   Q   AM 
Subjt:  SNPIP---PDTSAPLRRSTRTKQPPAWHKDYEMSSGANHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRAFLALITSQTEPKTYDEAVGDPLWQQ---AMN

Query:  DEIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQNAFLHGN
        +E+ +L++N T+ LV LP G + + C+WV+K+K + D  + RYKARLV KG+ Q +GID+ E FSP  K+T++R +L++AA+      QLDV+ AFLHG+
Subjt:  DEIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQNAFLHGN

Query:  LDEEVYMSLPPGLRRQG-ENTVCRLHKSLYGLKQASRNWFSIFSTTIQNAGYTQSKADYSL-FTKSKGTSFTAVLIYVDDILLTGNDLEEIQYLKTSLLQ
        L+EE+YM  P G    G ++ VC+L+KSLYGLKQA R W+  F + +++  Y ++ +D  + F +    +F  +L+YVDD+L+ G D   I  LK  L +
Subjt:  LDEEVYMSLPPGLRRQG-ENTVCRLHKSLYGLKQASRNWFSIFSTTIQNAGYTQSKADYSL-FTKSKGTSFTAVLIYVDDILLTGNDLEEIQYLKTSLLQ

Query:  KFLIKDLGNLKYFLGIEFSRSR--KGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLSL----TEGEKLNDPSK--YRRLIGRLIYLTV-TRPDIAY
         F +KDLG  +  LG++  R R  + +++SQ KY   +L+   +  A+P   P+  +LKLS     T  E+  + +K  Y   +G L+Y  V TRPDIA+
Subjt:  KFLIKDLGNLKYFLGIEFSRSR--KGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLSL----TEGEKLNDPSK--YRRLIGRLIYLTV-TRPDIAY

Query:  SVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANT
        +V ++S+F+  P K HWEA   +LRY++GT G  L     + + L+ Y D+D  G   +R+S +G+        ISW+SK Q  V+ S+ EAEY A   T
Subjt:  SVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANT

Query:  CLELTWLRYILQDLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEIDCHIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLGV
          E+ WL+  LQ+L +   E  ++YCD+Q+A+ ++ N ++H RTKHI++  H +RE +    +K   +ST    AD+ TK + R +F+  K+ +G+
Subjt:  CLELTWLRYILQDLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEIDCHIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLGV

P92519 Uncharacterized mitochondrial protein AtMg008101.2e-5448.66Show/hide
Query:  VLIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYR
        +L+YVDDILLTG+    +  L   L   F +KDLG + YFLGI+      G+F+SQ KYA  IL + G+   +P   P+   L  S++   K  DPS +R
Subjt:  VLIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYR

Query:  RLIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQ
         ++G L YLT+TRPDI+Y+V ++ Q MHEP    ++   RVLRY+KGT   GL +   + L +QA+CDSDW GC ++RRS +GFC FLG +IISW +K+Q
Subjt:  RLIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQ

Query:  TNVSRSSAEAEYRAMANTCLELTW
          VSRSS E EYRA+A T  ELTW
Subjt:  TNVSRSSAEAEYRAMANTCLELTW

Q94HW2 Retrovirus-related Pol polyprotein from transposon RE11.0e-13840.96Show/hide
Query:  FGGYPCGHKGYKLYDMQSHKFFISRDVKFCEDDFPFSSASQT------------------STLAPSTPVVPLHDPSYSNIH---PPPSIPSPPTPPS---
        F GY      Y    +Q+ + +ISR V+F E+ FPFS+   T                  +TL   TPV+P   PS S+ H    PPS PS P   S   
Subjt:  FGGYPCGHKGYKLYDMQSHKFFISRDVKFCEDDFPFSSASQT------------------STLAPSTPVVPLHDPSYSNIH---PPPSIPSPPTPPS---

Query:  ------------PPIPSPTTPSSPPPSP-------------------DSPTNSNP--IPPDTSAPLRRSTRTKQPPAWHKDYEMSSGANHLTSSSSP---
                    P  P PT P    P P                   ++PTN +P  +    S P + S+ +  P         S     +     P   
Subjt:  ------------PPIPSPTTPSSPPPSP-------------------DSPTNSNP--IPPDTSAPLRRSTRTKQPPAWHKDYEMSSGANHLTSSSSP---

Query:  ---GTGTRYPLHHYLSFSR-----FSPTQRAFLAL-ITSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPLPLGHKAI-GCRWVYKIKYNSDGS
               + PL+ +   +R       P  +  LA+ + +++EP+T  +A+ D  W+ AM  EI A   NHTW LVP P  H  I GCRW++  KYNSDGS
Subjt:  ---GTGTRYPLHHYLSFSR-----FSPTQRAFLAL-ITSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPLPLGHKAI-GCRWVYKIKYNSDGS

Query:  VERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQNAFLHGNLDEEVYMSLPPG-LRRQGENTVCRLHKSLYGLKQASRNW
        + RYKARLVAKGY Q  G+DY ETFSP  K T++R +L VA  R W   QLDV NAFL G L ++VYMS PPG + +   N VC+L K+LYGLKQA R W
Subjt:  VERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQNAFLHGNLDEEVYMSLPPG-LRRQGENTVCRLHKSLYGLKQASRNW

Query:  FSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDT
        +      +   G+  S +D SLF   +G S   +L+YVDDIL+TGND   +     +L Q+F +KD   L YFLGIE  R   G+ +SQR+Y LD+L  T
Subjt:  FSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDT

Query:  GLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYC
         +  A+P   PM  + KLSL  G KL DP++YR ++G L YL  TRPDI+Y+V  LSQFMH P + H +A  R+LRY+ GTP  G+ L   N L L AY 
Subjt:  GLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYC

Query:  DSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYILQDLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEI
        D+DW G +    S +G+ ++LG+  ISW SKKQ  V RSS EAEYR++ANT  E+ W+  +L +L + L+ P ++YCDN  A ++ ANPVFH R KHI I
Subjt:  DSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYILQDLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEI

Query:  DCHIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLGV
        D H +R ++Q+G ++  +VST  QLAD  TK L R  F     K+GV
Subjt:  DCHIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLGV

Q9ZT94 Retrovirus-related Pol polyprotein from transposon RE21.5e-13740.63Show/hide
Query:  FGGYPCGHKGYKLYDMQSHKFFISRDVKFCEDDFPFS------SASQ-----------TSTLAPSTPVV----PLHDPSYSNIHPPPSIPSP--------
        F GY      Y    + + + + SR V+F E  FPFS      S SQ           + T  P+TP+V    P   P       PPS PSP        
Subjt:  FGGYPCGHKGYKLYDMQSHKFFISRDVKFCEDDFPFS------SASQ-----------TSTLAPSTPVV----PLHDPSYSNIHPPPSIPSP--------

Query:  ---------------PTPPSPPIPSPT-----------------TPSSPPPSPDSPTNSNPIP-----------PDTS-----APLRRSTRTKQ-PPAWH
                       PT PS   P PT                  P+   PSP+SP  ++P+P           P TS     +P   ST T   PP   
Subjt:  ---------------PTPPSPPIPSPT-----------------TPSSPPPSPDSPTNSNPIP-----------PDTS-----APLRRSTRTKQ-PPAWH

Query:  KDYEMSSGANHLTSSSSPGT----GTRYPLHHYLSFSRFSPTQRAFLALITSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPLPLGHKAI-GC
            +   A    ++ S  T    G R P   Y           ++   + + +EP+T  +A+ D  W+QAM  EI A   NHTW LVP P     I GC
Subjt:  KDYEMSSGANHLTSSSSPGT----GTRYPLHHYLSFSRFSPTQRAFLALITSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPLPLGHKAI-GC

Query:  RWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQNAFLHGNLDEEVYMSLPPG-LRRQGENTVCRLH
        RW++  K+NSDGS+ RYKARLVAKGY Q  G+DY ETFSP  K T++R +L VA  R W   QLDV NAFL G L +EVYMS PPG + +   + VCRL 
Subjt:  RWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLDVQNAFLHGNLDEEVYMSLPPG-LRRQGENTVCRLH

Query:  KSLYGLKQASRNWFSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFM
        K++YGLKQA R W+    T +   G+  S +D SLF   +G S   +L+YVDDIL+TGND   +++   +L Q+F +K+  +L YFLGIE  R  +G+ +
Subjt:  KSLYGLKQASRNWFSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFM

Query:  SQRKYALDILQDTGLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLL
        SQR+Y LD+L  T +  A+P   PM  + KL+L  G KL DP++YR ++G L YL  TRPD++Y+V  LSQ+MH P   HW A  RVLRY+ GTP  G+ 
Subjt:  SQRKYALDILQDTGLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLL

Query:  LPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYILQDLNVPLSEPALLYCDNQAALHIAA
        L   N L L AY D+DW G      S +G+ ++LG+  ISW SKKQ  V RSS EAEYR++ANT  EL W+  +L +L + LS P ++YCDN  A ++ A
Subjt:  LPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYILQDLNVPLSEPALLYCDNQAALHIAA

Query:  NPVFHERTKHIEIDCHIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLGVIDI
        NPVFH R KHI +D H +R ++Q+G ++  +VST  QLAD  TK L R  F     K+GVI +
Subjt:  NPVFHERTKHIEIDCHIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLGVIDI

Q9ZT94 Retrovirus-related Pol polyprotein from transposon RE24.0e-1037.7Show/hide
Query:  PISTLSINSA-SSNSWILDSGATDHIVSKSSVMT--EPKAAIMSAINLPNGETARVSHTGNISL---SPNLQLNNVLCVPSFNLNLMSISKLTNNLKCYV
        P + L++NS  ++N+W+LDSGAT HI S  + ++  +P       + + +G T  ++HTG+ SL   S +L LN VL VP+ + NL+S+ +L N  +  V
Subjt:  PISTLSINSA-SSNSWILDSGATDHIVSKSSVMT--EPKAAIMSAINLPNGETARVSHTGNISL---SPNLQLNNVLCVPSFNLNLMSISKLTNNLKCYV

Query:  TFYPDSCVMQDLATGKMIGSGK
         F+P S  ++DL TG  +  GK
Subjt:  TFYPDSCVMQDLATGKMIGSGK

Arabidopsis top hitse value%identityAlignment
AT4G23160.1 cysteine-rich RLK (RECEPTOR-like protein kinase) 83.5e-15049.82Show/hide
Query:  TTPSSPPPSPDSPTNSNPIPPDTSAPLRRSTRTKQPPAWHKDYEMSSGANHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRAFLALITSQTEPKTYDEAVG
        +T SS      S    N +P  +     R TR    PA+ +DY   S A+           T + +  +LS+ + SP   +FL  I    EP TY+EA  
Subjt:  TTPSSPPPSPDSPTNSNPIPPDTSAPLRRSTRTKQPPAWHKDYEMSSGANHLTSSSSPGTGTRYPLHHYLSFSRFSPTQRAFLALITSQTEPKTYDEAVG

Query:  DPLWQQAMNDEIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLD
          +W  AM+DEI A+E  HTW +  LP   K IGC+WVYKIKYNSDG++ERYKARLVAKGYTQ EGID+ ETFSP  KLT+++ +L ++A   +  HQLD
Subjt:  DPLWQQAMNDEIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTVAAARKWFTHQLD

Query:  VQNAFLHGNLDEEVYMSLPPG-LRRQGE----NTVCRLHKSLYGLKQASRNWFSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDL
        + NAFL+G+LDEE+YM LPPG   RQG+    N VC L KS+YGLKQASR WF  FS T+   G+ QS +D++ F K   T F  VL+YVDDI++  N+ 
Subjt:  VQNAFLHGNLDEEVYMSLPPG-LRRQGE----NTVCRLHKSLYGLKQASRNWFSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDL

Query:  EEIQYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPD
          +  LK+ L   F ++DLG LKYFLG+E +RS  GI + QRKYALD+L +TGL G +P   PM+ ++  S   G    D   YRRLIGRL+YL +TR D
Subjt:  EEIQYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPD

Query:  IAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAM
        I+++V  LSQF   PR  H +A +++L YIKGT GQGL   S+  ++LQ + D+ +  C+ +RRS +G+C+FLG S+ISWKSKKQ  VS+SSAEAEYRA+
Subjt:  IAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAM

Query:  ANTCLELTWLRYILQDLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEIDCHIVREK
        +    E+ WL    ++L +PLS+P LL+CDN AA+HIA N VFHERTKHIE DCH VRE+
Subjt:  ANTCLELTWLRYILQDLNVPLSEPALLYCDNQAALHIAANPVFHERTKHIEIDCHIVREK

ATMG00240.1 Gag-Pol-related retrotransposon family protein5.4e-1848.1Show/hide
Query:  IYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFC
        +YLT+TRPD+ ++V  LSQF    R    +A  +VL Y+KGT GQGL   + ++L+L+A+ DSDW  C  +RRS++GFC
Subjt:  IYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFC

ATMG00810.1 DNA/RNA polymerases superfamily protein8.5e-5648.66Show/hide
Query:  VLIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYR
        +L+YVDDILLTG+    +  L   L   F +KDLG + YFLGI+      G+F+SQ KYA  IL + G+   +P   P+   L  S++   K  DPS +R
Subjt:  VLIYVDDILLTGNDLEEIQYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYR

Query:  RLIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQ
         ++G L YLT+TRPDI+Y+V ++ Q MHEP    ++   RVLRY+KGT   GL +   + L +QA+CDSDW GC ++RRS +GFC FLG +IISW +K+Q
Subjt:  RLIGRLIYLTVTRPDIAYSVRMLSQFMHEPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQ

Query:  TNVSRSSAEAEYRAMANTCLELTW
          VSRSS E EYRA+A T  ELTW
Subjt:  TNVSRSSAEAEYRAMANTCLELTW

ATMG00820.1 Reverse transcriptase (RNA-dependent DNA polymerase)3.5e-2551.46Show/hide
Query:  TSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLL
        T + EPK+   A+ DP W QAM +E+ AL RN TW LVP P+    +GC+WV+K K +SDG+++R KARLVAKG+ Q EGI + ET+SP  +  T+R +L
Subjt:  TSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLL

Query:  TVA
         VA
Subjt:  TVA


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGCCGATGTTTCACAGTTGAATACAGAAGAACAGTCACCTAATTCCATTCCAAATTTTTCTTCTGAGCAATTACGAGAGATAGCACAAGCCTTATCTGCAATCAATCA
TCACCCTTCTGGTAATTCTGACAATCACGTCAATGTTGCAGGTTTGTTTCCCATATCTACATTATCTATTAACTCTGCGAGTTCTAATTCATGGATTCTCGATAGTGGAG
CTACGGATCATATAGTATCAAAATCTTCTGTTATGACTGAACCAAAGGCTGCCATCATGTCTGCAATAAATTTGCCTAATGGAGAGACAGCACGTGTGTCACATACTGGC
AATATTTCCCTTAGCCCTAACCTTCAATTAAACAACGTTTTATGTGTGCCTTCATTCAATTTAAACCTAATGTCGATCAGCAAACTTACCAATAACTTGAAATGTTATGT
CACCTTCTATCCTGATTCTTGTGTTATGCAGGACTTGGCTACGGGGAAGATGATTGGCTCGGGTAAACAATTTGGAGGATATCCTTGTGGTCATAAAGGTTACAAGTTGT
ATGACATGCAATCTCACAAATTCTTTATCAGCCGTGATGTCAAATTTTGTGAAGATGATTTTCCTTTTTCATCAGCTTCACAAACTTCGACATTAGCTCCTTCGACTCCT
GTTGTACCACTTCATGATCCATCCTACTCAAACATCCATCCTCCACCTTCTATTCCTTCACCTCCTACTCCTCCTTCACCTCCTATCCCTTCACCTACTACTCCGTCGTC
TCCTCCACCTTCTCCAGATTCGCCCACTAATTCCAATCCTATCCCACCTGATACATCAGCTCCACTTCGACGTTCTACTCGTACTAAACAGCCTCCAGCTTGGCATAAGG
ATTATGAGATGTCTTCTGGAGCCAATCATTTAACCTCTAGCTCAAGTCCCGGCACTGGCACCAGGTATCCCCTTCATCATTACCTTTCATTCTCTCGTTTTTCTCCTACT
CAACGTGCTTTTCTAGCTCTTATTACATCCCAGACAGAACCTAAAACCTATGACGAGGCAGTTGGCGACCCGTTATGGCAGCAGGCTATGAATGATGAAATTGCAGCTTT
GGAACGTAATCATACATGGTCTCTCGTTCCTCTACCACTTGGTCATAAAGCTATTGGTTGTCGTTGGGTGTACAAAATTAAATACAACTCTGATGGTTCTGTTGAACGTT
ATAAAGCTCGACTAGTAGCAAAGGGATACACTCAGGTTGAAGGTATTGATTACACAGAAACATTTTCCCCTACAGCGAAACTTACTACACTTCGTTGCTTACTCACTGTT
GCTGCTGCTCGAAAATGGTTCACCCATCAGTTGGATGTTCAAAATGCCTTTCTCCATGGTAATCTAGACGAGGAAGTTTATATGTCTTTACCACCAGGTCTTCGCCGACA
GGGGGAGAATACAGTATGTCGGCTCCATAAATCTCTTTATGGATTAAAACAGGCTTCTCGCAATTGGTTCTCCATATTTTCTACAACTATACAAAATGCAGGCTACACTC
AGTCCAAAGCAGATTACTCTTTGTTTACTAAGAGTAAAGGTACTTCTTTCACTGCAGTTCTAATCTATGTTGATGATATTCTGTTGACAGGCAATGATCTCGAAGAAATT
CAATATCTCAAGACTAGTTTACTCCAGAAATTTCTTATCAAAGATTTAGGAAATTTGAAATATTTTCTAGGCATTGAATTTTCTCGATCTAGAAAAGGAATTTTTATGTC
TCAAAGGAAGTATGCTCTAGACATCCTTCAAGACACAGGTCTTACAGGAGCACGTCCAGACAAATTTCCTATGGAGCAAAATCTGAAACTTTCTTTAACTGAAGGAGAGA
AGTTGAATGATCCAAGTAAATACAGACGGTTGATTGGCAGATTAATATATTTGACCGTCACTAGGCCTGACATAGCTTATTCAGTTCGTATGCTTAGCCAATTTATGCAT
GAACCAAGAAAACCACATTGGGAGGCAGCTCTTCGAGTTCTGAGGTACATCAAAGGCACTCCTGGTCAAGGACTTCTACTGCCATCTGAAAACAATTTAAGATTACAGGC
ATATTGCGATTCTGACTGGGGTGGTTGTCGAACTTCCAGACGATCTATTTCTGGGTTCTGCATTTTCCTCGGAAATTCAATTATTTCTTGGAAGTCTAAAAAGCAAACTA
ATGTGTCCAGATCATCAGCAGAAGCCGAGTATCGAGCTATGGCAAATACTTGTTTAGAGTTAACTTGGTTAAGATACATTCTTCAAGACTTGAATGTTCCACTGTCCGAA
CCAGCATTATTATATTGTGATAATCAAGCAGCATTACATATAGCAGCCAATCCAGTTTTTCATGAACGTACGAAACACATTGAAATAGATTGTCATATAGTTCGAGAAAA
ATTACAAGCTGGAATCATCAAACCGTGTTATGTTTCGACCAAAATGCAATTGGCAGATGTTTTTACTAAAGCTTTGGGAAGACAGCAATTTGACTTTTTGAAGGACAAGT
TGGGTGTGATCGACATACACTCTCCAACTTGA
mRNA sequenceShow/hide mRNA sequence
ATGGCCGATGTTTCACAGTTGAATACAGAAGAACAGTCACCTAATTCCATTCCAAATTTTTCTTCTGAGCAATTACGAGAGATAGCACAAGCCTTATCTGCAATCAATCA
TCACCCTTCTGGTAATTCTGACAATCACGTCAATGTTGCAGGTTTGTTTCCCATATCTACATTATCTATTAACTCTGCGAGTTCTAATTCATGGATTCTCGATAGTGGAG
CTACGGATCATATAGTATCAAAATCTTCTGTTATGACTGAACCAAAGGCTGCCATCATGTCTGCAATAAATTTGCCTAATGGAGAGACAGCACGTGTGTCACATACTGGC
AATATTTCCCTTAGCCCTAACCTTCAATTAAACAACGTTTTATGTGTGCCTTCATTCAATTTAAACCTAATGTCGATCAGCAAACTTACCAATAACTTGAAATGTTATGT
CACCTTCTATCCTGATTCTTGTGTTATGCAGGACTTGGCTACGGGGAAGATGATTGGCTCGGGTAAACAATTTGGAGGATATCCTTGTGGTCATAAAGGTTACAAGTTGT
ATGACATGCAATCTCACAAATTCTTTATCAGCCGTGATGTCAAATTTTGTGAAGATGATTTTCCTTTTTCATCAGCTTCACAAACTTCGACATTAGCTCCTTCGACTCCT
GTTGTACCACTTCATGATCCATCCTACTCAAACATCCATCCTCCACCTTCTATTCCTTCACCTCCTACTCCTCCTTCACCTCCTATCCCTTCACCTACTACTCCGTCGTC
TCCTCCACCTTCTCCAGATTCGCCCACTAATTCCAATCCTATCCCACCTGATACATCAGCTCCACTTCGACGTTCTACTCGTACTAAACAGCCTCCAGCTTGGCATAAGG
ATTATGAGATGTCTTCTGGAGCCAATCATTTAACCTCTAGCTCAAGTCCCGGCACTGGCACCAGGTATCCCCTTCATCATTACCTTTCATTCTCTCGTTTTTCTCCTACT
CAACGTGCTTTTCTAGCTCTTATTACATCCCAGACAGAACCTAAAACCTATGACGAGGCAGTTGGCGACCCGTTATGGCAGCAGGCTATGAATGATGAAATTGCAGCTTT
GGAACGTAATCATACATGGTCTCTCGTTCCTCTACCACTTGGTCATAAAGCTATTGGTTGTCGTTGGGTGTACAAAATTAAATACAACTCTGATGGTTCTGTTGAACGTT
ATAAAGCTCGACTAGTAGCAAAGGGATACACTCAGGTTGAAGGTATTGATTACACAGAAACATTTTCCCCTACAGCGAAACTTACTACACTTCGTTGCTTACTCACTGTT
GCTGCTGCTCGAAAATGGTTCACCCATCAGTTGGATGTTCAAAATGCCTTTCTCCATGGTAATCTAGACGAGGAAGTTTATATGTCTTTACCACCAGGTCTTCGCCGACA
GGGGGAGAATACAGTATGTCGGCTCCATAAATCTCTTTATGGATTAAAACAGGCTTCTCGCAATTGGTTCTCCATATTTTCTACAACTATACAAAATGCAGGCTACACTC
AGTCCAAAGCAGATTACTCTTTGTTTACTAAGAGTAAAGGTACTTCTTTCACTGCAGTTCTAATCTATGTTGATGATATTCTGTTGACAGGCAATGATCTCGAAGAAATT
CAATATCTCAAGACTAGTTTACTCCAGAAATTTCTTATCAAAGATTTAGGAAATTTGAAATATTTTCTAGGCATTGAATTTTCTCGATCTAGAAAAGGAATTTTTATGTC
TCAAAGGAAGTATGCTCTAGACATCCTTCAAGACACAGGTCTTACAGGAGCACGTCCAGACAAATTTCCTATGGAGCAAAATCTGAAACTTTCTTTAACTGAAGGAGAGA
AGTTGAATGATCCAAGTAAATACAGACGGTTGATTGGCAGATTAATATATTTGACCGTCACTAGGCCTGACATAGCTTATTCAGTTCGTATGCTTAGCCAATTTATGCAT
GAACCAAGAAAACCACATTGGGAGGCAGCTCTTCGAGTTCTGAGGTACATCAAAGGCACTCCTGGTCAAGGACTTCTACTGCCATCTGAAAACAATTTAAGATTACAGGC
ATATTGCGATTCTGACTGGGGTGGTTGTCGAACTTCCAGACGATCTATTTCTGGGTTCTGCATTTTCCTCGGAAATTCAATTATTTCTTGGAAGTCTAAAAAGCAAACTA
ATGTGTCCAGATCATCAGCAGAAGCCGAGTATCGAGCTATGGCAAATACTTGTTTAGAGTTAACTTGGTTAAGATACATTCTTCAAGACTTGAATGTTCCACTGTCCGAA
CCAGCATTATTATATTGTGATAATCAAGCAGCATTACATATAGCAGCCAATCCAGTTTTTCATGAACGTACGAAACACATTGAAATAGATTGTCATATAGTTCGAGAAAA
ATTACAAGCTGGAATCATCAAACCGTGTTATGTTTCGACCAAAATGCAATTGGCAGATGTTTTTACTAAAGCTTTGGGAAGACAGCAATTTGACTTTTTGAAGGACAAGT
TGGGTGTGATCGACATACACTCTCCAACTTGA
Protein sequenceShow/hide protein sequence
MADVSQLNTEEQSPNSIPNFSSEQLREIAQALSAINHHPSGNSDNHVNVAGLFPISTLSINSASSNSWILDSGATDHIVSKSSVMTEPKAAIMSAINLPNGETARVSHTG
NISLSPNLQLNNVLCVPSFNLNLMSISKLTNNLKCYVTFYPDSCVMQDLATGKMIGSGKQFGGYPCGHKGYKLYDMQSHKFFISRDVKFCEDDFPFSSASQTSTLAPSTP
VVPLHDPSYSNIHPPPSIPSPPTPPSPPIPSPTTPSSPPPSPDSPTNSNPIPPDTSAPLRRSTRTKQPPAWHKDYEMSSGANHLTSSSSPGTGTRYPLHHYLSFSRFSPT
QRAFLALITSQTEPKTYDEAVGDPLWQQAMNDEIAALERNHTWSLVPLPLGHKAIGCRWVYKIKYNSDGSVERYKARLVAKGYTQVEGIDYTETFSPTAKLTTLRCLLTV
AAARKWFTHQLDVQNAFLHGNLDEEVYMSLPPGLRRQGENTVCRLHKSLYGLKQASRNWFSIFSTTIQNAGYTQSKADYSLFTKSKGTSFTAVLIYVDDILLTGNDLEEI
QYLKTSLLQKFLIKDLGNLKYFLGIEFSRSRKGIFMSQRKYALDILQDTGLTGARPDKFPMEQNLKLSLTEGEKLNDPSKYRRLIGRLIYLTVTRPDIAYSVRMLSQFMH
EPRKPHWEAALRVLRYIKGTPGQGLLLPSENNLRLQAYCDSDWGGCRTSRRSISGFCIFLGNSIISWKSKKQTNVSRSSAEAEYRAMANTCLELTWLRYILQDLNVPLSE
PALLYCDNQAALHIAANPVFHERTKHIEIDCHIVREKLQAGIIKPCYVSTKMQLADVFTKALGRQQFDFLKDKLGVIDIHSPT