; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Clc11G08940 (gene) of Watermelon (cordophanus) v2 genome

Gene IDClc11G08940
OrganismCitrullus lanatus subsp. cordophanus (Watermelon (cordophanus) v2)
DescriptionTransposable element protein
Genome locationClcChr11:11030844..11033129
RNA-Seq ExpressionClc11G08940
SyntenyClc11G08940
Gene Ontology termsGO:0015074 - DNA integration (biological process)
GO:0003676 - nucleic acid binding (molecular function)
InterPro domainsIPR001584 - Integrase, catalytic core
IPR012337 - Ribonuclease H-like superfamily
IPR021109 - Aspartic peptidase domain superfamily
IPR036397 - Ribonuclease H superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_012831341.1 PREDICTED: uncharacterized protein LOC105952343 [Erythranthe guttata]4.3e-2928.87Show/hide
Query:  MASLTNALSKLIAGGQAQASPPSIASLPTLASEMSTQKDLEQRD--NETINYVDRGHYRG--HQQQLPTHYHPNLRNHENFSYANNRNVLQV-----HKR
        +A+L+N ++++   G      P    +   ++  +T  D EQ    N   N     ++RG  +Q Q PTHYHP +RNHENFSYAN +N LQ      H+R
Subjt:  MASLTNALSKLIAGGQAQASPPSIASLPTLASEMSTQKDLEQRD--NETINYVDRGHYRG--HQQQLPTHYHPNLRNHENFSYANNRNVLQV-----HKR

Query:  VKR------KDHALERMVQSHGKAIHNIEVQISQIATSLQTMQKGKFPSCPKRNPKQECKVVTLRSGKKLS-TPLIDDEDEEQEVDETIQKP--------
         +R      + H  E+ ++     + N+E QI QIA S+ TM KG FPS  + NPK+ C+ +T RSG +++  P   DE     +  T  +P        
Subjt:  VKR------KDHALERMVQSHGKAIHNIEVQISQIATSLQTMQKGKFPSCPKRNPKQECKVVTLRSGKKLS-TPLIDDEDEEQEVDETIQKP--------

Query:  ----------ILEDEP----------------------------------------------------KAVLEKE-------------------------
                     D P                                                    K VL K+                         
Subjt:  ----------ILEDEP----------------------------------------------------KAVLEKE-------------------------

Query:  ---------------------------------------KLDIGEVQPITITLQIADRSLAYLKGIVEDVLVKVDKFIFPIDFVVLDMEEDSEVPIILGR
                                               KL +G +    +TLQ+ADRSL Y  GIVEDVLVKVDKFI P+DFVVL+M ED E PIILGR
Subjt:  ---------------------------------------KLDIGEVQPITITLQIADRSLAYLKGIVEDVLVKVDKFIFPIDFVVLDMEEDSEVPIILGR

Query:  PFLVIRKAIID-----------GEYVVFNIYKSLSHHDEGRTCHAIDMIDHTISEHV--VKSCDRCQR--TDNISRQHELPMKPI
        PFL   KA+ID           GE VVFN+  +  H +    C  ID+I+  +S      + CD  +    ++I   +  P  PI
Subjt:  PFLVIRKAIID-----------GEYVVFNIYKSLSHHDEGRTCHAIDMIDHTISEHV--VKSCDRCQR--TDNISRQHELPMKPI

XP_012831341.1 PREDICTED: uncharacterized protein LOC105952343 [Erythranthe guttata]8.0e-10853.6Show/hide
Query:  DDEDEEQEVDETIQKPILEDEPKAVLEKEKLDIGEVQPI-TITLQIADRSLAYLKGIVEDVLV--KVDKFIFPIDFVVLDMEEDSEVPIILGRPFLVIRK
        D +  E  V + + + ILE+ P     +E     ++  I T T   AD +     GI+ D L   +  KF+    F + D     E  +    P  VIR+
Subjt:  DDEDEEQEVDETIQKPILEDEPKAVLEKEKLDIGEVQPI-TITLQIADRSLAYLKGIVEDVLV--KVDKFIFPIDFVVLDMEEDSEVPIILGRPFLVIRK

Query:  AIIDGEY--VVFNIYKSL--SHHDEGRTCHAI---DMIDHTI---SEHVVKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYI
         + + E   ++ + + S    HH E RT   +        T+   S   VK CDRCQRT N+S + ++P+  + EVELFDVWGIDFMGPFP SS+G LYI
Subjt:  AIIDGEY--VVFNIYKSL--SHHDEGRTCHAI---DMIDHTI---SEHVVKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYI

Query:  LVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAIISDEGSHFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEKTVKINRK
        L+AVDYVSKWVEA+AT TNDARTVLKF HKNIF+RFGTP AIISDEGSHFCNKL  ++  K  + HKIA  Y+P+TNGLAELSNREIKQ+LEKTV  NRK
Subjt:  LVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAIISDEGSHFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEKTVKINRK

Query:  DWALKLDDAL-----------------LVFEKACHLPVELEHRAYWAIKKLNMDFEKAGEKRLLELNEMEEFRAQAYENAKLYKECTTRWHDKKINSQTF
        DWALKLDDAL                 LV+ KACHLPVELEHRAYWA+KKLN D    G++RLL+LNEMEEFR  AYENAK+YKE T +WHDK+I  + F
Subjt:  DWALKLDDAL-----------------LVFEKACHLPVELEHRAYWAIKKLNMDFEKAGEKRLLELNEMEEFRAQAYENAKLYKECTTRWHDKKINSQTF

Query:  LLGQRVLLFNSRLRLFPGKLRTRWSGPFVIV
          G +VLLFNSRLRLFPGKL++RWSGPFV++
Subjt:  LLGQRVLLFNSRLRLFPGKLRTRWSGPFVIV

XP_012833448.1 PREDICTED: uncharacterized protein LOC105954320 [Erythranthe guttata]7.4e-1335.42Show/hide
Query:  MASLTNALSKLIAGGQAQASPPSIASLPTLASEMSTQKDLEQRD--NETINYVDRGHYRG--HQQQLPTHYHPNLRNHENFSYANNRNVLQV-----HKR
        +A L+N ++++   G      P    +   ++  +T  D EQ    N   N     ++RG  +Q Q PTHYHP +RNHENFSYAN +N LQ      H+R
Subjt:  MASLTNALSKLIAGGQAQASPPSIASLPTLASEMSTQKDLEQRD--NETINYVDRGHYRG--HQQQLPTHYHPNLRNHENFSYANNRNVLQV-----HKR

Query:  VKR------KDHALERMVQSHGKAIHNIEVQISQIATSLQTMQKGKFPSCPKRNPKQECKVVTLRSGKKLS-TPLIDDEDEEQEVDETIQKP
         +R      + H  E+ ++     + N+E QI QIA S+ TM KG FPS  + NPK+ C+ +T RSG +++  P   DE     V  T  +P
Subjt:  VKR------KDHALERMVQSHGKAIHNIEVQISQIATSLQTMQKGKFPSCPKRNPKQECKVVTLRSGKKLS-TPLIDDEDEEQEVDETIQKP

XP_012833448.1 PREDICTED: uncharacterized protein LOC105954320 [Erythranthe guttata]8.0e-10853.6Show/hide
Query:  DDEDEEQEVDETIQKPILEDEPKAVLEKEKLDIGEVQPI-TITLQIADRSLAYLKGIVEDVLV--KVDKFIFPIDFVVLDMEEDSEVPIILGRPFLVIRK
        D +  E  V + + + ILE+ P     +E     ++  I T T   AD +     GI+ D L   +  KF+    F + D     E  +    P  VIR+
Subjt:  DDEDEEQEVDETIQKPILEDEPKAVLEKEKLDIGEVQPI-TITLQIADRSLAYLKGIVEDVLV--KVDKFIFPIDFVVLDMEEDSEVPIILGRPFLVIRK

Query:  AIIDGEY--VVFNIYKSL--SHHDEGRTCHAI---DMIDHTI---SEHVVKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYI
         + + E   ++ + + S    HH E RT   +        T+   S   VK CDRCQRT N+S + ++P+  + EVELFDVWGIDFMGPFP SS+G LYI
Subjt:  AIIDGEY--VVFNIYKSL--SHHDEGRTCHAI---DMIDHTI---SEHVVKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYI

Query:  LVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAIISDEGSHFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEKTVKINRK
        L+AVDYVSKWVEA+AT TNDARTVLKF HKNIF+RFGTP AIISDEGSHFCNKL  ++  K  + HKIA  Y+P+TNGLAELSNREIKQ+LEKTV  NRK
Subjt:  LVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAIISDEGSHFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEKTVKINRK

Query:  DWALKLDDAL-----------------LVFEKACHLPVELEHRAYWAIKKLNMDFEKAGEKRLLELNEMEEFRAQAYENAKLYKECTTRWHDKKINSQTF
        DWALKLDDAL                 LV+ KACHLPVELEHRAYWA+KKLN D    G++RLL+LNEMEEFR  AYENAK+YKE T +WHDK+I  + F
Subjt:  DWALKLDDAL-----------------LVFEKACHLPVELEHRAYWAIKKLNMDFEKAGEKRLLELNEMEEFRAQAYENAKLYKECTTRWHDKKINSQTF

Query:  LLGQRVLLFNSRLRLFPGKLRTRWSGPFVIV
          G +VLLFNSRLRLFPGKL++RWSGPFV++
Subjt:  LLGQRVLLFNSRLRLFPGKLRTRWSGPFVIV

XP_012833687.1 PREDICTED: uncharacterized protein LOC105954563 [Erythranthe guttata]5.7e-1335.42Show/hide
Query:  MASLTNALSKLIAGGQAQASPPSIASLPTLASEMSTQKDLEQRD--NETINYVDRGHYRG--HQQQLPTHYHPNLRNHENFSYANNRNVLQV-----HKR
        +A L+N ++++   G      P    +   ++  +T  D EQ    N   N     ++RG  +Q Q PTHYHP +RNHENFSYAN +N LQ      H+R
Subjt:  MASLTNALSKLIAGGQAQASPPSIASLPTLASEMSTQKDLEQRD--NETINYVDRGHYRG--HQQQLPTHYHPNLRNHENFSYANNRNVLQV-----HKR

Query:  VKR------KDHALERMVQSHGKAIHNIEVQISQIATSLQTMQKGKFPSCPKRNPKQECKVVTLRSGKKLS-TPLIDDEDEEQEVDETIQKP
         +R      + H  E+ ++     + N+E QI QIA S+ TM KG FPS  + NPK+ C+ +T RSG +++  P   DE     V  T  +P
Subjt:  VKR------KDHALERMVQSHGKAIHNIEVQISQIATSLQTMQKGKFPSCPKRNPKQECKVVTLRSGKKLS-TPLIDDEDEEQEVDETIQKP

XP_012833687.1 PREDICTED: uncharacterized protein LOC105954563 [Erythranthe guttata]8.0e-10853.13Show/hide
Query:  DDEDEEQEVDETIQKPILEDEPKAVLEKEKLDIGEVQPITI-TLQIADRSLAYLKGIVEDVLV--KVDKFIFPIDFVVLDMEEDSEVPIILGRPFLVIRK
        D +  E  V + + + ILE+ P     +E     ++  I+  T   AD +     GI+ D L   +  KF+    F + D     E  +    P  VIR+
Subjt:  DDEDEEQEVDETIQKPILEDEPKAVLEKEKLDIGEVQPITI-TLQIADRSLAYLKGIVEDVLV--KVDKFIFPIDFVVLDMEEDSEVPIILGRPFLVIRK

Query:  AIIDGEY--VVFNIYKSL--SHHDEGRTCHAIDMID------HTISEHVVKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYI
         + + E   ++ + + S    HH E RT   +  +          S   VK CDRCQRT N+S + ++P+  + EVELFDVWGIDFMGPFP SS+G LYI
Subjt:  AIIDGEY--VVFNIYKSL--SHHDEGRTCHAIDMID------HTISEHVVKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYI

Query:  LVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAIISDEGSHFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEKTVKINRK
        L+AVDYVSKWVEA+AT  NDARTVLKF HKNIF+RFGTP AIISDEGSHFCNKL  ++  K  + HKIA  Y+P+TNGLAELSNREIKQ+LEKTV  NRK
Subjt:  LVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAIISDEGSHFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEKTVKINRK

Query:  DWALKLDDAL-----------------LVFEKACHLPVELEHRAYWAIKKLNMDFEKAGEKRLLELNEMEEFRAQAYENAKLYKECTTRWHDKKINSQTF
        DWALKLDDAL                 LV+ KACHLPVELEHRAYWA+KKLN D   AG++RLL+LNEMEEFR  AYENAK+YKE T +WHDK+I  + F
Subjt:  DWALKLDDAL-----------------LVFEKACHLPVELEHRAYWAIKKLNMDFEKAGEKRLLELNEMEEFRAQAYENAKLYKECTTRWHDKKINSQTF

Query:  LLGQRVLLFNSRLRLFPGKLRTRWSGPFVIV
          G +VLLFNSRLRLFPGKL++RWSGPFV++
Subjt:  LLGQRVLLFNSRLRLFPGKLRTRWSGPFVIV

XP_012842899.1 PREDICTED: uncharacterized protein LOC105963074 [Erythranthe guttata]2.5e-1334.95Show/hide
Query:  MASLTNALSKLIAGGQAQASPPSIASLPTLASEMSTQKDLEQRD--NETINYVDRGHYRG--HQQQLPTHYHPNLRNHENFSYANNRNVLQV-----HKR
        +A+L+N +++L   G      P    +   ++  ST  D EQ    N   N     ++RG  +Q Q PTHYHP +RNHENFSYAN +N LQ      H+R
Subjt:  MASLTNALSKLIAGGQAQASPPSIASLPTLASEMSTQKDLEQRD--NETINYVDRGHYRG--HQQQLPTHYHPNLRNHENFSYANNRNVLQV-----HKR

Query:  VKR------KDHALERMVQSHGKAIHNIEVQISQIATSLQTMQKGKFPSCPKRNPKQECKVVTLRSGKKLS-TPLIDDEDEEQEVDETIQKPILEDEPKA
         +R      + H  E+ ++     + N+E QI QIA S+ TM KG FPS  + NPK+ C+ +T RSG +++  P   DE     V  T  +P +     +
Subjt:  VKR------KDHALERMVQSHGKAIHNIEVQISQIATSLQTMQKGKFPSCPKRNPKQECKVVTLRSGKKLS-TPLIDDEDEEQEVDETIQKPILEDEPKA

Query:  VLEKEK
          E  K
Subjt:  VLEKEK

XP_023874613.1 uncharacterized protein LOC111987139 [Quercus suber]1.3e-11070.07Show/hide
Query:  VVKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAIISDEGS
        +VK+CDRCQR  NISR+ ELP+K ILEVELFDVWGIDFMGPFP  S G++YIL+AVDYVSKWVEA+AT TNDA+ VLKFLHKNIFTRFGTP AIISDEG+
Subjt:  VVKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAIISDEGS

Query:  HFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEKTVKINRKDWALKLDDAL-----------------LVFEKACHLPVELEHRAYWAI
        HFCNKLF++++ KY V HKIA  Y+P+TNG AE+SNREIK +LEKTV  NRKDWA KLDDAL                 LVF KACHLPVELEH+AYWA+
Subjt:  HFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEKTVKINRKDWALKLDDAL-----------------LVFEKACHLPVELEHRAYWAI

Query:  KKLNMDFEKAGEKRLLELNEMEEFRAQAYENAKLYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLRTRWSGPFVIVK
        KK N+D + AGEKRLL+LNEM+EFR  AYENAK+YKE T +WHDK+I  + F  GQ+VLLFNSRL+LFPGKLR+RW+GP+ I K
Subjt:  KKLNMDFEKAGEKRLLELNEMEEFRAQAYENAKLYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLRTRWSGPFVIVK

XP_023874613.1 uncharacterized protein LOC111987139 [Quercus suber]5.4e-3227.66Show/hide
Query:  ALSKLIAGGQAQASPPSIASLPTLASEMSTQK---DLEQRDNETINYVDRGHYRGHQQQLPTHYHPNLRNHENFSYANNRNVLQ----------------
        ALS  +A    Q S  +   +P  A  ++       + +   E + Y++  +Y      +P +YHP LRNHENFSY N +NVLQ                
Subjt:  ALSKLIAGGQAQASPPSIASLPTLASEMSTQK---DLEQRDNETINYVDRGHYRGHQQQLPTHYHPNLRNHENFSYANNRNVLQ----------------

Query:  ----------VHKRVKRKDHALERMVQSH----GKAIHNIEVQISQIATSLQTMQKGKFPSCPKRNPKQECKVVTLRSGKKL---------STPLIDDE-
                       K+ D  L+  +++H    G  + N+EVQI Q+AT++   Q+G FPS  + NPK++CK +TLRSG+++         +TP   +  
Subjt:  ----------VHKRVKRKDHALERMVQSH----GKAIHNIEVQISQIATSLQTMQKGKFPSCPKRNPKQECKVVTLRSGKKL---------STPLIDDE-

Query:  ------DEEQEVDETIQK-------------PI-------------------------------------------------------------------
              +EE+ V++T+++             PI                                                                   
Subjt:  ------DEEQEVDETIQK-------------PI-------------------------------------------------------------------

Query:  LEDEPKAVLEKE------------------------------------------KLDIGEVQPITITLQIADRSLAYLKGIVEDVLVKVDKFIFPIDFVV
        L +E  A+++K+                                          KL +GE++  TI+LQ+ADRS+ Y +GI+EDVLVKVDKFIFP DFVV
Subjt:  LEDEPKAVLEKE------------------------------------------KLDIGEVQPITITLQIADRSLAYLKGIVEDVLVKVDKFIFPIDFVV

Query:  LDMEEDSEVPIILGRPFLVIRKAIID-----------GEYVVFNIYKSLSHHDEGRTCHAIDMIDHTISE
        LDMEED EVP+ILGRPFL   +A++D            E V FNIY+++   ++  TC  +D+I+  + E
Subjt:  LDMEEDSEVPIILGRPFLVIRKAIID-----------GEYVVFNIYKSLSHHDEGRTCHAIDMIDHTISE

XP_023874613.1 uncharacterized protein LOC111987139 [Quercus suber]5.5e-10965.81Show/hide
Query:  HHDEGRTCHAI---DMIDHTI---SEHVVKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDA
        HH E RT   +        T+   S   VK CDRCQRT N+S + ++P+  + EVELFDVWGIDFMGPFP SS+G LYIL+AVDYVSKWVEA+AT  NDA
Subjt:  HHDEGRTCHAI---DMIDHTI---SEHVVKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDA

Query:  RTVLKFLHKNIFTRFGTPGAIISDEGSHFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEKTVKINRKDWALKLDDAL-----------
        RTVLKF HKNIF+RFGTP AIISDEGSHFCNKLF ++  K  + HKIA  Y+P+TNGLAELSNREIKQ+LEKTV  NRKDWALKLDDAL           
Subjt:  RTVLKFLHKNIFTRFGTPGAIISDEGSHFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEKTVKINRKDWALKLDDAL-----------

Query:  ------LVFEKACHLPVELEHRAYWAIKKLNMDFEKAGEKRLLELNEMEEFRAQAYENAKLYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLR
              LVF KACHLPVELEHRAYWA+KKLN D    G +RLL+LNEMEEFR  AYENAK+YKE T +WHDK+I  + F  G +VLLFNSRLRLFPGKL+
Subjt:  ------LVFEKACHLPVELEHRAYWAIKKLNMDFEKAGEKRLLELNEMEEFRAQAYENAKLYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLR

Query:  TRWSGPFVIV
        +RWSGPFV++
Subjt:  TRWSGPFVIV

TrEMBL top hitse value%identityAlignment
A0A4Y1QYH5 Transposable element protein4.6e-10165.95Show/hide
Query:  CDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAIISDEGSHFCN
        CDRCQR  NISR++ELP+K IL VELFDVWGIDFMGPFP SS GY YILVAVDYVSKWVEA+AT+TND + VLKFL  NIFTRFGTP A+ISD GSHFCN
Subjt:  CDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAIISDEGSHFCN

Query:  KLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEKTVKINRKDWALKLDDAL-----------------LVFEKACHLPVELEHRAYWAIKKLN
        KLFE++M+KYN+ H+++T Y+P+T+G  E+SNREIKQ+LEK V   RKDWA KL+DAL                 LVF KACHLP+ELEH A+WAIKKLN
Subjt:  KLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEKTVKINRKDWALKLDDAL-----------------LVFEKACHLPVELEHRAYWAIKKLN

Query:  MDFEKAGEKRLLELNEMEEFRAQAYENAKLYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLRTRWSGPFVIV
         D +KAG  R  +LNE+EE R ++YENAKLYKE T  +HD+ I  + F  G  VLLFNSRLRLFPGKL++RW GPF +V
Subjt:  MDFEKAGEKRLLELNEMEEFRAQAYENAKLYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLRTRWSGPFVIV

A0A5H2XID6 Reverse transcriptase4.6e-10165.95Show/hide
Query:  CDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAIISDEGSHFCN
        CDRCQR  NISR++ELP+K IL VELFDVWGIDFMGPFP SS GY YILVAVDYVSKWVEA+AT+TND + VLKFL  NIFTRFGTP A+ISD GSHFCN
Subjt:  CDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAIISDEGSHFCN

Query:  KLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEKTVKINRKDWALKLDDAL-----------------LVFEKACHLPVELEHRAYWAIKKLN
        KLFE++M+KYN+ H+++T Y+P+T+G  E+SNREIKQ+LEK V   RKDWA KL+DAL                 LVF KACHLP+ELEH A+WAIKKLN
Subjt:  KLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEKTVKINRKDWALKLDDAL-----------------LVFEKACHLPVELEHRAYWAIKKLN

Query:  MDFEKAGEKRLLELNEMEEFRAQAYENAKLYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLRTRWSGPFVIV
         D +KAG  R  +LNE+EE R ++YENAKLYKE T  +HD+ I  + F  G  VLLFNSRLRLFPGKL++RW GPF +V
Subjt:  MDFEKAGEKRLLELNEMEEFRAQAYENAKLYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLRTRWSGPFVIV

A0A5H2XID6 Reverse transcriptase2.0e-1138.1Show/hide
Query:  EKLDIGEVQPITITLQIADRSLAYLKGIVEDVLVKVDKFIFPIDFVVLDMEEDSEV----PIILGRPFLVIRKAIID-----------GEYVVFNIYKSL
        + L +  ++  +I L++AD S+ Y +GIVED+LV+V+  I P DFVV+DME++  V    PI+LGRPF+     II            GE V F ++ +L
Subjt:  EKLDIGEVQPITITLQIADRSLAYLKGIVEDVLVKVDKFIFPIDFVVLDMEEDSEV----PIILGRPFLVIRKAIID-----------GEYVVFNIYKSL

Query:  SHHD-EGRTCHAIDMIDHTISEHVVK
        S       TC +ID++DH +S  +V+
Subjt:  SHHD-EGRTCHAIDMIDHTISEHVVK

A0A6P6G9R2 LOW QUALITY PROTEIN: uncharacterized protein LOC1124920843.6e-10642.15Show/hide
Query:  KLDIGEVQPITITLQIADRSLAYLKGIVEDVLVKVDKFIFPIDFVVLDMEEDSEVPIILGRPFLVIRKAIID----------------------------
        KL +G+V+P T TLQ+ADRS+   +GI+EDVLVKV+KFIFP DFV+LDMEED  +PIILGRPFL   +A+ID                            
Subjt:  KLDIGEVQPITITLQIADRSLAYLKGIVEDVLVKVDKFIFPIDFVVLDMEEDSEVPIILGRPFLVIRKAIID----------------------------

Query:  --------GEY-------VVFNIYKS-------------------------------------------------LSHHDEGRT----------------
                GE        +++ I  S                                                 ++  D+ +T                
Subjt:  --------GEY-------VVFNIYKS-------------------------------------------------LSHHDEGRT----------------

Query:  --CHAI------------DMI--------------------------------DHTISEH----------------------------------------
          C+A+            DM+                                DH+  ++                                        
Subjt:  --CHAI------------DMI--------------------------------DHTISEH----------------------------------------

Query:  --VVKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAIISDE
           V+ CDRCQRT NISR++E+P+K ILEVELFDVWGIDFMGPFP S  G  YILVAVDYVSKWVEA A  TNDA  V+KFL K IFTRFGTP AIISD 
Subjt:  --VVKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAIISDE

Query:  GSHFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEKTVKINRKDWALKLDDAL-----------------LVFEKACHLPVELEHRAYW
        G+HFCNK FES++ KY V HKIAT Y+P+T+G  E+SNREIK++LEKTV  +RKDW+LKLDDAL                 LVF K CHLP+ELEH+AYW
Subjt:  GSHFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEKTVKINRKDWALKLDDAL-----------------LVFEKACHLPVELEHRAYW

Query:  AIKKLNMDFEKAGEKRLLELNEMEEFRAQAYENAKLYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLRTRWSGPFVIVK
        A K LN D E  G+ RLL+L+E+EEFR  A+ENAK++KE T RWHDK I  + F +GQ+VLL+NSRL+LFPGKLR+RWSGP+ IV+
Subjt:  AIKKLNMDFEKAGEKRLLELNEMEEFRAQAYENAKLYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLRTRWSGPFVIVK

A0A6P6GGL5 LOW QUALITY PROTEIN: uncharacterized protein LOC1124928783.7e-10367.49Show/hide
Query:  VKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAIISDEGSH
        V+ CDRCQRT NISR++E+P+K ILEVELFDVWGIDFMGPFP SS G  YILVAVDYVSKWVEA    TNDAR V+KFL K IFTRFGTP AIISD G+H
Subjt:  VKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAIISDEGSH

Query:  FCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEKTVKINRKDWALKLDDAL-----------------LVFEKACHLPVELEHRAYWAIK
        FCNK FES++ KY V HKIAT Y+P+T+G  E+SNREIK++LEKTV  +RKDW+LKLDDAL                 LVF K CHLPVELEH+AYWA K
Subjt:  FCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEKTVKINRKDWALKLDDAL-----------------LVFEKACHLPVELEHRAYWAIK

Query:  KLNMDFEKAGEKRLLELNEMEEFRAQAYENAKLYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLRTRWSGPFVIVK
         LN D E  G+ RLL+L+E+EEFR  A+ENAK+YKE T RWHDK I  +TF +GQ+VLL+NSRL+LFPGKLR+RWSGP+ IV+
Subjt:  KLNMDFEKAGEKRLLELNEMEEFRAQAYENAKLYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLRTRWSGPFVIVK

A0A6P6GGL5 LOW QUALITY PROTEIN: uncharacterized protein LOC1124928788.2e-1847.15Show/hide
Query:  KLDIGEVQPITITLQIADRSLAYLKGIVEDVLVKVDKFIFPIDFVVLDMEEDSEVPIILGRPFLVIRKAIID-----------GEYVVFNI--YKSLSHH
        KL +G+V+P T+TLQ+ADRS+   +GI+EDVLVKV+KFIFP DFV+LDMEED  +PIILGRPFL   +A+ID            E V F I       + 
Subjt:  KLDIGEVQPITITLQIADRSLAYLKGIVEDVLVKVDKFIFPIDFVVLDMEEDSEVPIILGRPFLVIRKAIID-----------GEYVVFNI--YKSLSHH

Query:  DEGRTCHAIDMIDHTISEHVVKS
        DE   C  +D  D  +++   +S
Subjt:  DEGRTCHAIDMIDHTISEHVVKS

A0A6P6GGL5 LOW QUALITY PROTEIN: uncharacterized protein LOC1124928787.0e-10263.64Show/hide
Query:  KSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAIISDEGSHF
        ++CDRCQRT  I+++HE+P++ IL VELFDVWGIDFMGPFP  S+G+ YILVAVDYVSKWVEA+A  TNDA+ V+ F+ K+IFTRFGTP  +ISD G+HF
Subjt:  KSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAIISDEGSHF

Query:  CNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEKTVKINRKDWALKLDDAL-----------------LVFEKACHLPVELEHRAYWAIKK
        CNKL ++++ KY V HK+AT Y+P+T+G  E+SNRE+KQ+LEKTV  NRKDW+ KL+DAL                 LV+ KACHLPVE+EH+AYWAIKK
Subjt:  CNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEKTVKINRKDWALKLDDAL-----------------LVFEKACHLPVELEHRAYWAIKK

Query:  LNMDFEKAGEKRLLELNEMEEFRAQAYENAKLYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLRTRWSGPFVIVK-CPH
        LNM+ + AGEKRLL+LNE++EFR  AYENAKLYKE T +WHDK I  + F  GQ VLLFNSRL+LFPGKL++RW+GPFV+V   PH
Subjt:  LNMDFEKAGEKRLLELNEMEEFRAQAYENAKLYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLRTRWSGPFVIVK-CPH

SwissProt top hitse value%identityAlignment
P03359 Gag-Pol polyprotein5.1e-1727.55Show/hide
Query:  RTCHAIDMIDHTISEHVVKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKNI
        RT   I  +   + E V   C  C  T+ ++   E   +   +      W +DF    P    G  Y+LV +D  S WVEA  T+T  A TV K + + I
Subjt:  RTCHAIDMIDHTISEHVVKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKNI

Query:  FTRFGTPGAIISDEGSHFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEK-TVKINRKDWALKLDDALLVFEKACHLPVELEHRAYWAI
          RFG P  + SD G  F  ++ + +  +  +N K+   Y P+++G  E  NR IK+ L K  ++   KDW   L  ALL   +A + P       Y  +
Subjt:  FTRFGTPGAIISDEGSHFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEK-TVKINRKDWALKLDDALLVFEKACHLPVELEHRAYWAI

Query:  KKLNMDFEKAG----------EKRLLELNEMEEFRAQAYENAK-LYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLRTRWSGPFVIV
                ++G                L  +E  R Q ++  K +YK  T            F +G +VL+   R    PG L  RW GP++++
Subjt:  KKLNMDFEKAG----------EKRLLELNEMEEFRAQAYENAK-LYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLRTRWSGPFVIV

P10272 Gag-Pol polyprotein2.1e-1829.75Show/hide
Query:  TISEHVVKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAII
        T+ E V  +C  CQ+  N         K          W IDF    P  + GY Y+LV VD  S WVEA  TR   A  V K + + IF RFG P  I 
Subjt:  TISEHVVKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAII

Query:  SDEGSHFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEK-TVKINRKDWALKLDDALLVFEKACH----LPVELEHRAYWAIKKLNMDF
        SD G  F +++ + + +   +N K+   Y P+++G  E  NR IK+ L K T++   KDW   L  ALL      +     P E+ +     +  L   F
Subjt:  SDEGSHFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEK-TVKINRKDWALKLDDALLVFEKACH----LPVELEHRAYWAIKKLNMDF

Query:  EKAGEKRLLE--LNEMEEFRAQAYEN-AKLYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLRTRWSGPFVIV
          +  K  L+  L  ++  +AQ +   A+LY+   ++       S  F +G  V +   R +     L  RW GP++++
Subjt:  EKAGEKRLLE--LNEMEEFRAQAYEN-AKLYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLRTRWSGPFVIV

P21414 Gag-Pol polyprotein3.3e-1627.89Show/hide
Query:  RTCHAIDMIDHTISEHVVKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKNI
        RT   I  +   + E V   C  C  T+ ++   E   +   +      W +DF    P    G  Y+LV +D  S WVEA  T+T  A  V K + + I
Subjt:  RTCHAIDMIDHTISEHVVKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKNI

Query:  FTRFGTPGAIISDEGSHFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEK-TVKINRKDWALKLDDALLVFEKACHLPVELEHRAYWAI
          RFG P  + SD G  F  ++ + +  +  +N K+   Y P+++G  E  NR IK+ L K  ++   KDW   L  ALL   +A + P       Y  +
Subjt:  FTRFGTPGAIISDEGSHFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEK-TVKINRKDWALKLDDALLVFEKACHLPVELEHRAYWAI

Query:  KKLNMDFEKAGE-----KRLL-----ELNEMEEFRAQAYENAK-LYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLRTRWSGPFVIV
                ++GE      R L      L  +E  R Q ++  K +YK  T            F +G +VL+   R    P  L  RW GP++++
Subjt:  KKLNMDFEKAGE-----KRLL-----ELNEMEEFRAQAYENAK-LYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLRTRWSGPFVIV

P31792 Pol polyprotein (Fragment)1.8e-1730.11Show/hide
Query:  TISEHVVKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAII
        T+ E V  +C  CQ+  N         K          W IDF    P  + GY Y+LV VD  S WVEA  TR   A  V K + + IF RFG P  I 
Subjt:  TISEHVVKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAII

Query:  SDEGSHFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEK-TVKINRKDWALKLDDALLVFEKACH----LPVELEHRAYWAIKKLNMDF
        SD G  F +++ + + +   +N K+   Y P+++G  E  NR IK+ L K T++   KDW   L  ALL      +     P E+ +     +  L   F
Subjt:  SDEGSHFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEK-TVKINRKDWALKLDDALLVFEKACH----LPVELEHRAYWAIKKLNMDF

Query:  EKAGEKRLLE--LNEMEEFRAQAYEN-AKLYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLRTRWSGPFVIV
          +  K  L+  L  ++  +AQ +   A+LY+      H +   S  F +G  V +   R +     L  RW GP++++
Subjt:  EKAGEKRLLE--LNEMEEFRAQAYEN-AKLYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLRTRWSGPFVIV

Q9TTC1 Gag-Pol polyprotein1.1e-1628.14Show/hide
Query:  GRTCHAIDMIDHTISEHVVKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKN
        GRT   I  +   + E +   C  C  T+ ++   E P +          W +DF    P    G  Y+LV +D  S WVEA  T+T  A TV K + + 
Subjt:  GRTCHAIDMIDHTISEHVVKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKN

Query:  IFTRFGTPGAIISDEGSHFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEK-TVKINRKDWALKLDDALLVFEKACH----LPVELEH-
        I  RFG P  + SD G  F  ++ + +  +  ++ K+   Y P+++G  E  NR IK+ L K  ++   KDW   L  ALL            P E+ H 
Subjt:  IFTRFGTPGAIISDEGSHFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEK-TVKINRKDWALKLDDALLVFEKACH----LPVELEH-

Query:  -----RAYWAIKKLNMDFEKAGEKRLLELNEMEEFRAQAYENAK-LYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLRTRWSGPFVIV
              A   +   N DF          L  +E  R Q ++  K  Y+  T            F +G RVL+   R     G L  RW GP++++
Subjt:  -----RAYWAIKKLNMDFEKAGEKRLLELNEMEEFRAQAYENAK-LYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKLRTRWSGPFVIV

Arabidopsis top hitse value%identityAlignment
ATMG00750.1 GAG/POL/ENV polyprotein5.0e-0760.53Show/hide
Query:  VKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFM
        V SCD CQR  N ++++E+P   ILEVE+FDVWGI FM
Subjt:  VKSCDRCQRTDNISRQHELPMKPILEVELFDVWGIDFM


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGCCTCCCTTACTAATGCTCTTTCTAAATTGATTGCAGGGGGCCAAGCTCAAGCAAGTCCACCATCCATAGCATCTCTTCCCACCTTGGCATCGGAGATGTCTACACA
AAAGGACTTGGAACAAAGAGACAATGAGACGATCAATTATGTTGATCGAGGACACTATAGAGGCCACCAACAACAACTTCCAACTCATTACCATCCTAACTTGAGGAATC
ATGAAAACTTTTCTTATGCTAACAATAGAAATGTTTTGCAAGTTCATAAAAGAGTCAAGAGGAAGGACCACGCATTGGAGCGCATGGTACAAAGTCATGGCAAGGCTATT
CACAACATTGAGGTACAAATTAGCCAAATAGCCACTTCTCTTCAAACAATGCAAAAGGGTAAGTTTCCTAGTTGCCCCAAAAGGAATCCAAAGCAGGAGTGCAAGGTCGT
GACTTTGAGGAGTGGGAAAAAGTTGTCCACTCCCTTGATTGATGATGAGGATGAAGAGCAAGAGGTAGATGAGACCATCCAAAAGCCTATCTTAGAAGATGAACCCAAGG
CGGTCTTAGAAAAAGAGAAGCTTGACATTGGAGAAGTGCAACCTATCACTATCACATTACAAATAGCCGATAGATCCTTAGCTTATCTTAAAGGTATTGTTGAGGATGTA
TTAGTTAAAGTTGACAAATTTATCTTTCCTATAGATTTTGTAGTTTTGGACATGGAGGAGGACTCTGAGGTTCCTATCATTCTTGGGCGCCCATTCCTAGTAATTAGGAA
AGCTATCATAGATGGTGAATATGTTGTCTTTAATATTTATAAGTCCTTGAGTCACCATGATGAGGGTCGTACTTGCCATGCTATAGACATGATTGATCATACTATCTCTG
AGCATGTTGTCAAATCATGTGATAGGTGCCAACGTACTGACAATATTTCTAGACAACATGAGCTTCCAATGAAACCTATCTTAGAAGTGGAGCTCTTTGATGTCTGGGGT
ATTGACTTTATGGGGCCTTTTCCTATGTCTTCTGATGGCTACCTATATATTCTAGTTGCAGTTGATTATGTATCTAAATGGGTAGAAGCCATGGCTACTAGGACCAATGA
TGCTCGCACTGTTTTAAAATTCTTGCATAAAAACATCTTCACACGTTTTGGTACACCTGGAGCTATTATTAGTGATGAGGGTTCTCACTTTTGCAATAAATTATTTGAAT
CCATGATGCAAAAATATAATGTTAATCATAAAATTGCTACAATTTATTATCCTGAAACTAATGGTCTTGCTGAGTTATCTAATAGGGAAATCAAGCAAGTTTTGGAAAAG
ACTGTCAAGATCAATAGGAAGGATTGGGCCCTAAAGCTTGATGATGCATTGTTGGTGTTTGAAAAGGCTTGTCACTTACCCGTAGAGCTCGAGCATAGAGCTTATTGGGC
TATCAAGAAGTTGAACATGGATTTTGAGAAGGCCGGTGAGAAGCGCCTCTTGGAACTCAATGAGATGGAGGAGTTTCGTGCTCAAGCTTATGAGAATGCCAAACTTTATA
AGGAGTGCACTACCAGATGGCATGATAAGAAGATCAACTCACAGACCTTTCTTCTTGGACAAAGAGTATTACTTTTCAACTCACGTTTACGTTTGTTTCCAGGTAAGCTT
AGGACACGATGGTCGGGACCCTTTGTCATTGTCAAGTGTCCCCACATGGAGTCATGGAATTGCAAAGCGACGATGGGACAATCTTCAAAGTAA
mRNA sequenceShow/hide mRNA sequence
ATGGCCTCCCTTACTAATGCTCTTTCTAAATTGATTGCAGGGGGCCAAGCTCAAGCAAGTCCACCATCCATAGCATCTCTTCCCACCTTGGCATCGGAGATGTCTACACA
AAAGGACTTGGAACAAAGAGACAATGAGACGATCAATTATGTTGATCGAGGACACTATAGAGGCCACCAACAACAACTTCCAACTCATTACCATCCTAACTTGAGGAATC
ATGAAAACTTTTCTTATGCTAACAATAGAAATGTTTTGCAAGTTCATAAAAGAGTCAAGAGGAAGGACCACGCATTGGAGCGCATGGTACAAAGTCATGGCAAGGCTATT
CACAACATTGAGGTACAAATTAGCCAAATAGCCACTTCTCTTCAAACAATGCAAAAGGGTAAGTTTCCTAGTTGCCCCAAAAGGAATCCAAAGCAGGAGTGCAAGGTCGT
GACTTTGAGGAGTGGGAAAAAGTTGTCCACTCCCTTGATTGATGATGAGGATGAAGAGCAAGAGGTAGATGAGACCATCCAAAAGCCTATCTTAGAAGATGAACCCAAGG
CGGTCTTAGAAAAAGAGAAGCTTGACATTGGAGAAGTGCAACCTATCACTATCACATTACAAATAGCCGATAGATCCTTAGCTTATCTTAAAGGTATTGTTGAGGATGTA
TTAGTTAAAGTTGACAAATTTATCTTTCCTATAGATTTTGTAGTTTTGGACATGGAGGAGGACTCTGAGGTTCCTATCATTCTTGGGCGCCCATTCCTAGTAATTAGGAA
AGCTATCATAGATGGTGAATATGTTGTCTTTAATATTTATAAGTCCTTGAGTCACCATGATGAGGGTCGTACTTGCCATGCTATAGACATGATTGATCATACTATCTCTG
AGCATGTTGTCAAATCATGTGATAGGTGCCAACGTACTGACAATATTTCTAGACAACATGAGCTTCCAATGAAACCTATCTTAGAAGTGGAGCTCTTTGATGTCTGGGGT
ATTGACTTTATGGGGCCTTTTCCTATGTCTTCTGATGGCTACCTATATATTCTAGTTGCAGTTGATTATGTATCTAAATGGGTAGAAGCCATGGCTACTAGGACCAATGA
TGCTCGCACTGTTTTAAAATTCTTGCATAAAAACATCTTCACACGTTTTGGTACACCTGGAGCTATTATTAGTGATGAGGGTTCTCACTTTTGCAATAAATTATTTGAAT
CCATGATGCAAAAATATAATGTTAATCATAAAATTGCTACAATTTATTATCCTGAAACTAATGGTCTTGCTGAGTTATCTAATAGGGAAATCAAGCAAGTTTTGGAAAAG
ACTGTCAAGATCAATAGGAAGGATTGGGCCCTAAAGCTTGATGATGCATTGTTGGTGTTTGAAAAGGCTTGTCACTTACCCGTAGAGCTCGAGCATAGAGCTTATTGGGC
TATCAAGAAGTTGAACATGGATTTTGAGAAGGCCGGTGAGAAGCGCCTCTTGGAACTCAATGAGATGGAGGAGTTTCGTGCTCAAGCTTATGAGAATGCCAAACTTTATA
AGGAGTGCACTACCAGATGGCATGATAAGAAGATCAACTCACAGACCTTTCTTCTTGGACAAAGAGTATTACTTTTCAACTCACGTTTACGTTTGTTTCCAGGTAAGCTT
AGGACACGATGGTCGGGACCCTTTGTCATTGTCAAGTGTCCCCACATGGAGTCATGGAATTGCAAAGCGACGATGGGACAATCTTCAAAGTAA
Protein sequenceShow/hide protein sequence
MASLTNALSKLIAGGQAQASPPSIASLPTLASEMSTQKDLEQRDNETINYVDRGHYRGHQQQLPTHYHPNLRNHENFSYANNRNVLQVHKRVKRKDHALERMVQSHGKAI
HNIEVQISQIATSLQTMQKGKFPSCPKRNPKQECKVVTLRSGKKLSTPLIDDEDEEQEVDETIQKPILEDEPKAVLEKEKLDIGEVQPITITLQIADRSLAYLKGIVEDV
LVKVDKFIFPIDFVVLDMEEDSEVPIILGRPFLVIRKAIIDGEYVVFNIYKSLSHHDEGRTCHAIDMIDHTISEHVVKSCDRCQRTDNISRQHELPMKPILEVELFDVWG
IDFMGPFPMSSDGYLYILVAVDYVSKWVEAMATRTNDARTVLKFLHKNIFTRFGTPGAIISDEGSHFCNKLFESMMQKYNVNHKIATIYYPETNGLAELSNREIKQVLEK
TVKINRKDWALKLDDALLVFEKACHLPVELEHRAYWAIKKLNMDFEKAGEKRLLELNEMEEFRAQAYENAKLYKECTTRWHDKKINSQTFLLGQRVLLFNSRLRLFPGKL
RTRWSGPFVIVKCPHMESWNCKATMGQSSK