; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Tan0002455 (gene) of Snake gourd v1 genome

Gene IDTan0002455
OrganismTrichosanthes anguina (Snake gourd v1)
DescriptionGag/pol protein
Genome locationLG11:22221185..22223755
RNA-Seq ExpressionTan0002455
SyntenyTan0002455
Gene Ontology termsGO:0015074 - DNA integration (biological process)
GO:0003676 - nucleic acid binding (molecular function)
InterPro domainsIPR001584 - Integrase, catalytic core
IPR012337 - Ribonuclease H-like superfamily
IPR036397 - Ribonuclease H superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAA0025159.1 gag/pol protein [Cucumis melo var. makuwa]7.8e-15754.88Show/hide
Query:  MSSLIIALLKSECLTGENYTTWKSNLNMILVVDELRFVLTEKCPQVPARSASQSVKDAYDRWIKANDKAKVYILASLSEVLAKKHKGMVSAREIMSPLQN
        MSS IIALLK + LTGENY TWKS LNMILV+ +L FVL E+CP  P + ASQSV+D YDRW KANDKA+++ILAS+S++L+KKH+ MV+AR+IM  L+ 
Subjt:  MSSLIIALLKSECLTGENYTTWKSNLNMILVVDELRFVLTEKCPQVPARSASQSVKDAYDRWIKANDKAKVYILASLSEVLAKKHKGMVSAREIMSPLQN

Query:  IFGQLSGQLRHESLKYVYNSRMKEGSSVKEHVLDLMVHFNVEDMNGAVIDKQSQKTQKKKIGGKGKAPAAADKGKGKPKVADKGKCFHCNVDGHWKRNDP
        +FGQ S Q++ E+   V++ R    SS                         S+K QK+K  GK K P  A + KGK KVA K K FHCNVD HWK N P
Subjt:  IFGQLSGQLRHESLKYVYNSRMKEGSSVKEHVLDLMVHFNVEDMNGAVIDKQSQKTQKKKIGGKGKAPAAADKGKGKPKVADKGKCFHCNVDGHWKRNDP

Query:  KYLVELKEK------------------------------------------------------------------KGKMTKQPFTGKGYRAKKPLELIHS
        KYLV+ KE                                                                   +GKMTK+PFT KGYRAK+PLELIHS
Subjt:  KYLVELKEK------------------------------------------------------------------KGKMTKQPFTGKGYRAKKPLELIHS

Query:  DLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDYLIEHGIQSQLAQP---------N
        DLCG +NVKAR  +EYFISFID YSRYGYLYLM H SEAL+KFKEYKTEVEN L K IK LRSDRGGEY DLRFQDY+IEHGIQSQL+ P         N
Subjt:  DLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDYLIEHGIQSQLAQP---------N

Query:  TPQQN-------------------------AHMLWTNPKKLEPRSRLCQFVGYPKETRGCLFYDPQENKVLVLTNTTFLEEDNMRNLKPRT---------
         P ++                          H+L TNPKKL+ RSRLCQFVGYPKETRG LF+DPQEN+V V TN TFLEED+MRN KPR+         
Subjt:  TPQQN-------------------------AHMLWTNPKKLEPRSRLCQFVGYPKETRGCLFYDPQENKVLVLTNTTFLEEDNMRNLKPRT---------

Query:  ---------GPSSRVDEEVGTSSQSRPSQLWGMPRRSGRVVSQPDRYLGLTETQVVILFDGVEDPLSYKQAMNDVDKHQWIKVMDLEMESMYFNSVWELV
                 GPSSRVDE   TS QS PSQ   MPRRSGRVVSQP+RYLGLTETQ VI  DGVEDPLSYKQAMNDVDK QW+K MDLEMESMYFN VWELV
Subjt:  ---------GPSSRVDEEVGTSSQSRPSQLWGMPRRSGRVVSQPDRYLGLTETQVVILFDGVEDPLSYKQAMNDVDKHQWIKVMDLEMESMYFNSVWELV

Query:  DQPEG
        D PEG
Subjt:  DQPEG

KAA0059226.1 gag/pol protein [Cucumis melo var. makuwa]1.4e-16152.83Show/hide
Query:  MSSLIIALLKSECLTGENYTTWKSNLNMILVVDELRFVLTEKCPQVPARSASQSVKDAYDRWIKANDKAKVYILASLSEVLAKKHKGMVSAREIMSPLQN
        MSS IIALLK + LTGENY TWKS LNMILV+ +L FVL E+CP  P + ASQSV+DAYDRW KANDKA+++ILAS+S++L+KKH+ MV+AR+IM  L+ 
Subjt:  MSSLIIALLKSECLTGENYTTWKSNLNMILVVDELRFVLTEKCPQVPARSASQSVKDAYDRWIKANDKAKVYILASLSEVLAKKHKGMVSAREIMSPLQN

Query:  IFGQLSGQLRHESLKYVYNSRMKEGSSVKEHVLDLMVHFNVEDMNGAVIDKQSQKTQKKKIGGKGKAPAAADKGKGKPKVADKGKCFHCNVDGHWKRNDP
        +FGQ S Q++ E+   V +S+ +   S                         S+K QK+K  GKGK P  A + KGK KVA K KCFHCNVD HWK N P
Subjt:  IFGQLSGQLRHESLKYVYNSRMKEGSSVKEHVLDLMVHFNVEDMNGAVIDKQSQKTQKKKIGGKGKAPAAADKGKGKPKVADKGKCFHCNVDGHWKRNDP

Query:  KYLVELKEK-----------------------------------------------------------------------------KGKMTKQPFTGKGY
        KYLV+ KEK                                                                             +GKMTK+PFTGKGY
Subjt:  KYLVELKEK-----------------------------------------------------------------------------KGKMTKQPFTGKGY

Query:  RAKKPLELIHSDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDYLIEHGIQSQLAQ
        RAK+PLELIHSDLCGP+NVKAR  +EYFISFID YSRYGYLYLM H SEAL+KFKEYKTEVEN L K IK LRSDRGGEY DLRFQDY+IEHGIQSQL+ 
Subjt:  RAKKPLELIHSDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDYLIEHGIQSQLAQ

Query:  PNTPQQN-----------------------------------------------------------------------AHMLWTNPKKLEPRSRLCQFVG
        P TPQQN                                                                       AH+L TNPKKLEPRSRLCQFVG
Subjt:  PNTPQQN-----------------------------------------------------------------------AHMLWTNPKKLEPRSRLCQFVG

Query:  YPKETRGCLFYDPQENKVLVLTNTTFLEEDNMRNLKPRT------------------GPSSRVDEEVGTSSQSRPSQLWGMPRRSGRVVSQPDRYLGLTE
        YPKETRG LF+DPQEN+V V TN TFLEED+MRN KPR+                  GPSSRVDE   TS QS PSQ   MPRRSGRVVSQP+RYLGLTE
Subjt:  YPKETRGCLFYDPQENKVLVLTNTTFLEEDNMRNLKPRT------------------GPSSRVDEEVGTSSQSRPSQLWGMPRRSGRVVSQPDRYLGLTE

Query:  TQVVILFDGVEDPLSYKQAMNDVDKHQWIKVMDLEMESMYFNSVWELVDQPEG
        TQVVI  DGVEDPLSYKQAMNDVDK QW+K MDLEMESMYFNSVWELVD PEG
Subjt:  TQVVILFDGVEDPLSYKQAMNDVDKHQWIKVMDLEMESMYFNSVWELVDQPEG

KAA0060254.1 gag/pol protein [Cucumis melo var. makuwa]5.6e-15549.07Show/hide
Query:  MSSLIIALLKSECLTGENYTTWKSNLNMILVVDELRFVLTEKCPQVPARSASQSVKDAYDRWIKANDKAKVYILASLSEVLAKKHKGMVSAREIMSPLQN
        M++  + +L ++ L G NY +WK+ +N++L++D+L+FVL E+CPQVPA +A+Q+V++ Y+RW K N+K + YILASLSEVLAKKH+ M++AREIM  LQ 
Subjt:  MSSLIIALLKSECLTGENYTTWKSNLNMILVVDELRFVLTEKCPQVPARSASQSVKDAYDRWIKANDKAKVYILASLSEVLAKKHKGMVSAREIMSPLQN

Query:  IFGQLSGQLRHESLKYVYNSRMKEGSSVKEHVLDLMVHFNVEDMNGAVIDKQSQ--------------------------------KTQKKKIGGKG-KA
        +FGQ S Q+ H++LKY+YN+RM EG+SV+EHVL++MVHFNV +MNGAVID+ SQ                                K  KKK GG+G KA
Subjt:  IFGQLSGQLRHESLKYVYNSRMKEGSSVKEHVLDLMVHFNVEDMNGAVIDKQSQ--------------------------------KTQKKKIGGKG-KA

Query:  PAAADKGKGKPKVADKGKCFHCNVDGHWKRNDPKYLVELKEKK----------------------------------GKMTKQPFTGKGYRAKKPLELIH
          AA K   K K A KG CFH N +GHWKRN PKYL E K+ K                                  GKMTK+PFTGKG+RAK+PLEL+H
Subjt:  PAAADKGKGKPKVADKGKCFHCNVDGHWKRNDPKYLVELKEKK----------------------------------GKMTKQPFTGKGYRAKKPLELIH

Query:  SDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDYLIEHGIQSQLAQPNTPQQN---
        SDLCGP+NVKAR E+EYFI+F D YSRYGY+YLM H SEAL+KFKEYK EVENAL KTIKT RSDRGGEY DL+FQ+YL+E  I SQL+ P TPQQN   
Subjt:  SDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDYLIEHGIQSQLAQPNTPQQN---

Query:  --------------------------------------------------------------------AHMLWTNPKKLEPRSRLCQFVGYPKETRGCLF
                                                                            AH+L  NPKKLEPRS+LC FVGYPK TRG  F
Subjt:  --------------------------------------------------------------------AHMLWTNPKKLEPRSRLCQFVGYPKETRGCLF

Query:  YDPQENKVLVLTNTTFLEEDNMRNLKPR------------TGPSSRVDEE---------VGTSSQSRPSQLWGMPRRSGRVVSQPDRYLGLTETQVVILF
        YD ++NKV VLTN TFLE+D++R  KPR            T PS+RV EE         VG+S+++   Q    PRRSGRV + P RY+ LTET  VI  
Subjt:  YDPQENKVLVLTNTTFLEEDNMRNLKPR------------TGPSSRVDEE---------VGTSSQSRPSQLWGMPRRSGRVVSQPDRYLGLTETQVVILF

Query:  DGVEDPLSYKQAMNDVDKHQWIKVMDLEMESMYFNSVWELVDQPEG
          +EDPL++K+AM DVDK +WIK M+LE+ESMYFNSVW+LVDQP+G
Subjt:  DGVEDPLSYKQAMNDVDKHQWIKVMDLEMESMYFNSVWELVDQPEG

TYJ97618.1 gag/pol protein [Cucumis melo var. makuwa]5.0e-15649.23Show/hide
Query:  MSSLIIALLKSECLTGENYTTWKSNLNMILVVDELRFVLTEKCPQVPARSASQSVKDAYDRWIKANDKAKVYILASLSEVLAKKHKGMVSAREIMSPLQN
        M++  + +L ++ L G NY +WK+ +N++L++D+L+FVL E+CPQVPA +A+Q+V++ Y+RW K N+K + YILASLSEVLAKKH+ M++AREIM  LQ 
Subjt:  MSSLIIALLKSECLTGENYTTWKSNLNMILVVDELRFVLTEKCPQVPARSASQSVKDAYDRWIKANDKAKVYILASLSEVLAKKHKGMVSAREIMSPLQN

Query:  IFGQLSGQLRHESLKYVYNSRMKEGSSVKEHVLDLMVHFNVEDMNGAVIDKQSQ--------------------------------KTQKKKIGGKG-KA
        +FGQ S Q+ H++LKY+YN+RM EG+SV+EHVL++MVHFNV +MNGAVID+ SQ                                K  KKK GG+G KA
Subjt:  IFGQLSGQLRHESLKYVYNSRMKEGSSVKEHVLDLMVHFNVEDMNGAVIDKQSQ--------------------------------KTQKKKIGGKG-KA

Query:  PAAADKGKGKPKVADKGKCFHCNVDGHWKRNDPKYLVELKEKK----------------------------------GKMTKQPFTGKGYRAKKPLELIH
          AA K   K K A KG CFH N +GHWKRN PKYL E K+ K                                  GKMTK+PFTGKG+RAK+PLEL+H
Subjt:  PAAADKGKGKPKVADKGKCFHCNVDGHWKRNDPKYLVELKEKK----------------------------------GKMTKQPFTGKGYRAKKPLELIH

Query:  SDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDYLIEHGIQSQLAQPNTPQQN---
        SDLCGP+NVKAR E+EYFI+F D YSRYGY+YLM H SEAL+KFKEYK EVENAL KTIKT RSDRGGEY DL+FQ+YL+E  I SQL+ P TPQQN   
Subjt:  SDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDYLIEHGIQSQLAQPNTPQQN---

Query:  --------------------------------------------------------------------AHMLWTNPKKLEPRSRLCQFVGYPKETRGCLF
                                                                            AH+L  NPKKLEPRS+LC FVGYPK TRG  F
Subjt:  --------------------------------------------------------------------AHMLWTNPKKLEPRSRLCQFVGYPKETRGCLF

Query:  YDPQENKVLVLTNTTFLEEDNMRNLKPR------------TGPSSRVDEE---------VGTSSQSRPSQLWGMPRRSGRVVSQPDRYLGLTETQVVILF
        YDP++NKV V TN TFLEED++R  KPR            T PS+RV EE         VG+S+++   Q    PRRSGRV + P RY+ LTET  VI  
Subjt:  YDPQENKVLVLTNTTFLEEDNMRNLKPR------------TGPSSRVDEE---------VGTSSQSRPSQLWGMPRRSGRVVSQPDRYLGLTETQVVILF

Query:  DGVEDPLSYKQAMNDVDKHQWIKVMDLEMESMYFNSVWELVDQPEG
          +EDPL++K+AM DVDK +WIK M+LE+ESMYFNSVW+LVDQP+G
Subjt:  DGVEDPLSYKQAMNDVDKHQWIKVMDLEMESMYFNSVWELVDQPEG

TYK02840.1 gag/pol protein [Cucumis melo var. makuwa]1.4e-16152.83Show/hide
Query:  MSSLIIALLKSECLTGENYTTWKSNLNMILVVDELRFVLTEKCPQVPARSASQSVKDAYDRWIKANDKAKVYILASLSEVLAKKHKGMVSAREIMSPLQN
        MSS IIALLK + LTGENY TWKS LNMILV+ +L FVL E+CP  P + ASQSV+DAYDRW KANDKA+++ILAS+S++L+KKH+ MV+AR+IM  L+ 
Subjt:  MSSLIIALLKSECLTGENYTTWKSNLNMILVVDELRFVLTEKCPQVPARSASQSVKDAYDRWIKANDKAKVYILASLSEVLAKKHKGMVSAREIMSPLQN

Query:  IFGQLSGQLRHESLKYVYNSRMKEGSSVKEHVLDLMVHFNVEDMNGAVIDKQSQKTQKKKIGGKGKAPAAADKGKGKPKVADKGKCFHCNVDGHWKRNDP
        +FGQ S Q++ E+   V +S+ +   S                         S+K QK+K  GKGK P  A + KGK KVA K KCFHCNVD HWK N P
Subjt:  IFGQLSGQLRHESLKYVYNSRMKEGSSVKEHVLDLMVHFNVEDMNGAVIDKQSQKTQKKKIGGKGKAPAAADKGKGKPKVADKGKCFHCNVDGHWKRNDP

Query:  KYLVELKEK-----------------------------------------------------------------------------KGKMTKQPFTGKGY
        KYLV+ KEK                                                                             +GKMTK+PFTGKGY
Subjt:  KYLVELKEK-----------------------------------------------------------------------------KGKMTKQPFTGKGY

Query:  RAKKPLELIHSDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDYLIEHGIQSQLAQ
        RAK+PLELIHSDLCGP+NVKAR  +EYFISFID YSRYGYLYLM H SEAL+KFKEYKTEVEN L K IK LRSDRGGEY DLRFQDY+IEHGIQSQL+ 
Subjt:  RAKKPLELIHSDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDYLIEHGIQSQLAQ

Query:  PNTPQQN-----------------------------------------------------------------------AHMLWTNPKKLEPRSRLCQFVG
        P TPQQN                                                                       AH+L TNPKKLEPRSRLCQFVG
Subjt:  PNTPQQN-----------------------------------------------------------------------AHMLWTNPKKLEPRSRLCQFVG

Query:  YPKETRGCLFYDPQENKVLVLTNTTFLEEDNMRNLKPRT------------------GPSSRVDEEVGTSSQSRPSQLWGMPRRSGRVVSQPDRYLGLTE
        YPKETRG LF+DPQEN+V V TN TFLEED+MRN KPR+                  GPSSRVDE   TS QS PSQ   MPRRSGRVVSQP+RYLGLTE
Subjt:  YPKETRGCLFYDPQENKVLVLTNTTFLEEDNMRNLKPRT------------------GPSSRVDEEVGTSSQSRPSQLWGMPRRSGRVVSQPDRYLGLTE

Query:  TQVVILFDGVEDPLSYKQAMNDVDKHQWIKVMDLEMESMYFNSVWELVDQPEG
        TQVVI  DGVEDPLSYKQAMNDVDK QW+K MDLEMESMYFNSVWELVD PEG
Subjt:  TQVVILFDGVEDPLSYKQAMNDVDKHQWIKVMDLEMESMYFNSVWELVDQPEG

TrEMBL top hitse value%identityAlignment
A0A5A7SIN2 Gag/pol protein3.8e-15754.88Show/hide
Query:  MSSLIIALLKSECLTGENYTTWKSNLNMILVVDELRFVLTEKCPQVPARSASQSVKDAYDRWIKANDKAKVYILASLSEVLAKKHKGMVSAREIMSPLQN
        MSS IIALLK + LTGENY TWKS LNMILV+ +L FVL E+CP  P + ASQSV+D YDRW KANDKA+++ILAS+S++L+KKH+ MV+AR+IM  L+ 
Subjt:  MSSLIIALLKSECLTGENYTTWKSNLNMILVVDELRFVLTEKCPQVPARSASQSVKDAYDRWIKANDKAKVYILASLSEVLAKKHKGMVSAREIMSPLQN

Query:  IFGQLSGQLRHESLKYVYNSRMKEGSSVKEHVLDLMVHFNVEDMNGAVIDKQSQKTQKKKIGGKGKAPAAADKGKGKPKVADKGKCFHCNVDGHWKRNDP
        +FGQ S Q++ E+   V++ R    SS                         S+K QK+K  GK K P  A + KGK KVA K K FHCNVD HWK N P
Subjt:  IFGQLSGQLRHESLKYVYNSRMKEGSSVKEHVLDLMVHFNVEDMNGAVIDKQSQKTQKKKIGGKGKAPAAADKGKGKPKVADKGKCFHCNVDGHWKRNDP

Query:  KYLVELKEK------------------------------------------------------------------KGKMTKQPFTGKGYRAKKPLELIHS
        KYLV+ KE                                                                   +GKMTK+PFT KGYRAK+PLELIHS
Subjt:  KYLVELKEK------------------------------------------------------------------KGKMTKQPFTGKGYRAKKPLELIHS

Query:  DLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDYLIEHGIQSQLAQP---------N
        DLCG +NVKAR  +EYFISFID YSRYGYLYLM H SEAL+KFKEYKTEVEN L K IK LRSDRGGEY DLRFQDY+IEHGIQSQL+ P         N
Subjt:  DLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDYLIEHGIQSQLAQP---------N

Query:  TPQQN-------------------------AHMLWTNPKKLEPRSRLCQFVGYPKETRGCLFYDPQENKVLVLTNTTFLEEDNMRNLKPRT---------
         P ++                          H+L TNPKKL+ RSRLCQFVGYPKETRG LF+DPQEN+V V TN TFLEED+MRN KPR+         
Subjt:  TPQQN-------------------------AHMLWTNPKKLEPRSRLCQFVGYPKETRGCLFYDPQENKVLVLTNTTFLEEDNMRNLKPRT---------

Query:  ---------GPSSRVDEEVGTSSQSRPSQLWGMPRRSGRVVSQPDRYLGLTETQVVILFDGVEDPLSYKQAMNDVDKHQWIKVMDLEMESMYFNSVWELV
                 GPSSRVDE   TS QS PSQ   MPRRSGRVVSQP+RYLGLTETQ VI  DGVEDPLSYKQAMNDVDK QW+K MDLEMESMYFN VWELV
Subjt:  ---------GPSSRVDEEVGTSSQSRPSQLWGMPRRSGRVVSQPDRYLGLTETQVVILFDGVEDPLSYKQAMNDVDKHQWIKVMDLEMESMYFNSVWELV

Query:  DQPEG
        D PEG
Subjt:  DQPEG

A0A5A7UYX7 Gag/pol protein2.7e-15549.07Show/hide
Query:  MSSLIIALLKSECLTGENYTTWKSNLNMILVVDELRFVLTEKCPQVPARSASQSVKDAYDRWIKANDKAKVYILASLSEVLAKKHKGMVSAREIMSPLQN
        M++  + +L ++ L G NY +WK+ +N++L++D+L+FVL E+CPQVPA +A+Q+V++ Y+RW K N+K + YILASLSEVLAKKH+ M++AREIM  LQ 
Subjt:  MSSLIIALLKSECLTGENYTTWKSNLNMILVVDELRFVLTEKCPQVPARSASQSVKDAYDRWIKANDKAKVYILASLSEVLAKKHKGMVSAREIMSPLQN

Query:  IFGQLSGQLRHESLKYVYNSRMKEGSSVKEHVLDLMVHFNVEDMNGAVIDKQSQ--------------------------------KTQKKKIGGKG-KA
        +FGQ S Q+ H++LKY+YN+RM EG+SV+EHVL++MVHFNV +MNGAVID+ SQ                                K  KKK GG+G KA
Subjt:  IFGQLSGQLRHESLKYVYNSRMKEGSSVKEHVLDLMVHFNVEDMNGAVIDKQSQ--------------------------------KTQKKKIGGKG-KA

Query:  PAAADKGKGKPKVADKGKCFHCNVDGHWKRNDPKYLVELKEKK----------------------------------GKMTKQPFTGKGYRAKKPLELIH
          AA K   K K A KG CFH N +GHWKRN PKYL E K+ K                                  GKMTK+PFTGKG+RAK+PLEL+H
Subjt:  PAAADKGKGKPKVADKGKCFHCNVDGHWKRNDPKYLVELKEKK----------------------------------GKMTKQPFTGKGYRAKKPLELIH

Query:  SDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDYLIEHGIQSQLAQPNTPQQN---
        SDLCGP+NVKAR E+EYFI+F D YSRYGY+YLM H SEAL+KFKEYK EVENAL KTIKT RSDRGGEY DL+FQ+YL+E  I SQL+ P TPQQN   
Subjt:  SDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDYLIEHGIQSQLAQPNTPQQN---

Query:  --------------------------------------------------------------------AHMLWTNPKKLEPRSRLCQFVGYPKETRGCLF
                                                                            AH+L  NPKKLEPRS+LC FVGYPK TRG  F
Subjt:  --------------------------------------------------------------------AHMLWTNPKKLEPRSRLCQFVGYPKETRGCLF

Query:  YDPQENKVLVLTNTTFLEEDNMRNLKPR------------TGPSSRVDEE---------VGTSSQSRPSQLWGMPRRSGRVVSQPDRYLGLTETQVVILF
        YD ++NKV VLTN TFLE+D++R  KPR            T PS+RV EE         VG+S+++   Q    PRRSGRV + P RY+ LTET  VI  
Subjt:  YDPQENKVLVLTNTTFLEEDNMRNLKPR------------TGPSSRVDEE---------VGTSSQSRPSQLWGMPRRSGRVVSQPDRYLGLTETQVVILF

Query:  DGVEDPLSYKQAMNDVDKHQWIKVMDLEMESMYFNSVWELVDQPEG
          +EDPL++K+AM DVDK +WIK M+LE+ESMYFNSVW+LVDQP+G
Subjt:  DGVEDPLSYKQAMNDVDKHQWIKVMDLEMESMYFNSVWELVDQPEG

A0A5A7VJG3 Gag/pol protein2.9e-14947.15Show/hide
Query:  MSSLIIALLKSECLTGENYTTWKSNLNMILVVDELRFVLTEKCPQVPARSASQSVKDAYDRWIKANDKAKVYILASLSEVLAKKHKGMVSAREIMSPLQN
        MS  IIALLK + LTGENY TWKS LNMILV+ +LRFVL E+CP  P + ASQSV+DAYD W KANDKA ++ILAS+S++L+KKH+ MV+AR+IM  L+ 
Subjt:  MSSLIIALLKSECLTGENYTTWKSNLNMILVVDELRFVLTEKCPQVPARSASQSVKDAYDRWIKANDKAKVYILASLSEVLAKKHKGMVSAREIMSPLQN

Query:  IFGQLSGQLRHESLKYVYNSRMKEGSSVKEHVLDLMVHFNVEDMNGAVIDKQSQKTQKKKIGGKGKAPAAADKGKGKPKVADKGKCFHCNVDGHWKRNDP
        +FGQ S Q++ E+    ++ R    SS                         S+K QK+K  GKG+ P  A +GKGK KV  KGKCFHCNVD HWK N P
Subjt:  IFGQLSGQLRHESLKYVYNSRMKEGSSVKEHVLDLMVHFNVEDMNGAVIDKQSQKTQKKKIGGKGKAPAAADKGKGKPKVADKGKCFHCNVDGHWKRNDP

Query:  KYLVELKEK-------------------------------------------------------------------------------------------
        KYLV+ KEK                                                                                           
Subjt:  KYLVELKEK-------------------------------------------------------------------------------------------

Query:  -----------------------------------------------------------------------------------KGKMTKQPFTGKGYRAK
                                                                                           +GKMTK+PFTGK YRAK
Subjt:  -----------------------------------------------------------------------------------KGKMTKQPFTGKGYRAK

Query:  KPLELIHSDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDYLIEHGIQSQLAQPNT
        +PLELIHSDLCGP+NVKAR  +EYFISFID YSRYGYLYLM H  EAL+KFKEYKTEVEN L K IK LRSDRGGEY DLRFQDY+IEHGIQSQL+ P T
Subjt:  KPLELIHSDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDYLIEHGIQSQLAQPNT

Query:  PQQNA-----------------------HMLWTNPKKLEPRSRLCQFVGYPKETRGCLFYDPQENKVLVLTNTTFLEEDNMRNLKPRT------------
        PQQN                           W  PKKLEPRSRLCQFVGYPKE RG LF+DPQEN+V V TNTTFLEED MR+ KPR+            
Subjt:  PQQNA-----------------------HMLWTNPKKLEPRSRLCQFVGYPKETRGCLFYDPQENKVLVLTNTTFLEEDNMRNLKPRT------------

Query:  ------GPSSRVDEEVGTSSQSRPSQLWGMPRRSGRVVSQPDRYLGLTETQVVILFDGVEDPLSYKQAMNDVDKHQWIKVMDLEMESMYFNSVWELVDQP
               PSSRVDE   TS QS PSQ   MPRRSGR+VSQP RYLGLTETQVVI  DGVEDPLSYKQ MNDVDK+QW+K MDLE+ESMYFNSVWEL D  
Subjt:  ------GPSSRVDEEVGTSSQSRPSQLWGMPRRSGRVVSQPDRYLGLTETQVVILFDGVEDPLSYKQAMNDVDKHQWIKVMDLEMESMYFNSVWELVDQP

Query:  EG
        EG
Subjt:  EG

A0A5D3BHG7 Gag/pol protein2.4e-15649.23Show/hide
Query:  MSSLIIALLKSECLTGENYTTWKSNLNMILVVDELRFVLTEKCPQVPARSASQSVKDAYDRWIKANDKAKVYILASLSEVLAKKHKGMVSAREIMSPLQN
        M++  + +L ++ L G NY +WK+ +N++L++D+L+FVL E+CPQVPA +A+Q+V++ Y+RW K N+K + YILASLSEVLAKKH+ M++AREIM  LQ 
Subjt:  MSSLIIALLKSECLTGENYTTWKSNLNMILVVDELRFVLTEKCPQVPARSASQSVKDAYDRWIKANDKAKVYILASLSEVLAKKHKGMVSAREIMSPLQN

Query:  IFGQLSGQLRHESLKYVYNSRMKEGSSVKEHVLDLMVHFNVEDMNGAVIDKQSQ--------------------------------KTQKKKIGGKG-KA
        +FGQ S Q+ H++LKY+YN+RM EG+SV+EHVL++MVHFNV +MNGAVID+ SQ                                K  KKK GG+G KA
Subjt:  IFGQLSGQLRHESLKYVYNSRMKEGSSVKEHVLDLMVHFNVEDMNGAVIDKQSQ--------------------------------KTQKKKIGGKG-KA

Query:  PAAADKGKGKPKVADKGKCFHCNVDGHWKRNDPKYLVELKEKK----------------------------------GKMTKQPFTGKGYRAKKPLELIH
          AA K   K K A KG CFH N +GHWKRN PKYL E K+ K                                  GKMTK+PFTGKG+RAK+PLEL+H
Subjt:  PAAADKGKGKPKVADKGKCFHCNVDGHWKRNDPKYLVELKEKK----------------------------------GKMTKQPFTGKGYRAKKPLELIH

Query:  SDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDYLIEHGIQSQLAQPNTPQQN---
        SDLCGP+NVKAR E+EYFI+F D YSRYGY+YLM H SEAL+KFKEYK EVENAL KTIKT RSDRGGEY DL+FQ+YL+E  I SQL+ P TPQQN   
Subjt:  SDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDYLIEHGIQSQLAQPNTPQQN---

Query:  --------------------------------------------------------------------AHMLWTNPKKLEPRSRLCQFVGYPKETRGCLF
                                                                            AH+L  NPKKLEPRS+LC FVGYPK TRG  F
Subjt:  --------------------------------------------------------------------AHMLWTNPKKLEPRSRLCQFVGYPKETRGCLF

Query:  YDPQENKVLVLTNTTFLEEDNMRNLKPR------------TGPSSRVDEE---------VGTSSQSRPSQLWGMPRRSGRVVSQPDRYLGLTETQVVILF
        YDP++NKV V TN TFLEED++R  KPR            T PS+RV EE         VG+S+++   Q    PRRSGRV + P RY+ LTET  VI  
Subjt:  YDPQENKVLVLTNTTFLEEDNMRNLKPR------------TGPSSRVDEE---------VGTSSQSRPSQLWGMPRRSGRVVSQPDRYLGLTETQVVILF

Query:  DGVEDPLSYKQAMNDVDKHQWIKVMDLEMESMYFNSVWELVDQPEG
          +EDPL++K+AM DVDK +WIK M+LE+ESMYFNSVW+LVDQP+G
Subjt:  DGVEDPLSYKQAMNDVDKHQWIKVMDLEMESMYFNSVWELVDQPEG

A0A5D3DZX8 Gag/pol protein8.4e-14953.21Show/hide
Query:  MILVVDELRFVLTEKCPQVPARSASQSVKDAYDRWIKANDKAKVYILASLSEVLAKKHKGMVSAREIMSPLQNIFGQLSGQLRHESLKYVYNSRMKEGSS
        MILV+ +LRFVL EKCP  P +  SQSV+DAY RW KANDKA ++ILAS+S++L+KKH+ MV AR+IM  L+ +FGQ S Q++ E+    ++ R    SS
Subjt:  MILVVDELRFVLTEKCPQVPARSASQSVKDAYDRWIKANDKAKVYILASLSEVLAKKHKGMVSAREIMSPLQNIFGQLSGQLRHESLKYVYNSRMKEGSS

Query:  VKEHVLDLMVHFNVEDMNGAVIDKQSQKTQKKKIGGKGKAPAAADKGKGKPKVADKGKCFHCNVDGHWKRNDPKYLVELKEK------------------
                                 S+K QK+K  GKGK P  A + KGK KV  KGKCFHC+VD HWK N PKYLV+ KEK                  
Subjt:  VKEHVLDLMVHFNVEDMNGAVIDKQSQKTQKKKIGGKGKAPAAADKGKGKPKVADKGKCFHCNVDGHWKRNDPKYLVELKEK------------------

Query:  -----------------------------------------------------------KGKMTKQPFTGKGYRAKKPLELIHSDLCGPVNVKARVEYEY
                                                                   +GKMTK+PFT KGYRAK+PLELIHSDLCGP+NVKAR  +EY
Subjt:  -----------------------------------------------------------KGKMTKQPFTGKGYRAKKPLELIHSDLCGPVNVKARVEYEY

Query:  FISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDYLIEHGIQSQLAQPNTPQQN--------------------
        FISFID YSRYGYLYLM H SEAL+KFKEYK EVEN L K IK LRSD+GGEY DLRFQDY+IEHGIQSQL+ P TPQQN                    
Subjt:  FISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDYLIEHGIQSQLAQPNTPQQN--------------------

Query:  -----------------AHMLWTNPKKLEPRSRLCQFVGYPKETRGCLFYDPQENKVLVLTNTTFLEEDNMRNLKPRT------------------GPSS
                         +H+L TNPKKL PRSRLCQFVGYPKETRG L +DPQEN+VLV TN TFLEED+ R+ KPR+                  GPSS
Subjt:  -----------------AHMLWTNPKKLEPRSRLCQFVGYPKETRGCLFYDPQENKVLVLTNTTFLEEDNMRNLKPRT------------------GPSS

Query:  RVDEEVGTSSQSRPSQLWGMPRRSGRVVSQPDRYLGLTETQVVILFDGVEDPLSYKQAMNDVDKHQWIKVMDLEMESMYFNSVWELVDQPEG
        RVDE   TS QS PSQ   MPRRSGRVVSQP+RYLGLTETQVVI  DGVEDPLSYKQAMNDVDK QW+K +DLEMESMYFNSVWELVD PEG
Subjt:  RVDEEVGTSSQSRPSQLWGMPRRSGRVVSQPDRYLGLTETQVVILFDGVEDPLSYKQAMNDVDKHQWIKVMDLEMESMYFNSVWELVDQPEG

SwissProt top hitse value%identityAlignment
P04146 Copia protein1.3e-1029.51Show/hide
Query:  GKMTKQPFTGKGYRA--KKPLELIHSDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRF
        GK  + PF     +   K+PL ++HSD+CGP+      +  YF+ F+D ++ Y   YL+ + S+    F+++  + E      +  L  D G EY     
Subjt:  GKMTKQPFTGKGYRA--KKPLELIHSDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRF

Query:  QDYLIEHGIQSQLAQPNTPQQN
        + + ++ GI   L  P+TPQ N
Subjt:  QDYLIEHGIQSQLAQPNTPQQN

P10978 Retrovirus-related Pol polyprotein from transposon TNT 1-949.6e-1731.54Show/hide
Query:  GKMTKQPFTGKGYRAKKPLELIHSDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQD
        GK  +  F     R    L+L++SD+CGP+ +++    +YF++FID  SR  ++Y++    +  + F+++   VE   G+ +K LRSD GGEY    F++
Subjt:  GKMTKQPFTGKGYRAKKPLELIHSDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQD

Query:  YLIEHGIQSQLAQPNTPQQNAHMLWTNPKKLEPRSRLCQFVGYPKETRG
        Y   HGI+ +   P TPQ N      N   +E    + +    PK   G
Subjt:  YLIEHGIQSQLAQPNTPQQNAHMLWTNPKKLEPRSRLCQFVGYPKETRG

Q12491 Transposon Ty2-B Gag-Pol polyprotein1.7e-0528.68Show/hide
Query:  GKMTKQPFTGKGYRAK-----KPLELIHSDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSE--ALKKFKEYKTEVENALGKTIKTLRSDRGGEY
        GK TK     KG R K     +P + +H+D+ GPV+   +    YFISF D  +R+ ++Y +H   E   L  F      ++N     +  ++ DRG EY
Subjt:  GKMTKQPFTGKGYRAK-----KPLELIHSDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSE--ALKKFKEYKTEVENALGKTIKTLRSDRGGEY

Query:  KDLRFQDYLIEHGIQSQLAQPNTPQQNAH
         +     +    GI +      T    AH
Subjt:  KDLRFQDYLIEHGIQSQLAQPNTPQQNAH

Q94HW2 Retrovirus-related Pol polyprotein from transposon RE11.0e-1031.93Show/hide
Query:  KMTKQPFTGKGYRAKKPLELIHSDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDY
        K  K PF+     + +PLE I+SD+     + +   Y Y++ F+D ++RY +LY +   S+  + F  +K  +EN     I T  SD GGE+  +   +Y
Subjt:  KMTKQPFTGKGYRAKKPLELIHSDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDY

Query:  LIEHGIQSQLAQPNTPQQN
          +HGI    + P+TP+ N
Subjt:  LIEHGIQSQLAQPNTPQQN

Q9ZT94 Retrovirus-related Pol polyprotein from transposon RE21.4e-1237.82Show/hide
Query:  KMTKQPFTGKGYRAKKPLELIHSDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDY
        K  K PF+     + KPLE I+SD+     + +   Y Y++ F+D ++RY +LY +   S+    F  +K+ VEN     I TL SD GGE+  LR  DY
Subjt:  KMTKQPFTGKGYRAKKPLELIHSDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDY

Query:  LIEHGIQSQLAQPNTPQQN
        L +HGI    + P+TP+ N
Subjt:  LIEHGIQSQLAQPNTPQQN

Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGTCGTCCTTGATAATAGCCTTACTCAAAAGTGAATGTTTAACTGGCGAGAATTATACTACGTGGAAGTCCAACCTGAATATGATTCTGGTTGTTGACGAACTTCGATT
TGTACTAACTGAGAAATGTCCTCAGGTCCCTGCTCGAAGCGCTTCTCAATCTGTTAAGGATGCGTACGACCGTTGGATCAAGGCCAATGACAAGGCCAAGGTTTACATTT
TGGCTAGTCTTTCTGAAGTTCTGGCCAAAAAGCACAAGGGCATGGTCTCAGCTCGTGAGATCATGAGTCCGTTGCAGAATATATTTGGACAACTGTCTGGACAGCTGCGA
CACGAATCCCTCAAGTACGTTTATAACTCCCGTATGAAGGAGGGATCGTCGGTGAAAGAACATGTTCTCGATCTGATGGTCCACTTCAACGTGGAAGATATGAACGGCGC
GGTCATCGACAAGCAAAGTCAGAAGACCCAAAAGAAGAAGATAGGAGGGAAAGGGAAGGCACCTGCCGCTGCAGACAAAGGCAAGGGAAAACCCAAAGTTGCAGACAAAG
GAAAATGTTTCCACTGCAATGTGGACGGGCACTGGAAGCGAAACGACCCAAAATACCTTGTTGAGCTCAAAGAGAAGAAAGGTAAAATGACTAAGCAACCTTTTACTGGA
AAAGGTTATAGAGCCAAAAAACCCTTAGAACTTATACATTCGGATCTCTGTGGTCCGGTGAATGTTAAAGCTCGAGTAGAGTACGAATATTTCATCTCTTTCATAGATGT
TTATTCGAGGTATGGTTATCTATACCTAATGCATCATATGTCTGAAGCTCTTAAAAAGTTCAAAGAGTATAAGACTGAAGTAGAGAATGCATTAGGAAAAACCATAAAGA
CACTTCGATCCGATCGAGGTGGAGAGTATAAGGATCTAAGATTCCAGGACTATTTGATAGAACATGGAATCCAATCTCAACTCGCACAACCTAATACACCTCAGCAGAAT
GCACACATGCTATGGACAAACCCAAAGAAATTAGAACCTCGTTCAAGATTATGCCAATTTGTTGGCTATCCCAAAGAAACGAGAGGTTGTCTTTTCTATGACCCACAAGA
AAACAAGGTGCTTGTATTGACAAACACCACTTTCTTGGAGGAAGATAACATGAGAAACCTTAAACCGCGTACTGGACCTTCATCAAGAGTTGATGAAGAAGTTGGCACAT
CGAGTCAGTCTCGTCCTTCTCAATTGTGGGGAATGCCTCGACGCAGTGGGAGGGTTGTTTCCCAACCTGACCGCTACTTGGGTTTAACTGAAACTCAAGTTGTCATACTT
TTTGACGGTGTAGAGGATCCATTGTCTTATAAACAGGCAATGAATGACGTAGATAAACACCAATGGATCAAAGTCATGGACCTTGAAATGGAGTCAATGTACTTCAATTC
AGTTTGGGAACTTGTAGACCAGCCTGAAGGGTAA
mRNA sequenceShow/hide mRNA sequence
ATGTCGTCCTTGATAATAGCCTTACTCAAAAGTGAATGTTTAACTGGCGAGAATTATACTACGTGGAAGTCCAACCTGAATATGATTCTGGTTGTTGACGAACTTCGATT
TGTACTAACTGAGAAATGTCCTCAGGTCCCTGCTCGAAGCGCTTCTCAATCTGTTAAGGATGCGTACGACCGTTGGATCAAGGCCAATGACAAGGCCAAGGTTTACATTT
TGGCTAGTCTTTCTGAAGTTCTGGCCAAAAAGCACAAGGGCATGGTCTCAGCTCGTGAGATCATGAGTCCGTTGCAGAATATATTTGGACAACTGTCTGGACAGCTGCGA
CACGAATCCCTCAAGTACGTTTATAACTCCCGTATGAAGGAGGGATCGTCGGTGAAAGAACATGTTCTCGATCTGATGGTCCACTTCAACGTGGAAGATATGAACGGCGC
GGTCATCGACAAGCAAAGTCAGAAGACCCAAAAGAAGAAGATAGGAGGGAAAGGGAAGGCACCTGCCGCTGCAGACAAAGGCAAGGGAAAACCCAAAGTTGCAGACAAAG
GAAAATGTTTCCACTGCAATGTGGACGGGCACTGGAAGCGAAACGACCCAAAATACCTTGTTGAGCTCAAAGAGAAGAAAGGTAAAATGACTAAGCAACCTTTTACTGGA
AAAGGTTATAGAGCCAAAAAACCCTTAGAACTTATACATTCGGATCTCTGTGGTCCGGTGAATGTTAAAGCTCGAGTAGAGTACGAATATTTCATCTCTTTCATAGATGT
TTATTCGAGGTATGGTTATCTATACCTAATGCATCATATGTCTGAAGCTCTTAAAAAGTTCAAAGAGTATAAGACTGAAGTAGAGAATGCATTAGGAAAAACCATAAAGA
CACTTCGATCCGATCGAGGTGGAGAGTATAAGGATCTAAGATTCCAGGACTATTTGATAGAACATGGAATCCAATCTCAACTCGCACAACCTAATACACCTCAGCAGAAT
GCACACATGCTATGGACAAACCCAAAGAAATTAGAACCTCGTTCAAGATTATGCCAATTTGTTGGCTATCCCAAAGAAACGAGAGGTTGTCTTTTCTATGACCCACAAGA
AAACAAGGTGCTTGTATTGACAAACACCACTTTCTTGGAGGAAGATAACATGAGAAACCTTAAACCGCGTACTGGACCTTCATCAAGAGTTGATGAAGAAGTTGGCACAT
CGAGTCAGTCTCGTCCTTCTCAATTGTGGGGAATGCCTCGACGCAGTGGGAGGGTTGTTTCCCAACCTGACCGCTACTTGGGTTTAACTGAAACTCAAGTTGTCATACTT
TTTGACGGTGTAGAGGATCCATTGTCTTATAAACAGGCAATGAATGACGTAGATAAACACCAATGGATCAAAGTCATGGACCTTGAAATGGAGTCAATGTACTTCAATTC
AGTTTGGGAACTTGTAGACCAGCCTGAAGGGTAA
Protein sequenceShow/hide protein sequence
MSSLIIALLKSECLTGENYTTWKSNLNMILVVDELRFVLTEKCPQVPARSASQSVKDAYDRWIKANDKAKVYILASLSEVLAKKHKGMVSAREIMSPLQNIFGQLSGQLR
HESLKYVYNSRMKEGSSVKEHVLDLMVHFNVEDMNGAVIDKQSQKTQKKKIGGKGKAPAAADKGKGKPKVADKGKCFHCNVDGHWKRNDPKYLVELKEKKGKMTKQPFTG
KGYRAKKPLELIHSDLCGPVNVKARVEYEYFISFIDVYSRYGYLYLMHHMSEALKKFKEYKTEVENALGKTIKTLRSDRGGEYKDLRFQDYLIEHGIQSQLAQPNTPQQN
AHMLWTNPKKLEPRSRLCQFVGYPKETRGCLFYDPQENKVLVLTNTTFLEEDNMRNLKPRTGPSSRVDEEVGTSSQSRPSQLWGMPRRSGRVVSQPDRYLGLTETQVVIL
FDGVEDPLSYKQAMNDVDKHQWIKVMDLEMESMYFNSVWELVDQPEG