; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Lag0031760 (gene) of Sponge gourd (AG-4) v1 genome

Gene IDLag0031760
OrganismLuffa acutangula AG-4 (Sponge gourd (AG-4) v1)
DescriptionGag/pol protein
Genome locationchr11:13909986..13916899
RNA-Seq ExpressionLag0031760
SyntenyLag0031760
Gene Ontology termsGO:0015074 - DNA integration (biological process)
GO:0003676 - nucleic acid binding (molecular function)
GO:0008270 - zinc ion binding (molecular function)
InterPro domainsIPR001878 - Zinc finger, CCHC-type
IPR005162 - Retrotransposon gag domain
IPR036875 - Zinc finger, CCHC-type superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAG6734747.1 hypothetical protein I3842_01G285500 [Carya illinoinensis]2.0e-10040.64Show/hide
Query:  PTFPDQQSGIVYAPINANNFELKTGLIQMARDSAFRGFPSEDPNSHLKSFLDICGTVKLNGVSEDAIRLRLFPFSLQDKARDWLRSLPSGSITTWDALVQ
        P      S I+  PINANNFELK  LI M + + F G P +DPN HL  FL+IC TVK+NGV+ED IRLRLFPFSL+DKAR WL+SL  GSI +W  + +
Subjt:  PTFPDQQSGIVYAPINANNFELKTGLIQMARDSAFRGFPSEDPNSHLKSFLDICGTVKLNGVSEDAIRLRLFPFSLQDKARDWLRSLPSGSITTWDALVQ

Query:  AFLAKYFPPAKTVKLRTEIRTFQQLGDEQLFEAWERYKELLRKCPQHGYPDWLQIQLFYNGLNPNTKTIVDAAAGGTLLSKTVENARTLLEDMATNSYQW
         FLAK+FPPAKT +LR+EI  F+Q   E L+EAWERYK+L+R+CPQHG PDWLQ+Q+FYNGLN  T+TIVDAA+GGTL+SKT E A  LLE+MA+N+YQW
Subjt:  AFLAKYFPPAKTVKLRTEIRTFQQLGDEQLFEAWERYKELLRKCPQHGYPDWLQIQLFYNGLNPNTKTIVDAAAGGTLLSKTVENARTLLEDMATNSYQW

Query:  PTERSAPKKIAAGIYEIDNVSSLQAQMTSLANAFMKFSGTGSAQSIESAALASQC--QEETTTEQV----------------------------------
        PTER+  KK+ AGI+E++ +++L AQ+ +L++     +     QS E  A  S      E + EQV                                  
Subjt:  PTERSAPKKIAAGIYEIDNVSSLQAQMTSLANAFMKFSGTGSAQSIESAALASQC--QEETTTEQV----------------------------------

Query:  -----------------ENKPSLEDMVGAFIAESSKRTNKLEEAVIAINT-------------------TVTVTEQQLKASKLNWEIHEPEVTK------
                         E K SLED + +F+ E++ R  K +  +  I T                     T+  QQ  A   N E++  E  K      
Subjt:  -----------------ENKPSLEDMVGAFIAESSKRTNKLEEAVIAINT-------------------TVTVTEQQLKASKLNWEIHEPEVTK------

Query:  -EEVE-----------------EGSSSTEAEKLTSDPL----IP-------SPTVL---VPKPKKKKKKNYSTQFKKFLDIFMSLNINLPFAEALEQMPK
         +E+E                 +     E E++ +D L    +P       +P +L   +P P++ +K+    QF KFLDIF  ++IN+PFA+ALEQMP 
Subjt:  -EEVE-----------------EGSSSTEAEKLTSDPL----IP-------SPTVL---VPKPKKKKKKNYSTQFKKFLDIFMSLNINLPFAEALEQMPK

Query:  YVQFMKEWLSRKKKEKKVETIFLTSTCSARLQKNVPDKLADPGSFSFPCNFDCSEVMMVRYRKGAS
        YV+F+K+ +S+K++ ++ ET+ L+  CSA LQK +P KL DP SF+ PC    S    V    GAS
Subjt:  YVQFMKEWLSRKKKEKKVETIFLTSTCSARLQKNVPDKLADPGSFSFPCNFDCSEVMMVRYRKGAS

KAG7947748.1 hypothetical protein I3843_14G109500 [Carya illinoinensis]2.4e-10140.81Show/hide
Query:  PTFPDQQSGIVYAPINANNFELKTGLIQMARDSAFRGFPSEDPNSHLKSFLDICGTVKLNGVSEDAIRLRLFPFSLQDKARDWLRSLPSGSITTWDALVQ
        P      S I+  PINANNFELK  LI M + + F G P +DPN HL  FL+IC TVK+NGV+ED IRLRLFPFSL+DKAR WL+SL  GSI +W  + +
Subjt:  PTFPDQQSGIVYAPINANNFELKTGLIQMARDSAFRGFPSEDPNSHLKSFLDICGTVKLNGVSEDAIRLRLFPFSLQDKARDWLRSLPSGSITTWDALVQ

Query:  AFLAKYFPPAKTVKLRTEIRTFQQLGDEQLFEAWERYKELLRKCPQHGYPDWLQIQLFYNGLNPNTKTIVDAAAGGTLLSKTVENARTLLEDMATNSYQW
         FLAK+FPPAKT +LR+EI  F+Q   E L+EAWERYK+L+R+CPQHG PDWLQ+Q+FYNGLN  T+TIVDAA+GGTL+SKT E A  LLE+MA+N+YQW
Subjt:  AFLAKYFPPAKTVKLRTEIRTFQQLGDEQLFEAWERYKELLRKCPQHGYPDWLQIQLFYNGLNPNTKTIVDAAAGGTLLSKTVENARTLLEDMATNSYQW

Query:  PTERSAPKKIAAGIYEIDNVSSLQAQMTSLANAFMKFSGTGSAQSIESAALASQC--QEETTTEQV----------------------------------
        PTER+  KK+ AGI+E++ +++L AQ+ +L++     +     QS E  A  S      E + EQV                                  
Subjt:  PTERSAPKKIAAGIYEIDNVSSLQAQMTSLANAFMKFSGTGSAQSIESAALASQC--QEETTTEQV----------------------------------

Query:  -----------------ENKPSLEDMVGAFIAESSKRTNKLEEAVIAINT-------------------TVTVTEQQLKASKLNWEIHEPEVTK------
                         E K SLED + +F+ E++ R  K +  +  I T                     T+  QQ  A   N E++  E  K      
Subjt:  -----------------ENKPSLEDMVGAFIAESSKRTNKLEEAVIAINT-------------------TVTVTEQQLKASKLNWEIHEPEVTK------

Query:  -EEVE-----------------EGSSSTEAEKLTSDPL----IP-------SPTVL---VPKPKKKKKKNYSTQFKKFLDIFMSLNINLPFAEALEQMPK
         +E+E                 +     E E++ +D L    +P       +P +L   +P P++ +K+    QF KFLDIF  ++IN+PFA+ALEQMP 
Subjt:  -EEVE-----------------EGSSSTEAEKLTSDPL----IP-------SPTVL---VPKPKKKKKKNYSTQFKKFLDIFMSLNINLPFAEALEQMPK

Query:  YVQFMKEWLSRKKKEKKVETIFLTSTCSARLQKNVPDKLADPGSFSFPCNFDCSEVMMVRYRKGAS
        YV+F+K+ +S+K++ ++ ET+ L+  CSA LQK +P KL DPGSF+ PC    S    V    GAS
Subjt:  YVQFMKEWLSRKKKEKKVETIFLTSTCSARLQKNVPDKLADPGSFSFPCNFDCSEVMMVRYRKGAS

KAG7990634.1 hypothetical protein I3843_02G035100 [Carya illinoinensis]2.0e-10040.28Show/hide
Query:  PTFPDQQSGIVYAPINANNFELKTGLIQMARDSAFRGFPSEDPNSHLKSFLDICGTVKLNGVSEDAIRLRLFPFSLQDKARDWLRSLPSGSITTWDALVQ
        P      S I+  PINANNFELK  LI M + + F G P +DPN HL  FL+IC TVK+NGV+ED IRLRLFPFSL+DKAR WL+SL  GSI +W  + +
Subjt:  PTFPDQQSGIVYAPINANNFELKTGLIQMARDSAFRGFPSEDPNSHLKSFLDICGTVKLNGVSEDAIRLRLFPFSLQDKARDWLRSLPSGSITTWDALVQ

Query:  AFLAKYFPPAKTVKLRTEIRTFQQLGDEQLFEAWERYKELLRKCPQHGYPDWLQIQLFYNGLNPNTKTIVDAAAGGTLLSKTVENARTLLEDMATNSYQW
         FLAK+FPPAKT +LR+EI  F+Q   E L+EAWERYK+L+R+CPQHG PDWLQ+Q+FYNGLN  T+TIVDAA+GGTL+SKT E A  LLE+MA+N+YQW
Subjt:  AFLAKYFPPAKTVKLRTEIRTFQQLGDEQLFEAWERYKELLRKCPQHGYPDWLQIQLFYNGLNPNTKTIVDAAAGGTLLSKTVENARTLLEDMATNSYQW

Query:  PTERSAPKKIAAGIYEIDNVSSLQAQMTSLANAFMKFSGTGSAQSIESAALASQC--QEETTTEQV----------------------------------
        PTER+  KK+ AGI++++ +++L AQ+ +L++     +     QS E  A  S      E + EQV                                  
Subjt:  PTERSAPKKIAAGIYEIDNVSSLQAQMTSLANAFMKFSGTGSAQSIESAALASQC--QEETTTEQV----------------------------------

Query:  -----------------ENKPSLEDMVGAFIAESSKRTNKLEEAVIAINT-------------------TVTVTEQQLKASKLNWEIHEPEVTK------
                         E K SLED + +F+ E++ R  K +  +  I T                     T+  QQ  A   N E++  E  K      
Subjt:  -----------------ENKPSLEDMVGAFIAESSKRTNKLEEAVIAINT-------------------TVTVTEQQLKASKLNWEIHEPEVTK------

Query:  -EEVE-----------------EGSSSTEAEKLTSDPL-----------IPSPTVL---VPKPKKKKKKNYSTQFKKFLDIFMSLNINLPFAEALEQMPK
         +E+E                 +  +  E +++ +D L             +P +L   +P P++ +K+    QF KFLDIF  ++IN+PFA+ALEQMP 
Subjt:  -EEVE-----------------EGSSSTEAEKLTSDPL-----------IPSPTVL---VPKPKKKKKKNYSTQFKKFLDIFMSLNINLPFAEALEQMPK

Query:  YVQFMKEWLSRKKKEKKVETIFLTSTCSARLQKNVPDKLADPGSFSFPCNFDCSEVMMVRYRKGAS
        YV+F+K+ +S+K++ ++ ET+ L+  CSA LQK +P KL DPGSF+ PC    S    V    GAS
Subjt:  YVQFMKEWLSRKKKEKKVETIFLTSTCSARLQKNVPDKLADPGSFSFPCNFDCSEVMMVRYRKGAS

XP_022843226.1 uncharacterized protein LOC111366761 [Olea europaea var. sylvestris]3.5e-10042.12Show/hide
Query:  PTFPDQQSGIVYAPINANNFELKTGLIQMARDSAFRGFPSEDPNSHLKSFLDICGTVKLNGVSEDAIRLRLFPFSLQDKARDWLRSLPSGSITTWDALVQ
        P   D  SGI    I A NFELK GLI M + + F G   EDPN+HL SFL+IC TVK+NGV+EDAIRLRLF FSL+DKA+ W +SLP GSITTWD L Q
Subjt:  PTFPDQQSGIVYAPINANNFELKTGLIQMARDSAFRGFPSEDPNSHLKSFLDICGTVKLNGVSEDAIRLRLFPFSLQDKARDWLRSLPSGSITTWDALVQ

Query:  AFLAKYFPPAKTVKLRTEIRTFQQLGDEQLFEAWERYKELLRKCPQHGYPDWLQIQLFYNGLNPNTKTIVDAAAGGTLLSKTVENARTLLEDMATNSYQW
         FL KYFPP+K+ +LR EI  F+QL  E  +EAWER+K+LLR+CPQHG+  W+QI++FYNGLN  T+T+VDAAAGG L++KT E A  LL+D+ATNSYQW
Subjt:  AFLAKYFPPAKTVKLRTEIRTFQQLGDEQLFEAWERYKELLRKCPQHGYPDWLQIQLFYNGLNPNTKTIVDAAAGGTLLSKTVENARTLLEDMATNSYQW

Query:  PTERSAPKKIAAGIYEIDNVSSLQAQMTSLANAFMKFSGTGSAQSIESAALASQCQEET-----------------------------------------
        P+ERS  KK+ AG++E+D +++L AQ+ SL N  +  +  G+ Q+++S    S   +ET                                         
Subjt:  PTERSAPKKIAAGIYEIDNVSSLQAQMTSLANAFMKFSGTGSAQSIESAALASQCQEET-----------------------------------------

Query:  ----------TTEQVENKPSLEDMVGAFIAESSKRTNK-------LEEAVIAINTTVTVTEQQ--------------------------------LKASK
                   T+  + KP LED++G FI+E+  R NK       +E  V  I  T+   E Q                                L++ K
Subjt:  ----------TTEQVENKPSLEDMVGAFIAESSKRTNK-------LEEAVIAINTTVTVTEQQ--------------------------------LKASK

Query:  LNWE-------IHEPEV-TKEEVEEGSSSTEAE-----KLTSDPLIPSPTVL---VPKPKKKKKKNYSTQFKKFLDIFMSLNINLPFAEALEQMPKYVQF
        +  E       +  P+V   +E +     TEAE     K  S     +P +L   +P P++  KK +  QF KFL++F  ++IN+PFAE L QMP Y +F
Subjt:  LNWE-------IHEPEV-TKEEVEEGSSSTEAE-----KLTSDPLIPSPTVL---VPKPKKKKKKNYSTQFKKFLDIFMSLNINLPFAEALEQMPKYVQF

Query:  MKEWLSRKKKEKKVETIFLTSTCSARLQKNVPDKLADPGSFSFPCN
        +KE +S KKK ++ ETI LT  CS  LQK +P KL DPGSF+ PCN
Subjt:  MKEWLSRKKKEKKVETIFLTSTCSARLQKNVPDKLADPGSFSFPCN

XP_022860306.1 uncharacterized protein LOC111380876 [Olea europaea var. sylvestris]1.0e-9941.76Show/hide
Query:  PTFPDQQSGIVYAPINANNFELKTGLIQMARDSAFRGFPSEDPNSHLKSFLDICGTVKLNGVSEDAIRLRLFPFSLQDKARDWLRSLPSGSITTWDALVQ
        P   D  SGI +  I ANNFELK GLI M + + F G   ED N+HL SFL+IC TVK+NGV+EDAIRLRLF FSL+DKA+ W +SLP GSITTWD L Q
Subjt:  PTFPDQQSGIVYAPINANNFELKTGLIQMARDSAFRGFPSEDPNSHLKSFLDICGTVKLNGVSEDAIRLRLFPFSLQDKARDWLRSLPSGSITTWDALVQ

Query:  AFLAKYFPPAKTVKLRTEIRTFQQLGDEQLFEAWERYKELLRKCPQHGYPDWLQIQLFYNGLNPNTKTIVDAAAGGTLLSKTVENARTLLEDMATNSYQW
         FL KYFPP+K+ +L +EI  F+QL  E  +EAWER+K+LLR+CPQHG+  W+QI++FYNGLN  T+T+VDAAAGG L++KT E A  LL+D+ATNSYQW
Subjt:  AFLAKYFPPAKTVKLRTEIRTFQQLGDEQLFEAWERYKELLRKCPQHGYPDWLQIQLFYNGLNPNTKTIVDAAAGGTLLSKTVENARTLLEDMATNSYQW

Query:  PTERSAPKKIAAGIYEIDNVSSLQAQMTSLANAFMKFSGTGSAQSIESAALASQCQEET-----------------------------------------
        P+ERS  KK+ AG +E+D +++L AQ+ SL N  +  +  G+ Q ++S   AS   +ET                                         
Subjt:  PTERSAPKKIAAGIYEIDNVSSLQAQMTSLANAFMKFSGTGSAQSIESAALASQCQEET-----------------------------------------

Query:  ----------TTEQVENKPSLEDMVGAFIAESSKRTNKLEEAVIAINTTV-------------------TVTEQQLKASKLNWEIHEPE-----------
                   T+  + KP LED++G FI+E+  R NK E     I T V                   ++  QQ     ++ E++  E           
Subjt:  ----------TTEQVENKPSLEDMVGAFIAESSKRTNKLEEAVIAINTTV-------------------TVTEQQLKASKLNWEIHEPE-----------

Query:  -----------------VTKEEVEEGSSSTEAE-----KLTSDPLIPSPTVL---VPKPKKKKKKNYSTQFKKFLDIFMSLNINLPFAEALEQMPKYVQF
                         +  EE +     TEAE     K  S     +P +L   +P P++  KK +  QF KFL++F  ++IN+PFAEAL QMP Y +F
Subjt:  -----------------VTKEEVEEGSSSTEAE-----KLTSDPLIPSPTVL---VPKPKKKKKKNYSTQFKKFLDIFMSLNINLPFAEALEQMPKYVQF

Query:  MKEWLSRKKKEKKVETIFLTSTCSARLQKNVPDKLADPGSFSFPCN
        +KE +S KKK ++ ETI LT  CS  LQK +P KL D GSF+ PCN
Subjt:  MKEWLSRKKKEKKVETIFLTSTCSARLQKNVPDKLADPGSFSFPCN

TrEMBL top hitse value%identityAlignment
A0A5A7SMH8 Gag/pol protein9.5e-9667.25Show/hide
Query:  FVLTDDCPPAPARNASQAVKDAYDRWTKTNDKTRVYILARLSEVLSKRHEGMVTAREIMNSLQEMFGQPSYQLHHDALKYVYSCRMKEGTSAREHVLDMM
        FVL ++CP  PA NA++ V++ Y+RW K N+K R YILA LSEVL+K+HE M+TAREIM+SLQEMFGQ SYQ+ HDALKY+Y+ RM EG S REHVL+MM
Subjt:  FVLTDDCPPAPARNASQAVKDAYDRWTKTNDKTRVYILARLSEVLSKRHEGMVTAREIMNSLQEMFGQPSYQLHHDALKYVYSCRMKEGTSAREHVLDMM

Query:  VQFNVAEANGAVIDEHSQVAFILESLSKLFLQFRNKAMMNKITFNLTSLLNELQLYQSLLKTKGQIEGDANVVHSKRKFEKGSSSGTKSVATSS--KKTQ
        V FNVAE NGAVIDE SQV+FILESL + FLQFR+ A+MNKI + LT+LLNELQ ++SL+K KGQ +G+ANV  S RKF +GS+SGTKS+ +SS  KK +
Subjt:  VQFNVAEANGAVIDEHSQVAFILESLSKLFLQFRNKAMMNKITFNLTSLLNELQLYQSLLKTKGQIEGDANVVHSKRKFEKGSSSGTKSVATSS--KKTQ

Query:  KKKGNKG-KAPSTAAKSKGKAKVMADKGKCFHCNVDGHWKRNCPKYLAEKKKENEGKFDLLVLETCLVEHDELTWILDSGATNH
        KKKG +G KA   AAK+  KAK  A KG CFHCN +GHWKRNCPKYLAEKKK  +GK+DLLVLETCLVE+D+  WI+DSGATNH
Subjt:  KKKGNKG-KAPSTAAKSKGKAKVMADKGKCFHCNVDGHWKRNCPKYLAEKKKENEGKFDLLVLETCLVEHDELTWILDSGATNH

A0A5A7TU93 Gag/pol protein9.5e-9667.25Show/hide
Query:  FVLTDDCPPAPARNASQAVKDAYDRWTKTNDKTRVYILARLSEVLSKRHEGMVTAREIMNSLQEMFGQPSYQLHHDALKYVYSCRMKEGTSAREHVLDMM
        FVL ++CP  PA NA++ V++ Y+RW K N+K R YILA LSEVL+K+HE M+TAREIM+SLQEMFGQ SYQ+ HDALKY+Y+ RM EG S REHVL+MM
Subjt:  FVLTDDCPPAPARNASQAVKDAYDRWTKTNDKTRVYILARLSEVLSKRHEGMVTAREIMNSLQEMFGQPSYQLHHDALKYVYSCRMKEGTSAREHVLDMM

Query:  VQFNVAEANGAVIDEHSQVAFILESLSKLFLQFRNKAMMNKITFNLTSLLNELQLYQSLLKTKGQIEGDANVVHSKRKFEKGSSSGTKSVATSS--KKTQ
        V FNVAE NGAVIDE SQV+FILESL + FLQFR+ A+MNKI + LT+LLNELQ ++SL+K KGQ +G+ANV  S RKF +GS+SGTKS+ +SS  KK +
Subjt:  VQFNVAEANGAVIDEHSQVAFILESLSKLFLQFRNKAMMNKITFNLTSLLNELQLYQSLLKTKGQIEGDANVVHSKRKFEKGSSSGTKSVATSS--KKTQ

Query:  KKKGNKG-KAPSTAAKSKGKAKVMADKGKCFHCNVDGHWKRNCPKYLAEKKKENEGKFDLLVLETCLVEHDELTWILDSGATNH
        KKKG +G KA   AAK+  KAK  A KG CFHCN +GHWKRNCPKYLAEKKK  +GK+DLLVLETCLVE+D+  WI+DSGATNH
Subjt:  KKKGNKG-KAPSTAAKSKGKAKVMADKGKCFHCNVDGHWKRNCPKYLAEKKKENEGKFDLLVLETCLVEHDELTWILDSGATNH

A0A5A7TWB9 Gag/pol protein9.5e-9667.25Show/hide
Query:  FVLTDDCPPAPARNASQAVKDAYDRWTKTNDKTRVYILARLSEVLSKRHEGMVTAREIMNSLQEMFGQPSYQLHHDALKYVYSCRMKEGTSAREHVLDMM
        FVL ++CP  PA NA++ V++ Y+RW K N+K R YILA LSEVL+K+HE M+TAREIM+SLQEMFGQ SYQ+ HDALKY+Y+ RM EG S REHVL+MM
Subjt:  FVLTDDCPPAPARNASQAVKDAYDRWTKTNDKTRVYILARLSEVLSKRHEGMVTAREIMNSLQEMFGQPSYQLHHDALKYVYSCRMKEGTSAREHVLDMM

Query:  VQFNVAEANGAVIDEHSQVAFILESLSKLFLQFRNKAMMNKITFNLTSLLNELQLYQSLLKTKGQIEGDANVVHSKRKFEKGSSSGTKSVATSS--KKTQ
        V FNVAE NGAVIDE SQV+FILESL + FLQFR+ A+MNKI + LT+LLNELQ ++SL+K KGQ +G+ANV  S RKF +GS+SGTKS+ +SS  KK +
Subjt:  VQFNVAEANGAVIDEHSQVAFILESLSKLFLQFRNKAMMNKITFNLTSLLNELQLYQSLLKTKGQIEGDANVVHSKRKFEKGSSSGTKSVATSS--KKTQ

Query:  KKKGNKG-KAPSTAAKSKGKAKVMADKGKCFHCNVDGHWKRNCPKYLAEKKKENEGKFDLLVLETCLVEHDELTWILDSGATNH
        KKKG +G KA   AAK+  KAK  A KG CFHCN +GHWKRNCPKYLAEKKK  +GK+DLLVLETCLVE+D+  WI+DSGATNH
Subjt:  KKKGNKG-KAPSTAAKSKGKAKVMADKGKCFHCNVDGHWKRNCPKYLAEKKKENEGKFDLLVLETCLVEHDELTWILDSGATNH

A0A5A7TZD7 Gag/pol protein9.5e-9667.25Show/hide
Query:  FVLTDDCPPAPARNASQAVKDAYDRWTKTNDKTRVYILARLSEVLSKRHEGMVTAREIMNSLQEMFGQPSYQLHHDALKYVYSCRMKEGTSAREHVLDMM
        FVL ++CP  PA NA++ V++ Y+RW K N+K R YILA LSEVL+K+HE M+TAREIM+SLQEMFGQ SYQ+ HDALKY+Y+ RM EG S REHVL+MM
Subjt:  FVLTDDCPPAPARNASQAVKDAYDRWTKTNDKTRVYILARLSEVLSKRHEGMVTAREIMNSLQEMFGQPSYQLHHDALKYVYSCRMKEGTSAREHVLDMM

Query:  VQFNVAEANGAVIDEHSQVAFILESLSKLFLQFRNKAMMNKITFNLTSLLNELQLYQSLLKTKGQIEGDANVVHSKRKFEKGSSSGTKSVATSS--KKTQ
        V FNVAE NGAVIDE SQV+FILESL + FLQFR+ A+MNKI + LT+LLNELQ ++SL+K KGQ +G+ANV  S RKF +GS+SGTKS+ +SS  KK +
Subjt:  VQFNVAEANGAVIDEHSQVAFILESLSKLFLQFRNKAMMNKITFNLTSLLNELQLYQSLLKTKGQIEGDANVVHSKRKFEKGSSSGTKSVATSS--KKTQ

Query:  KKKGNKG-KAPSTAAKSKGKAKVMADKGKCFHCNVDGHWKRNCPKYLAEKKKENEGKFDLLVLETCLVEHDELTWILDSGATNH
        KKKG +G KA   AAK+  KAK  A KG CFHCN +GHWKRNCPKYLAEKKK  +GK+DLLVLETCLVE+D+  WI+DSGATNH
Subjt:  KKKGNKG-KAPSTAAKSKGKAKVMADKGKCFHCNVDGHWKRNCPKYLAEKKKENEGKFDLLVLETCLVEHDELTWILDSGATNH

A0A6J1DU19 uncharacterized protein LOC1110243613.9e-9746.53Show/hide
Query:  EPTFPDQQSGIVYAPINANNFELKTGLIQMARDSAFRGFPSEDPNSHLKSFLDICGTVKLNGVSEDAIRLRLFPFSLQDKARDWLRSLPSGSITTWDALV
        +P FP+   GI+  PINANN ELK GLIQM R++ FRG  +EDPN+HL  FLD+CGTVK+NGV +DAIRLRLFP SLQDK                  +V
Subjt:  EPTFPDQQSGIVYAPINANNFELKTGLIQMARDSAFRGFPSEDPNSHLKSFLDICGTVKLNGVSEDAIRLRLFPFSLQDKARDWLRSLPSGSITTWDALV

Query:  QAFLAKYFPPAKTVKLRTEIRTFQQLGDEQLFEAWERYKELLRKCPQHGYPDWLQIQLFYNGLNPNTKTIVDAAAGGTLLSKTVENARTLLEDMATNSYQ
        QAFL  +FPPAKT +LRTEIR+F++   EQLFE WERYKELLRKCPQHG  +WLQIQ+FYNGLN  T+TI+DAAAGGTLLS+T ENA  LL+DMA NS+Q
Subjt:  QAFLAKYFPPAKTVKLRTEIRTFQQLGDEQLFEAWERYKELLRKCPQHGYPDWLQIQLFYNGLNPNTKTIVDAAAGGTLLSKTVENARTLLEDMATNSYQ

Query:  WPTERSAPKKIAAGIYEIDNVSSLQAQMTSLANAFMKFSGTGSAQSIESAALASQC--------QEETTTEQVENKPSLEDMVGAFIAESSKRTNKLEEA
        WP+ERS  KK+ AG+YEID +SSL+AQ+ +L NA  K SG G++ S E  A             Q + T+   E K SLED++GAFI E   R +++E  
Subjt:  WPTERSAPKKIAAGIYEIDNVSSLQAQMTSLANAFMKFSGTGSAQSIESAALASQC--------QEETTTEQVENKPSLEDMVGAFIAESSKRTNKLEEA

Query:  VIAI-----NTTVTVTEQQLKASKLNWEIH---------EPEVTKEE------VEEGSSSTEAEKLTSDPLIPSPTVLVPKPKKKKKKNYSTQFKKFLDI
        V  +       T ++   +++  ++   ++         + EV   E      +  G    E EK   +  + +      K +  K+   + Q  K    
Subjt:  VIAI-----NTTVTVTEQQLKASKLNWEIH---------EPEVTKEE------VEEGSSSTEAEKLTSDPLIPSPTVLVPKPKKKKKKNYSTQFKKFLDI

Query:  FMSLNIN-LPFAE-ALEQMPKYVQFMKEWLSRKKKEKKVETIFLTSTCSARLQKNVPDKLADPGSFSFPCNFDCS
         +S   N LP+ + ALEQMP YV+FMK+ ++ K+K +  ET+ LT  CSA LQ+ +P KL DPGSF+ PC    S
Subjt:  FMSLNIN-LPFAE-ALEQMPKYVQFMKEWLSRKKKEKKVETIFLTSTCSARLQKNVPDKLADPGSFSFPCNFDCS

SwissProt top hitse value%identityAlignment
P10978 Retrovirus-related Pol polyprotein from transposon TNT 1-944.8e-0423.27Show/hide
Query:  WTKTNDKTRVYILARLSEVLSKRHEGMVTAREIMNSLQEMFGQPSYQLHHDALKYVYSCRMKEGTSAREHVLDMMVQFNVAEAN-GAVIDEHSQVAFILE
        W   +++    I   LS+ +        TAR I   L+ ++   +        K +Y+  M EGT+   H L++        AN G  I+E  +   +L 
Subjt:  WTKTNDKTRVYILARLSEVLSKRHEGMVTAREIMNSLQEMFGQPSYQLHHDALKYVYSCRMKEGTSAREHVLDMMVQFNVAEAN-GAVIDEHSQVAFILE

Query:  SLSKLFLQFRNKAMMNKITFNLTSLLNELQLYQSLLKTKGQIEGDANVVHSK-RKFEKGSSSGTKSVATSSKKTQKKKGNKGKAPSTAAKSKGKAKVMAD
        SL   +       +  K T  L  + + L L + + K K + +G A +   + R +++ S++  +S            G +GK     +K++ K++V   
Subjt:  SLSKLFLQFRNKAMMNKITFNLTSLLNELQLYQSLLKTKGQIEGDANVVHSK-RKFEKGSSSGTKSVATSSKKTQKKKGNKGKAPSTAAKSKGKAKVMAD

Query:  KGKCFHCNVDGHWKRNCP-------KYLAEKKKENEGKF----DLLVL-----ETCL-VEHDELTWILDSGATNH
           C++CN  GH+KR+CP       +   +K  +N        D +VL     E C+ +   E  W++D+ A++H
Subjt:  KGKCFHCNVDGHWKRNCP-------KYLAEKKKENEGKF----DLLVL-----ETCL-VEHDELTWILDSGATNH

Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGTCTTCCTCAATTTTCGTCTTAACGGATGATTGTCCTCCAGCCCCTGCTCGTAATGCATCCCAAGCAGTTAAGGATGCTTATGATCGCTGGACAAAGACCAATGATAA
GACTCGCGTCTATATCCTAGCCAGGTTATCTGAAGTTTTGTCCAAAAGGCATGAGGGCATGGTAACAGCGAGGGAGATTATGAACTCTCTCCAGGAGATGTTTGGACAAC
CGTCCTACCAACTCCACCATGACGCTCTCAAATACGTTTATAGCTGTCGCATGAAAGAGGGCACGTCTGCTCGGGAGCATGTCCTGGATATGATGGTCCAATTCAACGTG
GCAGAGGCAAACGGGGCGGTCATAGATGAGCATAGTCAAGTTGCATTCATCTTAGAATCTCTTTCGAAGCTTTTTCTACAGTTTCGAAACAAAGCAATGATGAATAAAAT
AACATTCAACCTGACTAGCCTCCTGAATGAGCTACAACTCTATCAGTCTCTTCTTAAGACCAAGGGACAGATAGAAGGAGATGCAAACGTTGTTCACTCTAAAAGAAAGT
TCGAGAAGGGTTCATCCTCTGGAACTAAATCTGTAGCCACTTCTTCAAAGAAAACTCAGAAGAAGAAAGGTAACAAGGGGAAAGCTCCCAGTACTGCTGCTAAAAGCAAG
GGAAAAGCCAAAGTTATGGCAGATAAGGGTAAGTGTTTCCACTGCAATGTAGATGGACATTGGAAGAGAAACTGCCCAAAGTACCTTGCTGAGAAAAAGAAGGAAAATGA
AGGTAAATTTGATTTACTTGTTTTAGAAACATGCCTGGTAGAACATGATGAGTTAACCTGGATACTTGATTCAGGAGCAACTAATCATAGAGAGAAAAACGACCAGAACA
AGGCAGAGAGCGGGGGGCTTGATTCGAGGAGCTTGGATCGAGAGCCGACTTTTCCTGATCAACAATCTGGGATAGTCTACGCGCCGATTAATGCAAATAATTTTGAGCTG
AAAACTGGCCTCATCCAGATGGCTAGAGATAGTGCATTTAGAGGATTCCCCTCTGAGGATCCTAATTCTCATTTAAAATCTTTTCTTGACATCTGTGGGACTGTGAAGTT
GAATGGTGTCTCTGAAGATGCCATACGCTTACGATTGTTTCCTTTCTCTTTGCAGGACAAAGCTAGAGACTGGCTCAGATCACTTCCATCTGGAAGCATAACTACGTGGG
ACGCGTTGGTTCAAGCTTTTCTTGCCAAATATTTCCCACCTGCAAAGACAGTCAAGCTTAGGACAGAGATTAGGACATTCCAGCAGCTAGGCGATGAACAACTATTTGAG
GCTTGGGAGCGCTACAAGGAGTTACTGAGGAAATGCCCTCAGCATGGATACCCAGATTGGCTGCAGATCCAGTTGTTTTATAATGGTCTAAATCCAAATACTAAAACCAT
TGTTGATGCAGCTGCAGGTGGGACTCTGTTGTCCAAGACTGTAGAGAATGCACGAACTTTACTTGAGGATATGGCCACGAACAGCTACCAGTGGCCAACTGAGCGGTCTG
CACCTAAGAAAATTGCAGCTGGGATTTATGAGATCGATAATGTAAGTTCGCTTCAAGCCCAGATGACCTCCCTTGCTAATGCTTTCATGAAATTCTCAGGTACAGGGAGT
GCACAATCTATTGAGTCAGCTGCTCTTGCATCTCAGTGTCAGGAGGAGACCACTACTGAACAGGTTGAAAATAAACCTTCTCTTGAAGATATGGTTGGAGCTTTTATTGC
AGAATCGAGCAAAAGGACCAACAAGCTGGAGGAAGCAGTGATTGCAATAAATACCACTGTGACGGTCACGGAGCAGCAATTAAAAGCATCGAAACTCAACTGGGAGATTC
ATGAGCCTGAAGTCACTAAGGAAGAAGTTGAAGAGGGTTCATCTTCAACCGAAGCTGAAAAACTCACTTCTGACCCTCTTATTCCTTCACCTACTGTTCTGGTTCCAAAG
CCCAAGAAAAAGAAGAAAAAGAATTACTCAACTCAATTCAAAAAGTTTCTTGATATTTTTATGAGTTTAAATATTAATTTACCATTTGCAGAGGCTTTGGAGCAGATGCC
CAAATATGTACAATTTATGAAGGAATGGCTTTCGAGGAAAAAGAAGGAAAAGAAAGTTGAGACAATATTCCTTACATCTACATGCAGTGCCCGACTTCAAAAGAATGTGC
CTGACAAACTTGCTGATCCAGGGAGTTTTTCTTTTCCCTGCAATTTTGATTGCTCTGAAGTTATGATGGTACGTTACAGAAAAGGTGCAAGCAAGAGCCCTCCGGTGGAG
CCCGTGACAATAGACCTCCTTGAAGCATATACGGTCACCACTGCTTGGTTTCGGCCATTCTCGCCGTGGGTGTGCTTGGAGGGTCCGAGTTCGCCGTGGGTTGCAAGTTT
TCGGGATTTTCAGCAAGTTTCCGGCGATTTCAGTGTTTCTCAGCTTGTTTTAGCCGTGGTGTTTTTTGGACTCTTCTGCTGTGTTGTTGAACTCTGTTGCTGCTGTTGTG
ACTTTATTTCTGAGGCGGACGGGCTATATTTGAAGTTTGTGGAGTGGTTGGTAGCTTGGGAGACATGCTGCCTCTTCTTCTCCTCCCCTCCACCACCTTTTGACCAAGAT
AGGTTCGTTAGTGCTGAAGCCTCAGCGAGGTTTGATAAATATGTGGAGGGTAGAAATTTTATTCCTGAGCGTGGTTTTAGCCCCGACCCCGAGATGCAACCAAACCTTGT
TAATAACATTGTTGCGCGGGGTTGGGGCGACTTTGTTCATCATCCTGCACCTGGGGTAGCAACTATAGTGAGGGAGTTCTATGCCAATATGGAAACCTCCTCTTCATCAT
CCTTTGTTCGAGGACACCGTGTCCTCTACGACCCCCTCACCATAAATAGGTTTTATAGTTTGCCCAACTTTGACAGGGATGAATATAGTACCTATCTTCATGGCCACTTG
GATGTAAACGAAGTCATTCAGACTATTTGTAGGCCAGGGGCAGAATGGATTATGACTGGCGCAGAGGTGGTGAGATTCAAAACCACTGATTTGTTTGTAGATTACAGAGC
GTGGCATACCTTCCTCTGTGCCAAGTTGATGCCTGTGGCGCATCTTAGCGATGTTACCAAGAGTCGTGCCATCCTGCTTTTCGCTATCGCCACAGGTCGCTCCGTCAATG
TCGGTCAGGTCATCCATCAGTCTATGAACCATATTCGCCGACGTTACACGACAGTGGGGCTCGGGCATCCTTCACTGATCACAGCCCTTTGCCGAGCTGCTGGTGTCGTG
TGGGACGCCCAAGAGGAGTTGGTTCATCCTGGAGCGCTAATCGACAAGAACTTCATCAGTCGCTACAGAGGACCTGGACCACAGGGCGCACAGCCACCGCTCCCTATCCA
TGCACCGCCACAGCATCATGAGCAGCACGAGCAGCCTGCAGAGCCTGAGCAGCAGGAGCAGGAGATTCCACATCCCTCCATCGAGGAGCAGCTGCAGCAGCTGCGTATGG
AGTTTCAGAGCCATCGCCTAGACTTCCAGACTCTCCAGCAGGGCATCCAGAGCCAGCAGCGAGAGCACCAGAGAGAGAGGCGTAGAGATCGTCGTCATTTCCTCTACTCC
ATGAGCATGCATGCCCACACCTATCAGTGTCAGGTAGCTATGAGTACGGGTCAGCCTTTACCGCCACCTTTACCACCGTACGAGTCGCCTGAGGACGAGGATGAGGACGC
GTGA
mRNA sequenceShow/hide mRNA sequence
ATGTCTTCCTCAATTTTCGTCTTAACGGATGATTGTCCTCCAGCCCCTGCTCGTAATGCATCCCAAGCAGTTAAGGATGCTTATGATCGCTGGACAAAGACCAATGATAA
GACTCGCGTCTATATCCTAGCCAGGTTATCTGAAGTTTTGTCCAAAAGGCATGAGGGCATGGTAACAGCGAGGGAGATTATGAACTCTCTCCAGGAGATGTTTGGACAAC
CGTCCTACCAACTCCACCATGACGCTCTCAAATACGTTTATAGCTGTCGCATGAAAGAGGGCACGTCTGCTCGGGAGCATGTCCTGGATATGATGGTCCAATTCAACGTG
GCAGAGGCAAACGGGGCGGTCATAGATGAGCATAGTCAAGTTGCATTCATCTTAGAATCTCTTTCGAAGCTTTTTCTACAGTTTCGAAACAAAGCAATGATGAATAAAAT
AACATTCAACCTGACTAGCCTCCTGAATGAGCTACAACTCTATCAGTCTCTTCTTAAGACCAAGGGACAGATAGAAGGAGATGCAAACGTTGTTCACTCTAAAAGAAAGT
TCGAGAAGGGTTCATCCTCTGGAACTAAATCTGTAGCCACTTCTTCAAAGAAAACTCAGAAGAAGAAAGGTAACAAGGGGAAAGCTCCCAGTACTGCTGCTAAAAGCAAG
GGAAAAGCCAAAGTTATGGCAGATAAGGGTAAGTGTTTCCACTGCAATGTAGATGGACATTGGAAGAGAAACTGCCCAAAGTACCTTGCTGAGAAAAAGAAGGAAAATGA
AGGTAAATTTGATTTACTTGTTTTAGAAACATGCCTGGTAGAACATGATGAGTTAACCTGGATACTTGATTCAGGAGCAACTAATCATAGAGAGAAAAACGACCAGAACA
AGGCAGAGAGCGGGGGGCTTGATTCGAGGAGCTTGGATCGAGAGCCGACTTTTCCTGATCAACAATCTGGGATAGTCTACGCGCCGATTAATGCAAATAATTTTGAGCTG
AAAACTGGCCTCATCCAGATGGCTAGAGATAGTGCATTTAGAGGATTCCCCTCTGAGGATCCTAATTCTCATTTAAAATCTTTTCTTGACATCTGTGGGACTGTGAAGTT
GAATGGTGTCTCTGAAGATGCCATACGCTTACGATTGTTTCCTTTCTCTTTGCAGGACAAAGCTAGAGACTGGCTCAGATCACTTCCATCTGGAAGCATAACTACGTGGG
ACGCGTTGGTTCAAGCTTTTCTTGCCAAATATTTCCCACCTGCAAAGACAGTCAAGCTTAGGACAGAGATTAGGACATTCCAGCAGCTAGGCGATGAACAACTATTTGAG
GCTTGGGAGCGCTACAAGGAGTTACTGAGGAAATGCCCTCAGCATGGATACCCAGATTGGCTGCAGATCCAGTTGTTTTATAATGGTCTAAATCCAAATACTAAAACCAT
TGTTGATGCAGCTGCAGGTGGGACTCTGTTGTCCAAGACTGTAGAGAATGCACGAACTTTACTTGAGGATATGGCCACGAACAGCTACCAGTGGCCAACTGAGCGGTCTG
CACCTAAGAAAATTGCAGCTGGGATTTATGAGATCGATAATGTAAGTTCGCTTCAAGCCCAGATGACCTCCCTTGCTAATGCTTTCATGAAATTCTCAGGTACAGGGAGT
GCACAATCTATTGAGTCAGCTGCTCTTGCATCTCAGTGTCAGGAGGAGACCACTACTGAACAGGTTGAAAATAAACCTTCTCTTGAAGATATGGTTGGAGCTTTTATTGC
AGAATCGAGCAAAAGGACCAACAAGCTGGAGGAAGCAGTGATTGCAATAAATACCACTGTGACGGTCACGGAGCAGCAATTAAAAGCATCGAAACTCAACTGGGAGATTC
ATGAGCCTGAAGTCACTAAGGAAGAAGTTGAAGAGGGTTCATCTTCAACCGAAGCTGAAAAACTCACTTCTGACCCTCTTATTCCTTCACCTACTGTTCTGGTTCCAAAG
CCCAAGAAAAAGAAGAAAAAGAATTACTCAACTCAATTCAAAAAGTTTCTTGATATTTTTATGAGTTTAAATATTAATTTACCATTTGCAGAGGCTTTGGAGCAGATGCC
CAAATATGTACAATTTATGAAGGAATGGCTTTCGAGGAAAAAGAAGGAAAAGAAAGTTGAGACAATATTCCTTACATCTACATGCAGTGCCCGACTTCAAAAGAATGTGC
CTGACAAACTTGCTGATCCAGGGAGTTTTTCTTTTCCCTGCAATTTTGATTGCTCTGAAGTTATGATGGTACGTTACAGAAAAGGTGCAAGCAAGAGCCCTCCGGTGGAG
CCCGTGACAATAGACCTCCTTGAAGCATATACGGTCACCACTGCTTGGTTTCGGCCATTCTCGCCGTGGGTGTGCTTGGAGGGTCCGAGTTCGCCGTGGGTTGCAAGTTT
TCGGGATTTTCAGCAAGTTTCCGGCGATTTCAGTGTTTCTCAGCTTGTTTTAGCCGTGGTGTTTTTTGGACTCTTCTGCTGTGTTGTTGAACTCTGTTGCTGCTGTTGTG
ACTTTATTTCTGAGGCGGACGGGCTATATTTGAAGTTTGTGGAGTGGTTGGTAGCTTGGGAGACATGCTGCCTCTTCTTCTCCTCCCCTCCACCACCTTTTGACCAAGAT
AGGTTCGTTAGTGCTGAAGCCTCAGCGAGGTTTGATAAATATGTGGAGGGTAGAAATTTTATTCCTGAGCGTGGTTTTAGCCCCGACCCCGAGATGCAACCAAACCTTGT
TAATAACATTGTTGCGCGGGGTTGGGGCGACTTTGTTCATCATCCTGCACCTGGGGTAGCAACTATAGTGAGGGAGTTCTATGCCAATATGGAAACCTCCTCTTCATCAT
CCTTTGTTCGAGGACACCGTGTCCTCTACGACCCCCTCACCATAAATAGGTTTTATAGTTTGCCCAACTTTGACAGGGATGAATATAGTACCTATCTTCATGGCCACTTG
GATGTAAACGAAGTCATTCAGACTATTTGTAGGCCAGGGGCAGAATGGATTATGACTGGCGCAGAGGTGGTGAGATTCAAAACCACTGATTTGTTTGTAGATTACAGAGC
GTGGCATACCTTCCTCTGTGCCAAGTTGATGCCTGTGGCGCATCTTAGCGATGTTACCAAGAGTCGTGCCATCCTGCTTTTCGCTATCGCCACAGGTCGCTCCGTCAATG
TCGGTCAGGTCATCCATCAGTCTATGAACCATATTCGCCGACGTTACACGACAGTGGGGCTCGGGCATCCTTCACTGATCACAGCCCTTTGCCGAGCTGCTGGTGTCGTG
TGGGACGCCCAAGAGGAGTTGGTTCATCCTGGAGCGCTAATCGACAAGAACTTCATCAGTCGCTACAGAGGACCTGGACCACAGGGCGCACAGCCACCGCTCCCTATCCA
TGCACCGCCACAGCATCATGAGCAGCACGAGCAGCCTGCAGAGCCTGAGCAGCAGGAGCAGGAGATTCCACATCCCTCCATCGAGGAGCAGCTGCAGCAGCTGCGTATGG
AGTTTCAGAGCCATCGCCTAGACTTCCAGACTCTCCAGCAGGGCATCCAGAGCCAGCAGCGAGAGCACCAGAGAGAGAGGCGTAGAGATCGTCGTCATTTCCTCTACTCC
ATGAGCATGCATGCCCACACCTATCAGTGTCAGGTAGCTATGAGTACGGGTCAGCCTTTACCGCCACCTTTACCACCGTACGAGTCGCCTGAGGACGAGGATGAGGACGC
GTGA
Protein sequenceShow/hide protein sequence
MSSSIFVLTDDCPPAPARNASQAVKDAYDRWTKTNDKTRVYILARLSEVLSKRHEGMVTAREIMNSLQEMFGQPSYQLHHDALKYVYSCRMKEGTSAREHVLDMMVQFNV
AEANGAVIDEHSQVAFILESLSKLFLQFRNKAMMNKITFNLTSLLNELQLYQSLLKTKGQIEGDANVVHSKRKFEKGSSSGTKSVATSSKKTQKKKGNKGKAPSTAAKSK
GKAKVMADKGKCFHCNVDGHWKRNCPKYLAEKKKENEGKFDLLVLETCLVEHDELTWILDSGATNHREKNDQNKAESGGLDSRSLDREPTFPDQQSGIVYAPINANNFEL
KTGLIQMARDSAFRGFPSEDPNSHLKSFLDICGTVKLNGVSEDAIRLRLFPFSLQDKARDWLRSLPSGSITTWDALVQAFLAKYFPPAKTVKLRTEIRTFQQLGDEQLFE
AWERYKELLRKCPQHGYPDWLQIQLFYNGLNPNTKTIVDAAAGGTLLSKTVENARTLLEDMATNSYQWPTERSAPKKIAAGIYEIDNVSSLQAQMTSLANAFMKFSGTGS
AQSIESAALASQCQEETTTEQVENKPSLEDMVGAFIAESSKRTNKLEEAVIAINTTVTVTEQQLKASKLNWEIHEPEVTKEEVEEGSSSTEAEKLTSDPLIPSPTVLVPK
PKKKKKKNYSTQFKKFLDIFMSLNINLPFAEALEQMPKYVQFMKEWLSRKKKEKKVETIFLTSTCSARLQKNVPDKLADPGSFSFPCNFDCSEVMMVRYRKGASKSPPVE
PVTIDLLEAYTVTTAWFRPFSPWVCLEGPSSPWVASFRDFQQVSGDFSVSQLVLAVVFFGLFCCVVELCCCCCDFISEADGLYLKFVEWLVAWETCCLFFSSPPPPFDQD
RFVSAEASARFDKYVEGRNFIPERGFSPDPEMQPNLVNNIVARGWGDFVHHPAPGVATIVREFYANMETSSSSSFVRGHRVLYDPLTINRFYSLPNFDRDEYSTYLHGHL
DVNEVIQTICRPGAEWIMTGAEVVRFKTTDLFVDYRAWHTFLCAKLMPVAHLSDVTKSRAILLFAIATGRSVNVGQVIHQSMNHIRRRYTTVGLGHPSLITALCRAAGVV
WDAQEELVHPGALIDKNFISRYRGPGPQGAQPPLPIHAPPQHHEQHEQPAEPEQQEQEIPHPSIEEQLQQLRMEFQSHRLDFQTLQQGIQSQQREHQRERRRDRRHFLYS
MSMHAHTYQCQVAMSTGQPLPPPLPPYESPEDEDEDA