; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Tan0021788 (gene) of Snake gourd v1 genome

Gene IDTan0021788
OrganismTrichosanthes anguina (Snake gourd v1)
DescriptionGag/pol protein
Genome locationLG09:44447612..44449193
RNA-Seq ExpressionTan0021788
SyntenyTan0021788
Gene Ontology termsGO:0006807 - nitrogen compound metabolic process (biological process)
GO:0043170 - macromolecule metabolic process (biological process)
GO:0044238 - primary metabolic process (biological process)
GO:0003676 - nucleic acid binding (molecular function)
InterPro domainsIPR012337 - Ribonuclease H-like superfamily
IPR025724 - GAG-pre-integrase domain
IPR036397 - Ribonuclease H superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAA0035879.1 gag/pol protein [Cucumis melo var. makuwa]1.0e-15156.13Show/hide
Query:  DAYDRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVS
        + Y+RW +ANEKA+ YI+ S+S+VLAKKHE ++TA+EIM+SLQEMFGQ S Q++HD+LKY++NARM EG+SVREHVLNMM HFN+ EMNGA IDE+SQVS
Subjt:  DAYDRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVS

Query:  FILETLQKSFLGPEQCQGIEN---------LRQMLPHMSLITGFDLRGRNSLL-----FH-----------AQKGKKRMKR------GKIDRVCAHKGKK
        FILE+L +SFL   +   + N         L ++    SL+     +G  ++      FH           +  G K+ K+       K +   A   KK
Subjt:  FILETLQKSFLGPEQCQGIEN---------LRQMLPHMSLITGFDLRGRNSLL-----FH-----------AQKGKKRMKR------GKIDRVCAHKGKK

Query:  VKDVAAERKVFP-PEWGRTLEEEPSQILGRERKNQGKADLLVTETCLVESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGK
         K  AA+   F   + G      P  +  +++  QGK DLLV ETCLVE+ DSAWI+DS ATNHVCSSFQ I SW+ L  GE+T+RVG+  +VSA A+G 
Subjt:  VKDVAAERKVFP-PEWGRTLEEEPSQILGRERKNQGKADLLVTETCLVESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGK

Query:  VKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHL
        ++L   ++++LL+N+Y+VP   RNL+S+ CLLEQ  S++FN NK FI +NG  ICSA LE++LYVL+  + K++LNTE+FKT  T+ K+ K+SPKENAHL
Subjt:  VKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHL

Query:  WHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSE
        WHLR G+INLNRIE+LVK+GLL+ELEENSL VCESC EGKMTKRPF+GKG+R KEPLELVH DLCGPMNVK RGG+EYF++F  DYSRYGY+YLM  KSE
Subjt:  WHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSE

Query:  TLEKFKEYKTEVENLLGKSLKT
         LEKFKEYK EVEN L K++KT
Subjt:  TLEKFKEYKTEVENLLGKSLKT

KAA0044955.1 gag/pol protein [Cucumis melo var. makuwa]3.5e-15256.13Show/hide
Query:  DAYDRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVS
        + Y+RW +ANEKA+ YI+ S+S+VLAKKHE ++TA+EIM+SLQEMFGQ S Q++HD+LKY++NARM EG+SVREHVLNMM HFN+ EMNGA IDE+SQVS
Subjt:  DAYDRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVS

Query:  FILETLQKSFLGPEQCQGIEN---------LRQMLPHMSLITGFDLRGRNSLL-----FH-----------AQKGKKRMKR------GKIDRVCAHKGKK
        FILE+L +SFL   +   + N         L ++    SL+     +G  ++      FH           +  G K+ K+       K +   A   KK
Subjt:  FILETLQKSFLGPEQCQGIEN---------LRQMLPHMSLITGFDLRGRNSLL-----FH-----------AQKGKKRMKR------GKIDRVCAHKGKK

Query:  VKDVAAERKVFP-PEWGRTLEEEPSQILGRERKNQGKADLLVTETCLVESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGK
         K  AA+   F   + G      P  +  +++  QGK DLLV ETCLVE+ DSAWI+DS ATNHVCSSFQ I SWQ L  GE+T+RVG+  +VSA A+G 
Subjt:  VKDVAAERKVFP-PEWGRTLEEEPSQILGRERKNQGKADLLVTETCLVESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGK

Query:  VKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHL
        ++L+  ++++LL+N+Y+VP   RNL+S+ CLLEQ  S++FN NK FI +NG  ICSA LE++LYVL+  + K++LNTE+FKT  T+ K+ K+SPKENAHL
Subjt:  VKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHL

Query:  WHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSE
        WHLR G+INLNRIE+LVK+GLL+ELEENSL VCESC EGKMTKRPF+GKG+R KEPLELVH +LCGPMNVK RGG+EYF++F  DYSRYGY+YLM  KSE
Subjt:  WHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSE

Query:  TLEKFKEYKTEVENLLGKSLKT
         LEKFKEYK EVEN L K++KT
Subjt:  TLEKFKEYKTEVENLLGKSLKT

KAA0048404.1 gag/pol protein [Cucumis melo var. makuwa]1.0e-15156.13Show/hide
Query:  DAYDRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVS
        + Y+RW +ANEKA+ YI+ S+S+VLAKKHE ++TA+EIM+SLQEMFGQ S Q++HD+LKY++NARM EG+SVREHVLNMM HFN+ EMNGA IDE+SQVS
Subjt:  DAYDRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVS

Query:  FILETLQKSFLGPEQCQGIEN---------LRQMLPHMSLITGFDLRGRNSLL-----FH-----------AQKGKKRMKR------GKIDRVCAHKGKK
        FILE+L +SFL   +   + N         L ++    SL+     +G  ++      FH           +  G K+ K+       K +   A   KK
Subjt:  FILETLQKSFLGPEQCQGIEN---------LRQMLPHMSLITGFDLRGRNSLL-----FH-----------AQKGKKRMKR------GKIDRVCAHKGKK

Query:  VKDVAAERKVFP-PEWGRTLEEEPSQILGRERKNQGKADLLVTETCLVESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGK
         K  AA+   F   + G      P  +  +++  QGK DLLV ETCLVE+ DSAWI+DS ATNHVCSSFQ I SW+ L  GE+T+RVG+  +VSA A+G 
Subjt:  VKDVAAERKVFP-PEWGRTLEEEPSQILGRERKNQGKADLLVTETCLVESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGK

Query:  VKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHL
        ++L   ++++LL+N+Y+VP   RNL+S+ CLLEQ  S++FN NK FI +NG  ICSA LE++LYVL+  + K++LNTE+FKT  T+ K+ K+SPKENAHL
Subjt:  VKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHL

Query:  WHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSE
        WHLR G+INLNRIE+LVK+GLL+ELEENSL VCESC EGKMTKRPF+GKG+R KEPLELVH DLCGPMNVK RGG+EYF++F  DYSRYGY+YLM  KSE
Subjt:  WHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSE

Query:  TLEKFKEYKTEVENLLGKSLKT
         LEKFKEYK EVEN L K++KT
Subjt:  TLEKFKEYKTEVENLLGKSLKT

KAA0054490.1 gag/pol protein [Cucumis melo var. makuwa]1.0e-15156.13Show/hide
Query:  DAYDRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVS
        + Y+RW +ANEKA+ YI+ S+S+VLAKKHE ++TA+EIM+SLQEMFGQ S Q++HD+LKY++NARM EG+SVREHVLNMM HFN+ EMNGA IDE+SQVS
Subjt:  DAYDRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVS

Query:  FILETLQKSFLGPEQCQGIEN---------LRQMLPHMSLITGFDLRGRNSLL-----FH-----------AQKGKKRMKR------GKIDRVCAHKGKK
        FILE+L +SFL   +   + N         L ++    SL+     +G  ++      FH           +  G K+ K+       K +   A   KK
Subjt:  FILETLQKSFLGPEQCQGIEN---------LRQMLPHMSLITGFDLRGRNSLL-----FH-----------AQKGKKRMKR------GKIDRVCAHKGKK

Query:  VKDVAAERKVFP-PEWGRTLEEEPSQILGRERKNQGKADLLVTETCLVESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGK
         K  AA+   F   + G      P  +  +++  QGK DLLV ETCLVE+ DSAWI+DS ATNHVCSSFQ I SW+ L  GE+T+RVG+  +VSA A+G 
Subjt:  VKDVAAERKVFP-PEWGRTLEEEPSQILGRERKNQGKADLLVTETCLVESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGK

Query:  VKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHL
        ++L   ++++LL+N+Y+VP   RNL+S+ CLLEQ  S++FN NK FI +NG  ICSA LE++LYVL+  + K++LNTE+FKT  T+ K+ K+SPKENAHL
Subjt:  VKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHL

Query:  WHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSE
        WHLR G+INLNRIE+LVK+GLL+ELEENSL VCESC EGKMTKRPF+GKG+R KEPLELVH DLCGPMNVK RGG+EYF++F  DYSRYGY+YLM  KSE
Subjt:  WHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSE

Query:  TLEKFKEYKTEVENLLGKSLKT
         LEKFKEYK EVEN L K++KT
Subjt:  TLEKFKEYKTEVENLLGKSLKT

TYK14550.1 gag/pol protein [Cucumis melo var. makuwa]1.0e-15156.13Show/hide
Query:  DAYDRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVS
        + Y+RW +ANEKA+ YI+ S+S+VLAKKHE ++TA+EIM+SLQEMFGQ S Q++HD+LKY++NARM EG+SVREHVLNMM HFN+ EMNGA IDE+SQVS
Subjt:  DAYDRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVS

Query:  FILETLQKSFLGPEQCQGIEN---------LRQMLPHMSLITGFDLRGRNSLL-----FH-----------AQKGKKRMKR------GKIDRVCAHKGKK
        FILE+L +SFL   +   + N         L ++    SL+     +G  ++      FH           +  G K+ K+       K +   A   KK
Subjt:  FILETLQKSFLGPEQCQGIEN---------LRQMLPHMSLITGFDLRGRNSLL-----FH-----------AQKGKKRMKR------GKIDRVCAHKGKK

Query:  VKDVAAERKVFP-PEWGRTLEEEPSQILGRERKNQGKADLLVTETCLVESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGK
         K  AA+   F   + G      P  +  +++  QGK DLLV ETCLVE+ DSAWI+DS ATNHVCSSFQ I SW+ L  GE+T+RVG+  +VSA A+G 
Subjt:  VKDVAAERKVFP-PEWGRTLEEEPSQILGRERKNQGKADLLVTETCLVESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGK

Query:  VKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHL
        ++L   ++++LL+N+Y+VP   RNL+S+ CLLEQ  S++FN NK FI +NG  ICSA LE++LYVL+  + K++LNTE+FKT  T+ K+ K+SPKENAHL
Subjt:  VKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHL

Query:  WHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSE
        WHLR G+INLNRIE+LVK+GLL+ELEENSL VCESC EGKMTKRPF+GKG+R KEPLELVH DLCGPMNVK RGG+EYF++F  DYSRYGY+YLM  KSE
Subjt:  WHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSE

Query:  TLEKFKEYKTEVENLLGKSLKT
         LEKFKEYK EVEN L K++KT
Subjt:  TLEKFKEYKTEVENLLGKSLKT

TrEMBL top hitse value%identityAlignment
A0A5A7SMH8 Gag/pol protein4.9e-15256.13Show/hide
Query:  DAYDRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVS
        + Y+RW +ANEKA+ YI+ S+S+VLAKKHE ++TA+EIM+SLQEMFGQ S Q++HD+LKY++NARM EG+SVREHVLNMM HFN+ EMNGA IDE+SQVS
Subjt:  DAYDRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVS

Query:  FILETLQKSFLGPEQCQGIEN---------LRQMLPHMSLITGFDLRGRNSLL-----FH-----------AQKGKKRMKR------GKIDRVCAHKGKK
        FILE+L +SFL   +   + N         L ++    SL+     +G  ++      FH           +  G K+ K+       K +   A   KK
Subjt:  FILETLQKSFLGPEQCQGIEN---------LRQMLPHMSLITGFDLRGRNSLL-----FH-----------AQKGKKRMKR------GKIDRVCAHKGKK

Query:  VKDVAAERKVFP-PEWGRTLEEEPSQILGRERKNQGKADLLVTETCLVESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGK
         K  AA+   F   + G      P  +  +++  QGK DLLV ETCLVE+ DSAWI+DS ATNHVCSSFQ I SW+ L  GE+T+RVG+  +VSA A+G 
Subjt:  VKDVAAERKVFP-PEWGRTLEEEPSQILGRERKNQGKADLLVTETCLVESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGK

Query:  VKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHL
        ++L   ++++LL+N+Y+VP   RNL+S+ CLLEQ  S++FN NK FI +NG  ICSA LE++LYVL+  + K++LNTE+FKT  T+ K+ K+SPKENAHL
Subjt:  VKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHL

Query:  WHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSE
        WHLR G+INLNRIE+LVK+GLL+ELEENSL VCESC EGKMTKRPF+GKG+R KEPLELVH DLCGPMNVK RGG+EYF++F  DYSRYGY+YLM  KSE
Subjt:  WHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSE

Query:  TLEKFKEYKTEVENLLGKSLKT
         LEKFKEYK EVEN L K++KT
Subjt:  TLEKFKEYKTEVENLLGKSLKT

A0A5A7TU93 Gag/pol protein1.7e-15256.13Show/hide
Query:  DAYDRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVS
        + Y+RW +ANEKA+ YI+ S+S+VLAKKHE ++TA+EIM+SLQEMFGQ S Q++HD+LKY++NARM EG+SVREHVLNMM HFN+ EMNGA IDE+SQVS
Subjt:  DAYDRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVS

Query:  FILETLQKSFLGPEQCQGIEN---------LRQMLPHMSLITGFDLRGRNSLL-----FH-----------AQKGKKRMKR------GKIDRVCAHKGKK
        FILE+L +SFL   +   + N         L ++    SL+     +G  ++      FH           +  G K+ K+       K +   A   KK
Subjt:  FILETLQKSFLGPEQCQGIEN---------LRQMLPHMSLITGFDLRGRNSLL-----FH-----------AQKGKKRMKR------GKIDRVCAHKGKK

Query:  VKDVAAERKVFP-PEWGRTLEEEPSQILGRERKNQGKADLLVTETCLVESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGK
         K  AA+   F   + G      P  +  +++  QGK DLLV ETCLVE+ DSAWI+DS ATNHVCSSFQ I SWQ L  GE+T+RVG+  +VSA A+G 
Subjt:  VKDVAAERKVFP-PEWGRTLEEEPSQILGRERKNQGKADLLVTETCLVESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGK

Query:  VKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHL
        ++L+  ++++LL+N+Y+VP   RNL+S+ CLLEQ  S++FN NK FI +NG  ICSA LE++LYVL+  + K++LNTE+FKT  T+ K+ K+SPKENAHL
Subjt:  VKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHL

Query:  WHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSE
        WHLR G+INLNRIE+LVK+GLL+ELEENSL VCESC EGKMTKRPF+GKG+R KEPLELVH +LCGPMNVK RGG+EYF++F  DYSRYGY+YLM  KSE
Subjt:  WHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSE

Query:  TLEKFKEYKTEVENLLGKSLKT
         LEKFKEYK EVEN L K++KT
Subjt:  TLEKFKEYKTEVENLLGKSLKT

A0A5A7TWB9 Gag/pol protein4.9e-15256.13Show/hide
Query:  DAYDRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVS
        + Y+RW +ANEKA+ YI+ S+S+VLAKKHE ++TA+EIM+SLQEMFGQ S Q++HD+LKY++NARM EG+SVREHVLNMM HFN+ EMNGA IDE+SQVS
Subjt:  DAYDRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVS

Query:  FILETLQKSFLGPEQCQGIEN---------LRQMLPHMSLITGFDLRGRNSLL-----FH-----------AQKGKKRMKR------GKIDRVCAHKGKK
        FILE+L +SFL   +   + N         L ++    SL+     +G  ++      FH           +  G K+ K+       K +   A   KK
Subjt:  FILETLQKSFLGPEQCQGIEN---------LRQMLPHMSLITGFDLRGRNSLL-----FH-----------AQKGKKRMKR------GKIDRVCAHKGKK

Query:  VKDVAAERKVFP-PEWGRTLEEEPSQILGRERKNQGKADLLVTETCLVESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGK
         K  AA+   F   + G      P  +  +++  QGK DLLV ETCLVE+ DSAWI+DS ATNHVCSSFQ I SW+ L  GE+T+RVG+  +VSA A+G 
Subjt:  VKDVAAERKVFP-PEWGRTLEEEPSQILGRERKNQGKADLLVTETCLVESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGK

Query:  VKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHL
        ++L   ++++LL+N+Y+VP   RNL+S+ CLLEQ  S++FN NK FI +NG  ICSA LE++LYVL+  + K++LNTE+FKT  T+ K+ K+SPKENAHL
Subjt:  VKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHL

Query:  WHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSE
        WHLR G+INLNRIE+LVK+GLL+ELEENSL VCESC EGKMTKRPF+GKG+R KEPLELVH DLCGPMNVK RGG+EYF++F  DYSRYGY+YLM  KSE
Subjt:  WHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSE

Query:  TLEKFKEYKTEVENLLGKSLKT
         LEKFKEYK EVEN L K++KT
Subjt:  TLEKFKEYKTEVENLLGKSLKT

A0A5A7V4M1 Gag/pol protein4.9e-15256.13Show/hide
Query:  DAYDRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVS
        + Y+RW +ANEKA+ YI+ S+S+VLAKKHE ++TA+EIM+SLQEMFGQ S Q++HD+LKY++NARM EG+SVREHVLNMM HFN+ EMNGA IDE+SQVS
Subjt:  DAYDRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVS

Query:  FILETLQKSFLGPEQCQGIEN---------LRQMLPHMSLITGFDLRGRNSLL-----FH-----------AQKGKKRMKR------GKIDRVCAHKGKK
        FILE+L +SFL   +   + N         L ++    SL+     +G  ++      FH           +  G K+ K+       K +   A   KK
Subjt:  FILETLQKSFLGPEQCQGIEN---------LRQMLPHMSLITGFDLRGRNSLL-----FH-----------AQKGKKRMKR------GKIDRVCAHKGKK

Query:  VKDVAAERKVFP-PEWGRTLEEEPSQILGRERKNQGKADLLVTETCLVESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGK
         K  AA+   F   + G      P  +  +++  QGK DLLV ETCLVE+ DSAWI+DS ATNHVCSSFQ I SW+ L  GE+T+RVG+  +VSA A+G 
Subjt:  VKDVAAERKVFP-PEWGRTLEEEPSQILGRERKNQGKADLLVTETCLVESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGK

Query:  VKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHL
        ++L   ++++LL+N+Y+VP   RNL+S+ CLLEQ  S++FN NK FI +NG  ICSA LE++LYVL+  + K++LNTE+FKT  T+ K+ K+SPKENAHL
Subjt:  VKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHL

Query:  WHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSE
        WHLR G+INLNRIE+LVK+GLL+ELEENSL VCESC EGKMTKRPF+GKG+R KEPLELVH DLCGPMNVK RGG+EYF++F  DYSRYGY+YLM  KSE
Subjt:  WHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSE

Query:  TLEKFKEYKTEVENLLGKSLKT
         LEKFKEYK EVEN L K++KT
Subjt:  TLEKFKEYKTEVENLLGKSLKT

A0A5D3CPJ6 Gag/pol protein4.9e-15256.13Show/hide
Query:  DAYDRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVS
        + Y+RW +ANEKA+ YI+ S+S+VLAKKHE ++TA+EIM+SLQEMFGQ S Q++HD+LKY++NARM EG+SVREHVLNMM HFN+ EMNGA IDE+SQVS
Subjt:  DAYDRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVS

Query:  FILETLQKSFLGPEQCQGIEN---------LRQMLPHMSLITGFDLRGRNSLL-----FH-----------AQKGKKRMKR------GKIDRVCAHKGKK
        FILE+L +SFL   +   + N         L ++    SL+     +G  ++      FH           +  G K+ K+       K +   A   KK
Subjt:  FILETLQKSFLGPEQCQGIEN---------LRQMLPHMSLITGFDLRGRNSLL-----FH-----------AQKGKKRMKR------GKIDRVCAHKGKK

Query:  VKDVAAERKVFP-PEWGRTLEEEPSQILGRERKNQGKADLLVTETCLVESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGK
         K  AA+   F   + G      P  +  +++  QGK DLLV ETCLVE+ DSAWI+DS ATNHVCSSFQ I SW+ L  GE+T+RVG+  +VSA A+G 
Subjt:  VKDVAAERKVFP-PEWGRTLEEEPSQILGRERKNQGKADLLVTETCLVESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGK

Query:  VKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHL
        ++L   ++++LL+N+Y+VP   RNL+S+ CLLEQ  S++FN NK FI +NG  ICSA LE++LYVL+  + K++LNTE+FKT  T+ K+ K+SPKENAHL
Subjt:  VKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHL

Query:  WHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSE
        WHLR G+INLNRIE+LVK+GLL+ELEENSL VCESC EGKMTKRPF+GKG+R KEPLELVH DLCGPMNVK RGG+EYF++F  DYSRYGY+YLM  KSE
Subjt:  WHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSE

Query:  TLEKFKEYKTEVENLLGKSLKT
         LEKFKEYK EVEN L K++KT
Subjt:  TLEKFKEYKTEVENLLGKSLKT

SwissProt top hitse value%identityAlignment
P04146 Copia protein7.6e-1723.9Show/hide
Query:  DRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVSFIL
        D W +A   AK  II  +SD         +TA++I+E+L  ++ ++SL  +    K + + ++    S+  H        + +   GA I+E  ++S +L
Subjt:  DRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVSFIL

Query:  ETLQKSFLGPEQCQGIENLRQMLPHMSLITGF--------DLRGRNSLLFHAQKGKKRMKRGKIDRVCAHKGKKVKD-VAAERKVFPPEWGRTLEEEPSQ
         TL      P    GI    + L   +L   F        +++ +N    H    KK M     +    +K    K+ V   +K+F    G +  +    
Subjt:  ETLQKSFLGPEQCQGIENLRQMLPHMSLITGF--------DLRGRNSLLFHAQKGKKRMKRGKIDRVCAHKGKKVKD-VAAERKVFPPEWGRTLEEEPSQ

Query:  ILGRE---------------RKNQGKADLLVTETC-----LVES-------SDSAWILDSSATNHVCSSFQ-EIDSWQLLREGEVTLRVGSRELVSAAAI
          GRE                KN+     + T T      +V+         +  ++LDS A++H+ +      DS +++   ++ +     E + A   
Subjt:  ILGRE---------------RKNQGKADLLVTETC-----LVES-------SDSAWILDSSATNHVCSSFQ-EIDSWQLLREGEVTLRVGSRELVSAAAI

Query:  GKVKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENA
        G V+L    + I L+++        NL+S+  L E  +S+ F+ +   IS+NG           L V+K + + + +    F+      K      K N 
Subjt:  GKVKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENA

Query:  HLWHLREGYIN------LNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRT--KEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYG
         LWH R G+I+      + R        LLN L E S  +CE C  GK  + PF     +T  K PL +VH D+CGP+         YFV F+  ++ Y 
Subjt:  HLWHLREGYIN------LNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRT--KEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYG

Query:  YIYLMHKKSETLEKFKEYKTEVE
          YL+  KS+    F+++  + E
Subjt:  YIYLMHKKSETLEKFKEYKTEVE

P10978 Retrovirus-related Pol polyprotein from transposon TNT 1-941.2e-1721.93Show/hide
Query:  WIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVSFILET
        W   +E+A   I + +SD +        TA+ I   L+ ++  ++L  +    K ++   M EG++   H+         +   G  I+E  +   +L +
Subjt:  WIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQVSFILET

Query:  LQKSFLGPEQCQGIENLRQMLPHMSLITGFDLRGRNSLLFHAQKGKKRMK---------------------------RGKI-----DRV-----CAHKGK
        L  S+         +NL   + H    T  +L+   S L   +K +K+ +                           RGK       RV     C   G 
Subjt:  LQKSFLGPEQCQGIENLRQMLPHMSLITGFDLRGRNSLLFHAQKGKKRMK---------------------------RGKI-----DRV-----CAHKGK

Query:  KVKDVAAERKVFPPEWGRTLEEEPSQILGRERKNQGKADLLVT--ETCL-VESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEV-TLRVGSRELVSAA
          +D    RK      G+  ++  + ++    +N     L +   E C+ +   +S W++D++A++H   +    D +     G+  T+++G+      A
Subjt:  KVKDVAAERKVFPPEWGRTLEEEPSQILGRERKNQGKADLLVT--ETCL-VESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEV-TLRVGSRELVSAA

Query:  AIGK--VKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSP
         IG   +K + G   +L D  + VP    NL+S   L        F   K  +++   +I       +LY       +  LN                  
Subjt:  AIGK--VKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSP

Query:  KENAHLWHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYL
        + +  LWH R G+++   ++ L K  L++  +  ++  C+ C  GK  +  F     R    L+LV+ D+CGPM ++  GG +YFV+FI D SR  ++Y+
Subjt:  KENAHLWHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYL

Query:  MHKKSETLEKFKEYKTEVENLLGKSLKTL
        +  K +  + F+++   VE   G+ LK L
Subjt:  MHKKSETLEKFKEYKTEVENLLGKSLKTL

Q12337 Transposon Ty2-GR1 Gag-Pol polyprotein1.8e-1023.57Show/hide
Query:  LEEEPSQILGRERKNQGKADLLVTETCLVESSDSA---WILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGKVKLHFGRNYILLDNMY
        L ++    LG+++K           T  ++S+D      ++DS A+  +  S   +         E+ +    ++ +   AIG +  +F           
Subjt:  LEEEPSQILGRERKNQGKADLLVTETCLVESSDSA---WILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGKVKLHFGRNYILLDNMY

Query:  IVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEH-SLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHLWHLREGYINLNRIEK
          P    +L+S+S L  Q I+  F  N   + R+   + +  ++H   Y L   S K ++ + + K T     K+K   K    L H   G+ N   I+K
Subjt:  IVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLICSASLEH-SLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHLWHLREGYINLNRIEK

Query:  LVKSGLLNELEENSLS-------VCESCFEGKMTKRPFSGKGYRTK-----EPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSE
         +K   +  L+E+ +         C  C  GK TK     KG R K     EP + +H D+ GP++  P+    YF+SF  + +R+ ++Y +H + E
Subjt:  LVKSGLLNELEENSLS-------VCESCFEGKMTKRPFSGKGYRTK-----EPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSE

Q94HW2 Retrovirus-related Pol polyprotein from transposon RE17.4e-1225Show/hide
Query:  SDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGKVKLHFGRNYILLDNMYIVPGFTRNLVSISCLLE-QCISVSFNGNKAFIS-
        S + W+LDS AT+H+ S F  +   Q    G+  + V     +  +  G   L      + L N+  VP   +NL+S+  L     +SV F      +  
Subjt:  SDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGKVKLHFGRNYILLDNMYIVPGFTRNLVSISCLLE-QCISVSFNGNKAFIS-

Query:  -RNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHLWHLREGYINLNRIEKLVKSGLLNELE-ENSLSVCESCFEGKMTKRPF
           G  +     +  LY     S + V    LF         A  S K     WH R G+   + +  ++ +  L+ L   +    C  C   K  K PF
Subjt:  -RNGNLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHLWHLREGYINLNRIEKLVKSGLLNELE-ENSLSVCESCFEGKMTKRPF

Query:  SGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSETLEKFKEYKTEVEN
        S     +  PLE ++ D+     +     Y Y+V F+  ++RY ++Y + +KS+  E F  +K  +EN
Subjt:  SGKGYRTKEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSETLEKFKEYKTEVEN

Q9ZT94 Retrovirus-related Pol polyprotein from transposon RE29.6e-1223.7Show/hide
Query:  WILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGKVKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLI
        W+LDS AT+H+ S F  +   Q    G+  + +     +     G   L      + L+ +  VP   +NL+S+  L         N N+  +       
Subjt:  WILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGKVKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNGNLI

Query:  CSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHLWHLREGYINLNRIEKLVKSGLLNELE-ENSLSVCESCFEGKMTKRPFSGKGYRT
            L   + +L+    K  L      +++  +  A    K     WH R G+ +L  +  ++ +  L  L   + L  C  CF  K  K PFS     +
Subjt:  CSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHLWHLREGYINLNRIEKLVKSGLLNELE-ENSLSVCESCFEGKMTKRPFSGKGYRT

Query:  KEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTL
         +PLE ++ D+     +     Y Y+V F+  ++RY ++Y + +KS+  + F  +K+ VEN     + TL
Subjt:  KEPLELVHYDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTL

Arabidopsis top hitse value%identityAlignment
ATMG00300.1 Gag-Pol-related retrotransposon family protein2.7e-0935.96Show/hide
Query:  TTETRTKKAKVSPKENAHLWHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNV
        + ET       + K+   LWH R  +++   +E LVK G L+  + +SL  CE C  GK  +  FS   + TK PL+ VH DL G  +V
Subjt:  TTETRTKKAKVSPKENAHLWHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVHYDLCGPMNV


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGTCGGGCTCGACAAATCGCGAAGTGTTTTGCGATGCATACGATCGATGGATCAGGGCCAATGAAAAGGCCAAGGTCTACATCATTGTCAGCATGTCTGATGTTTTGGC
AAAGAAGCATGAGTTGATCGTCACCGCCAAGGAGATCATGGAGTCCTTGCAGGAAATGTTTGGACAACAGTCCCTTCAGGTCCGGCATGACTCGCTCAAATACGTCTTCA
ATGCACGGATGAAAGAGGGGTCGTCTGTCCGTGAACATGTTCTAAACATGATGACCCACTTTAATCTGGTTGAGATGAACGGGGCTTCGATCGATGAGTCGAGCCAAGTC
AGTTTTATTTTGGAGACTCTCCAGAAGAGTTTCCTTGGGCCAGAGCAATGTCAAGGCATCGAGAATCTGAGGCAAATGTTGCCTCACATGAGTCTTATCACGGGGTTCGA
CCTGCGGGGACGAAACTCGTTGCTCTTTCACGCCCAAAAAGGGAAGAAGAGGATGAAGAGGGGTAAAATCGACCGAGTTTGCGCCCACAAGGGCAAAAAGGTCAAGGACG
TTGCAGCAGAAAGGAAAGTGTTTCCACCTGAATGGGGGCGCACACTAGAAGAGGAACCGTCCCAAATTCTCGGTCGAGAGAGGAAGAATCAAGGTAAAGCTGATTTACTT
GTGACAGAAACTTGTTTAGTGGAGAGTAGTGACTCTGCCTGGATATTGGATTCGAGCGCCACTAACCATGTTTGTTCTTCTTTTCAGGAGATTGATTCCTGGCAGCTGCT
GCGAGAGGGTGAGGTGACTCTACGGGTTGGATCCAGGGAGCTTGTCTCTGCTGCAGCGATCGGCAAGGTGAAGCTACATTTTGGCAGAAACTACATTTTGCTGGACAATA
TGTATATAGTTCCAGGGTTTACTAGAAACCTAGTTTCTATTTCCTGCCTTTTAGAACAGTGTATTTCTGTTTCATTCAATGGTAATAAAGCGTTTATTTCCAGAAATGGT
AATCTTATTTGTTCTGCTTCACTTGAGCATAGTCTGTATGTTTTGAAACCTAATTCGGTCAAAAGTGTTTTGAATACTGAATTGTTTAAAACTACAGAAACACGAACTAA
GAAAGCGAAAGTTTCTCCTAAAGAAAATGCCCATCTTTGGCATCTACGGGAAGGCTACATTAATCTCAATAGGATTGAGAAACTAGTGAAGAGTGGACTTCTAAACGAGT
TGGAAGAAAACTCTTTGTCGGTGTGTGAGTCATGCTTTGAGGGCAAGATGACCAAACGTCCTTTTAGTGGAAAAGGATATAGAACCAAGGAGCCTCTTGAGTTAGTACAT
TATGACCTCTGTGGTCCGATGAATGTTAAACCTCGGGGCGGTTATGAGTACTTTGTGTCTTTCATATACGACTACTCAAGGTATGGGTATATTTACCTAATGCACAAGAA
GTCTGAAACTCTTGAAAAGTTCAAGGAGTACAAGACTGAGGTTGAGAACCTCTTAGGTAAATCGCTTAAAACACTTTGA
mRNA sequenceShow/hide mRNA sequence
ATGTCGGGCTCGACAAATCGCGAAGTGTTTTGCGATGCATACGATCGATGGATCAGGGCCAATGAAAAGGCCAAGGTCTACATCATTGTCAGCATGTCTGATGTTTTGGC
AAAGAAGCATGAGTTGATCGTCACCGCCAAGGAGATCATGGAGTCCTTGCAGGAAATGTTTGGACAACAGTCCCTTCAGGTCCGGCATGACTCGCTCAAATACGTCTTCA
ATGCACGGATGAAAGAGGGGTCGTCTGTCCGTGAACATGTTCTAAACATGATGACCCACTTTAATCTGGTTGAGATGAACGGGGCTTCGATCGATGAGTCGAGCCAAGTC
AGTTTTATTTTGGAGACTCTCCAGAAGAGTTTCCTTGGGCCAGAGCAATGTCAAGGCATCGAGAATCTGAGGCAAATGTTGCCTCACATGAGTCTTATCACGGGGTTCGA
CCTGCGGGGACGAAACTCGTTGCTCTTTCACGCCCAAAAAGGGAAGAAGAGGATGAAGAGGGGTAAAATCGACCGAGTTTGCGCCCACAAGGGCAAAAAGGTCAAGGACG
TTGCAGCAGAAAGGAAAGTGTTTCCACCTGAATGGGGGCGCACACTAGAAGAGGAACCGTCCCAAATTCTCGGTCGAGAGAGGAAGAATCAAGGTAAAGCTGATTTACTT
GTGACAGAAACTTGTTTAGTGGAGAGTAGTGACTCTGCCTGGATATTGGATTCGAGCGCCACTAACCATGTTTGTTCTTCTTTTCAGGAGATTGATTCCTGGCAGCTGCT
GCGAGAGGGTGAGGTGACTCTACGGGTTGGATCCAGGGAGCTTGTCTCTGCTGCAGCGATCGGCAAGGTGAAGCTACATTTTGGCAGAAACTACATTTTGCTGGACAATA
TGTATATAGTTCCAGGGTTTACTAGAAACCTAGTTTCTATTTCCTGCCTTTTAGAACAGTGTATTTCTGTTTCATTCAATGGTAATAAAGCGTTTATTTCCAGAAATGGT
AATCTTATTTGTTCTGCTTCACTTGAGCATAGTCTGTATGTTTTGAAACCTAATTCGGTCAAAAGTGTTTTGAATACTGAATTGTTTAAAACTACAGAAACACGAACTAA
GAAAGCGAAAGTTTCTCCTAAAGAAAATGCCCATCTTTGGCATCTACGGGAAGGCTACATTAATCTCAATAGGATTGAGAAACTAGTGAAGAGTGGACTTCTAAACGAGT
TGGAAGAAAACTCTTTGTCGGTGTGTGAGTCATGCTTTGAGGGCAAGATGACCAAACGTCCTTTTAGTGGAAAAGGATATAGAACCAAGGAGCCTCTTGAGTTAGTACAT
TATGACCTCTGTGGTCCGATGAATGTTAAACCTCGGGGCGGTTATGAGTACTTTGTGTCTTTCATATACGACTACTCAAGGTATGGGTATATTTACCTAATGCACAAGAA
GTCTGAAACTCTTGAAAAGTTCAAGGAGTACAAGACTGAGGTTGAGAACCTCTTAGGTAAATCGCTTAAAACACTTTGA
Protein sequenceShow/hide protein sequence
MSGSTNREVFCDAYDRWIRANEKAKVYIIVSMSDVLAKKHELIVTAKEIMESLQEMFGQQSLQVRHDSLKYVFNARMKEGSSVREHVLNMMTHFNLVEMNGASIDESSQV
SFILETLQKSFLGPEQCQGIENLRQMLPHMSLITGFDLRGRNSLLFHAQKGKKRMKRGKIDRVCAHKGKKVKDVAAERKVFPPEWGRTLEEEPSQILGRERKNQGKADLL
VTETCLVESSDSAWILDSSATNHVCSSFQEIDSWQLLREGEVTLRVGSRELVSAAAIGKVKLHFGRNYILLDNMYIVPGFTRNLVSISCLLEQCISVSFNGNKAFISRNG
NLICSASLEHSLYVLKPNSVKSVLNTELFKTTETRTKKAKVSPKENAHLWHLREGYINLNRIEKLVKSGLLNELEENSLSVCESCFEGKMTKRPFSGKGYRTKEPLELVH
YDLCGPMNVKPRGGYEYFVSFIYDYSRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTL