; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Tan0004826 (gene) of Snake gourd v1 genome

Gene IDTan0004826
OrganismTrichosanthes anguina (Snake gourd v1)
DescriptionRetrovirus-related Pol polyprotein from transposon TNT 1-94
Genome locationLG09:13834881..13835659
RNA-Seq ExpressionTan0004826
SyntenyTan0004826
Gene Ontology termsGO:0006278 - RNA-dependent DNA biosynthetic process (biological process)
GO:0015074 - DNA integration (biological process)
GO:0003676 - nucleic acid binding (molecular function)
GO:0003964 - RNA-directed DNA polymerase activity (molecular function)
GO:0008194 - UDP-glycosyltransferase activity (molecular function)
GO:0008270 - zinc ion binding (molecular function)
InterPro domainsNA


Homology Show/hide homology
GenBank top hitse value%identityAlignment
RVW19779.1 Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Vitis vinifera]2.8e-6760.17Show/hide
Query:  MDASSSTNGVAPTMMGSTIIK--THVEKPEKFKGENFKRWQQKMIFYLSTLNLAHLLKEECPITLPEVVTPETEAAKQAWMHIDFLCRNYILSGLQDTLY
        M+ +  +N   P +     IK  TH EKP+KF  ++FKRWQQK++FYL+TLNL H+LKEECP   PE  T +   A +AW H +FLCRNYIL+GL D+LY
Subjt:  MDASSSTNGVAPTMMGSTIIK--THVEKPEKFKGENFKRWQQKMIFYLSTLNLAHLLKEECPITLPEVVTPETEAAKQAWMHIDFLCRNYILSGLQDTLY

Query:  NVYCNAYNTSRQLWEALDKKYKLEDADTKKFLVGKFLDYKMIDAKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVFEKLPPSWKDFKCYFKHKRKELS
        NVY +++ T+R LWEAL+KKYK +DA TKKF+VGKFLD+KMID+  V+NQ+EELQI+I+ + +EG+ I+E FQVA+  EKL PSWKDFK Y KHKRKELS
Subjt:  NVYCNAYNTSRQLWEALDKKYKLEDADTKKFLVGKFLDYKMIDAKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVFEKLPPSWKDFKCYFKHKRKELS

Query:  MENLVVKLRIEEDNRKRDKSWLE--VEARAH
        ME+L+V+LRIEEDNRK DKS  +  +EA+AH
Subjt:  MENLVVKLRIEEDNRKRDKSWLE--VEARAH

RVW83338.1 Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Vitis vinifera]1.6e-6763Show/hide
Query:  TIIKTHVEKPEKFKGENFKRWQQKMIFYLSTLNLAHLLKEECPITLPEVVTPETEAAKQAWMHIDFLCRNYILSGLQDTLYNVYCNAYNTSRQLWEALDK
        T+  TH EKPEKF G  FKRWQQKM+FYL+TLNLA  L EECPI        E  AA  AW H DFLCRNY+L+GL +TLYNVYC +  T+++LW++LDK
Subjt:  TIIKTHVEKPEKFKGENFKRWQQKMIFYLSTLNLAHLLKEECPITLPEVVTPETEAAKQAWMHIDFLCRNYILSGLQDTLYNVYCNAYNTSRQLWEALDK

Query:  KYKLEDADTKKFLVGKFLDYKMIDAKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVFEKLPPSWKDFKCYFKHKRKELSMENLVVKLRIEEDNRKRDK
        KYK EDA  KKF+VGKFLD+KMID+K+V++Q++ELQ+I+ ++ S+G+ +S+ FQVAAV EKLPP WKDFK Y KHKRKE+++E L+V+LRIEEDNRK +K
Subjt:  KYKLEDADTKKFLVGKFLDYKMIDAKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVFEKLPPSWKDFKCYFKHKRKELSMENLVVKLRIEEDNRKRDK

XP_022147763.1 uncharacterized protein LOC111016620 [Momordica charantia]2.1e-9175.33Show/hide
Query:  MDASSSTNGVAPTMMGSTIIKTHVEKPEKFKGENFKRWQQKMIFYLSTLNLAHLLKEECPITLPEVVTPETEAAKQAWMHIDFLCRNYILSGLQDTLYNV
        M A++STN  AP MMGSTI K+H EKPEKFKGENFKRWQQKM+FY +TLNLAH++KE CP T  E +TPETEAAKQAW+H DFLC NYILS + DTLYNV
Subjt:  MDASSSTNGVAPTMMGSTIIKTHVEKPEKFKGENFKRWQQKMIFYLSTLNLAHLLKEECPITLPEVVTPETEAAKQAWMHIDFLCRNYILSGLQDTLYNV

Query:  YCNAYNTSRQLWEALDKKYKLEDADTKKFLVGKFLDYKMIDAKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVFEKLPPSWKDFKCYFKHKRKELSME
        YCNA++TSRQLWEALDKKYKLEDA TKKFLVGKFLDYKM+D KLVVN +EELQIIISDLQSEGL I+EPFQV  V EKL P+W++FKCY KHK+KELS+E
Subjt:  YCNAYNTSRQLWEALDKKYKLEDADTKKFLVGKFLDYKMIDAKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVFEKLPPSWKDFKCYFKHKRKELSME

Query:  NLVVKLRIEEDNRKRDKSWLEVEARAH
        NL VKLRI+E+N K DK   + EA+AH
Subjt:  NLVVKLRIEEDNRKRDKSWLEVEARAH

XP_022148559.1 uncharacterized protein LOC111017193 [Momordica charantia]2.4e-8774.11Show/hide
Query:  SSSTNGVAPTMMGSTIIKTHVEKPEKFKGENFKRWQQKMIFYLSTLNLAHLLKEECPITLPEVVTPETEAAKQAWMHIDFLCRNYILSGLQDTLYNVYCN
        +++T+  AP+MMGSTI+K+H EK EKFKGENFKRWQQKMIFY +TLNLAH+LKE CP T  E +T ETEA KQA +H +FLC NYILS L DTL+NVYCN
Subjt:  SSSTNGVAPTMMGSTIIKTHVEKPEKFKGENFKRWQQKMIFYLSTLNLAHLLKEECPITLPEVVTPETEAAKQAWMHIDFLCRNYILSGLQDTLYNVYCN

Query:  AYNTSRQLWEALDKKYKLEDADTKKFLVGKFLDYKMIDAKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVFEKLPPSWKDFKCYFKHKRKELSMENLV
        A++TSRQLWEALDKKYKLEDA TKKFLV KFLDYK+ID KLV+NQ+EELQII SDLQSE L I+EPFQ+ AV EKLPP+W++FK Y KHKRKELSMENL 
Subjt:  AYNTSRQLWEALDKKYKLEDADTKKFLVGKFLDYKMIDAKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVFEKLPPSWKDFKCYFKHKRKELSMENLV

Query:  VKLRIEEDNRKRDKSWLEVEARAH
        VKLRIEEDNRK DK   + EA+AH
Subjt:  VKLRIEEDNRKRDKSWLEVEARAH

XP_022156727.1 uncharacterized protein LOC111023572 [Momordica charantia]3.4e-8173.73Show/hide
Query:  MDASSSTNGVAPTMMGSTIIKTHVEKPEKFKGENFKRWQQKMIFYLSTLNLAHLLKEECPITLPEVVTPETEAAKQAWMHIDFLCRNYILSGLQDTLYNV
        M A++STN  APTMMGSTIIK H EK EKF+G+NFK WQ KMIFYL+TLNLAH+L++ CP T  E + PETEAAKQAW+H DFL  NYIL+ L  TL NV
Subjt:  MDASSSTNGVAPTMMGSTIIKTHVEKPEKFKGENFKRWQQKMIFYLSTLNLAHLLKEECPITLPEVVTPETEAAKQAWMHIDFLCRNYILSGLQDTLYNV

Query:  YCNAYNTSRQLWEALDKKYKLEDADTKKFLVGKFLDYKMIDAKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVFEKLPPSWKDFKCYFKHKRKELSME
        YCNA++TSRQLW+ LDKKYKLED  TKKFLVGKFLDYKM++ KLVVNQ+EELQII SDLQSEGL I+E FQVAAV E LP  W++FKCY KHKRK+LSME
Subjt:  YCNAYNTSRQLWEALDKKYKLEDADTKKFLVGKFLDYKMIDAKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVFEKLPPSWKDFKCYFKHKRKELSME

Query:  NLVVKLRIEEDNRKRDK
        NL VKLRIEED RK DK
Subjt:  NLVVKLRIEEDNRKRDK

TrEMBL top hitse value%identityAlignment
A0A438C9B6 Retrovirus-related Pol polyprotein from transposon TNT 1-941.3e-6760.17Show/hide
Query:  MDASSSTNGVAPTMMGSTIIK--THVEKPEKFKGENFKRWQQKMIFYLSTLNLAHLLKEECPITLPEVVTPETEAAKQAWMHIDFLCRNYILSGLQDTLY
        M+ +  +N   P +     IK  TH EKP+KF  ++FKRWQQK++FYL+TLNL H+LKEECP   PE  T +   A +AW H +FLCRNYIL+GL D+LY
Subjt:  MDASSSTNGVAPTMMGSTIIK--THVEKPEKFKGENFKRWQQKMIFYLSTLNLAHLLKEECPITLPEVVTPETEAAKQAWMHIDFLCRNYILSGLQDTLY

Query:  NVYCNAYNTSRQLWEALDKKYKLEDADTKKFLVGKFLDYKMIDAKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVFEKLPPSWKDFKCYFKHKRKELS
        NVY +++ T+R LWEAL+KKYK +DA TKKF+VGKFLD+KMID+  V+NQ+EELQI+I+ + +EG+ I+E FQVA+  EKL PSWKDFK Y KHKRKELS
Subjt:  NVYCNAYNTSRQLWEALDKKYKLEDADTKKFLVGKFLDYKMIDAKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVFEKLPPSWKDFKCYFKHKRKELS

Query:  MENLVVKLRIEEDNRKRDKSWLE--VEARAH
        ME+L+V+LRIEEDNRK DKS  +  +EA+AH
Subjt:  MENLVVKLRIEEDNRKRDKSWLE--VEARAH

A0A438HFY7 Retrovirus-related Pol polyprotein from transposon TNT 1-947.8e-6863Show/hide
Query:  TIIKTHVEKPEKFKGENFKRWQQKMIFYLSTLNLAHLLKEECPITLPEVVTPETEAAKQAWMHIDFLCRNYILSGLQDTLYNVYCNAYNTSRQLWEALDK
        T+  TH EKPEKF G  FKRWQQKM+FYL+TLNLA  L EECPI        E  AA  AW H DFLCRNY+L+GL +TLYNVYC +  T+++LW++LDK
Subjt:  TIIKTHVEKPEKFKGENFKRWQQKMIFYLSTLNLAHLLKEECPITLPEVVTPETEAAKQAWMHIDFLCRNYILSGLQDTLYNVYCNAYNTSRQLWEALDK

Query:  KYKLEDADTKKFLVGKFLDYKMIDAKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVFEKLPPSWKDFKCYFKHKRKELSMENLVVKLRIEEDNRKRDK
        KYK EDA  KKF+VGKFLD+KMID+K+V++Q++ELQ+I+ ++ S+G+ +S+ FQVAAV EKLPP WKDFK Y KHKRKE+++E L+V+LRIEEDNRK +K
Subjt:  KYKLEDADTKKFLVGKFLDYKMIDAKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVFEKLPPSWKDFKCYFKHKRKELSMENLVVKLRIEEDNRKRDK

A0A6J1D271 uncharacterized protein LOC1110166201.0e-9175.33Show/hide
Query:  MDASSSTNGVAPTMMGSTIIKTHVEKPEKFKGENFKRWQQKMIFYLSTLNLAHLLKEECPITLPEVVTPETEAAKQAWMHIDFLCRNYILSGLQDTLYNV
        M A++STN  AP MMGSTI K+H EKPEKFKGENFKRWQQKM+FY +TLNLAH++KE CP T  E +TPETEAAKQAW+H DFLC NYILS + DTLYNV
Subjt:  MDASSSTNGVAPTMMGSTIIKTHVEKPEKFKGENFKRWQQKMIFYLSTLNLAHLLKEECPITLPEVVTPETEAAKQAWMHIDFLCRNYILSGLQDTLYNV

Query:  YCNAYNTSRQLWEALDKKYKLEDADTKKFLVGKFLDYKMIDAKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVFEKLPPSWKDFKCYFKHKRKELSME
        YCNA++TSRQLWEALDKKYKLEDA TKKFLVGKFLDYKM+D KLVVN +EELQIIISDLQSEGL I+EPFQV  V EKL P+W++FKCY KHK+KELS+E
Subjt:  YCNAYNTSRQLWEALDKKYKLEDADTKKFLVGKFLDYKMIDAKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVFEKLPPSWKDFKCYFKHKRKELSME

Query:  NLVVKLRIEEDNRKRDKSWLEVEARAH
        NL VKLRI+E+N K DK   + EA+AH
Subjt:  NLVVKLRIEEDNRKRDKSWLEVEARAH

A0A6J1D4C8 uncharacterized protein LOC1110171931.2e-8774.11Show/hide
Query:  SSSTNGVAPTMMGSTIIKTHVEKPEKFKGENFKRWQQKMIFYLSTLNLAHLLKEECPITLPEVVTPETEAAKQAWMHIDFLCRNYILSGLQDTLYNVYCN
        +++T+  AP+MMGSTI+K+H EK EKFKGENFKRWQQKMIFY +TLNLAH+LKE CP T  E +T ETEA KQA +H +FLC NYILS L DTL+NVYCN
Subjt:  SSSTNGVAPTMMGSTIIKTHVEKPEKFKGENFKRWQQKMIFYLSTLNLAHLLKEECPITLPEVVTPETEAAKQAWMHIDFLCRNYILSGLQDTLYNVYCN

Query:  AYNTSRQLWEALDKKYKLEDADTKKFLVGKFLDYKMIDAKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVFEKLPPSWKDFKCYFKHKRKELSMENLV
        A++TSRQLWEALDKKYKLEDA TKKFLV KFLDYK+ID KLV+NQ+EELQII SDLQSE L I+EPFQ+ AV EKLPP+W++FK Y KHKRKELSMENL 
Subjt:  AYNTSRQLWEALDKKYKLEDADTKKFLVGKFLDYKMIDAKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVFEKLPPSWKDFKCYFKHKRKELSMENLV

Query:  VKLRIEEDNRKRDKSWLEVEARAH
        VKLRIEEDNRK DK   + EA+AH
Subjt:  VKLRIEEDNRKRDKSWLEVEARAH

A0A6J1DSQ3 uncharacterized protein LOC1110235721.6e-8173.73Show/hide
Query:  MDASSSTNGVAPTMMGSTIIKTHVEKPEKFKGENFKRWQQKMIFYLSTLNLAHLLKEECPITLPEVVTPETEAAKQAWMHIDFLCRNYILSGLQDTLYNV
        M A++STN  APTMMGSTIIK H EK EKF+G+NFK WQ KMIFYL+TLNLAH+L++ CP T  E + PETEAAKQAW+H DFL  NYIL+ L  TL NV
Subjt:  MDASSSTNGVAPTMMGSTIIKTHVEKPEKFKGENFKRWQQKMIFYLSTLNLAHLLKEECPITLPEVVTPETEAAKQAWMHIDFLCRNYILSGLQDTLYNV

Query:  YCNAYNTSRQLWEALDKKYKLEDADTKKFLVGKFLDYKMIDAKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVFEKLPPSWKDFKCYFKHKRKELSME
        YCNA++TSRQLW+ LDKKYKLED  TKKFLVGKFLDYKM++ KLVVNQ+EELQII SDLQSEGL I+E FQVAAV E LP  W++FKCY KHKRK+LSME
Subjt:  YCNAYNTSRQLWEALDKKYKLEDADTKKFLVGKFLDYKMIDAKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVFEKLPPSWKDFKCYFKHKRKELSME

Query:  NLVVKLRIEEDNRKRDK
        NL VKLRIEED RK DK
Subjt:  NLVVKLRIEEDNRKRDK

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGATGCAAGCTCCTCCACTAATGGTGTTGCTCCTACAATGATGGGATCAACCATCATCAAAACTCATGTTGAAAAACCAGAGAAATTCAAAGGAGAAAACTTCAAGAG
ATGGCAACAGAAGATGATCTTCTACCTCTCAACATTGAACCTTGCTCACCTCTTGAAGGAAGAATGTCCAATTACCCTACCAGAAGTTGTCACTCCTGAAACTGAAGCTG
CCAAACAAGCATGGATGCATATAGACTTCTTATGTCGCAATTATATACTAAGTGGTCTTCAAGACACCTTGTATAATGTCTACTGCAATGCTTATAATACATCAAGGCAA
TTGTGGGAGGCATTAGACAAGAAGTATAAGCTGGAAGATGCTGACACTAAGAAATTCCTTGTAGGAAAATTCTTAGATTATAAAATGATTGATGCCAAATTGGTAGTCAA
TCAGATGGAAGAATTGCAAATTATCATTAGTGATTTGCAAAGTGAAGGATTGGACATCAGTGAACCATTCCAAGTTGCTGCTGTGTTTGAGAAGTTGCCTCCTTCCTGGA
AGGACTTCAAATGCTATTTCAAACACAAGCGAAAGGAATTATCCATGGAGAATCTTGTTGTTAAACTCCGAATAGAAGAGGATAATAGAAAAAGAGATAAAAGTTGGCTA
GAAGTTGAAGCCAGAGCTCATGAATGGCGTCAACAAACACATTTGTAG
mRNA sequenceShow/hide mRNA sequence
ATGGATGCAAGCTCCTCCACTAATGGTGTTGCTCCTACAATGATGGGATCAACCATCATCAAAACTCATGTTGAAAAACCAGAGAAATTCAAAGGAGAAAACTTCAAGAG
ATGGCAACAGAAGATGATCTTCTACCTCTCAACATTGAACCTTGCTCACCTCTTGAAGGAAGAATGTCCAATTACCCTACCAGAAGTTGTCACTCCTGAAACTGAAGCTG
CCAAACAAGCATGGATGCATATAGACTTCTTATGTCGCAATTATATACTAAGTGGTCTTCAAGACACCTTGTATAATGTCTACTGCAATGCTTATAATACATCAAGGCAA
TTGTGGGAGGCATTAGACAAGAAGTATAAGCTGGAAGATGCTGACACTAAGAAATTCCTTGTAGGAAAATTCTTAGATTATAAAATGATTGATGCCAAATTGGTAGTCAA
TCAGATGGAAGAATTGCAAATTATCATTAGTGATTTGCAAAGTGAAGGATTGGACATCAGTGAACCATTCCAAGTTGCTGCTGTGTTTGAGAAGTTGCCTCCTTCCTGGA
AGGACTTCAAATGCTATTTCAAACACAAGCGAAAGGAATTATCCATGGAGAATCTTGTTGTTAAACTCCGAATAGAAGAGGATAATAGAAAAAGAGATAAAAGTTGGCTA
GAAGTTGAAGCCAGAGCTCATGAATGGCGTCAACAAACACATTTGTAG
Protein sequenceShow/hide protein sequence
MDASSSTNGVAPTMMGSTIIKTHVEKPEKFKGENFKRWQQKMIFYLSTLNLAHLLKEECPITLPEVVTPETEAAKQAWMHIDFLCRNYILSGLQDTLYNVYCNAYNTSRQ
LWEALDKKYKLEDADTKKFLVGKFLDYKMIDAKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVFEKLPPSWKDFKCYFKHKRKELSMENLVVKLRIEEDNRKRDKSWL
EVEARAHEWRQQTHL