; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Tan0003340 (gene) of Snake gourd v1 genome

Gene IDTan0003340
OrganismTrichosanthes anguina (Snake gourd v1)
DescriptionGag protease polyprotein
Genome locationLG02:34521624..34525155
RNA-Seq ExpressionTan0003340
SyntenyTan0003340
Gene Ontology termsGO:0009987 - cellular process (biological process)
GO:0003824 - catalytic activity (molecular function)
InterPro domainsNA


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_022156662.1 uncharacterized protein LOC111023512 [Momordica charantia]9.4e-1952.81Show/hide
Query:  VEAMQALVRATVATQLAQTGQVQNDVSIELRYLKDFRKYDPRPFDGSSHDPTVAELWVSSIETIFRLTNCLEYRRVSCAAFMLRDDTYL
        +E +Q LV+ TV+ Q+ Q  Q +  +SIE +YL+DF+KYDPR FDG S DP +AE W+S +ETIFR   CLE ++V C  FML+DD +L
Subjt:  VEAMQALVRATVATQLAQTGQVQNDVSIELRYLKDFRKYDPRPFDGSSHDPTVAELWVSSIETIFRLTNCLEYRRVSCAAFMLRDDTYL

XP_022938329.1 uncharacterized protein LOC111444463 [Cucurbita moschata]1.4e-1743.12Show/hide
Query:  MPPRGRGKG-RGRGSGRGRVAGRVGNLPPENQEEILQPISNLLHDQLGRQEDIPLATPTWGQTNPNVMTMMVEAMQALVRATVATQLAQTGQVQNDVSIE
        MPPR RG G RGR   +GR  GR  N   EN  E  QP+              P A P   Q  PN    +V+A+Q +++   A Q A      +  ++E
Subjt:  MPPRGRGKG-RGRGSGRGRVAGRVGNLPPENQEEILQPISNLLHDQLGRQEDIPLATPTWGQTNPNVMTMMVEAMQALVRATVATQLAQTGQVQNDVSIE

Query:  LRYLKDFRKYDPRPFDGSSHDPTVAELWVSSIETIFRLTNCLEYRRVSCAAFMLRDDTYL
         +YL+DF++ DPR F G+S DPTVA++W+ SIET+F LTNC E  RV CA FMLR D  L
Subjt:  LRYLKDFRKYDPRPFDGSSHDPTVAELWVSSIETIFRLTNCLEYRRVSCAAFMLRDDTYL

XP_031737529.1 uncharacterized protein LOC116402422 [Cucumis sativus]1.4e-1742.26Show/hide
Query:  MPP-----RGRGKGRGRGS-GRGRVAGRVGNLPPENQEEILQPISNLLHDQLGRQEDIPLATPTWGQTNPNVMTMMVEAMQA-------LVRATVATQLA
        MPP     RG  +GRGRG+ GRGR AGR  N P E Q E   P + + H       +    +    Q    +MT + +  QA       +V    A   A
Subjt:  MPP-----RGRGKGRGRGS-GRGRVAGRVGNLPPENQEEILQPISNLLHDQLGRQEDIPLATPTWGQTNPNVMTMMVEAMQA-------LVRATVATQLA

Query:  QTGQVQNDVSIELRYLKDFRKYDPRPFDGSSHDPTVAELWVSSIETIFRLTNCLEYRRVSCAAFMLRD
        Q  ++ N +S E ++L+DFRKYDP+ FDGS  DPT AE+W+SS+ETIF    C E  RV CAAF+LRD
Subjt:  QTGQVQNDVSIELRYLKDFRKYDPRPFDGSSHDPTVAELWVSSIETIFRLTNCLEYRRVSCAAFMLRD

XP_031742890.1 uncharacterized protein LOC116404512 [Cucumis sativus]3.0e-1740.33Show/hide
Query:  MPPRG-------RGKGRGRGSGRGRVAGRVGNLPPENQEEILQPISNLLHDQ------------------LGRQEDIPLATPTWGQTNPNVMTMMVEAMQ
        MPPRG       RG+GRG G GRGR AGR  N P E Q E   P + + H +                  + R +  P   P        V+  +  A  
Subjt:  MPPRG-------RGKGRGRGSGRGRVAGRVGNLPPENQEEILQPISNLLHDQ------------------LGRQEDIPLATPTWGQTNPNVMTMMVEAMQ

Query:  ALVRATVATQLAQTGQV-QNDVSIELRYLKDFRKYDPRPFDGSSHDPTVAELWVSSIETIFRLTNCLEYRRVSCAAFMLRD
        A          AQ  Q+  N +S E ++L+DFRKYDP+ FDGS  DPT AELW+SS+ETIF    C E  RV CAAF+LRD
Subjt:  ALVRATVATQLAQTGQV-QNDVSIELRYLKDFRKYDPRPFDGSSHDPTVAELWVSSIETIFRLTNCLEYRRVSCAAFMLRD

XP_031744976.1 uncharacterized protein LOC116405198 [Cucumis sativus]1.4e-1742.26Show/hide
Query:  MPP-----RGRGKGRGRGS-GRGRVAGRVGNLPPENQEEILQPISNLLHDQLGRQEDIPLATPTWGQTNPNVMTMMVEAMQA-------LVRATVATQLA
        MPP     RG  +GRGRG+ GRGR AGR  N P E Q E   P + + H       +    +    Q    +MT + +  QA       +V    A   A
Subjt:  MPP-----RGRGKGRGRGS-GRGRVAGRVGNLPPENQEEILQPISNLLHDQLGRQEDIPLATPTWGQTNPNVMTMMVEAMQA-------LVRATVATQLA

Query:  QTGQVQNDVSIELRYLKDFRKYDPRPFDGSSHDPTVAELWVSSIETIFRLTNCLEYRRVSCAAFMLRD
        Q  ++ N +S E ++L+DFRKYDP+ FDGS  DPT AE+W+SS+ETIF    C E  RV CAAF+LRD
Subjt:  QTGQVQNDVSIELRYLKDFRKYDPRPFDGSSHDPTVAELWVSSIETIFRLTNCLEYRRVSCAAFMLRD

TrEMBL top hitse value%identityAlignment
A0A5A7T0M7 Reverse transcriptase1.4e-1540.88Show/hide
Query:  MPPRGRGKGRGRGSGRGRVAGRVGNLPPENQ--EEILQPISNLLHDQLGRQED--IPLATPTWGQTNPNVMTMMVEAMQALVRATVATQLAQTGQVQNDV
        MPPR RG  RG   GRGR AGRV    PE Q   +   P +++ H  L   E     L    W Q  P           A     V  Q+     V + +
Subjt:  MPPRGRGKGRGRGSGRGRVAGRVGNLPPENQ--EEILQPISNLLHDQLGRQED--IPLATPTWGQTNPNVMTMMVEAMQALVRATVATQLAQTGQVQNDV

Query:  SIELRYLKDFRKYDPRPFDGSSHDPTVAELWVSSIETIFRLTNCLEYRRVSCAAFMLRD
        S E ++L+DFRKY+P  FDGS  DPT A+LW+SS+ETIFR   C E ++V C  FML D
Subjt:  SIELRYLKDFRKYDPRPFDGSSHDPTVAELWVSSIETIFRLTNCLEYRRVSCAAFMLRD

A0A5A7UP36 Reverse transcriptase1.0e-1541.51Show/hide
Query:  MPPRGRGKGRGRGSGRGRVAGRVGNLPPENQ--EEILQPISNLLHDQLGRQED--IPLATPTWGQTNPNVMTMMVEAMQALVRATVATQLAQTGQVQNDV
        MPPR RG  RG   G+GR AGRV    PE Q   +   P + + H  L   E     L    W Q  P           A   A VA Q+     V + +
Subjt:  MPPRGRGKGRGRGSGRGRVAGRVGNLPPENQ--EEILQPISNLLHDQLGRQED--IPLATPTWGQTNPNVMTMMVEAMQALVRATVATQLAQTGQVQNDV

Query:  SIELRYLKDFRKYDPRPFDGSSHDPTVAELWVSSIETIFRLTNCLEYRRVSCAAFMLRD
        S + ++L+DFRKY+P  FDGS  DPT A+LW+SS+ETIFR   C E ++V CA FML D
Subjt:  SIELRYLKDFRKYDPRPFDGSSHDPTVAELWVSSIETIFRLTNCLEYRRVSCAAFMLRD

A0A5A7V810 Reverse transcriptase1.4e-1541.77Show/hide
Query:  MPPRGRGKGRGRGSGRGRVAGRVGNLPPENQ--EEILQPISNLLHDQLGRQEDIPLATPTWGQTNPNVMTMMVEAMQALVRATVATQLAQTGQVQND-VS
        MPPR RG  RG   GRGR AGRV    PE Q   +   P + + H  L   E          Q   +++  M E  Q    A V   +    QV +D +S
Subjt:  MPPRGRGKGRGRGSGRGRVAGRVGNLPPENQ--EEILQPISNLLHDQLGRQEDIPLATPTWGQTNPNVMTMMVEAMQALVRATVATQLAQTGQVQND-VS

Query:  IELRYLKDFRKYDPRPFDGSSHDPTVAELWVSSIETIFRLTNCLEYRRVSCAAFMLRD
         E ++L+DFRKY+P  FDGS  DPT A+LW+ S+ETIFR   C E ++V CA FML D
Subjt:  IELRYLKDFRKYDPRPFDGSSHDPTVAELWVSSIETIFRLTNCLEYRRVSCAAFMLRD

A0A6J1DSJ6 uncharacterized protein LOC1110235124.6e-1952.81Show/hide
Query:  VEAMQALVRATVATQLAQTGQVQNDVSIELRYLKDFRKYDPRPFDGSSHDPTVAELWVSSIETIFRLTNCLEYRRVSCAAFMLRDDTYL
        +E +Q LV+ TV+ Q+ Q  Q +  +SIE +YL+DF+KYDPR FDG S DP +AE W+S +ETIFR   CLE ++V C  FML+DD +L
Subjt:  VEAMQALVRATVATQLAQTGQVQNDVSIELRYLKDFRKYDPRPFDGSSHDPTVAELWVSSIETIFRLTNCLEYRRVSCAAFMLRDDTYL

A0A6J1FDR9 uncharacterized protein LOC1114444636.6e-1843.12Show/hide
Query:  MPPRGRGKG-RGRGSGRGRVAGRVGNLPPENQEEILQPISNLLHDQLGRQEDIPLATPTWGQTNPNVMTMMVEAMQALVRATVATQLAQTGQVQNDVSIE
        MPPR RG G RGR   +GR  GR  N   EN  E  QP+              P A P   Q  PN    +V+A+Q +++   A Q A      +  ++E
Subjt:  MPPRGRGKG-RGRGSGRGRVAGRVGNLPPENQEEILQPISNLLHDQLGRQEDIPLATPTWGQTNPNVMTMMVEAMQALVRATVATQLAQTGQVQNDVSIE

Query:  LRYLKDFRKYDPRPFDGSSHDPTVAELWVSSIETIFRLTNCLEYRRVSCAAFMLRDDTYL
         +YL+DF++ DPR F G+S DPTVA++W+ SIET+F LTNC E  RV CA FMLR D  L
Subjt:  LRYLKDFRKYDPRPFDGSSHDPTVAELWVSSIETIFRLTNCLEYRRVSCAAFMLRDDTYL

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGCCGCCAAGAGGAAGAGGAAAAGGTCGAGGTAGGGGAAGTGGCAGAGGTAGGGTGGCAGGAAGAGTAGGTAACCTACCACCAGAGAACCAAGAGGAGATCCTCCAACC
TATATCGAACCTTCTGCATGATCAACTAGGTCGTCAAGAAGATATTCCTTTAGCAACCCCAACATGGGGACAGACTAATCCAAATGTTATGACCATGATGGTGGAGGCAA
TGCAGGCCTTAGTGCGAGCTACAGTGGCCACCCAGTTGGCTCAGACAGGTCAAGTACAAAATGATGTGTCAATTGAGCTCAGATACCTAAAAGATTTTAGGAAATATGAC
CCACGACCGTTTGATGGGTCTTCCCATGATCCTACAGTGGCGGAGTTATGGGTGTCCTCGATTGAAACAATTTTTAGATTGACGAATTGCTTGGAATACCGGAGGGTATC
TTGTGCAGCTTTTATGTTGAGAGATGATACCTATTTGTGA
mRNA sequenceShow/hide mRNA sequence
TCTCCTTCTTCGACCAAGCAGCTGTCGACGCCTCTCCTTCTCCCCCATCCAGTCGTCACCTTCACATCGACAGCCCAAGTCTGCAACCCACCGAAGACCTGCAAGTCGTT
CGACCGTCAATTCCAACATTTGAACCCATGAGCAGCTGCATATCAGGAAGTGACGTCGGAATTGCACGTCGTTCTGTTCGTGGGTGTCGTTAATCATCATTTGACCGGAA
CGACTGGGCGTGCGAGAATCGAGCTACGCCTTCTCAGCCTCAGAACTCGCCGAAGTCGTCGACCTGCAAGTGTAGAGTCGAACGTCAGCCACCAGAAGTGTTATCATCGA
TTCCTTGACGTCACCTATTCAACTCTTGGTTCGAATTTTTGAGTGGGTTTTCAGCCGGTTTTTGAAGTCTAGTGATCGGTTCTGTGTAGATCGACGAGTGGGCATTCCAA
AACTAGGTTCCACGCGCTGAGGCGTACTATTGCGCCGCCACACACGCTGCCGCCGGCCGCCATCGTTTGCCGAAGCTTGTTCGATTCTTTTAGTTTCGCTCTAGAGTTCA
TTTTCAGCCTTCGTTAGTGGAATCCGTTCATTGGCCAGACGGGAGGACATCAAAGTGAGAACAGATTTTAGTACAGTGCAAAACTAAATCTAAGGTAAAGGCAAAGGCGT
ACAGGGCGAGTGACGAGACGAGTTAGAAGGCTATGGGACAAAGCCATGATTTCCCGCATTTATGTCTTAATAGTTATGATTAAACGATTTTGATTATGCATTAAGATAAT
AGAGAATCAAATTGAAGCCAAACTGATTAATGTTGTTTTTAAATGTTTTGATTTTTCTTAGGGATCCCTAAATGTAGTTTTGGTTGTGCTTTAAAGTTAAACTATTCTTA
TTATATTTTTGGCCTAAAATGCATGCATGAGTAGTAACGTCTCTAGGAGTCGAGTTATAAAAGTTAGGGGCATTATAGTTGGTACCAGAGCCCTAGGTTTGGGTTATGTA
GACTTGCTTACATCGTAAGCACTATTATCCCATGGCTAAGTAATCGATCCCAGTCACCGCAAGGTATGTCTTTGAATTTATAAATATGAATAACTGTTTTACCATGATTG
AATGTTATGTTATTGAGACTGATTTGAATGATTAAATGTGATCCCTACCTTATGGGCAGCAAAGGACTAGACTGAATTAGAGAATCCTCTTATGTTGTAGTACTGAGGAA
TATGCCGCCAAGAGGAAGAGGAAAAGGTCGAGGTAGGGGAAGTGGCAGAGGTAGGGTGGCAGGAAGAGTAGGTAACCTACCACCAGAGAACCAAGAGGAGATCCTCCAAC
CTATATCGAACCTTCTGCATGATCAACTAGGTCGTCAAGAAGATATTCCTTTAGCAACCCCAACATGGGGACAGACTAATCCAAATGTTATGACCATGATGGTGGAGGCA
ATGCAGGCCTTAGTGCGAGCTACAGTGGCCACCCAGTTGGCTCAGACAGGTCAAGTACAAAATGATGTGTCAATTGAGCTCAGATACCTAAAAGATTTTAGGAAATATGA
CCCACGACCGTTTGATGGGTCTTCCCATGATCCTACAGTGGCGGAGTTATGGGTGTCCTCGATTGAAACAATTTTTAGATTGACGAATTGCTTGGAATACCGGAGGGTAT
CTTGTGCAGCTTTTATGTTGAGAGATGATACCTATTTGTGATGGGAGTCCGCTCAAAGGACCATGGACACTAATGGAGAACTAATCACTTGGAATATGTTTAGAGAGGCA
TTCTGGCACAAATTTTACCCAACCACGACTCAGTATAGGAAGCAAGTTGAGTTCCTACAACTCTGTCAGAATAGAAGACCTATAGAGGATTATGAGAGAGAATTCACACA
ATTAATGCGTTTTGCTCCAGAACTGGTGGACACCGAGGCTAAGTAAGTAGAACAAATTGTTATGGGACTAGATGAGGGTATTCGAGGATTCATTCTGCACTTTCACCCCC
GGATTATGCTTCAGCAGTAAGAGTAGTTGAATTGATTGGTGTTCAGTCTCATAGTGTGCAACAGGAAATAGTTAACCCGAGTCAGCCACTTTCAGGCTATAAAAGGAAGT
GGGATTAGGAAGGTTCTGATCTCCAGCTTTATCAGCAACCTTCGAGATCATCGAACGATTCACATTCCACCCCTAGTCAGAGACAACCAGTTCGAACAGGTAAAGATGTG
GTGAGAAACCACGCTATAAGGAATGTGGAAGATATCATTGGGGCAAGTGTTTAGCTCGTTTTAGGGAATGTTTCAGATGCAAGAAAGAGGGGCACAGAGCAGAGAGTTGT
CTCAATCAAGGTACCATGGATGATCAACCTTCCTGATCGAATGGAGCTGGATCTTTTGAACAAACGACTGAACAAGGAAGAGCTTTTGCCAGTACAAGTCGAGACACTAG
CAACTTCGATCCAACGATCACAAGTACATTTCTCGTACTTAGATACTTTATTTAACTTCCCTCCAGGATGTATACATTTGTTGTTCTATGCATGTTTTAGAGTTAGGGTT
GCTAACTTTTGTCTGGTGGTTGCCACTTCGATGGGAGTGAGTTGTTTGGTTATTGAAATGATTAAAGTTTTAAGTTGA
Protein sequenceShow/hide protein sequence
MPPRGRGKGRGRGSGRGRVAGRVGNLPPENQEEILQPISNLLHDQLGRQEDIPLATPTWGQTNPNVMTMMVEAMQALVRATVATQLAQTGQVQNDVSIELRYLKDFRKYD
PRPFDGSSHDPTVAELWVSSIETIFRLTNCLEYRRVSCAAFMLRDDTYL