; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Clc03G04620 (gene) of Watermelon (cordophanus) v2 genome

Gene IDClc03G04620
OrganismCitrullus lanatus subsp. cordophanus (Watermelon (cordophanus) v2)
DescriptionRetrovirus-related Pol polyprotein from transposon RE1
Genome locationClcChr03:4507527..4508346
RNA-Seq ExpressionClc03G04620
SyntenyClc03G04620
Gene Ontology termsNA
InterPro domainsNA


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAA0055390.1 uncharacterized protein E6C27_scaffold80G002350 [Cucumis melo var. makuwa]1.7e-2745.29Show/hide
Query:  MMPVRGIFSPLRS------SERKLIAPPNPEEQQ-----------------------NLSIHLPKSIVKMAP-SRRQTEQPTNPNTKIPT---TKSLTPS
        M  +RGI  P  S         +  APP+PEEQQ                       +  I  P+  +K+ P S  QT+QP NP+ KI T   T S  PS
Subjt:  MMPVRGIFSPLRS------SERKLIAPPNPEEQQ-----------------------NLSIHLPKSIVKMAP-SRRQTEQPTNPNTKIPT---TKSLTPS

Query:  NSSLKGGFGYNSSRLSLNGSRKYFDSPNDSPKPYRSFGTKEEFGSRSKSQIILFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAKSYSDNLADQYSSDA
         S+LK G              K+  SPN SPK  + FG+     S+S + IIL KIY EIA HRQGNSSITSYF KL+ LWDQLAK+Y+D LA QYSSDA
Subjt:  NSSLKGGFGYNSSRLSLNGSRKYFDSPNDSPKPYRSFGTKEEFGSRSKSQIILFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAKSYSDNLADQYSSDA

Query:  IEMLREHMEREKVMQFLVGLNDS
        I ML EHMEREK+M  +   NDS
Subjt:  IEMLREHMEREKVMQFLVGLNDS

TYJ99320.1 hypothetical protein E5676_scaffold248G005340 [Cucumis melo var. makuwa]2.7e-4953.26Show/hide
Query:  MMPVRGIFSPLRS------SERKLIAPPNPEEQQ-----------------------NLSIHLPKSIVKMAP-SRRQTEQPTNPNTKIPT---TKSLTPS
        M  +RGI  P  S         +  APP+PEEQQ                       +  I  P+  +K+ P S  QT+QP NP+TKI T   T S  PS
Subjt:  MMPVRGIFSPLRS------SERKLIAPPNPEEQQ-----------------------NLSIHLPKSIVKMAP-SRRQTEQPTNPNTKIPT---TKSLTPS

Query:  NSSLKGGFGYNSSRLSLNGSRKYFDSPNDSPKPYRSFGTKEEFGSRSKSQIILFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAKSYSDNLADQYSSDA
         S+LK G              K+  SPN SPK  + FG+     S+S + IIL KIY EIA HRQGNSSITSYF KL+ LWDQLAK+Y+D LA QYSSDA
Subjt:  NSSLKGGFGYNSSRLSLNGSRKYFDSPNDSPKPYRSFGTKEEFGSRSKSQIILFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAKSYSDNLADQYSSDA

Query:  IEMLREHMEREKVMQFLVGLNDSYSSICSKILLMRPFPTVDTAFSVIIRKERRRRSIILSQ
        I ML EHMEREKV+QFLVGLNDSYSS CSKIL MRPFPTVDTAFSVIIRKE RR+ + LSQ
Subjt:  IEMLREHMEREKVMQFLVGLNDSYSSICSKILLMRPFPTVDTAFSVIIRKERRRRSIILSQ

XP_004154160.1 uncharacterized protein LOC101211799 [Cucumis sativus]3.5e-2537.96Show/hide
Query:  MPVRGIFSPLRSSERKLIAPPNPEEQQNLSIHLPKSIVKMAPSRR-QTEQPTNPNTKIPTTKSLTPSNSSLKGGFGYNSSRLSLNGS----RKYFD-SPN
        M  RGI SP RS  +K   PP          HLP S    AP    Q+E P+  + +  T  +  PS S+ KG  G + SR   +GS    +K  D SPN
Subjt:  MPVRGIFSPLRSSERKLIAPPNPEEQQNLSIHLPKSIVKMAPSRR-QTEQPTNPNTKIPTTKSLTPSNSSLKGGFGYNSSRLSLNGS----RKYFD-SPN

Query:  DSPKPYRSFG-------------------------------------TKEEFGSRSKSQIILFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAKSYSDN
         SP P ++                                        +++  S S +  +LFKIY  IASHRQGN+SI SYFK L+ LWD+ A S    
Subjt:  DSPKPYRSFG-------------------------------------TKEEFGSRSKSQIILFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAKSYSDN

Query:  LADQYSSDAIEMLREH-MEREKVMQFLVGLNDSYSSICSKILLMRPFPTVDTAFSVIIRKERRRRSIILSQEVV
         + Q SS+   + +   MEREK+MQFL+GLNDSYSS+CS+ILL RP PTVD A+S+II +E+ R+S +  + ++
Subjt:  LADQYSSDAIEMLREH-MEREKVMQFLVGLNDSYSSICSKILLMRPFPTVDTAFSVIIRKERRRRSIILSQEVV

XP_022137024.1 uncharacterized protein LOC111008588 [Momordica charantia]1.2e-2550Show/hide
Query:  YRSFGTKEEFGSRSKSQIILFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAKSYSDNLADQYSSDAIEMLREHMEREKVMQFLVGLNDSYSSICSKILL
        Y S G+     S   +   +F+IY +IASHRQ NSS+TSYF KL++LWD+L ++YSD++    S  A+E L  H+EREKVMQFL+GLN+SYS+IC +ILL
Subjt:  YRSFGTKEEFGSRSKSQIILFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAKSYSDNLADQYSSDAIEMLREHMEREKVMQFLVGLNDSYSSICSKILL

Query:  MRPFPTVDTAFSVIIRKERRRRSIILSQEV---VDEEKVSSQND
        ++PFPT++ A+S+IIR+E+R   +   + V   V E K   QND
Subjt:  MRPFPTVDTAFSVIIRKERRRRSIILSQEV---VDEEKVSSQND

XP_038895287.1 hybrid signal transduction histidine kinase L-like isoform X3 [Benincasa hispida]2.1e-2538.31Show/hide
Query:  SERKLIAPPNPEEQQNLSIHLPKSIVKMAPSRRQTEQPTNPNTKIPTTKSLTPSNSSLKGGFGYNSSRLSLNGSRKYFDSPNDSPKP-------------
        S R   A  NP  Q+ L  +    I+   PS  +T +P    T+ PT  S     + +    G + S LS  G  KYF S N SP               
Subjt:  SERKLIAPPNPEEQQNLSIHLPKSIVKMAPSRRQTEQPTNPNTKIPTTKSLTPSNSSLKGGFGYNSSRLSLNGSRKYFDSPNDSPKP-------------

Query:  -----------------------------YRSFG--TKEEFGSRSKSQIILFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAKSYSDNLADQYSSDAIE
                                     Y S G  TKE   S+S     +F+IY EIA HRQ NSSITSYF KLE LWD+LA ++  +L       A E
Subjt:  -----------------------------YRSFG--TKEEFGSRSKSQIILFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAKSYSDNLADQYSSDAIE

Query:  MLREHMEREKVMQFLVGLNDSYSSICSKILLMRPFPTVDTAFSVIIRKERRRRSIILSQEV
         L E+MEREKVMQFLVGLNDSYS IC++ILL  PFPT++ A+S +IR+E+ R  ++  + V
Subjt:  MLREHMEREKVMQFLVGLNDSYSSICSKILLMRPFPTVDTAFSVIIRKERRRRSIILSQEV

TrEMBL top hitse value%identityAlignment
A0A5A7UJM8 Uncharacterized protein8.1e-2845.29Show/hide
Query:  MMPVRGIFSPLRS------SERKLIAPPNPEEQQ-----------------------NLSIHLPKSIVKMAP-SRRQTEQPTNPNTKIPT---TKSLTPS
        M  +RGI  P  S         +  APP+PEEQQ                       +  I  P+  +K+ P S  QT+QP NP+ KI T   T S  PS
Subjt:  MMPVRGIFSPLRS------SERKLIAPPNPEEQQ-----------------------NLSIHLPKSIVKMAP-SRRQTEQPTNPNTKIPT---TKSLTPS

Query:  NSSLKGGFGYNSSRLSLNGSRKYFDSPNDSPKPYRSFGTKEEFGSRSKSQIILFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAKSYSDNLADQYSSDA
         S+LK G              K+  SPN SPK  + FG+     S+S + IIL KIY EIA HRQGNSSITSYF KL+ LWDQLAK+Y+D LA QYSSDA
Subjt:  NSSLKGGFGYNSSRLSLNGSRKYFDSPNDSPKPYRSFGTKEEFGSRSKSQIILFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAKSYSDNLADQYSSDA

Query:  IEMLREHMEREKVMQFLVGLNDS
        I ML EHMEREK+M  +   NDS
Subjt:  IEMLREHMEREKVMQFLVGLNDS

A0A5D3BHR0 Uncharacterized protein1.3e-4953.26Show/hide
Query:  MMPVRGIFSPLRS------SERKLIAPPNPEEQQ-----------------------NLSIHLPKSIVKMAP-SRRQTEQPTNPNTKIPT---TKSLTPS
        M  +RGI  P  S         +  APP+PEEQQ                       +  I  P+  +K+ P S  QT+QP NP+TKI T   T S  PS
Subjt:  MMPVRGIFSPLRS------SERKLIAPPNPEEQQ-----------------------NLSIHLPKSIVKMAP-SRRQTEQPTNPNTKIPT---TKSLTPS

Query:  NSSLKGGFGYNSSRLSLNGSRKYFDSPNDSPKPYRSFGTKEEFGSRSKSQIILFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAKSYSDNLADQYSSDA
         S+LK G              K+  SPN SPK  + FG+     S+S + IIL KIY EIA HRQGNSSITSYF KL+ LWDQLAK+Y+D LA QYSSDA
Subjt:  NSSLKGGFGYNSSRLSLNGSRKYFDSPNDSPKPYRSFGTKEEFGSRSKSQIILFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAKSYSDNLADQYSSDA

Query:  IEMLREHMEREKVMQFLVGLNDSYSSICSKILLMRPFPTVDTAFSVIIRKERRRRSIILSQ
        I ML EHMEREKV+QFLVGLNDSYSS CSKIL MRPFPTVDTAFSVIIRKE RR+ + LSQ
Subjt:  IEMLREHMEREKVMQFLVGLNDSYSSICSKILLMRPFPTVDTAFSVIIRKERRRRSIILSQ

A0A6J1C5Z8 uncharacterized protein LOC1110085885.8e-2650Show/hide
Query:  YRSFGTKEEFGSRSKSQIILFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAKSYSDNLADQYSSDAIEMLREHMEREKVMQFLVGLNDSYSSICSKILL
        Y S G+     S   +   +F+IY +IASHRQ NSS+TSYF KL++LWD+L ++YSD++    S  A+E L  H+EREKVMQFL+GLN+SYS+IC +ILL
Subjt:  YRSFGTKEEFGSRSKSQIILFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAKSYSDNLADQYSSDAIEMLREHMEREKVMQFLVGLNDSYSSICSKILL

Query:  MRPFPTVDTAFSVIIRKERRRRSIILSQEV---VDEEKVSSQND
        ++PFPT++ A+S+IIR+E+R   +   + V   V E K   QND
Subjt:  MRPFPTVDTAFSVIIRKERRRRSIILSQEV---VDEEKVSSQND

A0A6J1C7L7 uncharacterized protein LOC1110089864.2e-2456.69Show/hide
Query:  DSPNDSPKPYRSFGTKEEFGSRSKSQIILFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAKSYSDNLADQYSSDAIEMLREHMEREKVMQFLVGLNDSY
        +S ++S  PY    TKEE   +S ++ IL +IY +IASHRQGNSSITSYF KLE LW++L ++YSD       S   +   + +EREKVMQFLVGLNDSY
Subjt:  DSPNDSPKPYRSFGTKEEFGSRSKSQIILFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAKSYSDNLADQYSSDAIEMLREHMEREKVMQFLVGLNDSY

Query:  SSICSKILLMRPFPTVDTAFSVIIRKE
        S+ICS+ILL+RPFPTV+ A+S+II +E
Subjt:  SSICSKILLMRPFPTVDTAFSVIIRKE

A0A6J1GTG4 serine/arginine repetitive matrix protein 1-like3.2e-2453.6Show/hide
Query:  KEEFGSRSKSQIILFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAKSYSDNLADQYSSDAIEMLREHMEREKVMQFLVGLNDSYSSICSKILLMRPFPT
        +EE  S+  +   +F+IY EIASH QGNSSITSY  KL+ LWD+L ++Y D    + S  + E   E +EREKVMQFL+GLNDSYS+IC++IL M+PFPT
Subjt:  KEEFGSRSKSQIILFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAKSYSDNLADQYSSDAIEMLREHMEREKVMQFLVGLNDSYSSICSKILLMRPFPT

Query:  VDTAFSVIIRKERRRRSIILSQEVV
        V+ A   I+R+E +RR ++LS E+V
Subjt:  VDTAFSVIIRKERRRRSIILSQEVV

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
AT1G21280.1 CONTAINS InterPro DOMAIN/s: Retrotransposon gag protein (InterPro:IPR005162); Has 707 Blast hits to 705 proteins in 25 species: Archae - 0; Bacteria - 0; Metazoa - 4; Fungi - 0; Plants - 703; Viruses - 0; Other Eukaryotes - 0 (source: NCBI BLink).1.9e-0525.25Show/hide
Query:  LFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAK--SYSDNLADQYSSDAIEMLREHMEREKVMQFLVG--LNDSYSSICSKILLMRPFPTVDTAFSVI
        ++++   +A+ RQG  S+  YF KL  +W +L++     +      + +  +   E  E+E+  +FL+G  LN  + ++ +KI+  +P P++  AF+++
Subjt:  LFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAK--SYSDNLADQYSSDAIEMLREHMEREKVMQFLVG--LNDSYSSICSKILLMRPFPTVDTAFSVI


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGATGCCGGTGAGAGGCATTTTCAGTCCCCTGAGATCCTCAGAGCGAAAACTGATAGCGCCGCCCAATCCGGAGGAGCAACAAAATCTCAGTATTCACTTACCA
AAATCCATTGTAAAAATGGCCCCCTCACGCCGACAAACCGAACAACCCACAAATCCCAACACTAAAATCCCCACAACAAAATCACTCACGCCTTCAAACAGCAGT
TTAAAAGGTGGGTTTGGCTATAATTCGTCAAGATTATCCCTTAATGGGAGTCGTAAATATTTTGATTCTCCAAATGATTCTCCCAAGCCTTACCGGAGCTTCGGT
ACAAAAGAAGAATTTGGTTCTCGAAGCAAGTCTCAAATAATATTATTTAAAATTTACATGGAAATTGCTTCTCATCGTCAAGGAAACTCATCTATTACATCTTAC
TTCAAAAAGCTGGAGGTATTATGGGATCAACTTGCAAAAAGCTACAGTGATAATTTGGCAGATCAATATTCATCCGATGCAATTGAGATGCTGAGGGAGCATATG
GAAAGGGAAAAGGTAATGCAATTTCTTGTTGGACTAAATGATTCTTATTCCTCTATTTGTTCTAAAATACTTCTTATGAGGCCATTTCCAACGGTGGACACAGCT
TTTTCTGTAATAATTCGAAAAGAAAGACGAAGGAGATCGATTATTTTGTCACAAGAAGTTGTTGATGAGGAAAAAGTAAGTAGCCAAAACGATTAG
mRNA sequenceShow/hide mRNA sequence
ATGATGCCGGTGAGAGGCATTTTCAGTCCCCTGAGATCCTCAGAGCGAAAACTGATAGCGCCGCCCAATCCGGAGGAGCAACAAAATCTCAGTATTCACTTACCA
AAATCCATTGTAAAAATGGCCCCCTCACGCCGACAAACCGAACAACCCACAAATCCCAACACTAAAATCCCCACAACAAAATCACTCACGCCTTCAAACAGCAGT
TTAAAAGGTGGGTTTGGCTATAATTCGTCAAGATTATCCCTTAATGGGAGTCGTAAATATTTTGATTCTCCAAATGATTCTCCCAAGCCTTACCGGAGCTTCGGT
ACAAAAGAAGAATTTGGTTCTCGAAGCAAGTCTCAAATAATATTATTTAAAATTTACATGGAAATTGCTTCTCATCGTCAAGGAAACTCATCTATTACATCTTAC
TTCAAAAAGCTGGAGGTATTATGGGATCAACTTGCAAAAAGCTACAGTGATAATTTGGCAGATCAATATTCATCCGATGCAATTGAGATGCTGAGGGAGCATATG
GAAAGGGAAAAGGTAATGCAATTTCTTGTTGGACTAAATGATTCTTATTCCTCTATTTGTTCTAAAATACTTCTTATGAGGCCATTTCCAACGGTGGACACAGCT
TTTTCTGTAATAATTCGAAAAGAAAGACGAAGGAGATCGATTATTTTGTCACAAGAAGTTGTTGATGAGGAAAAAGTAAGTAGCCAAAACGATTAG
Protein sequenceShow/hide protein sequence
MMPVRGIFSPLRSSERKLIAPPNPEEQQNLSIHLPKSIVKMAPSRRQTEQPTNPNTKIPTTKSLTPSNSSLKGGFGYNSSRLSLNGSRKYFDSPNDSPKPYRSFG
TKEEFGSRSKSQIILFKIYMEIASHRQGNSSITSYFKKLEVLWDQLAKSYSDNLADQYSSDAIEMLREHMEREKVMQFLVGLNDSYSSICSKILLMRPFPTVDTA
FSVIIRKERRRRSIILSQEVVDEEKVSSQND