; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

CSPI03G20720 (gene) of Cucumber (PI 183967) v1 genome

Gene IDCSPI03G20720
OrganismCucumis sativus L. var. sativus cv. PI 183967 (Cucumber (PI 183967) v1)
DescriptionRetrovirus-related Pol polyprotein from transposon TNT 1-94
Genome locationChr3:16745740..16746695
RNA-Seq ExpressionCSPI03G20720
SyntenyCSPI03G20720
Gene Ontology termsGO:0015074 - DNA integration (biological process)
GO:0003676 - nucleic acid binding (molecular function)
InterPro domainsIPR001584 - Integrase, catalytic core
IPR012337 - Ribonuclease H-like superfamily
IPR036397 - Ribonuclease H superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAA0033341.1 putative gag-pol polyprotein, identical [Cucumis melo var. makuwa]5.3e-7071.01Show/hide
Query:  MANITSDQKTAEVCFIDSGCSNHMTGLKPIFNELKEGEKLKVELGNNKELQVERKGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDDE
        M NI SDQK A+V FIDSGCSN MT LKP+F EL EGEKLKVELGN KELQVE KGTVGIETHHGNRILTNVQY  DIGYNLLSVGQLMESGHSILFDDE
Subjt:  MANITSDQKTAEVCFIDSGCSNHMTGLKPIFNELKEGEKLKVELGNNKELQVERKGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDDE

Query:  IWLFGYRKSVLSISVRDVCMENKLENLYLLGKLGEPRKRKSETSEKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQN
                                              RKSE  EKFKHF AKVEKQSGMF+KSLRSDRGGEFLSNNFNHFC+E GIHREL TPYTPEQN
Subjt:  IWLFGYRKSVLSISVRDVCMENKLENLYLLGKLGEPRKRKSETSEKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQN

Query:  GKAERKN
        G AERKN
Subjt:  GKAERKN

KAA0061308.1 UBN2 domain-containing protein [Cucumis melo var. makuwa]1.5e-6971.5Show/hide
Query:  MANITSDQKTAEVCFIDSGCSNHMTGLKPIFNELKEGEKLKVELGNNKELQVERKGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDDE
        M NI SDQKTAEV FIDSGCSNHMT LKP+F EL EGEKLKVEL N K+LQVE KGTVGIET+ GNRILTNVQYVPDIGYNLLSV QLMESG+SILFDD 
Subjt:  MANITSDQKTAEVCFIDSGCSNHMTGLKPIFNELKEGEKLKVELGNNKELQVERKGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDDE

Query:  IWLFGYRKSVLSISVRDVCMENKLENLYLLGKLGEPRKRKSETSEKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQN
               KS    +      +N  E  +L        +++    EKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFN+FC+EHGI RELTTPYTPEQN
Subjt:  IWLFGYRKSVLSISVRDVCMENKLENLYLLGKLGEPRKRKSETSEKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQN

Query:  GKAERKN
        G AERKN
Subjt:  GKAERKN

TYK21566.1 Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Cucumis melo var. makuwa]5.3e-7071.01Show/hide
Query:  MANITSDQKTAEVCFIDSGCSNHMTGLKPIFNELKEGEKLKVELGNNKELQVERKGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDDE
        M NI SDQK A+V FIDSGCSN MT LKP+F EL EGEKLKVELGN KELQVE KGTVGIETHHGNRILTNVQY  DIGYNLLSVGQLMESGHSILFDDE
Subjt:  MANITSDQKTAEVCFIDSGCSNHMTGLKPIFNELKEGEKLKVELGNNKELQVERKGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDDE

Query:  IWLFGYRKSVLSISVRDVCMENKLENLYLLGKLGEPRKRKSETSEKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQN
                                              RKSE  EKFKHF AKVEKQSGMF+KSLRSDRGGEFLSNNFNHFC+E GIHREL TPYTPEQN
Subjt:  IWLFGYRKSVLSISVRDVCMENKLENLYLLGKLGEPRKRKSETSEKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQN

Query:  GKAERKN
        G AERKN
Subjt:  GKAERKN

TYK28117.1 putative gag-pol polyprotein, identical [Cucumis melo var. makuwa]8.4e-6860.87Show/hide
Query:  MANITSDQKTAEVCFIDSGCSNHMTGLKPIFNELKEGEKLKVELGNNKELQVERKGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDDE
        M NI SDQKT EV FIDSG  NHMT LKP+F EL EGEKLKVELGN KELQVE K T+GIETH+GNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDD 
Subjt:  MANITSDQKTAEVCFIDSGCSNHMTGLKPIFNELKEGEKLKVELGNNKELQVERKGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDDE

Query:  IWLFGYRKS-------------VLSISVRDV-----------CMENKLE--------------NLYLLGKLGEPR--------KRKSETSEKFKHFKAKV
          L   +++             +  + V +V             +N  E              + Y L  + +          K +SET EKFKHFKAKV
Subjt:  IWLFGYRKS-------------VLSISVRDV-----------CMENKLE--------------NLYLLGKLGEPR--------KRKSETSEKFKHFKAKV

Query:  EKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQNGKAERKN
        EKQSGMFIKS RSDRGG+FLSNNFNHFCEEHGIHRELTTPYT EQNG AERKN
Subjt:  EKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQNGKAERKN

XP_031739225.1 uncharacterized protein LOC101208246 [Cucumis sativus]1.1e-8380.19Show/hide
Query:  MANITSDQKTAEVCFIDSGCSNHMTGLKPIFNELKEGEKLKVELGNNKELQVERKGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDDE
        MAN+TSDQKTAEVCFIDSGCSNHMTGLKPIFNEL EGEKLKVELGNNKELQVERKGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDDE
Subjt:  MANITSDQKTAEVCFIDSGCSNHMTGLKPIFNELKEGEKLKVELGNNKELQVERKGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDDE

Query:  IWLFGYRKSVLSISVRDVCMENKLENLYLLGKLGEPRKRKSETSEKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQN
                                              RKSETSEKFKHFKAKVEKQSGMFIKSLRSD GGEFLSNNFNHFCEEHGIHRELTTPYTPEQN
Subjt:  IWLFGYRKSVLSISVRDVCMENKLENLYLLGKLGEPRKRKSETSEKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQN

Query:  GKAERKN
        GKAERKN
Subjt:  GKAERKN

TrEMBL top hitse value%identityAlignment
A0A5A7SV62 Putative gag-pol polyprotein, identical2.5e-7071.01Show/hide
Query:  MANITSDQKTAEVCFIDSGCSNHMTGLKPIFNELKEGEKLKVELGNNKELQVERKGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDDE
        M NI SDQK A+V FIDSGCSN MT LKP+F EL EGEKLKVELGN KELQVE KGTVGIETHHGNRILTNVQY  DIGYNLLSVGQLMESGHSILFDDE
Subjt:  MANITSDQKTAEVCFIDSGCSNHMTGLKPIFNELKEGEKLKVELGNNKELQVERKGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDDE

Query:  IWLFGYRKSVLSISVRDVCMENKLENLYLLGKLGEPRKRKSETSEKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQN
                                              RKSE  EKFKHF AKVEKQSGMF+KSLRSDRGGEFLSNNFNHFC+E GIHREL TPYTPEQN
Subjt:  IWLFGYRKSVLSISVRDVCMENKLENLYLLGKLGEPRKRKSETSEKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQN

Query:  GKAERKN
        G AERKN
Subjt:  GKAERKN

A0A5A7TSJ0 DUF4219 domain-containing protein/UBN2 domain-containing protein1.5e-6770.85Show/hide
Query:  MANITSDQKTAEVCFIDSGCSNHMTGLKPIFNELKEGEKLKVELGNNKELQVERKGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDDE
        M NI SDQKT EV FIDS CSNHMTGLK +F EL EGEKLKVEL N KELQVE KGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQL+ESG+SILFDDE
Subjt:  MANITSDQKTAEVCFIDSGCSNHMTGLKPIFNELKEGEKLKVELGNNKELQVERKGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDDE

Query:  IWLFGYRKSVLSISVRDVCMENKLENLYLLGKLGEPRKRKSETSEKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQ
                                               KS+T EKFKHFKAKV+KQSG+FIKSLRSDRGGEFL NNFNHFCEEHGIHRELTTPYTPEQ
Subjt:  IWLFGYRKSVLSISVRDVCMENKLENLYLLGKLGEPRKRKSETSEKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQ

A0A5A7V170 UBN2 domain-containing protein7.4e-7071.5Show/hide
Query:  MANITSDQKTAEVCFIDSGCSNHMTGLKPIFNELKEGEKLKVELGNNKELQVERKGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDDE
        M NI SDQKTAEV FIDSGCSNHMT LKP+F EL EGEKLKVEL N K+LQVE KGTVGIET+ GNRILTNVQYVPDIGYNLLSV QLMESG+SILFDD 
Subjt:  MANITSDQKTAEVCFIDSGCSNHMTGLKPIFNELKEGEKLKVELGNNKELQVERKGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDDE

Query:  IWLFGYRKSVLSISVRDVCMENKLENLYLLGKLGEPRKRKSETSEKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQN
               KS    +      +N  E  +L        +++    EKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFN+FC+EHGI RELTTPYTPEQN
Subjt:  IWLFGYRKSVLSISVRDVCMENKLENLYLLGKLGEPRKRKSETSEKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQN

Query:  GKAERKN
        G AERKN
Subjt:  GKAERKN

A0A5D3DDC3 Retrovirus-related Pol polyprotein from transposon TNT 1-942.5e-7071.01Show/hide
Query:  MANITSDQKTAEVCFIDSGCSNHMTGLKPIFNELKEGEKLKVELGNNKELQVERKGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDDE
        M NI SDQK A+V FIDSGCSN MT LKP+F EL EGEKLKVELGN KELQVE KGTVGIETHHGNRILTNVQY  DIGYNLLSVGQLMESGHSILFDDE
Subjt:  MANITSDQKTAEVCFIDSGCSNHMTGLKPIFNELKEGEKLKVELGNNKELQVERKGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDDE

Query:  IWLFGYRKSVLSISVRDVCMENKLENLYLLGKLGEPRKRKSETSEKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQN
                                              RKSE  EKFKHF AKVEKQSGMF+KSLRSDRGGEFLSNNFNHFC+E GIHREL TPYTPEQN
Subjt:  IWLFGYRKSVLSISVRDVCMENKLENLYLLGKLGEPRKRKSETSEKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQN

Query:  GKAERKN
        G AERKN
Subjt:  GKAERKN

A0A5D3DWC7 Putative gag-pol polyprotein, identical4.1e-6860.87Show/hide
Query:  MANITSDQKTAEVCFIDSGCSNHMTGLKPIFNELKEGEKLKVELGNNKELQVERKGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDDE
        M NI SDQKT EV FIDSG  NHMT LKP+F EL EGEKLKVELGN KELQVE K T+GIETH+GNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDD 
Subjt:  MANITSDQKTAEVCFIDSGCSNHMTGLKPIFNELKEGEKLKVELGNNKELQVERKGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDDE

Query:  IWLFGYRKS-------------VLSISVRDV-----------CMENKLE--------------NLYLLGKLGEPR--------KRKSETSEKFKHFKAKV
          L   +++             +  + V +V             +N  E              + Y L  + +          K +SET EKFKHFKAKV
Subjt:  IWLFGYRKS-------------VLSISVRDV-----------CMENKLE--------------NLYLLGKLGEPR--------KRKSETSEKFKHFKAKV

Query:  EKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQNGKAERKN
        EKQSGMFIKS RSDRGG+FLSNNFNHFCEEHGIHRELTTPYT EQNG AERKN
Subjt:  EKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQNGKAERKN

SwissProt top hitse value%identityAlignment
P10978 Retrovirus-related Pol polyprotein from transposon TNT 1-944.4e-1136.04Show/hide
Query:  SVLSISVRDVCMENKLE----NLYLLGKLGEPR--------KRKSETSEKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYT
        ++L +   DVC   ++E    N Y +  + +          K K +  + F+ F A VE+++G  +K LRSD GGE+ S  F  +C  HGI  E T P T
Subjt:  SVLSISVRDVCMENKLE----NLYLLGKLGEPR--------KRKSETSEKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYT

Query:  PEQNGKAERKN
        P+ NG AER N
Subjt:  PEQNGKAERKN

P14350 Pro-Pol polyprotein1.0e-0446.67Show/hide
Query:  KSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQNGKAERKN
        K + SD+G  F S+ F  + +E GIH E +TPY P+   K ERKN
Subjt:  KSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQNGKAERKN

P23074 Pro-Pol polyprotein1.4e-0448.89Show/hide
Query:  KSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQNGKAERKN
        K L SD+G  F S+ F  + +E GI  E +TPY P+ +GK ERKN
Subjt:  KSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQNGKAERKN

Q87040 Pro-Pol polyprotein1.6e-0548.89Show/hide
Query:  KSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQNGKAERKN
        K + SD+G  F S+ F  + +E GIH E +TPY P+ +GK ERKN
Subjt:  KSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQNGKAERKN

Q9ZT94 Retrovirus-related Pol polyprotein from transposon RE22.8e-0540.28Show/hide
Query:  PRKRKSETSEKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQNGKAERKN
        P K+KS+  + F  FK+ VE +    I +L SD GGEF+      +  +HGI    + P+TPE NG +ERK+
Subjt:  PRKRKSETSEKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQNGKAERKN

Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGCAAACATCACTAGTGATCAAAAGACAGCGGAGGTGTGTTTCATTGATAGCGGGTGTTCGAATCACATGACAGGCTTGAAGCCTATATTCAATGAGCTTAAAGAAGG
AGAAAAGTTGAAGGTGGAACTTGGAAACAACAAGGAGCTACAAGTAGAACGCAAAGGAACGGTGGGAATTGAAACTCACCATGGAAATAGAATTCTCACAAATGTTCAGT
ATGTGCCCGATATTGGATATAATTTGCTGAGTGTTGGACAGCTAATGGAGAGTGGGCATTCTATCTTGTTTGATGATGAGATATGGTTATTCGGTTATCGAAAATCAGTG
CTATCAATATCTGTGAGGGATGTGTGTATGGAAAACAAACTCGAAAATCTTTACCTATTGGGAAAGCTTGGAGAGCCTCGAAAAAGAAAATCAGAAACATCTGAGAAGTT
CAAGCATTTCAAGGCAAAGGTAGAAAAGCAAAGTGGCATGTTCATCAAATCTCTTCGCAGTGATAGAGGTGGAGAATTTTTGTCCAACAACTTCAACCATTTTTGTGAGG
AACATGGCATCCATAGGGAGTTGACAACACCTTACACTCCAGAGCAAAATGGGAAAGCTGAGAGGAAGAATTGA
mRNA sequenceShow/hide mRNA sequence
ATGGCAAACATCACTAGTGATCAAAAGACAGCGGAGGTGTGTTTCATTGATAGCGGGTGTTCGAATCACATGACAGGCTTGAAGCCTATATTCAATGAGCTTAAAGAAGG
AGAAAAGTTGAAGGTGGAACTTGGAAACAACAAGGAGCTACAAGTAGAACGCAAAGGAACGGTGGGAATTGAAACTCACCATGGAAATAGAATTCTCACAAATGTTCAGT
ATGTGCCCGATATTGGATATAATTTGCTGAGTGTTGGACAGCTAATGGAGAGTGGGCATTCTATCTTGTTTGATGATGAGATATGGTTATTCGGTTATCGAAAATCAGTG
CTATCAATATCTGTGAGGGATGTGTGTATGGAAAACAAACTCGAAAATCTTTACCTATTGGGAAAGCTTGGAGAGCCTCGAAAAAGAAAATCAGAAACATCTGAGAAGTT
CAAGCATTTCAAGGCAAAGGTAGAAAAGCAAAGTGGCATGTTCATCAAATCTCTTCGCAGTGATAGAGGTGGAGAATTTTTGTCCAACAACTTCAACCATTTTTGTGAGG
AACATGGCATCCATAGGGAGTTGACAACACCTTACACTCCAGAGCAAAATGGGAAAGCTGAGAGGAAGAATTGA
Protein sequenceShow/hide protein sequence
MANITSDQKTAEVCFIDSGCSNHMTGLKPIFNELKEGEKLKVELGNNKELQVERKGTVGIETHHGNRILTNVQYVPDIGYNLLSVGQLMESGHSILFDDEIWLFGYRKSV
LSISVRDVCMENKLENLYLLGKLGEPRKRKSETSEKFKHFKAKVEKQSGMFIKSLRSDRGGEFLSNNFNHFCEEHGIHRELTTPYTPEQNGKAERKN