; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

CmoCh17G005660 (gene) of Cucurbita moschata (Rifu) v1 genome

Gene IDCmoCh17G005660
OrganismCucurbita moschata Rifu (Cucurbita moschata (Rifu) v1)
DescriptionRetrovirus-related Pol polyprotein from transposon TNT 1-94
Genome locationCmo_Chr17:5819839..5820975
RNA-Seq ExpressionCmoCh17G005660
SyntenyCmoCh17G005660
Gene Ontology termsGO:0006807 - nitrogen compound metabolic process (biological process)
GO:0044238 - primary metabolic process (biological process)
GO:0044260 - cellular macromolecule metabolic process (biological process)
GO:0051716 - cellular response to stimulus (biological process)
GO:0065007 - biological regulation (biological process)
GO:0003676 - nucleic acid binding (molecular function)
GO:0008270 - zinc ion binding (molecular function)
GO:0016772 - transferase activity, transferring phosphorus-containing groups (molecular function)
InterPro domainsNA


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAE8664704.1 hypothetical protein F3Y22_tig00112738pilonHSYRG00095 [Hibiscus syriacus]5.5e-4842.96Show/hide
Query:  ESSKIGIEKFDVSDFSFWKMQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYLM
        +  K+ IEKFD +DF FWKMQIED+LYQK LY+PL G  P+ M  E W L DRKALG+IRLTLSRN+AFNI K+KTT+ L+ ALS+MYEK SA NKV+LM
Subjt:  ESSKIGIEKFDVSDFSFWKMQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYLM

Query:  RRLFNLQMSE------------------------------------------------------------------------------GASFHSSSNKEL
        RRLFNL+M+E                                                                              GASFHS+  +E+
Subjt:  RRLFNLQMSE------------------------------------------------------------------------------GASFHSSSNKEL

Query:  FRNFKSRKFKKVYLANNKDLEIKGKEDVCIKTPAGNQWTLKDVSYIPGLKKNLISIGQLDSTGYATELGKSSWKIMR
          N+ S  F  V+LA+++ L+I GK D+ +K P    WTLK V +IP LK+NLISIGQLD  GY+T      WKI +
Subjt:  FRNFKSRKFKKVYLANNKDLEIKGKEDVCIKTPAGNQWTLKDVSYIPGLKKNLISIGQLDSTGYATELGKSSWKIMR

KAE8730942.1 Major allergen Pru ar 1 [Hibiscus syriacus]8.0e-4747.37Show/hide
Query:  ESSKIGIEKFDVSDFSFWKMQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYLM
        +   + IEKFD +DF FW M IED+LYQK LY+PL G  P+ M  E W L DR+ALG+IRLTLSRNVAFNI K+KT   L+ ALS+MYEK SA NKV+LM
Subjt:  ESSKIGIEKFDVSDFSFWKMQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYLM

Query:  RRL-----------------------------FNLQMSEGASFHSSSNKELFRNFKSRKFKKVYLANNKDLEIKGKEDVCIKTPAGNQWTLKDVSYIPGL
        R+L                              +  +  GASFHS+  +E+  N+ S  F KV++A+++ L+I GK D+ +K P    W LK V +IPGL
Subjt:  RRL-----------------------------FNLQMSEGASFHSSSNKELFRNFKSRKFKKVYLANNKDLEIKGKEDVCIKTPAGNQWTLKDVSYIPGL

Query:  KKNLISIGQLDSTGYATELGKSSWKIMR
        K+NLIS+GQLD  GY T      WKI++
Subjt:  KKNLISIGQLDSTGYATELGKSSWKIMR

KAF3636042.1 hypothetical protein FXO37_25670 [Capsicum annuum]3.9e-6248.82Show/hide
Query:  MESSKIGIEKFDVSDFSFWKMQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYL
        ME SK+GIEKFD SDFSFWKMQIEDYLYQK L++ L G+ P++M  E+WKLKDR+ALG IRLTLSRNVAFNI K+KTTSDLLKALSNMYEKS AMNKVYL
Subjt:  MESSKIGIEKFDVSDFSFWKMQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYL

Query:  MRRLFNLQMSEG----------------------------------------------------------------------------------------
        M RLFNLQ+S+                                                                                         
Subjt:  MRRLFNLQMSEG----------------------------------------------------------------------------------------

Query:  ---------ASFHSSSNKELFRNFKSRKFKKVYLANNKDLEIKGKEDVCIKTPAGNQWTLKDVSYIPGLKKNLISIGQLDSTGYATELGKSSWKIMR
                 ASFHSS +KE+F+NFK + F KVYLANNK L I+GK DVCIKT A NQWTL+++ Y+P LK+NLIS+ QLDSTGY  E GK  WK+++
Subjt:  ---------ASFHSSSNKELFRNFKSRKFKKVYLANNKDLEIKGKEDVCIKTPAGNQWTLKDVSYIPGLKKNLISIGQLDSTGYATELGKSSWKIMR

KAF3680274.1 putative 50S ribosomal protein L18-like [Capsicum annuum]2.6e-5853.33Show/hide
Query:  KMQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYLMRRLFNLQMSE--------
        +M+IEDYLYQK L+EPL G+ P+++  E WKLKDR+AL LI LTLSRNVAFNI+K+KTT DLLKALSNMYE  SA+NKVYLMRRLFNLQM E        
Subjt:  KMQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYLMRRLFNLQMSE--------

Query:  -----------------------------------------------------------------GASFHSSSNKELFRNFKSRKFKKVYLANNKDLEIK
                                                                         GASFHSS +KELF+NFKS  F KVYLA+NK L IK
Subjt:  -----------------------------------------------------------------GASFHSSSNKELFRNFKSRKFKKVYLANNKDLEIK

Query:  GKEDVCIKTPAGNQWTLKDVSYIPGLKKNLISIGQLDSTGYATELGKSSWKIMRV
        GK DVCIKTPAGNQWTL+DV YIPGLKKNLI +GQLDSTGYA E GK    ++ V
Subjt:  GKEDVCIKTPAGNQWTLKDVSYIPGLKKNLISIGQLDSTGYATELGKSSWKIMRV

VFQ62075.1 unnamed protein product [Cuscuta campestris]4.2e-5643.98Show/hide
Query:  MESSKIGIEKFDVSDFSFWKMQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYL
        ME SK+GIEKFD SDF FWKMQIEDYLYQK L+EPL G+ PD+MT EQWKLKDR+ALG+I LTL++NVAFNI+K+ TT+ LLKALSNMYEK SAMNK  +
Subjt:  MESSKIGIEKFDVSDFSFWKMQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYL

Query:  M---------------------------------------------------------RRLFNLQ-----------------------------------
        +                                                         R  F  Q                                   
Subjt:  M---------------------------------------------------------RRLFNLQ-----------------------------------

Query:  ----------------------------------------MSEGASFHSSSNKELFRNFKSRKFKKVYLANNKDLEIKGKEDVCIKTPAGNQWTLKDVSY
                                                +  GASFHSS +KE F+NFKS  F KVYLA+NK L I+GK DV IKTPAGNQWTLKDV Y
Subjt:  ----------------------------------------MSEGASFHSSSNKELFRNFKSRKFKKVYLANNKDLEIKGKEDVCIKTPAGNQWTLKDVSY

Query:  IPGLKKNLISIGQLDSTGYATELGKSSWKIMR
        IPGLKKNLISIGQLD+ GYA E GK SWKI++
Subjt:  IPGLKKNLISIGQLDSTGYATELGKSSWKIMR

TrEMBL top hitse value%identityAlignment
A0A484KC47 CCHC-type domain-containing protein2.0e-5643.98Show/hide
Query:  MESSKIGIEKFDVSDFSFWKMQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYL
        ME SK+GIEKFD SDF FWKMQIEDYLYQK L+EPL G+ PD+MT EQWKLKDR+ALG+I LTL++NVAFNI+K+ TT+ LLKALSNMYEK SAMNK  +
Subjt:  MESSKIGIEKFDVSDFSFWKMQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYL

Query:  M---------------------------------------------------------RRLFNLQ-----------------------------------
        +                                                         R  F  Q                                   
Subjt:  M---------------------------------------------------------RRLFNLQ-----------------------------------

Query:  ----------------------------------------MSEGASFHSSSNKELFRNFKSRKFKKVYLANNKDLEIKGKEDVCIKTPAGNQWTLKDVSY
                                                +  GASFHSS +KE F+NFKS  F KVYLA+NK L I+GK DV IKTPAGNQWTLKDV Y
Subjt:  ----------------------------------------MSEGASFHSSSNKELFRNFKSRKFKKVYLANNKDLEIKGKEDVCIKTPAGNQWTLKDVSY

Query:  IPGLKKNLISIGQLDSTGYATELGKSSWKIMR
        IPGLKKNLISIGQLD+ GYA E GK SWKI++
Subjt:  IPGLKKNLISIGQLDSTGYATELGKSSWKIMR

A0A484MUU4 gag_pre-integrs domain-containing protein2.8e-4547.01Show/hide
Query:  MQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYLM------------------R
        MQIEDYLYQK L+EPL G+ PD+MT EQWKLKDR+ALG+IRLTL++NVAFNI+K+ TT+ L+KALSNMYEK  AMNK  ++                  R
Subjt:  MQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYLM------------------R

Query:  RLFNLQMSE----------------------------------------------------------GASFHSSSNKELFRNFKSRKFKKVYLANNKDLE
            L+  E                                                          GASFHSS +KELF+NFKS  F KVYLA+NK L 
Subjt:  RLFNLQMSE----------------------------------------------------------GASFHSSSNKELFRNFKSRKFKKVYLANNKDLE

Query:  IKGKEDVCIKTPAGNQWTLKDVSYIPGLKKNLISIGQLDSTGYATELGKSS
        I+GK DV IKTP GNQWTLKD  YIPGLKKNLISIG ++    A  +  SS
Subjt:  IKGKEDVCIKTPAGNQWTLKDVSYIPGLKKNLISIGQLDSTGYATELGKSS

A0A6A2Y6G9 Integrase catalytic domain-containing protein2.7e-4842.96Show/hide
Query:  ESSKIGIEKFDVSDFSFWKMQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYLM
        +  K+ IEKFD +DF FWKMQIED+LYQK LY+PL G  P+ M  E W L DRKALG+IRLTLSRN+AFNI K+KTT+ L+ ALS+MYEK SA NKV+LM
Subjt:  ESSKIGIEKFDVSDFSFWKMQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYLM

Query:  RRLFNLQMSE------------------------------------------------------------------------------GASFHSSSNKEL
        RRLFNL+M+E                                                                              GASFHS+  +E+
Subjt:  RRLFNLQMSE------------------------------------------------------------------------------GASFHSSSNKEL

Query:  FRNFKSRKFKKVYLANNKDLEIKGKEDVCIKTPAGNQWTLKDVSYIPGLKKNLISIGQLDSTGYATELGKSSWKIMR
          N+ S  F  V+LA+++ L+I GK D+ +K P    WTLK V +IP LK+NLISIGQLD  GY+T      WKI +
Subjt:  FRNFKSRKFKKVYLANNKDLEIKGKEDVCIKTPAGNQWTLKDVSYIPGLKKNLISIGQLDSTGYATELGKSSWKIMR

A0A6A3CTZ2 Major allergen Pru ar 13.9e-4747.37Show/hide
Query:  ESSKIGIEKFDVSDFSFWKMQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYLM
        +   + IEKFD +DF FW M IED+LYQK LY+PL G  P+ M  E W L DR+ALG+IRLTLSRNVAFNI K+KT   L+ ALS+MYEK SA NKV+LM
Subjt:  ESSKIGIEKFDVSDFSFWKMQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYLM

Query:  RRL-----------------------------FNLQMSEGASFHSSSNKELFRNFKSRKFKKVYLANNKDLEIKGKEDVCIKTPAGNQWTLKDVSYIPGL
        R+L                              +  +  GASFHS+  +E+  N+ S  F KV++A+++ L+I GK D+ +K P    W LK V +IPGL
Subjt:  RRL-----------------------------FNLQMSEGASFHSSSNKELFRNFKSRKFKKVYLANNKDLEIKGKEDVCIKTPAGNQWTLKDVSYIPGL

Query:  KKNLISIGQLDSTGYATELGKSSWKIMR
        K+NLIS+GQLD  GY T      WKI++
Subjt:  KKNLISIGQLDSTGYATELGKSSWKIMR

A0A803LL22 Uncharacterized protein1.2e-4542.35Show/hide
Query:  MESSKIGIEKFDVSDFSFWKMQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYL
        ME  K+ IEKFD  DF FWKMQIEDYLYQK LY PL    P  M  E+WK+ DR+ALG+IRLTL+++VA+N+    TT  ++KALSNMYEK SAMNK+  
Subjt:  MESSKIGIEKFDVSDFSFWKMQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYL

Query:  -----------MRR----------------------------------------------------------------------LFNLQMSEGASFHSSS
                   +RR                                                                      + +  +  GASFHSS 
Subjt:  -----------MRR----------------------------------------------------------------------LFNLQMSEGASFHSSS

Query:  NKELFRNFKSRKFKKVYLANNKDLEIKGKEDVCIKTPAGNQWTLKDVSYIPGLKKNLISIGQLDSTGYATELGKSSWKIMR
         KE+F+NFKS KF KVYLA+++ LEI GK DV IKT +G+ W L+DV YIP L+KNLIS+GQLDS+GY T  G+ +WK+ +
Subjt:  NKELFRNFKSRKFKKVYLANNKDLEIKGKEDVCIKTPAGNQWTLKDVSYIPGLKKNLISIGQLDSTGYATELGKSSWKIMR

SwissProt top hitse value%identityAlignment
P10978 Retrovirus-related Pol polyprotein from transposon TNT 1-941.2e-1036.59Show/hide
Query:  MESSKIGIEKFDVSD-FSFWKMQIEDYLYQKCLYEPL--LGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNK
        M   K  + KF+  + FS W+ ++ D L Q+ L++ L      PDTM  E W   D +A   IRL LS +V  NII + T   +   L ++Y   +  NK
Subjt:  MESSKIGIEKFDVSD-FSFWKMQIEDYLYQKCLYEPL--LGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNK

Query:  VYLMRRLFNLQMSEGASFHSSSN
        +YL ++L+ L MSEG +F S  N
Subjt:  VYLMRRLFNLQMSEGASFHSSSN

P10978 Retrovirus-related Pol polyprotein from transposon TNT 1-945.8e-0835.23Show/hide
Query:  ASFHSSSNKELFRNFKSRKFKKVYLANNKDLEIKGKEDVCIKTPAGNQWTLKDVSYIPGLKKNLISIGQLDSTGYATELGKSSWKIMR
        AS H++  ++LF  + +  F  V + N    +I G  D+CIKT  G    LKDV ++P L+ NLIS   LD  GY +      W++ +
Subjt:  ASFHSSSNKELFRNFKSRKFKKVYLANNKDLEIKGKEDVCIKTPAGNQWTLKDVSYIPGLKKNLISIGQLDSTGYATELGKSSWKIMR

Arabidopsis top hitse value%identityAlignment
AT3G20980.1 Gag-Pol-related retrotransposon family protein2.3e-0424.52Show/hide
Query:  DTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYLMRRLFNLQMSEGASFHSSSNKELFRNF-KSRKFKKVYLANN
        D     ++  KD +AL +++ +L  +V    +   +  DL   L+ + +  +     Y       L +S   S H + + + F    +SRK K  +++ +
Subjt:  DTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYLMRRLFNLQMSEGASFHSSSNKELFRNF-KSRKFKKVYLANN

Query:  KD----LEIKGKEDVCIKTPAGNQWTLKDVSYIPGLKKNLISIGQLDSTGYATEL
        K       ++G  DV   T  GN+ T+K+V Y+PG++ N +S+ QL   G+   +
Subjt:  KD----LEIKGKEDVCIKTPAGNQWTLKDVSYIPGLKKNLISIGQLDSTGYATEL

AT3G21000.1 Gag-Pol-related retrotransposon family protein1.8e-0436Show/hide
Query:  LEIKGKEDVCIKTPAGNQWTLKDVSYIPGLKKNLISIGQLDSTGYATELG
        L ++GK DV I+   G + T+++V ++PGL +N++S G++ S  Y+   G
Subjt:  LEIKGKEDVCIKTPAGNQWTLKDVSYIPGLKKNLISIGQLDSTGYATELG

AT3G29785.1 unknown protein2.9e-1544.44Show/hide
Query:  EKFDVSDFSFWKMQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKV
        +K D + +SF +M+IEDYLY K L++P LG   +TM+ + W +  R+ L +IRLT+S+N+A N+ K+K+   L+K LS++Y+K S  N V
Subjt:  EKFDVSDFSFWKMQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKV


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGAAAGTTCAAAGATTGGAATTGAGAAGTTTGATGTATCCGATTTCAGTTTCTGGAAGATGCAGATTGAAGATTATCTATACCAGAAATGTCTTTATGAACCCCTGTT
GGGATTGATGCCGGATACCATGACCACGGAGCAGTGGAAGCTCAAGGATCGAAAAGCCTTAGGGCTGATCCGGTTGACGCTATCCAGAAACGTGGCGTTCAATATCATCA
AGAAGAAGACTACGTCAGATTTGCTGAAGGCGCTGTCGAATATGTATGAAAAATCATCGGCTATGAACAAGGTGTATTTAATGCGGAGATTGTTCAATCTACAGATGTCT
GAAGGTGCATCTTTTCATTCTTCTTCAAATAAAGAGTTGTTCCGGAATTTCAAGTCTCGAAAGTTCAAGAAGGTGTATCTTGCCAACAACAAAGATTTGGAGATTAAAGG
AAAAGAGGATGTTTGCATAAAAACTCCGGCAGGAAATCAGTGGACATTAAAGGATGTCAGCTATATTCCTGGTCTCAAGAAGAACCTGATATCTATTGGTCAGTTGGACA
GCACAGGTTATGCAACAGAGTTAGGAAAGAGTTCATGGAAGATTATGAGGGTGCTACGGTAG
mRNA sequenceShow/hide mRNA sequence
ATGGAAAGTTCAAAGATTGGAATTGAGAAGTTTGATGTATCCGATTTCAGTTTCTGGAAGATGCAGATTGAAGATTATCTATACCAGAAATGTCTTTATGAACCCCTGTT
GGGATTGATGCCGGATACCATGACCACGGAGCAGTGGAAGCTCAAGGATCGAAAAGCCTTAGGGCTGATCCGGTTGACGCTATCCAGAAACGTGGCGTTCAATATCATCA
AGAAGAAGACTACGTCAGATTTGCTGAAGGCGCTGTCGAATATGTATGAAAAATCATCGGCTATGAACAAGGTGTATTTAATGCGGAGATTGTTCAATCTACAGATGTCT
GAAGGTGCATCTTTTCATTCTTCTTCAAATAAAGAGTTGTTCCGGAATTTCAAGTCTCGAAAGTTCAAGAAGGTGTATCTTGCCAACAACAAAGATTTGGAGATTAAAGG
AAAAGAGGATGTTTGCATAAAAACTCCGGCAGGAAATCAGTGGACATTAAAGGATGTCAGCTATATTCCTGGTCTCAAGAAGAACCTGATATCTATTGGTCAGTTGGACA
GCACAGGTTATGCAACAGAGTTAGGAAAGAGTTCATGGAAGATTATGAGGGTGCTACGGTAG
Protein sequenceShow/hide protein sequence
MESSKIGIEKFDVSDFSFWKMQIEDYLYQKCLYEPLLGLMPDTMTTEQWKLKDRKALGLIRLTLSRNVAFNIIKKKTTSDLLKALSNMYEKSSAMNKVYLMRRLFNLQMS
EGASFHSSSNKELFRNFKSRKFKKVYLANNKDLEIKGKEDVCIKTPAGNQWTLKDVSYIPGLKKNLISIGQLDSTGYATELGKSSWKIMRVLR