; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Moc02g19360 (gene) of Bitter gourd (OHB3-1) v2 genome

Gene IDMoc02g19360
OrganismMomordica charantia cv. OHB3-1 (Bitter gourd (OHB3-1) v2)
DescriptionGag-protease polyprotein
Genome locationchr2:14397931..14398602
RNA-Seq ExpressionMoc02g19360
SyntenyMoc02g19360
Gene Ontology termsGO:0006259 - DNA metabolic process (biological process)
GO:0016020 - membrane (cellular component)
GO:0004518 - nuclease activity (molecular function)
GO:0005488 - binding (molecular function)
GO:0008233 - peptidase activity (molecular function)
GO:0016779 - nucleotidyltransferase activity (molecular function)
InterPro domainsIPR005162 - Retrotransposon gag domain


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_022156662.1 uncharacterized protein LOC111023512 [Momordica charantia]3.9e-6376.33Show/hide
Query:  MDTLQTLVQTTLSNQIAQLTQDRESIVIKAKYLRNFKKYDHRFFDGLYVDPTLAEAWLSSIETIFRYMRCSEEQKVQCAVFMLKDDVILWWESAERFIDV
        M+TLQTLVQTT+SNQ+ QLTQ+R SI I+AKYLR+FKKYD R FDGL VDP LAEAWLS +ETIFRYMRC EEQKVQC VFMLKDD  LWWES ER IDV
Subjt:  MDTLQTLVQTTLSNQIAQLTQDRESIVIKAKYLRNFKKYDHRFFDGLYVDPTLAEAWLSSIETIFRYMRCSEEQKVQCAVFMLKDDVILWWESAERFIDV

Query:  SG--ITWLQFEEAFFLQYYPMITRFKKQAEFLNLKQGNRSVEEFEREFTKLSRFALELVETEAKKTKKF
        SG  +TWLQF+EAFF QYYP IT ++KQ EFLNLKQ NRSVEE++REFTKLSRFA ELV+TEA K ++F
Subjt:  SG--ITWLQFEEAFFLQYYPMITRFKKQAEFLNLKQGNRSVEEFEREFTKLSRFALELVETEAKKTKKF

XP_038880446.1 uncharacterized protein LOC120072105 [Benincasa hispida]1.7e-3758.74Show/hide
Query:  ESIVIKAKYLRNFKKYDHRFFDGLYVDPTLAEAWLSSIETIFRYMRCSEEQKVQCAVFMLKDDVILWWESAERFIDVSG--ITWLQFEEAFFLQYYPMIT
        + + ++AK+LR+F+KYD R FD    DPT AE WLSSIETIFR+MRC EE K+QCAVFML D+V +WW SA++ ID  G   TW QF+E F+ +Y+    
Subjt:  ESIVIKAKYLRNFKKYDHRFFDGLYVDPTLAEAWLSSIETIFRYMRCSEEQKVQCAVFMLKDDVILWWESAERFIDVSG--ITWLQFEEAFFLQYYPMIT

Query:  RFKKQAEFLNLKQGNRSVEEFEREFTKLSRFALELVETEAKKT
        R+ KQA+FLNL+Q   SVEE+E+EF KLSRFA ELV TEA +T
Subjt:  RFKKQAEFLNLKQGNRSVEEFEREFTKLSRFALELVETEAKKT

XP_038883046.1 uncharacterized protein LOC120074107 [Benincasa hispida]1.1e-3847.92Show/hide
Query:  IPSLMMDTLQTLVQTTLSNQIAQL------------------TQDRESIVIKAKYLRNFKKYDHRFFDGLYVDPTLAEAWLSSIETIFRYMRCSEEQKVQ
        + S +M  LQ L+++ +  Q AQ                    +D + + ++AK+LR+F+KYD R FDG   DPT A+ WLSSIETIFR+MRC EE K+Q
Subjt:  IPSLMMDTLQTLVQTTLSNQIAQL------------------TQDRESIVIKAKYLRNFKKYDHRFFDGLYVDPTLAEAWLSSIETIFRYMRCSEEQKVQ

Query:  CAVFMLKDDVILWWESAERFIDVSG--ITWLQFEEAFFLQYYPMITRFKKQAEFLNLKQGNRSVEEFEREFTKLSRFALELVETEAKKTKKF
        C VFML  +V +WW S E+ ID  G   TW QF+E F+ +Y+   TR+ KQAEFLNLKQG  S+E++E+EF KLS F  ELV TEA +T++F
Subjt:  CAVFMLKDDVILWWESAERFIDVSG--ITWLQFEEAFFLQYYPMITRFKKQAEFLNLKQGNRSVEEFEREFTKLSRFALELVETEAKKTKKF

XP_038887018.1 uncharacterized protein LOC120077183 [Benincasa hispida]1.3e-3756.85Show/hide
Query:  ESIVIKAKYLRNFKKYDHRFFDGLYVDPTLAEAWLSSIETIFRYMRCSEEQKVQCAVFMLKDDVILWWESAERFIDVSG--ITWLQFEEAFFLQYYPMIT
        + + ++AK+LR+F+K+D R FDG   DPT A+ WLSSIETIF +MRC EE K+QCAVFML  +  +WW  AE+ ID SG   TW QF+E F+  Y+   T
Subjt:  ESIVIKAKYLRNFKKYDHRFFDGLYVDPTLAEAWLSSIETIFRYMRCSEEQKVQCAVFMLKDDVILWWESAERFIDVSG--ITWLQFEEAFFLQYYPMIT

Query:  RFKKQAEFLNLKQGNRSVEEFEREFTKLSRFALELVETEAKKTKKF
        R+ KQ EFLNLKQ   SVEE+E+EF KLS F+LELV  EA +TK+F
Subjt:  RFKKQAEFLNLKQGNRSVEEFEREFTKLSRFALELVETEAKKTKKF

XP_038891712.1 uncharacterized protein LOC120081110 [Benincasa hispida]2.2e-3750.61Show/hide
Query:  QDRESIVIKAKYLRNFKKYDHRFFDGLYVDPTLAEAWLSSIETIFRYMRCSEEQKVQCAVFMLKDDVILWWESAERFIDVSG--ITWLQFEEAFFLQYYP
        Q      ++AK+LR+FKKY+   F+G   DPT AE W+S IETIFRYM+C E+QKVQCAVFML D   +WW+ AER + V G  +TW QF+E F+ +Y+ 
Subjt:  QDRESIVIKAKYLRNFKKYDHRFFDGLYVDPTLAEAWLSSIETIFRYMRCSEEQKVQCAVFMLKDDVILWWESAERFIDVSG--ITWLQFEEAFFLQYYP

Query:  MITRFKKQAEFLNLKQGNRSVEEFEREFTKLSRFALELVETEAKKTKKFSWACRMRFKVLWQLF
           R+ KQ EFL L+QG+RSVEE+++EF  LSRFA ELV TEA + ++F    +   + + Q F
Subjt:  MITRFKKQAEFLNLKQGNRSVEEFEREFTKLSRFALELVETEAKKTKKFSWACRMRFKVLWQLF

TrEMBL top hitse value%identityAlignment
A0A5A7T7E7 Ty3-gypsy retrotransposon protein3.4e-3649.07Show/hide
Query:  ESIVIKAKYLRNFKKYDHRFFDGLYVDPTLAEAWLSSIETIFRYMRCSEEQKVQCAVFMLKDDVILWWESAERFI--DVSGITWLQFEEAFFLQYYPMIT
        + + ++AK+LR+F+KY+    DG   DPT A+ WLSS+ETIFRYM+C E+QKVQCA+FML D    WWE+ ER +  DVS ITW QF+E+F+ +++P   
Subjt:  ESIVIKAKYLRNFKKYDHRFFDGLYVDPTLAEAWLSSIETIFRYMRCSEEQKVQCAVFMLKDDVILWWESAERFI--DVSGITWLQFEEAFFLQYYPMIT

Query:  RFKKQAEFLNLKQGNRSVEEFEREFTKLSRFALELVETEAKKTKKFSWACRMRFKVLWQLF
        R  K+ EFLNL+QG+ +VE+++ EF  LSRFA E++ TEA +  KF    R+  + L + F
Subjt:  RFKKQAEFLNLKQGNRSVEEFEREFTKLSRFALELVETEAKKTKKFSWACRMRFKVLWQLF

A0A5A7VDB7 Gag-protease polyprotein2.6e-3650.31Show/hide
Query:  ESIVIKAKYLRNFKKYDHRFFDGLYVDPTLAEAWLSSIETIFRYMRCSEEQKVQCAVFMLKDDVILWWESAERFI--DVSGITWLQFEEAFFLQYYPMIT
        + +  +AK+LR+F+KY+   FDG   DPT A+ WLSS+ETIFRYM+C E+QKVQCAVFML D    WWE+ ER +  DVS ITW QF+E+F+ +++    
Subjt:  ESIVIKAKYLRNFKKYDHRFFDGLYVDPTLAEAWLSSIETIFRYMRCSEEQKVQCAVFMLKDDVILWWESAERFI--DVSGITWLQFEEAFFLQYYPMIT

Query:  RFKKQAEFLNLKQGNRSVEEFEREFTKLSRFALELVETEAKKTKKFSWACRMRFKVLWQLF
        R  K+ EFLNL+QG+ +VE+++ EF  LSRFA E++ETEA +  KF    R+  + L + F
Subjt:  RFKKQAEFLNLKQGNRSVEEFEREFTKLSRFALELVETEAKKTKKFSWACRMRFKVLWQLF

A0A5A7VQH2 Reverse transcriptase2.0e-3650.31Show/hide
Query:  ESIVIKAKYLRNFKKYDHRFFDGLYVDPTLAEAWLSSIETIFRYMRCSEEQKVQCAVFMLKDDVILWWESAERFI--DVSGITWLQFEEAFFLQYYPMIT
        + +  +AK+LR+F+KY+   FDG   DPT A+ WLSS+ETIFRYM+C E+QKVQCAVFML D    WWE+ ER +  DVS ITW QF+E+F+ +++    
Subjt:  ESIVIKAKYLRNFKKYDHRFFDGLYVDPTLAEAWLSSIETIFRYMRCSEEQKVQCAVFMLKDDVILWWESAERFI--DVSGITWLQFEEAFFLQYYPMIT

Query:  RFKKQAEFLNLKQGNRSVEEFEREFTKLSRFALELVETEAKKTKKFSWACRMRFKVLWQLF
        R  K+ EFLNL+QG+ +VE+++ EF  LSRFALE++ TE  +  KF    R+  + L Q F
Subjt:  RFKKQAEFLNLKQGNRSVEEFEREFTKLSRFALELVETEAKKTKKFSWACRMRFKVLWQLF

A0A5D3BSM2 Reverse transcriptase1.2e-3651.88Show/hide
Query:  IAQLTQDRESIVIKAKYLRNFKKYDHRFFDGLYVDPTLAEAWLSSIETIFRYMRCSEEQKVQCAVFMLKDDVILWWESAERFI--DVSGITWLQFEEAFF
        + Q+  D+ S   +AK+LR+F+KY+   FDG   DPT A+ WLSS+ETIFRYM+CSE+QKVQCAVFML D    WWE+AER +  DV  ITW QF+E+F+
Subjt:  IAQLTQDRESIVIKAKYLRNFKKYDHRFFDGLYVDPTLAEAWLSSIETIFRYMRCSEEQKVQCAVFMLKDDVILWWESAERFI--DVSGITWLQFEEAFF

Query:  LQYYPMITRFKKQAEFLNLKQGNRSVEEFEREFTKLSRFALELVETEAKKTKKFSWACRM
         +++    R  K+ EFLNL+QG+R VE+++ EF  LSRFA E++ TEA +  KF    R+
Subjt:  LQYYPMITRFKKQAEFLNLKQGNRSVEEFEREFTKLSRFALELVETEAKKTKKFSWACRM

A0A6J1DSJ6 uncharacterized protein LOC1110235121.9e-6376.33Show/hide
Query:  MDTLQTLVQTTLSNQIAQLTQDRESIVIKAKYLRNFKKYDHRFFDGLYVDPTLAEAWLSSIETIFRYMRCSEEQKVQCAVFMLKDDVILWWESAERFIDV
        M+TLQTLVQTT+SNQ+ QLTQ+R SI I+AKYLR+FKKYD R FDGL VDP LAEAWLS +ETIFRYMRC EEQKVQC VFMLKDD  LWWES ER IDV
Subjt:  MDTLQTLVQTTLSNQIAQLTQDRESIVIKAKYLRNFKKYDHRFFDGLYVDPTLAEAWLSSIETIFRYMRCSEEQKVQCAVFMLKDDVILWWESAERFIDV

Query:  SG--ITWLQFEEAFFLQYYPMITRFKKQAEFLNLKQGNRSVEEFEREFTKLSRFALELVETEAKKTKKF
        SG  +TWLQF+EAFF QYYP IT ++KQ EFLNLKQ NRSVEE++REFTKLSRFA ELV+TEA K ++F
Subjt:  SG--ITWLQFEEAFFLQYYPMITRFKKQAEFLNLKQGNRSVEEFEREFTKLSRFALELVETEAKKTKKF

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGTTCCACTTTCTAGTCAGACCAAGAATCCAGCAGTGGGTCAAACTTTTGGACAGTCAGAGCCTACAATACCGTCCTTAATGATGGACACCTTACAGACACTT
GTTCAAACTACTCTATCTAACCAAATTGCTCAACTGACTCAAGATCGAGAGAGCATTGTAATAAAAGCTAAATATCTGCGAAATTTTAAGAAGTATGACCATCGA
TTTTTTGACGGACTATATGTAGATCCGACGTTGGCAGAGGCTTGGTTATCCTCAATAGAGACTATCTTTCGTTATATGAGGTGTTCGGAGGAGCAAAAAGTGCAG
TGTGCTGTCTTCATGCTAAAAGATGATGTCATTTTATGGTGGGAGTCTGCAGAAAGGTTTATCGATGTTAGTGGGATCACATGGTTGCAGTTTGAGGAGGCTTTC
TTCCTACAGTATTACCCAATGATCACTCGATTCAAGAAACAAGCAGAGTTTCTAAACCTAAAGCAAGGCAACAGATCAGTGGAGGAATTTGAGAGAGAATTCACA
AAATTGTCTCGTTTTGCCCTTGAGCTAGTAGAAACAGAGGCCAAGAAGACTAAAAAATTCTCATGGGCCTGTAGGATGAGATTCAAGGTTTTGTGGCAACTCTTT
CTCCACCAGATTATACTACAGCACTTCGAGCAGCTGCATTGA
mRNA sequenceShow/hide mRNA sequence
ATGGTTCCACTTTCTAGTCAGACCAAGAATCCAGCAGTGGGTCAAACTTTTGGACAGTCAGAGCCTACAATACCGTCCTTAATGATGGACACCTTACAGACACTT
GTTCAAACTACTCTATCTAACCAAATTGCTCAACTGACTCAAGATCGAGAGAGCATTGTAATAAAAGCTAAATATCTGCGAAATTTTAAGAAGTATGACCATCGA
TTTTTTGACGGACTATATGTAGATCCGACGTTGGCAGAGGCTTGGTTATCCTCAATAGAGACTATCTTTCGTTATATGAGGTGTTCGGAGGAGCAAAAAGTGCAG
TGTGCTGTCTTCATGCTAAAAGATGATGTCATTTTATGGTGGGAGTCTGCAGAAAGGTTTATCGATGTTAGTGGGATCACATGGTTGCAGTTTGAGGAGGCTTTC
TTCCTACAGTATTACCCAATGATCACTCGATTCAAGAAACAAGCAGAGTTTCTAAACCTAAAGCAAGGCAACAGATCAGTGGAGGAATTTGAGAGAGAATTCACA
AAATTGTCTCGTTTTGCCCTTGAGCTAGTAGAAACAGAGGCCAAGAAGACTAAAAAATTCTCATGGGCCTGTAGGATGAGATTCAAGGTTTTGTGGCAACTCTTT
CTCCACCAGATTATACTACAGCACTTCGAGCAGCTGCATTGA
Protein sequenceShow/hide protein sequence
MVPLSSQTKNPAVGQTFGQSEPTIPSLMMDTLQTLVQTTLSNQIAQLTQDRESIVIKAKYLRNFKKYDHRFFDGLYVDPTLAEAWLSSIETIFRYMRCSEEQKVQ
CAVFMLKDDVILWWESAERFIDVSGITWLQFEEAFFLQYYPMITRFKKQAEFLNLKQGNRSVEEFEREFTKLSRFALELVETEAKKTKKFSWACRMRFKVLWQLF
LHQIILQHFEQLH