; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Lag0032021 (gene) of Sponge gourd (AG-4) v1 genome

Gene IDLag0032021
OrganismLuffa acutangula AG-4 (Sponge gourd (AG-4) v1)
DescriptionGag protease polyprotein
Genome locationchr11:22672915..22678844
RNA-Seq ExpressionLag0032021
SyntenyLag0032021
Gene Ontology termsNA
InterPro domainsIPR005162 - Retrotransposon gag domain
IPR021109 - Aspartic peptidase domain superfamily
IPR043502 - DNA/RNA polymerase superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_022156662.1 uncharacterized protein LOC111023512 [Momordica charantia]4.9e-3751.48Show/hide
Query:  GREEDPIVAELWLSSMETIFRYMKCPKDQKVHCVVFVLSDDAFLWWESAERSIDTSGGPVSWLQFKEAFFCQYYPALTRFRKQAEFMKLTQGN-------
        G   DP++AE WLS METIFRYM+C ++QKV C VF+L DDAFLWWES ER ID SGGPV+WLQFKEAFF QYYPA+T +RKQ EF+ L Q N       
Subjt:  GREEDPIVAELWLSSMETIFRYMKCPKDQKVHCVVFVLSDDAFLWWESAERSIDTSGGPVSWLQFKEAFFCQYYPALTRFRKQAEFMKLTQGN-------

Query:  -----------DLWKNMRENLPGCLV------QGIVAAVTPSDYAAALRAATLIDKRPSNGSQVASDPG
                   +L           ++      +G VA ++P DYA ALR A LID R ++GSQV   PG
Subjt:  -----------DLWKNMRENLPGCLV------QGIVAAVTPSDYAAALRAATLIDKRPSNGSQVASDPG

XP_024948889.1 uncharacterized protein LOC112496050 [Citrus sinensis]1.3e-2931.88Show/hide
Query:  ARPQGKLPSDTEHPRREGKEQVKAVTLRS-------------------------------------------------DVEPPYVPPPPYVP--------
        +R QG LPS+TE PRREGKE  K + L S                                                 +V+P +       P        
Subjt:  ARPQGKLPSDTEHPRREGKEQVKAVTLRS-------------------------------------------------DVEPPYVPPPPYVP--------

Query:  ----------------PLPFPQRQKPKNQD----------------------------------GGKELGRALCDLGASINLMPLSVYRKLGIGEARPTT
                        P PFPQR + + QD                                  G +  GRALCDLGASINLMPLSV++ LG+ E RPTT
Subjt:  ----------------PLPFPQRQKPKNQD----------------------------------GGKELGRALCDLGASINLMPLSVYRKLGIGEARPTT

Query:  VTLQLADRSITYPEDEMEDC-----SFIRILESTVI---------------------------------------------------ETAIQDSADKHSE
        VTLQLADRS  YPE ++ED       FI +++  V+                                                   E ++ D   K+  
Subjt:  VTLQLADRSITYPEDEMEDC-----SFIRILESTVI---------------------------------------------------ETAIQDSADKHSE

Query:  KHG------EGISPSFCMHKITLEEGSFRSIEQQRRLNPAMKEVVKKEVIKWLDAGIIYPIADSNWL
          G      +GISPS CMHKI LE  S  S+E QRRLNP MKEVVKKE+IKWLD G+IY I+DS+W+
Subjt:  KHG------EGISPSFCMHKITLEEGSFRSIEQQRRLNPAMKEVVKKEVIKWLDAGIIYPIADSNWL

XP_038874946.1 uncharacterized protein LOC120067458 [Benincasa hispida]1.7e-2955.24Show/hide
Query:  GGKELGRALCDLGASINLMPLSVYRKLGIGEARPTTVTLQLADRSITYPEDEMED-----------CSFIRI-LESTVIETAIQDSADKHSEKHG-----
        GGKE+G ALCDLGASINLMPL +++KL  G ARPTTVTLQLADRS+ +PE ++ED             FI +  E  + E A+      +++  G     
Subjt:  GGKELGRALCDLGASINLMPLSVYRKLGIGEARPTTVTLQLADRSITYPEDEMED-----------CSFIRI-LESTVIETAIQDSADKHSEKHG-----

Query:  -EGISPSFCMHKITLEEGSFRSIEQQRRLNPAMKEVVKKEVIK
          GISPS+CMHKI LEEG   SIE+QRRLNPAMKEVVKKE++K
Subjt:  -EGISPSFCMHKITLEEGSFRSIEQQRRLNPAMKEVVKKEVIK

XP_038902502.1 uncharacterized protein LOC120089161 [Benincasa hispida]9.2e-3651.74Show/hide
Query:  PPYVPPLPFPQRQKPKNQDGGKELGRALCDLGASINLMPLSVYRKLGIGEARPTTVTLQLADRSITYPEDEMEDCS-----FIRILESTVIE-TAIQDSA
        P YV        +K +  +  +++G ALCDLGASINLMPLS+++KLGIG A+PT+VTLQLADR+I YP+ ++ED       FI   +  +++  A  +  
Subjt:  PPYVPPLPFPQRQKPKNQDGGKELGRALCDLGASINLMPLSVYRKLGIGEARPTTVTLQLADRSITYPEDEMEDCS-----FIRILESTVIE-TAIQDSA

Query:  DKHSEKHG------EGISPSFCMHKITLEEGSFRSIEQQRRLNPAMKEVVKKEVIKWLDAGIIYPIADSNWL
         KH    G       GI+PS+CMHKI LEEG  R IE QRRLNP +KEVVKKE++KWLDAG+IYPI+DS+W+
Subjt:  DKHSEKHG------EGISPSFCMHKITLEEGSFRSIEQQRRLNPAMKEVVKKEVIKWLDAGIIYPIADSNWL

XP_038904339.1 uncharacterized protein LOC120090693 [Benincasa hispida]4.3e-3352.26Show/hide
Query:  GGKELGRALCDLGASINLMPLSVYRKLGIGEARPTTVTLQLADRSITYPEDEMEDCSFIRILE-STVIETAIQDSAD-------KHSEKHG------EGI
        GG ++G ALCDLGASINL PLS+++KLGI EA+PT+V LQLA+R+IT  + ++ED   +  +  +T +E   ++  D       +H    G       GI
Subjt:  GGKELGRALCDLGASINLMPLSVYRKLGIGEARPTTVTLQLADRSITYPEDEMEDCSFIRILE-STVIETAIQDSAD-------KHSEKHG------EGI

Query:  SPSFCMHKITLEEGSFRSIEQQRRLNPAMKEVVKKEVIKWLDAGIIYPIADSNWL
         P +CMH I LE+G   SIE QRRLNP MKEVVKKE+IKWLDA +IYPI+DS+WL
Subjt:  SPSFCMHKITLEEGSFRSIEQQRRLNPAMKEVVKKEVIKWLDAGIIYPIADSNWL

TrEMBL top hitse value%identityAlignment
A0A1S3Z8J0 uncharacterized protein LOC1077841195.3e-2933.07Show/hide
Query:  WRNHPNFSWGGQGN-------------------------------NCKPTKVNQPGFAKAQ-----------------------------------ARPQ
        WRNHPNFSWGG  N                               N K    NQ   A+ Q                                    RP 
Subjt:  WRNHPNFSWGGQGN-------------------------------NCKPTKVNQPGFAKAQ-----------------------------------ARPQ

Query:  GKLPSDTEHPRREGKEQVKAVTLR---------------SDVEPPYVPPPPYVP--------------PLPFPQRQKPKNQD------------------
        G LPSDTE        QV AVTLR               S +E   VP P  V               P PFPQR + KN D                  
Subjt:  GKLPSDTEHPRREGKEQVKAVTLR---------------SDVEPPYVPPPPYVP--------------PLPFPQRQKPKNQD------------------

Query:  -------------------------------GGKELGRALCDLGASINLMPLSVYRKLGIGEARPTTVTLQLADRSITYPEDEMEDCSFIRILESTVIET
                                       G  ++GR LCDLGASINLM L V+++LG+G  RPTTV +QLADRS  YPE E +    +R+L       
Subjt:  -------------------------------GGKELGRALCDLGASINLMPLSVYRKLGIGEARPTTVTLQLADRSITYPEDEMEDCSFIRILESTVIET

Query:  AIQDSADKHSEKHG------EGISPSFCMHKITLEEGSFRSIEQQRRLNPAMKEVVKKEVIKWLDAGIIYPIADS
               +H    G      +GISP+F MHKI +EEG   S+E Q  LNP MKEVV+KEVIKWLD GI++PI+DS
Subjt:  AIQDSADKHSEKHG------EGISPSFCMHKITLEEGSFRSIEQQRRLNPAMKEVVKKEVIKWLDAGIIYPIADS

A0A1U8IME8 uncharacterized protein LOC1078982768.1e-3046.91Show/hide
Query:  GGKELGRALCDLGASINLMPLSVYRKLGIGEARPTTVTLQLADRSITYPEDEMEDCSFIRILESTVIETAIQ------------DSADKHSEKHG-----
        G   +G+ L DLGASINLMP+ ++RKL IG ARPTT+TLQLADRS  +PE ++ED  +  + E+      I             +   K  +  G     
Subjt:  GGKELGRALCDLGASINLMPLSVYRKLGIGEARPTTVTLQLADRSITYPEDEMEDCSFIRILESTVIETAIQ------------DSADKHSEKHG-----

Query:  -EGISPSFCMHKITLEEGSFRSIEQQRRLNPAMKEVVKKEVIKWLDAGIIYPIADSNWLPKP
         +GI+   C+HKI LE+   +SIE QRRLNP MKEVVKKE++KWLD  IIYPI++S+W+  P
Subjt:  -EGISPSFCMHKITLEEGSFRSIEQQRRLNPAMKEVVKKEVIKWLDAGIIYPIADSNWLPKP

A0A5A7TIJ0 Ty3-gypsy retrotransposon protein1.6e-2541.86Show/hide
Query:  VGREEDPIVAELWLSSMETIFRYMKCPKDQKVHCVVFVLSDDAFLWWESAERSIDTSGGPVSWLQFKEAFFCQYYPALTRFRKQAEFMKLTQGNDLWKNM
        VG  EDP  A+LWLSS+ETIFRYMKCP+DQKV C VF+L+D    WWE+ ER +    G ++W QFKE+F+ +++ A  R  K+ EF+ L Q +      
Subjt:  VGREEDPIVAELWLSSMETIFRYMKCPKDQKVHCVVFVLSDDAFLWWESAERSIDTSGGPVSWLQFKEAFFCQYYPALTRFRKQAEFMKLTQGNDLWKNM

Query:  RENLPGCLVQGIVAAVTPSDYAAALRAATLIDKRPSNGSQVASDPGLQSGQKRKFDQANSNF-QRSQRSSGK
                + G+V A  P+ +A ALR A  +  +    S   +  G  SGQKRK +Q  ++  QR+ R SG+
Subjt:  RENLPGCLVQGIVAAVTPSDYAAALRAATLIDKRPSNGSQVASDPGLQSGQKRKFDQANSNF-QRSQRSSGK

A0A6A2Y697 Reverse transcriptase domain-containing protein1.5e-2834.67Show/hide
Query:  QGKLPSDTEHPRREGKEQVKAVTLRS-------------------------------------DVEPPYVPPPPY------VPPL---------PFPQRQ
        QG LPSDTE  +  GKE    +TLRS                                     D E   V  P Y      +P +         PFPQR 
Subjt:  QGKLPSDTEHPRREGKEQVKAVTLRS-------------------------------------DVEPPYVPPPPY------VPPL---------PFPQRQ

Query:  KPKNQD-----------------GGKELGRALCDLGASINLMPLSVYRKLGIGEARPTTVTLQLADRSITYPEDEMEDC-----SFI-------------
        K  N +                 G   +G+ALCDLG+S+NLMP S++ KLGIG+ARPT+V LQLADRS   PE  +ED       F+             
Subjt:  KPKNQD-----------------GGKELGRALCDLGASINLMPLSVYRKLGIGEARPTTVTLQLADRSITYPEDEMEDC-----SFI-------------

Query:  ---------------RIL---ESTVIETAIQDSADKHSEKHGEGISPSFCMHKITLEEGSFRSIEQQRRLNPAMKEVVKKEVIKWLDAGIIYPIADSNWL
                       RIL   E + +   +   A   +  + +GISP+ CMHKI LE+    SIE QRRLNP + +VV KE++KWLD G+IYPI++S+W+
Subjt:  ---------------RIL---ESTVIETAIQDSADKHSEKHGEGISPSFCMHKITLEEGSFRSIEQQRRLNPAMKEVVKKEVIKWLDAGIIYPIADSNWL

A0A6J1DSJ6 uncharacterized protein LOC1110235122.4e-3751.48Show/hide
Query:  GREEDPIVAELWLSSMETIFRYMKCPKDQKVHCVVFVLSDDAFLWWESAERSIDTSGGPVSWLQFKEAFFCQYYPALTRFRKQAEFMKLTQGN-------
        G   DP++AE WLS METIFRYM+C ++QKV C VF+L DDAFLWWES ER ID SGGPV+WLQFKEAFF QYYPA+T +RKQ EF+ L Q N       
Subjt:  GREEDPIVAELWLSSMETIFRYMKCPKDQKVHCVVFVLSDDAFLWWESAERSIDTSGGPVSWLQFKEAFFCQYYPALTRFRKQAEFMKLTQGN-------

Query:  -----------DLWKNMRENLPGCLV------QGIVAAVTPSDYAAALRAATLIDKRPSNGSQVASDPG
                   +L           ++      +G VA ++P DYA ALR A LID R ++GSQV   PG
Subjt:  -----------DLWKNMRENLPGCLV------QGIVAAVTPSDYAAALRAATLIDKRPSNGSQVASDPG

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGATTGCTAACGCTCTTAAGAATGTGACAGTGATTAGTCATCAGCAGCCACCAGCTATGGAGCCTACTGCAGTGGTGAACCAAGTGCAGAGGAAGCATGTGTCTATTGT
GGTGAAGACCACAACTACGAGTTTTGCCCCACAATCCAGCTTCTGTGTTTTTTGTTGGCGCAACCACCCCAACTTCTCATGGGGAGGTCAAGGAAATAACTGCAAGCCAA
CAAAAGTGAACCAGCCAGGATTTGCTAAAGCGCAGGCAAGGCCTCAAGGGAAACTTCCATCAGATACTGAACACCCTAGAAGGGAAGGTAAGGAGCAGGTAAAGGCAGTA
ACTCTTAGGAGTGATGTGGAACCACCTTATGTGCCGCCCCCACCTTATGTACCACCTCTACCTTTTCCACAAAGGCAAAAGCCTAAGAATCAGGATGGTGGAAAAGAATT
AGGTAGAGCACTCTGTGATTTAGGCGCAAGCATTAACCTTATGCCTCTTTCGGTCTATCGGAAGTTAGGTATTGGTGAAGCTAGGCCTACCACAGTCACACTCCAATTAG
CTGATAGGTCTATCACATATCCAGAGGACGAAATGGAGGATTGCTCCTTCATTAGGATTCTGGAGAGCACAGTTATTGAGACAGCAATACAGGATTCGGCTGATAAGCAT
TCGGAAAAGCATGGAGAGGGAATTAGCCCATCTTTTTGTATGCACAAAATCACTCTAGAGGAGGGATCCTTTAGGAGTATTGAGCAACAAAGAAGGCTTAACCCTGCAAT
GAAAGAGGTTGTTAAAAAGGAGGTAATTAAATGGTTGGATGCTGGGATCATTTATCCAATTGCCGATAGCAATTGGCTTCCCAAGCCAGCCGCAGCCTTCAACATCTCTC
GCCTCTTCTCACCGCCACCTATCACCTTGACGCCGGTTGTAGTCTCGTATGCTGCCATCCTTCTAGCGTCCAGTGTAGGTCGTCACCATCGTGTGGTTCCAGCTCTCCGT
GCGCGTCGCTCGTCGAACCGTCTCGCGCGTTCCGCCGCTGCTTTCGAGTTGCTGTCGCCGCGCACGTCTTGGAGTTCAGCAGTTTTCAGCCTTAATCAGTCGTTCTCGAG
TTCAAGCTTCGATACTCAGCCATTTACGCTTCTGTCCAGCGAAGTGACTTTAGTTTGGGTAAGTGAAGGAATTTTGTTTGAGAGGTTGTTGGTTGTGTGGTTTCGTTGGT
TGCTAATTCAGGTAAAGGCAAAGGCGTTCAAGGGATTGACAGAGCATGGTGGAATCATCGAGGAATGTTGTTTAGAATGTTTCATGCTTAGCAGAAAGAAAAATGCGCCA
CGTGGAAGAGGAACAGGGCGTCGTGGAAGACCAGCAAGGGCACGCGTGAATCCGGAGGCAGTTGATCCACCTGTGGAGCAGCCCAACGTCAACGAGAACCCACCGTTGGT
GCAACAAGTGGGTCGGGAGGAAGACCCTATTGTGGCTGAGTTGTGGTTATCATCCATGGAGACGATCTTTCGATACATGAAGTGCCCGAAGGATCAAAAGGTGCATTGTG
TGGTGTTTGTTCTATCCGATGACGCGTTCTTATGGTGGGAGTCAGCTGAGAGGTCGATCGACACGAGTGGAGGTCCAGTGTCTTGGTTACAATTCAAAGAGGCGTTCTTC
TGTCAGTACTATCCCGCTCTGACTCGCTTCAGAAAGCAGGCTGAATTCATGAAGCTGACACAGGGAAACGATCTGTGGAAGAATATGAGAGAGAATTTACCAGGCTGTCT
CGTTCAAGGCATTGTGGCAGCCGTCACACCATCTGATTATGCGGCAGCACTGCGAGCGGCCACTCTGATAGATAAACGTCCCAGTAATGGGTCTCAGGTGGCTTCTGATC
CTGGTCTTCAGTCAGGACAGAAGAGGAAGTTTGACCAGGCGAACTCAAATTTTCAAAGGTCTCAACGATCGTCGGGAAAGACACGACCCCCTCAGTCATAG
mRNA sequenceShow/hide mRNA sequence
ATGATTGCTAACGCTCTTAAGAATGTGACAGTGATTAGTCATCAGCAGCCACCAGCTATGGAGCCTACTGCAGTGGTGAACCAAGTGCAGAGGAAGCATGTGTCTATTGT
GGTGAAGACCACAACTACGAGTTTTGCCCCACAATCCAGCTTCTGTGTTTTTTGTTGGCGCAACCACCCCAACTTCTCATGGGGAGGTCAAGGAAATAACTGCAAGCCAA
CAAAAGTGAACCAGCCAGGATTTGCTAAAGCGCAGGCAAGGCCTCAAGGGAAACTTCCATCAGATACTGAACACCCTAGAAGGGAAGGTAAGGAGCAGGTAAAGGCAGTA
ACTCTTAGGAGTGATGTGGAACCACCTTATGTGCCGCCCCCACCTTATGTACCACCTCTACCTTTTCCACAAAGGCAAAAGCCTAAGAATCAGGATGGTGGAAAAGAATT
AGGTAGAGCACTCTGTGATTTAGGCGCAAGCATTAACCTTATGCCTCTTTCGGTCTATCGGAAGTTAGGTATTGGTGAAGCTAGGCCTACCACAGTCACACTCCAATTAG
CTGATAGGTCTATCACATATCCAGAGGACGAAATGGAGGATTGCTCCTTCATTAGGATTCTGGAGAGCACAGTTATTGAGACAGCAATACAGGATTCGGCTGATAAGCAT
TCGGAAAAGCATGGAGAGGGAATTAGCCCATCTTTTTGTATGCACAAAATCACTCTAGAGGAGGGATCCTTTAGGAGTATTGAGCAACAAAGAAGGCTTAACCCTGCAAT
GAAAGAGGTTGTTAAAAAGGAGGTAATTAAATGGTTGGATGCTGGGATCATTTATCCAATTGCCGATAGCAATTGGCTTCCCAAGCCAGCCGCAGCCTTCAACATCTCTC
GCCTCTTCTCACCGCCACCTATCACCTTGACGCCGGTTGTAGTCTCGTATGCTGCCATCCTTCTAGCGTCCAGTGTAGGTCGTCACCATCGTGTGGTTCCAGCTCTCCGT
GCGCGTCGCTCGTCGAACCGTCTCGCGCGTTCCGCCGCTGCTTTCGAGTTGCTGTCGCCGCGCACGTCTTGGAGTTCAGCAGTTTTCAGCCTTAATCAGTCGTTCTCGAG
TTCAAGCTTCGATACTCAGCCATTTACGCTTCTGTCCAGCGAAGTGACTTTAGTTTGGGTAAGTGAAGGAATTTTGTTTGAGAGGTTGTTGGTTGTGTGGTTTCGTTGGT
TGCTAATTCAGGTAAAGGCAAAGGCGTTCAAGGGATTGACAGAGCATGGTGGAATCATCGAGGAATGTTGTTTAGAATGTTTCATGCTTAGCAGAAAGAAAAATGCGCCA
CGTGGAAGAGGAACAGGGCGTCGTGGAAGACCAGCAAGGGCACGCGTGAATCCGGAGGCAGTTGATCCACCTGTGGAGCAGCCCAACGTCAACGAGAACCCACCGTTGGT
GCAACAAGTGGGTCGGGAGGAAGACCCTATTGTGGCTGAGTTGTGGTTATCATCCATGGAGACGATCTTTCGATACATGAAGTGCCCGAAGGATCAAAAGGTGCATTGTG
TGGTGTTTGTTCTATCCGATGACGCGTTCTTATGGTGGGAGTCAGCTGAGAGGTCGATCGACACGAGTGGAGGTCCAGTGTCTTGGTTACAATTCAAAGAGGCGTTCTTC
TGTCAGTACTATCCCGCTCTGACTCGCTTCAGAAAGCAGGCTGAATTCATGAAGCTGACACAGGGAAACGATCTGTGGAAGAATATGAGAGAGAATTTACCAGGCTGTCT
CGTTCAAGGCATTGTGGCAGCCGTCACACCATCTGATTATGCGGCAGCACTGCGAGCGGCCACTCTGATAGATAAACGTCCCAGTAATGGGTCTCAGGTGGCTTCTGATC
CTGGTCTTCAGTCAGGACAGAAGAGGAAGTTTGACCAGGCGAACTCAAATTTTCAAAGGTCTCAACGATCGTCGGGAAAGACACGACCCCCTCAGTCATAG
Protein sequenceShow/hide protein sequence
MIANALKNVTVISHQQPPAMEPTAVVNQVQRKHVSIVVKTTTTSFAPQSSFCVFCWRNHPNFSWGGQGNNCKPTKVNQPGFAKAQARPQGKLPSDTEHPRREGKEQVKAV
TLRSDVEPPYVPPPPYVPPLPFPQRQKPKNQDGGKELGRALCDLGASINLMPLSVYRKLGIGEARPTTVTLQLADRSITYPEDEMEDCSFIRILESTVIETAIQDSADKH
SEKHGEGISPSFCMHKITLEEGSFRSIEQQRRLNPAMKEVVKKEVIKWLDAGIIYPIADSNWLPKPAAAFNISRLFSPPPITLTPVVVSYAAILLASSVGRHHRVVPALR
ARRSSNRLARSAAAFELLSPRTSWSSAVFSLNQSFSSSSFDTQPFTLLSSEVTLVWVSEGILFERLLVVWFRWLLIQVKAKAFKGLTEHGGIIEECCLECFMLSRKKNAP
RGRGTGRRGRPARARVNPEAVDPPVEQPNVNENPPLVQQVGREEDPIVAELWLSSMETIFRYMKCPKDQKVHCVVFVLSDDAFLWWESAERSIDTSGGPVSWLQFKEAFF
CQYYPALTRFRKQAEFMKLTQGNDLWKNMRENLPGCLVQGIVAAVTPSDYAAALRAATLIDKRPSNGSQVASDPGLQSGQKRKFDQANSNFQRSQRSSGKTRPPQS