; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Lag0001225 (gene) of Sponge gourd (AG-4) v1 genome

Gene IDLag0001225
OrganismLuffa acutangula AG-4 (Sponge gourd (AG-4) v1)
DescriptionRetrovirus-related Pol polyprotein from transposon opus
Genome locationchr4:27112894..27116814
RNA-Seq ExpressionLag0001225
SyntenyLag0001225
Gene Ontology termsNA
InterPro domainsIPR021109 - Aspartic peptidase domain superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAG9442207.1 hypothetical protein H6P81_018061 [Aristolochia fimbriata]2.6e-1129.32Show/hide
Query:  NGNNFELKTAAGGALLSMTVENAHTLLEDMATNSYQWPSGRSTPKKVAVEVFEIDNVSALQDQMSTLANAFLKFSSI-----------GKSSNRTTKLEE
        N     +  AAGG +   T    + L+E+MA+N YQ+P  RS   +VA  +  +D+V ALQ Q+ +LA    K  +            G   N   + E 
Subjt:  NGNNFELKTAAGGALLSMTVENAHTLLEDMATNSYQWPSGRSTPKKVAVEVFEIDNVSALQDQMSTLANAFLKFSSI-----------GKSSNRTTKLEE

Query:  AVIFINTTVTGHSATIKYIKTQLRRLEKEIEETEEPETEEYDTPTGEAEEDTSDEAEKPDPELSIPSPTVSIRDNPVQLIHEGMAGKEAEGEEGASINVI
         +      +  H+  +K + +  R++       EE  T           ++      K     +IP    + + N V              + GASIN++
Subjt:  AVIFINTTVTGHSATIKYIKTQLRRLEKEIEETEEPETEEYDTPTGEAEEDTSDEAEKPDPELSIPSPTVSIRDNPVQLIHEGMAGKEAEGEEGASINVI

Query:  PLSLCKKLDIGEIKSTLVKLQLVDQSVVKPFGIIENVLIKVGRFVIPID
        PLS+CKKL++GE+K T + LQ  D S  KP G+IE+VL++VG+F+ P D
Subjt:  PLSLCKKLDIGEIKSTLVKLQLVDQSVVKPFGIIENVLIKVGRFVIPID

XP_022157217.1 uncharacterized protein LOC111023979 [Momordica charantia]1.5e-1162.07Show/hide
Query:  GASINVIPLSLCKKLDIGEIKSTLVKLQLVDQSVVKPFGIIENVLIKVGRFVIPIDLY
        G +IN  PLSLC+KL+IGEIK T + +QLVD+S   P+G+IENVLIKVG+F++P+D Y
Subjt:  GASINVIPLSLCKKLDIGEIKSTLVKLQLVDQSVVKPFGIIENVLIKVGRFVIPIDLY

XP_023876781.1 uncharacterized protein LOC111989228 [Quercus suber]2.6e-1629.43Show/hide
Query:  AAGGALLSMTVENAHTLLEDMATNSYQWPSGRSTPKKVAVEVFEIDNVSALQDQMSTLANAFLKFSS-----------------IGKSSNRTTKLEEAVI
        AAGGAL+S T E A+ LLE++A+++YQWP+ R+ P+K A EV E+D++++L  QM+TL+   +   +                 I + S+  T+     +
Subjt:  AAGGALLSMTVENAHTLLEDMATNSYQWPSGRSTPKKVAVEVFEIDNVSALQDQMSTLANAFLKFSS-----------------IGKSSNRTTKLEEAVI

Query:  FINTTVTGHSATIKYIKTQLRRLEKEIEETEEPETEEYDTPTGEAEEDTSDEAEKPDPELSIPS-----------------PTVSIRDNPVQLIHEGMAG
          N TVT     +K I  +  R  +E++ T   + E  D    + +E  S+  E    E ++                    TV +      ++ + ++ 
Subjt:  FINTTVTGHSATIKYIKTQLRRLEKEIEETEEPETEEYDTPTGEAEEDTSDEAEKPDPELSIPS-----------------PTVSIRDNPVQLIHEGMAG

Query:  K-------------------EAEGEEGASINVIPLSLCKKLDIGEIKSTLVKLQLVDQSVVKPFGIIENVLIKVGRFVIPID
        K                    A  + GASIN +PLS  +KL +GE+K T + LQL D+S+  P G+IENVL+KV +F+  +D
Subjt:  K-------------------EAEGEEGASINVIPLSLCKKLDIGEIKSTLVKLQLVDQSVVKPFGIIENVLIKVGRFVIPID

XP_023887557.1 uncharacterized protein LOC111999657 [Quercus suber]1.1e-1129.22Show/hide
Query:  AAGGALLSMTVENAHTLLEDMATNSYQWPSGRSTPKKVAVEVFEIDNVSALQDQMSTLANAFLKFS----------------------SIGKSSNRTTKL
        AAGGAL S T E A+ LLE +A+N+YQWP+ R+ P+K A  V E+D++++L  QM+TL+    K +                       IG+ S+  TK 
Subjt:  AAGGALLSMTVENAHTLLEDMATNSYQWPSGRSTPKKVAVEVFEIDNVSALQDQMSTLANAFLKFS----------------------SIGKSSNRTTKL

Query:  EEAVIFINTTVTG--HSATIKYIKTQLRRLEK-------EIEETEEPETEEYDTPTGEAEEDTSDEAEKPDPELSIPSPTVS-----------IRD----
            +  NT      H+  I     Q     K       E  + EE E +E +      E   + +++K   +L I  P +            ++D    
Subjt:  EEAVIFINTTVTG--HSATIKYIKTQLRRLEK-------EIEETEEPETEEYDTPTGEAEEDTSDEAEKPDPELSIPSPTVS-----------IRD----

Query:  -------NPVQLIHEGMA----------------------GK----EAEGEEGASINVIPLSLCKKLDIGEIKSTLVKLQLVDQSVVKPFGIIENVLIKV
                 V L  E  A                      GK     A  + GASIN++PLS+ +KL + E+K T + LQL D+S   P G+IE+VL+KV
Subjt:  -------NPVQLIHEGMA----------------------GK----EAEGEEGASINVIPLSLCKKLDIGEIKSTLVKLQLVDQSVVKPFGIIENVLIKV

Query:  GRFVIPID
         +F+ P D
Subjt:  GRFVIPID

XP_024022201.1 uncharacterized protein LOC112091842 [Morus notabilis]1.2e-1328.54Show/hide
Query:  LVPLDPEIGRTIHRLRRETREPIQMTDPNPLEEPKPIRDYFQLVFQGQHSRIAYAP---INGNNFELKTAAGGALLSMTVENAHTLLEDMATNSYQWPSG
        L+PLD EI RT  R R+E RE +       ++  + + D    + +  ++     P   I  NNF      GGAL+  T + A+ LLEDMATN+YQWPS 
Subjt:  LVPLDPEIGRTIHRLRRETREPIQMTDPNPLEEPKPIRDYFQLVFQGQHSRIAYAP---INGNNFELKTAAGGALLSMTVENAHTLLEDMATNSYQWPSG

Query:  RSTPKKVAVEVFEIDNVSALQDQMSTL----------ANAFLKFSSIGKSSNRTTKLEEAVIFINTTVTGHSATIKYIKTQLRRLEKEIEETEE---PET
        RS PKK+A  + EID ++ L  Q+++L          ANA    S + +  N   +  E  +  N      SA I  +  Q+ ++     E ++   P T
Subjt:  RSTPKKVAVEVFEIDNVSALQDQMSTL----------ANAFLKFSSIGKSSNRTTKLEEAVIFINTTVTGHSATIKYIKTQLRRLEKEIEETEE---PET

Query:  EEYD------------TPTGEAEEDTSDEAEKPD--------PELSIPSP---------TVSIR-----DNPVQL----------IHEGMAGKEA-----
         E +            T     E +     EKPD        P  S  +P          V++R     + PV++          +H  +   EA     
Subjt:  EEYD------------TPTGEAEEDTSDEAEKPD--------PELSIPSP---------TVSIR-----DNPVQL----------IHEGMAGKEA-----

Query:  ----------------EGEE--------------------GASINVIPLSLCKKLDIGEIKSTLVKLQLVDQSVVKPFGIIENVLIKVGRFVIPID
                        E  E                    GASIN++PLS+ +KL +GE + T+V LQL D+S+  P G+IE+VL+KV +F+ P D
Subjt:  ----------------EGEE--------------------GASINVIPLSLCKKLDIGEIKSTLVKLQLVDQSVVKPFGIIENVLIKVGRFVIPID

TrEMBL top hitse value%identityAlignment
A0A1U7V076 uncharacterized protein LOC1042122874.0e-1055.22Show/hide
Query:  GMAGKEAEGEEGASINVIPLSLCKKLDIGEIKSTLVKLQLVDQSVVKPFGIIENVLIKVGRFVIPID
        G+  ++A  + GASIN++P+S+ KKLD+GEIK T + LQ  DQS  KP GIIENVL++V +FV P+D
Subjt:  GMAGKEAEGEEGASINVIPLSLCKKLDIGEIKSTLVKLQLVDQSVVKPFGIIENVLIKVGRFVIPID

A0A1U7WVS0 uncharacterized protein LOC1042280772.3e-1056.72Show/hide
Query:  GMAGKEAEGEEGASINVIPLSLCKKLDIGEIKSTLVKLQLVDQSVVKPFGIIENVLIKVGRFVIPID
        G+  ++A  + GASIN++P S+ KKLD+GEIK+T V LQ  DQS  KP GIIENVL++V +FV P+D
Subjt:  GMAGKEAEGEEGASINVIPLSLCKKLDIGEIKSTLVKLQLVDQSVVKPFGIIENVLIKVGRFVIPID

A0A1U8BEX5 uncharacterized protein LOC1046101681.2e-0957.14Show/hide
Query:  GASINVIPLSLCKKLDIGEIKSTLVKLQLVDQSVVKPFGIIENVLIKVGRFVIPID
        GASIN++PLS+ +KL +GE+K TL+ LQLVD S+ KP G++E+VL+KV +F+ P+D
Subjt:  GASINVIPLSLCKKLDIGEIKSTLVKLQLVDQSVVKPFGIIENVLIKVGRFVIPID

A0A5D2YWP4 Uncharacterized protein1.5e-0928.12Show/hide
Query:  GGALLSMTVENAHTLLEDMATNSYQWPSGRSTPKKVAVEVFEIDNVSALQDQMSTL------------ANAFLK------FSSIGKSSNRTTKLEEAVI-
        GG + + T E A+  +E+M+ N+YQW   R+ P KVA  VF +D V+ L +Q+  L             +  ++      F        +   LEE +  
Subjt:  GGALLSMTVENAHTLLEDMATNSYQWPSGRSTPKKVAVEVFEIDNVSALQDQMSTL------------ANAFLK------FSSIGKSSNRTTKLEEAVI-

Query:  FIN----------TTVTGHSATIKYIKTQLRRLEKEIEETEEPETEEYDTPTGEAEEDTSDEAEKPDPELSIPSPTVSIRDNPVQLIHEGMAGKEAEGEE
        FI+          T +    A+I+ ++TQ+ +L K I E  +      +T +   E++   E  K   E  +P+         +  +   +    A  + 
Subjt:  FIN----------TTVTGHSATIKYIKTQLRRLEKEIEETEEPETEEYDTPTGEAEEDTSDEAEKPDPELSIPSPTVSIRDNPVQLIHEGMAGKEAEGEE

Query:  GASINVIPLSLCKKLDIGEIKSTLVKLQLVDQSVVKPFGIIENVLIKVGRFVIPID
        GASINV+P  + K+L +G+ K T + +QL D+++  P GIIE+VLIK+ +F+ P+D
Subjt:  GASINVIPLSLCKKLDIGEIKSTLVKLQLVDQSVVKPFGIIENVLIKVGRFVIPID

A0A6J1DTZ8 uncharacterized protein LOC1110239797.2e-1262.07Show/hide
Query:  GASINVIPLSLCKKLDIGEIKSTLVKLQLVDQSVVKPFGIIENVLIKVGRFVIPIDLY
        G +IN  PLSLC+KL+IGEIK T + +QLVD+S   P+G+IENVLIKVG+F++P+D Y
Subjt:  GASINVIPLSLCKKLDIGEIKSTLVKLQLVDQSVVKPFGIIENVLIKVGRFVIPIDLY

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGCCCGACCCATATGGTCGACCTCGGCAAAAGGCCGAGGCCGACCATTCGGCCCGTTTGCGCGAGCCGAGCCCGGTGACCTCTTTTCGGTCCCTGATGCCCCGAATCG
CCCCGGTTCCGCCTGGTTCGTCCCGAAACACCACCGAATTCCTAAAAACCCTAGGAGGACAAACAGGCATCGGAGGCGGTGTGGCCTACACCACGCCAGTGTGCAGCGGT
TTTTGTTGGTCTTGCAGGTCACGTCTTCCCCAGTTTCTACAAATTCACTGTTGGTGTCACGTGAAGGTCAGGGATTTGGCCTAGTACCTTTGGATCCTGAGATAGGAAGA
ACAATTCATAGGCTTCGAAGGGAGACTAGAGAACCTATCCAAATGACCGATCCAAATCCACTTGAGGAGCCCAAGCCCATTAGAGATTATTTTCAGCTTGTGTTTCAAGG
GCAACATTCGAGGATTGCCTATGCCCCGATCAATGGCAACAACTTTGAGCTGAAGACCGCTGCAGGTGGGGCTTTGTTGTCCATGACCGTGGAAAATGCACATACTTTAT
TGGAGGATATGGCCACCAACAGCTACCAGTGGCCATCTGGGCGGTCTACACCAAAGAAGGTTGCAGTTGAGGTGTTTGAAATTGATAATGTAAGTGCTCTCCAAGACCAG
ATGTCTACCCTTGCTAATGCTTTCTTGAAGTTTTCAAGCATAGGGAAGTCTAGTAACAGGACAACTAAGCTAGAGGAGGCAGTTATTTTCATCAACACCACGGTGACTGG
CCACAGTGCAACCATAAAATACATCAAGACTCAGCTGAGACGGTTGGAGAAGGAGATTGAAGAGACTGAGGAGCCTGAGACTGAAGAATATGACACTCCTACTGGAGAAG
CTGAGGAGGACACATCAGATGAAGCTGAGAAGCCTGACCCTGAGCTTTCTATTCCTTCTCCCACAGTAAGCATTAGAGATAACCCCGTACAACTGATTCATGAAGGAATG
GCTGGCAAAGAAGCGGAAGGAGAAGAAGGTGCTAGCATTAATGTTATTCCCTTATCTTTATGCAAGAAATTAGACATAGGTGAGATTAAATCTACCCTTGTTAAACTGCA
ATTAGTTGATCAATCTGTAGTTAAACCATTTGGAATTATAGAAAATGTTTTAATCAAAGTAGGTAGATTTGTAATCCCTATTGATTTATATGATGGAAAACCCTTCAGTA
CCTGTCATACTAGGGAGACCATTCCTCACTACTGGATCATTGAAGAAAAAGAAGAGAAGAAGAAAGAAGAGAAAAAAATTCAGTCGCCGGCGGCGGCGGAGGCAGAGAGC
GGTGGCTGTCGGTCGTCGGACTTAGAAAGAAGTAGAAGGAGGAGGGAGAAGATGGAGAAGAAGAAGGAGGAGGAAGAAGATGGAGAAGAAGAAGAAGATGAGAGAGAATG
A
mRNA sequenceShow/hide mRNA sequence
ATGGCCCGACCCATATGGTCGACCTCGGCAAAAGGCCGAGGCCGACCATTCGGCCCGTTTGCGCGAGCCGAGCCCGGTGACCTCTTTTCGGTCCCTGATGCCCCGAATCG
CCCCGGTTCCGCCTGGTTCGTCCCGAAACACCACCGAATTCCTAAAAACCCTAGGAGGACAAACAGGCATCGGAGGCGGTGTGGCCTACACCACGCCAGTGTGCAGCGGT
TTTTGTTGGTCTTGCAGGTCACGTCTTCCCCAGTTTCTACAAATTCACTGTTGGTGTCACGTGAAGGTCAGGGATTTGGCCTAGTACCTTTGGATCCTGAGATAGGAAGA
ACAATTCATAGGCTTCGAAGGGAGACTAGAGAACCTATCCAAATGACCGATCCAAATCCACTTGAGGAGCCCAAGCCCATTAGAGATTATTTTCAGCTTGTGTTTCAAGG
GCAACATTCGAGGATTGCCTATGCCCCGATCAATGGCAACAACTTTGAGCTGAAGACCGCTGCAGGTGGGGCTTTGTTGTCCATGACCGTGGAAAATGCACATACTTTAT
TGGAGGATATGGCCACCAACAGCTACCAGTGGCCATCTGGGCGGTCTACACCAAAGAAGGTTGCAGTTGAGGTGTTTGAAATTGATAATGTAAGTGCTCTCCAAGACCAG
ATGTCTACCCTTGCTAATGCTTTCTTGAAGTTTTCAAGCATAGGGAAGTCTAGTAACAGGACAACTAAGCTAGAGGAGGCAGTTATTTTCATCAACACCACGGTGACTGG
CCACAGTGCAACCATAAAATACATCAAGACTCAGCTGAGACGGTTGGAGAAGGAGATTGAAGAGACTGAGGAGCCTGAGACTGAAGAATATGACACTCCTACTGGAGAAG
CTGAGGAGGACACATCAGATGAAGCTGAGAAGCCTGACCCTGAGCTTTCTATTCCTTCTCCCACAGTAAGCATTAGAGATAACCCCGTACAACTGATTCATGAAGGAATG
GCTGGCAAAGAAGCGGAAGGAGAAGAAGGTGCTAGCATTAATGTTATTCCCTTATCTTTATGCAAGAAATTAGACATAGGTGAGATTAAATCTACCCTTGTTAAACTGCA
ATTAGTTGATCAATCTGTAGTTAAACCATTTGGAATTATAGAAAATGTTTTAATCAAAGTAGGTAGATTTGTAATCCCTATTGATTTATATGATGGAAAACCCTTCAGTA
CCTGTCATACTAGGGAGACCATTCCTCACTACTGGATCATTGAAGAAAAAGAAGAGAAGAAGAAAGAAGAGAAAAAAATTCAGTCGCCGGCGGCGGCGGAGGCAGAGAGC
GGTGGCTGTCGGTCGTCGGACTTAGAAAGAAGTAGAAGGAGGAGGGAGAAGATGGAGAAGAAGAAGGAGGAGGAAGAAGATGGAGAAGAAGAAGAAGATGAGAGAGAATG
A
Protein sequenceShow/hide protein sequence
MARPIWSTSAKGRGRPFGPFARAEPGDLFSVPDAPNRPGSAWFVPKHHRIPKNPRRTNRHRRRCGLHHASVQRFLLVLQVTSSPVSTNSLLVSREGQGFGLVPLDPEIGR
TIHRLRRETREPIQMTDPNPLEEPKPIRDYFQLVFQGQHSRIAYAPINGNNFELKTAAGGALLSMTVENAHTLLEDMATNSYQWPSGRSTPKKVAVEVFEIDNVSALQDQ
MSTLANAFLKFSSIGKSSNRTTKLEEAVIFINTTVTGHSATIKYIKTQLRRLEKEIEETEEPETEEYDTPTGEAEEDTSDEAEKPDPELSIPSPTVSIRDNPVQLIHEGM
AGKEAEGEEGASINVIPLSLCKKLDIGEIKSTLVKLQLVDQSVVKPFGIIENVLIKVGRFVIPIDLYDGKPFSTCHTRETIPHYWIIEEKEEKKKEEKKIQSPAAAEAES
GGCRSSDLERSRRRREKMEKKKEEEEDGEEEEDERE