; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Lag0041884 (gene) of Sponge gourd (AG-4) v1 genome

Gene IDLag0041884
OrganismLuffa acutangula AG-4 (Sponge gourd (AG-4) v1)
DescriptionIntegrase catalytic domain-containing protein
Genome locationchr13:30515523..30518143
RNA-Seq ExpressionLag0041884
SyntenyLag0041884
Gene Ontology termsGO:0003676 - nucleic acid binding (molecular function)
InterPro domainsIPR005162 - Retrotransposon gag domain
IPR012337 - Ribonuclease H-like superfamily
IPR036397 - Ribonuclease H superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_022148562.1 uncharacterized protein LOC111017196 [Momordica charantia]4.8e-7850.89Show/hide
Query:  MIIGLTVKNKMGFVDGSLLQPTSTLRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQLRREISNLAQDQQTLTTYFAK
        M I LTVKNK G VDGS+ +P      SWIICN+VV AWILNSLS E+SASV F +SAREIWLDLQ+R+QR+NR RIFQLRR++S L QDQ +++ YF K
Subjt:  MIIGLTVKNKMGFVDGSLLQPTSTLRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQLRREISNLAQDQQTLTTYFAK

Query:  LKSIWNELTAYRPSCSCRRCSCGGVKELNTYFQTEYVMAFLMGLNDSFAQVRSQLLLMEPEPTIQRAFSLVAQELEQR-ASTTPSSSTSIPATALLVK--
        LK++W EL AYRP+CSC RC+CGGVK L  YFQTEYVM FLMGLNDSF+Q+R+ LLLM P PTI  AF L+AQE++QR  S    +++S  +TA  V   
Subjt:  LKSIWNELTAYRPSCSCRRCSCGGVKELNTYFQTEYVMAFLMGLNDSFAQVRSQLLLMEPEPTIQRAFSLVAQELEQR-ASTTPSSSTSIPATALLVK--

Query:  --------TNSANPSSTQASNSNKKKERPFCTHC-----------------------NIQG-HIKVASSKSETPATSANASLSDQLSGLNVEQCQGLLTL
                +NS+  S+   SN  K+KE+  CTHC                       N +G H   A +KS    +   A++ D LS +N +QC GL  +
Subjt:  --------TNSANPSSTQASNSNKKKERPFCTHC-----------------------NIQG-HIKVASSKSETPATSANASLSDQLSGLNVEQCQGLLTL

Query:  LQSHLNKVKVDPSDASSNTHIAGQVLFRMIGKARIWQG
        LQSHL KVK      S  +H+AGQVLF        WQG
Subjt:  LQSHLNKVKVDPSDASSNTHIAGQVLFRMIGKARIWQG

XP_022154919.1 uncharacterized protein LOC111022065 [Momordica charantia]2.3e-10450.72Show/hide
Query:  SQAMIIGLTVKNKMGFVDGSLLQPTSTLRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQLRREISNLAQDQQTLTTY
        S++++I LTVKNK+GFVDGS+ +PT     SWIICN+VV +WI NSLS ++SASV F +SA EIWLDL++R+QR+NR RIFQLRRE+SNL QDQ ++T Y
Subjt:  SQAMIIGLTVKNKMGFVDGSLLQPTSTLRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQLRREISNLAQDQQTLTTY

Query:  FAKLKSIWNELTAYRPSCSCRRCSCGGVKELNTYFQTEYVMAFLMGLNDSFAQVRSQLLLMEPEPTIQRAFSLVAQELEQRASTTPSSSTSIPATALLVK
        F +LK++W+EL  YRP+CSC RCS GGVK +  ++Q EYVMAFLMGLN SF+Q+R+QLLLMEP PTI RAF+LVAQE++QR+ + P S TS  A+A+   
Subjt:  FAKLKSIWNELTAYRPSCSCRRCSCGGVKELNTYFQTEYVMAFLMGLNDSFAQVRSQLLLMEPEPTIQRAFSLVAQELEQRASTTPSSSTSIPATALLVK

Query:  TNSANPSSTQASNSN-KKKERPFCTHCNIQGHI-------------------------KVASSKSETPATSANAS---LSDQLSGLNVEQCQGLLTLLQS
        +NS+N     +S S+ K+K++  CTHC I GH                            +S  +E P+ S +A+   +S+ L+ L  +QCQ LLTLLQS
Subjt:  TNSANPSSTQASNSN-KKKERPFCTHCNIQGHI-------------------------KVASSKSETPATSANAS---LSDQLSGLNVEQCQGLLTLLQS

Query:  HLNKVKVDPSDASSNTHIAGQVLFRMIGKARI-------------WQGVTHQFSCVGHPEQNSVVERRHQHLLDIARALLFKSRLPIQLWGESILTAAYI
        HL   K    + S  +H+A     + I + R               +GV HQFSCVG PEQNSVVER+HQHLL++AR+L F+SR+P   WGE +LTAAY+
Subjt:  HLNKVKVDPSDASSNTHIAGQVLFRMIGKARI-------------WQGVTHQFSCVGHPEQNSVVERRHQHLLDIARALLFKSRLPIQLWGESILTAAYI

Query:  ANWTPSRELNWQTPYYKL
         N TP+  L+W TPY +L
Subjt:  ANWTPSRELNWQTPYYKL

XP_038896588.1 uncharacterized protein LOC120084843 [Benincasa hispida]1.7e-6754.61Show/hide
Query:  MIIGLTVKNKMGFVDGSLLQPTSTLRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQLRREISNLAQDQQTLTTYFAK
        M IGLTVKNK+GF++G + +P                         E+SAS+NF  SA+EIW DLQ+ YQRKNR R+FQL REISNL+Q+Q ++TTY+ K
Subjt:  MIIGLTVKNKMGFVDGSLLQPTSTLRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQLRREISNLAQDQQTLTTYFAK

Query:  LKSIWNELTAYRPSCSCRRCSCGGVKELNTYFQTEYVMAFLMGLNDSFAQVRSQLLLMEPEPTIQRAFSLVAQELEQRA--STTPSSSTSIPATALLVKT
        LK +WNEL +Y PSCSC +C+CGGVK L TYFQTEYVMAFLMGLNDS A +RSQLLLMEPEP+I RAFSLVAQE++Q+A  S+  ++ +S   TALLVK 
Subjt:  LKSIWNELTAYRPSCSCRRCSCGGVKELNTYFQTEYVMAFLMGLNDSFAQVRSQLLLMEPEPTIQRAFSLVAQELEQRA--STTPSSSTSIPATALLVKT

Query:  NSA-----NPSSTQASNSNKKKERPFCTHCNIQGHI---------KVASSKSETPATSANASLSDQLSGLNVEQCQGLLTLL
         S+      PS ++ +N NKKK++P CTHCNIQGH               K   P  SA A  SD LS L  EQCQGLL +L
Subjt:  NSA-----NPSSTQASNSNKKKERPFCTHCNIQGHI---------KVASSKSETPATSANASLSDQLSGLNVEQCQGLLTLL

XP_038904477.1 uncharacterized protein LOC120090845 [Benincasa hispida]2.0e-8461.35Show/hide
Query:  MIIGLTVKNKMGFVDGSLLQPTSTLRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQLRREISNLAQDQQTLTTYFAK
        M IGLTVKNK+GF++G + +P+  L  SWIICN +VT WILNSLS E+SAS+NF +SA+EIW+DLQ+RYQRKNR R+FQLRRE SNL+Q+Q ++TTY+AK
Subjt:  MIIGLTVKNKMGFVDGSLLQPTSTLRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQLRREISNLAQDQQTLTTYFAK

Query:  LKSIWNELTAYRPSCSCRRCSCGGVKELNTYFQTEYVMAFLMGLNDSFAQVRSQLLLMEPEPTIQRAFSLVAQELEQRA--STTPSSSTSIPATALLVKT
        LK++WNEL +YRPSCSC +C+CGGVK L TYFQTEYV+AFLMGLNDS A +RSQLLLMEP+PTI RAFSLVAQE++Q+A  S+  ++ +S  ATALLVK 
Subjt:  LKSIWNELTAYRPSCSCRRCSCGGVKELNTYFQTEYVMAFLMGLNDSFAQVRSQLLLMEPEPTIQRAFSLVAQELEQRA--STTPSSSTSIPATALLVKT

Query:  NSAN-----PSSTQASNSNKKKERPFCTHCNIQGHI---------KVASSKSETPATSANASLSDQLSGLNVEQCQGLLTLL
         S+      PS +  +N NKKK+RP CTHC+IQGH               K   P  SA A  SD LS L  EQCQGLL +L
Subjt:  NSAN-----PSSTQASNSNKKKERPFCTHCNIQGHI---------KVASSKSETPATSANASLSDQLSGLNVEQCQGLLTLL

XP_038905564.1 uncharacterized protein LOC120091546 [Benincasa hispida]1.8e-7753.97Show/hide
Query:  MIIGLTVKNKMGFVDGSLLQPTSTLRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQLRREISNLAQDQQTLTTYFAK
        M IGLT+KNK+GF+D S++ PT  + +SWI+CNSVVT WILNSLS E+SASV F + A++IW+DLQ++YQRKN    +Q+RRE+SNL Q Q ++TTY+AK
Subjt:  MIIGLTVKNKMGFVDGSLLQPTSTLRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQLRREISNLAQDQQTLTTYFAK

Query:  LKSIWNELTAYRPSCSCRRCSCGGVKELNTYFQTEYVMAFLMGLNDSFAQVRSQLLLMEPEPTIQRAFSLVAQELEQRASTTPS-----SSTSIPATALL
         K++WNEL +YRPSCSC RC+C GVK+LNTY QTE+VM FLMGLN+SF+Q+ +QLLLME EP+I +AFS V QE+EQR  T PS     S+ S    ALL
Subjt:  LKSIWNELTAYRPSCSCRRCSCGGVKELNTYFQTEYVMAFLMGLNDSFAQVRSQLLLMEPEPTIQRAFSLVAQELEQRASTTPS-----SSTSIPATALL

Query:  VKTNSANPSSTQA--SNSNKKKERPFCTHCNIQGHI------------------KVASSKSETPATSANASLSDQLSGLN--VEQCQGLLTLLQSHLNKV
        VK  S+N +S Q+  SN+NKKK+R   THCNI GH                   K +SSKSE+ ++++ +  SD  S  N   +Q QGLL + QSHL K 
Subjt:  VKTNSANPSSTQA--SNSNKKKERPFCTHCNIQGHI------------------KVASSKSETPATSANASLSDQLSGLN--VEQCQGLLTLLQSHLNKV

Query:  KVDPSDASSNTHIAG
        K++ ++ SS  HIAG
Subjt:  KVDPSDASSNTHIAG

TrEMBL top hitse value%identityAlignment
A0A5J5B2C5 Uncharacterized protein2.8e-6342.66Show/hide
Query:  SQAMIIGLTVKNKMGFVDGSLLQPTST---LRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQLRREISNLAQDQQTL
        S+AM+I L+VKNK+GFVDG + +P  T   L  SWI  N++V +WILNS+S E+SAS+ F   AREIWLDL+ R+Q++N  RIFQL+RE+ NL Q+Q ++
Subjt:  SQAMIIGLTVKNKMGFVDGSLLQPTST---LRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQLRREISNLAQDQQTL

Query:  TTYFAKLKSIWNELTAYRPSCSCRRCSCGGVKELNTYFQTEYVMAFLMGLNDSFAQVRSQLLLMEPEPTIQRAFSLVAQELEQRASTTPS-SSTSIPATA
        + YF K+K+IW EL+ YRP+CSC +C CGGVK LN Y QTEY+M+FLMGL+DSF+QV  QLLLM+  P I R FSL+ QE +QR +   S SS S    A
Subjt:  TTYFAKLKSIWNELTAYRPSCSCRRCSCGGVKELNTYFQTEYVMAFLMGLNDSFAQVRSQLLLMEPEPTIQRAFSLVAQELEQRASTTPS-SSTSIPATA

Query:  LLVKTNSANPSSTQASNS---------NKKKERPFCTHCNIQGH--------------IKVASSKSETPA---TSANASLSDQ-------LSGLNVEQCQ
         +VKT+ A    + + NS         N+K++RP+CTHC I GH               K  S+ +   A    S +   SDQ       +  LN  Q Q
Subjt:  LLVKTNSANPSSTQASNS---------NKKKERPFCTHCNIQGH--------------IKVASSKSETPA---TSANASLSDQ-------LSGLNVEQCQ

Query:  GLLTLLQSHL-NKVKVDPSDASSNTHIAGQVLFRMIGKARIWQGVTHQFSCVGHPEQNSVV
         L+++L +HL +  KV  +   S T+    +    +    +  GV      +  P+ NS +
Subjt:  GLLTLLQSHL-NKVKVDPSDASSNTHIAGQVLFRMIGKARIWQGVTHQFSCVGHPEQNSVV

A0A5J5BKC2 Uncharacterized protein1.1e-6254.51Show/hide
Query:  SQAMIIGLTVKNKMGFVDGSLLQPTST---LRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQLRREISNLAQDQQTL
        S+AM+I L+VKNK+GFVDGS+L+P  T   L  SWI  N++V +WILNS+S E+SAS+ F  SAREIWLDL+ R+Q++NR RIFQL+RE+ NL Q+Q ++
Subjt:  SQAMIIGLTVKNKMGFVDGSLLQPTST---LRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQLRREISNLAQDQQTL

Query:  TTYFAKLKSIWNELTAYRPSCSCRRCSCGGVKELNTYFQTEYVMAFLMGLNDSFAQVRSQLLLMEPEPTIQRAFSLVAQELEQRASTTPS-SSTSIPATA
        + YF KLK+IW EL+ YR +CSC +CSCGGVK LN + Q EY+M+FLMGL+DSF+QVR QLLLM+P P I R FSL+ QE +QR + + S SS S    A
Subjt:  TTYFAKLKSIWNELTAYRPSCSCRRCSCGGVKELNTYFQTEYVMAFLMGLNDSFAQVRSQLLLMEPEPTIQRAFSLVAQELEQRASTTPS-SSTSIPATA

Query:  LLVKTNSANPSSTQASNS---------NKKKERPFCTHCNIQGH
          VKT+ A    + + NS         N+K++R +C HC I GH
Subjt:  LLVKTNSANPSSTQASNS---------NKKKERPFCTHCNIQGH

A0A6J1D5E3 uncharacterized protein LOC1110171962.3e-7850.89Show/hide
Query:  MIIGLTVKNKMGFVDGSLLQPTSTLRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQLRREISNLAQDQQTLTTYFAK
        M I LTVKNK G VDGS+ +P      SWIICN+VV AWILNSLS E+SASV F +SAREIWLDLQ+R+QR+NR RIFQLRR++S L QDQ +++ YF K
Subjt:  MIIGLTVKNKMGFVDGSLLQPTSTLRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQLRREISNLAQDQQTLTTYFAK

Query:  LKSIWNELTAYRPSCSCRRCSCGGVKELNTYFQTEYVMAFLMGLNDSFAQVRSQLLLMEPEPTIQRAFSLVAQELEQR-ASTTPSSSTSIPATALLVK--
        LK++W EL AYRP+CSC RC+CGGVK L  YFQTEYVM FLMGLNDSF+Q+R+ LLLM P PTI  AF L+AQE++QR  S    +++S  +TA  V   
Subjt:  LKSIWNELTAYRPSCSCRRCSCGGVKELNTYFQTEYVMAFLMGLNDSFAQVRSQLLLMEPEPTIQRAFSLVAQELEQR-ASTTPSSSTSIPATALLVK--

Query:  --------TNSANPSSTQASNSNKKKERPFCTHC-----------------------NIQG-HIKVASSKSETPATSANASLSDQLSGLNVEQCQGLLTL
                +NS+  S+   SN  K+KE+  CTHC                       N +G H   A +KS    +   A++ D LS +N +QC GL  +
Subjt:  --------TNSANPSSTQASNSNKKKERPFCTHC-----------------------NIQG-HIKVASSKSETPATSANASLSDQLSGLNVEQCQGLLTL

Query:  LQSHLNKVKVDPSDASSNTHIAGQVLFRMIGKARIWQG
        LQSHL KVK      S  +H+AGQVLF        WQG
Subjt:  LQSHLNKVKVDPSDASSNTHIAGQVLFRMIGKARIWQG

A0A6J1DIP8 uncharacterized protein LOC1110203991.1e-6458.56Show/hide
Query:  SQAMIIGLTVKNKMGFVDGSLLQPTSTLRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQLRREISNLAQDQQTLTTY
        S++M+I LTVKNK+GFVDGS+++PT  L  SWIICN+VV +WILNSLS E+SAS+ F +SAREIWLDL++R++++NR RIFQLRR++SNL QDQ +++ Y
Subjt:  SQAMIIGLTVKNKMGFVDGSLLQPTSTLRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQLRREISNLAQDQQTLTTY

Query:  FAKLKSIWNELTAYRPSCSCRRCSCGGVKELNTYFQTEYVMAFLMGLNDSFAQVRSQLLLMEPEPTIQRAFSLVAQELEQRASTTPSSSTSIPATALLVK
        F  LK++W EL +Y PSC+  RCSCGGVKE+  + Q E+VM FLMGLN+SF+Q+R QLLLMEPEPTI R FSLV+QE +QRA  T +S  ++P TAL+ +
Subjt:  FAKLKSIWNELTAYRPSCSCRRCSCGGVKELNTYFQTEYVMAFLMGLNDSFAQVRSQLLLMEPEPTIQRAFSLVAQELEQRASTTPSSSTSIPATALLVK

Query:  TNSANPSSTQASNSNKKKERPF
        ++S++  S   SNS+    + F
Subjt:  TNSANPSSTQASNSNKKKERPF

A0A6J1DNP7 uncharacterized protein LOC1110220651.1e-10450.72Show/hide
Query:  SQAMIIGLTVKNKMGFVDGSLLQPTSTLRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQLRREISNLAQDQQTLTTY
        S++++I LTVKNK+GFVDGS+ +PT     SWIICN+VV +WI NSLS ++SASV F +SA EIWLDL++R+QR+NR RIFQLRRE+SNL QDQ ++T Y
Subjt:  SQAMIIGLTVKNKMGFVDGSLLQPTSTLRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQLRREISNLAQDQQTLTTY

Query:  FAKLKSIWNELTAYRPSCSCRRCSCGGVKELNTYFQTEYVMAFLMGLNDSFAQVRSQLLLMEPEPTIQRAFSLVAQELEQRASTTPSSSTSIPATALLVK
        F +LK++W+EL  YRP+CSC RCS GGVK +  ++Q EYVMAFLMGLN SF+Q+R+QLLLMEP PTI RAF+LVAQE++QR+ + P S TS  A+A+   
Subjt:  FAKLKSIWNELTAYRPSCSCRRCSCGGVKELNTYFQTEYVMAFLMGLNDSFAQVRSQLLLMEPEPTIQRAFSLVAQELEQRASTTPSSSTSIPATALLVK

Query:  TNSANPSSTQASNSN-KKKERPFCTHCNIQGHI-------------------------KVASSKSETPATSANAS---LSDQLSGLNVEQCQGLLTLLQS
        +NS+N     +S S+ K+K++  CTHC I GH                            +S  +E P+ S +A+   +S+ L+ L  +QCQ LLTLLQS
Subjt:  TNSANPSSTQASNSN-KKKERPFCTHCNIQGHI-------------------------KVASSKSETPATSANAS---LSDQLSGLNVEQCQGLLTLLQS

Query:  HLNKVKVDPSDASSNTHIAGQVLFRMIGKARI-------------WQGVTHQFSCVGHPEQNSVVERRHQHLLDIARALLFKSRLPIQLWGESILTAAYI
        HL   K    + S  +H+A     + I + R               +GV HQFSCVG PEQNSVVER+HQHLL++AR+L F+SR+P   WGE +LTAAY+
Subjt:  HLNKVKVDPSDASSNTHIAGQVLFRMIGKARI-------------WQGVTHQFSCVGHPEQNSVVERRHQHLLDIARALLFKSRLPIQLWGESILTAAYI

Query:  ANWTPSRELNWQTPYYKL
         N TP+  L+W TPY +L
Subjt:  ANWTPSRELNWQTPYYKL

SwissProt top hitse value%identityAlignment
P04146 Copia protein9.7e-0528.85Show/hide
Query:  ASSNTHIAGQVLFRMIGKAR-----------IWQGVTHQFSCVGHPEQNSVVERRHQHLLDIARALLFKSRLPIQLWGESILTAAYIANWTPSREL--NW
        A S  H   +V++  I   R           + +G+++  +    P+ N V ER  + + + AR ++  ++L    WGE++LTA Y+ N  PSR L  + 
Subjt:  ASSNTHIAGQVLFRMIGKAR-----------IWQGVTHQFSCVGHPEQNSVVERRHQHLLDIARALLFKSRLPIQLWGESILTAAYIANWTPSREL--NW

Query:  QTPY
        +TPY
Subjt:  QTPY

P10978 Retrovirus-related Pol polyprotein from transposon TNT 1-942.3e-0634.33Show/hide
Query:  GVTHQFSCVGHPEQNSVVERRHQHLLDIARALLFKSRLPIQLWGESILTAAYIANWTPSRELNWQTP
        G+ H+ +  G P+ N V ER ++ +++  R++L  ++LP   WGE++ TA Y+ N +PS  L ++ P
Subjt:  GVTHQFSCVGHPEQNSVVERRHQHLLDIARALLFKSRLPIQLWGESILTAAYIANWTPSRELNWQTP

Q94HW2 Retrovirus-related Pol polyprotein from transposon RE11.7e-0430.99Show/hide
Query:  GVTHQFSCVGHPEQNSVVERRHQHLLDIARALLFKSRLPIQLWGESILTAAYIANWTPSRELNWQTPYYKL
        G++H  S    PE N + ER+H+H+++    LL  + +P   W  +   A Y+ N  P+  L  ++P+ KL
Subjt:  GVTHQFSCVGHPEQNSVVERRHQHLLDIARALLFKSRLPIQLWGESILTAAYIANWTPSRELNWQTPYYKL

Q9ZT94 Retrovirus-related Pol polyprotein from transposon RE29.7e-0532.39Show/hide
Query:  GVTHQFSCVGHPEQNSVVERRHQHLLDIARALLFKSRLPIQLWGESILTAAYIANWTPSRELNWQTPYYKL
        G++H  S    PE N + ER+H+H++++   LL  + +P   W  +   A Y+ N  P+  L  Q+P+ KL
Subjt:  GVTHQFSCVGHPEQNSVVERRHQHLLDIARALLFKSRLPIQLWGESILTAAYIANWTPSRELNWQTPYYKL

Arabidopsis top hitse value%identityAlignment
AT1G21280.1 CONTAINS InterPro DOMAIN/s: Retrotransposon gag protein (InterPro:IPR005162); Has 707 Blast hits to 705 proteins in 25 species: Archae - 0; Bacteria - 0; Metazoa - 4; Fungi - 0; Plants - 703; Viruses - 0; Other Eukaryotes - 0 (source: NCBI BLink).3.3e-2434.68Show/hide
Query:  LTVKNKMGFVDGSLLQPT--STLRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQLRREISNLAQDQQTLTTYFAKLK
        L V  K GF+DG+L +P   S L + W  CN++V  W++NS+++++  SV + E+A ++W DL++ +      +I+QLRR ++ L Q   ++  YF KL 
Subjt:  LTVKNKMGFVDGSLLQPT--STLRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQLRREISNLAQDQQTLTTYFAKLK

Query:  SIWNELTAYR--PSCSCRRCSCGGVKELNTYFQTEYVMAFLMG--LNDSFAQVRSQLLLMEPEPTIQRAFSLV
         +W EL+ Y   P C C  C+C   K      + E    FLMG  LN  F  V ++++  +P P++  AF++V
Subjt:  SIWNELTAYR--PSCSCRRCSCGGVKELNTYFQTEYVMAFLMG--LNDSFAQVRSQLLLMEPEPTIQRAFSLV


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGTTGACGACACTGTGAACGAGACGGCAACCATTCCCTCCGATTCATCTGTTCCAGTATCTTCTTCCACGATTCAACCCAGTCAAGCAATGATCATAGGCCTAACCGT
GAAGAACAAAATGGGATTTGTTGATGGTTCTCTGCTACAACCTACTAGCACATTGAGACGATCTTGGATTATTTGTAATAGCGTGGTAACCGCCTGGATCTTGAATTCCC
TCTCCAATGAAGTTTCCGCCAGCGTCAATTTCGTTGAATCTGCTCGCGAAATCTGGCTTGATCTCCAACAGAGGTACCAGAGAAAGAATCGGTCGCGTATCTTCCAATTG
CGGCGCGAAATCTCAAATCTCGCTCAAGATCAACAGACTCTTACTACCTATTTCGCCAAACTCAAATCTATATGGAATGAACTCACCGCTTATCGACCGTCATGTTCCTG
CAGACGGTGTTCTTGTGGGGGAGTGAAGGAGTTAAACACATATTTTCAGACCGAATACGTGATGGCCTTCTTAATGGGCCTTAACGACTCCTTTGCTCAGGTTCGCTCCC
AACTCCTACTAATGGAACCCGAACCCACCATCCAACGAGCTTTCTCCCTTGTCGCACAAGAACTGGAGCAAAGGGCTTCGACAACACCCTCTTCCTCTACATCTATCCCT
GCTACTGCTCTTCTGGTCAAGACCAACTCTGCTAACCCAAGTTCTACTCAAGCCTCGAATTCGAACAAAAAGAAGGAGAGGCCTTTCTGCACCCATTGCAATATTCAAGG
CCATATTAAGGTCGCTTCATCCAAGTCAGAGACACCCGCAACATCAGCAAATGCGTCTCTCAGCGATCAGTTGTCTGGTTTGAATGTCGAGCAATGTCAAGGACTGCTCA
CCTTACTACAGTCTCATCTTAATAAAGTCAAGGTTGATCCTTCTGATGCTTCGAGCAATACACACATCGCAGGACAAGTACTCTTTAGGATGATTGGCAAAGCTAGGATA
TGGCAAGGTGTTACTCATCAATTTTCTTGTGTGGGTCATCCCGAACAGAACTCAGTGGTTGAGAGACGACATCAACACTTACTCGACATTGCTCGTGCCTTACTTTTTAA
GTCTCGGTTGCCTATTCAGTTATGGGGAGAGTCCATTTTAACAGCTGCATATATTGCTAACTGGACTCCTTCACGCGAGCTCAACTGGCAAACTCCTTATTACAAACTTA
ATCAACAAAACTCAGAAGAAGAAGTTGAACCAGAGTACGATCTGGACGACCCATTTCTCGACTCGCAACCTATGTGA
mRNA sequenceShow/hide mRNA sequence
ATGGTTGACGACACTGTGAACGAGACGGCAACCATTCCCTCCGATTCATCTGTTCCAGTATCTTCTTCCACGATTCAACCCAGTCAAGCAATGATCATAGGCCTAACCGT
GAAGAACAAAATGGGATTTGTTGATGGTTCTCTGCTACAACCTACTAGCACATTGAGACGATCTTGGATTATTTGTAATAGCGTGGTAACCGCCTGGATCTTGAATTCCC
TCTCCAATGAAGTTTCCGCCAGCGTCAATTTCGTTGAATCTGCTCGCGAAATCTGGCTTGATCTCCAACAGAGGTACCAGAGAAAGAATCGGTCGCGTATCTTCCAATTG
CGGCGCGAAATCTCAAATCTCGCTCAAGATCAACAGACTCTTACTACCTATTTCGCCAAACTCAAATCTATATGGAATGAACTCACCGCTTATCGACCGTCATGTTCCTG
CAGACGGTGTTCTTGTGGGGGAGTGAAGGAGTTAAACACATATTTTCAGACCGAATACGTGATGGCCTTCTTAATGGGCCTTAACGACTCCTTTGCTCAGGTTCGCTCCC
AACTCCTACTAATGGAACCCGAACCCACCATCCAACGAGCTTTCTCCCTTGTCGCACAAGAACTGGAGCAAAGGGCTTCGACAACACCCTCTTCCTCTACATCTATCCCT
GCTACTGCTCTTCTGGTCAAGACCAACTCTGCTAACCCAAGTTCTACTCAAGCCTCGAATTCGAACAAAAAGAAGGAGAGGCCTTTCTGCACCCATTGCAATATTCAAGG
CCATATTAAGGTCGCTTCATCCAAGTCAGAGACACCCGCAACATCAGCAAATGCGTCTCTCAGCGATCAGTTGTCTGGTTTGAATGTCGAGCAATGTCAAGGACTGCTCA
CCTTACTACAGTCTCATCTTAATAAAGTCAAGGTTGATCCTTCTGATGCTTCGAGCAATACACACATCGCAGGACAAGTACTCTTTAGGATGATTGGCAAAGCTAGGATA
TGGCAAGGTGTTACTCATCAATTTTCTTGTGTGGGTCATCCCGAACAGAACTCAGTGGTTGAGAGACGACATCAACACTTACTCGACATTGCTCGTGCCTTACTTTTTAA
GTCTCGGTTGCCTATTCAGTTATGGGGAGAGTCCATTTTAACAGCTGCATATATTGCTAACTGGACTCCTTCACGCGAGCTCAACTGGCAAACTCCTTATTACAAACTTA
ATCAACAAAACTCAGAAGAAGAAGTTGAACCAGAGTACGATCTGGACGACCCATTTCTCGACTCGCAACCTATGTGA
Protein sequenceShow/hide protein sequence
MVDDTVNETATIPSDSSVPVSSSTIQPSQAMIIGLTVKNKMGFVDGSLLQPTSTLRRSWIICNSVVTAWILNSLSNEVSASVNFVESAREIWLDLQQRYQRKNRSRIFQL
RREISNLAQDQQTLTTYFAKLKSIWNELTAYRPSCSCRRCSCGGVKELNTYFQTEYVMAFLMGLNDSFAQVRSQLLLMEPEPTIQRAFSLVAQELEQRASTTPSSSTSIP
ATALLVKTNSANPSSTQASNSNKKKERPFCTHCNIQGHIKVASSKSETPATSANASLSDQLSGLNVEQCQGLLTLLQSHLNKVKVDPSDASSNTHIAGQVLFRMIGKARI
WQGVTHQFSCVGHPEQNSVVERRHQHLLDIARALLFKSRLPIQLWGESILTAAYIANWTPSRELNWQTPYYKLNQQNSEEEVEPEYDLDDPFLDSQPM