; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Moc03g05740 (gene) of Bitter gourd (OHB3-1) v2 genome

Gene IDMoc03g05740
OrganismMomordica charantia cv. OHB3-1 (Bitter gourd (OHB3-1) v2)
DescriptionRetrovirus-related Pol polyprotein from transposon TNT 1-94
Genome locationchr3:4238904..4247490
RNA-Seq ExpressionMoc03g05740
SyntenyMoc03g05740
Gene Ontology termsGO:0016021 - integral component of membrane (cellular component)
InterPro domainsNA


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_022144034.1 uncharacterized protein LOC111013826 [Momordica charantia]1.2e-11194.86Show/hide
Query:  MFEYGLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEDAELLGVDQLLACFEAKRIAKKPGRFYMYARKGAGGIVKGPTSIKGWVR
        MFEYGLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSE+AELL VDQLLACFEAKRIAKKPGRFYM ARKGAGGIVKGPTSIKGWVR
Subjt:  MFEYGLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEDAELLGVDQLLACFEAKRIAKKPGRFYMYARKGAGGIVKGPTSIKGWVR

Query:  KWFYASGEWLAKDESGRSFFNVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVRLIESSRPNSELAMVCGFAS
        KWFYASGEWLAKDESGRSFF+VPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVR IE SRPNS LAMVC FAS
Subjt:  KWFYASGEWLAKDESGRSFFNVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVRLIESSRPNSELAMVCGFAS

Query:  SVKRKSKGRAHGLK
         VKRKSKGRAH L+
Subjt:  SVKRKSKGRAHGLK

XP_022158122.1 uncharacterized protein LOC111024680 [Momordica charantia]1.0e-10297.4Show/hide
Query:  MFEYGLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEDAELLGVDQLLACFEAKRIAKKPGRFYMYARKGAGGIVKGPTSIKGWVR
        MFEYGLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSE+AELL VDQLLACFEAKRIAKKPGRFYM ARKGAGGIVKGPTSIKGWVR
Subjt:  MFEYGLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEDAELLGVDQLLACFEAKRIAKKPGRFYMYARKGAGGIVKGPTSIKGWVR

Query:  KWFYASGEWLAKDESGRSFFNVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVRLIESSRPNSEL
        KWFYASGEWLAKDESGRSFF+VPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVR IESSRPNSEL
Subjt:  KWFYASGEWLAKDESGRSFFNVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVRLIESSRPNSEL

XP_022158650.1 uncharacterized protein LOC111025108 [Momordica charantia]1.9e-10195.36Show/hide
Query:  MFEYGLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEDAELLGVDQLLACFEAKRIAKKPGRFYMYARKGAGGIVKGPTSIKGWVR
        MFEYGLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSE+AELL VDQLLACFEAKRIAKKPGRFYM ARKGA GIVKGPTSIKGWVR
Subjt:  MFEYGLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEDAELLGVDQLLACFEAKRIAKKPGRFYMYARKGAGGIVKGPTSIKGWVR

Query:  KWFYASGEWLAKDESGRSFFNVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVRLIESSRPNSELAM
        KWFYASGEWLAKDESGRSFF+VPTRFGNLVSIRPVPELTQASFDTLKYYKE FPRGRKVGTLVTD+LLLESGLLDYNPAVR IESSRPNSEL M
Subjt:  KWFYASGEWLAKDESGRSFFNVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVRLIESSRPNSELAM

XP_022159063.1 uncharacterized protein LOC111025502, partial [Momordica charantia]2.3e-16394.14Show/hide
Query:  MSFSFSSNLGSDLARRLEFELEEVENFRFSDDGEDSDASTSGQGLEYPSRIPEHYLGSLRRGFTIPENILLRLPEEGERADNPPEGWVTLYFKMFEYGLR
        MS S SSNL SDLARRLE +LEE+EN R SDDGEDSDASTSGQGLEYPSRIPEHYLGSLRRGF IPENILLRLPEEGERADNPPEGWVTLYFKMFEYGLR
Subjt:  MSFSFSSNLGSDLARRLEFELEEVENFRFSDDGEDSDASTSGQGLEYPSRIPEHYLGSLRRGFTIPENILLRLPEEGERADNPPEGWVTLYFKMFEYGLR

Query:  LPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEDAELLGVDQLLACFEAKRIAKKPGRFYMYARKGAGGIVKGPTSIKGWVRKWFYASG
        LPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSE+AEL  VDQLLACFEAKRIAKKPGRFYM ARKGAGGIVKGPTSIKGWVRKWFYASG
Subjt:  LPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEDAELLGVDQLLACFEAKRIAKKPGRFYMYARKGAGGIVKGPTSIKGWVRKWFYASG

Query:  EWLAKDESGRSFFNVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVRLIESSRPNSELAMVCGFASSVKRKSK
        EWLAKDESGRSFF+VPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVR IESSRPNSELAMVCGFAS VKRKSK
Subjt:  EWLAKDESGRSFFNVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVRLIESSRPNSELAMVCGFASSVKRKSK

Query:  GRAHGLK
        GRAH L+
Subjt:  GRAHGLK

XP_022159252.1 uncharacterized protein LOC111025665 [Momordica charantia]4.5e-8741.98Show/hide
Query:  MYARKGAGGIVKGPTSIKGWVRKWFYASGEWLAKDESGRSFFNVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNP
        M ARKG GGIVKGPTSIKGWV KWF+ASGEWLAKDESGR+FF+VPTRFGNLVSI+ +PEL QA+FDTLK+YK+ FPR RK+ TLVTD+LLLESGLLDYNP
Subjt:  MYARKGAGGIVKGPTSIKGWVRKWFYASGEWLAKDESGRSFFNVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNP

Query:  AVRLIESSRPNSELAMVCGFASSVKRKSKGRAHGLKDHLGVQ---------------------------------------------------INQ----
         VRLIE+SRPNSELAMVCGF  SVKRKSKGRAH LK  +G +                                                   +N+    
Subjt:  AVRLIESSRPNSELAMVCGFASSVKRKSKGRAHGLKDHLGVQ---------------------------------------------------INQ----

Query:  -------KKQKDGKTFMGGARRLGSLQKNWF------------SSNFAFNETGLPMRFGGSNRCIRVE--------------------------------
               KK+K   +   GAR  G+L  +              +SN        P   G  ++  R+                                 
Subjt:  -------KKQKDGKTFMGGARRLGSLQKNWF------------SSNFAFNETGLPMRFGGSNRCIRVE--------------------------------

Query:  ------------------------------------------------EVFHYQFEHDLRVVFLEERA------------------RDTPPCLKEKGEAL
                                                        EV   + E D +V  L++                    ++    LKEK +  
Subjt:  ------------------------------------------------EVFHYQFEHDLRVVFLEERA------------------RDTPPCLKEKGEAL

Query:  N--PEKNSPNNQ---SLETAKERLSNGVLLEESFRQHPDFDGFAKDFSDAGFKFFMKGIASDMPDLQIDLSGLKRRYAEKWASGPGGTLGPQALVDQYVR
            EK++   +    L+  KERL+NG LLEESFRQHPDFDGFAKDFSDAGFKF MKGIA+DMP LQIDL+GLK++Y+EKWASGP GT  PQ+LVD+YVR
Subjt:  N--PEKNSPNNQ---SLETAKERLSNGVLLEESFRQHPDFDGFAKDFSDAGFKFFMKGIASDMPDLQIDLSGLKRRYAEKWASGPGGTLGPQALVDQYVR

Query:  DLDSDYSDPEED--------QVGSTQEGAP--QAGS
        +LDSDYSD EE+        +VG+TQE  P  Q GS
Subjt:  DLDSDYSDPEED--------QVGSTQEGAP--QAGS

TrEMBL top hitse value%identityAlignment
A0A6J1CR42 uncharacterized protein LOC1110138265.7e-11294.86Show/hide
Query:  MFEYGLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEDAELLGVDQLLACFEAKRIAKKPGRFYMYARKGAGGIVKGPTSIKGWVR
        MFEYGLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSE+AELL VDQLLACFEAKRIAKKPGRFYM ARKGAGGIVKGPTSIKGWVR
Subjt:  MFEYGLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEDAELLGVDQLLACFEAKRIAKKPGRFYMYARKGAGGIVKGPTSIKGWVR

Query:  KWFYASGEWLAKDESGRSFFNVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVRLIESSRPNSELAMVCGFAS
        KWFYASGEWLAKDESGRSFF+VPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVR IE SRPNS LAMVC FAS
Subjt:  KWFYASGEWLAKDESGRSFFNVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVRLIESSRPNSELAMVCGFAS

Query:  SVKRKSKGRAHGLK
         VKRKSKGRAH L+
Subjt:  SVKRKSKGRAHGLK

A0A6J1DWD2 uncharacterized protein LOC1110246804.8e-10397.4Show/hide
Query:  MFEYGLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEDAELLGVDQLLACFEAKRIAKKPGRFYMYARKGAGGIVKGPTSIKGWVR
        MFEYGLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSE+AELL VDQLLACFEAKRIAKKPGRFYM ARKGAGGIVKGPTSIKGWVR
Subjt:  MFEYGLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEDAELLGVDQLLACFEAKRIAKKPGRFYMYARKGAGGIVKGPTSIKGWVR

Query:  KWFYASGEWLAKDESGRSFFNVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVRLIESSRPNSEL
        KWFYASGEWLAKDESGRSFF+VPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVR IESSRPNSEL
Subjt:  KWFYASGEWLAKDESGRSFFNVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVRLIESSRPNSEL

A0A6J1DWF1 uncharacterized protein LOC1110251089.1e-10295.36Show/hide
Query:  MFEYGLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEDAELLGVDQLLACFEAKRIAKKPGRFYMYARKGAGGIVKGPTSIKGWVR
        MFEYGLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSE+AELL VDQLLACFEAKRIAKKPGRFYM ARKGA GIVKGPTSIKGWVR
Subjt:  MFEYGLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEDAELLGVDQLLACFEAKRIAKKPGRFYMYARKGAGGIVKGPTSIKGWVR

Query:  KWFYASGEWLAKDESGRSFFNVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVRLIESSRPNSELAM
        KWFYASGEWLAKDESGRSFF+VPTRFGNLVSIRPVPELTQASFDTLKYYKE FPRGRKVGTLVTD+LLLESGLLDYNPAVR IESSRPNSEL M
Subjt:  KWFYASGEWLAKDESGRSFFNVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVRLIESSRPNSELAM

A0A6J1DXS5 uncharacterized protein LOC1110255021.1e-16394.14Show/hide
Query:  MSFSFSSNLGSDLARRLEFELEEVENFRFSDDGEDSDASTSGQGLEYPSRIPEHYLGSLRRGFTIPENILLRLPEEGERADNPPEGWVTLYFKMFEYGLR
        MS S SSNL SDLARRLE +LEE+EN R SDDGEDSDASTSGQGLEYPSRIPEHYLGSLRRGF IPENILLRLPEEGERADNPPEGWVTLYFKMFEYGLR
Subjt:  MSFSFSSNLGSDLARRLEFELEEVENFRFSDDGEDSDASTSGQGLEYPSRIPEHYLGSLRRGFTIPENILLRLPEEGERADNPPEGWVTLYFKMFEYGLR

Query:  LPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEDAELLGVDQLLACFEAKRIAKKPGRFYMYARKGAGGIVKGPTSIKGWVRKWFYASG
        LPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSE+AEL  VDQLLACFEAKRIAKKPGRFYM ARKGAGGIVKGPTSIKGWVRKWFYASG
Subjt:  LPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEDAELLGVDQLLACFEAKRIAKKPGRFYMYARKGAGGIVKGPTSIKGWVRKWFYASG

Query:  EWLAKDESGRSFFNVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVRLIESSRPNSELAMVCGFASSVKRKSK
        EWLAKDESGRSFF+VPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVR IESSRPNSELAMVCGFAS VKRKSK
Subjt:  EWLAKDESGRSFFNVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVRLIESSRPNSELAMVCGFASSVKRKSK

Query:  GRAHGLK
        GRAH L+
Subjt:  GRAHGLK

A0A6J1DZB3 uncharacterized protein LOC1110256652.2e-8741.98Show/hide
Query:  MYARKGAGGIVKGPTSIKGWVRKWFYASGEWLAKDESGRSFFNVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNP
        M ARKG GGIVKGPTSIKGWV KWF+ASGEWLAKDESGR+FF+VPTRFGNLVSI+ +PEL QA+FDTLK+YK+ FPR RK+ TLVTD+LLLESGLLDYNP
Subjt:  MYARKGAGGIVKGPTSIKGWVRKWFYASGEWLAKDESGRSFFNVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNP

Query:  AVRLIESSRPNSELAMVCGFASSVKRKSKGRAHGLKDHLGVQ---------------------------------------------------INQ----
         VRLIE+SRPNSELAMVCGF  SVKRKSKGRAH LK  +G +                                                   +N+    
Subjt:  AVRLIESSRPNSELAMVCGFASSVKRKSKGRAHGLKDHLGVQ---------------------------------------------------INQ----

Query:  -------KKQKDGKTFMGGARRLGSLQKNWF------------SSNFAFNETGLPMRFGGSNRCIRVE--------------------------------
               KK+K   +   GAR  G+L  +              +SN        P   G  ++  R+                                 
Subjt:  -------KKQKDGKTFMGGARRLGSLQKNWF------------SSNFAFNETGLPMRFGGSNRCIRVE--------------------------------

Query:  ------------------------------------------------EVFHYQFEHDLRVVFLEERA------------------RDTPPCLKEKGEAL
                                                        EV   + E D +V  L++                    ++    LKEK +  
Subjt:  ------------------------------------------------EVFHYQFEHDLRVVFLEERA------------------RDTPPCLKEKGEAL

Query:  N--PEKNSPNNQ---SLETAKERLSNGVLLEESFRQHPDFDGFAKDFSDAGFKFFMKGIASDMPDLQIDLSGLKRRYAEKWASGPGGTLGPQALVDQYVR
            EK++   +    L+  KERL+NG LLEESFRQHPDFDGFAKDFSDAGFKF MKGIA+DMP LQIDL+GLK++Y+EKWASGP GT  PQ+LVD+YVR
Subjt:  N--PEKNSPNNQ---SLETAKERLSNGVLLEESFRQHPDFDGFAKDFSDAGFKFFMKGIASDMPDLQIDLSGLKRRYAEKWASGPGGTLGPQALVDQYVR

Query:  DLDSDYSDPEED--------QVGSTQEGAP--QAGS
        +LDSDYSD EE+        +VG+TQE  P  Q GS
Subjt:  DLDSDYSDPEED--------QVGSTQEGAP--QAGS

SwissProt top hitse value%identityAlignment
P10978 Retrovirus-related Pol polyprotein from transposon TNT 1-947.4e-1628.4Show/hide
Query:  ETSVNSHINGLTDMLNKLEGMSIKIDEEVMAMRLLTSLPDIWETMKTAVSN---SLRD------SNDYGLVRMWNVSVSKVKGIEDVCLKTIEGAELVLR
        ETS   + +    M+   + + + I+EE   M L  S P+    + TA S+    +RD      + D+G V+M N S SK+ GI D+C+KT  G  LVL+
Subjt:  ETSVNSHINGLTDMLNKLEGMSIKIDEEVMAMRLLTSLPDIWETMKTAVSN---SLRD------SNDYGLVRMWNVSVSKVKGIEDVCLKTIEGAELVLR

Query:  DVRYIPRFILNLLFARKLDDDRYNSEFVEGCWKLKRESKIVATDHKRSSIYVSEFGVAKGSLRQRMHKIATDGELVKLHKRISALKGKS----SISSVAT
        DVR++P   +NL+    LD D Y S F    W+L + S ++A    R ++Y +   + +G L     +I+ D      HKR+  +  K     +  S+ +
Subjt:  DVRYIPRFILNLLFARKLDDDRYNSEFVEGCWKLKRESKIVATDHKRSSIYVSEFGVAKGSLRQRMHKIATDGELVKLHKRISALKGKS----SISSVAT

Query:  YLGGST-QP------NVERKSSFRVSEQ---NQIELGEAEQCRSFQVSGL
        Y  G+T +P        + + SF+ S +   N ++L  ++ C   ++  +
Subjt:  YLGGST-QP------NVERKSSFRVSEQ---NQIELGEAEQCRSFQVSGL

Q9LEX8 Uncharacterized protein At3g60930, chloroplastic2.6e-0526.73Show/hide
Query:  SRIPEHYLGSLRRGFTIPENILLRLPEEGERADNPPEGWVTLYFKMFEYG--LRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEDA
        S   E  L  L+  F +   + LR+P   ERAD+PP G+ TLY + F YG  L LP+   V E++    +A +Q+       + +L  L  +  R  E  
Subjt:  SRIPEHYLGSLRRGFTIPENILLRLPEEGERADNPPEGWVTLYFKMFEYG--LRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEDA

Query:  ELLGVDQLLACFEAKRIAK-KPGRFYMYARKGAGGIVKGPTSIKGWVRKWFYASGEWLAKDESGRSFFNVPTRFG----NLVSIRPVPELTQASFDTLKY
          + +  L    E +R+ K +  R+Y+   KG   I   P+  + +   +F+ + E    ++       V TR+G     L  + P+P+   ++F  L  
Subjt:  ELLGVDQLLACFEAKRIAK-KPGRFYMYARKGAGGIVKGPTSIKGWVRKWFYASGEWLAKDESGRSFFNVPTRFG----NLVSIRPVPELTQASFDTLKY

Query:  YK
         K
Subjt:  YK

Arabidopsis top hitse value%identityAlignment
AT2G15420.1 myosin heavy chain-related6.4e-0731.34Show/hide
Query:  PENILLRLPEEGERADNPPEGWVTLYFKMF-EYGLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEDAELLGVDQLLACFEAKRIA
        P  I L  P+  +R   PPEG++ LY   F   GL  PL  F+ E+  R  +A +Q+          LAIL       +E    +  D         R+ 
Subjt:  PENILLRLPEEGERADNPPEGWVTLYFKMF-EYGLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEDAELLGVDQLLACFEAKRIA

Query:  KKPGRFYMYARKGAGGIVKGPTS-IKGWVRKWFY
        + PG +Y  A K    IV G  S I GW R++F+
Subjt:  KKPGRFYMYARKGAGGIVKGPTS-IKGWVRKWFY


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGGTGTATCTAGCATCGTGGCGAAAGAGACAACAACGAAAGAACTGATGAAGATCTTGCAAGACAGGTATGAGAAACCTTCTGCCAACACCAAAATACTTCTCTGGAC
GAAGTATTTCAATATCCACATGGAGGAGGAAACCTCGGTGAATTCCCACATTAATGGGCTCACTGACATGTTGAACAAACTGGAAGGGATGAGTATCAAGATCGATGAGG
AAGTGATGGCTATGAGGCTGCTGACATCTTTGCCTGACATTTGGGAGACGATGAAAACCGCGGTGTCGAATTCGTTGAGGGATAGTAATGATTATGGCTTAGTAAGAATG
TGGAATGTGAGTGTCTCCAAGGTGAAAGGAATCGAAGATGTTTGTTTGAAGACAATTGAAGGGGCCGAGTTGGTGTTGCGAGATGTCAGGTATATTCCTAGATTCATATT
GAATTTATTATTTGCAAGGAAGCTAGACGATGATCGCTACAACAGTGAGTTTGTTGAGGGTTGCTGGAAGCTCAAGAGGGAATCCAAGATAGTGGCGACAGACCACAAGA
GATCTTCTATTTATGTGTCAGAGTTTGGGGTTGCCAAGGGTTCACTAAGACAAAGAATGCACAAAATAGCTACAGATGGCGAGTTGGTGAAGTTGCATAAGCGAATTAGT
GCATTAAAGGGTAAGAGCTCTATTTCTAGTGTGGCGACATACTTGGGTGGGAGTACCCAGCCTAATGTAGAGAGGAAATCTTCTTTCAGAGTCTCGGAGCAGAATCAGAT
AGAGCTTGGGGAGGCCGAGCAGTGTAGATCATTTCAGGTCTCAGGGTTAACTGTGCCGATTTTATGCCGTTTTGAGAGCAGTTTTGCACTAGTTTTGAGGCAGAGGCTCA
GGGTATTGGCAGTGGCATTAGGACAGATCAAGATCGACATATTCGGGGATATGCACAACAATGTGTTCCCGATTGTAGCTCGAACTCGGCCTCCGGACCGACCTGAACAC
TTGGGCGGACCTGCACAAAAAGGTGAGCACTCCGACGATCAAGTCAGTATAGTCTCGGGACCGATGGTTACGCCCGATAATCACGTCGCCGACAATTGCTCACATCAGCC
CCTACCGAGCTCCCCGGTAGGGTATTCCCTTCCCCAAACATTGGCCCCCTCTCTGTCTGGTCCGATCTCGACCTGGCAGAGAAGTGCATTCGACTTGCTTTGGACACGTG
GCGACTTCCTATTCGTGGGAAAATACAACCGTCGCGGAAGATTTATCGTCGGAATATTCAAATATTCCGACGCTTCGGATCTCAGGGAGGATCCTAGCCTCTCGTTGATT
ACACGTGCAGCTCGAACCCTTGGTAGGTCGGTCTCTTCCCTCTCTCTTTCGAACGTAGTTGCCATGTCGTTCTCTTTTAGCAGCAACTTAGGGTCCGATTTAGCTCGTAG
GTTAGAGTTCGAGCTCGAGGAGGTAGAAAACTTTAGATTCTCCGATGACGGGGAGGATAGTGACGCCTCCACTTCAGGTCAAGGTTTGGAATACCCTTCTAGGATACCTG
AGCACTACCTCGGATCCCTTCGTAGGGGGTTCACTATCCCTGAGAACATCCTCCTCAGGCTTCCGGAGGAGGGGGAGAGAGCTGACAATCCTCCAGAGGGATGGGTCACT
CTATACTTCAAAATGTTTGAGTACGGCCTCAGACTTCCCCTTCACCCTTTTGTCCAAGAATTTCTCTTCCGGACTGGATTGGCTCCGGCTCAAGTGGCCCCCAATGGGTG
GGGTGTCATTTTCGCTTTGGCCATTCTTTTTTGGCTACGAGCTCGGGATAGTGAGGACGCCGAGCTGTTGGGCGTAGACCAGCTCCTCGCGTGCTTCGAAGCGAAAAGGA
TAGCTAAGAAGCCCGGTCGGTTCTATATGTACGCAAGGAAAGGCGCAGGCGGTATAGTTAAGGGGCCGACCTCCATCAAGGGATGGGTGAGGAAGTGGTTCTACGCTTCC
GGGGAATGGCTCGCAAAGGACGAGTCAGGTCGTTCCTTCTTTAACGTCCCCACTAGGTTTGGGAACCTAGTTTCAATCCGACCAGTCCCCGAGCTTACGCAAGCCTCCTT
CGACACGCTGAAATACTACAAGGAGCGCTTTCCGAGGGGTAGGAAGGTCGGAACCCTGGTGACCGACGAGCTGCTGCTTGAGTCCGGGCTGCTAGATTACAACCCTGCAG
TTCGTCTCATTGAATCCTCAAGGCCGAACTCTGAACTTGCCATGGTTTGCGGATTTGCAAGCAGCGTGAAGCGCAAGTCCAAGGGCCGAGCCCATGGACTTAAAGACCAT
TTGGGAGTGCAAATTAATCAAAAGAAGCAAAAAGACGGAAAAACCTTCATGGGAGGCGCCAGGCGCCTGGGAAGCCTGCAGAAAAACTGGTTTTCTTCCAACTTTGCCTT
TAATGAAACGGGTCTTCCAATGCGTTTTGGTGGTTCCAACCGATGCATACGTGTAGAAGAAGTGTTCCACTATCAGTTTGAGCACGATTTGAGAGTTGTGTTTTTGGAGG
AAAGAGCACGTGACACTCCCCCTTGCTTGAAAGAGAAGGGCGAAGCTCTCAATCCCGAGAAAAACTCTCCCAACAACCAGTCACTAGAGACGGCGAAGGAGCGCCTCAGC
AATGGAGTCCTATTGGAGGAATCGTTTAGGCAACATCCTGACTTCGATGGATTTGCCAAAGACTTTTCTGACGCGGGTTTCAAGTTCTTCATGAAGGGCATTGCTTCCGA
CATGCCCGACCTTCAGATCGATCTCAGCGGTCTGAAAAGGAGGTATGCCGAGAAGTGGGCGTCTGGGCCTGGCGGCACCCTTGGCCCCCAAGCGTTGGTGGATCAGTATG
TCAGAGATCTGGACTCTGACTACTCCGATCCCGAAGAGGACCAGGTCGGCTCCACTCAAGAGGGCGCTCCTCAAGCAGGCTCTTAG
mRNA sequenceShow/hide mRNA sequence
ATGGGTGTATCTAGCATCGTGGCGAAAGAGACAACAACGAAAGAACTGATGAAGATCTTGCAAGACAGGTATGAGAAACCTTCTGCCAACACCAAAATACTTCTCTGGAC
GAAGTATTTCAATATCCACATGGAGGAGGAAACCTCGGTGAATTCCCACATTAATGGGCTCACTGACATGTTGAACAAACTGGAAGGGATGAGTATCAAGATCGATGAGG
AAGTGATGGCTATGAGGCTGCTGACATCTTTGCCTGACATTTGGGAGACGATGAAAACCGCGGTGTCGAATTCGTTGAGGGATAGTAATGATTATGGCTTAGTAAGAATG
TGGAATGTGAGTGTCTCCAAGGTGAAAGGAATCGAAGATGTTTGTTTGAAGACAATTGAAGGGGCCGAGTTGGTGTTGCGAGATGTCAGGTATATTCCTAGATTCATATT
GAATTTATTATTTGCAAGGAAGCTAGACGATGATCGCTACAACAGTGAGTTTGTTGAGGGTTGCTGGAAGCTCAAGAGGGAATCCAAGATAGTGGCGACAGACCACAAGA
GATCTTCTATTTATGTGTCAGAGTTTGGGGTTGCCAAGGGTTCACTAAGACAAAGAATGCACAAAATAGCTACAGATGGCGAGTTGGTGAAGTTGCATAAGCGAATTAGT
GCATTAAAGGGTAAGAGCTCTATTTCTAGTGTGGCGACATACTTGGGTGGGAGTACCCAGCCTAATGTAGAGAGGAAATCTTCTTTCAGAGTCTCGGAGCAGAATCAGAT
AGAGCTTGGGGAGGCCGAGCAGTGTAGATCATTTCAGGTCTCAGGGTTAACTGTGCCGATTTTATGCCGTTTTGAGAGCAGTTTTGCACTAGTTTTGAGGCAGAGGCTCA
GGGTATTGGCAGTGGCATTAGGACAGATCAAGATCGACATATTCGGGGATATGCACAACAATGTGTTCCCGATTGTAGCTCGAACTCGGCCTCCGGACCGACCTGAACAC
TTGGGCGGACCTGCACAAAAAGGTGAGCACTCCGACGATCAAGTCAGTATAGTCTCGGGACCGATGGTTACGCCCGATAATCACGTCGCCGACAATTGCTCACATCAGCC
CCTACCGAGCTCCCCGGTAGGGTATTCCCTTCCCCAAACATTGGCCCCCTCTCTGTCTGGTCCGATCTCGACCTGGCAGAGAAGTGCATTCGACTTGCTTTGGACACGTG
GCGACTTCCTATTCGTGGGAAAATACAACCGTCGCGGAAGATTTATCGTCGGAATATTCAAATATTCCGACGCTTCGGATCTCAGGGAGGATCCTAGCCTCTCGTTGATT
ACACGTGCAGCTCGAACCCTTGGTAGGTCGGTCTCTTCCCTCTCTCTTTCGAACGTAGTTGCCATGTCGTTCTCTTTTAGCAGCAACTTAGGGTCCGATTTAGCTCGTAG
GTTAGAGTTCGAGCTCGAGGAGGTAGAAAACTTTAGATTCTCCGATGACGGGGAGGATAGTGACGCCTCCACTTCAGGTCAAGGTTTGGAATACCCTTCTAGGATACCTG
AGCACTACCTCGGATCCCTTCGTAGGGGGTTCACTATCCCTGAGAACATCCTCCTCAGGCTTCCGGAGGAGGGGGAGAGAGCTGACAATCCTCCAGAGGGATGGGTCACT
CTATACTTCAAAATGTTTGAGTACGGCCTCAGACTTCCCCTTCACCCTTTTGTCCAAGAATTTCTCTTCCGGACTGGATTGGCTCCGGCTCAAGTGGCCCCCAATGGGTG
GGGTGTCATTTTCGCTTTGGCCATTCTTTTTTGGCTACGAGCTCGGGATAGTGAGGACGCCGAGCTGTTGGGCGTAGACCAGCTCCTCGCGTGCTTCGAAGCGAAAAGGA
TAGCTAAGAAGCCCGGTCGGTTCTATATGTACGCAAGGAAAGGCGCAGGCGGTATAGTTAAGGGGCCGACCTCCATCAAGGGATGGGTGAGGAAGTGGTTCTACGCTTCC
GGGGAATGGCTCGCAAAGGACGAGTCAGGTCGTTCCTTCTTTAACGTCCCCACTAGGTTTGGGAACCTAGTTTCAATCCGACCAGTCCCCGAGCTTACGCAAGCCTCCTT
CGACACGCTGAAATACTACAAGGAGCGCTTTCCGAGGGGTAGGAAGGTCGGAACCCTGGTGACCGACGAGCTGCTGCTTGAGTCCGGGCTGCTAGATTACAACCCTGCAG
TTCGTCTCATTGAATCCTCAAGGCCGAACTCTGAACTTGCCATGGTTTGCGGATTTGCAAGCAGCGTGAAGCGCAAGTCCAAGGGCCGAGCCCATGGACTTAAAGACCAT
TTGGGAGTGCAAATTAATCAAAAGAAGCAAAAAGACGGAAAAACCTTCATGGGAGGCGCCAGGCGCCTGGGAAGCCTGCAGAAAAACTGGTTTTCTTCCAACTTTGCCTT
TAATGAAACGGGTCTTCCAATGCGTTTTGGTGGTTCCAACCGATGCATACGTGTAGAAGAAGTGTTCCACTATCAGTTTGAGCACGATTTGAGAGTTGTGTTTTTGGAGG
AAAGAGCACGTGACACTCCCCCTTGCTTGAAAGAGAAGGGCGAAGCTCTCAATCCCGAGAAAAACTCTCCCAACAACCAGTCACTAGAGACGGCGAAGGAGCGCCTCAGC
AATGGAGTCCTATTGGAGGAATCGTTTAGGCAACATCCTGACTTCGATGGATTTGCCAAAGACTTTTCTGACGCGGGTTTCAAGTTCTTCATGAAGGGCATTGCTTCCGA
CATGCCCGACCTTCAGATCGATCTCAGCGGTCTGAAAAGGAGGTATGCCGAGAAGTGGGCGTCTGGGCCTGGCGGCACCCTTGGCCCCCAAGCGTTGGTGGATCAGTATG
TCAGAGATCTGGACTCTGACTACTCCGATCCCGAAGAGGACCAGGTCGGCTCCACTCAAGAGGGCGCTCCTCAAGCAGGCTCTTAG
Protein sequenceShow/hide protein sequence
MGVSSIVAKETTTKELMKILQDRYEKPSANTKILLWTKYFNIHMEEETSVNSHINGLTDMLNKLEGMSIKIDEEVMAMRLLTSLPDIWETMKTAVSNSLRDSNDYGLVRM
WNVSVSKVKGIEDVCLKTIEGAELVLRDVRYIPRFILNLLFARKLDDDRYNSEFVEGCWKLKRESKIVATDHKRSSIYVSEFGVAKGSLRQRMHKIATDGELVKLHKRIS
ALKGKSSISSVATYLGGSTQPNVERKSSFRVSEQNQIELGEAEQCRSFQVSGLTVPILCRFESSFALVLRQRLRVLAVALGQIKIDIFGDMHNNVFPIVARTRPPDRPEH
LGGPAQKGEHSDDQVSIVSGPMVTPDNHVADNCSHQPLPSSPVGYSLPQTLAPSLSGPISTWQRSAFDLLWTRGDFLFVGKYNRRGRFIVGIFKYSDASDLREDPSLSLI
TRAARTLGRSVSSLSLSNVVAMSFSFSSNLGSDLARRLEFELEEVENFRFSDDGEDSDASTSGQGLEYPSRIPEHYLGSLRRGFTIPENILLRLPEEGERADNPPEGWVT
LYFKMFEYGLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEDAELLGVDQLLACFEAKRIAKKPGRFYMYARKGAGGIVKGPTSIKGWVRKWFYAS
GEWLAKDESGRSFFNVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDELLLESGLLDYNPAVRLIESSRPNSELAMVCGFASSVKRKSKGRAHGLKDH
LGVQINQKKQKDGKTFMGGARRLGSLQKNWFSSNFAFNETGLPMRFGGSNRCIRVEEVFHYQFEHDLRVVFLEERARDTPPCLKEKGEALNPEKNSPNNQSLETAKERLS
NGVLLEESFRQHPDFDGFAKDFSDAGFKFFMKGIASDMPDLQIDLSGLKRRYAEKWASGPGGTLGPQALVDQYVRDLDSDYSDPEEDQVGSTQEGAPQAGS