; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Moc03g01880 (gene) of Bitter gourd (OHB3-1) v2 genome

Gene IDMoc03g01880
OrganismMomordica charantia cv. OHB3-1 (Bitter gourd (OHB3-1) v2)
DescriptionUlp1-like peptidase
Genome locationchr3:1409407..1412514
RNA-Seq ExpressionMoc03g01880
SyntenyMoc03g01880
Gene Ontology termsGO:0006508 - proteolysis (biological process)
GO:0008234 - cysteine-type peptidase activity (molecular function)
InterPro domainsIPR003653 - Ulp1 protease family, C-terminal catalytic domain
IPR015410 - Domain of unknown function DUF1985
IPR038765 - Papain-like cysteine peptidase superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_022146372.1 uncharacterized protein LOC111015600 [Momordica charantia]3.8e-9973.96Show/hide
Query:  MDVVFNGPLIHHLLLREVEEPRQDVISFDLFGKRVSFGRREFDLITGLSHRMKKVDNHIPGRRLRARYFKDSVRVNCRELEKIFLEDVFDDDGDVVKVSI
        M VVFNGPLIHHLLL EVEEPRQDVISFDLF KRVSFG+REFDLITGLSH+M +V+NHIPGRRLRARYFKDSVRV C ELEKIFLED+F DD DVVKV I
Subjt:  MDVVFNGPLIHHLLLREVEEPRQDVISFDLFGKRVSFGRREFDLITGLSHRMKKVDNHIPGRRLRARYFKDSVRVNCRELEKIFLEDVFDDDGDVVKVSI

Query:  VYFIELAMMRKERKHLIDTALLGVVDWWEAFCNYDWSSMVLDRTILSLKDALKDKLPAYQQKARADPTHIETYSLYGFPYAFQVWAYETILTLSMRVATR
        VYFIELAMM KERK  IDT  +GVVD WEAFCN DWSSM+ DRTI SLK+ LKDKL AYQQKA ADPTH+ETYSLYGFPY              MR +  
Subjt:  VYFIELAMMRKERKHLIDTALLGVVDWWEAFCNYDWSSMVLDRTILSLKDALKDKLPAYQQKARADPTHIETYSLYGFPYAFQVWAYETILTLSMRVATR

Query:  LSDDAIPRLLRFLQSKVKEHLVATDAAEQHMVCVILPPEARVIPDPPAVPDRAAVPDRAAVSNPP
        L+ +          SKVKEHL+ATDA EQHMV VILPPE RVIPDPPAVPDRA VPDRA V +PP
Subjt:  LSDDAIPRLLRFLQSKVKEHLVATDAAEQHMVCVILPPEARVIPDPPAVPDRAAVPDRAAVSNPP

XP_022153201.1 uncharacterized protein LOC111020757 [Momordica charantia]1.6e-16172.5Show/hide
Query:  MDVVFNGPLIHHLLLREVEEPRQDVISFDLFGKRVSFGRREFDLITGLSHRMKKVDNHIPGRRLRARYFKDSVRVNCRELEKIFLEDVFDDDGDVVKVSI
        +DVVFNGPLIHHLLLREVEEPRQDVISFDLFGKRVSFG+REFDLITGLSHRM +VDNHIPGRRLRARYFKD VRV C ELEKIFLEDVF DD DVVKV I
Subjt:  MDVVFNGPLIHHLLLREVEEPRQDVISFDLFGKRVSFGRREFDLITGLSHRMKKVDNHIPGRRLRARYFKDSVRVNCRELEKIFLEDVFDDDGDVVKVSI

Query:  VYFIELAMMRKERKHLIDTALLGVVDWWEAFCNYDWSSMVLDRTILSLKDALKDKLPAYQQKARADPTHIETYSLYGFPYAFQVWAYETILTLSMRVATR
        VYFIELAMM KERK  IDTALLGVVD WE FCNYDWSSM+ DRTI SLK+ALKDKL  YQQKA ADP+H+ETYSLYGFPYAFQVWAYETI T        
Subjt:  VYFIELAMMRKERKHLIDTALLGVVDWWEAFCNYDWSSMVLDRTILSLKDALKDKLPAYQQKARADPTHIETYSLYGFPYAFQVWAYETILTLSMRVATR

Query:  LSDDAIPRLLRF------------------LQSKVKEHLVATDAAEQHMVCVILPPEARVIPDPPAVPDRAAVPD------RAAVSNPPTDVERGPLEDP
        LSDDAIPRLLR+                   +SKVKEHL+ATDA EQHMV VILPPE RVIPDPPAVPDRA VPD      RAAV +PP DVE GPLEDP
Subjt:  LSDDAIPRLLRF------------------LQSKVKEHLVATDAAEQHMVCVILPPEARVIPDPPAVPDRAAVPD------RAAVSNPPTDVERGPLEDP

Query:  VVEAHALDEVGPSANDGEALEKRSKRNKFKNRISRRLKRLDDRVGAIEDTLGDFGVALKGIQRYLKKLTK----------------------DQRSDESL
        VV+AHA+DE  PSANDGE LEKR K+NKFK RISRRLKRLD+ VGAIED LGDFGVALKGIQ YLKKL K                      DQR DES 
Subjt:  VVEAHALDEVGPSANDGEALEKRSKRNKFKNRISRRLKRLDDRVGAIEDTLGDFGVALKGIQRYLKKLTK----------------------DQRSDESL

Query:  KPDGGRKSMDEDVRPDEDPETDE------ESTSGHGPNNM
        KPDGGRKSMDED R DED  TDE      E TSGHG + +
Subjt:  KPDGGRKSMDEDVRPDEDPETDE------ESTSGHGPNNM

XP_022155158.1 uncharacterized protein LOC111022300 [Momordica charantia]3.9e-6780Show/hide
Query:  MDVVFNGPLIHHLLLREVEEPRQDVISFDLFGKRVSFGRREFDLITGLSHRMKKVDNHIPGRRLRARYFKDSVRVNCRELEKIFLEDVFDDDGDVVKVSI
        MDVVFNGPLIHHLLLREVEEPRQD+ISFDLFGKRVSFG+REFDLITGLS+RM +VDN IPGRRLRARYFKDSVRV C ELEKIF+E VF DD D VKV I
Subjt:  MDVVFNGPLIHHLLLREVEEPRQDVISFDLFGKRVSFGRREFDLITGLSHRMKKVDNHIPGRRLRARYFKDSVRVNCRELEKIFLEDVFDDDGDVVKVSI

Query:  VYFIELAMMRKERKHLIDTALLGVVDWWEAFCNYDWSSMVLDRTILSLKDALKDKLPAYQ
        VYF+ELAMM KERK  ID  LLGVVD WE FCN+DWSS++ +RT+ SLK+A+ DKLPAYQ
Subjt:  VYFIELAMMRKERKHLIDTALLGVVDWWEAFCNYDWSSMVLDRTILSLKDALKDKLPAYQ

XP_022155476.1 uncharacterized protein LOC111022607 [Momordica charantia]1.2e-10368.92Show/hide
Query:  SGHGPNNMDEDPKKKDDDPMITEEDDGTITDEDEDPNQ------------------------MTMDL---RQEPDVQPDTQPTRRRVRCPYKDWAPDAIV
        SGHGPN++DEDPK++D+DPMI EEDDG ITD DEDPNQ                        +  DL   RQEPD QPDTQPTRRRVR PYKDWAPDAIV
Subjt:  SGHGPNNMDEDPKKKDDDPMITEEDDGTITDEDEDPNQ------------------------MTMDL---RQEPDVQPDTQPTRRRVRCPYKDWAPDAIV

Query:  K-------GQSDLQHAPTGRGLRKRHYSWKLKDIYTPTGQRRITVDAYDPACLIPPQLDGQFQTWMDDPNIDGRTRFTVAGLQGKEWYPDPLNPTVQLKD
        K        ++DLQHAPTGRGLRK HYSWKLK IYTPTG+RRITVDAYDPAC IPPQLDGQFQTWMDD +IDGRTR T AGLQGKEWY D L+PTVQLKD
Subjt:  K-------GQSDLQHAPTGRGLRKRHYSWKLKDIYTPTGQRRITVDAYDPACLIPPQLDGQFQTWMDDPNIDGRTRFTVAGLQGKEWYPDPLNPTVQLKD

Query:  EVVDALVLFTAKKLEKCLHLCRKKFAIGDVLLSTLLNRTDGPYAAMKLG------------------------SNQNVPWSDADIVYTPINLGGNH
        EVVDALVLFTAKKLEKC++LCRKKFAIGDVLLSTLLNRTDGPYAAMK G                        S+QNV W+DADIVYTPIN+GGNH
Subjt:  EVVDALVLFTAKKLEKCLHLCRKKFAIGDVLLSTLLNRTDGPYAAMKLG------------------------SNQNVPWSDADIVYTPINLGGNH

XP_022157020.1 uncharacterized protein LOC111023847 [Momordica charantia]6.2e-8170.28Show/hide
Query:  MDVVFNGPLIHHLLLREVEEPRQDVISFDLFGKRVSFGRREFDLITGLSHRMKKVDNHIPGRRLRARYFKDSVRVNCRELEKIFLEDVFDDDGDVVKVSI
        M+VVFNGPL+HHLLLREVEEP+ D+ISF+LFG RVSFG+REFDLITGL H M +VD  +  RRLR  YF+D   V C ELEKIFLE  F++D D VK++I
Subjt:  MDVVFNGPLIHHLLLREVEEPRQDVISFDLFGKRVSFGRREFDLITGLSHRMKKVDNHIPGRRLRARYFKDSVRVNCRELEKIFLEDVFDDDGDVVKVSI

Query:  VYFIELAMMRKERKHLIDTALLGVVDWWEAFCNYDWSSMVLDRTILSLKDALKDKLPAYQQKARADPTHIETYSLYGFPYAFQVWAYETILTLSMRVATR
        VYFIELAMM KERK  +DT+LLG+VD WE FCNYDWSSM+ +RT+ SLK+ALKDK+  Y+QK   D +H+ETYSLY FPYAFQVWAYETI TLS RVA R
Subjt:  VYFIELAMMRKERKHLIDTALLGVVDWWEAFCNYDWSSMVLDRTILSLKDALKDKLPAYQQKARADPTHIETYSLYGFPYAFQVWAYETILTLSMRVATR

Query:  LSDDAIPRLLRF
        L+DDAIPRLLR+
Subjt:  LSDDAIPRLLRF

TrEMBL top hitse value%identityAlignment
A0A6J1CZE8 uncharacterized protein LOC1110156001.9e-9973.96Show/hide
Query:  MDVVFNGPLIHHLLLREVEEPRQDVISFDLFGKRVSFGRREFDLITGLSHRMKKVDNHIPGRRLRARYFKDSVRVNCRELEKIFLEDVFDDDGDVVKVSI
        M VVFNGPLIHHLLL EVEEPRQDVISFDLF KRVSFG+REFDLITGLSH+M +V+NHIPGRRLRARYFKDSVRV C ELEKIFLED+F DD DVVKV I
Subjt:  MDVVFNGPLIHHLLLREVEEPRQDVISFDLFGKRVSFGRREFDLITGLSHRMKKVDNHIPGRRLRARYFKDSVRVNCRELEKIFLEDVFDDDGDVVKVSI

Query:  VYFIELAMMRKERKHLIDTALLGVVDWWEAFCNYDWSSMVLDRTILSLKDALKDKLPAYQQKARADPTHIETYSLYGFPYAFQVWAYETILTLSMRVATR
        VYFIELAMM KERK  IDT  +GVVD WEAFCN DWSSM+ DRTI SLK+ LKDKL AYQQKA ADPTH+ETYSLYGFPY              MR +  
Subjt:  VYFIELAMMRKERKHLIDTALLGVVDWWEAFCNYDWSSMVLDRTILSLKDALKDKLPAYQQKARADPTHIETYSLYGFPYAFQVWAYETILTLSMRVATR

Query:  LSDDAIPRLLRFLQSKVKEHLVATDAAEQHMVCVILPPEARVIPDPPAVPDRAAVPDRAAVSNPP
        L+ +          SKVKEHL+ATDA EQHMV VILPPE RVIPDPPAVPDRA VPDRA V +PP
Subjt:  LSDDAIPRLLRFLQSKVKEHLVATDAAEQHMVCVILPPEARVIPDPPAVPDRAAVPDRAAVSNPP

A0A6J1DJX9 uncharacterized protein LOC1110207577.7e-16272.5Show/hide
Query:  MDVVFNGPLIHHLLLREVEEPRQDVISFDLFGKRVSFGRREFDLITGLSHRMKKVDNHIPGRRLRARYFKDSVRVNCRELEKIFLEDVFDDDGDVVKVSI
        +DVVFNGPLIHHLLLREVEEPRQDVISFDLFGKRVSFG+REFDLITGLSHRM +VDNHIPGRRLRARYFKD VRV C ELEKIFLEDVF DD DVVKV I
Subjt:  MDVVFNGPLIHHLLLREVEEPRQDVISFDLFGKRVSFGRREFDLITGLSHRMKKVDNHIPGRRLRARYFKDSVRVNCRELEKIFLEDVFDDDGDVVKVSI

Query:  VYFIELAMMRKERKHLIDTALLGVVDWWEAFCNYDWSSMVLDRTILSLKDALKDKLPAYQQKARADPTHIETYSLYGFPYAFQVWAYETILTLSMRVATR
        VYFIELAMM KERK  IDTALLGVVD WE FCNYDWSSM+ DRTI SLK+ALKDKL  YQQKA ADP+H+ETYSLYGFPYAFQVWAYETI T        
Subjt:  VYFIELAMMRKERKHLIDTALLGVVDWWEAFCNYDWSSMVLDRTILSLKDALKDKLPAYQQKARADPTHIETYSLYGFPYAFQVWAYETILTLSMRVATR

Query:  LSDDAIPRLLRF------------------LQSKVKEHLVATDAAEQHMVCVILPPEARVIPDPPAVPDRAAVPD------RAAVSNPPTDVERGPLEDP
        LSDDAIPRLLR+                   +SKVKEHL+ATDA EQHMV VILPPE RVIPDPPAVPDRA VPD      RAAV +PP DVE GPLEDP
Subjt:  LSDDAIPRLLRF------------------LQSKVKEHLVATDAAEQHMVCVILPPEARVIPDPPAVPDRAAVPD------RAAVSNPPTDVERGPLEDP

Query:  VVEAHALDEVGPSANDGEALEKRSKRNKFKNRISRRLKRLDDRVGAIEDTLGDFGVALKGIQRYLKKLTK----------------------DQRSDESL
        VV+AHA+DE  PSANDGE LEKR K+NKFK RISRRLKRLD+ VGAIED LGDFGVALKGIQ YLKKL K                      DQR DES 
Subjt:  VVEAHALDEVGPSANDGEALEKRSKRNKFKNRISRRLKRLDDRVGAIEDTLGDFGVALKGIQRYLKKLTK----------------------DQRSDESL

Query:  KPDGGRKSMDEDVRPDEDPETDE------ESTSGHGPNNM
        KPDGGRKSMDED R DED  TDE      E TSGHG + +
Subjt:  KPDGGRKSMDEDVRPDEDPETDE------ESTSGHGPNNM

A0A6J1DM82 uncharacterized protein LOC1110223001.9e-6780Show/hide
Query:  MDVVFNGPLIHHLLLREVEEPRQDVISFDLFGKRVSFGRREFDLITGLSHRMKKVDNHIPGRRLRARYFKDSVRVNCRELEKIFLEDVFDDDGDVVKVSI
        MDVVFNGPLIHHLLLREVEEPRQD+ISFDLFGKRVSFG+REFDLITGLS+RM +VDN IPGRRLRARYFKDSVRV C ELEKIF+E VF DD D VKV I
Subjt:  MDVVFNGPLIHHLLLREVEEPRQDVISFDLFGKRVSFGRREFDLITGLSHRMKKVDNHIPGRRLRARYFKDSVRVNCRELEKIFLEDVFDDDGDVVKVSI

Query:  VYFIELAMMRKERKHLIDTALLGVVDWWEAFCNYDWSSMVLDRTILSLKDALKDKLPAYQ
        VYF+ELAMM KERK  ID  LLGVVD WE FCN+DWSS++ +RT+ SLK+A+ DKLPAYQ
Subjt:  VYFIELAMMRKERKHLIDTALLGVVDWWEAFCNYDWSSMVLDRTILSLKDALKDKLPAYQ

A0A6J1DRS0 uncharacterized protein LOC1110226075.6e-10468.92Show/hide
Query:  SGHGPNNMDEDPKKKDDDPMITEEDDGTITDEDEDPNQ------------------------MTMDL---RQEPDVQPDTQPTRRRVRCPYKDWAPDAIV
        SGHGPN++DEDPK++D+DPMI EEDDG ITD DEDPNQ                        +  DL   RQEPD QPDTQPTRRRVR PYKDWAPDAIV
Subjt:  SGHGPNNMDEDPKKKDDDPMITEEDDGTITDEDEDPNQ------------------------MTMDL---RQEPDVQPDTQPTRRRVRCPYKDWAPDAIV

Query:  K-------GQSDLQHAPTGRGLRKRHYSWKLKDIYTPTGQRRITVDAYDPACLIPPQLDGQFQTWMDDPNIDGRTRFTVAGLQGKEWYPDPLNPTVQLKD
        K        ++DLQHAPTGRGLRK HYSWKLK IYTPTG+RRITVDAYDPAC IPPQLDGQFQTWMDD +IDGRTR T AGLQGKEWY D L+PTVQLKD
Subjt:  K-------GQSDLQHAPTGRGLRKRHYSWKLKDIYTPTGQRRITVDAYDPACLIPPQLDGQFQTWMDDPNIDGRTRFTVAGLQGKEWYPDPLNPTVQLKD

Query:  EVVDALVLFTAKKLEKCLHLCRKKFAIGDVLLSTLLNRTDGPYAAMKLG------------------------SNQNVPWSDADIVYTPINLGGNH
        EVVDALVLFTAKKLEKC++LCRKKFAIGDVLLSTLLNRTDGPYAAMK G                        S+QNV W+DADIVYTPIN+GGNH
Subjt:  EVVDALVLFTAKKLEKCLHLCRKKFAIGDVLLSTLLNRTDGPYAAMKLG------------------------SNQNVPWSDADIVYTPINLGGNH

A0A6J1DRZ7 uncharacterized protein LOC1110238473.0e-8170.28Show/hide
Query:  MDVVFNGPLIHHLLLREVEEPRQDVISFDLFGKRVSFGRREFDLITGLSHRMKKVDNHIPGRRLRARYFKDSVRVNCRELEKIFLEDVFDDDGDVVKVSI
        M+VVFNGPL+HHLLLREVEEP+ D+ISF+LFG RVSFG+REFDLITGL H M +VD  +  RRLR  YF+D   V C ELEKIFLE  F++D D VK++I
Subjt:  MDVVFNGPLIHHLLLREVEEPRQDVISFDLFGKRVSFGRREFDLITGLSHRMKKVDNHIPGRRLRARYFKDSVRVNCRELEKIFLEDVFDDDGDVVKVSI

Query:  VYFIELAMMRKERKHLIDTALLGVVDWWEAFCNYDWSSMVLDRTILSLKDALKDKLPAYQQKARADPTHIETYSLYGFPYAFQVWAYETILTLSMRVATR
        VYFIELAMM KERK  +DT+LLG+VD WE FCNYDWSSM+ +RT+ SLK+ALKDK+  Y+QK   D +H+ETYSLY FPYAFQVWAYETI TLS RVA R
Subjt:  VYFIELAMMRKERKHLIDTALLGVVDWWEAFCNYDWSSMVLDRTILSLKDALKDKLPAYQQKARADPTHIETYSLYGFPYAFQVWAYETILTLSMRVATR

Query:  LSDDAIPRLLRF
        L+DDAIPRLLR+
Subjt:  LSDDAIPRLLRF

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
AT4G08430.1 Ulp1 protease family protein7.7e-0525.9Show/hide
Query:  QNVPW-SDADIVYTPINLGGNHWVMLGIDLVEGDLTVCDSLESATPLDSLEKELKLICTILPAVLHHGGIFVV----RLDLPVVPWRVHRVRTPQQGSAT
        +N  W  D D +Y  + + GNHWV L IDL +  + V DS+ S T    +  +   + T++PA+L     F+     R     + W+    + P+   A 
Subjt:  QNVPW-SDADIVYTPINLGGNHWVMLGIDLVEGDLTVCDSLESATPLDSLEKELKLICTILPAVLHHGGIFVV----RLDLPVVPWRVHRVRTPQQGSAT

Query:  DCRIFVYAFSSTMLSGQN-STLTQDNIVFFRHQYVVQMW
        DC I+   +   +  G++   L  +N+     +  V+M+
Subjt:  DCRIFVYAFSSTMLSGQN-STLTQDNIVFFRHQYVVQMW

AT5G28235.1 Ulp1 protease family protein6.5e-0436.21Show/hide
Query:  DADIVYTPINLGGNHWVMLGIDLVEGDLTVCDSLESATPLDSLEKELKLICTILPAVL
        D D +Y  + + GNHWV L IDL +  + V DS+ S T    +  +   + T++PA+L
Subjt:  DADIVYTPINLGGNHWVMLGIDLVEGDLTVCDSLESATPLDSLEKELKLICTILPAVL

AT5G45570.1 Ulp1 protease family protein5.9e-0525.76Show/hide
Query:  DADIVYTPINLGGNHWVMLGIDLVEGDLTVCDSLESATPLDSLEKELKLICTILPAVLHHGGIFVV----RLDLPVVPWRVHRVRTPQQGSATDCRIFVY
        D D +Y  + + GNHWV L IDL    + V DS+ S T    +  +   + T++PA+L     F+     R     + W+    + P+     DC I+  
Subjt:  DADIVYTPINLGGNHWVMLGIDLVEGDLTVCDSLESATPLDSLEKELKLICTILPAVLHHGGIFVV----RLDLPVVPWRVHRVRTPQQGSATDCRIFVY

Query:  AFSSTMLSGQN-STLTQDNIVFFRHQYVVQMW
         +   +  G++   L  +N+   R +  V+M+
Subjt:  AFSSTMLSGQN-STLTQDNIVFFRHQYVVQMW


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGACGTTGTTTTCAACGGTCCATTGATCCATCACCTATTGTTGAGAGAGGTTGAAGAGCCTAGGCAGGACGTCATTAGCTTTGACCTGTTTGGGAAGAGGGTATCTTT
TGGTAGGCGAGAGTTCGACCTAATCACCGGGCTCAGTCATAGGATGAAGAAGGTAGATAATCATATTCCCGGACGAAGACTTAGAGCACGTTACTTTAAAGACAGTGTTA
GGGTTAATTGTAGGGAGCTGGAAAAGATTTTTTTGGAGGACGTTTTTGACGACGACGGGGATGTTGTGAAGGTTAGCATAGTTTACTTCATAGAACTTGCCATGATGAGG
AAGGAGAGGAAGCATCTCATTGATACAGCCCTGTTAGGTGTTGTGGATTGGTGGGAGGCGTTTTGCAACTATGACTGGAGTTCGATGGTTTTGGATAGGACGATTTTGAG
TCTCAAGGATGCTCTGAAGGATAAACTACCGGCCTATCAACAGAAGGCAAGAGCCGACCCCACACACATTGAGACTTATAGTCTGTACGGGTTTCCGTATGCATTTCAGG
TATGGGCGTATGAGACGATCTTGACGTTGAGTATGCGAGTAGCAACGAGGTTGAGCGATGACGCCATTCCTCGACTTCTCAGGTTTTTGCAGTCCAAGGTTAAGGAACAC
TTAGTGGCGACGGATGCTGCGGAACAACACATGGTCTGTGTCATTCTTCCGCCAGAAGCCCGTGTTATACCTGATCCGCCTGCTGTACCTGATCGGGCTGCTGTACCTGA
TCGGGCTGCTGTATCTAATCCGCCTACTGATGTGGAAAGGGGTCCTCTAGAGGATCCGGTAGTAGAGGCCCATGCGTTAGACGAGGTTGGACCCAGTGCAAATGATGGTG
AAGCGCTAGAGAAGAGGTCGAAGAGGAATAAATTCAAGAATAGGATCAGCAGACGGTTGAAGAGGCTCGATGACCGTGTTGGTGCTATCGAGGACACACTAGGTGACTTT
GGAGTTGCCCTGAAAGGTATTCAGAGGTACCTGAAGAAACTGACGAAGGATCAAAGGTCTGATGAGTCCTTGAAGCCAGATGGAGGTCGGAAGAGTATGGACGAGGACGT
GAGGCCTGATGAGGACCCGGAGACTGACGAGGAATCGACATCGGGGCATGGTCCGAATAATATGGACGAGGATCCGAAGAAAAAGGACGATGATCCAATGATTACGGAGG
AGGACGATGGTACGATAACAGATGAGGACGAGGATCCAAATCAGATGACCATGGACCTCAGACAAGAGCCGGATGTTCAACCAGATACGCAACCCACGAGGAGACGAGTT
AGGTGTCCCTATAAGGACTGGGCACCAGACGCTATCGTTAAGGGCCAATCTGACCTTCAGCATGCCCCAACTGGTAGGGGGCTACGCAAGCGCCATTATTCGTGGAAACT
GAAGGATATATACACACCAACCGGCCAACGTAGGATCACCGTGGATGCATACGATCCAGCATGTCTCATTCCTCCACAGCTAGATGGTCAGTTCCAGACATGGATGGATG
ACCCGAACATCGATGGGCGAACTCGGTTTACTGTAGCTGGCTTACAAGGGAAGGAATGGTATCCCGATCCACTGAACCCTACTGTCCAATTGAAGGACGAAGTAGTTGAT
GCTCTCGTCCTGTTTACGGCAAAAAAGTTGGAGAAGTGTCTCCATCTGTGTCGCAAAAAGTTTGCGATAGGCGACGTGCTGCTTTCGACTCTGCTGAACCGAACAGACGG
TCCGTATGCGGCCATGAAACTAGGTTCGAACCAGAACGTGCCTTGGAGTGATGCAGACATTGTGTACACCCCGATCAACTTAGGTGGGAACCACTGGGTGATGCTCGGAA
TTGATCTTGTGGAAGGCGACTTAACCGTATGTGATTCACTCGAATCGGCCACTCCACTGGATTCACTCGAGAAGGAGCTGAAGCTCATTTGTACGATCCTACCTGCAGTG
CTGCATCATGGCGGGATATTTGTAGTACGACTGGACCTGCCAGTGGTACCATGGAGGGTGCATCGGGTTCGTACACCTCAACAAGGTAGCGCGACTGATTGTAGGATTTT
TGTATACGCTTTTTCGAGTACGATGTTATCGGGTCAAAACTCCACTTTGACCCAAGATAATATTGTATTTTTTAGGCACCAGTACGTTGTACAGATGTGGGCGCGCCGTC
CCATTTTTTGA
mRNA sequenceShow/hide mRNA sequence
ATGGACGTTGTTTTCAACGGTCCATTGATCCATCACCTATTGTTGAGAGAGGTTGAAGAGCCTAGGCAGGACGTCATTAGCTTTGACCTGTTTGGGAAGAGGGTATCTTT
TGGTAGGCGAGAGTTCGACCTAATCACCGGGCTCAGTCATAGGATGAAGAAGGTAGATAATCATATTCCCGGACGAAGACTTAGAGCACGTTACTTTAAAGACAGTGTTA
GGGTTAATTGTAGGGAGCTGGAAAAGATTTTTTTGGAGGACGTTTTTGACGACGACGGGGATGTTGTGAAGGTTAGCATAGTTTACTTCATAGAACTTGCCATGATGAGG
AAGGAGAGGAAGCATCTCATTGATACAGCCCTGTTAGGTGTTGTGGATTGGTGGGAGGCGTTTTGCAACTATGACTGGAGTTCGATGGTTTTGGATAGGACGATTTTGAG
TCTCAAGGATGCTCTGAAGGATAAACTACCGGCCTATCAACAGAAGGCAAGAGCCGACCCCACACACATTGAGACTTATAGTCTGTACGGGTTTCCGTATGCATTTCAGG
TATGGGCGTATGAGACGATCTTGACGTTGAGTATGCGAGTAGCAACGAGGTTGAGCGATGACGCCATTCCTCGACTTCTCAGGTTTTTGCAGTCCAAGGTTAAGGAACAC
TTAGTGGCGACGGATGCTGCGGAACAACACATGGTCTGTGTCATTCTTCCGCCAGAAGCCCGTGTTATACCTGATCCGCCTGCTGTACCTGATCGGGCTGCTGTACCTGA
TCGGGCTGCTGTATCTAATCCGCCTACTGATGTGGAAAGGGGTCCTCTAGAGGATCCGGTAGTAGAGGCCCATGCGTTAGACGAGGTTGGACCCAGTGCAAATGATGGTG
AAGCGCTAGAGAAGAGGTCGAAGAGGAATAAATTCAAGAATAGGATCAGCAGACGGTTGAAGAGGCTCGATGACCGTGTTGGTGCTATCGAGGACACACTAGGTGACTTT
GGAGTTGCCCTGAAAGGTATTCAGAGGTACCTGAAGAAACTGACGAAGGATCAAAGGTCTGATGAGTCCTTGAAGCCAGATGGAGGTCGGAAGAGTATGGACGAGGACGT
GAGGCCTGATGAGGACCCGGAGACTGACGAGGAATCGACATCGGGGCATGGTCCGAATAATATGGACGAGGATCCGAAGAAAAAGGACGATGATCCAATGATTACGGAGG
AGGACGATGGTACGATAACAGATGAGGACGAGGATCCAAATCAGATGACCATGGACCTCAGACAAGAGCCGGATGTTCAACCAGATACGCAACCCACGAGGAGACGAGTT
AGGTGTCCCTATAAGGACTGGGCACCAGACGCTATCGTTAAGGGCCAATCTGACCTTCAGCATGCCCCAACTGGTAGGGGGCTACGCAAGCGCCATTATTCGTGGAAACT
GAAGGATATATACACACCAACCGGCCAACGTAGGATCACCGTGGATGCATACGATCCAGCATGTCTCATTCCTCCACAGCTAGATGGTCAGTTCCAGACATGGATGGATG
ACCCGAACATCGATGGGCGAACTCGGTTTACTGTAGCTGGCTTACAAGGGAAGGAATGGTATCCCGATCCACTGAACCCTACTGTCCAATTGAAGGACGAAGTAGTTGAT
GCTCTCGTCCTGTTTACGGCAAAAAAGTTGGAGAAGTGTCTCCATCTGTGTCGCAAAAAGTTTGCGATAGGCGACGTGCTGCTTTCGACTCTGCTGAACCGAACAGACGG
TCCGTATGCGGCCATGAAACTAGGTTCGAACCAGAACGTGCCTTGGAGTGATGCAGACATTGTGTACACCCCGATCAACTTAGGTGGGAACCACTGGGTGATGCTCGGAA
TTGATCTTGTGGAAGGCGACTTAACCGTATGTGATTCACTCGAATCGGCCACTCCACTGGATTCACTCGAGAAGGAGCTGAAGCTCATTTGTACGATCCTACCTGCAGTG
CTGCATCATGGCGGGATATTTGTAGTACGACTGGACCTGCCAGTGGTACCATGGAGGGTGCATCGGGTTCGTACACCTCAACAAGGTAGCGCGACTGATTGTAGGATTTT
TGTATACGCTTTTTCGAGTACGATGTTATCGGGTCAAAACTCCACTTTGACCCAAGATAATATTGTATTTTTTAGGCACCAGTACGTTGTACAGATGTGGGCGCGCCGTC
CCATTTTTTGA
Protein sequenceShow/hide protein sequence
MDVVFNGPLIHHLLLREVEEPRQDVISFDLFGKRVSFGRREFDLITGLSHRMKKVDNHIPGRRLRARYFKDSVRVNCRELEKIFLEDVFDDDGDVVKVSIVYFIELAMMR
KERKHLIDTALLGVVDWWEAFCNYDWSSMVLDRTILSLKDALKDKLPAYQQKARADPTHIETYSLYGFPYAFQVWAYETILTLSMRVATRLSDDAIPRLLRFLQSKVKEH
LVATDAAEQHMVCVILPPEARVIPDPPAVPDRAAVPDRAAVSNPPTDVERGPLEDPVVEAHALDEVGPSANDGEALEKRSKRNKFKNRISRRLKRLDDRVGAIEDTLGDF
GVALKGIQRYLKKLTKDQRSDESLKPDGGRKSMDEDVRPDEDPETDEESTSGHGPNNMDEDPKKKDDDPMITEEDDGTITDEDEDPNQMTMDLRQEPDVQPDTQPTRRRV
RCPYKDWAPDAIVKGQSDLQHAPTGRGLRKRHYSWKLKDIYTPTGQRRITVDAYDPACLIPPQLDGQFQTWMDDPNIDGRTRFTVAGLQGKEWYPDPLNPTVQLKDEVVD
ALVLFTAKKLEKCLHLCRKKFAIGDVLLSTLLNRTDGPYAAMKLGSNQNVPWSDADIVYTPINLGGNHWVMLGIDLVEGDLTVCDSLESATPLDSLEKELKLICTILPAV
LHHGGIFVVRLDLPVVPWRVHRVRTPQQGSATDCRIFVYAFSSTMLSGQNSTLTQDNIVFFRHQYVVQMWARRPIF