; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Moc02g01010 (gene) of Bitter gourd (OHB3-1) v2 genome

Gene IDMoc02g01010
OrganismMomordica charantia cv. OHB3-1 (Bitter gourd (OHB3-1) v2)
DescriptionUlp1-like peptidase
Genome locationchr2:814451..818189
RNA-Seq ExpressionMoc02g01010
SyntenyMoc02g01010
Gene Ontology termsGO:0006508 - proteolysis (biological process)
GO:0008234 - cysteine-type peptidase activity (molecular function)
InterPro domainsIPR003653 - Ulp1 protease family, C-terminal catalytic domain
IPR038765 - Papain-like cysteine peptidase superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_022148137.1 uncharacterized protein LOC111016890 [Momordica charantia]8.2e-8594.97Show/hide
Query:  MKPSVLSTRINYPWCEENTIWRYVHGRQSEHNVPWSDADIVYTPMNVGGNHWVMLGIDLVQGDITVWDSLQTTTPLDELEKELKPMCTILPTLLHHGGIF
        MKP VLSTRINYPW EENTIWRYVHGRQS+HNVPWSDADIVYTPMNVGGNHWVMLGIDLVQGDITVWDSLQT TPLDELEKELKPMCTILPTLLHHGGIF
Subjt:  MKPSVLSTRINYPWCEENTIWRYVHGRQSEHNVPWSDADIVYTPMNVGGNHWVMLGIDLVQGDITVWDSLQTTTPLDELEKELKPMCTILPTLLHHGGIF

Query:  SVRPDLPVVPWRVRRVRVPQQSSATDCGIFCVRYFEYDATGSNMNTLTQYNIVYFRRQY
        SVRPDLPVVPWRVRRVRVPQQSS TDCGIF VRYFEYDATGSNM+TLTQ NIVYFRRQY
Subjt:  SVRPDLPVVPWRVRRVRVPQQSSATDCGIFCVRYFEYDATGSNMNTLTQYNIVYFRRQY

XP_022153201.1 uncharacterized protein LOC111020757 [Momordica charantia]2.5e-11056.33Show/hide
Query:  IREVEEPRQDIISFDLFGKRVSFGKREFDLITGLSHRMIRA--------------------------------VFDDDEDAVKVVIVYFVELAMMGKERK
        +REVEEPRQD+ISFDLFGKRVSFGKREFDLITGLSHRM R                                 VF DDED VKV IVYF+ELAMMGKERK
Subjt:  IREVEEPRQDIISFDLFGKRVSFGKREFDLITGLSHRMIRA--------------------------------VFDDDEDAVKVVIVYFVELAMMGKERK

Query:  QFIDTTLLGVVDRSELFCNHDWSSLIFERTLWSLKNALKDKLPAYQQKARNDPTHQETYSLYGFPYAYQVWAYETISTLSLRVATRLGDDAIPRLLRWSC
        QFIDT LLGVVDR E+FCN+DWSS+IF+RT+WSLKNALKDKL  YQQKA  DP+H ETYSLYGFPYA+QVWAYETISTLS        DDAIPRLLRWSC
Subjt:  QFIDTTLLGVVDRSELFCNHDWSSLIFERTLWSLKNALKDKLPAYQQKARNDPTHQETYSLYGFPYAYQVWAYETISTLSLRVATRLGDDAIPRLLRWSC

Query:  TYSRGFLTLQRDVFDNTMSKVKEYLVSTNAEAEHMVRIMRPSEARAIPAPLAVPDQPAVPDQPAVPDPAVVPAPVAVRNPSADLERGAEERM---GPALD
         YS GF  L  +VFDNT SKVKE+L++T+A+ +HMVR++ P E R IP P AVPD+  VPD PA P+ A VP      +P AD+E G  E       A+D
Subjt:  TYSRGFLTLQRDVFDNTMSKVKEYLVSTNAEAEHMVRIMRPSEARAIPAPLAVPDQPAVPDQPAVPDPAVVPAPVAVRNPSADLERGAEERM---GPALD

Query:  DAGPSGNDSEALQKRLKRKKFKNKISRRLKRLDDRVGAIEATLTGFEATLTGFGVALKGIQRYLKKMSKSM------------------DEDRRPEAVPK
        +A PS ND E L+KRLK+ KFK +ISRRLKRLD+ VGAI       E  L  FGVALKGIQ YLKK++K                      D+RP+  PK
Subjt:  DAGPSGNDSEALQKRLKRKKFKNKISRRLKRLDDRVGAIEATLTGFEATLTGFGVALKGIQRYLKKMSKSM------------------DEDRRPEAVPK

Query:  TDEYQTMDDNPKSMDEDPKNMDEDPMFMVEDQGTITERDNAS
               D   KSMDED ++ DED      D+   TE++  S
Subjt:  TDEYQTMDDNPKSMDEDPKNMDEDPMFMVEDQGTITERDNAS

XP_022155476.1 uncharacterized protein LOC111022607 [Momordica charantia]2.4e-10868.03Show/hide
Query:  PKSMDEDPKNMDEDPMFMVEDQGTIT-------------------ERDNASDAYPDRPVGLFQDAIVGMQEPDVASDTQPVSRRVRRPYKDWAPDAVIK-
        P S+DEDPK  D DPM M ED G IT                   E D+  D  P   V + QD  VG QEPD   DTQP  RRVRRPYKDWAPDA++K 
Subjt:  PKSMDEDPKNMDEDPMFMVEDQGTIT-------------------ERDNASDAYPDRPVGLFQDAIVGMQEPDVASDTQPVSRRVRRPYKDWAPDAVIK-

Query:  ------DEYDLQQAPTGRGLRKRHYSWKLKDIYTPTGQRGITVDRYDPVCPIPPQLDDKFQRWMDDPKTDGRLRSTATGFQKKEWYRDLLDPSVELKDEV
              DE DLQ APTGRGLRK HYSWKLK IYTPTG+R ITVD YDP CPIPPQLD +FQ WMDD   DGR RSTA G Q KEWYRDLLDP+V+LKDEV
Subjt:  ------DEYDLQQAPTGRGLRKRHYSWKLKDIYTPTGQRGITVDRYDPVCPIPPQLDDKFQRWMDDPKTDGRLRSTATGFQKKEWYRDLLDPSVELKDEV

Query:  LDGLVLFTAKKLEKCLHLCRKKFAIGDVLLSTLLNRTDGPYAAMKPSVLSTRINYPWCEENTIWRYVHGRQSEHNVPWSDADIVYTPMNVGGNH
        +D LVLFTAKKLEKC++LCRKKFAIGDVLLSTLLNRTDGPYAAMKP VLSTRI YP  +ENTI+RYV GRQS+ NV W+DADIVYTP+N+GGNH
Subjt:  LDGLVLFTAKKLEKCLHLCRKKFAIGDVLLSTLLNRTDGPYAAMKPSVLSTRINYPWCEENTIWRYVHGRQSEHNVPWSDADIVYTPMNVGGNH

XP_022157020.1 uncharacterized protein LOC111023847 [Momordica charantia]4.7e-7262.34Show/hide
Query:  IREVEEPRQDIISFDLFGKRVSFGKREFDLITGLSHRMIRA--------------------------------VFDDDEDAVKVVIVYFVELAMMGKERK
        +REVEEP+ D+ISF+LFG RVSFGKREFDLITGL H M R                                  F++DEDAVK+ IVYF+ELAMMGKERK
Subjt:  IREVEEPRQDIISFDLFGKRVSFGKREFDLITGLSHRMIRA--------------------------------VFDDDEDAVKVVIVYFVELAMMGKERK

Query:  QFIDTTLLGVVDRSELFCNHDWSSLIFERTLWSLKNALKDKLPAYQQKARNDPTHQETYSLYGFPYAYQVWAYETISTLSLRVATRLGDDAIPRLLRWSC
          +DT+LLG+VDR E+FCN+DWSS+IFERTLWSLKNALKDK+  Y+QK   D +H ETYSLY FPYA+QVWAYETISTLS RVA RL DDAIPRLLRWSC
Subjt:  QFIDTTLLGVVDRSELFCNHDWSSLIFERTLWSLKNALKDKLPAYQQKARNDPTHQETYSLYGFPYAYQVWAYETISTLSLRVATRLGDDAIPRLLRWSC

Query:  TYSRGFLTLQRDVFDNTMSKVKEYLVSTNAE
        TYSR F  L+R+VF+N  SKV   L +T+ E
Subjt:  TYSRGFLTLQRDVFDNTMSKVKEYLVSTNAE

XP_022158807.1 uncharacterized protein LOC111025273 [Momordica charantia]1.5e-9465.2Show/hide
Query:  MDDPKTDGRLRSTATGFQKKEWYRDLLDPSVELKDEVLDGLVLFTAKKLEKCLHLCRKKFAIGDVLLSTLLNRTDGPYAAMKPSVLSTRINYPWCEENTI
        MDDP TD   RST+ G + K W+  LLDP  +L DE +D L++ TA+K+EKC HL R +FAIGDVLLS LL RTDGPYAAMKP VL ++  Y W +E TI
Subjt:  MDDPKTDGRLRSTATGFQKKEWYRDLLDPSVELKDEVLDGLVLFTAKKLEKCLHLCRKKFAIGDVLLSTLLNRTDGPYAAMKPSVLSTRINYPWCEENTI

Query:  WRYVHGRQSEHNVPWSDADIVYTPMNVGGNHWVMLGIDLVQGDITVWDSLQTTTPLDELEKELKPMCTILPTLLHHGGIFSVRPDLPVVPWRVRRVRVPQ
        +RYV GRQS+++  WS+ADIVYT MN+GGNHWVM+GIDLV+GD+TVWDSLQ  TPL++LEK LKPMCTI+P +LH  GI ++RP+LP+VPWRVRR  VPQ
Subjt:  WRYVHGRQSEHNVPWSDADIVYTPMNVGGNHWVMLGIDLVQGDITVWDSLQTTTPLDELEKELKPMCTILPTLLHHGGIFSVRPDLPVVPWRVRRVRVPQ

Query:  QSSATDCGIFCVRYFEYDATGSNMNTLTQYNIVYFRRQYTVQMWARRPIF
        Q+  TDC IFCVR+FEYD  GS ++TL Q NI  FRRQY VQMWARRP F
Subjt:  QSSATDCGIFCVRYFEYDATGSNMNTLTQYNIVYFRRQYTVQMWARRPIF

TrEMBL top hitse value%identityAlignment
A0A6J1D492 uncharacterized protein LOC1110168904.0e-8594.97Show/hide
Query:  MKPSVLSTRINYPWCEENTIWRYVHGRQSEHNVPWSDADIVYTPMNVGGNHWVMLGIDLVQGDITVWDSLQTTTPLDELEKELKPMCTILPTLLHHGGIF
        MKP VLSTRINYPW EENTIWRYVHGRQS+HNVPWSDADIVYTPMNVGGNHWVMLGIDLVQGDITVWDSLQT TPLDELEKELKPMCTILPTLLHHGGIF
Subjt:  MKPSVLSTRINYPWCEENTIWRYVHGRQSEHNVPWSDADIVYTPMNVGGNHWVMLGIDLVQGDITVWDSLQTTTPLDELEKELKPMCTILPTLLHHGGIF

Query:  SVRPDLPVVPWRVRRVRVPQQSSATDCGIFCVRYFEYDATGSNMNTLTQYNIVYFRRQY
        SVRPDLPVVPWRVRRVRVPQQSS TDCGIF VRYFEYDATGSNM+TLTQ NIVYFRRQY
Subjt:  SVRPDLPVVPWRVRRVRVPQQSSATDCGIFCVRYFEYDATGSNMNTLTQYNIVYFRRQY

A0A6J1DJX9 uncharacterized protein LOC1110207571.2e-11056.33Show/hide
Query:  IREVEEPRQDIISFDLFGKRVSFGKREFDLITGLSHRMIRA--------------------------------VFDDDEDAVKVVIVYFVELAMMGKERK
        +REVEEPRQD+ISFDLFGKRVSFGKREFDLITGLSHRM R                                 VF DDED VKV IVYF+ELAMMGKERK
Subjt:  IREVEEPRQDIISFDLFGKRVSFGKREFDLITGLSHRMIRA--------------------------------VFDDDEDAVKVVIVYFVELAMMGKERK

Query:  QFIDTTLLGVVDRSELFCNHDWSSLIFERTLWSLKNALKDKLPAYQQKARNDPTHQETYSLYGFPYAYQVWAYETISTLSLRVATRLGDDAIPRLLRWSC
        QFIDT LLGVVDR E+FCN+DWSS+IF+RT+WSLKNALKDKL  YQQKA  DP+H ETYSLYGFPYA+QVWAYETISTLS        DDAIPRLLRWSC
Subjt:  QFIDTTLLGVVDRSELFCNHDWSSLIFERTLWSLKNALKDKLPAYQQKARNDPTHQETYSLYGFPYAYQVWAYETISTLSLRVATRLGDDAIPRLLRWSC

Query:  TYSRGFLTLQRDVFDNTMSKVKEYLVSTNAEAEHMVRIMRPSEARAIPAPLAVPDQPAVPDQPAVPDPAVVPAPVAVRNPSADLERGAEERM---GPALD
         YS GF  L  +VFDNT SKVKE+L++T+A+ +HMVR++ P E R IP P AVPD+  VPD PA P+ A VP      +P AD+E G  E       A+D
Subjt:  TYSRGFLTLQRDVFDNTMSKVKEYLVSTNAEAEHMVRIMRPSEARAIPAPLAVPDQPAVPDQPAVPDPAVVPAPVAVRNPSADLERGAEERM---GPALD

Query:  DAGPSGNDSEALQKRLKRKKFKNKISRRLKRLDDRVGAIEATLTGFEATLTGFGVALKGIQRYLKKMSKSM------------------DEDRRPEAVPK
        +A PS ND E L+KRLK+ KFK +ISRRLKRLD+ VGAI       E  L  FGVALKGIQ YLKK++K                      D+RP+  PK
Subjt:  DAGPSGNDSEALQKRLKRKKFKNKISRRLKRLDDRVGAIEATLTGFEATLTGFGVALKGIQRYLKKMSKSM------------------DEDRRPEAVPK

Query:  TDEYQTMDDNPKSMDEDPKNMDEDPMFMVEDQGTITERDNAS
               D   KSMDED ++ DED      D+   TE++  S
Subjt:  TDEYQTMDDNPKSMDEDPKNMDEDPMFMVEDQGTITERDNAS

A0A6J1DRS0 uncharacterized protein LOC1110226071.1e-10868.03Show/hide
Query:  PKSMDEDPKNMDEDPMFMVEDQGTIT-------------------ERDNASDAYPDRPVGLFQDAIVGMQEPDVASDTQPVSRRVRRPYKDWAPDAVIK-
        P S+DEDPK  D DPM M ED G IT                   E D+  D  P   V + QD  VG QEPD   DTQP  RRVRRPYKDWAPDA++K 
Subjt:  PKSMDEDPKNMDEDPMFMVEDQGTIT-------------------ERDNASDAYPDRPVGLFQDAIVGMQEPDVASDTQPVSRRVRRPYKDWAPDAVIK-

Query:  ------DEYDLQQAPTGRGLRKRHYSWKLKDIYTPTGQRGITVDRYDPVCPIPPQLDDKFQRWMDDPKTDGRLRSTATGFQKKEWYRDLLDPSVELKDEV
              DE DLQ APTGRGLRK HYSWKLK IYTPTG+R ITVD YDP CPIPPQLD +FQ WMDD   DGR RSTA G Q KEWYRDLLDP+V+LKDEV
Subjt:  ------DEYDLQQAPTGRGLRKRHYSWKLKDIYTPTGQRGITVDRYDPVCPIPPQLDDKFQRWMDDPKTDGRLRSTATGFQKKEWYRDLLDPSVELKDEV

Query:  LDGLVLFTAKKLEKCLHLCRKKFAIGDVLLSTLLNRTDGPYAAMKPSVLSTRINYPWCEENTIWRYVHGRQSEHNVPWSDADIVYTPMNVGGNH
        +D LVLFTAKKLEKC++LCRKKFAIGDVLLSTLLNRTDGPYAAMKP VLSTRI YP  +ENTI+RYV GRQS+ NV W+DADIVYTP+N+GGNH
Subjt:  LDGLVLFTAKKLEKCLHLCRKKFAIGDVLLSTLLNRTDGPYAAMKPSVLSTRINYPWCEENTIWRYVHGRQSEHNVPWSDADIVYTPMNVGGNH

A0A6J1DRZ7 uncharacterized protein LOC1110238472.3e-7262.34Show/hide
Query:  IREVEEPRQDIISFDLFGKRVSFGKREFDLITGLSHRMIRA--------------------------------VFDDDEDAVKVVIVYFVELAMMGKERK
        +REVEEP+ D+ISF+LFG RVSFGKREFDLITGL H M R                                  F++DEDAVK+ IVYF+ELAMMGKERK
Subjt:  IREVEEPRQDIISFDLFGKRVSFGKREFDLITGLSHRMIRA--------------------------------VFDDDEDAVKVVIVYFVELAMMGKERK

Query:  QFIDTTLLGVVDRSELFCNHDWSSLIFERTLWSLKNALKDKLPAYQQKARNDPTHQETYSLYGFPYAYQVWAYETISTLSLRVATRLGDDAIPRLLRWSC
          +DT+LLG+VDR E+FCN+DWSS+IFERTLWSLKNALKDK+  Y+QK   D +H ETYSLY FPYA+QVWAYETISTLS RVA RL DDAIPRLLRWSC
Subjt:  QFIDTTLLGVVDRSELFCNHDWSSLIFERTLWSLKNALKDKLPAYQQKARNDPTHQETYSLYGFPYAYQVWAYETISTLSLRVATRLGDDAIPRLLRWSC

Query:  TYSRGFLTLQRDVFDNTMSKVKEYLVSTNAE
        TYSR F  L+R+VF+N  SKV   L +T+ E
Subjt:  TYSRGFLTLQRDVFDNTMSKVKEYLVSTNAE

A0A6J1DY60 uncharacterized protein LOC1110252737.2e-9565.2Show/hide
Query:  MDDPKTDGRLRSTATGFQKKEWYRDLLDPSVELKDEVLDGLVLFTAKKLEKCLHLCRKKFAIGDVLLSTLLNRTDGPYAAMKPSVLSTRINYPWCEENTI
        MDDP TD   RST+ G + K W+  LLDP  +L DE +D L++ TA+K+EKC HL R +FAIGDVLLS LL RTDGPYAAMKP VL ++  Y W +E TI
Subjt:  MDDPKTDGRLRSTATGFQKKEWYRDLLDPSVELKDEVLDGLVLFTAKKLEKCLHLCRKKFAIGDVLLSTLLNRTDGPYAAMKPSVLSTRINYPWCEENTI

Query:  WRYVHGRQSEHNVPWSDADIVYTPMNVGGNHWVMLGIDLVQGDITVWDSLQTTTPLDELEKELKPMCTILPTLLHHGGIFSVRPDLPVVPWRVRRVRVPQ
        +RYV GRQS+++  WS+ADIVYT MN+GGNHWVM+GIDLV+GD+TVWDSLQ  TPL++LEK LKPMCTI+P +LH  GI ++RP+LP+VPWRVRR  VPQ
Subjt:  WRYVHGRQSEHNVPWSDADIVYTPMNVGGNHWVMLGIDLVQGDITVWDSLQTTTPLDELEKELKPMCTILPTLLHHGGIFSVRPDLPVVPWRVRRVRVPQ

Query:  QSSATDCGIFCVRYFEYDATGSNMNTLTQYNIVYFRRQYTVQMWARRPIF
        Q+  TDC IFCVR+FEYD  GS ++TL Q NI  FRRQY VQMWARRP F
Subjt:  QSSATDCGIFCVRYFEYDATGSNMNTLTQYNIVYFRRQYTVQMWARRPIF

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
AT4G08430.1 Ulp1 protease family protein3.5e-0929.46Show/hide
Query:  DADIVYTPMNVGGNHWVMLGIDLVQGDITVWDSLQTTTPLDELEKELKPMCTILPTLLHHG-GIFSVRPDLPVVPWRVRRVRVPQQSSATDCGIFCVRYF
        D D +Y  + V GNHWV L IDL +  I V+DS+ + T   E+  +   + T++P +L         R     + W+ R  ++P+   A DC I+ ++Y 
Subjt:  DADIVYTPMNVGGNHWVMLGIDLVQGDITVWDSLQTTTPLDELEKELKPMCTILPTLLHHG-GIFSVRPDLPVVPWRVRRVRVPQQSSATDCGIFCVRYF

Query:  EYDATGSNMNTLTQYNIVYFRRQYTVQMW
        E  A G + + L   N+     +  V+M+
Subjt:  EYDATGSNMNTLTQYNIVYFRRQYTVQMW

AT5G28235.1 Ulp1 protease family protein4.4e-0436.21Show/hide
Query:  DADIVYTPMNVGGNHWVMLGIDLVQGDITVWDSLQTTTPLDELEKELKPMCTILPTLL
        D D +Y  + V GNHWV L IDL +  + V+DS+ + T   E+  +   + T++P +L
Subjt:  DADIVYTPMNVGGNHWVMLGIDLVQGDITVWDSLQTTTPLDELEKELKPMCTILPTLL

AT5G45570.1 Ulp1 protease family protein1.6e-0928.68Show/hide
Query:  DADIVYTPMNVGGNHWVMLGIDLVQGDITVWDSLQTTTPLDELEKELKPMCTILPTLLHHG-GIFSVRPDLPVVPWRVRRVRVPQQSSATDCGIFCVRYF
        D D +Y  + V GNHWV L IDL    + V+DS+ + T   E+  +   + T++P +L         R     + W+ R  ++P+     DC I+ ++Y 
Subjt:  DADIVYTPMNVGGNHWVMLGIDLVQGDITVWDSLQTTTPLDELEKELKPMCTILPTLLHHG-GIFSVRPDLPVVPWRVRRVRVPQQSSATDCGIFCVRYF

Query:  EYDATGSNMNTLTQYNIVYFRRQYTVQMW
        E  A G + + L   N+   R +  V+M+
Subjt:  EYDATGSNMNTLTQYNIVYFRRQYTVQMW


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGAAAGGCAGAAAATGGGGTTTCATTCTCTGCCTTTCATCCCGGACCGTTTCGGTCCGGGATAAGGTCCGGGATGGAAGGCATAATGCATTATGTTGTGAAAGCCTTCC
ATCCCGAACCTTATCCTGGACCGTTTCGTTCCGGGATGAAAGGCAGAAAATGAAACCCCATTCATCCCGGAACGAAACGGTCCAGGATAAGGTCCGGGATAGAAGGCTTT
CACAGCATAATGCATTCTGCCTTCCATCCCGGACCTTATCCCGGATCGAAACGGTCCGGGATGAAAGGCAGAGAATGAAACCCCATTTTCTGCCTTTCATCCCGGAGGCG
AAATGGTTCGGGATACGAGAGGTTGAAGAGCCTAGGCAGGACATCATTAGTTTCGACCTGTTTGGGAAAAGGGTCTCTTTTGGTAAGCGGGAGTTTGACCTAATCACCGG
CCTCAGTCATAGGATGATTAGGGCAGTTTTTGACGATGATGAGGATGCTGTCAAAGTTGTCATAGTTTACTTCGTCGAGCTTGCCATGATGGGGAAGGAGAGGAAGCAGT
TTATAGATACGACCCTTTTAGGGGTTGTGGATAGGTCGGAGCTGTTCTGCAATCACGACTGGAGTTCGTTGATTTTCGAAAGAACACTTTGGAGCCTGAAGAATGCCCTG
AAGGATAAACTACCGGCGTACCAACAGAAGGCGAGAAATGACCCCACACACCAAGAGACTTATAGTCTCTACGGGTTTCCATACGCATATCAGGTATGGGCTTACGAGAC
GATATCGACGTTGAGTCTGCGCGTAGCCACGAGGCTGGGCGACGACGCCATTCCTCGACTCCTTAGGTGGTCGTGCACTTATTCTCGTGGGTTTCTTACTCTGCAGAGAG
ATGTGTTCGATAACACGATGTCCAAGGTTAAGGAATACTTGGTTTCGACTAATGCTGAGGCAGAACACATGGTCCGTATCATGCGTCCATCGGAAGCCCGCGCTATACCT
GCCCCGCTGGCTGTACCTGACCAGCCTGCAGTACCTGACCAGCCTGCAGTACCTGACCCGGCTGTTGTACCTGCCCCGGTTGCAGTACGTAACCCGTCTGCAGATTTGGA
AAGGGGTGCTGAGGAAAGAATGGGTCCTGCATTAGACGATGCTGGACCCAGTGGAAATGACAGCGAAGCCCTACAGAAGAGGTTGAAACGGAAAAAATTCAAAAATAAGA
TCAGTAGAAGGTTGAAGAGGCTCGATGACCGAGTTGGTGCTATCGAGGCCACACTGACTGGCTTCGAGGCCACACTGACTGGCTTCGGGGTTGCCCTGAAAGGCATCCAG
AGATACCTTAAAAAAATGTCGAAGAGTATGGACGAGGACCGGAGGCCGGAAGCGGTCCCTAAGACTGACGAGTATCAGACCATGGACGATAATCCGAAGAGTATGGACGA
GGATCCGAAGAATATGGACGAGGATCCGATGTTTATGGTTGAAGACCAGGGTACGATAACGGAGCGGGACAATGCATCGGATGCTTACCCCGATCGTCCTGTCGGTTTGT
TTCAGGATGCCATTGTTGGAATGCAAGAGCCGGACGTTGCATCAGATACGCAACCCGTCAGCCGACGCGTTAGGCGTCCCTATAAGGACTGGGCACCGGACGCAGTCATT
AAGGACGAATATGACCTTCAGCAGGCCCCAACTGGGCGTGGGCTACGCAAAAGGCATTACTCTTGGAAGCTGAAGGATATATACACACCAACCGGTCAGCGTGGGATCAC
CGTGGATAGATACGACCCAGTATGTCCCATTCCACCGCAGTTGGACGATAAGTTCCAGAGATGGATGGATGACCCGAAGACGGATGGGAGATTGCGGTCCACTGCAACTG
GTTTCCAAAAGAAGGAATGGTATCGCGATCTATTGGACCCTAGTGTTGAATTGAAGGACGAAGTACTTGATGGTCTCGTCCTGTTTACAGCGAAAAAGTTGGAGAAGTGT
CTCCATCTATGTCGCAAGAAGTTTGCGATAGGCGACGTACTTCTTTCGACTCTGCTGAATCGGACAGACGGTCCATATGCGGCCATGAAGCCGAGTGTATTGTCCACTAG
GATCAACTACCCTTGGTGCGAGGAGAATACAATCTGGCGATATGTCCACGGTAGGCAGTCGGAACACAACGTGCCCTGGAGTGATGCAGACATCGTGTACACCCCCATGA
ACGTAGGCGGGAACCACTGGGTGATGCTCGGGATCGACCTTGTACAGGGCGACATAACCGTATGGGATTCACTCCAAACGACCACTCCACTGGATGAACTTGAGAAGGAG
TTGAAGCCCATGTGTACAATCCTACCTACGCTACTGCATCATGGCGGGATATTTTCAGTTCGACCCGATTTGCCAGTGGTGCCGTGGAGGGTACGTCGGGTTCGCGTACC
ACAGCAGAGTAGTGCGACTGATTGCGGGATTTTCTGTGTCCGGTATTTCGAGTACGATGCCACCGGGTCAAATATGAACACTTTAACCCAATATAATATTGTATATTTTA
GGCGTCAGTACACTGTACAGATGTGGGCGCGTCGTCCCATTTTTTGA
mRNA sequenceShow/hide mRNA sequence
ATGAAAGGCAGAAAATGGGGTTTCATTCTCTGCCTTTCATCCCGGACCGTTTCGGTCCGGGATAAGGTCCGGGATGGAAGGCATAATGCATTATGTTGTGAAAGCCTTCC
ATCCCGAACCTTATCCTGGACCGTTTCGTTCCGGGATGAAAGGCAGAAAATGAAACCCCATTCATCCCGGAACGAAACGGTCCAGGATAAGGTCCGGGATAGAAGGCTTT
CACAGCATAATGCATTCTGCCTTCCATCCCGGACCTTATCCCGGATCGAAACGGTCCGGGATGAAAGGCAGAGAATGAAACCCCATTTTCTGCCTTTCATCCCGGAGGCG
AAATGGTTCGGGATACGAGAGGTTGAAGAGCCTAGGCAGGACATCATTAGTTTCGACCTGTTTGGGAAAAGGGTCTCTTTTGGTAAGCGGGAGTTTGACCTAATCACCGG
CCTCAGTCATAGGATGATTAGGGCAGTTTTTGACGATGATGAGGATGCTGTCAAAGTTGTCATAGTTTACTTCGTCGAGCTTGCCATGATGGGGAAGGAGAGGAAGCAGT
TTATAGATACGACCCTTTTAGGGGTTGTGGATAGGTCGGAGCTGTTCTGCAATCACGACTGGAGTTCGTTGATTTTCGAAAGAACACTTTGGAGCCTGAAGAATGCCCTG
AAGGATAAACTACCGGCGTACCAACAGAAGGCGAGAAATGACCCCACACACCAAGAGACTTATAGTCTCTACGGGTTTCCATACGCATATCAGGTATGGGCTTACGAGAC
GATATCGACGTTGAGTCTGCGCGTAGCCACGAGGCTGGGCGACGACGCCATTCCTCGACTCCTTAGGTGGTCGTGCACTTATTCTCGTGGGTTTCTTACTCTGCAGAGAG
ATGTGTTCGATAACACGATGTCCAAGGTTAAGGAATACTTGGTTTCGACTAATGCTGAGGCAGAACACATGGTCCGTATCATGCGTCCATCGGAAGCCCGCGCTATACCT
GCCCCGCTGGCTGTACCTGACCAGCCTGCAGTACCTGACCAGCCTGCAGTACCTGACCCGGCTGTTGTACCTGCCCCGGTTGCAGTACGTAACCCGTCTGCAGATTTGGA
AAGGGGTGCTGAGGAAAGAATGGGTCCTGCATTAGACGATGCTGGACCCAGTGGAAATGACAGCGAAGCCCTACAGAAGAGGTTGAAACGGAAAAAATTCAAAAATAAGA
TCAGTAGAAGGTTGAAGAGGCTCGATGACCGAGTTGGTGCTATCGAGGCCACACTGACTGGCTTCGAGGCCACACTGACTGGCTTCGGGGTTGCCCTGAAAGGCATCCAG
AGATACCTTAAAAAAATGTCGAAGAGTATGGACGAGGACCGGAGGCCGGAAGCGGTCCCTAAGACTGACGAGTATCAGACCATGGACGATAATCCGAAGAGTATGGACGA
GGATCCGAAGAATATGGACGAGGATCCGATGTTTATGGTTGAAGACCAGGGTACGATAACGGAGCGGGACAATGCATCGGATGCTTACCCCGATCGTCCTGTCGGTTTGT
TTCAGGATGCCATTGTTGGAATGCAAGAGCCGGACGTTGCATCAGATACGCAACCCGTCAGCCGACGCGTTAGGCGTCCCTATAAGGACTGGGCACCGGACGCAGTCATT
AAGGACGAATATGACCTTCAGCAGGCCCCAACTGGGCGTGGGCTACGCAAAAGGCATTACTCTTGGAAGCTGAAGGATATATACACACCAACCGGTCAGCGTGGGATCAC
CGTGGATAGATACGACCCAGTATGTCCCATTCCACCGCAGTTGGACGATAAGTTCCAGAGATGGATGGATGACCCGAAGACGGATGGGAGATTGCGGTCCACTGCAACTG
GTTTCCAAAAGAAGGAATGGTATCGCGATCTATTGGACCCTAGTGTTGAATTGAAGGACGAAGTACTTGATGGTCTCGTCCTGTTTACAGCGAAAAAGTTGGAGAAGTGT
CTCCATCTATGTCGCAAGAAGTTTGCGATAGGCGACGTACTTCTTTCGACTCTGCTGAATCGGACAGACGGTCCATATGCGGCCATGAAGCCGAGTGTATTGTCCACTAG
GATCAACTACCCTTGGTGCGAGGAGAATACAATCTGGCGATATGTCCACGGTAGGCAGTCGGAACACAACGTGCCCTGGAGTGATGCAGACATCGTGTACACCCCCATGA
ACGTAGGCGGGAACCACTGGGTGATGCTCGGGATCGACCTTGTACAGGGCGACATAACCGTATGGGATTCACTCCAAACGACCACTCCACTGGATGAACTTGAGAAGGAG
TTGAAGCCCATGTGTACAATCCTACCTACGCTACTGCATCATGGCGGGATATTTTCAGTTCGACCCGATTTGCCAGTGGTGCCGTGGAGGGTACGTCGGGTTCGCGTACC
ACAGCAGAGTAGTGCGACTGATTGCGGGATTTTCTGTGTCCGGTATTTCGAGTACGATGCCACCGGGTCAAATATGAACACTTTAACCCAATATAATATTGTATATTTTA
GGCGTCAGTACACTGTACAGATGTGGGCGCGTCGTCCCATTTTTTGA
Protein sequenceShow/hide protein sequence
MKGRKWGFILCLSSRTVSVRDKVRDGRHNALCCESLPSRTLSWTVSFRDERQKMKPHSSRNETVQDKVRDRRLSQHNAFCLPSRTLSRIETVRDERQRMKPHFLPFIPEA
KWFGIREVEEPRQDIISFDLFGKRVSFGKREFDLITGLSHRMIRAVFDDDEDAVKVVIVYFVELAMMGKERKQFIDTTLLGVVDRSELFCNHDWSSLIFERTLWSLKNAL
KDKLPAYQQKARNDPTHQETYSLYGFPYAYQVWAYETISTLSLRVATRLGDDAIPRLLRWSCTYSRGFLTLQRDVFDNTMSKVKEYLVSTNAEAEHMVRIMRPSEARAIP
APLAVPDQPAVPDQPAVPDPAVVPAPVAVRNPSADLERGAEERMGPALDDAGPSGNDSEALQKRLKRKKFKNKISRRLKRLDDRVGAIEATLTGFEATLTGFGVALKGIQ
RYLKKMSKSMDEDRRPEAVPKTDEYQTMDDNPKSMDEDPKNMDEDPMFMVEDQGTITERDNASDAYPDRPVGLFQDAIVGMQEPDVASDTQPVSRRVRRPYKDWAPDAVI
KDEYDLQQAPTGRGLRKRHYSWKLKDIYTPTGQRGITVDRYDPVCPIPPQLDDKFQRWMDDPKTDGRLRSTATGFQKKEWYRDLLDPSVELKDEVLDGLVLFTAKKLEKC
LHLCRKKFAIGDVLLSTLLNRTDGPYAAMKPSVLSTRINYPWCEENTIWRYVHGRQSEHNVPWSDADIVYTPMNVGGNHWVMLGIDLVQGDITVWDSLQTTTPLDELEKE
LKPMCTILPTLLHHGGIFSVRPDLPVVPWRVRRVRVPQQSSATDCGIFCVRYFEYDATGSNMNTLTQYNIVYFRRQYTVQMWARRPIF