; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Moc04g08960 (gene) of Bitter gourd (OHB3-1) v2 genome

Gene IDMoc04g08960
OrganismMomordica charantia cv. OHB3-1 (Bitter gourd (OHB3-1) v2)
DescriptionUlp1-like peptidase
Genome locationchr4:6606071..6615240
RNA-Seq ExpressionMoc04g08960
SyntenyMoc04g08960
Gene Ontology termsGO:0006508 - proteolysis (biological process)
GO:0008234 - cysteine-type peptidase activity (molecular function)
InterPro domainsIPR003653 - Ulp1 protease family, C-terminal catalytic domain
IPR015410 - Domain of unknown function DUF1985
IPR029472 - Retrotransposon Copia-like, N-terminal
IPR038765 - Papain-like cysteine peptidase superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_022154561.1 uncharacterized protein LOC111021802 [Momordica charantia]6.5e-10759.56Show/hide
Query:  MFRKTVFGHLLDVDLVFNGPLVHNILLREVEDSTADSISFNLFGRKVSFGRREFDLISGLKYDGNLVRKDTHVHRLRALYFNDRSDLVLSDLENLYEAAQ
        MFRKT F HLLDVDLVFNG L+HNILLREVE+ST ++ISFNLF R++SF R +F LISGLKY    VR++T  HRL  LYFND++DLVLSD E +Y AA+
Subjt:  MFRKTVFGHLLDVDLVFNGPLVHNILLREVEDSTADSISFNLFGRKVSFGRREFDLISGLKYDGNLVRKDTHVHRLRALYFNDRSDLVLSDLENLYEAAQ

Query:  FQDDFDAVKVSIVYMVEMVLLRRERTVKFDQTLLGIVDDWEICCNYDWASLSFEKTIGSLRRGPAKMAKDGGFRKSYSLYGFSW---VWAYEVISSLSGR
        F+DD+D VKV IVYMV + LL RER VKFD TLLGIVDDWE+CCNY+WASLSFEKTI SL+RGP KM+KDG  RKSYSLYGF W   VWAY+ ISSLS R
Subjt:  FQDDFDAVKVSIVYMVEMVLLRRERTVKFDQTLLGIVDDWEICCNYDWASLSFEKTIGSLRRGPAKMAKDGGFRKSYSLYGFSW---VWAYEVISSLSGR

Query:  VANKFFEDAVPRILQWRCGHSTAWHVLDREIFRSTTGRTQRLEASDAETSFMRGSC---WGQGDDAQPQSRVCEGPQEPNVGQGDAVGPSAVREG-----
        VANK   D VP I +WR  HSTAWHVLDR+IF ST GRT+ L+ +D ETSF+  S        DD   +             +GD  GPSAVREG     
Subjt:  VANKFFEDAVPRILQWRCGHSTAWHVLDREIFRSTTGRTQRLEASDAETSFMRGSC---WGQGDDAQPQSRVCEGPQEPNVGQGDAVGPSAVREG-----

Query:  --RGIGTNIV---------DSKGQNNARTSSMRLRKVEKHLKNIDKRIGERMSGMEAELKAIKKYL
          RG    +V         ++KG+N    S+ RL++VEK LK++DKR+ ERM  +EAELK+IKK+L
Subjt:  --RGIGTNIV---------DSKGQNNARTSSMRLRKVEKHLKNIDKRIGERMSGMEAELKAIKKYL

XP_022156465.1 uncharacterized protein LOC111023353 [Momordica charantia]1.3e-8359.53Show/hide
Query:  MFRKTVFGHLLDVDLVFNGPLVHNILLREVEDSTADSISFNLFGRKVSFGRREFDLISGLKYDGNLVRKDTHVHRLRALYFNDRSDLVLSDLENLYEAAQ
        MFRKT+FGHLLDVDLVFNGPL+HNILLREVEDST ++ISFNLFGR+VSFGRREFDLISGL YD + VRK TH H+LR LYFNDR++ VLSD   LY AA 
Subjt:  MFRKTVFGHLLDVDLVFNGPLVHNILLREVEDSTADSISFNLFGRKVSFGRREFDLISGLKYDGNLVRKDTHVHRLRALYFNDRSDLVLSDLENLYEAAQ

Query:  FQDDFDAVKVSIVYMVEMVLLRRERTVKFDQTLLGIVDDWEICCNYDWASLSFEKTIGSLRRGPAKMAKDGGFRKSYSLYGFSW---VWAYEVISSLSGR
        F+DDFD +KVSI+YMVE+VLL RE T+KFDQ LLG+VDDWE+CCN+D ASLSF+KTI SL RGP  MAKD G RKSYSLYGF W   VW YE        
Subjt:  FQDDFDAVKVSIVYMVEMVLLRRERTVKFDQTLLGIVDDWEICCNYDWASLSFEKTIGSLRRGPAKMAKDGGFRKSYSLYGFSW---VWAYEVISSLSGR

Query:  VANKFFEDAVPRILQWRCGHSTAWHVLDREIFRSTTGRTQRLEASDAETSFMRGSC---WGQGDDAQP----QSRVCEGPQEPNVGQGDAVGPSAVREG
                                             RT+RLEA+DAET+FMR +      + DD +      S V EG Q P+VG+GD  GPSAVREG
Subjt:  VANKFFEDAVPRILQWRCGHSTAWHVLDREIFRSTTGRTQRLEASDAETSFMRGSC---WGQGDDAQP----QSRVCEGPQEPNVGQGDAVGPSAVREG

XP_022158083.1 uncharacterized protein LOC111024651 [Momordica charantia]2.5e-9595.72Show/hide
Query:  MELRPKIDPAIYASANVSCLSHLAKTVTAIKGKLVPRQLAMFRKTVFGHLLDVDLVFNGPLVHNILLREVEDSTADSISFNLFGRKVSFGRREFDLISGL
        MELRPKIDPAIYASANVSCLSHLAKTVTAIKGKL PRQLAMFRKT+F HLLDVDLVFNGPLVHNILLREVEDST DSISFNLFGRKVSFGRREFDLISGL
Subjt:  MELRPKIDPAIYASANVSCLSHLAKTVTAIKGKLVPRQLAMFRKTVFGHLLDVDLVFNGPLVHNILLREVEDSTADSISFNLFGRKVSFGRREFDLISGL

Query:  KYDGNLVRKDTHVHRLRALYFNDRSDLVLSDLENLYEAAQFQDDFDAVKVSIVYMVEMVLLRRERTVKFDQTLLGIVDDWEICCNYD
        KYDG+LVRKDTHVHRLRALYFNDRSDLVLSDLE+LYEAAQFQDDFDAVKVSIVYMVEMVLL RERTVKFDQTLLGIVDDWE+CCNYD
Subjt:  KYDGNLVRKDTHVHRLRALYFNDRSDLVLSDLENLYEAAQFQDDFDAVKVSIVYMVEMVLLRRERTVKFDQTLLGIVDDWEICCNYD

XP_022158660.1 uncharacterized protein LOC111025123 [Momordica charantia]8.2e-8693.3Show/hide
Query:  MELRPKIDPAIYASANVSCLSHLAKTVTAIKGKLVPRQLAMFRKTVFGHLLDVDLVFNGPLVHNILLREVEDSTADSISFNLFGRKVSFGRREFDLISGL
        MELRPKIDPAIYASA VSCLSHLAKTVTAIKGKL PRQL+MFRKT+FGHLL+VDLVFNG LVHNILLREVEDST DSISFNLFGRKVSFGRREFDLISGL
Subjt:  MELRPKIDPAIYASANVSCLSHLAKTVTAIKGKLVPRQLAMFRKTVFGHLLDVDLVFNGPLVHNILLREVEDSTADSISFNLFGRKVSFGRREFDLISGL

Query:  KYDGNLVRKDTHVHRLRALYFNDRSDLVLSDLENLYEAAQFQDDFDAVKVSIVYMVEMVLLRRERTVKFDQTLLGIVDD
        KYDG+LV+KDTHVHRLRALYFNDR DLVLSDLE+LYEAAQFQDDFDAVKVSIVYMVEMVLL RERTVKFDQTLLGIVDD
Subjt:  KYDGNLVRKDTHVHRLRALYFNDRSDLVLSDLENLYEAAQFQDDFDAVKVSIVYMVEMVLLRRERTVKFDQTLLGIVDD

XP_022158744.1 uncharacterized protein LOC111025209 [Momordica charantia]4.8e-8662.13Show/hide
Query:  PKIDPAIYASANVSCLSHLAKTVTAIKGKLVPRQLAMFRKTVFGHLLDVDLVFNGPLVHNILLREVEDSTADSISFNLFGRKVSFGRREFDLISGLKYDG
        PKIDPA YASA ++CLSH+AKT   IK KL P+QLAMFRKT+F HLLDVDLVFNGPL+                     G KVSFGRREFD+ISGLKY  
Subjt:  PKIDPAIYASANVSCLSHLAKTVTAIKGKLVPRQLAMFRKTVFGHLLDVDLVFNGPLVHNILLREVEDSTADSISFNLFGRKVSFGRREFDLISGLKYDG

Query:  NLVRKDTHVHRLRALYFNDRSDLVLSDLENLYEAAQFQDDFDAVKVSIVYMVEMVLLRRERTVKFDQTLLGIVDDWEICCNYDWASLSFEKTIGSLRRGP
        + VRK T+  R   LYFN+ +DL+LS+LE +Y + +F+DD DAVKV +VY VE+VLL RER+ KFD  LLGIVDDWE CCN+DWA LSF+KTI SL+RG 
Subjt:  NLVRKDTHVHRLRALYFNDRSDLVLSDLENLYEAAQFQDDFDAVKVSIVYMVEMVLLRRERTVKFDQTLLGIVDDWEICCNYDWASLSFEKTIGSLRRGP

Query:  AKMAKDGGFRKSYSLYGFSW---VWAYEVISSLSGRVANKFFEDAVPRILQWRCGHSTAWHVLDREIFRSTT
        +  +K+GG RKSYSLYGF W   VWAYE+ISSLSG +     +D VPRILQWR  HSTA+H+L REIFRS+T
Subjt:  AKMAKDGGFRKSYSLYGFSW---VWAYEVISSLSGRVANKFFEDAVPRILQWRCGHSTAWHVLDREIFRSTT

TrEMBL top hitse value%identityAlignment
A0A6J1DP34 uncharacterized protein LOC1110218023.1e-10759.56Show/hide
Query:  MFRKTVFGHLLDVDLVFNGPLVHNILLREVEDSTADSISFNLFGRKVSFGRREFDLISGLKYDGNLVRKDTHVHRLRALYFNDRSDLVLSDLENLYEAAQ
        MFRKT F HLLDVDLVFNG L+HNILLREVE+ST ++ISFNLF R++SF R +F LISGLKY    VR++T  HRL  LYFND++DLVLSD E +Y AA+
Subjt:  MFRKTVFGHLLDVDLVFNGPLVHNILLREVEDSTADSISFNLFGRKVSFGRREFDLISGLKYDGNLVRKDTHVHRLRALYFNDRSDLVLSDLENLYEAAQ

Query:  FQDDFDAVKVSIVYMVEMVLLRRERTVKFDQTLLGIVDDWEICCNYDWASLSFEKTIGSLRRGPAKMAKDGGFRKSYSLYGFSW---VWAYEVISSLSGR
        F+DD+D VKV IVYMV + LL RER VKFD TLLGIVDDWE+CCNY+WASLSFEKTI SL+RGP KM+KDG  RKSYSLYGF W   VWAY+ ISSLS R
Subjt:  FQDDFDAVKVSIVYMVEMVLLRRERTVKFDQTLLGIVDDWEICCNYDWASLSFEKTIGSLRRGPAKMAKDGGFRKSYSLYGFSW---VWAYEVISSLSGR

Query:  VANKFFEDAVPRILQWRCGHSTAWHVLDREIFRSTTGRTQRLEASDAETSFMRGSC---WGQGDDAQPQSRVCEGPQEPNVGQGDAVGPSAVREG-----
        VANK   D VP I +WR  HSTAWHVLDR+IF ST GRT+ L+ +D ETSF+  S        DD   +             +GD  GPSAVREG     
Subjt:  VANKFFEDAVPRILQWRCGHSTAWHVLDREIFRSTTGRTQRLEASDAETSFMRGSC---WGQGDDAQPQSRVCEGPQEPNVGQGDAVGPSAVREG-----

Query:  --RGIGTNIV---------DSKGQNNARTSSMRLRKVEKHLKNIDKRIGERMSGMEAELKAIKKYL
          RG    +V         ++KG+N    S+ RL++VEK LK++DKR+ ERM  +EAELK+IKK+L
Subjt:  --RGIGTNIV---------DSKGQNNARTSSMRLRKVEKHLKNIDKRIGERMSGMEAELKAIKKYL

A0A6J1DQC8 uncharacterized protein LOC1110233536.4e-8459.53Show/hide
Query:  MFRKTVFGHLLDVDLVFNGPLVHNILLREVEDSTADSISFNLFGRKVSFGRREFDLISGLKYDGNLVRKDTHVHRLRALYFNDRSDLVLSDLENLYEAAQ
        MFRKT+FGHLLDVDLVFNGPL+HNILLREVEDST ++ISFNLFGR+VSFGRREFDLISGL YD + VRK TH H+LR LYFNDR++ VLSD   LY AA 
Subjt:  MFRKTVFGHLLDVDLVFNGPLVHNILLREVEDSTADSISFNLFGRKVSFGRREFDLISGLKYDGNLVRKDTHVHRLRALYFNDRSDLVLSDLENLYEAAQ

Query:  FQDDFDAVKVSIVYMVEMVLLRRERTVKFDQTLLGIVDDWEICCNYDWASLSFEKTIGSLRRGPAKMAKDGGFRKSYSLYGFSW---VWAYEVISSLSGR
        F+DDFD +KVSI+YMVE+VLL RE T+KFDQ LLG+VDDWE+CCN+D ASLSF+KTI SL RGP  MAKD G RKSYSLYGF W   VW YE        
Subjt:  FQDDFDAVKVSIVYMVEMVLLRRERTVKFDQTLLGIVDDWEICCNYDWASLSFEKTIGSLRRGPAKMAKDGGFRKSYSLYGFSW---VWAYEVISSLSGR

Query:  VANKFFEDAVPRILQWRCGHSTAWHVLDREIFRSTTGRTQRLEASDAETSFMRGSC---WGQGDDAQP----QSRVCEGPQEPNVGQGDAVGPSAVREG
                                             RT+RLEA+DAET+FMR +      + DD +      S V EG Q P+VG+GD  GPSAVREG
Subjt:  VANKFFEDAVPRILQWRCGHSTAWHVLDREIFRSTTGRTQRLEASDAETSFMRGSC---WGQGDDAQP----QSRVCEGPQEPNVGQGDAVGPSAVREG

A0A6J1DV44 uncharacterized protein LOC1110246511.2e-9595.72Show/hide
Query:  MELRPKIDPAIYASANVSCLSHLAKTVTAIKGKLVPRQLAMFRKTVFGHLLDVDLVFNGPLVHNILLREVEDSTADSISFNLFGRKVSFGRREFDLISGL
        MELRPKIDPAIYASANVSCLSHLAKTVTAIKGKL PRQLAMFRKT+F HLLDVDLVFNGPLVHNILLREVEDST DSISFNLFGRKVSFGRREFDLISGL
Subjt:  MELRPKIDPAIYASANVSCLSHLAKTVTAIKGKLVPRQLAMFRKTVFGHLLDVDLVFNGPLVHNILLREVEDSTADSISFNLFGRKVSFGRREFDLISGL

Query:  KYDGNLVRKDTHVHRLRALYFNDRSDLVLSDLENLYEAAQFQDDFDAVKVSIVYMVEMVLLRRERTVKFDQTLLGIVDDWEICCNYD
        KYDG+LVRKDTHVHRLRALYFNDRSDLVLSDLE+LYEAAQFQDDFDAVKVSIVYMVEMVLL RERTVKFDQTLLGIVDDWE+CCNYD
Subjt:  KYDGNLVRKDTHVHRLRALYFNDRSDLVLSDLENLYEAAQFQDDFDAVKVSIVYMVEMVLLRRERTVKFDQTLLGIVDDWEICCNYD

A0A6J1DWG2 uncharacterized protein LOC1110251234.0e-8693.3Show/hide
Query:  MELRPKIDPAIYASANVSCLSHLAKTVTAIKGKLVPRQLAMFRKTVFGHLLDVDLVFNGPLVHNILLREVEDSTADSISFNLFGRKVSFGRREFDLISGL
        MELRPKIDPAIYASA VSCLSHLAKTVTAIKGKL PRQL+MFRKT+FGHLL+VDLVFNG LVHNILLREVEDST DSISFNLFGRKVSFGRREFDLISGL
Subjt:  MELRPKIDPAIYASANVSCLSHLAKTVTAIKGKLVPRQLAMFRKTVFGHLLDVDLVFNGPLVHNILLREVEDSTADSISFNLFGRKVSFGRREFDLISGL

Query:  KYDGNLVRKDTHVHRLRALYFNDRSDLVLSDLENLYEAAQFQDDFDAVKVSIVYMVEMVLLRRERTVKFDQTLLGIVDD
        KYDG+LV+KDTHVHRLRALYFNDR DLVLSDLE+LYEAAQFQDDFDAVKVSIVYMVEMVLL RERTVKFDQTLLGIVDD
Subjt:  KYDGNLVRKDTHVHRLRALYFNDRSDLVLSDLENLYEAAQFQDDFDAVKVSIVYMVEMVLLRRERTVKFDQTLLGIVDD

A0A6J1E0A9 uncharacterized protein LOC1110252092.3e-8662.13Show/hide
Query:  PKIDPAIYASANVSCLSHLAKTVTAIKGKLVPRQLAMFRKTVFGHLLDVDLVFNGPLVHNILLREVEDSTADSISFNLFGRKVSFGRREFDLISGLKYDG
        PKIDPA YASA ++CLSH+AKT   IK KL P+QLAMFRKT+F HLLDVDLVFNGPL+                     G KVSFGRREFD+ISGLKY  
Subjt:  PKIDPAIYASANVSCLSHLAKTVTAIKGKLVPRQLAMFRKTVFGHLLDVDLVFNGPLVHNILLREVEDSTADSISFNLFGRKVSFGRREFDLISGLKYDG

Query:  NLVRKDTHVHRLRALYFNDRSDLVLSDLENLYEAAQFQDDFDAVKVSIVYMVEMVLLRRERTVKFDQTLLGIVDDWEICCNYDWASLSFEKTIGSLRRGP
        + VRK T+  R   LYFN+ +DL+LS+LE +Y + +F+DD DAVKV +VY VE+VLL RER+ KFD  LLGIVDDWE CCN+DWA LSF+KTI SL+RG 
Subjt:  NLVRKDTHVHRLRALYFNDRSDLVLSDLENLYEAAQFQDDFDAVKVSIVYMVEMVLLRRERTVKFDQTLLGIVDDWEICCNYDWASLSFEKTIGSLRRGP

Query:  AKMAKDGGFRKSYSLYGFSW---VWAYEVISSLSGRVANKFFEDAVPRILQWRCGHSTAWHVLDREIFRSTT
        +  +K+GG RKSYSLYGF W   VWAYE+ISSLSG +     +D VPRILQWR  HSTA+H+L REIFRS+T
Subjt:  AKMAKDGGFRKSYSLYGFSW---VWAYEVISSLSGRVANKFFEDAVPRILQWRCGHSTAWHVLDREIFRSTT

SwissProt top hitse value%identityAlignment
Q94HW2 Retrovirus-related Pol polyprotein from transposon RE17.4e-0520.1Show/hide
Query:  LHSPIFLLSNICNLMSIRLDSSNYVLWKFQLTAILKAHKLFGFIEGTTVQPQQFLITTTESSSVTSINPMFEEWIVKDQ----AFMTLINVTLSPAA---
        L++   L  N+ N+   +L S+NY++W  Q+ A+   ++L GF++G+T  P      T  + +   +NP +  W  +D+    A +  I++++ PA    
Subjt:  LHSPIFLLSNICNLMSIRLDSSNYVLWKFQLTAILKAHKLFGFIEGTTVQPQQFLITTTESSSVTSINPMFEEWIVKDQ----AFMTLINVTLSPAA---

Query:  -----------------------------LAYVKPGESISDYVKRIKELKDKLANVSVVINDEALLIYALNGLPAEYNAFRTSIRTRPQAISFEEFMFSY
                                       + K  ++I DY++ +    D+LA +   ++ +  +   L  LP EY      I  +    +  E     
Subjt:  -----------------------------LAYVKPGESISDYVKRIKELKDKLANVSVVINDEALLIYALNGLPAEYNAFRTSIRTRPQAISFEEFMFSY

Query:  CLKNQQLISRINARIFFLNLRLCSLLRS---------------------------HSSEIIFLEN-----PLLLNIKL------AAMVASQNHNFVSGI-
             ++++  +A +  +     S   +                             S   F  N     P L   ++      +A   SQ  +F+S + 
Subjt:  CLKNQQLISRINARIFFLNLRLCSLLRS---------------------------HSSEIIFLEN-----PLLLNIKL------AAMVASQNHNFVSGI-

Query:  -----------------------SSTAWLTDSGCNAHVTSDLSHLSNASKYNGEDQVSVGSGQSLPVTHSSCGTVSEPFSARP
                               SS  WL DSG   H+TSD ++LS    Y G D V V  G ++P++H+  G+ S    +RP
Subjt:  -----------------------SSTAWLTDSGCNAHVTSDLSHLSNASKYNGEDQVSVGSGQSLPVTHSSCGTVSEPFSARP

Q9ZT94 Retrovirus-related Pol polyprotein from transposon RE22.8e-0422.15Show/hide
Query:  IFLLSNICNL-MS--IRLDSSNYVLWKFQLTAILKAHKLFGFIEGTTVQPQQFLITTTESSSVTSINPMFEEWIVKDQ----AFMTLINVTLSPAA----
        + + +NI N+ MS   +L S+NY++W  Q+ A+   ++L GF++G+T  P      T  + +V  +NP +  W  +D+    A +  I++++ PA     
Subjt:  IFLLSNICNL-MS--IRLDSSNYVLWKFQLTAILKAHKLFGFIEGTTVQPQQFLITTTESSSVTSINPMFEEWIVKDQ----AFMTLINVTLSPAA----

Query:  ----------LAYVKPGESISDYVKRIKELKDKLANVSVVINDEALLIYALNGLPAEYNAFRTSIRTRPQAISFEEFMFSYCLKNQQLISRINARIFFLN
                    Y  P       ++ I    D+LA +   ++ +  +   L  LP +Y      I  +    S  E       +  +L++  +A +  + 
Subjt:  ----------LAYVKPGESISDYVKRIKELKDKLANVSVVINDEALLIYALNGLPAEYNAFRTSIRTRPQAISFEEFMFSYCLKNQQLISRINARIFFLN

Query:  LRLCSLLRSH--------------------------SSEIIFLEN----PLLLNIKL------AAMVASQNHNFV------------------------S
          + +   ++                          SS     +N    P L   ++      +A    Q H F                         S
Subjt:  LRLCSLLRSH--------------------------SSEIIFLEN----PLLLNIKL------AAMVASQNHNFV------------------------S

Query:  GISSTAWLTDSGCNAHVTSDLSHLSNASKYNGEDQVSVGSGQSLPVTHSSCGTVSEPFSARPTAAMGPFDGDNSPLLTATATFRNGCSSFKFTYHDCHGY
          ++  WL DSG   H+TSD ++LS    Y G D V +  G ++P+TH+  G+ S P S+R          D + +L      +N  S     Y  C+  
Subjt:  GISSTAWLTDSGCNAHVTSDLSHLSNASKYNGEDQVSVGSGQSLPVTHSSCGTVSEPFSARPTAAMGPFDGDNSPLLTATATFRNGCSSFKFTYHDCHGY

Query:  NSTYTCYVSGSSINAAMDVGLLGTAIPLNPLDTAVNSRQYP-SGTAANSEVAVPIGPKGTS-----IG-PSIGIVNGPLVISMELRPKIDPA
            T  VS     A+  V  L T +PL    T     ++P + + A S  A P      S     +G PS+ I+N   VIS    P ++P+
Subjt:  NSTYTCYVSGSSINAAMDVGLLGTAIPLNPLDTAVNSRQYP-SGTAANSEVAVPIGPKGTS-----IG-PSIGIVNGPLVISMELRPKIDPA

Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGACATCGTCGTCCTCCTTGAGTGGTGATTTATCACAGACTGCTAGTTCTTCGATTGGTAATCTACATTCACCAATCTTTCTGTTGTCAAATATCTGCAACCTTATGTC
AATTCGATTAGATTCATCAAATTACGTCTTGTGGAAGTTCCAACTCACCGCAATTCTCAAGGCACACAAATTGTTTGGGTTTATTGAGGGAACAACAGTCCAACCTCAAC
AATTTCTGATCACAACCACAGAATCTTCTTCTGTGACTTCAATCAATCCAATGTTTGAAGAATGGATCGTCAAAGATCAGGCTTTCATGACTCTGATAAATGTCACGCTC
TCACCTGCTGCTTTGGCCTATGTGAAGCCAGGGGAATCCATCAGTGACTATGTCAAGCGTATCAAAGAACTCAAAGACAAGCTTGCAAATGTATCGGTTGTTATCAATGA
TGAGGCTTTATTGATTTATGCCCTAAATGGCCTCCCTGCAGAGTACAATGCTTTTCGCACATCAATTCGTACTCGTCCTCAAGCCATTTCATTTGAAGAATTCATGTTCT
CCTACTGTCTAAAGAATCAGCAATTAATAAGCAGGATAAACGCGAGGATATTCTTCCTCAACCTTCGGCTATGCTCGCTTCTCAGATCTCACAGCAGCGAAATAATTTTT
CTCGAAAATCCTCTACTTCTGAACATCAAGCTTGCAGCAATGGTAGCCTCTCAAAACCATAATTTTGTCTCTGGAATTTCTTCTACCGCTTGGCTTACTGACTCTGGCTG
TAACGCTCATGTTACATCCGATTTGAGTCATCTCTCCAACGCCTCTAAATATAATGGTGAAGATCAAGTTTCTGTCGGCAGTGGGCAGTCCCTTCCTGTTACTCACTCAA
GTTGTGGTACTGTTTCTGAACCATTTTCTGCTAGGCCTACTGCTGCTATGGGTCCATTTGATGGTGATAATTCACCTTTGCTCACTGCTACTGCTACGTTCAGAAATGGG
TGTAGTAGCTTTAAATTCACCTACCATGACTGTCACGGGTACAACAGTACCTACACCTGCTATGTCTCTGGTTCTTCAATTAATGCTGCTATGGATGTAGGACTTTTGGG
TACAGCAATTCCTCTTAATCCTTTGGATACAGCAGTCAATTCAAGGCAATATCCTTCGGGTACAGCAGCAAATTCAGAAGTTGCAGTACCCATTGGTCCAAAAGGGACAT
CTATTGGTCCTTCTATTGGTATTGTAAATGGTCCTTTGGTTATAAGTATGGAATTGAGACCGAAAATTGACCCTGCAATCTATGCATCTGCAAACGTGTCCTGTTTATCG
CATCTAGCGAAGACAGTGACTGCTATTAAGGGAAAATTGGTCCCTAGACAGCTAGCTATGTTTAGGAAAACCGTATTCGGTCATTTGCTGGACGTGGACCTCGTTTTTAA
CGGGCCATTGGTACACAATATATTACTTAGAGAGGTTGAGGATAGTACGGCGGACAGTATTAGTTTCAACCTGTTTGGGAGAAAGGTGTCGTTCGGACGGAGGGAATTTG
ACCTTATTAGTGGCCTTAAGTATGACGGGAACCTAGTTAGGAAAGATACTCATGTTCATAGACTTAGGGCTCTGTACTTTAACGATAGGTCTGACCTTGTGTTGAGTGAT
TTAGAAAACCTATATGAAGCCGCCCAGTTTCAGGATGACTTCGATGCGGTTAAGGTATCCATTGTTTACATGGTCGAAATGGTCTTGCTAAGGAGGGAGAGAACTGTGAA
GTTCGACCAGACGCTGTTAGGAATAGTGGATGACTGGGAGATCTGCTGCAACTACGACTGGGCGTCCCTATCGTTCGAGAAGACGATAGGTAGTCTTCGTCGTGGCCCAG
CCAAGATGGCAAAGGATGGAGGGTTCAGGAAATCATATAGCCTGTATGGTTTCTCCTGGGTGTGGGCGTACGAGGTGATATCTTCCCTATCTGGTCGGGTCGCAAATAAA
TTTTTTGAGGATGCAGTGCCACGTATCCTCCAATGGAGGTGTGGCCATTCGACTGCATGGCATGTGCTCGATCGGGAGATTTTTCGGTCTACAACGGGAAGAACGCAAAG
ATTAGAGGCAAGTGATGCTGAGACGAGCTTCATGAGAGGATCCTGCTGGGGTCAAGGCGATGATGCTCAACCGCAGTCACGTGTATGTGAGGGCCCACAGGAGCCTAATG
TGGGCCAAGGTGATGCTGTTGGACCATCAGCTGTGCGTGAGGGACGAGGCATCGGTACGAACATTGTCGATTCTAAAGGTCAGAATAACGCTCGCACATCTAGCATGCGC
TTGAGGAAGGTTGAGAAACACTTGAAGAACATAGACAAGCGTATTGGCGAGCGTATGTCTGGCATGGAGGCTGAATTGAAAGCAATCAAGAAGTATTTGAGGCGACTTGC
TAAGGATGACATGAGGAGAAGAAAAGGTACCGACTCGGGCGGTGGTGCTGGCCCGAGAGATGGTGATGAGCTGGGAGATGGTACAGATCCAAGGGATCGTGGTGAGCCGA
GGGATGGTGGTGGATCGGGAGAGAGCACCGAGGCGGATACCGGTATTGAGCTGGGAGATACTATCGAGCAGGATGCCGGTGCTGGGTCGGTTGATTGTACCGCGCCTAAT
GTTGGTATTGCAGATGGGTCGGGAGATGTGGGCGCCGGCGGTGATGACACGGTGGTCGATACTGTAAAGGAACATTACATGGACAATTTCACGGATTCCGACATGGAGGA
AACGAGGGAGGTTAAATCACACATGGACGGAGATGAGGTCGTGTGCTATCGCAGCCAGTTGCACCACCGAATCCGCGACGTGGCTCTCGGAAGAGGAAGGCACCGTGGAT
CGTTTGTGGTCATGGAGGACGGGAGGAAAAAGAAGGCTATCCAGTATGATCCGCTAGTACGGATCCCTCCTGAGCAGGTCACCAAGTTCCACAATTGGATGAGTAGCCCT
ATCACGAAGCATGCGAAGAGGAAATCGTGCTATGGTAAGAAGGACAAGACGTGGTTTCGTGACCTTTTAACGTCGGACAAGTGGCTGATCAGCGAGGTAATCGATTCGCT
GGTCATGTTCACACGGAACAAGCTGGAGCAGCGTCCCGACTTGTGCTCTAGGAGGTTTACGACTGGTGACATATTGCTTGCGAACTTCTTCCGCCAGACCGATGGTATAT
ACCAGAGATTGATCGCCCCCAATGCCGTTCCAGCAATAGTGTCGGCCGAATATGACTGGGAGAGTCGATACAAAACGATCATGAGCTATGTCGACGGCACTCACACAAAC
TATGGGACACGGTGGCTGGACCTCATGTTGTATACTTGTCATACAACATCGGTGGAAATCATTGGATCATGGTTTGCATTCGACATGCAAGAAAGTGAGATCATCGTATG
GGATTCTATGATGGCAATCACAACACCGGCTACTCTGGAGGAGGAGTTGAAACCGATGAGCGTCATTCTCCCAGCGTTGATGTGTAGAGCCAGAGTTAGGGTTTTGAGGC
CTACCATACCCACTGTACCATGGCGCATCCGTCGAGTAACCGGGGCTCCACAGCAGACTGGTTCAGGTGATTGTGGTATTTCCTGTGTTAAGTTTTTTGAGTACGATGTA
ACGGGTTCCAATTTCGCGACTCTAACTCAAGAGAGGATCCCATTTTTTAGGGAGAAACTCACCATTGAAGTATGGGCGAACCGGTGTATTTTTTGA
mRNA sequenceShow/hide mRNA sequence
ATGACATCGTCGTCCTCCTTGAGTGGTGATTTATCACAGACTGCTAGTTCTTCGATTGGTAATCTACATTCACCAATCTTTCTGTTGTCAAATATCTGCAACCTTATGTC
AATTCGATTAGATTCATCAAATTACGTCTTGTGGAAGTTCCAACTCACCGCAATTCTCAAGGCACACAAATTGTTTGGGTTTATTGAGGGAACAACAGTCCAACCTCAAC
AATTTCTGATCACAACCACAGAATCTTCTTCTGTGACTTCAATCAATCCAATGTTTGAAGAATGGATCGTCAAAGATCAGGCTTTCATGACTCTGATAAATGTCACGCTC
TCACCTGCTGCTTTGGCCTATGTGAAGCCAGGGGAATCCATCAGTGACTATGTCAAGCGTATCAAAGAACTCAAAGACAAGCTTGCAAATGTATCGGTTGTTATCAATGA
TGAGGCTTTATTGATTTATGCCCTAAATGGCCTCCCTGCAGAGTACAATGCTTTTCGCACATCAATTCGTACTCGTCCTCAAGCCATTTCATTTGAAGAATTCATGTTCT
CCTACTGTCTAAAGAATCAGCAATTAATAAGCAGGATAAACGCGAGGATATTCTTCCTCAACCTTCGGCTATGCTCGCTTCTCAGATCTCACAGCAGCGAAATAATTTTT
CTCGAAAATCCTCTACTTCTGAACATCAAGCTTGCAGCAATGGTAGCCTCTCAAAACCATAATTTTGTCTCTGGAATTTCTTCTACCGCTTGGCTTACTGACTCTGGCTG
TAACGCTCATGTTACATCCGATTTGAGTCATCTCTCCAACGCCTCTAAATATAATGGTGAAGATCAAGTTTCTGTCGGCAGTGGGCAGTCCCTTCCTGTTACTCACTCAA
GTTGTGGTACTGTTTCTGAACCATTTTCTGCTAGGCCTACTGCTGCTATGGGTCCATTTGATGGTGATAATTCACCTTTGCTCACTGCTACTGCTACGTTCAGAAATGGG
TGTAGTAGCTTTAAATTCACCTACCATGACTGTCACGGGTACAACAGTACCTACACCTGCTATGTCTCTGGTTCTTCAATTAATGCTGCTATGGATGTAGGACTTTTGGG
TACAGCAATTCCTCTTAATCCTTTGGATACAGCAGTCAATTCAAGGCAATATCCTTCGGGTACAGCAGCAAATTCAGAAGTTGCAGTACCCATTGGTCCAAAAGGGACAT
CTATTGGTCCTTCTATTGGTATTGTAAATGGTCCTTTGGTTATAAGTATGGAATTGAGACCGAAAATTGACCCTGCAATCTATGCATCTGCAAACGTGTCCTGTTTATCG
CATCTAGCGAAGACAGTGACTGCTATTAAGGGAAAATTGGTCCCTAGACAGCTAGCTATGTTTAGGAAAACCGTATTCGGTCATTTGCTGGACGTGGACCTCGTTTTTAA
CGGGCCATTGGTACACAATATATTACTTAGAGAGGTTGAGGATAGTACGGCGGACAGTATTAGTTTCAACCTGTTTGGGAGAAAGGTGTCGTTCGGACGGAGGGAATTTG
ACCTTATTAGTGGCCTTAAGTATGACGGGAACCTAGTTAGGAAAGATACTCATGTTCATAGACTTAGGGCTCTGTACTTTAACGATAGGTCTGACCTTGTGTTGAGTGAT
TTAGAAAACCTATATGAAGCCGCCCAGTTTCAGGATGACTTCGATGCGGTTAAGGTATCCATTGTTTACATGGTCGAAATGGTCTTGCTAAGGAGGGAGAGAACTGTGAA
GTTCGACCAGACGCTGTTAGGAATAGTGGATGACTGGGAGATCTGCTGCAACTACGACTGGGCGTCCCTATCGTTCGAGAAGACGATAGGTAGTCTTCGTCGTGGCCCAG
CCAAGATGGCAAAGGATGGAGGGTTCAGGAAATCATATAGCCTGTATGGTTTCTCCTGGGTGTGGGCGTACGAGGTGATATCTTCCCTATCTGGTCGGGTCGCAAATAAA
TTTTTTGAGGATGCAGTGCCACGTATCCTCCAATGGAGGTGTGGCCATTCGACTGCATGGCATGTGCTCGATCGGGAGATTTTTCGGTCTACAACGGGAAGAACGCAAAG
ATTAGAGGCAAGTGATGCTGAGACGAGCTTCATGAGAGGATCCTGCTGGGGTCAAGGCGATGATGCTCAACCGCAGTCACGTGTATGTGAGGGCCCACAGGAGCCTAATG
TGGGCCAAGGTGATGCTGTTGGACCATCAGCTGTGCGTGAGGGACGAGGCATCGGTACGAACATTGTCGATTCTAAAGGTCAGAATAACGCTCGCACATCTAGCATGCGC
TTGAGGAAGGTTGAGAAACACTTGAAGAACATAGACAAGCGTATTGGCGAGCGTATGTCTGGCATGGAGGCTGAATTGAAAGCAATCAAGAAGTATTTGAGGCGACTTGC
TAAGGATGACATGAGGAGAAGAAAAGGTACCGACTCGGGCGGTGGTGCTGGCCCGAGAGATGGTGATGAGCTGGGAGATGGTACAGATCCAAGGGATCGTGGTGAGCCGA
GGGATGGTGGTGGATCGGGAGAGAGCACCGAGGCGGATACCGGTATTGAGCTGGGAGATACTATCGAGCAGGATGCCGGTGCTGGGTCGGTTGATTGTACCGCGCCTAAT
GTTGGTATTGCAGATGGGTCGGGAGATGTGGGCGCCGGCGGTGATGACACGGTGGTCGATACTGTAAAGGAACATTACATGGACAATTTCACGGATTCCGACATGGAGGA
AACGAGGGAGGTTAAATCACACATGGACGGAGATGAGGTCGTGTGCTATCGCAGCCAGTTGCACCACCGAATCCGCGACGTGGCTCTCGGAAGAGGAAGGCACCGTGGAT
CGTTTGTGGTCATGGAGGACGGGAGGAAAAAGAAGGCTATCCAGTATGATCCGCTAGTACGGATCCCTCCTGAGCAGGTCACCAAGTTCCACAATTGGATGAGTAGCCCT
ATCACGAAGCATGCGAAGAGGAAATCGTGCTATGGTAAGAAGGACAAGACGTGGTTTCGTGACCTTTTAACGTCGGACAAGTGGCTGATCAGCGAGGTAATCGATTCGCT
GGTCATGTTCACACGGAACAAGCTGGAGCAGCGTCCCGACTTGTGCTCTAGGAGGTTTACGACTGGTGACATATTGCTTGCGAACTTCTTCCGCCAGACCGATGGTATAT
ACCAGAGATTGATCGCCCCCAATGCCGTTCCAGCAATAGTGTCGGCCGAATATGACTGGGAGAGTCGATACAAAACGATCATGAGCTATGTCGACGGCACTCACACAAAC
TATGGGACACGGTGGCTGGACCTCATGTTGTATACTTGTCATACAACATCGGTGGAAATCATTGGATCATGGTTTGCATTCGACATGCAAGAAAGTGAGATCATCGTATG
GGATTCTATGATGGCAATCACAACACCGGCTACTCTGGAGGAGGAGTTGAAACCGATGAGCGTCATTCTCCCAGCGTTGATGTGTAGAGCCAGAGTTAGGGTTTTGAGGC
CTACCATACCCACTGTACCATGGCGCATCCGTCGAGTAACCGGGGCTCCACAGCAGACTGGTTCAGGTGATTGTGGTATTTCCTGTGTTAAGTTTTTTGAGTACGATGTA
ACGGGTTCCAATTTCGCGACTCTAACTCAAGAGAGGATCCCATTTTTTAGGGAGAAACTCACCATTGAAGTATGGGCGAACCGGTGTATTTTTTGA
Protein sequenceShow/hide protein sequence
MTSSSSLSGDLSQTASSSIGNLHSPIFLLSNICNLMSIRLDSSNYVLWKFQLTAILKAHKLFGFIEGTTVQPQQFLITTTESSSVTSINPMFEEWIVKDQAFMTLINVTL
SPAALAYVKPGESISDYVKRIKELKDKLANVSVVINDEALLIYALNGLPAEYNAFRTSIRTRPQAISFEEFMFSYCLKNQQLISRINARIFFLNLRLCSLLRSHSSEIIF
LENPLLLNIKLAAMVASQNHNFVSGISSTAWLTDSGCNAHVTSDLSHLSNASKYNGEDQVSVGSGQSLPVTHSSCGTVSEPFSARPTAAMGPFDGDNSPLLTATATFRNG
CSSFKFTYHDCHGYNSTYTCYVSGSSINAAMDVGLLGTAIPLNPLDTAVNSRQYPSGTAANSEVAVPIGPKGTSIGPSIGIVNGPLVISMELRPKIDPAIYASANVSCLS
HLAKTVTAIKGKLVPRQLAMFRKTVFGHLLDVDLVFNGPLVHNILLREVEDSTADSISFNLFGRKVSFGRREFDLISGLKYDGNLVRKDTHVHRLRALYFNDRSDLVLSD
LENLYEAAQFQDDFDAVKVSIVYMVEMVLLRRERTVKFDQTLLGIVDDWEICCNYDWASLSFEKTIGSLRRGPAKMAKDGGFRKSYSLYGFSWVWAYEVISSLSGRVANK
FFEDAVPRILQWRCGHSTAWHVLDREIFRSTTGRTQRLEASDAETSFMRGSCWGQGDDAQPQSRVCEGPQEPNVGQGDAVGPSAVREGRGIGTNIVDSKGQNNARTSSMR
LRKVEKHLKNIDKRIGERMSGMEAELKAIKKYLRRLAKDDMRRRKGTDSGGGAGPRDGDELGDGTDPRDRGEPRDGGGSGESTEADTGIELGDTIEQDAGAGSVDCTAPN
VGIADGSGDVGAGGDDTVVDTVKEHYMDNFTDSDMEETREVKSHMDGDEVVCYRSQLHHRIRDVALGRGRHRGSFVVMEDGRKKKAIQYDPLVRIPPEQVTKFHNWMSSP
ITKHAKRKSCYGKKDKTWFRDLLTSDKWLISEVIDSLVMFTRNKLEQRPDLCSRRFTTGDILLANFFRQTDGIYQRLIAPNAVPAIVSAEYDWESRYKTIMSYVDGTHTN
YGTRWLDLMLYTCHTTSVEIIGSWFAFDMQESEIIVWDSMMAITTPATLEEELKPMSVILPALMCRARVRVLRPTIPTVPWRIRRVTGAPQQTGSGDCGISCVKFFEYDV
TGSNFATLTQERIPFFREKLTIEVWANRCIF