; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

CmoCh02G001910 (gene) of Cucurbita moschata (Rifu) v1 genome

Gene IDCmoCh02G001910
OrganismCucurbita moschata Rifu (Cucurbita moschata (Rifu) v1)
DescriptionLOW QUALITY PROTEIN: KDEL-tailed cysteine endopeptidase CEP2
Genome locationCmo_Chr02:907645..914469
RNA-Seq ExpressionCmoCh02G001910
SyntenyCmoCh02G001910
Gene Ontology termsGO:0006412 - translation (biological process)
GO:0006508 - proteolysis (biological process)
GO:0005739 - mitochondrion (cellular component)
GO:0015934 - large ribosomal subunit (cellular component)
GO:0003735 - structural constituent of ribosome (molecular function)
GO:0008234 - cysteine-type peptidase activity (molecular function)
InterPro domainsIPR000668 - Peptidase C1A, papain C-terminal
IPR005996 - Ribosomal protein L30, bacterial-type
IPR016082 - Ribosomal protein L30, ferredoxin-like fold domain
IPR025660 - Cysteine peptidase, histidine active site
IPR036919 - Ribosomal protein L30, ferredoxin-like fold domain superfamily
IPR038765 - Papain-like cysteine peptidase superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAG6604875.1 hypothetical protein SDJN03_02192, partial [Cucurbita argyrosperma subsp. sororia]4.6e-47100Show/hide
Query:  ILEVLDMNAFKAFKANVPIAWSPNLYITLVRGIPGTRRLHRRTLEALRLRKCNRTVMRWNTPTVRGMIQQVKRLVVVETEEMFKARKQKVEQHKSLRPPL
        ILEVLDMNAFKAFKANVPIAWSPNLYITLVRGIPGTRRLHRRTLEALRLRKCNRTVMRWNTPTVRGMIQQVKRLVVVETEEMFKARKQKVEQHKSLRPPL
Subjt:  ILEVLDMNAFKAFKANVPIAWSPNLYITLVRGIPGTRRLHRRTLEALRLRKCNRTVMRWNTPTVRGMIQQVKRLVVVETEEMFKARKQKVEQHKSLRPPL

Query:  V
        V
Subjt:  V

KAG6604877.1 Thiol protease 102, partial [Cucurbita argyrosperma subsp. sororia]9.6e-6970.05Show/hide
Query:  LNGSSSNTLLNGASEPKLQLIIGASRKAIESLTDQPCLANKKLHKSSPNKAWLHPRRKRSSSEQNQARAAVEGISKIKTGTLVSQSEQELVDRDVISGNQ
        LNGSSSNTLLNGASEPKLQL+IGASRKAIESLTDQPCLANKKLHKSSPNKAWLHPRRKRSSSEQNQARAAVEGISKIKTGTLVSQSEQELVDRDVISGNQ
Subjt:  LNGSSSNTLLNGASEPKLQLIIGASRKAIESLTDQPCLANKKLHKSSPNKAWLHPRRKRSSSEQNQARAAVEGISKIKTGTLVSQSEQELVDRDVISGNQ

Query:  GCNGGFISSKKLDSLQKENIHTLKLYAINKKRDTTMTEKVPVIDEKSIKDAVANQPVSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGYGEASNRL
        GCNGGF+                K +   KK   T   + P I      +A+ N+  +          +++     + NCGKELNHGVAVVGYGEASNRL
Subjt:  GCNGGFISSKKLDSLQKENIHTLKLYAINKKRDTTMTEKVPVIDEKSIKDAVANQPVSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGYGEASNRL

Query:  VKNSWGTDGVNLVTREF
        VKNSWGTDGVNLVTREF
Subjt:  VKNSWGTDGVNLVTREF

KAG7034988.1 KDEL-tailed cysteine endopeptidase CEP2, partial [Cucurbita argyrosperma subsp. argyrosperma]4.3e-8580.65Show/hide
Query:  LNGSSSNTLLNGASEPKLQLIIGASRKAIESLTDQPCLANKKLHKSSPNKAWLHPRRKRSSSEQNQARAAVEGISKIKTGTLVSQSEQELVDRDVISGNQ
        LNGSSSNTLLNGASEPKLQL+IGASRKA+ESLTDQPCLANKKLHKSSPNKAWLHPRRKRSSSEQNQARAAVEGISKIKTGTLVSQSEQELVD        
Subjt:  LNGSSSNTLLNGASEPKLQLIIGASRKAIESLTDQPCLANKKLHKSSPNKAWLHPRRKRSSSEQNQARAAVEGISKIKTGTLVSQSEQELVDRDVISGNQ

Query:  GCNGGFISSKKLDSLQKENIHTLKLYAINKKRDTTMTEKVPVIDEKSIKDAVANQPVSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGYGEASNRL
                                       RDTTMTEKVPVIDEKSIKDAVANQPVSVAIHTGGYDFQFYSGGVFSGNCGKE NHGVAVVGYGEASNRL
Subjt:  GCNGGFISSKKLDSLQKENIHTLKLYAINKKRDTTMTEKVPVIDEKSIKDAVANQPVSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGYGEASNRL

Query:  VKNSWGTDGVNLVTREF
        VKNSWGTDGVNLVTREF
Subjt:  VKNSWGTDGVNLVTREF

XP_022947302.1 LOW QUALITY PROTEIN: KDEL-tailed cysteine endopeptidase CEP2 [Cucurbita moschata]1.1e-9386.36Show/hide
Query:  FFEFHHLNGSSSNTLLNGASEPKLQLIIGASRKAIESLTDQPCLANKKLHKSSPNKAWLHPRRKRSSSEQNQARAAVEGISKIKTGTLVSQSEQELVDRD
        F EFHHLNGSSSNTLLNGASEPKLQLIIGASRKAIESLTDQPCLANKKLHKSSPNKAWLHPRRKRSSSEQNQARAAVEGISKIKTGTLVSQSEQELVDRD
Subjt:  FFEFHHLNGSSSNTLLNGASEPKLQLIIGASRKAIESLTDQPCLANKKLHKSSPNKAWLHPRRKRSSSEQNQARAAVEGISKIKTGTLVSQSEQELVDRD

Query:  VISGNQGCNGGFISSK----KLDSLQKENIHTLKLYAINKK--RDTTMTEKVPVIDEKSIKDAVANQPVSVAIHTGGYDFQFYSGGVFSGNCGKELNHGV
        VISGNQGCNGGF+       K   L  E  H       NK+  R    TEKVPVIDEKSIKDAVANQPVSVAIHTGGYDFQFYSGGVFSGNCGKELNHGV
Subjt:  VISGNQGCNGGFISSK----KLDSLQKENIHTLKLYAINKK--RDTTMTEKVPVIDEKSIKDAVANQPVSVAIHTGGYDFQFYSGGVFSGNCGKELNHGV

Query:  AVVGYGEASNRLVKNSWGTD
        AVVGYGEASNRLVKNSWGTD
Subjt:  AVVGYGEASNRLVKNSWGTD

XP_023533790.1 LOW QUALITY PROTEIN: KDEL-tailed cysteine endopeptidase CEP2 [Cucurbita pepo subsp. pepo]5.6e-7779.44Show/hide
Query:  LNGSSSNTLLNGASEPKLQLIIGASRKAIESLTDQPCLANKKLHKSSPNKAWLHPRRKRSSSEQNQARAAVEGISKIKTGTLVSQSEQELVDRDVISGNQ
        LNGSSSNTLLNGASEPKLQLI GASRKAIESLTDQPCLANKKLHKSSPNKAW      RSSSEQNQA+AAVE ISKIKTGTLVSQSEQELVD DVISGNQ
Subjt:  LNGSSSNTLLNGASEPKLQLIIGASRKAIESLTDQPCLANKKLHKSSPNKAWLHPRRKRSSSEQNQARAAVEGISKIKTGTLVSQSEQELVDRDVISGNQ

Query:  GCNGGFISSKKLDSLQKENIHTLK----LYAI---NKKRDTTMTEKVPVIDEKSIKDAVANQPVSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGY
        GCNGGF+  K    ++K  + T +    + AI    K R  TMTEKVPVIDEKSIKDAVANQP SVAIHTGGYDFQFYSGGVFS NCGKELNHGVAVVGY
Subjt:  GCNGGFISSKKLDSLQKENIHTLK----LYAI---NKKRDTTMTEKVPVIDEKSIKDAVANQPVSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGY

Query:  GEASNR---LVKNS
        GEASN+   LVKNS
Subjt:  GEASNR---LVKNS

TrEMBL top hitse value%identityAlignment
A0A6J1CI25 uncharacterized protein LOC1110111342.2e-4295.79Show/hide
Query:  MNAFKAFKANVPIAWSPNLYITLVRGIPGTRRLHRRTLEALRLRKCNRTVMRWNTPTVRGMIQQVKRLVVVETEEMFKARKQKVEQHKSLRPPLV
        MNAFKAFKANVPIAWSPNLYITLVRG+PGTRRLHRRTLEALRL KCNRTVMRWNTPTVRGMIQQVKRLVVVETEEM+KARKQKVEQHK+LRPPLV
Subjt:  MNAFKAFKANVPIAWSPNLYITLVRGIPGTRRLHRRTLEALRLRKCNRTVMRWNTPTVRGMIQQVKRLVVVETEEMFKARKQKVEQHKSLRPPLV

A0A6J1FUA4 uncharacterized protein LOC1114488488.3e-4293.68Show/hide
Query:  MNAFKAFKANVPIAWSPNLYITLVRGIPGTRRLHRRTLEALRLRKCNRTVMRWNTPTVRGMIQQVKRLVVVETEEMFKARKQKVEQHKSLRPPLV
        MNAFKAFKANVPI WSPNLYITLVRG+PGTRRLHRRTLEALRLRKCNRTVMRWNTPTVRGM+QQVKRLVVVETEEM+KARKQKVEQH +LRPPLV
Subjt:  MNAFKAFKANVPIAWSPNLYITLVRGIPGTRRLHRRTLEALRLRKCNRTVMRWNTPTVRGMIQQVKRLVVVETEEMFKARKQKVEQHKSLRPPLV

A0A6J1G629 LOW QUALITY PROTEIN: KDEL-tailed cysteine endopeptidase CEP25.5e-9486.36Show/hide
Query:  FFEFHHLNGSSSNTLLNGASEPKLQLIIGASRKAIESLTDQPCLANKKLHKSSPNKAWLHPRRKRSSSEQNQARAAVEGISKIKTGTLVSQSEQELVDRD
        F EFHHLNGSSSNTLLNGASEPKLQLIIGASRKAIESLTDQPCLANKKLHKSSPNKAWLHPRRKRSSSEQNQARAAVEGISKIKTGTLVSQSEQELVDRD
Subjt:  FFEFHHLNGSSSNTLLNGASEPKLQLIIGASRKAIESLTDQPCLANKKLHKSSPNKAWLHPRRKRSSSEQNQARAAVEGISKIKTGTLVSQSEQELVDRD

Query:  VISGNQGCNGGFISSK----KLDSLQKENIHTLKLYAINKK--RDTTMTEKVPVIDEKSIKDAVANQPVSVAIHTGGYDFQFYSGGVFSGNCGKELNHGV
        VISGNQGCNGGF+       K   L  E  H       NK+  R    TEKVPVIDEKSIKDAVANQPVSVAIHTGGYDFQFYSGGVFSGNCGKELNHGV
Subjt:  VISGNQGCNGGFISSK----KLDSLQKENIHTLKLYAINKK--RDTTMTEKVPVIDEKSIKDAVANQPVSVAIHTGGYDFQFYSGGVFSGNCGKELNHGV

Query:  AVVGYGEASNRLVKNSWGTD
        AVVGYGEASNRLVKNSWGTD
Subjt:  AVVGYGEASNRLVKNSWGTD

A0A6J1G641 uncharacterized protein LOC1114511213.0e-44100Show/hide
Query:  MNAFKAFKANVPIAWSPNLYITLVRGIPGTRRLHRRTLEALRLRKCNRTVMRWNTPTVRGMIQQVKRLVVVETEEMFKARKQKVEQHKSLRPPLV
        MNAFKAFKANVPIAWSPNLYITLVRGIPGTRRLHRRTLEALRLRKCNRTVMRWNTPTVRGMIQQVKRLVVVETEEMFKARKQKVEQHKSLRPPLV
Subjt:  MNAFKAFKANVPIAWSPNLYITLVRGIPGTRRLHRRTLEALRLRKCNRTVMRWNTPTVRGMIQQVKRLVVVETEEMFKARKQKVEQHKSLRPPLV

A0A6J1I4X2 uncharacterized protein LOC1114699022.6e-4396.84Show/hide
Query:  MNAFKAFKANVPIAWSPNLYITLVRGIPGTRRLHRRTLEALRLRKCNRTVMRWNTPTVRGMIQQVKRLVVVETEEMFKARKQKVEQHKSLRPPLV
        MNAFKAFKANVPIAWSPNLYITLVRGIPGTRRLHRRTLEALRLRKCNRTVMRWNTPTVRGM+QQVKRLVVVETEEMFKARKQK+EQHK+LRPPLV
Subjt:  MNAFKAFKANVPIAWSPNLYITLVRGIPGTRRLHRRTLEALRLRKCNRTVMRWNTPTVRGMIQQVKRLVVVETEEMFKARKQKVEQHKSLRPPLV

SwissProt top hitse value%identityAlignment
O65039 Vignain6.1e-2649.36Show/hide
Query:  AVEGISKIKTGTLVSQSEQELVDRDVISGNQGCNGG-----FISSKKLDSLQKENIHTLKLY----AINKKRDTTMT----EKVPVIDEKSIKDAVANQP
        AVEGI++IKT  LVS SEQELVD D    NQGCNGG     F   K+   +  E  +  + Y     ++K+    ++    E VP  DE ++  AVANQP
Subjt:  AVEGISKIKTGTLVSQSEQELVDRDVISGNQGCNGG-----FISSKKLDSLQKENIHTLKLY----AINKKRDTTMT----EKVPVIDEKSIKDAVANQP

Query:  VSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGYGEASNR----LVKNSWGTD
        VSVAI  GG DFQFYS GVF+G+CG EL+HGVA+VGYG   +      VKNSWG +
Subjt:  VSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGYGEASNR----LVKNSWGTD

P12412 Vignain3.3e-2751.28Show/hide
Query:  AVEGISKIKTGTLVSQSEQELVDRDVISGNQGCNGGFISSKKLDSLQKENIHTLKLYAI---------NKKRDTTMT----EKVPVIDEKSIKDAVANQP
        AVEGI++IKT  LVS SEQELVD D    NQGCNGG + S      QK  I T   Y           +K  D  ++    E VPV DE ++  AVANQP
Subjt:  AVEGISKIKTGTLVSQSEQELVDRDVISGNQGCNGGFISSKKLDSLQKENIHTLKLYAI---------NKKRDTTMT----EKVPVIDEKSIKDAVANQP

Query:  VSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGYG---EASNR-LVKNSWGTD
        VSVAI  GG DFQFYS GVF+G+C  +LNHGVA+VGYG   + +N  +V+NSWG +
Subjt:  VSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGYG---EASNR-LVKNSWGTD

P25803 Vignain2.8e-2650Show/hide
Query:  AVEGISKIKTGTLVSQSEQELVDRDVISGNQGCNGGFISSKKLDSLQKENIHTLKLYA---------INKKRDTTMT----EKVPVIDEKSIKDAVANQP
        AVEGI++IKT  LV+ SEQELVD D    NQGCNGG + S      QK  I T   Y           +K  D  ++    E VP  DE ++  AVANQP
Subjt:  AVEGISKIKTGTLVSQSEQELVDRDVISGNQGCNGGFISSKKLDSLQKENIHTLKLYA---------INKKRDTTMT----EKVPVIDEKSIKDAVANQP

Query:  VSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGYG---EASNR-LVKNSWGTD
        VSVAI  GG DFQFYS GVF+G+C  +LNHGVA+VGYG   + +N  +V+NSWG +
Subjt:  VSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGYG---EASNR-LVKNSWGTD

P43156 Thiol protease SEN1027.2e-2747.13Show/hide
Query:  AAVEGISKIKTGTLVSQSEQELVDRDVISGNQGCNGGFISSKKLDSLQKENIHTLKLYAINKKRDTTMT-------------EKVPVIDEKSIKDAVANQ
        A+VEGI++IKTG LVS SEQELVD D  S N+GCNGG +     + +QK  I T   Y   ++  T  +             + VP  +E ++  AVANQ
Subjt:  AAVEGISKIKTGTLVSQSEQELVDRDVISGNQGCNGGFISSKKLDSLQKENIHTLKLYAINKKRDTTMT-------------EKVPVIDEKSIKDAVANQ

Query:  PVSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGYGEASNR----LVKNSWGTD
        P+SV+I   GY FQFYS GVF+G CG EL+HGVA+VGYG   +     +VKNSWG +
Subjt:  PVSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGYGEASNR----LVKNSWGTD

Q9STL4 KDEL-tailed cysteine endopeptidase CEP21.6e-2650Show/hide
Query:  AAVEGISKIKTGTLVSQSEQELVDRDVISGNQGCNGG-----FISSKKLDSLQKENIHTLKLYAINKKRDTTMT----------EKVPVIDEKSIKDAVA
        AAVEGI+KIKT  LVS SEQELVD D    N+GCNGG     F   KK   +  E+ +  +   I+ K D +            E VP  DE ++  AVA
Subjt:  AAVEGISKIKTGTLVSQSEQELVDRDVISGNQGCNGG-----FISSKKLDSLQKENIHTLKLYAINKKRDTTMT----------EKVPVIDEKSIKDAVA

Query:  NQPVSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGYGEASNR---LVKNSWGTD
        NQPVSVAI  G  DFQFYS GVF+G+CG ELNHGVA VGYG    +   +V+NSWG +
Subjt:  NQPVSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGYGEASNR---LVKNSWGTD

Arabidopsis top hitse value%identityAlignment
AT1G20850.1 xylem cysteine peptidase 22.2e-2647.4Show/hide
Query:  AAVEGISKIKTGTLVSQSEQELVDRDVISGNQGCNGGFISSK-----KLDSLQKENIHTLKL----YAINKKRDTTMT----EKVPVIDEKSIKDAVANQ
        AAVEGI+KI TG L + SEQEL+D D  + N GCNGG +        K   L+KE  +   +      + K    T+T    + VP  DEKS+  A+A+Q
Subjt:  AAVEGISKIKTGTLVSQSEQELVDRDVISGNQGCNGGFISSK-----KLDSLQKENIHTLKL----YAINKKRDTTMT----EKVPVIDEKSIKDAVANQ

Query:  PVSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGYGEASNR---LVKNSWG
        P+SVAI   G +FQFYSGGVF G CG +L+HGVA VGYG +      +VKNSWG
Subjt:  PVSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGYGEASNR---LVKNSWG

AT3G19390.1 Granulin repeat cysteine protease family protein3.7e-2645.91Show/hide
Query:  ARAAVEGISKIKTGTLVSQSEQELVDRDVISGNQGCNGGFISSKKLDSLQKENIHTLKLYAI----------NKKRDTTMT----EKVPVIDEKSIKDAV
        A  AVEGI++IKTG L+S SEQELVD D  S N GC GG +       ++   I T + Y            +KK    +T    E VP  DEKS+K A+
Subjt:  ARAAVEGISKIKTGTLVSQSEQELVDRDVISGNQGCNGGFISSKKLDSLQKENIHTLKLYAI----------NKKRDTTMT----EKVPVIDEKSIKDAV

Query:  ANQPVSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGYGEASNR---LVKNSWGTD
        ANQP+SVAI  GG  FQ Y+ GVF+G CG  L+HGV  VGYG    +   +V+NSWG++
Subjt:  ANQPVSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGYGEASNR---LVKNSWGTD

AT3G48340.1 Cysteine proteinases superfamily protein1.1e-2750Show/hide
Query:  AAVEGISKIKTGTLVSQSEQELVDRDVISGNQGCNGG-----FISSKKLDSLQKENIHTLKLYAINKKRDTTMT----------EKVPVIDEKSIKDAVA
        AAVEGI+KIKT  LVS SEQELVD D    N+GCNGG     F   KK   +  E+ +  +   I+ K D +            E VP  DE ++  AVA
Subjt:  AAVEGISKIKTGTLVSQSEQELVDRDVISGNQGCNGG-----FISSKKLDSLQKENIHTLKLYAINKKRDTTMT----------EKVPVIDEKSIKDAVA

Query:  NQPVSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGYGEASNR---LVKNSWGTD
        NQPVSVAI  G  DFQFYS GVF+G+CG ELNHGVA VGYG    +   +V+NSWG +
Subjt:  NQPVSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGYGEASNR---LVKNSWGTD

AT5G50260.1 Cysteine proteinases superfamily protein1.3e-2650Show/hide
Query:  AVEGISKIKTGTLVSQSEQELVDRDVISGNQGCNGG-----FISSKKLDSLQKENIHTLK----LYAINKKRDTTMT----EKVPVIDEKSIKDAVANQP
        AVEGI++I+T  L S SEQELVD D  + NQGCNGG     F   K+   L  E ++  K        NK+    ++    E VP   E  +  AVANQP
Subjt:  AVEGISKIKTGTLVSQSEQELVDRDVISGNQGCNGG-----FISSKKLDSLQKENIHTLK----LYAINKKRDTTMT----EKVPVIDEKSIKDAVANQP

Query:  VSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGYGEASNR----LVKNSWGTD
        VSVAI  GG DFQFYS GVF+G CG ELNHGVAVVGYG   +     +VKNSWG +
Subjt:  VSVAIHTGGYDFQFYSGGVFSGNCGKELNHGVAVVGYGEASNR----LVKNSWGTD

AT5G55140.1 ribosomal protein L30 family protein1.7e-3168.42Show/hide
Query:  MNAFKAFKANVPIAWSPNLYITLVRGIPGTRRLHRRTLEALRLRKCNRTVMRWNTPTVRGMIQQVKRLVVVETEEMFKARKQKVEQHKSLRPPLV
        M+ F+AFKA VPI WS +LYITLVRG+PGTR+LHRRTLEA+ LR+C+RTV+  N  ++RGMI QVKR+VVVETEEM+ ARK+    HK+LRPPLV
Subjt:  MNAFKAFKANVPIAWSPNLYITLVRGIPGTRRLHRRTLEALRLRKCNRTVMRWNTPTVRGMIQQVKRLVVVETEEMFKARKQKVEQHKSLRPPLV


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGGAATTTTGGAGGTCTTAGACATGAATGCCTTCAAGGCCTTCAAAGCTAATGTTCCAATTGCATGGAGTCCTAATCTCTATATCACATTAGTGAGAGGCATTCCTGG
AACAAGGAGACTTCACAGACGTACTTTAGAGGCCTTACGTCTTCGGAAATGCAACCGAACTGTCATGCGCTGGAACACACCAACTGTTCGAGGAATGATTCAACAGGTAA
AGCGATTAGTAGTCGTTGAAACAGAAGAAATGTTTAAAGCCCGTAAACAGAAGGTGGAACAGCACAAATCTCTTCGACCTCCATTAGTTCCATTGCTTGTTGGTTACGTT
TTCTACGATCGTCACCAACAAACTACTTGCAAAAAATCCCATGGAAAGGGTGCTAAGGAAAAGTCCTGTGCTCATGGATTTCATTCGTTCTGGTGCTTCTCTGATGAAAA
ACTCGAGTTGTCCAACATATGCAAATGCCTCTCCAGCACCAACCAAGAAGAACTGAGGATGGCGATGAATAGAAAGGCTGACATAGAACCAGCAGGAATCACAAACTTGC
CAACTTTTCTATCCATAAATGATGCTTGCTCTACTGTGAATTCGTCTGAGAGTAGATCGTCCAAAAGAGGATGCCAGTGGACCATATGGGTAAAAGCTTCAGAACCATTT
TCACTTCCTCGACTTCGGTCACGCATAGAGCATGGCAAGTTGCCCACCATTAGCCGCAATGCATTGATGAAGTTGCCTTGTGGAGGAGATATCACAATGAGGAGGTCTCA
TGCCAGGGATGCCATTCTTAATGGCTCAGAACTTACTACAGCAGTTATGGAAGCAAAGGTTGCAACAGTGAGGAACCGTCCCAACTTGGCATCAGCTAAGAAGCCACCAA
GAAGGCCAAGAAGATTAAGGGCTCCAAGAAAATTAGTCACAATAGTTGCAGACTTAGCTGAAGAGAGATGCAGTTCTCTCTTCTATAGAAAGAGAGAGAGAAAGGGAAGG
AGGGAGGGATTGTTTTTGTGTGCTAAGGCTGTGCTTAGTTTCATGATTTATTGGAAAGCAAGAAGCTCAAAAGTGATGGAAAAACACAAAAAAGACTTTCCCTTCTTTGA
ATTTCATCACCTAAATGGCTCCTCATCAAACACATTATTGAATGGTGCATCTGAGCCTAAGCTTCAACTCATCATCGGTGCCTCAAGGAAGGCGATAGAGTCGTTAACAG
ATCAACCCTGTCTAGCCAACAAGAAACTGCACAAGTCATCACCCAATAAGGCGTGGCTCCACCCAAGGCGGAAGCGATCTTCTAGTGAGCAAAATCAGGCCAGAGCAGCT
GTGGAAGGCATTAGCAAAATAAAAACAGGCACATTGGTCTCTCAATCAGAACAAGAGCTTGTCGACCGTGATGTCATCTCGGGGAACCAGGGATGCAATGGTGGCTTCAT
TTCATCAAAAAAACTGGACTCACTACAGAAAGAGAACATCCATACATTGAAGCTATATGCAATAAACAAAAAGCGAGATACCACTATGACAGAAAAAGTACCTGTAATTG
ATGAGAAAAGCATAAAAGATGCAGTTGCTAACCAGCCAGTCTCTGTAGCAATTCATACAGGGGGATATGATTTCCAGTTCTATTCTGGTGGAGTTTTCTCAGGGAATTGT
GGAAAGGAACTCAATCATGGAGTGGCAGTAGTTGGGTATGGGGAAGCTAGCAATAGGCTTGTCAAGAATTCATGGGGCACTGACGGGGTGAATCTGGTTACACGAGAATT
CTGA
mRNA sequenceShow/hide mRNA sequence
ATGGGAATTTTGGAGGTCTTAGACATGAATGCCTTCAAGGCCTTCAAAGCTAATGTTCCAATTGCATGGAGTCCTAATCTCTATATCACATTAGTGAGAGGCATTCCTGG
AACAAGGAGACTTCACAGACGTACTTTAGAGGCCTTACGTCTTCGGAAATGCAACCGAACTGTCATGCGCTGGAACACACCAACTGTTCGAGGAATGATTCAACAGGTAA
AGCGATTAGTAGTCGTTGAAACAGAAGAAATGTTTAAAGCCCGTAAACAGAAGGTGGAACAGCACAAATCTCTTCGACCTCCATTAGTTCCATTGCTTGTTGGTTACGTT
TTCTACGATCGTCACCAACAAACTACTTGCAAAAAATCCCATGGAAAGGGTGCTAAGGAAAAGTCCTGTGCTCATGGATTTCATTCGTTCTGGTGCTTCTCTGATGAAAA
ACTCGAGTTGTCCAACATATGCAAATGCCTCTCCAGCACCAACCAAGAAGAACTGAGGATGGCGATGAATAGAAAGGCTGACATAGAACCAGCAGGAATCACAAACTTGC
CAACTTTTCTATCCATAAATGATGCTTGCTCTACTGTGAATTCGTCTGAGAGTAGATCGTCCAAAAGAGGATGCCAGTGGACCATATGGGTAAAAGCTTCAGAACCATTT
TCACTTCCTCGACTTCGGTCACGCATAGAGCATGGCAAGTTGCCCACCATTAGCCGCAATGCATTGATGAAGTTGCCTTGTGGAGGAGATATCACAATGAGGAGGTCTCA
TGCCAGGGATGCCATTCTTAATGGCTCAGAACTTACTACAGCAGTTATGGAAGCAAAGGTTGCAACAGTGAGGAACCGTCCCAACTTGGCATCAGCTAAGAAGCCACCAA
GAAGGCCAAGAAGATTAAGGGCTCCAAGAAAATTAGTCACAATAGTTGCAGACTTAGCTGAAGAGAGATGCAGTTCTCTCTTCTATAGAAAGAGAGAGAGAAAGGGAAGG
AGGGAGGGATTGTTTTTGTGTGCTAAGGCTGTGCTTAGTTTCATGATTTATTGGAAAGCAAGAAGCTCAAAAGTGATGGAAAAACACAAAAAAGACTTTCCCTTCTTTGA
ATTTCATCACCTAAATGGCTCCTCATCAAACACATTATTGAATGGTGCATCTGAGCCTAAGCTTCAACTCATCATCGGTGCCTCAAGGAAGGCGATAGAGTCGTTAACAG
ATCAACCCTGTCTAGCCAACAAGAAACTGCACAAGTCATCACCCAATAAGGCGTGGCTCCACCCAAGGCGGAAGCGATCTTCTAGTGAGCAAAATCAGGCCAGAGCAGCT
GTGGAAGGCATTAGCAAAATAAAAACAGGCACATTGGTCTCTCAATCAGAACAAGAGCTTGTCGACCGTGATGTCATCTCGGGGAACCAGGGATGCAATGGTGGCTTCAT
TTCATCAAAAAAACTGGACTCACTACAGAAAGAGAACATCCATACATTGAAGCTATATGCAATAAACAAAAAGCGAGATACCACTATGACAGAAAAAGTACCTGTAATTG
ATGAGAAAAGCATAAAAGATGCAGTTGCTAACCAGCCAGTCTCTGTAGCAATTCATACAGGGGGATATGATTTCCAGTTCTATTCTGGTGGAGTTTTCTCAGGGAATTGT
GGAAAGGAACTCAATCATGGAGTGGCAGTAGTTGGGTATGGGGAAGCTAGCAATAGGCTTGTCAAGAATTCATGGGGCACTGACGGGGTGAATCTGGTTACACGAGAATT
CTGA
Protein sequenceShow/hide protein sequence
MGILEVLDMNAFKAFKANVPIAWSPNLYITLVRGIPGTRRLHRRTLEALRLRKCNRTVMRWNTPTVRGMIQQVKRLVVVETEEMFKARKQKVEQHKSLRPPLVPLLVGYV
FYDRHQQTTCKKSHGKGAKEKSCAHGFHSFWCFSDEKLELSNICKCLSSTNQEELRMAMNRKADIEPAGITNLPTFLSINDACSTVNSSESRSSKRGCQWTIWVKASEPF
SLPRLRSRIEHGKLPTISRNALMKLPCGGDITMRRSHARDAILNGSELTTAVMEAKVATVRNRPNLASAKKPPRRPRRLRAPRKLVTIVADLAEERCSSLFYRKRERKGR
REGLFLCAKAVLSFMIYWKARSSKVMEKHKKDFPFFEFHHLNGSSSNTLLNGASEPKLQLIIGASRKAIESLTDQPCLANKKLHKSSPNKAWLHPRRKRSSSEQNQARAA
VEGISKIKTGTLVSQSEQELVDRDVISGNQGCNGGFISSKKLDSLQKENIHTLKLYAINKKRDTTMTEKVPVIDEKSIKDAVANQPVSVAIHTGGYDFQFYSGGVFSGNC
GKELNHGVAVVGYGEASNRLVKNSWGTDGVNLVTREF