; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Moc08g34730 (gene) of Bitter gourd (OHB3-1) v2 genome

Gene IDMoc08g34730
OrganismMomordica charantia cv. OHB3-1 (Bitter gourd (OHB3-1) v2)
DescriptionBEST Arabidopsis thaliana protein match is: 3'-5' exonuclease domain-containing protein / K homology domain-containing protein / KH domain-containing protein .
Genome locationchr8:25379448..25389439
RNA-Seq ExpressionMoc08g34730
SyntenyMoc08g34730
Gene Ontology termsNA
InterPro domainsIPR029472 - Retrotransposon Copia-like, N-terminal


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAG6581290.1 hypothetical protein SDJN03_21292, partial [Cucurbita argyrosperma subsp. sororia]2.0e-12387.82Show/hide
Query:  MGVESNSAAPPPP----SSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDD
        MGVESNS  PPPP    SSSSTPSPS KRARDP+DEVYLDNFHSHKRYLSEIMASSLNGLTVGDPL ENLMDSPARSESMLYLRDEMS QYSPMSEDSDD
Subjt:  MGVESNSAAPPPP----SSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDD

Query:  CRFCETSTNLFPLQSD-SVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCSTSPVPGLQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPG
        CRFCETSTNLFP QSD SVPTSPVSPYRYQRPFS +TPST T N SLGC+T PV  LQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQP G
Subjt:  CRFCETSTNLFPLQSD-SVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCSTSPVPGLQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPG

Query:  PSSMELPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTMRVSEPEYNQQKPCKDLNRNMKDSESG
        PSSMELPYCSM EPGPNIEAEER C+F KSLV+ERVYQL ECS+M VSEPEYN+QK CKDLNR MKDSESG
Subjt:  PSSMELPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTMRVSEPEYNQQKPCKDLNRNMKDSESG

XP_022155254.1 uncharacterized protein LOC111022394 isoform X1 [Momordica charantia]1.5e-147100Show/hide
Query:  MGVESNSAAPPPPSSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDCRFC
        MGVESNSAAPPPPSSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDCRFC
Subjt:  MGVESNSAAPPPPSSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDCRFC

Query:  ETSTNLFPLQSDSVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCSTSPVPGLQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPGPSSME
        ETSTNLFPLQSDSVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCSTSPVPGLQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPGPSSME
Subjt:  ETSTNLFPLQSDSVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCSTSPVPGLQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPGPSSME

Query:  LPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTMRVSEPEYNQQKPCKDLNRNMKDSESGES
        LPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTMRVSEPEYNQQKPCKDLNRNMKDSESGES
Subjt:  LPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTMRVSEPEYNQQKPCKDLNRNMKDSESGES

XP_022934215.1 uncharacterized protein LOC111441454 isoform X1 [Cucurbita moschata]1.5e-12388.15Show/hide
Query:  MGVESNSAAPPPP---SSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDC
        MGVESNS  PPPP   SSSSTPSPS KRARDP+DEVYLDNFHSHKRYLSEIMASSLNGLTVGDPL ENLMDSPARSESMLYLRDEMS QYSPMSEDSDDC
Subjt:  MGVESNSAAPPPP---SSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDC

Query:  RFCETSTNLFPLQSD-SVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCSTSPVPGLQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPGP
        RFCETSTNLFP QSD SVPTSPVSPYRYQRPFS +TPST T N SLGC+T PV  LQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQP GP
Subjt:  RFCETSTNLFPLQSD-SVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCSTSPVPGLQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPGP

Query:  SSMELPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTMRVSEPEYNQQKPCKDLNRNMKDSESG
        SSMELPYCSM EPGPNIEAEER C+F KSLV+ERVYQL ECS+M VSEPEYN+QK CKDLNR MKDSESG
Subjt:  SSMELPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTMRVSEPEYNQQKPCKDLNRNMKDSESG

XP_022984066.1 uncharacterized protein LOC111482488 isoform X1 [Cucurbita maxima]1.0e-12489.14Show/hide
Query:  MGVESNSAAPPPPSSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDCRFC
        MGVESNSA PPPPSSSSTPSPS KRARDP+DEVYLDNFHSHKRYLSEIMASSLNGLTVGD L ENLMDSPARSESMLYLRDEMS QYSPMSEDSDDCRFC
Subjt:  MGVESNSAAPPPPSSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDCRFC

Query:  ETSTNLFPLQSD-SVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCSTSPVPGLQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPGPSSM
        ETSTNLFP QSD SVPTSPVSPYRYQRPFS +TPST+T N SLGC+TSPV  LQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQP GPSSM
Subjt:  ETSTNLFPLQSD-SVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCSTSPVPGLQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPGPSSM

Query:  ELPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTMRVSEPEYNQQKPCKDLNRNMKDSESG
        ELPYCSM EPGPNIEAEER C+F KSLV+ERVYQL ECS M VSEPEYN+QK CKDLNR MKD ESG
Subjt:  ELPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTMRVSEPEYNQQKPCKDLNRNMKDSESG

XP_038904570.1 uncharacterized protein LOC120090942 [Benincasa hispida]2.3e-12489.14Show/hide
Query:  MGVESNSAAPPPPSSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDCRFC
        MGVESNSA  PPPSSSSTPSPS KRARDP+DEVYLDNFHSHKRYLSEIMASSLNGLTVGDPL ENLMDSPARSESMLY R+EMSWQYSPMSEDSDDCRFC
Subjt:  MGVESNSAAPPPPSSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDCRFC

Query:  ETSTNLFPLQSD-SVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCSTSPVPGLQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPGPSSM
        ETSTNLFP QSD SVPTSPVSPYRYQRPFS VTPST T N SLGCSTSPV  LQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQP GPSSM
Subjt:  ETSTNLFPLQSD-SVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCSTSPVPGLQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPGPSSM

Query:  ELPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTMRVSEPEYNQQKPCKDLNRNMKDSESG
        ELPYCSM EPGPNIEAEERPC+  KSLV+ER +QLEECS+M VSEPEYN++K CKDLNR+MKDSESG
Subjt:  ELPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTMRVSEPEYNQQKPCKDLNRNMKDSESG

TrEMBL top hitse value%identityAlignment
A0A0A0KWN0 Uncharacterized protein1.2e-12388.52Show/hide
Query:  MGVESNSAAPPPPSSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDCRFC
        MGVESNSA PPP SSSSTPSPS KRARDP+DEVYLDNFHSHKRYLSEIMASSLNGLTVGDPL ENLMDSPARSESMLY RDEMSWQYSPMSEDSDDCRFC
Subjt:  MGVESNSAAPPPPSSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDCRFC

Query:  ETSTNLFPLQSD-SVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCS-TSPVPGLQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPGPSS
        ETSTNLFP QSD SVPTSPVSPYRYQRPFS V PST T N SLGCS TSPV  LQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPGPSS
Subjt:  ETSTNLFPLQSD-SVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCS-TSPVPGLQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPGPSS

Query:  MELPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTM--RVSEPEYNQQKPCKDLNRNMKDSESG
        MELPYCSM EPGPNIEAE+RPC+  KSLV+ERVYQLEECS+M   VSE EYN+QK CKDLNR+MKDS SG
Subjt:  MELPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTM--RVSEPEYNQQKPCKDLNRNMKDSESG

A0A5A7TRC2 Uncharacterized protein8.0e-12387.78Show/hide
Query:  MGVESNSAAPPPP-SSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDCRF
        MGVESNSA PPPP SSSSTPSPS KRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVG+PL ENLMDSPARSESMLY RDEMSWQYSPMSEDSDDCRF
Subjt:  MGVESNSAAPPPP-SSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDCRF

Query:  CETSTNLFPLQSD-SVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCSTSPVPGLQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPGPSS
        CETSTNLFP QSD SVPTSPVSPYRYQRPFS + PS  T N SLGCSTSPV  LQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQP GPSS
Subjt:  CETSTNLFPLQSD-SVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCSTSPVPGLQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPGPSS

Query:  MELPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTM--RVSEPEYNQQKPCKDLNRNMKDSESG
        MELPYCSM EPGPNIEAE+RPC+  KSLV+ERVYQLEECS+M   VSE EYN+QK CKDLNR+MKDS+SG
Subjt:  MELPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTM--RVSEPEYNQQKPCKDLNRNMKDSESG

A0A6J1DPQ3 uncharacterized protein LOC111022394 isoform X17.2e-148100Show/hide
Query:  MGVESNSAAPPPPSSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDCRFC
        MGVESNSAAPPPPSSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDCRFC
Subjt:  MGVESNSAAPPPPSSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDCRFC

Query:  ETSTNLFPLQSDSVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCSTSPVPGLQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPGPSSME
        ETSTNLFPLQSDSVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCSTSPVPGLQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPGPSSME
Subjt:  ETSTNLFPLQSDSVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCSTSPVPGLQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPGPSSME

Query:  LPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTMRVSEPEYNQQKPCKDLNRNMKDSESGES
        LPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTMRVSEPEYNQQKPCKDLNRNMKDSESGES
Subjt:  LPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTMRVSEPEYNQQKPCKDLNRNMKDSESGES

A0A6J1F722 uncharacterized protein LOC111441454 isoform X17.3e-12488.15Show/hide
Query:  MGVESNSAAPPPP---SSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDC
        MGVESNS  PPPP   SSSSTPSPS KRARDP+DEVYLDNFHSHKRYLSEIMASSLNGLTVGDPL ENLMDSPARSESMLYLRDEMS QYSPMSEDSDDC
Subjt:  MGVESNSAAPPPP---SSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDC

Query:  RFCETSTNLFPLQSD-SVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCSTSPVPGLQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPGP
        RFCETSTNLFP QSD SVPTSPVSPYRYQRPFS +TPST T N SLGC+T PV  LQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQP GP
Subjt:  RFCETSTNLFPLQSD-SVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCSTSPVPGLQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPGP

Query:  SSMELPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTMRVSEPEYNQQKPCKDLNRNMKDSESG
        SSMELPYCSM EPGPNIEAEER C+F KSLV+ERVYQL ECS+M VSEPEYN+QK CKDLNR MKDSESG
Subjt:  SSMELPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTMRVSEPEYNQQKPCKDLNRNMKDSESG

A0A6J1J464 uncharacterized protein LOC111482488 isoform X15.0e-12589.14Show/hide
Query:  MGVESNSAAPPPPSSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDCRFC
        MGVESNSA PPPPSSSSTPSPS KRARDP+DEVYLDNFHSHKRYLSEIMASSLNGLTVGD L ENLMDSPARSESMLYLRDEMS QYSPMSEDSDDCRFC
Subjt:  MGVESNSAAPPPPSSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDCRFC

Query:  ETSTNLFPLQSD-SVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCSTSPVPGLQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPGPSSM
        ETSTNLFP QSD SVPTSPVSPYRYQRPFS +TPST+T N SLGC+TSPV  LQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQP GPSSM
Subjt:  ETSTNLFPLQSD-SVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCSTSPVPGLQPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQMRAQPPGPSSM

Query:  ELPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTMRVSEPEYNQQKPCKDLNRNMKDSESG
        ELPYCSM EPGPNIEAEER C+F KSLV+ERVYQL ECS M VSEPEYN+QK CKDLNR MKD ESG
Subjt:  ELPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTMRVSEPEYNQQKPCKDLNRNMKDSESG

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
AT2G25920.1 BEST Arabidopsis thaliana protein match is: 3'-5' exonuclease domain-containing protein / K homology domain-containing protein / KH domain-containing protein (TAIR:AT2G25910.2)8.1e-5953.31Show/hide
Query:  GVESNSAAPPPPSSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDCRFCE
        G      A  PP   S  SP  KR RDPEDEVYLDN  S KRYLSEIMA SLNGLTVGD LP N+++SPARSES LY RD++S QYSPMSEDSD+ RFCE
Subjt:  GVESNSAAPPPPSSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLPENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDCRFCE

Query:  TST---NLFPLQSDSVPTSPVSPYRYQRPFSTVT---PSTSTNNNSLGCSTSPVPGL------QPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQM
          T   +    Q +S PTSPVSPYRYQRP ++     PS +  ++S  C  S +         Q  QRGSD+EGRFPSSPSDICHS DLRR ALLRSVQM
Subjt:  TST---NLFPLQSDSVPTSPVSPYRYQRPFSTVT---PSTSTNNNSLGCSTSPVPGL------QPHQRGSDSEGRFPSSPSDICHSADLRRAALLRSVQM

Query:  RAQPPGPSSMELPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQL-EECSTMRVSEPEYNQQKPCKDLNRNM
        R QP G SS   P         NI+ EER C  +KS+  +R Y   E+     VS    ++ K CK L+  +
Subjt:  RAQPPGPSSMELPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQL-EECSTMRVSEPEYNQQKPCKDLNRNM


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGATCTCCGCCTTGTGCGGTGGTTCATCAAGAGACTGTCCCACCCAATTAGCTCGATCCCGACCCCAACTCGTCCCCGCCCACATTATGCAGCAGAGCGGCAATTCTC
ATTCCAGGAGACCCCAAAATTCAGCCGGAATTTGCACCGCCGGCCGCCGTCAGCCGCGGCGCGTCTCCACCTGCCGCGGCTCCCGTGCATAAACCCGAAGGTTGTTGAAT
TTGATCGGGCGTTGGAGAGGCGTGTTGTGGTTGTTGATGAGGAGGAGGCCGGAGGGGAGGGGCGGTGGAGATTGGGCGGCGGCGCGTGTAATTATGAGGAGGAAGAAGGT
GATATTGAGATTGATTTGAGAGCGGAGAAGTTTATAATTGCTGTCAATTATGGAGGAATATTTACTGCGAGCTACAAGGCTCACGCCACCATGGACACCGAAGGTTCTTC
CTCCTCAAATCTCAATACCCTAAATGCAATGCCAATAGCCACTGCAACTAACAACACCATATCATCAAATCCCTTTGGCAACCCACTTGGTACAGTATTGGCAGTCAAGT
TGGATGAAAAGAATTATCTTCTCTGGAAATTCATGATTACTGCTGCTCTTCGTGGACAGAAGCTTGATGGCTACGTTATGGGAACAATTGCTCAACCTCCAGAAATGATT
CAAGGTACCGGTGCAAATGCCACCACACTCATTGCAAATCCTGCGTTTGATTCATGGTCTACCACAGATCAATCGCTCCTAGCCTGGTTGTATGGATCCATGACGCCATC
TGTGGCTTGTGACATCCTCAATCTACGCACATCTAGAGATCAACTCGCTCACAGTCTTGCTCTAGCAGGCGAACCGGTAAATGAAAATTCTCTCATCACTAATGTTCTCA
CGGGTCTTGATGCAGAATATTTACCGGTAGCTTGCCAAATCAATGGAAAAGAAAATATGACATGGCAAGAGATGCATGCCACGTTGCTAGCTTTTGAAAACACACTCATT
CATCTGAATGTGGACAAGCAATCCGGGAGAACCATACTGGAGGGGAGGCTTAGTGAAGGGCTCTATCAGCTGGATCTTCCAAAGCCTAAAGCACATTTTTCTGCTTCAAA
TAAAGTTATCAATTTTCGTCCAACTGCCTTGATGGGCGTCGAATCGAACTCTGCGGCACCGCCACCACCATCGTCGTCTTCTACACCTTCTCCGAGCGCGAAGCGAGCCA
GAGATCCCGAAGATGAAGTTTATCTCGACAATTTCCACTCTCATAAGCGCTACCTCAGTGAGATAATGGCTTCTAGTTTGAATGGATTGACGGTTGGGGACCCCCTCCCT
GAGAATCTCATGGATTCTCCTGCAAGGTCGGAATCCATGCTTTATCTAAGGGATGAAATGTCCTGGCAATATTCTCCTATGTCTGAAGATTCAGATGACTGCCGTTTTTG
TGAGACATCCACAAACTTATTTCCTTTGCAGTCTGACAGTGTACCTACCAGTCCAGTCTCGCCATATCGATATCAAAGACCGTTCAGCACGGTGACTCCTTCAACAAGTA
CTAATAATAATTCACTTGGATGTTCTACTAGTCCCGTCCCTGGCTTGCAACCACATCAACGTGGATCAGATTCTGAGGGCCGTTTCCCGTCATCTCCCAGCGACATATGC
CACTCGGCAGACTTGAGAAGGGCTGCGCTCTTGCGTTCTGTTCAAATGAGAGCACAACCTCCTGGTCCATCATCTATGGAGTTGCCATATTGCTCAATGACTGAGCCTGG
ACCTAATATAGAAGCTGAGGAGCGGCCATGTGCTTTCACAAAATCGTTAGTCAATGAAAGAGTATATCAACTTGAGGAATGCTCCACAATGAGAGTGTCCGAACCCGAAT
ATAACCAACAGAAACCGTGCAAGGACTTGAACAGGAATATGAAGGATAGTGAATCCGGAGAGTCGTAA
mRNA sequenceShow/hide mRNA sequence
ATGGATCTCCGCCTTGTGCGGTGGTTCATCAAGAGACTGTCCCACCCAATTAGCTCGATCCCGACCCCAACTCGTCCCCGCCCACATTATGCAGCAGAGCGGCAATTCTC
ATTCCAGGAGACCCCAAAATTCAGCCGGAATTTGCACCGCCGGCCGCCGTCAGCCGCGGCGCGTCTCCACCTGCCGCGGCTCCCGTGCATAAACCCGAAGGTTGTTGAAT
TTGATCGGGCGTTGGAGAGGCGTGTTGTGGTTGTTGATGAGGAGGAGGCCGGAGGGGAGGGGCGGTGGAGATTGGGCGGCGGCGCGTGTAATTATGAGGAGGAAGAAGGT
GATATTGAGATTGATTTGAGAGCGGAGAAGTTTATAATTGCTGTCAATTATGGAGGAATATTTACTGCGAGCTACAAGGCTCACGCCACCATGGACACCGAAGGTTCTTC
CTCCTCAAATCTCAATACCCTAAATGCAATGCCAATAGCCACTGCAACTAACAACACCATATCATCAAATCCCTTTGGCAACCCACTTGGTACAGTATTGGCAGTCAAGT
TGGATGAAAAGAATTATCTTCTCTGGAAATTCATGATTACTGCTGCTCTTCGTGGACAGAAGCTTGATGGCTACGTTATGGGAACAATTGCTCAACCTCCAGAAATGATT
CAAGGTACCGGTGCAAATGCCACCACACTCATTGCAAATCCTGCGTTTGATTCATGGTCTACCACAGATCAATCGCTCCTAGCCTGGTTGTATGGATCCATGACGCCATC
TGTGGCTTGTGACATCCTCAATCTACGCACATCTAGAGATCAACTCGCTCACAGTCTTGCTCTAGCAGGCGAACCGGTAAATGAAAATTCTCTCATCACTAATGTTCTCA
CGGGTCTTGATGCAGAATATTTACCGGTAGCTTGCCAAATCAATGGAAAAGAAAATATGACATGGCAAGAGATGCATGCCACGTTGCTAGCTTTTGAAAACACACTCATT
CATCTGAATGTGGACAAGCAATCCGGGAGAACCATACTGGAGGGGAGGCTTAGTGAAGGGCTCTATCAGCTGGATCTTCCAAAGCCTAAAGCACATTTTTCTGCTTCAAA
TAAAGTTATCAATTTTCGTCCAACTGCCTTGATGGGCGTCGAATCGAACTCTGCGGCACCGCCACCACCATCGTCGTCTTCTACACCTTCTCCGAGCGCGAAGCGAGCCA
GAGATCCCGAAGATGAAGTTTATCTCGACAATTTCCACTCTCATAAGCGCTACCTCAGTGAGATAATGGCTTCTAGTTTGAATGGATTGACGGTTGGGGACCCCCTCCCT
GAGAATCTCATGGATTCTCCTGCAAGGTCGGAATCCATGCTTTATCTAAGGGATGAAATGTCCTGGCAATATTCTCCTATGTCTGAAGATTCAGATGACTGCCGTTTTTG
TGAGACATCCACAAACTTATTTCCTTTGCAGTCTGACAGTGTACCTACCAGTCCAGTCTCGCCATATCGATATCAAAGACCGTTCAGCACGGTGACTCCTTCAACAAGTA
CTAATAATAATTCACTTGGATGTTCTACTAGTCCCGTCCCTGGCTTGCAACCACATCAACGTGGATCAGATTCTGAGGGCCGTTTCCCGTCATCTCCCAGCGACATATGC
CACTCGGCAGACTTGAGAAGGGCTGCGCTCTTGCGTTCTGTTCAAATGAGAGCACAACCTCCTGGTCCATCATCTATGGAGTTGCCATATTGCTCAATGACTGAGCCTGG
ACCTAATATAGAAGCTGAGGAGCGGCCATGTGCTTTCACAAAATCGTTAGTCAATGAAAGAGTATATCAACTTGAGGAATGCTCCACAATGAGAGTGTCCGAACCCGAAT
ATAACCAACAGAAACCGTGCAAGGACTTGAACAGGAATATGAAGGATAGTGAATCCGGAGAGTCGTAA
Protein sequenceShow/hide protein sequence
MDLRLVRWFIKRLSHPISSIPTPTRPRPHYAAERQFSFQETPKFSRNLHRRPPSAAARLHLPRLPCINPKVVEFDRALERRVVVVDEEEAGGEGRWRLGGGACNYEEEEG
DIEIDLRAEKFIIAVNYGGIFTASYKAHATMDTEGSSSSNLNTLNAMPIATATNNTISSNPFGNPLGTVLAVKLDEKNYLLWKFMITAALRGQKLDGYVMGTIAQPPEMI
QGTGANATTLIANPAFDSWSTTDQSLLAWLYGSMTPSVACDILNLRTSRDQLAHSLALAGEPVNENSLITNVLTGLDAEYLPVACQINGKENMTWQEMHATLLAFENTLI
HLNVDKQSGRTILEGRLSEGLYQLDLPKPKAHFSASNKVINFRPTALMGVESNSAAPPPPSSSSTPSPSAKRARDPEDEVYLDNFHSHKRYLSEIMASSLNGLTVGDPLP
ENLMDSPARSESMLYLRDEMSWQYSPMSEDSDDCRFCETSTNLFPLQSDSVPTSPVSPYRYQRPFSTVTPSTSTNNNSLGCSTSPVPGLQPHQRGSDSEGRFPSSPSDIC
HSADLRRAALLRSVQMRAQPPGPSSMELPYCSMTEPGPNIEAEERPCAFTKSLVNERVYQLEECSTMRVSEPEYNQQKPCKDLNRNMKDSESGES