; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Moc03g18630 (gene) of Bitter gourd (OHB3-1) v2 genome

Gene IDMoc03g18630
OrganismMomordica charantia cv. OHB3-1 (Bitter gourd (OHB3-1) v2)
DescriptionDNA-directed DNA polymerase
Genome locationchr3:12327678..12344928
RNA-Seq ExpressionMoc03g18630
SyntenyMoc03g18630
Gene Ontology termsNA
InterPro domainsIPR021109 - Aspartic peptidase domain superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_022147186.1 uncharacterized protein LOC111016198 [Momordica charantia]1.1e-5861.29Show/hide
Query:  HVGHALCDLRLSINLMPLSVYQKLGIAKARPTTVTLQLPDRSITHPKGKIVEFLVQVDKFIFPTDFIILYYEADNEISIILGRPFLSTGRALIDVQNGEL
        +VGH LCDL   INL+PL VYQKLGI +ARPTTVTLQL DRSITHP+GK  + LVQVDKFIFP DFIIL YE + EI IILGRPFLSTGRALIDV NGEL
Subjt:  HVGHALCDLRLSINLMPLSVYQKLGIAKARPTTVTLQLPDRSITHPKGKIVEFLVQVDKFIFPTDFIILYYEADNEISIILGRPFLSTGRALIDVQNGEL

Query:  TMRVNDQHMTFSVFNSIKFPTDVDECSLLRIVDDLLTKEMQNEELLNQL-------------------------------------------EQAPFQPS
        TMRVNDQ +TF +FNSIKFP D++ECSLLR+ DDL ++EMQ EELL+QL                                           EQA  Q S
Subjt:  TMRVNDQHMTFSVFNSIKFPTDVDECSLLRIVDDLLTKEMQNEELLNQL-------------------------------------------EQAPFQPS

Query:  VEKAPKLEFKVLPTHLK
        VEKA KLE KVLPTHLK
Subjt:  VEKAPKLEFKVLPTHLK

XP_022156989.1 uncharacterized protein LOC111023818 [Momordica charantia]1.7e-6474.05Show/hide
Query:  HVGHALCDLRLSINLMPLSVYQKLGIAKARPTTVTLQLPDRSITHPKGKIVEFLVQVDKFIFPTDFIILYYEADNEISIILGRPFLSTGRALIDVQNGEL
        +VGHALCDL  SINLMPLSVYQKLGI +ARP TVTLQL DRSIT+ +GKI + LVQVDKFIFP DFIIL YEAD EI IILGRPFLSTGRALIDV NGEL
Subjt:  HVGHALCDLRLSINLMPLSVYQKLGIAKARPTTVTLQLPDRSITHPKGKIVEFLVQVDKFIFPTDFIILYYEADNEISIILGRPFLSTGRALIDVQNGEL

Query:  TMRVNDQHMTFSVFNSIKFPTDVDECSLLRIVDDLLTKEMQNEELLNQLE-----------QAPFQPSVEKAPKLEFKVLPTHLK
        T+RVNDQ +T S+FNSIK+P DV+ECS LRI DDL++ E+Q EELLNQLE           QAP QPSV KAPKLE KVLP+HLK
Subjt:  TMRVNDQHMTFSVFNSIKFPTDVDECSLLRIVDDLLTKEMQNEELLNQLE-----------QAPFQPSVEKAPKLEFKVLPTHLK

XP_022158490.1 uncharacterized protein LOC111024970 [Momordica charantia]5.1e-8035.86Show/hide
Query:  DSPTRLLNLVVERNNGGEVGVAAAAPHLNVILLADDGERAIKTYVAPTLHGFHPVIAKPKIEAERFELKSIMFQMLQTVGQLFGKPSEDLHLHWRYFLLV
        D   RL N VVE  N GEV V AAAP LNVILL DDGER I+ Y AP +HGFHPVIA P IEAERFELKSIMFQMLQTVGQ FG PSED HLH RYFL V
Subjt:  DSPTRLLNLVVERNNGGEVGVAAAAPHLNVILLADDGERAIKTYVAPTLHGFHPVIAKPKIEAERFELKSIMFQMLQTVGQLFGKPSEDLHLHWRYFLLV

Query:  SDSFKMQGVSKEALLLKLFPY-SNVRYRGDINNFQQRHGESVSESWEE-AVDILESIYASNYH-WSDPKAVNDRSTHVATDNEAMLALKDEIVNLTNMVK
        SDSF MQ VSKEAL LKLFPY  + + R  +N+      ES++ SW + A      I++      SD KAVN+R+ H A DNEAM AL D+I NLTNMVK
Subjt:  SDSFKMQGVSKEALLLKLFPY-SNVRYRGDINNFQQRHGESVSESWEE-AVDILESIYASNYH-WSDPKAVNDRSTHVATDNEAMLALKDEIVNLTNMVK

Query:  NMNTATTSSTSPGTRETVENMPIDEDKVNYARALQEQELHQGPIVQPEIERQPEPEINRRKRAAEAEPAEEVSMKTPMQKKVKENILKRWETHEASLFLY
        NMNTATTSS SPG    +   P++++   Y    Q++   Q  ++ P  ++   P  N              SM+T M++ +  N        +A +   
Subjt:  NMNTATTSSTSPGTRETVENMPIDEDKVNYARALQEQELHQGPIVQPEIERQPEPEINRRKRAAEAEPAEEVSMKTPMQKKVKENILKRWETHEASLFLY

Query:  PEESVSSYRSWQGVVVKEYQRSEEKPEKSAGQCLGARDGTYVLPELQKPKSSSLQQWGQRLGTVPWNPGSGRRAPTSNAARASRDAYNRRIQANDKAKVY
         +   + +R+ +  + +   + + +P          R  T  +    +P+  ++++                   T+       D   R  +A    +V 
Subjt:  PEESVSSYRSWQGVVVKEYQRSEEKPEKSAGQCLGARDGTYVLPELQKPKSSSLQQWGQRLGTVPWNPGSGRRAPTSNAARASRDAYNRRIQANDKAKVY

Query:  ILASISDVLAKKHESMVTAKGTSVREHVLNLMVQFNVAKVNGAVIDERNQEGEVN--VVTLKQFHRGSTSRTKSAFSSFGSKTFKKKKNSGKGVKANPTA
        I            E+ V  K        + + V++    +    + ++ Q+ + +  +  LKQ H                                   
Subjt:  ILASISDVLAKKHESMVTAKGTSVREHVLNLMVQFNVAKVNGAVIDERNQEGEVN--VVTLKQFHRGSTSRTKSAFSSFGSKTFKKKKNSGKGVKANPTA

Query:  TAATKKGKAKGISSWRQLDAGEITLKVKTGAVVSVVTVVLNEISDEATNTSTRVVDKASTSTRVVDGPGTLRQSHPSQELRV----PRCSGRVVAQPDRY
                                           + + L E  ++  N    + D  +   R+             +E  +      C+  +  +P + 
Subjt:  TAATKKGKAKGISSWRQLDAGEITLKVKTGAVVSVVTVVLNEISDEATNTSTRVVDKASTSTRVVDGPGTLRQSHPSQELRV----PRCSGRVVAQPDRY

Query:  IFRCGILSL-------HVGHALCDLRLSINLMPLSVYQKLGIAKARPTTVTLQLPDRSITHPKGKIVEFLVQVDKFIFPTDFIILYYEADNEISIILGRP
        +   G  ++       +VG+ALCDL  SINLMPLS+ +KL I KARPTT+TLQL DRSITHP+GKI + LVQVDKFIFP DFIIL YEAD EI IILGRP
Subjt:  IFRCGILSL-------HVGHALCDLRLSINLMPLSVYQKLGIAKARPTTVTLQLPDRSITHPKGKIVEFLVQVDKFIFPTDFIILYYEADNEISIILGRP

Query:  FLSTGRALIDV
        FL TGRALIDV
Subjt:  FLSTGRALIDV

XP_030497888.1 uncharacterized protein LOC115713544 [Cannabis sativa]1.9e-5029.86Show/hide
Query:  HPVIAKPKIEAERFELKSIMFQMLQTVGQLFGKPSEDLHLHWRYFLLVSDSFKMQGVSKEALLLKLFPYS---------NVRYRGDINNFQ---------
        +P I +PKI+A  FELK +MFQMLQTVGQ  G P+ED HLH   FL VS+SFK++GVS+EAL LKLFP+S         N      + N+          
Subjt:  HPVIAKPKIEAERFELKSIMFQMLQTVGQLFGKPSEDLHLHWRYFLLVSDSFKMQGVSKEALLLKLFPYS---------NVRYRGDINNFQ---------

Query:  -----QRHGESVSESWEEAVDILESIYASNYHWSDPKAVNDRSTHVATDNEAMLALKDEIVNLTNMVKNMNTATTSSTSPGTRETVENMPIDEDKVNYAR
               +G  +S+S+ EA +ILE I ++NY WS   A   R      + +A+ AL  ++ ++TN++KNMN                             
Subjt:  -----QRHGESVSESWEEAVDILESIYASNYHWSDPKAVNDRSTHVATDNEAMLALKDEIVNLTNMVKNMNTATTSSTSPGTRETVENMPIDEDKVNYAR

Query:  ALQEQELHQGPIVQPEIERQPEPEINRRKRAAEAEPAEEVSMKTPMQKKVKENILKRWETHEASLFLYPEESVSSYRSWQGVVVKEYQRSEEKPEKSAGQ
                 G  +QP                A A    E+S          EN      +H  +             SW G          +    S  Q
Subjt:  ALQEQELHQGPIVQPEIERQPEPEINRRKRAAEAEPAEEVSMKTPMQKKVKENILKRWETHEASLFLYPEESVSSYRSWQGVVVKEYQRSEEKPEKSAGQ

Query:  CLGARDGTYVLPELQKPKSSSLQQWGQRLGTVPWNPGSGRRAPTSNAARASRDAYNRRIQANDKAKVYILASISDVLAKKHESMVTAKGTSVREHVLNLM
          G +  ++ L   Q+P+              P  P   + + TS+     RD Y  +  A  +++V  L ++   L +    + +    ++     N  
Subjt:  CLGARDGTYVLPELQKPKSSSLQQWGQRLGTVPWNPGSGRRAPTSNAARASRDAYNRRIQANDKAKVYILASISDVLAKKHESMVTAKGTSVREHVLNLM

Query:  VQFNVAKVNGAVIDERNQEGEVNVVTLKQFHRGSTSRTKSAFSSFGSKTFKKKKNSGKGVKANPTATAA------TKKGKAKGISSWRQLDAGEITLKVK
                       R+ +     V L+     S    +S  ++ GSK     +  G+ +K  PT  AA      T  G+        Q       L+ K
Subjt:  VQFNVAKVNGAVIDERNQEGEVNVVTLKQFHRGSTSRTKSAFSSFGSKTFKKKKNSGKGVKANPTATAA------TKKGKAKGISSWRQLDAGEITLKVK

Query:  TGAVVSVVTVVLNEISDEATNTSTRVVDKASTSTRVVDGPGTLRQSHPSQELRVPRCSGRVVAQPDRYIFRCGILSLHVGHALCDLRLSINLMPLSVYQK
                         +      R +D              L+Q H    + +P     +   P    F   IL+      L + + ++ L       K
Subjt:  TGAVVSVVTVVLNEISDEATNTSTRVVDKASTSTRVVDGPGTLRQSHPSQELRVPRCSGRVVAQPDRYIFRCGILSLHVGHALCDLRLSINLMPLSVYQK

Query:  LGIAKARPTTVTLQLPDRSITHPKGKIVEFLVQVDKFIFPTDFIILYYEADNEISIILGRPFLSTGRALIDVQNGELTMRVNDQHMTFSVFNSIKFPTDV
        LGI +ARPTTVTLQL DRS+ HP+GKI +  VQVDKFIFP DFIIL YEAD E+ IILGRPFL+TGR LIDVQNGELTMRVNDQ +TF+VFN+++FP ++
Subjt:  LGIAKARPTTVTLQLPDRSITHPKGKIVEFLVQVDKFIFPTDFIILYYEADNEISIILGRPFLSTGRALIDVQNGELTMRVNDQHMTFSVFNSIKFPTDV

Query:  DECSLLRIVDDLLTKEMQNE
        +ECS L ++D ++ +    E
Subjt:  DECSLLRIVDDLLTKEMQNE

XP_030504924.1 uncharacterized protein LOC115719886 [Cannabis sativa]4.2e-5049.13Show/hide
Query:  PDRYIFRCGILSLHVGHALCDLRLSINLMPLSVYQKLGIAKARPTTVTLQLPDRSITHPKGKIVEFLVQVDKFIFPTDFIILYYEADNEISIILGRPFLS
        P  +   C I    VG ALCDL  SINLMP+S+++KLGI +ARPTTVTLQL DRS+ HP+GKI + LVQVDKFIFP DFIIL YEAD ++ IILGRPFL+
Subjt:  PDRYIFRCGILSLHVGHALCDLRLSINLMPLSVYQKLGIAKARPTTVTLQLPDRSITHPKGKIVEFLVQVDKFIFPTDFIILYYEADNEISIILGRPFLS

Query:  TGRALIDVQNGELTMRVNDQHMTFSVFNSIKFPTDVDECSLLRIVDDLLTKEMQNE--------ELLNQLE----------------------QAPFQ--
        TGR LIDVQNGELTMRVNDQ +TF+VFN+++FP +++ECS + ++D ++ ++   E           ++LE                      + PF+  
Subjt:  TGRALIDVQNGELTMRVNDQHMTFSVFNSIKFPTDVDECSLLRIVDDLLTKEMQNE--------ELLNQLE----------------------QAPFQ--

Query:  -----------PSVEKAPKLEFKVLPTHLK
                   PS+++ PKLE K LP+HLK
Subjt:  -----------PSVEKAPKLEFKVLPTHLK

TrEMBL top hitse value%identityAlignment
A0A6J1CS22 uncharacterized protein LOC1110138052.2e-4456.4Show/hide
Query:  PDRYIFRCGILSLHVGHALCDLRLSINLMPLSVYQKLGIAKARPTTVTLQLPDRSITHPKGKIVEFLVQVDKFIFPTDFIILYYEADNEISIILGRPFLS
        P  +     I    +G  LCD+  SIN+MPLS+Y KLGI +ARPTTVTLQL DRSITHP+GKI +  VQV+KF FP DFIIL Y+A  E+ IILGRPFL+
Subjt:  PDRYIFRCGILSLHVGHALCDLRLSINLMPLSVYQKLGIAKARPTTVTLQLPDRSITHPKGKIVEFLVQVDKFIFPTDFIILYYEADNEISIILGRPFLS

Query:  TGRALIDVQNGELTMRVNDQHMTFSVFNSIKFPTDVDECSLLRIVDDLLTKEMQNEELLNQLEQAPFQPSVE
        TGRAL+DV  GELTM V DQ + FSV NS+KF  + +ECS+L+I+D+ L +E++ E +L +LE    +  VE
Subjt:  TGRALIDVQNGELTMRVNDQHMTFSVFNSIKFPTDVDECSLLRIVDDLLTKEMQNEELLNQLEQAPFQPSVE

A0A6J1D1L0 uncharacterized protein LOC1110161985.3e-5961.29Show/hide
Query:  HVGHALCDLRLSINLMPLSVYQKLGIAKARPTTVTLQLPDRSITHPKGKIVEFLVQVDKFIFPTDFIILYYEADNEISIILGRPFLSTGRALIDVQNGEL
        +VGH LCDL   INL+PL VYQKLGI +ARPTTVTLQL DRSITHP+GK  + LVQVDKFIFP DFIIL YE + EI IILGRPFLSTGRALIDV NGEL
Subjt:  HVGHALCDLRLSINLMPLSVYQKLGIAKARPTTVTLQLPDRSITHPKGKIVEFLVQVDKFIFPTDFIILYYEADNEISIILGRPFLSTGRALIDVQNGEL

Query:  TMRVNDQHMTFSVFNSIKFPTDVDECSLLRIVDDLLTKEMQNEELLNQL-------------------------------------------EQAPFQPS
        TMRVNDQ +TF +FNSIKFP D++ECSLLR+ DDL ++EMQ EELL+QL                                           EQA  Q S
Subjt:  TMRVNDQHMTFSVFNSIKFPTDVDECSLLRIVDDLLTKEMQNEELLNQL-------------------------------------------EQAPFQPS

Query:  VEKAPKLEFKVLPTHLK
        VEKA KLE KVLPTHLK
Subjt:  VEKAPKLEFKVLPTHLK

A0A6J1DUG5 uncharacterized protein LOC1110244564.4e-4560.76Show/hide
Query:  VGHALCDLRLSINLMPLSVYQKLGIAKARPTTVTLQLPDRSITHPKGKIVEFLVQVDKFIFPTDFIILYYEADNEISIILGRPFLSTGRALIDVQNGELT
        +G ALCDL  SINLMPLS+Y KLGI +ARP T+TL+L DRSI HP GKI + LVQVDKFIFP DFIIL YE D E+ IILGRPFL TGRAL+DV  GELT
Subjt:  VGHALCDLRLSINLMPLSVYQKLGIAKARPTTVTLQLPDRSITHPKGKIVEFLVQVDKFIFPTDFIILYYEADNEISIILGRPFLSTGRALIDVQNGELT

Query:  MRVNDQHMTFSVFNSIKFPTDVDECSLLRIVDDLLTKEMQNEELLNQLEQAPFQPSVE
        MRV DQ + FS+  S+KFP + +EC +L++ D+ L KE++ E +L  LE    +  V+
Subjt:  MRVNDQHMTFSVFNSIKFPTDVDECSLLRIVDDLLTKEMQNEELLNQLEQAPFQPSVE

A0A6J1DV77 uncharacterized protein LOC1110238188.5e-6574.05Show/hide
Query:  HVGHALCDLRLSINLMPLSVYQKLGIAKARPTTVTLQLPDRSITHPKGKIVEFLVQVDKFIFPTDFIILYYEADNEISIILGRPFLSTGRALIDVQNGEL
        +VGHALCDL  SINLMPLSVYQKLGI +ARP TVTLQL DRSIT+ +GKI + LVQVDKFIFP DFIIL YEAD EI IILGRPFLSTGRALIDV NGEL
Subjt:  HVGHALCDLRLSINLMPLSVYQKLGIAKARPTTVTLQLPDRSITHPKGKIVEFLVQVDKFIFPTDFIILYYEADNEISIILGRPFLSTGRALIDVQNGEL

Query:  TMRVNDQHMTFSVFNSIKFPTDVDECSLLRIVDDLLTKEMQNEELLNQLE-----------QAPFQPSVEKAPKLEFKVLPTHLK
        T+RVNDQ +T S+FNSIK+P DV+ECS LRI DDL++ E+Q EELLNQLE           QAP QPSV KAPKLE KVLP+HLK
Subjt:  TMRVNDQHMTFSVFNSIKFPTDVDECSLLRIVDDLLTKEMQNEELLNQLE-----------QAPFQPSVEKAPKLEFKVLPTHLK

A0A6J1DVZ9 uncharacterized protein LOC1110249702.5e-8035.86Show/hide
Query:  DSPTRLLNLVVERNNGGEVGVAAAAPHLNVILLADDGERAIKTYVAPTLHGFHPVIAKPKIEAERFELKSIMFQMLQTVGQLFGKPSEDLHLHWRYFLLV
        D   RL N VVE  N GEV V AAAP LNVILL DDGER I+ Y AP +HGFHPVIA P IEAERFELKSIMFQMLQTVGQ FG PSED HLH RYFL V
Subjt:  DSPTRLLNLVVERNNGGEVGVAAAAPHLNVILLADDGERAIKTYVAPTLHGFHPVIAKPKIEAERFELKSIMFQMLQTVGQLFGKPSEDLHLHWRYFLLV

Query:  SDSFKMQGVSKEALLLKLFPY-SNVRYRGDINNFQQRHGESVSESWEE-AVDILESIYASNYH-WSDPKAVNDRSTHVATDNEAMLALKDEIVNLTNMVK
        SDSF MQ VSKEAL LKLFPY  + + R  +N+      ES++ SW + A      I++      SD KAVN+R+ H A DNEAM AL D+I NLTNMVK
Subjt:  SDSFKMQGVSKEALLLKLFPY-SNVRYRGDINNFQQRHGESVSESWEE-AVDILESIYASNYH-WSDPKAVNDRSTHVATDNEAMLALKDEIVNLTNMVK

Query:  NMNTATTSSTSPGTRETVENMPIDEDKVNYARALQEQELHQGPIVQPEIERQPEPEINRRKRAAEAEPAEEVSMKTPMQKKVKENILKRWETHEASLFLY
        NMNTATTSS SPG    +   P++++   Y    Q++   Q  ++ P  ++   P  N              SM+T M++ +  N        +A +   
Subjt:  NMNTATTSSTSPGTRETVENMPIDEDKVNYARALQEQELHQGPIVQPEIERQPEPEINRRKRAAEAEPAEEVSMKTPMQKKVKENILKRWETHEASLFLY

Query:  PEESVSSYRSWQGVVVKEYQRSEEKPEKSAGQCLGARDGTYVLPELQKPKSSSLQQWGQRLGTVPWNPGSGRRAPTSNAARASRDAYNRRIQANDKAKVY
         +   + +R+ +  + +   + + +P          R  T  +    +P+  ++++                   T+       D   R  +A    +V 
Subjt:  PEESVSSYRSWQGVVVKEYQRSEEKPEKSAGQCLGARDGTYVLPELQKPKSSSLQQWGQRLGTVPWNPGSGRRAPTSNAARASRDAYNRRIQANDKAKVY

Query:  ILASISDVLAKKHESMVTAKGTSVREHVLNLMVQFNVAKVNGAVIDERNQEGEVN--VVTLKQFHRGSTSRTKSAFSSFGSKTFKKKKNSGKGVKANPTA
        I            E+ V  K        + + V++    +    + ++ Q+ + +  +  LKQ H                                   
Subjt:  ILASISDVLAKKHESMVTAKGTSVREHVLNLMVQFNVAKVNGAVIDERNQEGEVN--VVTLKQFHRGSTSRTKSAFSSFGSKTFKKKKNSGKGVKANPTA

Query:  TAATKKGKAKGISSWRQLDAGEITLKVKTGAVVSVVTVVLNEISDEATNTSTRVVDKASTSTRVVDGPGTLRQSHPSQELRV----PRCSGRVVAQPDRY
                                           + + L E  ++  N    + D  +   R+             +E  +      C+  +  +P + 
Subjt:  TAATKKGKAKGISSWRQLDAGEITLKVKTGAVVSVVTVVLNEISDEATNTSTRVVDKASTSTRVVDGPGTLRQSHPSQELRV----PRCSGRVVAQPDRY

Query:  IFRCGILSL-------HVGHALCDLRLSINLMPLSVYQKLGIAKARPTTVTLQLPDRSITHPKGKIVEFLVQVDKFIFPTDFIILYYEADNEISIILGRP
        +   G  ++       +VG+ALCDL  SINLMPLS+ +KL I KARPTT+TLQL DRSITHP+GKI + LVQVDKFIFP DFIIL YEAD EI IILGRP
Subjt:  IFRCGILSL-------HVGHALCDLRLSINLMPLSVYQKLGIAKARPTTVTLQLPDRSITHPKGKIVEFLVQVDKFIFPTDFIILYYEADNEISIILGRP

Query:  FLSTGRALIDV
        FL TGRALIDV
Subjt:  FLSTGRALIDV

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGCTAACCCAATGGATGAACTACCAAGACTCCCCAACAAGACTCCTCAATCTAGTAGTTGAAAGAAATAATGGAGGAGAAGTGGGTGTAGCAGCAGCTGCTCCCCATCT
TAACGTCATTTTGTTGGCAGATGATGGAGAAAGGGCTATCAAAACCTATGTTGCGCCTACACTTCATGGTTTTCATCCAGTTATAGCGAAGCCAAAAATAGAAGCTGAAA
GGTTTGAGTTGAAATCTATTATGTTCCAGATGCTCCAAACAGTGGGTCAGCTTTTTGGAAAGCCGTCTGAAGACCTCCATTTGCACTGGAGGTACTTTCTGTTGGTAAGC
GATTCTTTCAAGATGCAAGGAGTATCTAAGGAGGCATTGCTTTTGAAATTGTTCCCCTACTCAAACGTCAGGTATCGAGGAGACATTAATAATTTTCAACAGAGGCATGG
AGAATCGGTCAGTGAGTCGTGGGAAGAGGCAGTTGACATCTTGGAAAGTATTTATGCTAGTAACTACCACTGGTCAGATCCCAAAGCAGTGAATGACAGGAGCACTCACG
TGGCTACTGATAATGAGGCAATGCTTGCATTGAAGGATGAGATTGTCAACCTAACCAACATGGTAAAGAACATGAACACTGCCACAACATCATCAACTAGCCCTGGAACT
AGGGAAACTGTAGAGAACATGCCTATAGATGAGGATAAAGTTAATTATGCTAGAGCATTGCAGGAGCAAGAACTACACCAAGGACCAATAGTACAACCGGAGATAGAAAG
GCAACCTGAACCTGAGATAAATAGAAGAAAAAGAGCTGCAGAAGCAGAGCCAGCTGAAGAAGTATCTATGAAGACACCAATGCAGAAAAAAGTTAAGGAAAACATTCTCA
AACGATGGGAGACCCATGAAGCTTCACTATTCCTGTATCCAGAGGAGTCTGTAAGTTCATATCGTTCGTGGCAAGGAGTTGTTGTGAAAGAATACCAAAGAAGTGAGGAA
AAACCAGAGAAATCGGCTGGACAGTGCCTCGGGGCTAGGGATGGCACCTATGTGCTGCCAGAGCTACAGAAACCCAAGAGCTCGAGCCTGCAGCAATGGGGGCAGCGCCT
AGGCACTGTCCCGTGGAACCCTGGGTCCGGGCGTCGGGCTCCCACGTCTAATGCAGCCCGAGCCAGTCGGGATGCTTATAACAGACGGATCCAGGCTAATGACAAGGCCA
AGGTCTACATCTTGGCAAGCATATCTGATGTGCTGGCCAAAAAGCATGAGAGCATGGTGACTGCAAAGGGAACATCAGTGCGAGAACATGTTCTCAATCTGATGGTCCAG
TTTAATGTGGCTAAAGTGAACGGCGCTGTCATAGACGAGAGGAATCAGGAAGGGGAGGTAAACGTTGTCACCTTAAAACAGTTCCATCGAGGTTCGACCTCAAGAACAAA
ATCTGCATTTTCGTCTTTTGGAAGTAAGACTTTCAAGAAGAAGAAGAACAGTGGTAAGGGGGTGAAAGCTAACCCTACTGCTACTGCTGCTACCAAGAAGGGCAAGGCCA
AAGGAATTAGTTCCTGGAGGCAGCTTGATGCTGGGGAGATAACTCTCAAAGTCAAAACGGGAGCTGTCGTCTCAGTAGTCACAGTTGTGTTAAATGAGATTTCCGATGAA
GCTACAAATACATCAACAAGAGTTGTTGATAAAGCTAGCACTTCAACAAGAGTTGTTGATGGCCCTGGTACATTACGTCAGTCACATCCATCTCAAGAGTTGAGAGTGCC
TCGGTGTAGTGGGAGGGTTGTGGCACAACCTGATCGTTACATCTTCCGCTGTGGGATCTTATCCCTTCATGTGGGTCATGCACTATGTGACTTAAGGCTAAGTATAAACC
TCATGCCTCTGTCGGTATATCAGAAATTGGGAATTGCGAAGGCACGCCCTACCACGGTGACCTTGCAGTTGCCTGATAGGTCAATCACACATCCGAAGGGTAAGATAGTA
GAATTTTTAGTGCAAGTGGACAAATTCATCTTCCCAACTGACTTCATCATATTATACTACGAAGCTGACAATGAAATTTCAATCATTTTGGGAAGGCCTTTCCTCTCCAC
TGGCAGAGCTTTAATAGATGTACAAAATGGAGAATTAACGATGCGAGTAAATGACCAACATATGACATTTTCTGTTTTTAATTCTATTAAGTTTCCTACTGATGTGGATG
AATGTTCTCTGTTAAGGATTGTAGATGACTTGCTTACGAAGGAGATGCAGAATGAGGAGCTATTAAATCAGTTAGAGCAAGCACCGTTTCAACCATCTGTGGAAAAGGCT
CCCAAGTTGGAGTTTAAAGTCCTCCCAACTCACCTGAAAACAACAACCAGAACAATAGTTGCAGCAAGAGGTAGTTCCTTCATCCCCAAAGAAGAAAAAACCAGCAACAA
TGGGGCGAAAGCTTACCAAGGTGAAACCACGTCAGTGCTCATTGAAGAAGAAGAAAGTGTTGCCCCAGAAGAAGAAGTGCCCGTAGAAGAGCAACCAGTGGAACATGAAG
AAGTCCTGCCTGAGGAAAGGAACGAAGATGAAGTTCCAACCTTCTCTCCTGAAACAGTGCAGGTGGTGGAGGAGTTTGTTCGAATTCTATCAGAAAATGAGCAGGTGCTC
ATGGGTTTCAAACAATGGGGCTTCGGCAAGCAAGGTGTACCAAGTGGAGCAAAGGATGCAGAGGAAGAAGAAGATAATGAGGAGCTCCAGAGTGACAAAGAAGGGGAAGA
AGTAGTCCCTGTGGCCGTAAGGAAATCATCTAGAAAAAGAAGGGCACCACCCAAGCAACCAGCTGGCAAGAAAAACAAGCTAGAGTACGAGAGGTTCGTTAATGCTACAA
CATCTGAATGGTTTGAAGAATTGAAAGATAGAAATCTGCTAGGCGAGAGAGGCTTCCCAACTGATTCCCCCATGCCAGACTTTGTGGTCGACACCGTTCAGAAGCATGAA
GCTAGATGGCAGCTTTCAACAAGGGGAGCCCGAACCTTCCTCCATGCCTACCTCAAGCCTGAAGCAAACGTATGGCACCACCTGCTGAAGAGCAAGCTCATGCCAACTAC
ACATGATGCCTCCATCACCAAGGAAAGAGTTCTACTCCTGTTCTGCATCATGGAAGGCCTTAGCATAAACGTCGGCAAGTTAATTGAGAGGGAGATTGCTAGCGTGAAGA
AAAGGAAGTACGGCAATCTATTTTTTCCAAATCTCATCACGGAGCTGTGTATTCTCTCGCGTGTTAAGGTGGAGGAAGCAGATCAAGTGCTCAAGGATAAGGGAGTGATT
AATGAAGCTGCCATTGAGAGATTGAAAGTAGTCGATCATAGCAAGAGAGTGGGGAACATTACCTTCGCCATCAAAGAAGTTACCACCAATCAGGCCAGCATGTTGGAGCA
ACTTAACTCCATTATGGGAGTGGTAGCTGCAACAGAAATAAGGCAGCAGGCTTACTTACGCTATGCAAAAGAAAGAAACATAGTGATCCGTAAGGCCCTAATGACGAAAT
TCTCACATCCATTCCTTCCATTCCCTATATTTCCTAAAGAGCAAATTGTTGAAGAAGAAGGTGATGATGCCGGCGCGTCTCAAACTGCCAGCAGCGCTATTGTGATGAAT
AAGAAGGATAACAAGGAGAGTTGA
mRNA sequenceShow/hide mRNA sequence
ATGCTAACCCAATGGATGAACTACCAAGACTCCCCAACAAGACTCCTCAATCTAGTAGTTGAAAGAAATAATGGAGGAGAAGTGGGTGTAGCAGCAGCTGCTCCCCATCT
TAACGTCATTTTGTTGGCAGATGATGGAGAAAGGGCTATCAAAACCTATGTTGCGCCTACACTTCATGGTTTTCATCCAGTTATAGCGAAGCCAAAAATAGAAGCTGAAA
GGTTTGAGTTGAAATCTATTATGTTCCAGATGCTCCAAACAGTGGGTCAGCTTTTTGGAAAGCCGTCTGAAGACCTCCATTTGCACTGGAGGTACTTTCTGTTGGTAAGC
GATTCTTTCAAGATGCAAGGAGTATCTAAGGAGGCATTGCTTTTGAAATTGTTCCCCTACTCAAACGTCAGGTATCGAGGAGACATTAATAATTTTCAACAGAGGCATGG
AGAATCGGTCAGTGAGTCGTGGGAAGAGGCAGTTGACATCTTGGAAAGTATTTATGCTAGTAACTACCACTGGTCAGATCCCAAAGCAGTGAATGACAGGAGCACTCACG
TGGCTACTGATAATGAGGCAATGCTTGCATTGAAGGATGAGATTGTCAACCTAACCAACATGGTAAAGAACATGAACACTGCCACAACATCATCAACTAGCCCTGGAACT
AGGGAAACTGTAGAGAACATGCCTATAGATGAGGATAAAGTTAATTATGCTAGAGCATTGCAGGAGCAAGAACTACACCAAGGACCAATAGTACAACCGGAGATAGAAAG
GCAACCTGAACCTGAGATAAATAGAAGAAAAAGAGCTGCAGAAGCAGAGCCAGCTGAAGAAGTATCTATGAAGACACCAATGCAGAAAAAAGTTAAGGAAAACATTCTCA
AACGATGGGAGACCCATGAAGCTTCACTATTCCTGTATCCAGAGGAGTCTGTAAGTTCATATCGTTCGTGGCAAGGAGTTGTTGTGAAAGAATACCAAAGAAGTGAGGAA
AAACCAGAGAAATCGGCTGGACAGTGCCTCGGGGCTAGGGATGGCACCTATGTGCTGCCAGAGCTACAGAAACCCAAGAGCTCGAGCCTGCAGCAATGGGGGCAGCGCCT
AGGCACTGTCCCGTGGAACCCTGGGTCCGGGCGTCGGGCTCCCACGTCTAATGCAGCCCGAGCCAGTCGGGATGCTTATAACAGACGGATCCAGGCTAATGACAAGGCCA
AGGTCTACATCTTGGCAAGCATATCTGATGTGCTGGCCAAAAAGCATGAGAGCATGGTGACTGCAAAGGGAACATCAGTGCGAGAACATGTTCTCAATCTGATGGTCCAG
TTTAATGTGGCTAAAGTGAACGGCGCTGTCATAGACGAGAGGAATCAGGAAGGGGAGGTAAACGTTGTCACCTTAAAACAGTTCCATCGAGGTTCGACCTCAAGAACAAA
ATCTGCATTTTCGTCTTTTGGAAGTAAGACTTTCAAGAAGAAGAAGAACAGTGGTAAGGGGGTGAAAGCTAACCCTACTGCTACTGCTGCTACCAAGAAGGGCAAGGCCA
AAGGAATTAGTTCCTGGAGGCAGCTTGATGCTGGGGAGATAACTCTCAAAGTCAAAACGGGAGCTGTCGTCTCAGTAGTCACAGTTGTGTTAAATGAGATTTCCGATGAA
GCTACAAATACATCAACAAGAGTTGTTGATAAAGCTAGCACTTCAACAAGAGTTGTTGATGGCCCTGGTACATTACGTCAGTCACATCCATCTCAAGAGTTGAGAGTGCC
TCGGTGTAGTGGGAGGGTTGTGGCACAACCTGATCGTTACATCTTCCGCTGTGGGATCTTATCCCTTCATGTGGGTCATGCACTATGTGACTTAAGGCTAAGTATAAACC
TCATGCCTCTGTCGGTATATCAGAAATTGGGAATTGCGAAGGCACGCCCTACCACGGTGACCTTGCAGTTGCCTGATAGGTCAATCACACATCCGAAGGGTAAGATAGTA
GAATTTTTAGTGCAAGTGGACAAATTCATCTTCCCAACTGACTTCATCATATTATACTACGAAGCTGACAATGAAATTTCAATCATTTTGGGAAGGCCTTTCCTCTCCAC
TGGCAGAGCTTTAATAGATGTACAAAATGGAGAATTAACGATGCGAGTAAATGACCAACATATGACATTTTCTGTTTTTAATTCTATTAAGTTTCCTACTGATGTGGATG
AATGTTCTCTGTTAAGGATTGTAGATGACTTGCTTACGAAGGAGATGCAGAATGAGGAGCTATTAAATCAGTTAGAGCAAGCACCGTTTCAACCATCTGTGGAAAAGGCT
CCCAAGTTGGAGTTTAAAGTCCTCCCAACTCACCTGAAAACAACAACCAGAACAATAGTTGCAGCAAGAGGTAGTTCCTTCATCCCCAAAGAAGAAAAAACCAGCAACAA
TGGGGCGAAAGCTTACCAAGGTGAAACCACGTCAGTGCTCATTGAAGAAGAAGAAAGTGTTGCCCCAGAAGAAGAAGTGCCCGTAGAAGAGCAACCAGTGGAACATGAAG
AAGTCCTGCCTGAGGAAAGGAACGAAGATGAAGTTCCAACCTTCTCTCCTGAAACAGTGCAGGTGGTGGAGGAGTTTGTTCGAATTCTATCAGAAAATGAGCAGGTGCTC
ATGGGTTTCAAACAATGGGGCTTCGGCAAGCAAGGTGTACCAAGTGGAGCAAAGGATGCAGAGGAAGAAGAAGATAATGAGGAGCTCCAGAGTGACAAAGAAGGGGAAGA
AGTAGTCCCTGTGGCCGTAAGGAAATCATCTAGAAAAAGAAGGGCACCACCCAAGCAACCAGCTGGCAAGAAAAACAAGCTAGAGTACGAGAGGTTCGTTAATGCTACAA
CATCTGAATGGTTTGAAGAATTGAAAGATAGAAATCTGCTAGGCGAGAGAGGCTTCCCAACTGATTCCCCCATGCCAGACTTTGTGGTCGACACCGTTCAGAAGCATGAA
GCTAGATGGCAGCTTTCAACAAGGGGAGCCCGAACCTTCCTCCATGCCTACCTCAAGCCTGAAGCAAACGTATGGCACCACCTGCTGAAGAGCAAGCTCATGCCAACTAC
ACATGATGCCTCCATCACCAAGGAAAGAGTTCTACTCCTGTTCTGCATCATGGAAGGCCTTAGCATAAACGTCGGCAAGTTAATTGAGAGGGAGATTGCTAGCGTGAAGA
AAAGGAAGTACGGCAATCTATTTTTTCCAAATCTCATCACGGAGCTGTGTATTCTCTCGCGTGTTAAGGTGGAGGAAGCAGATCAAGTGCTCAAGGATAAGGGAGTGATT
AATGAAGCTGCCATTGAGAGATTGAAAGTAGTCGATCATAGCAAGAGAGTGGGGAACATTACCTTCGCCATCAAAGAAGTTACCACCAATCAGGCCAGCATGTTGGAGCA
ACTTAACTCCATTATGGGAGTGGTAGCTGCAACAGAAATAAGGCAGCAGGCTTACTTACGCTATGCAAAAGAAAGAAACATAGTGATCCGTAAGGCCCTAATGACGAAAT
TCTCACATCCATTCCTTCCATTCCCTATATTTCCTAAAGAGCAAATTGTTGAAGAAGAAGGTGATGATGCCGGCGCGTCTCAAACTGCCAGCAGCGCTATTGTGATGAAT
AAGAAGGATAACAAGGAGAGTTGA
Protein sequenceShow/hide protein sequence
MLTQWMNYQDSPTRLLNLVVERNNGGEVGVAAAAPHLNVILLADDGERAIKTYVAPTLHGFHPVIAKPKIEAERFELKSIMFQMLQTVGQLFGKPSEDLHLHWRYFLLVS
DSFKMQGVSKEALLLKLFPYSNVRYRGDINNFQQRHGESVSESWEEAVDILESIYASNYHWSDPKAVNDRSTHVATDNEAMLALKDEIVNLTNMVKNMNTATTSSTSPGT
RETVENMPIDEDKVNYARALQEQELHQGPIVQPEIERQPEPEINRRKRAAEAEPAEEVSMKTPMQKKVKENILKRWETHEASLFLYPEESVSSYRSWQGVVVKEYQRSEE
KPEKSAGQCLGARDGTYVLPELQKPKSSSLQQWGQRLGTVPWNPGSGRRAPTSNAARASRDAYNRRIQANDKAKVYILASISDVLAKKHESMVTAKGTSVREHVLNLMVQ
FNVAKVNGAVIDERNQEGEVNVVTLKQFHRGSTSRTKSAFSSFGSKTFKKKKNSGKGVKANPTATAATKKGKAKGISSWRQLDAGEITLKVKTGAVVSVVTVVLNEISDE
ATNTSTRVVDKASTSTRVVDGPGTLRQSHPSQELRVPRCSGRVVAQPDRYIFRCGILSLHVGHALCDLRLSINLMPLSVYQKLGIAKARPTTVTLQLPDRSITHPKGKIV
EFLVQVDKFIFPTDFIILYYEADNEISIILGRPFLSTGRALIDVQNGELTMRVNDQHMTFSVFNSIKFPTDVDECSLLRIVDDLLTKEMQNEELLNQLEQAPFQPSVEKA
PKLEFKVLPTHLKTTTRTIVAARGSSFIPKEEKTSNNGAKAYQGETTSVLIEEEESVAPEEEVPVEEQPVEHEEVLPEERNEDEVPTFSPETVQVVEEFVRILSENEQVL
MGFKQWGFGKQGVPSGAKDAEEEEDNEELQSDKEGEEVVPVAVRKSSRKRRAPPKQPAGKKNKLEYERFVNATTSEWFEELKDRNLLGERGFPTDSPMPDFVVDTVQKHE
ARWQLSTRGARTFLHAYLKPEANVWHHLLKSKLMPTTHDASITKERVLLLFCIMEGLSINVGKLIEREIASVKKRKYGNLFFPNLITELCILSRVKVEEADQVLKDKGVI
NEAAIERLKVVDHSKRVGNITFAIKEVTTNQASMLEQLNSIMGVVAATEIRQQAYLRYAKERNIVIRKALMTKFSHPFLPFPIFPKEQIVEEEGDDAGASQTASSAIVMN
KKDNKES