; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Moc04g26230 (gene) of Bitter gourd (OHB3-1) v2 genome

Gene IDMoc04g26230
OrganismMomordica charantia cv. OHB3-1 (Bitter gourd (OHB3-1) v2)
DescriptionRetrotran_gag_3 domain-containing protein
Genome locationchr4:19092757..19102673
RNA-Seq ExpressionMoc04g26230
SyntenyMoc04g26230
Gene Ontology termsGO:0003676 - nucleic acid binding (molecular function)
GO:0008270 - zinc ion binding (molecular function)
InterPro domainsIPR001878 - Zinc finger, CCHC-type
IPR025836 - Zinc knuckle CX2CX4HX4C
IPR029472 - Retrotransposon Copia-like, N-terminal
IPR036691 - Endonuclease/exonuclease/phosphatase superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAA0049700.1 T4.5 [Cucumis melo var. makuwa]5.8e-5062.29Show/hide
Query:  SNISKDLASPIFLLSNIRNFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFLNDDST-----QPNPAYEDWIAKDHALMTLINATLSTTTL
        S+  KD  SPIFLLSNI N +S+RLDS+NFVLWKFQLT+ILKAHKL+GF+DG+   P    N  ST     Q NP+YEDWIAKD ALMT+INATLS   L
Subjt:  SNISKDLASPIFLLSNIRNFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFLNDDST-----QPNPAYEDWIAKDHALMTLINATLSTTTL

Query:  TFVVGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIEDEDLIIY
         +VVG  +S++VW  L K YSS SR NVVNLK++LQ+I KK  E+ID Y++R+KE+KDKLAN+S  I +EDL+IY
Subjt:  TFVVGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIEDEDLIIY

XP_008448007.1 PREDICTED: uncharacterized protein LOC103490319 isoform X2 [Cucumis melo]5.8e-5062.29Show/hide
Query:  SNISKDLASPIFLLSNIRNFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFLNDDST-----QPNPAYEDWIAKDHALMTLINATLSTTTL
        S+  KD  SPIFLLSNI N +S+RLDS+NFVLWKFQLT+ILKAHKL+GF+DG+   P    N  ST     Q NP+YEDWIAKD ALMT+INATLS   L
Subjt:  SNISKDLASPIFLLSNIRNFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFLNDDST-----QPNPAYEDWIAKDHALMTLINATLSTTTL

Query:  TFVVGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIEDEDLIIY
         +VVG  +S++VW  L K YSS SR NVVNLK++LQ+I KK  E+ID Y++R+KE+KDKLAN+S  I +EDL+IY
Subjt:  TFVVGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIEDEDLIIY

XP_008448008.1 PREDICTED: uncharacterized protein LOC103490319 isoform X3 [Cucumis melo]5.8e-5062.29Show/hide
Query:  SNISKDLASPIFLLSNIRNFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFLNDDST-----QPNPAYEDWIAKDHALMTLINATLSTTTL
        S+  KD  SPIFLLSNI N +S+RLDS+NFVLWKFQLT+ILKAHKL+GF+DG+   P    N  ST     Q NP+YEDWIAKD ALMT+INATLS   L
Subjt:  SNISKDLASPIFLLSNIRNFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFLNDDST-----QPNPAYEDWIAKDHALMTLINATLSTTTL

Query:  TFVVGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIEDEDLIIY
         +VVG  +S++VW  L K YSS SR NVVNLK++LQ+I KK  E+ID Y++R+KE+KDKLAN+S  I +EDL+IY
Subjt:  TFVVGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIEDEDLIIY

XP_022150845.1 uncharacterized protein LOC111018892 [Momordica charantia]1.6e-5257.97Show/hide
Query:  PRVRFYYFLFLLAVKSLLAVSFHGESNISKDLASPIFLLSNIRNFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFL---NDDSTQP----
        PRV F+        +S +       +N  KDL SPIFLLSNI N VSIRLDS++F+LWKFQLT+ILKAHKLFGF+DGS   PS FL   ++  +QP    
Subjt:  PRVRFYYFLFLLAVKSLLAVSFHGESNISKDLASPIFLLSNIRNFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFL---NDDSTQP----

Query:  -----NPAYEDWIAKDHALMTLINATLSTTTLTFVVGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIE
             NP +EDWIAKD ALMTLINATLS   L +VV    S++VW  L KHYSS+SR NVVNLK++LQSIVKK+ E+ID YV+R+KE+KDK AN+S+ I 
Subjt:  -----NPAYEDWIAKDHALMTLINATLSTTTLTFVVGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIE

Query:  DEDLIIY
        DE L+IY
Subjt:  DEDLIIY

XP_022158689.1 uncharacterized protein LOC111025150 [Momordica charantia]4.9e-5768.97Show/hide
Query:  KDLASPIFLLSNIRNFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFL----NDDSTQP---NPAYEDWIAKDHALMTLINATLSTTTLTF
        KDL+SPIFLLSNI N VS+RLDSSNFVLWKFQLT+ILKAHKL+GF+DGST KP+ FL    +  S+ P   NPA+ +WIAKDHALMTL+NA LS++ L +
Subjt:  KDLASPIFLLSNIRNFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFL----NDDSTQP---NPAYEDWIAKDHALMTLINATLSTTTLTF

Query:  VVGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIEDEDLIIYT
        VVGC +SQ+VW TLVKHYSS SR NVVNLK++LQSI KK   +ID YVQR+KELKDKLAN+ V++++EDL+IYT
Subjt:  VVGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIEDEDLIIYT

TrEMBL top hitse value%identityAlignment
A0A1S3BI58 uncharacterized protein LOC103490319 isoform X22.8e-5062.29Show/hide
Query:  SNISKDLASPIFLLSNIRNFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFLNDDST-----QPNPAYEDWIAKDHALMTLINATLSTTTL
        S+  KD  SPIFLLSNI N +S+RLDS+NFVLWKFQLT+ILKAHKL+GF+DG+   P    N  ST     Q NP+YEDWIAKD ALMT+INATLS   L
Subjt:  SNISKDLASPIFLLSNIRNFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFLNDDST-----QPNPAYEDWIAKDHALMTLINATLSTTTL

Query:  TFVVGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIEDEDLIIY
         +VVG  +S++VW  L K YSS SR NVVNLK++LQ+I KK  E+ID Y++R+KE+KDKLAN+S  I +EDL+IY
Subjt:  TFVVGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIEDEDLIIY

A0A1S4DWT9 uncharacterized protein LOC103490319 isoform X12.8e-5062.29Show/hide
Query:  SNISKDLASPIFLLSNIRNFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFLNDDST-----QPNPAYEDWIAKDHALMTLINATLSTTTL
        S+  KD  SPIFLLSNI N +S+RLDS+NFVLWKFQLT+ILKAHKL+GF+DG+   P    N  ST     Q NP+YEDWIAKD ALMT+INATLS   L
Subjt:  SNISKDLASPIFLLSNIRNFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFLNDDST-----QPNPAYEDWIAKDHALMTLINATLSTTTL

Query:  TFVVGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIEDEDLIIY
         +VVG  +S++VW  L K YSS SR NVVNLK++LQ+I KK  E+ID Y++R+KE+KDKLAN+S  I +EDL+IY
Subjt:  TFVVGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIEDEDLIIY

A0A5D3CLI6 T4.52.8e-5062.29Show/hide
Query:  SNISKDLASPIFLLSNIRNFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFLNDDST-----QPNPAYEDWIAKDHALMTLINATLSTTTL
        S+  KD  SPIFLLSNI N +S+RLDS+NFVLWKFQLT+ILKAHKL+GF+DG+   P    N  ST     Q NP+YEDWIAKD ALMT+INATLS   L
Subjt:  SNISKDLASPIFLLSNIRNFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFLNDDST-----QPNPAYEDWIAKDHALMTLINATLSTTTL

Query:  TFVVGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIEDEDLIIY
         +VVG  +S++VW  L K YSS SR NVVNLK++LQ+I KK  E+ID Y++R+KE+KDKLAN+S  I +EDL+IY
Subjt:  TFVVGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIEDEDLIIY

A0A6J1D9L6 uncharacterized protein LOC1110188927.8e-5357.97Show/hide
Query:  PRVRFYYFLFLLAVKSLLAVSFHGESNISKDLASPIFLLSNIRNFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFL---NDDSTQP----
        PRV F+        +S +       +N  KDL SPIFLLSNI N VSIRLDS++F+LWKFQLT+ILKAHKLFGF+DGS   PS FL   ++  +QP    
Subjt:  PRVRFYYFLFLLAVKSLLAVSFHGESNISKDLASPIFLLSNIRNFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFL---NDDSTQP----

Query:  -----NPAYEDWIAKDHALMTLINATLSTTTLTFVVGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIE
             NP +EDWIAKD ALMTLINATLS   L +VV    S++VW  L KHYSS+SR NVVNLK++LQSIVKK+ E+ID YV+R+KE+KDK AN+S+ I 
Subjt:  -----NPAYEDWIAKDHALMTLINATLSTTTLTFVVGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIE

Query:  DEDLIIY
        DE L+IY
Subjt:  DEDLIIY

A0A6J1E049 uncharacterized protein LOC1110251502.4e-5768.97Show/hide
Query:  KDLASPIFLLSNIRNFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFL----NDDSTQP---NPAYEDWIAKDHALMTLINATLSTTTLTF
        KDL+SPIFLLSNI N VS+RLDSSNFVLWKFQLT+ILKAHKL+GF+DGST KP+ FL    +  S+ P   NPA+ +WIAKDHALMTL+NA LS++ L +
Subjt:  KDLASPIFLLSNIRNFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFL----NDDSTQP---NPAYEDWIAKDHALMTLINATLSTTTLTF

Query:  VVGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIEDEDLIIYT
        VVGC +SQ+VW TLVKHYSS SR NVVNLK++LQSI KK   +ID YVQR+KELKDKLAN+ V++++EDL+IYT
Subjt:  VVGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIEDEDLIIYT

SwissProt top hitse value%identityAlignment
Q94HW2 Retrovirus-related Pol polyprotein from transposon RE13.6e-1027.4Show/hide
Query:  RLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFLNDDST-QPNPAYEDWIAKDHALMTLINATLSTTTLTFVVGCANSQEVWSTLVKHYSSDSRLN
        +L S+N+++W  Q+ ++   ++L GF+DGST  P   +  D+  + NP Y  W  +D  + + +   +S +    V     + ++W TL K Y++ S  +
Subjt:  RLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFLNDDST-QPNPAYEDWIAKDHALMTLINATLSTTTLTFVVGCANSQEVWSTLVKHYSSDSRLN

Query:  VVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIEDEDLI
        V  L+T L+    K ++TID Y+Q +    D+LA +   ++ ++ +
Subjt:  VVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIEDEDLI

Q9ZT94 Retrovirus-related Pol polyprotein from transposon RE28.2e-0728.57Show/hide
Query:  RLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFLNDDST-QPNPAYEDWIAKDHALMTLINATLSTTTLTFVVGCANSQEVWSTLVKHYSSDSRLN
        +L S+N+++W  Q+ ++   ++L GF+DGST  P   +  D+  + NP Y  W  +D  + + I   +S +    V     + ++W TL K Y++ S  +
Subjt:  RLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFLNDDST-QPNPAYEDWIAKDHALMTLINATLSTTTLTFVVGCANSQEVWSTLVKHYSSDSRLN

Query:  VVNLK
        V  L+
Subjt:  VVNLK

Arabidopsis top hitse value%identityAlignment
AT1G21280.1 CONTAINS InterPro DOMAIN/s: Retrotransposon gag protein (InterPro:IPR005162); Has 707 Blast hits to 705 proteins in 25 species: Archae - 0; Bacteria - 0; Metazoa - 4; Fungi - 0; Plants - 703; Viruses - 0; Other Eukaryotes - 0 (source: NCBI BLink).1.6e-0521.21Show/hide
Query:  SKDLASPIFLLSNIR-----NFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFLNDDSTQPNPAYEDWIAKDHALMTLINATLSTTTLTFV
        + D  SP +L  +I      +   +  D  N+V WK +  S L+  K FGF+DG+  KP  F        +P Y+ W   +  +M  +  +++   L  V
Subjt:  SKDLASPIFLLSNIR-----NFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFLNDDSTQPNPAYEDWIAKDHALMTLINATLSTTTLTFV

Query:  VGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIE
        +    + ++W  L + +     L +  L+  L ++ ++  +++++Y  ++ ++  +L+  + + E
Subjt:  VGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIE

AT2G13450.1 unknown protein1.3e-0429.47Show/hide
Query:  KGCPVLSADRCMSKVMANQLGALLGQVEFVDCNGVDNWVGAFLRLRVKVDIGQPLRRGLKIQLDSGKELWCLIQYEKLPDFCYPCGRLGHTLREC
        +G P+L      +  +A++LG ++  ++F D         A++R+R++  I   LR  L+I  DSG+      QYE+L   C  C R+ H    C
Subjt:  KGCPVLSADRCMSKVMANQLGALLGQVEFVDCNGVDNWVGAFLRLRVKVDIGQPLRRGLKIQLDSGKELWCLIQYEKLPDFCYPCGRLGHTLREC

AT2G16676.1 unknown protein1.2e-0532.63Show/hide
Query:  KGCPVLSADRCMSKVMANQLGALLGQVEFVDCNGVDNWVGAFLRLRVKVDIGQPLRRGLKIQLDSGKELWCLIQYEKLPDFCYPCGRLGHTLREC
        +G P+L    C + V    LG  LGQ+  +D +       AF+R+R++ +I   +R   +I  DSG+      QYE+L   C  C RL H    C
Subjt:  KGCPVLSADRCMSKVMANQLGALLGQVEFVDCNGVDNWVGAFLRLRVKVDIGQPLRRGLKIQLDSGKELWCLIQYEKLPDFCYPCGRLGHTLREC

AT2G17920.1 nucleic acid binding;zinc ion binding7.9e-0528.42Show/hide
Query:  KGCPVLSADRCMSKVMANQLGALLGQVEFVDCNGVDNWVGAFLRLRVKVDIGQPLRRGLKIQLDSGKELWCLIQYEKLPDFCYPCGRLGHTLREC
        +G P L      +  +A ++GA++     +D +   +   A++R+RV+V I   LR   +I  +SG+      QYE+L   C  C R  H    C
Subjt:  KGCPVLSADRCMSKVMANQLGALLGQVEFVDCNGVDNWVGAFLRLRVKVDIGQPLRRGLKIQLDSGKELWCLIQYEKLPDFCYPCGRLGHTLREC

AT5G36228.1 nucleic acid binding;zinc ion binding1.3e-0428.95Show/hide
Query:  LGALLGQVEFVDCNGVDNWVGAFLRLRVKVDIGQPLRRGLKIQLDSGKELWCLIQYEKLPDFCYPCGRLGHTLREC
        + + LG+V  +D N        F+R++V++D  +PLR   +++  S +      +YEKL   C  C R+ H +  C
Subjt:  LGALLGQVEFVDCNGVDNWVGAFLRLRVKVDIGQPLRRGLKIQLDSGKELWCLIQYEKLPDFCYPCGRLGHTLREC


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGGTGCTGAATACAGGGCCTTGTGTTTTGATAATGAGGTGCATAATGTTCCTTTTAAATGTATGACTCAGGATATGGCTGTTAAGTTGGTGGTCTTCTTGGACCGATT
GTTGTTGTGGATTGTGTGGTGTCCCATTCAGTATGAAAAGCTTCTCGACTTCTGTTACTCTTGTGGGAGGATTGGTCACTCCCTTCGGGAATGTGTAACAGCAGTACCAC
CATCGGAGTCATCCGCTCGGTTACAATATGGAGAACGGCTGCGCGCGTCAGTTTTCAAGAAGGCTGAGGGTCGTCGAGGTGGTGCCAGATCCTCATCTTCCTCAGCCGAA
GGAGGGTGGGACAAGGATGACGAAGTGTGTGGAGGGAGCGGTGGATTGGCGGGTGAGTGCCGACGAGGCCCCCCAGTGTTTGCATCCGTTGCGCCGGTGGTTAAAGCACG
TCTCTTCCCAATGCGGCACCCAGTTGATTTTGTTAATGGGAATGCGTTACCTCCGAAAGTGGGGGGCGGGGGTGTAGTATCTGGTTTTCTCTCTAATTTATCGGACTCTT
CTCTGCGTCCTAGGGCTGCTGGCCCAATTCGGAAAAGCAAATGGAAACCGGCTGCGAGGGATCAACTTACGACTTCGAACTCATCTGAGGTTCCTGAGGAACCGATTATG
GGTAAGCGTGCCAGCAGCTCTTATTTGGCCATTTCTAGGGCTCCGGATTTTAAGAAGGCACGATCTAGTGATGTCGGATCTTTCGATTCAATTGTAGCTCATGGTTCTGC
AGTGGTATCGTTTCCATCTTTGTTGGCAGAGGCTGGGATGATGTCCCGTGGCTTGTTGGTGGGGATTTTAATGAAATTATGTATCATCATGAGAAGTTTGGGGGCGTTTA
AGCCGAACCGGGCTCTGTCGGATTTTCATTTGGCTATCGATGATTGTGGTTTACAGAATATGGGTTTCCAAGTTGATCGTTTCACGTGGATTAATAGGCGTGACAAAGGG
GAGATTATTTTTGAACGGTTGGATCGCTGTTTTTGTACGACGGCATGGTGGCTCCTGTTTCCGTTTGTTCGATCCAGCATTTGGACTACAATTTTTCGGACCACCGCCCT
TTGGCTTGTTATTTTGGGACAAGGCCGTTTGGAGGCCCGCCTTGTCGTTCATCTATACTCCGTTTTGAAGATGCTTGGTTGTCCCAAGAGGATAGTAAATCCATCGTTTT
TCACAGCTGGAATTCTGTCTCGGAGATTAGCTCACCAGAGAGTTTTGAGTGCGGAATCCCGTCTTGAGTCTCTGTTATTGGAGGAGAAAGTGTACTGGAAACAACGGTTC
CGTGAGTGTTGGCTGAAATGTGGCGACCAAAATACAAAGTGGTTCCACAATCGCGCTTCTTTTCGACGGCAGGTCAATACCATTGAGAGCTTAGAAGATTCCATTGGCCG
ATGGGTACAGGGACCAACGAAAATCCAAGACAATATGAACCATCGGTTACTTCGACCGTTTACAGTGGACGAAGTTGTGCAGGCTCTTAAACAAATTCATCCAGCTAAGG
CCCCCGGACCGGACGGTCTTCCGGGGTTTTTTTACGGGAATTTTTTAGAGGTCGTGGGTATGAATGTTATCAGTAATTGTTTACAGGTTGTAGTTCATTTATTTGGCGGA
GCCTTATTTGGGGACGTGGACTTTTACATTCGGACCTCCGTTGGTGGGTGGGCAATGCCGAAAATATTTGGATCTATCAAGATAGGTGGCTGCCTACGACGAGTTTCCTC
TGAGTACAATCTGTTCCGTCCTTGCCAGCGCACAGTCAAGGTGGCTCAGCATGCAGATATCCACCGGTCTCCTTCGTCTTCCTCTTCGGATGGTGTGGGTGCTCTTTTAC
CTTCCTTAAAGGTGACTGATGTGGCAAACCTCCTTCGTGATGTGAAGGAGAAGCCCAGCGATGTCTTTATCCCCCCTCCCCTCCGTCGTTGTGGTCATCCCGTTGTACGG
TTGTGTGGTGCCCTCCGGTTGCTCCGGTTTGTCCCAAATGTAGCAGATGTTGATATGGCCGAGAGCTTGGCTACTCAAGTAGGTTCTTGGGTTGTAGATACGGTTGCTTT
GGGGTGGTTATGTATGGAGGAGATCGTTGATTTATGGCGAGGGCTTACCTTAACGGGCGAAGAGGAGACAACTTTTCGAGTTGATAAGGGTTGCCCTGTGCTATCGGCGG
ATCGGTGTATGTCCAAAGTGATGGCCAACCAATTAGGGGCATTACTGGGTCAGGTTGAATTCGTGGATTGCAATGGGGTGGATAATTGGGTTGGTGCTTTTTTACGGCTC
CGAGTGAAGGTGGACATCGGACAACCGCTTCGACGGGGTCTCAAGATCCAGTTGGACTCGGGGAAGGAGTTGTGGTGCCTCATCCAGTATGAGAAGCTACCTGACTTTTG
CTATCCGTGTGGGCGATTGGGGCATACGTTGCGGGAGTGTGGCGATCCGGTGGGTGATGCGGGGGTTGAACGGGGGAGTCCGGGAACGGTGGAGGTGCTGTCGGGGGTTC
CGATAGTTGTGGGCGCGGTGTCGGAGGTTGGACCTGGGGTTTCTTCGCAGCCGGTGGTGGGTGAGTCGACGGTAGTGGCGTTGGTTTCGTCGGGCAGAGGATCCCCACCA
CCACGTTCTCCTATTTTAGCTGACAACGATGCACAGGTTGATGGTGGGGGGTTGCCTTTGTCAGTTACGTGTTCTCCAGGTGGTGGGAAGGATTTGGGCTGCATATTGCA
CTGTGACAGTGAAGGGCAGAATGGTGCTCCTGTTTCTAAAAAGCTTTGTTCTGAGGTTTCTGATGTTGGTGATTTTGCTGGACGGCTCGGGGGTCATGGCAATATCTGTT
CTTCTTCTCCTCTGGCAGCGGCTGGTCGAAGCGGTGGGTTAGCGCTTTTGTGGCATTGTGCATTGAATTTCAGATTGATTTCTTTCTCTGCCTCTCATATTAACGGCTGG
GTGGACCATGGTGGCACTCAGTGGCGTCTCACGGGTTTCTACGGCAGTCCTGTTGAACATTCGCGCCCCGCCACCTCGACTCTATTGAAACACCTCCGTGGGAGTGATGA
TGTACCATGGCTTATCAGGGGAGATTTTAATGAGATTATTCACCATCATGAGAAAACTTGGGGGACGCACAAAGCCGGGCAGGCCATTGAGGTTCAAATTGCAAACATGT
CGGTTGTACCTCGGAAATACATTAATAAAATTCCTCGGTTTGAGGATACTTCTATTGACCCACGTCTCGATCCAAAACCACGTGTCCGCTTCTATTACTTCTTGTTTCTT
CTTGCGGTGAAGTCTCTTCTTGCAGTGAGTTTTCATGGCGAATCCAATATCTCGAAGGATCTTGCTTCTCCAATATTTCTTCTGTCCAATATACGCAATTTTGTCTCTAT
TAGACTCGATTCATCCAATTTTGTTCTATGGAAGTTTCAGTTAACTTCGATTCTAAAAGCGCACAAGTTATTCGGCTTTGTTGATGGTTCAACCAAGAAGCCATCTCTAT
TTCTGAACGATGATTCTACTCAACCTAATCCAGCGTATGAAGATTGGATCGCCAAGGATCATGCCCTAATGACCTTGATCAATGCAACACTTTCTACTACTACTTTAACT
TTTGTTGTTGGCTGTGCAAATTCTCAGGAAGTGTGGTCTACGCTTGTGAAGCATTACTCATCGGATTCCAGATTGAATGTTGTGAATCTCAAAACAAATCTCCAATCGAT
TGTAAAGAAGTCGTCCGAGACCATTGATCAGTATGTTCAACGCGTCAAAGAACTTAAAGATAAGCTTGCTAACATCTCGGTTGTTATTGAGGATGAGGATCTTATCATTT
ATACATGA
mRNA sequenceShow/hide mRNA sequence
ATGGGTGCTGAATACAGGGCCTTGTGTTTTGATAATGAGGTGCATAATGTTCCTTTTAAATGTATGACTCAGGATATGGCTGTTAAGTTGGTGGTCTTCTTGGACCGATT
GTTGTTGTGGATTGTGTGGTGTCCCATTCAGTATGAAAAGCTTCTCGACTTCTGTTACTCTTGTGGGAGGATTGGTCACTCCCTTCGGGAATGTGTAACAGCAGTACCAC
CATCGGAGTCATCCGCTCGGTTACAATATGGAGAACGGCTGCGCGCGTCAGTTTTCAAGAAGGCTGAGGGTCGTCGAGGTGGTGCCAGATCCTCATCTTCCTCAGCCGAA
GGAGGGTGGGACAAGGATGACGAAGTGTGTGGAGGGAGCGGTGGATTGGCGGGTGAGTGCCGACGAGGCCCCCCAGTGTTTGCATCCGTTGCGCCGGTGGTTAAAGCACG
TCTCTTCCCAATGCGGCACCCAGTTGATTTTGTTAATGGGAATGCGTTACCTCCGAAAGTGGGGGGCGGGGGTGTAGTATCTGGTTTTCTCTCTAATTTATCGGACTCTT
CTCTGCGTCCTAGGGCTGCTGGCCCAATTCGGAAAAGCAAATGGAAACCGGCTGCGAGGGATCAACTTACGACTTCGAACTCATCTGAGGTTCCTGAGGAACCGATTATG
GGTAAGCGTGCCAGCAGCTCTTATTTGGCCATTTCTAGGGCTCCGGATTTTAAGAAGGCACGATCTAGTGATGTCGGATCTTTCGATTCAATTGTAGCTCATGGTTCTGC
AGTGGTATCGTTTCCATCTTTGTTGGCAGAGGCTGGGATGATGTCCCGTGGCTTGTTGGTGGGGATTTTAATGAAATTATGTATCATCATGAGAAGTTTGGGGGCGTTTA
AGCCGAACCGGGCTCTGTCGGATTTTCATTTGGCTATCGATGATTGTGGTTTACAGAATATGGGTTTCCAAGTTGATCGTTTCACGTGGATTAATAGGCGTGACAAAGGG
GAGATTATTTTTGAACGGTTGGATCGCTGTTTTTGTACGACGGCATGGTGGCTCCTGTTTCCGTTTGTTCGATCCAGCATTTGGACTACAATTTTTCGGACCACCGCCCT
TTGGCTTGTTATTTTGGGACAAGGCCGTTTGGAGGCCCGCCTTGTCGTTCATCTATACTCCGTTTTGAAGATGCTTGGTTGTCCCAAGAGGATAGTAAATCCATCGTTTT
TCACAGCTGGAATTCTGTCTCGGAGATTAGCTCACCAGAGAGTTTTGAGTGCGGAATCCCGTCTTGAGTCTCTGTTATTGGAGGAGAAAGTGTACTGGAAACAACGGTTC
CGTGAGTGTTGGCTGAAATGTGGCGACCAAAATACAAAGTGGTTCCACAATCGCGCTTCTTTTCGACGGCAGGTCAATACCATTGAGAGCTTAGAAGATTCCATTGGCCG
ATGGGTACAGGGACCAACGAAAATCCAAGACAATATGAACCATCGGTTACTTCGACCGTTTACAGTGGACGAAGTTGTGCAGGCTCTTAAACAAATTCATCCAGCTAAGG
CCCCCGGACCGGACGGTCTTCCGGGGTTTTTTTACGGGAATTTTTTAGAGGTCGTGGGTATGAATGTTATCAGTAATTGTTTACAGGTTGTAGTTCATTTATTTGGCGGA
GCCTTATTTGGGGACGTGGACTTTTACATTCGGACCTCCGTTGGTGGGTGGGCAATGCCGAAAATATTTGGATCTATCAAGATAGGTGGCTGCCTACGACGAGTTTCCTC
TGAGTACAATCTGTTCCGTCCTTGCCAGCGCACAGTCAAGGTGGCTCAGCATGCAGATATCCACCGGTCTCCTTCGTCTTCCTCTTCGGATGGTGTGGGTGCTCTTTTAC
CTTCCTTAAAGGTGACTGATGTGGCAAACCTCCTTCGTGATGTGAAGGAGAAGCCCAGCGATGTCTTTATCCCCCCTCCCCTCCGTCGTTGTGGTCATCCCGTTGTACGG
TTGTGTGGTGCCCTCCGGTTGCTCCGGTTTGTCCCAAATGTAGCAGATGTTGATATGGCCGAGAGCTTGGCTACTCAAGTAGGTTCTTGGGTTGTAGATACGGTTGCTTT
GGGGTGGTTATGTATGGAGGAGATCGTTGATTTATGGCGAGGGCTTACCTTAACGGGCGAAGAGGAGACAACTTTTCGAGTTGATAAGGGTTGCCCTGTGCTATCGGCGG
ATCGGTGTATGTCCAAAGTGATGGCCAACCAATTAGGGGCATTACTGGGTCAGGTTGAATTCGTGGATTGCAATGGGGTGGATAATTGGGTTGGTGCTTTTTTACGGCTC
CGAGTGAAGGTGGACATCGGACAACCGCTTCGACGGGGTCTCAAGATCCAGTTGGACTCGGGGAAGGAGTTGTGGTGCCTCATCCAGTATGAGAAGCTACCTGACTTTTG
CTATCCGTGTGGGCGATTGGGGCATACGTTGCGGGAGTGTGGCGATCCGGTGGGTGATGCGGGGGTTGAACGGGGGAGTCCGGGAACGGTGGAGGTGCTGTCGGGGGTTC
CGATAGTTGTGGGCGCGGTGTCGGAGGTTGGACCTGGGGTTTCTTCGCAGCCGGTGGTGGGTGAGTCGACGGTAGTGGCGTTGGTTTCGTCGGGCAGAGGATCCCCACCA
CCACGTTCTCCTATTTTAGCTGACAACGATGCACAGGTTGATGGTGGGGGGTTGCCTTTGTCAGTTACGTGTTCTCCAGGTGGTGGGAAGGATTTGGGCTGCATATTGCA
CTGTGACAGTGAAGGGCAGAATGGTGCTCCTGTTTCTAAAAAGCTTTGTTCTGAGGTTTCTGATGTTGGTGATTTTGCTGGACGGCTCGGGGGTCATGGCAATATCTGTT
CTTCTTCTCCTCTGGCAGCGGCTGGTCGAAGCGGTGGGTTAGCGCTTTTGTGGCATTGTGCATTGAATTTCAGATTGATTTCTTTCTCTGCCTCTCATATTAACGGCTGG
GTGGACCATGGTGGCACTCAGTGGCGTCTCACGGGTTTCTACGGCAGTCCTGTTGAACATTCGCGCCCCGCCACCTCGACTCTATTGAAACACCTCCGTGGGAGTGATGA
TGTACCATGGCTTATCAGGGGAGATTTTAATGAGATTATTCACCATCATGAGAAAACTTGGGGGACGCACAAAGCCGGGCAGGCCATTGAGGTTCAAATTGCAAACATGT
CGGTTGTACCTCGGAAATACATTAATAAAATTCCTCGGTTTGAGGATACTTCTATTGACCCACGTCTCGATCCAAAACCACGTGTCCGCTTCTATTACTTCTTGTTTCTT
CTTGCGGTGAAGTCTCTTCTTGCAGTGAGTTTTCATGGCGAATCCAATATCTCGAAGGATCTTGCTTCTCCAATATTTCTTCTGTCCAATATACGCAATTTTGTCTCTAT
TAGACTCGATTCATCCAATTTTGTTCTATGGAAGTTTCAGTTAACTTCGATTCTAAAAGCGCACAAGTTATTCGGCTTTGTTGATGGTTCAACCAAGAAGCCATCTCTAT
TTCTGAACGATGATTCTACTCAACCTAATCCAGCGTATGAAGATTGGATCGCCAAGGATCATGCCCTAATGACCTTGATCAATGCAACACTTTCTACTACTACTTTAACT
TTTGTTGTTGGCTGTGCAAATTCTCAGGAAGTGTGGTCTACGCTTGTGAAGCATTACTCATCGGATTCCAGATTGAATGTTGTGAATCTCAAAACAAATCTCCAATCGAT
TGTAAAGAAGTCGTCCGAGACCATTGATCAGTATGTTCAACGCGTCAAAGAACTTAAAGATAAGCTTGCTAACATCTCGGTTGTTATTGAGGATGAGGATCTTATCATTT
ATACATGA
Protein sequenceShow/hide protein sequence
MGAEYRALCFDNEVHNVPFKCMTQDMAVKLVVFLDRLLLWIVWCPIQYEKLLDFCYSCGRIGHSLRECVTAVPPSESSARLQYGERLRASVFKKAEGRRGGARSSSSSAE
GGWDKDDEVCGGSGGLAGECRRGPPVFASVAPVVKARLFPMRHPVDFVNGNALPPKVGGGGVVSGFLSNLSDSSLRPRAAGPIRKSKWKPAARDQLTTSNSSEVPEEPIM
GKRASSSYLAISRAPDFKKARSSDVGSFDSIVAHGSAVVSFPSLLAEAGMMSRGLLVGILMKLCIIMRSLGAFKPNRALSDFHLAIDDCGLQNMGFQVDRFTWINRRDKG
EIIFERLDRCFCTTAWWLLFPFVRSSIWTTIFRTTALWLVILGQGRLEARLVVHLYSVLKMLGCPKRIVNPSFFTAGILSRRLAHQRVLSAESRLESLLLEEKVYWKQRF
RECWLKCGDQNTKWFHNRASFRRQVNTIESLEDSIGRWVQGPTKIQDNMNHRLLRPFTVDEVVQALKQIHPAKAPGPDGLPGFFYGNFLEVVGMNVISNCLQVVVHLFGG
ALFGDVDFYIRTSVGGWAMPKIFGSIKIGGCLRRVSSEYNLFRPCQRTVKVAQHADIHRSPSSSSSDGVGALLPSLKVTDVANLLRDVKEKPSDVFIPPPLRRCGHPVVR
LCGALRLLRFVPNVADVDMAESLATQVGSWVVDTVALGWLCMEEIVDLWRGLTLTGEEETTFRVDKGCPVLSADRCMSKVMANQLGALLGQVEFVDCNGVDNWVGAFLRL
RVKVDIGQPLRRGLKIQLDSGKELWCLIQYEKLPDFCYPCGRLGHTLRECGDPVGDAGVERGSPGTVEVLSGVPIVVGAVSEVGPGVSSQPVVGESTVVALVSSGRGSPP
PRSPILADNDAQVDGGGLPLSVTCSPGGGKDLGCILHCDSEGQNGAPVSKKLCSEVSDVGDFAGRLGGHGNICSSSPLAAAGRSGGLALLWHCALNFRLISFSASHINGW
VDHGGTQWRLTGFYGSPVEHSRPATSTLLKHLRGSDDVPWLIRGDFNEIIHHHEKTWGTHKAGQAIEVQIANMSVVPRKYINKIPRFEDTSIDPRLDPKPRVRFYYFLFL
LAVKSLLAVSFHGESNISKDLASPIFLLSNIRNFVSIRLDSSNFVLWKFQLTSILKAHKLFGFVDGSTKKPSLFLNDDSTQPNPAYEDWIAKDHALMTLINATLSTTTLT
FVVGCANSQEVWSTLVKHYSSDSRLNVVNLKTNLQSIVKKSSETIDQYVQRVKELKDKLANISVVIEDEDLIIYT