; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Spg028411 (gene) of Sponge gourd (cylindrica) v1 genome

Gene IDSpg028411
OrganismLuffa cylindrica (Sponge gourd (cylindrica) v1)
DescriptionCCHC-type domain-containing protein
Genome locationscaffold7:17566896..17578486
RNA-Seq ExpressionSpg028411
SyntenySpg028411
Gene Ontology termsGO:0003676 - nucleic acid binding (molecular function)
GO:0004523 - RNA-DNA hybrid ribonuclease activity (molecular function)
GO:0008270 - zinc ion binding (molecular function)
InterPro domainsIPR001878 - Zinc finger, CCHC-type
IPR002156 - Ribonuclease H domain
IPR025836 - Zinc knuckle CX2CX4HX4C
IPR036691 - Endonuclease/exonuclease/phosphatase superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAF4381998.1 hypothetical protein G4B88_006630 [Cannabis sativa]1.1e-1733.13Show/hide
Query:  LKEVGSGVDLCDVDVMDMANVIDSCGLLDLGFVGNKLTWRNRRPGGGTIYERLDRCLSSVTWHDIYLNCVVNYLDYYQSDHRPIELVLSPQPSCWRRSGQ
        + E   G D     + +  N +D C L DLGF G   TW N+R GG  + ERLDR   +  WHD++    V   D+  SDHRPI   L       R   +
Subjt:  LKEVGSGVDLCDVDVMDMANVIDSCGLLDLGFVGNKLTWRNRRPGGGTIYERLDRCLSSVTWHDIYLNCVVNYLDYYQSDHRPIELVLSPQPSCWRRSGQ

Query:  RIARFDETWLVQPYLQQLVRDSWGMSEKDYGLTVPLILANVSRRCMCSVAGWGRSKMGNFPQHISE
        R  RF+  WL  P  Q+++  SW     D  L     L ++   C   +  W +SK G+ P+ + E
Subjt:  RIARFDETWLVQPYLQQLVRDSWGMSEKDYGLTVPLILANVSRRCMCSVAGWGRSKMGNFPQHISE

KAF4383622.1 hypothetical protein F8388_014122 [Cannabis sativa]1.1e-1733.13Show/hide
Query:  LKEVGSGVDLCDVDVMDMANVIDSCGLLDLGFVGNKLTWRNRRPGGGTIYERLDRCLSSVTWHDIYLNCVVNYLDYYQSDHRPIELVLSPQPSCWRRSGQ
        + E   G D     + +  N +D C L DLGF G   TW N+R GG  + ERLDR   +  WHD++    V   D+  SDHRPI   L       R   +
Subjt:  LKEVGSGVDLCDVDVMDMANVIDSCGLLDLGFVGNKLTWRNRRPGGGTIYERLDRCLSSVTWHDIYLNCVVNYLDYYQSDHRPIELVLSPQPSCWRRSGQ

Query:  RIARFDETWLVQPYLQQLVRDSWGMSEKDYGLTVPLILANVSRRCMCSVAGWGRSKMGNFPQHISE
        R  RF+  WL  P  Q+++  SW     D  L     L ++   C   +  W +SK G+ P+ + E
Subjt:  RIARFDETWLVQPYLQQLVRDSWGMSEKDYGLTVPLILANVSRRCMCSVAGWGRSKMGNFPQHISE

TXG57064.1 hypothetical protein EZV62_018377 [Acer yangbiense]8.8e-2031.76Show/hide
Query:  RLKEVGSGVDLCDVDVMDMANVIDSCGLLDLGFVGNKLTWRNRRPGGGTIYERLDRCLSSVTWHDIYLNCVVNYLDYYQSDHRPIELVLSPQPSCWRRSG
        R+KE   G +   + +     VID C L+DLGF G K+TW NRR G   + ER+DR L+   W D++    V +L Y  SDHRP+ L  +      R++ 
Subjt:  RLKEVGSGVDLCDVDVMDMANVIDSCGLLDLGFVGNKLTWRNRRPGGGTIYERLDRCLSSVTWHDIYLNCVVNYLDYYQSDHRPIELVLSPQPSCWRRSG

Query:  QRIARFDETWLVQPYLQQLVRDSWGMSEKDYGLTVPLILANVSRR---CMCSVAGWGRSKMGNFPQHISE
        ++  +F+  WL +    +++R++W +      L VP  L ++ R+   C   ++ W  +K G+  + I E
Subjt:  QRIARFDETWLVQPYLQQLVRDSWGMSEKDYGLTVPLILANVSRR---CMCSVAGWGRSKMGNFPQHISE

XP_022150918.1 uncharacterized protein LOC111018954 [Momordica charantia]1.9e-1926.65Show/hide
Query:  EDRLIWHFEKHGMFSVKSGYKLAYSLASQVC---PSSSDSDR---------------------WQTCF-------------------------AVEDGLH
        EDRLIW++EK G++SV+SGYK+A  L +  C   PSSS S+                      W+ C                            ED +H
Subjt:  EDRLIWHFEKHGMFSVKSGYKLAYSLASQVC---PSSSDSDR---------------------WQTCF-------------------------AVEDGLH

Query:  LFWKCFVTKEMWLCSKFSRL---------YQSLYHLDFVD---VIWALKE---------------KLGLLDFELVTIFCQ--RESCYSSQPHPIRQVEQG
        LFW C   + +W+ SKF +L         ++SL   DF +   VIW L                 K+G+   E    +    RE+  +     +    + 
Subjt:  LFWKCFVTKEMWLCSKFSRL---------YQSLYHLDFVD---VIWALKE---------------KLGLLDFELVTIFCQ--RESCYSSQPHPIRQVEQG

Query:  AWTPPVASEFKLNTDASIRPETGEVGGGCVLRDASGAVLLAACLALPRCWSVDLVEGWALVKGVELALQM----------EVGLLMDDVRRLLHPCVSGK
         W PP    +K+NTDAS        G G ++ +  G V+ AA   L    SVD+ E  A V+G++LA ++          E G ++   +      +   
Subjt:  AWTPPVASEFKLNTDASIRPETGEVGGGCVLRDASGAVLLAACLALPRCWSVDLVEGWALVKGVELALQM----------EVGLLMDDVRRLLHPCVSGK

Query:  VLFTPRQGNKVAH---APACLTFSYSDCVWLEEWPSEISAVLTCDATDK
          F  R+GNK AH     A L   +S  +W+E+WP E+ + L  +  ++
Subjt:  VLFTPRQGNKVAH---APACLTFSYSDCVWLEEWPSEISAVLTCDATDK

XP_022150918.1 uncharacterized protein LOC111018954 [Momordica charantia]1.4e-0936.3Show/hide
Query:  DGFSLWCSLQYERLLDFCYRCGRIGHSHRECSEEGEGADSDGQFLFGDWLRVVPFLRVVANAPEE---GSGQSDSQGGRGTGMSSRGRGWGQMGPFEVVG
        DG  LWC L+YE+L DFCY CGR+GHS RE  +       +G   +G WLR     + + N  EE     G+S     RG+G   RG  W + G  E  G
Subjt:  DGFSLWCSLQYERLLDFCYRCGRIGHSHRECSEEGEGADSDGQFLFGDWLRVVPFLRVVANAPEE---GSGQSDSQGGRGTGMSSRGRGWGQMGPFEVVG

Query:  EERPDSHLVSGLVADPVVEPRTESELASETLVPLV
        E+    H       D   +P      A +  VPLV
Subjt:  EERPDSHLVSGLVADPVVEPRTESELASETLVPLV

XP_022150918.1 uncharacterized protein LOC111018954 [Momordica charantia]8.2e-1823.46Show/hide
Query:  ERLDRCLSSVTWHDIYLNCVVNYLDYYQSDHRPIELVLS-PQPSCWRRSGQRIARFDETWLVQPYLQQLVRDSWGMSEKDYGLTVPLILANVSRRCMCSV
        ERLD  + +  WH  + +  +N+LD++ SDH+ I++VL        +++  +   F   WL +P    ++ D+W  +  +   +  L+  +    C   +
Subjt:  ERLDRCLSSVTWHDIYLNCVVNYLDYYQSDHRPIELVLS-PQPSCWRRSGQRIARFDETWLVQPYLQQLVRDSWGMSEKDYGLTVPLILANVSRRCMCSV

Query:  AGW-GRSKMGNFPQHISEDRL-------------IWHFEKHGMFSVKSGYKLAYSLASQVCPSSSDSDR----WQTCFAVE---DGLHLFWKCFVTKEMW
        + W  ++  G+FPQ I +D L              W     G +SVK+GY +A +  +   PSSS++      W + ++++      H  ++    K + 
Subjt:  AGW-GRSKMGNFPQHISEDRL-------------IWHFEKHGMFSVKSGYKLAYSLASQVCPSSSDSDR----WQTCFAVE---DGLHLFWKCFVTKEMW

Query:  LCSKFSRLYQSLYHLDFVDVIWALKEK-LGLLDFELVTIFCQRESCYSSQPHPIRQVEQGAWTPPVASEFKLNTDASIRPETGEVGGGCVLRDASGAVLL
            F R  + +    FV        K +     E + ++ + +  + ++P P     +  W PP     KLN D +I  + G+ G G ++RD++G V+ 
Subjt:  LCSKFSRLYQSLYHLDFVDVIWALKEK-LGLLDFELVTIFCQRESCYSSQPHPIRQVEQGAWTPPVASEFKLNTDASIRPETGEVGGGCVLRDASGAVLL

Query:  AACLALPRCWSVDLVEGWALVKGV
        A            + EGWAL++G+
Subjt:  AACLALPRCWSVDLVEGWALVKGV

TrEMBL top hitse value%identityAlignment
A0A2N9IXK4 RNase H domain-containing protein4.2e-2023.84Show/hide
Query:  GDGFSLWCSLQYERLLDFCYRCGRIGHSHRECSEEGEGADS--DGQFLFGDWLRVVPFL-------RVVANAPEEGSGQSDSQGGRGTGMSSRGRGWGQM
        GD   +  S +YE+L +FCY CG I H  ++CS      D+    +  +G WLR  P L        V           + S     TG     +   Q 
Subjt:  GDGFSLWCSLQYERLLDFCYRCGRIGHSHRECSEEGEGADS--DGQFLFGDWLRVVPFL-------RVVANAPEEGSGQSDSQGGRGTGMSSRGRGWGQM

Query:  GPFEVVGEERPDSHLVSGLVADPVVEPR-------TESELASETLVPLVTTHTVLNDVCLTSVDKGKVVASENSKTYMIDVN----------VVPVKKS-
           ++ G+  P   + +    DP++  +       TE++  S+ L   +          +  +++  + +S  S    ++V           VVP ++  
Subjt:  GPFEVVGEERPDSHLVSGLVADPVVEPR-------TESELASETLVPLVTTHTVLNDVCLTSVDKGKVVASENSKTYMIDVN----------VVPVKKS-

Query:  ------WKWLARASLKDITNQLSTPTVSRHKRQA----------QGHPPDEVGSASKRLKEVGSGVDLCDVD--------------------VMDMANVI
              WK  A  S+K  ++      +   +  +          + H   E  S  + L    S    C  D                    + D  + I
Subjt:  ------WKWLARASLKDITNQLSTPTVSRHKRQA----------QGHPPDEVGSASKRLKEVGSGVDLCDVD--------------------VMDMANVI

Query:  DSCGLLDLGFVGNKLTWRNRRPGGGTIYERLDRCLSSVTWHDIYLNCVVNYLDYYQSDHRPIELVLSPQPSCWRRSGQRIARFDETWLVQPYLQQLVRDS
        D CG  DLGF G   TW N R G  T++ERLDR L++ +W  ++    V +L    SDH PI    SP PS   RS  RI RF+E WL  P  ++ +  +
Subjt:  DSCGLLDLGFVGNKLTWRNRRPGGGTIYERLDRCLSSVTWHDIYLNCVVNYLDYYQSDHRPIELVLSPQPSCWRRSGQRIARFDETWLVQPYLQQLVRDS

Query:  WGMSEKDYGLTVPLILANVSRRCMCSVAGWGRSKMGNFPQHI-SEDRLIWHFEKHGMFSVKSGYKLAYSLASQV
        W    + +G T    + +  R C  S+  W R   GN    +  + +++   E   M     G+  A++L  +V
Subjt:  WGMSEKDYGLTVPLILANVSRRCMCSVAGWGRSKMGNFPQHI-SEDRLIWHFEKHGMFSVKSGYKLAYSLASQV

A0A5C7HJN1 Uncharacterized protein4.2e-2031.76Show/hide
Query:  RLKEVGSGVDLCDVDVMDMANVIDSCGLLDLGFVGNKLTWRNRRPGGGTIYERLDRCLSSVTWHDIYLNCVVNYLDYYQSDHRPIELVLSPQPSCWRRSG
        R+KE   G +   + +     VID C L+DLGF G K+TW NRR G   + ER+DR L+   W D++    V +L Y  SDHRP+ L  +      R++ 
Subjt:  RLKEVGSGVDLCDVDVMDMANVIDSCGLLDLGFVGNKLTWRNRRPGGGTIYERLDRCLSSVTWHDIYLNCVVNYLDYYQSDHRPIELVLSPQPSCWRRSG

Query:  QRIARFDETWLVQPYLQQLVRDSWGMSEKDYGLTVPLILANVSRR---CMCSVAGWGRSKMGNFPQHISE
        ++  +F+  WL +    +++R++W +      L VP  L ++ R+   C   ++ W  +K G+  + I E
Subjt:  QRIARFDETWLVQPYLQQLVRDSWGMSEKDYGLTVPLILANVSRR---CMCSVAGWGRSKMGNFPQHISE

A0A6J1DAR4 uncharacterized protein LOC1110189549.4e-2026.65Show/hide
Query:  EDRLIWHFEKHGMFSVKSGYKLAYSLASQVC---PSSSDSDR---------------------WQTCF-------------------------AVEDGLH
        EDRLIW++EK G++SV+SGYK+A  L +  C   PSSS S+                      W+ C                            ED +H
Subjt:  EDRLIWHFEKHGMFSVKSGYKLAYSLASQVC---PSSSDSDR---------------------WQTCF-------------------------AVEDGLH

Query:  LFWKCFVTKEMWLCSKFSRL---------YQSLYHLDFVD---VIWALKE---------------KLGLLDFELVTIFCQ--RESCYSSQPHPIRQVEQG
        LFW C   + +W+ SKF +L         ++SL   DF +   VIW L                 K+G+   E    +    RE+  +     +    + 
Subjt:  LFWKCFVTKEMWLCSKFSRL---------YQSLYHLDFVD---VIWALKE---------------KLGLLDFELVTIFCQ--RESCYSSQPHPIRQVEQG

Query:  AWTPPVASEFKLNTDASIRPETGEVGGGCVLRDASGAVLLAACLALPRCWSVDLVEGWALVKGVELALQM----------EVGLLMDDVRRLLHPCVSGK
         W PP    +K+NTDAS        G G ++ +  G V+ AA   L    SVD+ E  A V+G++LA ++          E G ++   +      +   
Subjt:  AWTPPVASEFKLNTDASIRPETGEVGGGCVLRDASGAVLLAACLALPRCWSVDLVEGWALVKGVELALQM----------EVGLLMDDVRRLLHPCVSGK

Query:  VLFTPRQGNKVAH---APACLTFSYSDCVWLEEWPSEISAVLTCDATDK
          F  R+GNK AH     A L   +S  +W+E+WP E+ + L  +  ++
Subjt:  VLFTPRQGNKVAH---APACLTFSYSDCVWLEEWPSEISAVLTCDATDK

A0A6J1DAR4 uncharacterized protein LOC1110189546.8e-1036.3Show/hide
Query:  DGFSLWCSLQYERLLDFCYRCGRIGHSHRECSEEGEGADSDGQFLFGDWLRVVPFLRVVANAPEE---GSGQSDSQGGRGTGMSSRGRGWGQMGPFEVVG
        DG  LWC L+YE+L DFCY CGR+GHS RE  +       +G   +G WLR     + + N  EE     G+S     RG+G   RG  W + G  E  G
Subjt:  DGFSLWCSLQYERLLDFCYRCGRIGHSHRECSEEGEGADSDGQFLFGDWLRVVPFLRVVANAPEE---GSGQSDSQGGRGTGMSSRGRGWGQMGPFEVVG

Query:  EERPDSHLVSGLVADPVVEPRTESELASETLVPLV
        E+    H       D   +P      A +  VPLV
Subjt:  EERPDSHLVSGLVADPVVEPRTESELASETLVPLV

A0A6J1DAR4 uncharacterized protein LOC1110189542.7e-1924.38Show/hide
Query:  LWCSLQYERLLDFCYRCGRIGHSHRECS-----EEGEGADSDGQFLFGDWLRV------------VPFLRVVAN-APEEGSGQSDSQGGRGTGMSSRGRG
        LW  L+YERL DFCY CGR+GH  +EC+     EEG+GA       FG WL              + F+ V+   + E+  G S S+     G S+    
Subjt:  LWCSLQYERLLDFCYRCGRIGHSHRECS-----EEGEGADSDGQFLFGDWLRV------------VPFLRVVAN-APEEGSGQSDSQGGRGTGMSSRGRG

Query:  WGQMGPFEVVGEERPDSHL---------------VSGLVADPVVEPRTESELASETLVP----------------------------LVTTHTVLNDVCL
               E+V     D HL                S    D  +E   ++E AS  L P                            L+       D+ +
Subjt:  WGQMGPFEVVGEERPDSHL---------------VSGLVADPVVEPRTESELASETLVP----------------------------LVTTHTVLNDVCL

Query:  TSVDK----GKVVASENSKTYMIDVNVVPVKKSWKWLARASLKDITNQLSTPTVSRHKRQAQGHPPDEVG--SASKRLKEVGSGVDLCDVDVMDMANVID
        T  +K      V +S  ++ +M+ V   P  KS + L    LK +++    P +              VG  +  K   E   G         ++   + 
Subjt:  TSVDK----GKVVASENSKTYMIDVNVVPVKKSWKWLARASLKDITNQLSTPTVSRHKRQAQGHPPDEVG--SASKRLKEVGSGVDLCDVDVMDMANVID

Query:  SCGLLDLGFVGNKLTWRNRRPGGGTIYERLDRCLSSVTWHDIYLNCVVNYLDYYQSDHRPIELVLSPQPSCWRRSGQRIARFDETWLVQPYLQQLVRDSW
         C L+DL F GN  TW N+R G   I +RLDR +++V W  ++    V +L    SDH PI +    + S    SG +  +F + W       +++R++W
Subjt:  SCGLLDLGFVGNKLTWRNRRPGGGTIYERLDRCLSSVTWHDIYLNCVVNYLDYYQSDHRPIELVLSPQPSCWRRSGQRIARFDETWLVQPYLQQLVRDSW

Query:  GMSEKDYGLTVPLILANVSRRCMCSVAGWGRSKMGNFPQHISE
          ++  +G +   IL     +    +  W ++  G   + I E
Subjt:  GMSEKDYGLTVPLILANVSRRCMCSVAGWGRSKMGNFPQHISE

A0A7N2LCX2 Uncharacterized protein2.7e-1925.93Show/hide
Query:  LSSKPVNADAFHRVMLSVWSVYRSTRIEPLEDNIFV-IRAIGQVVEVFGEGQVK--GDGFSLWCSLQYERLLDFCYRCGRIGHSHRECSE--EGEGADSD
        L  K +NA A  ++   + +V  ST  +  E   FV IR    V      G++   G    +W S +YERLL+ CY C    H +++C +    EG  + 
Subjt:  LSSKPVNADAFHRVMLSVWSVYRSTRIEPLEDNIFV-IRAIGQVVEVFGEGQVK--GDGFSLWCSLQYERLLDFCYRCGRIGHSHRECSE--EGEGADSD

Query:  GQFLFGDWLRVVPFL---RVVANAPEEGSGQSDSQGGRGTGMSSRGRGWGQMGPFEVV--------GEER----PDSHLVSGLVADPVVEPRTESELASE
         Q LFG  LR  PF    + V + P     +  S        S+  +   +MG   +         GEER    P   ++S +  +      T+    S 
Subjt:  GQFLFGDWLRVVPFL---RVVANAPEEGSGQSDSQGGRGTGMSSRGRGWGQMGPFEVV--------GEER----PDSHLVSGLVADPVVEPRTESELASE

Query:  TLVPLVTTHTVLNDVCLTSVDK-----------GKVVASE--NSKTYMIDVNVVPVKK-SWKWL-------------ARASLKDITNQLSTPTVSRHKRQ
        +   L+ T +   +V  T +DK              V  E  +S  Y ID  +  V +  W++              A  SL  + +   TP +      
Subjt:  TLVPLVTTHTVLNDVCLTSVDK-----------GKVVASE--NSKTYMIDVNVVPVKK-SWKWL-------------ARASLKDITNQLSTPTVSRHKRQ

Query:  AQGHPPDEVGSASKRLKEVGSGVDLCDVDVMDMANVIDSCGLLDLGFVGNKLTWRNRRPGGGTIYERLDRCLSSVTWHDIYLNCVVNYLDYYQSDHRPI-
              +++G A+K+ ++           +    +VID CG LDLGFVGN+ TW      G +I+ERLDR L++  W   +    V +L     DH P+ 
Subjt:  AQGHPPDEVGSASKRLKEVGSGVDLCDVDVMDMANVIDSCGLLDLGFVGNKLTWRNRRPGGGTIYERLDRCLSSVTWHDIYLNCVVNYLDYYQSDHRPI-

Query:  --ELVLSPQPSCWRRSGQRIARFDETWLVQPYLQQLVRDSW---GMSEKDYGLTVPLILANVSRRCMCSVAGWGRSKMGNFPQHISEDR-LIWHFEKHGM
           L L P P       +RI RF+E WL+  +  ++V  SW    M  +D       IL  V   C   +A W  +  GN  + + + + L+   E   +
Subjt:  --ELVLSPQPSCWRRSGQRIARFDETWLVQPYLQQLVRDSW---GMSEKDYGLTVPLILANVSRRCMCSVAGWGRSKMGNFPQHISEDR-LIWHFEKHGM

Query:  FSVKSGYKL
         S ++G  L
Subjt:  FSVKSGYKL

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
AT2G34320.1 Polynucleotidyl transferase, ribonuclease H-like superfamily protein2.6e-0633.33Show/hide
Query:  WTPPVASEFKLNTDASIRPETGEVGGGCVLRDASGAVLLAACLALPRCWSVDLVE----GWALVKG---------VELALQMEVGLL------------M
        W  P     K NTDA+ + E    G G +LR+ SG VL     ALPR  +V   E     WA++            E   Q  V LL            +
Subjt:  WTPPVASEFKLNTDASIRPETGEVGGGCVLRDASGAVLLAACLALPRCWSVDLVE----GWALVKG---------VELALQMEVGLL------------M

Query:  DDVRRLLHPCVSGKVLFTPRQGNKVAHAPACLTFSYSD
        +D+++LLH     K  FTPR GNKVA   A  + S+S+
Subjt:  DDVRRLLHPCVSGKVLFTPRQGNKVAHAPACLTFSYSD


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGCTCTTTCTCAAGGTTTTCGACGGCGTTCCTGTGTTTTGCCCAAGGTTGCGGTTTCTTTGTCGGCCTTCTCTCAAGATCTTCGACGGCGGTTCTATGACTCTCAAGG
TTATGATTCTTCGACGGCGGTTTCTGAAATTTTCGTGGTTCTTGACGGTGGTTCCGGTGACTCGCAAAGTGATCTCAACGGTGAGTTTCAGGGATCTCGACGGTGTGTTT
CAGTGGTCCTCGACGTGGTCCTCGACGATGAGATATGTTCCTTCGACGGTGATTTCTGTGATTTGTTCGACAGTGATTGCTTGATGTTCTATGGTTCTGTTTTGTGTGGT
TTATGTGGTTTTGTGTTGTCAGTGGTTTTGTGTAGTTCTGTGGTTCTGTGTGGTTCGCAGTGTGCTCTTATGGAATATCTTGTGATCCAATGGGAAAACATGGCGTTATC
AGAGGTAGAGACAACGGTGGTCCCTGTGTCAGCAGACATCCTCCTGTTGGATGAAACGACGGTCCAACTTTGTGCCATCGGAAAGGTTCTCTCATCCAAACCTGTGAATG
CTGATGCCTTTCACAGAGTCATGTTATCGGTTTGGAGTGTTTATCGTTCTACCCGGATTGAGCCATTGGAGGATAATATCTTCGTGATTCGTGCGATTGGTCAAGTGGTT
GAAGTATTTGGGGAAGGTCAAGTTAAAGGAGATGGGTTTTCTCTTTGGTGCTCGCTGCAATATGAACGTTTGTTAGATTTTTGTTATAGATGTGGGCGTATTGGCCATTC
GCATAGGGAGTGCTCTGAGGAGGGGGAAGGTGCAGATTCTGATGGTCAGTTTCTGTTTGGTGACTGGTTGCGGGTTGTTCCATTTCTGCGTGTTGTTGCTAATGCTCCGG
AAGAGGGTAGTGGGCAATCGGATAGCCAGGGGGGTCGGGGAACGGGCATGTCTAGTAGAGGGAGAGGGTGGGGTCAAATGGGTCCTTTTGAGGTTGTGGGTGAAGAGAGG
CCTGATTCGCATCTGGTGTCTGGCCTGGTGGCTGATCCGGTTGTTGAACCGAGGACTGAATCTGAGTTGGCTTCGGAAACCTTGGTTCCTTTAGTTACTACCCATACAGT
CCTTAATGATGTTTGTTTGACTTCTGTGGATAAAGGTAAGGTTGTGGCTAGTGAGAACTCCAAGACGTATATGATTGATGTGAACGTTGTTCCGGTCAAGAAGAGTTGGA
AGTGGCTGGCCAGAGCCTCTTTGAAGGACATTACCAATCAGTTATCCACTCCCACTGTAAGTAGGCACAAGCGTCAAGCTCAGGGGCACCCGCCTGATGAGGTTGGGTCG
GCTTCCAAACGGTTGAAGGAGGTGGGGTCTGGTGTTGATTTGTGTGATGTTGATGTGATGGATATGGCGAATGTGATCGATTCATGTGGACTTCTTGATTTGGGTTTTGT
GGGGAACAAACTCACATGGCGCAACAGACGGCCTGGTGGTGGAACGATTTATGAGCGATTGGATAGGTGTTTGAGCTCTGTCACTTGGCATGATATCTATCTGAATTGTG
TGGTTAACTACCTTGACTACTACCAGTCGGATCATCGTCCGATTGAACTGGTTCTTTCCCCGCAACCTAGTTGTTGGAGACGGTCGGGTCAGCGAATTGCTCGGTTTGAT
GAGACTTGGCTTGTGCAGCCATATTTGCAGCAGCTGGTTAGGGATTCATGGGGGATGAGCGAGAAGGATTATGGTTTGACGGTGCCTCTGATTTTGGCTAATGTATCCAG
GAGGTGTATGTGCTCGGTGGCTGGTTGGGGTCGTTCAAAGATGGGGAACTTCCCTCAGCACATTAGTGAGGATCGGCTTATTTGGCATTTTGAGAAGCATGGTATGTTTT
CTGTCAAGAGTGGCTATAAGCTAGCCTACTCCTTGGCATCTCAAGTGTGTCCGTCTTCTTCTGACTCTGATCGCTGGCAGACTTGCTTTGCTGTGGAGGACGGTCTCCAT
CTTTTCTGGAAGTGCTTTGTGACTAAAGAAATGTGGCTCTGTTCAAAATTTTCTCGTCTTTATCAGTCACTATATCATTTGGATTTTGTTGATGTCATCTGGGCATTGAA
GGAGAAGTTGGGTCTACTAGACTTTGAACTTGTGACAATATTTTGCCAGCGTGAGTCTTGCTACAGTTCGCAACCTCACCCCATTCGACAAGTAGAGCAGGGTGCATGGA
CTCCTCCGGTGGCCAGTGAGTTTAAACTCAACACCGATGCTTCTATCAGGCCTGAAACGGGTGAAGTGGGGGGAGGTTGTGTTCTTCGTGATGCATCTGGTGCAGTGCTC
CTGGCGGCGTGCTTGGCCTTGCCTAGGTGCTGGAGTGTGGATCTCGTTGAAGGCTGGGCATTGGTGAAGGGCGTGGAGTTAGCGTTACAGATGGAGGTTGGCCTGCTGAT
GGATGATGTCCGACGACTTCTCCATCCTTGTGTTAGTGGCAAGGTTCTTTTTACACCACGACAGGGGAACAAAGTGGCTCATGCTCCAGCTTGTTTGACCTTCTCTTATT
CTGATTGTGTTTGGCTTGAAGAGTGGCCTAGTGAAATCTCTGCTGTGTTAACTTGTGATGCCACGGATAAGGCTGGGTACCTTATCCTGGTGACACTATGTGATACGGCC
CACCTTGTATCCGATACAGATGCAATGATCCAAAGCATCCATATAGGAGACATGCGGGTGGGGGTATCCTATGCAATGAGGTGGGGTGGTAGAATGTGCATGCTAGTGGT
TTTATGCTGGGTCTGGATGCATCTGAGCACCAGAATACCTTATGCATGTTACGCCCTCTCTTTTGTTTTGGTTGAAGTGAAGCTTACTCTCAAGACAAGACCTAGAATTC
TCTCTAGCCTCCCTCTTAGAGAAAGACTCCTACAAGTCTTTTGCCTCCTAGACCTAGAGTCATACCGGTGTAACCTCTGTGGTTATTGTGTCATTCAAGAAGAATTTTCC
AGCGATCACAAGACAAAGGGGGCTGCTGCGTTTTGTTCGTTGGAGCATCGTTGGCGACGAACGGTCAAGTCTACAACGGAGGGGGAAGATAGGTGGAGGGTCCCTATAAA
TCATTCCCAGACTATTAGAGGCGACTTTAGGAGAGGCCTAAGTTCTGAATTCGGGACTGATTTTCGTGTTGATAATGAGTTAGCCCGGAGCAAAACCAGTAGTTAG
mRNA sequenceShow/hide mRNA sequence
ATGGCTCTTTCTCAAGGTTTTCGACGGCGTTCCTGTGTTTTGCCCAAGGTTGCGGTTTCTTTGTCGGCCTTCTCTCAAGATCTTCGACGGCGGTTCTATGACTCTCAAGG
TTATGATTCTTCGACGGCGGTTTCTGAAATTTTCGTGGTTCTTGACGGTGGTTCCGGTGACTCGCAAAGTGATCTCAACGGTGAGTTTCAGGGATCTCGACGGTGTGTTT
CAGTGGTCCTCGACGTGGTCCTCGACGATGAGATATGTTCCTTCGACGGTGATTTCTGTGATTTGTTCGACAGTGATTGCTTGATGTTCTATGGTTCTGTTTTGTGTGGT
TTATGTGGTTTTGTGTTGTCAGTGGTTTTGTGTAGTTCTGTGGTTCTGTGTGGTTCGCAGTGTGCTCTTATGGAATATCTTGTGATCCAATGGGAAAACATGGCGTTATC
AGAGGTAGAGACAACGGTGGTCCCTGTGTCAGCAGACATCCTCCTGTTGGATGAAACGACGGTCCAACTTTGTGCCATCGGAAAGGTTCTCTCATCCAAACCTGTGAATG
CTGATGCCTTTCACAGAGTCATGTTATCGGTTTGGAGTGTTTATCGTTCTACCCGGATTGAGCCATTGGAGGATAATATCTTCGTGATTCGTGCGATTGGTCAAGTGGTT
GAAGTATTTGGGGAAGGTCAAGTTAAAGGAGATGGGTTTTCTCTTTGGTGCTCGCTGCAATATGAACGTTTGTTAGATTTTTGTTATAGATGTGGGCGTATTGGCCATTC
GCATAGGGAGTGCTCTGAGGAGGGGGAAGGTGCAGATTCTGATGGTCAGTTTCTGTTTGGTGACTGGTTGCGGGTTGTTCCATTTCTGCGTGTTGTTGCTAATGCTCCGG
AAGAGGGTAGTGGGCAATCGGATAGCCAGGGGGGTCGGGGAACGGGCATGTCTAGTAGAGGGAGAGGGTGGGGTCAAATGGGTCCTTTTGAGGTTGTGGGTGAAGAGAGG
CCTGATTCGCATCTGGTGTCTGGCCTGGTGGCTGATCCGGTTGTTGAACCGAGGACTGAATCTGAGTTGGCTTCGGAAACCTTGGTTCCTTTAGTTACTACCCATACAGT
CCTTAATGATGTTTGTTTGACTTCTGTGGATAAAGGTAAGGTTGTGGCTAGTGAGAACTCCAAGACGTATATGATTGATGTGAACGTTGTTCCGGTCAAGAAGAGTTGGA
AGTGGCTGGCCAGAGCCTCTTTGAAGGACATTACCAATCAGTTATCCACTCCCACTGTAAGTAGGCACAAGCGTCAAGCTCAGGGGCACCCGCCTGATGAGGTTGGGTCG
GCTTCCAAACGGTTGAAGGAGGTGGGGTCTGGTGTTGATTTGTGTGATGTTGATGTGATGGATATGGCGAATGTGATCGATTCATGTGGACTTCTTGATTTGGGTTTTGT
GGGGAACAAACTCACATGGCGCAACAGACGGCCTGGTGGTGGAACGATTTATGAGCGATTGGATAGGTGTTTGAGCTCTGTCACTTGGCATGATATCTATCTGAATTGTG
TGGTTAACTACCTTGACTACTACCAGTCGGATCATCGTCCGATTGAACTGGTTCTTTCCCCGCAACCTAGTTGTTGGAGACGGTCGGGTCAGCGAATTGCTCGGTTTGAT
GAGACTTGGCTTGTGCAGCCATATTTGCAGCAGCTGGTTAGGGATTCATGGGGGATGAGCGAGAAGGATTATGGTTTGACGGTGCCTCTGATTTTGGCTAATGTATCCAG
GAGGTGTATGTGCTCGGTGGCTGGTTGGGGTCGTTCAAAGATGGGGAACTTCCCTCAGCACATTAGTGAGGATCGGCTTATTTGGCATTTTGAGAAGCATGGTATGTTTT
CTGTCAAGAGTGGCTATAAGCTAGCCTACTCCTTGGCATCTCAAGTGTGTCCGTCTTCTTCTGACTCTGATCGCTGGCAGACTTGCTTTGCTGTGGAGGACGGTCTCCAT
CTTTTCTGGAAGTGCTTTGTGACTAAAGAAATGTGGCTCTGTTCAAAATTTTCTCGTCTTTATCAGTCACTATATCATTTGGATTTTGTTGATGTCATCTGGGCATTGAA
GGAGAAGTTGGGTCTACTAGACTTTGAACTTGTGACAATATTTTGCCAGCGTGAGTCTTGCTACAGTTCGCAACCTCACCCCATTCGACAAGTAGAGCAGGGTGCATGGA
CTCCTCCGGTGGCCAGTGAGTTTAAACTCAACACCGATGCTTCTATCAGGCCTGAAACGGGTGAAGTGGGGGGAGGTTGTGTTCTTCGTGATGCATCTGGTGCAGTGCTC
CTGGCGGCGTGCTTGGCCTTGCCTAGGTGCTGGAGTGTGGATCTCGTTGAAGGCTGGGCATTGGTGAAGGGCGTGGAGTTAGCGTTACAGATGGAGGTTGGCCTGCTGAT
GGATGATGTCCGACGACTTCTCCATCCTTGTGTTAGTGGCAAGGTTCTTTTTACACCACGACAGGGGAACAAAGTGGCTCATGCTCCAGCTTGTTTGACCTTCTCTTATT
CTGATTGTGTTTGGCTTGAAGAGTGGCCTAGTGAAATCTCTGCTGTGTTAACTTGTGATGCCACGGATAAGGCTGGGTACCTTATCCTGGTGACACTATGTGATACGGCC
CACCTTGTATCCGATACAGATGCAATGATCCAAAGCATCCATATAGGAGACATGCGGGTGGGGGTATCCTATGCAATGAGGTGGGGTGGTAGAATGTGCATGCTAGTGGT
TTTATGCTGGGTCTGGATGCATCTGAGCACCAGAATACCTTATGCATGTTACGCCCTCTCTTTTGTTTTGGTTGAAGTGAAGCTTACTCTCAAGACAAGACCTAGAATTC
TCTCTAGCCTCCCTCTTAGAGAAAGACTCCTACAAGTCTTTTGCCTCCTAGACCTAGAGTCATACCGGTGTAACCTCTGTGGTTATTGTGTCATTCAAGAAGAATTTTCC
AGCGATCACAAGACAAAGGGGGCTGCTGCGTTTTGTTCGTTGGAGCATCGTTGGCGACGAACGGTCAAGTCTACAACGGAGGGGGAAGATAGGTGGAGGGTCCCTATAAA
TCATTCCCAGACTATTAGAGGCGACTTTAGGAGAGGCCTAAGTTCTGAATTCGGGACTGATTTTCGTGTTGATAATGAGTTAGCCCGGAGCAAAACCAGTAGTTAG
Protein sequenceShow/hide protein sequence
MALSQGFRRRSCVLPKVAVSLSAFSQDLRRRFYDSQGYDSSTAVSEIFVVLDGGSGDSQSDLNGEFQGSRRCVSVVLDVVLDDEICSFDGDFCDLFDSDCLMFYGSVLCG
LCGFVLSVVLCSSVVLCGSQCALMEYLVIQWENMALSEVETTVVPVSADILLLDETTVQLCAIGKVLSSKPVNADAFHRVMLSVWSVYRSTRIEPLEDNIFVIRAIGQVV
EVFGEGQVKGDGFSLWCSLQYERLLDFCYRCGRIGHSHRECSEEGEGADSDGQFLFGDWLRVVPFLRVVANAPEEGSGQSDSQGGRGTGMSSRGRGWGQMGPFEVVGEER
PDSHLVSGLVADPVVEPRTESELASETLVPLVTTHTVLNDVCLTSVDKGKVVASENSKTYMIDVNVVPVKKSWKWLARASLKDITNQLSTPTVSRHKRQAQGHPPDEVGS
ASKRLKEVGSGVDLCDVDVMDMANVIDSCGLLDLGFVGNKLTWRNRRPGGGTIYERLDRCLSSVTWHDIYLNCVVNYLDYYQSDHRPIELVLSPQPSCWRRSGQRIARFD
ETWLVQPYLQQLVRDSWGMSEKDYGLTVPLILANVSRRCMCSVAGWGRSKMGNFPQHISEDRLIWHFEKHGMFSVKSGYKLAYSLASQVCPSSSDSDRWQTCFAVEDGLH
LFWKCFVTKEMWLCSKFSRLYQSLYHLDFVDVIWALKEKLGLLDFELVTIFCQRESCYSSQPHPIRQVEQGAWTPPVASEFKLNTDASIRPETGEVGGGCVLRDASGAVL
LAACLALPRCWSVDLVEGWALVKGVELALQMEVGLLMDDVRRLLHPCVSGKVLFTPRQGNKVAHAPACLTFSYSDCVWLEEWPSEISAVLTCDATDKAGYLILVTLCDTA
HLVSDTDAMIQSIHIGDMRVGVSYAMRWGGRMCMLVVLCWVWMHLSTRIPYACYALSFVLVEVKLTLKTRPRILSSLPLRERLLQVFCLLDLESYRCNLCGYCVIQEEFS
SDHKTKGAAAFCSLEHRWRRTVKSTTEGEDRWRVPINHSQTIRGDFRRGLSSEFGTDFRVDNELARSKTSS