; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Lag0033161 (gene) of Sponge gourd (AG-4) v1 genome

Gene IDLag0033161
OrganismLuffa acutangula AG-4 (Sponge gourd (AG-4) v1)
DescriptionTransposon TX1 uncharacterized 149 kDa protein
Genome locationchr11:41316156..41323885
RNA-Seq ExpressionLag0033161
SyntenyLag0033161
Gene Ontology termsNA
InterPro domainsIPR000477 - Reverse transcriptase domain
IPR036691 - Endonuclease/exonuclease/phosphatase superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAG7011323.1 NHL repeat-containing protein 2 [Cucurbita argyrosperma subsp. argyrosperma]1.4e-9781.78Show/hide
Query:  MSFR--ATNMAFRFRRLKQISRSLPQFYSGYYHQYHHSYAVSSLALSVAPSHVSEGIDKRILDNGRHLLRFSTTTELQCESSPANDILSFIKSTSDESEG
        MSFR  ATNMAFRFRRL++IS+SLPQFYSGYYHQ+HH +AVSSL  SV+PS+VSEG+++RIL++GRHLLRFSTT ELQCESSPAND+LSFIKST DESEG
Subjt:  MSFR--ATNMAFRFRRLKQISRSLPQFYSGYYHQYHHSYAVSSLALSVAPSHVSEGIDKRILDNGRHLLRFSTTTELQCESSPANDILSFIKSTSDESEG

Query:  PNHCWLNTSDGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAAEKSNMIQFIMREYVSFPILLSKKIFEMAPGLC
        PNH WLN  DG KGISE DGIYLILA+QFLE TSSDSVVLVENVKFLQHRFPQLHVIG QCS+TLS AEKS MIQFIMREYVSFPILLS KI EM  GLC
Subjt:  PNHCWLNTSDGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAAEKSNMIQFIMREYVSFPILLSKKIFEMAPGLC

Query:  YIISKDFSNPLLLGEKDMDLSILRK
        YIISKDFSNPLL+ E+D DL++LRK
Subjt:  YIISKDFSNPLLLGEKDMDLSILRK

XP_022963868.1 uncharacterized protein LOC111464050 isoform X1 [Cucurbita moschata]5.7e-9680.89Show/hide
Query:  MSFR--ATNMAFRFRRLKQISRSLPQFYSGYYHQYHHSYAVSSLALSVAPSHVSEGIDKRILDNGRHLLRFSTTTELQCESSPANDILSFIKSTSDESEG
        MSFR  ATNMAFRFRRL++IS+SLPQFYSGYYHQ HH +AVSSL  SVAPS+VSEG+++RIL++GRHLLRFSTT ELQCESSP ND+LSFIKST D+SEG
Subjt:  MSFR--ATNMAFRFRRLKQISRSLPQFYSGYYHQYHHSYAVSSLALSVAPSHVSEGIDKRILDNGRHLLRFSTTTELQCESSPANDILSFIKSTSDESEG

Query:  PNHCWLNTSDGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAAEKSNMIQFIMREYVSFPILLSKKIFEMAPGLC
        PNH WLN  DGNKGISE DGIYLILA+QFLE TSSDSVVLVENVKFLQHRFPQLHVIG QCS+T S AEKS MIQFIMREYVSFPILLS KI EM  G C
Subjt:  PNHCWLNTSDGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAAEKSNMIQFIMREYVSFPILLSKKIFEMAPGLC

Query:  YIISKDFSNPLLLGEKDMDLSILRK
        YIISKDFSNPLL+ E+D DL++LRK
Subjt:  YIISKDFSNPLLLGEKDMDLSILRK

XP_022963870.1 uncharacterized protein LOC111464050 isoform X3 [Cucurbita moschata]5.7e-9680.89Show/hide
Query:  MSFR--ATNMAFRFRRLKQISRSLPQFYSGYYHQYHHSYAVSSLALSVAPSHVSEGIDKRILDNGRHLLRFSTTTELQCESSPANDILSFIKSTSDESEG
        MSFR  ATNMAFRFRRL++IS+SLPQFYSGYYHQ HH +AVSSL  SVAPS+VSEG+++RIL++GRHLLRFSTT ELQCESSP ND+LSFIKST D+SEG
Subjt:  MSFR--ATNMAFRFRRLKQISRSLPQFYSGYYHQYHHSYAVSSLALSVAPSHVSEGIDKRILDNGRHLLRFSTTTELQCESSPANDILSFIKSTSDESEG

Query:  PNHCWLNTSDGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAAEKSNMIQFIMREYVSFPILLSKKIFEMAPGLC
        PNH WLN  DGNKGISE DGIYLILA+QFLE TSSDSVVLVENVKFLQHRFPQLHVIG QCS+T S AEKS MIQFIMREYVSFPILLS KI EM  G C
Subjt:  PNHCWLNTSDGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAAEKSNMIQFIMREYVSFPILLSKKIFEMAPGLC

Query:  YIISKDFSNPLLLGEKDMDLSILRK
        YIISKDFSNPLL+ E+D DL++LRK
Subjt:  YIISKDFSNPLLLGEKDMDLSILRK

XP_022967190.1 uncharacterized protein LOC111466804 isoform X2 [Cucurbita maxima]1.3e-9581.33Show/hide
Query:  MSFR--ATNMAFRFRRLKQISRSLPQFYSGYYHQYHHSYAVSSLALSVAPSHVSEGIDKRILDNGRHLLRFSTTTELQCESSPANDILSFIKSTSDESEG
        MSFR  A NMAFRFRRL++IS+SLPQFYSGYYHQ+HH +AVSSL  SVA S+VSEG+D+RILD+G HL RFSTTTELQC+SSPANDILSFIKST DESEG
Subjt:  MSFR--ATNMAFRFRRLKQISRSLPQFYSGYYHQYHHSYAVSSLALSVAPSHVSEGIDKRILDNGRHLLRFSTTTELQCESSPANDILSFIKSTSDESEG

Query:  PNHCWLNTSDGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAAEKSNMIQFIMREYVSFPILLSKKIFEMAPGLC
        PNH WLN  DGNKGISE D IYLILA+QFLE TSSDSVVLVENVKFLQHRFPQLHVIG QCS+TLS  EKS MIQFIMREYVSFPILLS KIFEM  GLC
Subjt:  PNHCWLNTSDGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAAEKSNMIQFIMREYVSFPILLSKKIFEMAPGLC

Query:  YIISKDFSNPLLLGEKDMDLSILRK
        YIISKD+SNPLL+ E+D DL++LRK
Subjt:  YIISKDFSNPLLLGEKDMDLSILRK

XP_023511527.1 uncharacterized protein LOC111776330 [Cucurbita pepo subsp. pepo]4.2e-9983.11Show/hide
Query:  MSFR--ATNMAFRFRRLKQISRSLPQFYSGYYHQYHHSYAVSSLALSVAPSHVSEGIDKRILDNGRHLLRFSTTTELQCESSPANDILSFIKSTSDESEG
        MSFR  ATNMAFRFRRL++IS+SLPQFYSGYYHQ+HH +AVSSL  SVAPS+VSEG+++RIL++GRHLLRFSTTTELQCESSPAND+LSFIKST DESEG
Subjt:  MSFR--ATNMAFRFRRLKQISRSLPQFYSGYYHQYHHSYAVSSLALSVAPSHVSEGIDKRILDNGRHLLRFSTTTELQCESSPANDILSFIKSTSDESEG

Query:  PNHCWLNTSDGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAAEKSNMIQFIMREYVSFPILLSKKIFEMAPGLC
        PNH WLN  DGNKGISE DGIYLILA+QFLE TSSDSVVLVENVKFLQHRFPQLHVIG QCS+TLS AEKS MIQFIMREYVSFPILLS KIFEM   LC
Subjt:  PNHCWLNTSDGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAAEKSNMIQFIMREYVSFPILLSKKIFEMAPGLC

Query:  YIISKDFSNPLLLGEKDMDLSILRK
        YIISKDFSNPLL+ E+D DL++LRK
Subjt:  YIISKDFSNPLLLGEKDMDLSILRK

TrEMBL top hitse value%identityAlignment
A0A1S3CQE6 uncharacterized protein LOC103503064 isoform X18.7e-9079.28Show/hide
Query:  MAFRFRRLKQISRSLPQFYSGYYHQYHHSYAVSSLALSVAPSHVSEGIDKRILDNGRHLLRFSTTTELQCESSPANDILSFIKSTSDESEGPNHCWLNTS
        MAFRFRRLK+ISRSLPQ YSGYYHQ+HH Y VSSL LSVAP HVSEGID+R+ DNGRH  RFSTTTELQCESSP NDI SFI ST DESEGPNH WLNTS
Subjt:  MAFRFRRLKQISRSLPQFYSGYYHQYHHSYAVSSLALSVAPSHVSEGIDKRILDNGRHLLRFSTTTELQCESSPANDILSFIKSTSDESEGPNHCWLNTS

Query:  DGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAAEKSNMIQFIMREYVSFPILLSKKIFEMAPGLCYIISKDFSN
        +GNKGI E DG+YLILANQFLE TSSDS+ LVENVKFLQ RFP LHVIGFQC STLS AEKS MIQFIMREY+SFPILLS KIFE+A   C IISKD SN
Subjt:  DGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAAEKSNMIQFIMREYVSFPILLSKKIFEMAPGLCYIISKDFSN

Query:  PLLLGEKDMDLSILRKGNDGNH
        PLL+ E+DMDLSIL K  +  H
Subjt:  PLLLGEKDMDLSILRKGNDGNH

A0A6J1HJ63 uncharacterized protein LOC111464050 isoform X12.8e-9680.89Show/hide
Query:  MSFR--ATNMAFRFRRLKQISRSLPQFYSGYYHQYHHSYAVSSLALSVAPSHVSEGIDKRILDNGRHLLRFSTTTELQCESSPANDILSFIKSTSDESEG
        MSFR  ATNMAFRFRRL++IS+SLPQFYSGYYHQ HH +AVSSL  SVAPS+VSEG+++RIL++GRHLLRFSTT ELQCESSP ND+LSFIKST D+SEG
Subjt:  MSFR--ATNMAFRFRRLKQISRSLPQFYSGYYHQYHHSYAVSSLALSVAPSHVSEGIDKRILDNGRHLLRFSTTTELQCESSPANDILSFIKSTSDESEG

Query:  PNHCWLNTSDGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAAEKSNMIQFIMREYVSFPILLSKKIFEMAPGLC
        PNH WLN  DGNKGISE DGIYLILA+QFLE TSSDSVVLVENVKFLQHRFPQLHVIG QCS+T S AEKS MIQFIMREYVSFPILLS KI EM  G C
Subjt:  PNHCWLNTSDGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAAEKSNMIQFIMREYVSFPILLSKKIFEMAPGLC

Query:  YIISKDFSNPLLLGEKDMDLSILRK
        YIISKDFSNPLL+ E+D DL++LRK
Subjt:  YIISKDFSNPLLLGEKDMDLSILRK

A0A6J1HJ76 uncharacterized protein LOC111464050 isoform X32.8e-9680.89Show/hide
Query:  MSFR--ATNMAFRFRRLKQISRSLPQFYSGYYHQYHHSYAVSSLALSVAPSHVSEGIDKRILDNGRHLLRFSTTTELQCESSPANDILSFIKSTSDESEG
        MSFR  ATNMAFRFRRL++IS+SLPQFYSGYYHQ HH +AVSSL  SVAPS+VSEG+++RIL++GRHLLRFSTT ELQCESSP ND+LSFIKST D+SEG
Subjt:  MSFR--ATNMAFRFRRLKQISRSLPQFYSGYYHQYHHSYAVSSLALSVAPSHVSEGIDKRILDNGRHLLRFSTTTELQCESSPANDILSFIKSTSDESEG

Query:  PNHCWLNTSDGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAAEKSNMIQFIMREYVSFPILLSKKIFEMAPGLC
        PNH WLN  DGNKGISE DGIYLILA+QFLE TSSDSVVLVENVKFLQHRFPQLHVIG QCS+T S AEKS MIQFIMREYVSFPILLS KI EM  G C
Subjt:  PNHCWLNTSDGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAAEKSNMIQFIMREYVSFPILLSKKIFEMAPGLC

Query:  YIISKDFSNPLLLGEKDMDLSILRK
        YIISKDFSNPLL+ E+D DL++LRK
Subjt:  YIISKDFSNPLLLGEKDMDLSILRK

A0A6J1HTQ7 uncharacterized protein LOC111466804 isoform X26.2e-9681.33Show/hide
Query:  MSFR--ATNMAFRFRRLKQISRSLPQFYSGYYHQYHHSYAVSSLALSVAPSHVSEGIDKRILDNGRHLLRFSTTTELQCESSPANDILSFIKSTSDESEG
        MSFR  A NMAFRFRRL++IS+SLPQFYSGYYHQ+HH +AVSSL  SVA S+VSEG+D+RILD+G HL RFSTTTELQC+SSPANDILSFIKST DESEG
Subjt:  MSFR--ATNMAFRFRRLKQISRSLPQFYSGYYHQYHHSYAVSSLALSVAPSHVSEGIDKRILDNGRHLLRFSTTTELQCESSPANDILSFIKSTSDESEG

Query:  PNHCWLNTSDGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAAEKSNMIQFIMREYVSFPILLSKKIFEMAPGLC
        PNH WLN  DGNKGISE D IYLILA+QFLE TSSDSVVLVENVKFLQHRFPQLHVIG QCS+TLS  EKS MIQFIMREYVSFPILLS KIFEM  GLC
Subjt:  PNHCWLNTSDGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAAEKSNMIQFIMREYVSFPILLSKKIFEMAPGLC

Query:  YIISKDFSNPLLLGEKDMDLSILRK
        YIISKD+SNPLL+ E+D DL++LRK
Subjt:  YIISKDFSNPLLLGEKDMDLSILRK

A0A6J1HW28 uncharacterized protein LOC111466804 isoform X16.2e-9681.33Show/hide
Query:  MSFR--ATNMAFRFRRLKQISRSLPQFYSGYYHQYHHSYAVSSLALSVAPSHVSEGIDKRILDNGRHLLRFSTTTELQCESSPANDILSFIKSTSDESEG
        MSFR  A NMAFRFRRL++IS+SLPQFYSGYYHQ+HH +AVSSL  SVA S+VSEG+D+RILD+G HL RFSTTTELQC+SSPANDILSFIKST DESEG
Subjt:  MSFR--ATNMAFRFRRLKQISRSLPQFYSGYYHQYHHSYAVSSLALSVAPSHVSEGIDKRILDNGRHLLRFSTTTELQCESSPANDILSFIKSTSDESEG

Query:  PNHCWLNTSDGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAAEKSNMIQFIMREYVSFPILLSKKIFEMAPGLC
        PNH WLN  DGNKGISE D IYLILA+QFLE TSSDSVVLVENVKFLQHRFPQLHVIG QCS+TLS  EKS MIQFIMREYVSFPILLS KIFEM  GLC
Subjt:  PNHCWLNTSDGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAAEKSNMIQFIMREYVSFPILLSKKIFEMAPGLC

Query:  YIISKDFSNPLLLGEKDMDLSILRK
        YIISKD+SNPLL+ E+D DL++LRK
Subjt:  YIISKDFSNPLLLGEKDMDLSILRK

SwissProt top hitse value%identityAlignment
O00370 LINE-1 retrotransposable element ORF2 protein7.5e-1439.02Show/hide
Query:  LERPFEMEEIFKAISQLGAQKSPGPDGFTGEFLKNSWNTIKNDLFKVFQEFFDNGIVSRRMNETYICLIPKK-KVASKVSDFRPISLVTMLYKSLSKVLA
        L RP    EI   I+ L  +KSPGPDGFT EF +     +   L K+FQ     GI+     E  I LIPK  +  +K  +FRPISL+ +  K L+K+LA
Subjt:  LERPFEMEEIFKAISQLGAQKSPGPDGFTGEFLKNSWNTIKNDLFKVFQEFFDNGIVSRRMNETYICLIPKK-KVASKVSDFRPISLVTMLYKSLSKVLA

Query:  ERLRKVLPDTIDKAQMAFVEGRQ
         R+++ +   I   Q+ F+ G Q
Subjt:  ERLRKVLPDTIDKAQMAFVEGRQ

P08548 LINE-1 reverse transcriptase homolog1.1e-1235.43Show/hide
Query:  QANWLERPFEMEEIFKAISQLGAQKSPGPDGFTGEFLKNSWNTIKNDLFKVFQEFFDNGIVSRRMNETYICLIPKK-KVASKVSDFRPISLVTMLYKSLS
        +   L RP    EI   I  L  +KSPGPDGFT EF +     +   L  +FQ     GI+     E  I LIPK  K  ++  ++RPISL+ +  K L+
Subjt:  QANWLERPFEMEEIFKAISQLGAQKSPGPDGFTGEFLKNSWNTIKNDLFKVFQEFFDNGIVSRRMNETYICLIPKK-KVASKVSDFRPISLVTMLYKSLS

Query:  KVLAERLRKVLPDTIDKAQMAFVEGRQ
        K+L  R+++ +   I   Q+ F+ G Q
Subjt:  KVLAERLRKVLPDTIDKAQMAFVEGRQ

P11369 LINE-1 retrotransposable element ORF2 protein3.7e-1336.36Show/hide
Query:  NQANWLERPFEMEEIFKAISQLGAQKSPGPDGFTGEFLKNSWNTIKNDLFKVFQEFFD----NGIVSRRMNETYICLIPK-KKVASKVSDFRPISLVTML
        +Q + L  P   +EI   I+ L  +KSPGPDGF+ EF    + T K DL  +  + F      G +     E  I LIPK +K  +K+ +FRPISL+ + 
Subjt:  NQANWLERPFEMEEIFKAISQLGAQKSPGPDGFTGEFLKNSWNTIKNDLFKVFQEFFD----NGIVSRRMNETYICLIPK-KKVASKVSDFRPISLVTML

Query:  YKSLSKVLAERLRKVLPDTIDKAQMAFVEGRQ
         K L+K+LA R+++ +   I   Q+ F+ G Q
Subjt:  YKSLSKVLAERLRKVLPDTIDKAQMAFVEGRQ

P14381 Transposon TX1 uncharacterized 149 kDa protein1.5e-1432.41Show/hide
Query:  LERPFEMEEIFKAISQLGAQKSPGPDGFTGEFLKNSWNTIKNDLFKVFQEFFDNGIVSRRMNETYICLIPKKKVASKVSDFRPISLVTMLYKSLSKVLAE
        LE P  ++E+ +A+  +   KSPG DG T EF +  W+T+  D  +V  E F  G +        + L+PKK     + ++RP+SL++  YK ++K ++ 
Subjt:  LERPFEMEEIFKAISQLGAQKSPGPDGFTGEFLKNSWNTIKNDLFKVFQEFFDNGIVSRRMNETYICLIPKKKVASKVSDFRPISLVTMLYKSLSKVLAE

Query:  RLRKVLPDTIDKAQMAFVEGRQIMDAILIATETVDDYRPRGKILA
        RL+ VL + I   Q   V GR I D + +  + +   R  G  LA
Subjt:  RLRKVLPDTIDKAQMAFVEGRQIMDAILIATETVDDYRPRGKILA

Arabidopsis top hitse value%identityAlignment
AT1G43760.1 DNAse I-like superfamily protein1.9e-1239.77Show/hide
Query:  EEIFKAISQLGAQKSPGPDGFTGEFLKNSWNTIKNDLFKVFQEFFDNGIVSRRMNETYICLIPKKKVASKVSDFRPISLVTMLYKSLS
        +EI  A+  +   K+PGPD FT EF   SW  +K+      +EFF  G + +R N T I LIPK     ++S FRP+S  T++YK ++
Subjt:  EEIFKAISQLGAQKSPGPDGFTGEFLKNSWNTIKNDLFKVFQEFFDNGIVSRRMNETYICLIPKKKVASKVSDFRPISLVTMLYKSLS

AT3G07060.1 NHL domain-containing protein8.2e-2441.33Show/hide
Query:  SSPANDILSFIKSTSDESEGPNHCWLNTSDGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAA-EKSNMIQFIMR
        SSP  D+LSFIK++ D+ EGP+H WLN   GNK + ++ G Y++LA   L+ T SD     E +K LQ R P +  +G   S     A +++ + + I++
Subjt:  SSPANDILSFIKSTSDESEGPNHCWLNTSDGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAA-EKSNMIQFIMR

Query:  EYVSFPILLSKKIFEMAPG-LCYIISKDFSNPLLLGEKDMDLSILRKGND
        EY++FP+LLS+K F    G + YI+ KDF NPL+  EKD+D++ + K  D
Subjt:  EYVSFPILLSKKIFEMAPG-LCYIISKDFSNPLLLGEKDMDLSILRKGND


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGCTGAAAGTTGAAGGGGCAGATACAGAATGGAGGGTCCGGATTGGCAGGTGGGGCAGAGGTCCGCCGTCGACGCCGCCTGCAGGCTATGAAGTTCCCGTCGACTTGAT
GTCATTTCGTGCAACGAATATGGCTTTCAGGTTCCGGCGACTCAAACAAATCTCAAGGTCTTTGCCTCAATTCTACTCCGGATATTATCATCAGTATCACCATAGTTATG
CTGTTAGCTCATTGGCACTGTCCGTCGCTCCATCTCATGTATCTGAAGGAATCGATAAAAGGATTTTAGACAATGGACGCCACCTTCTGCGGTTTTCTACAACAACAGAG
CTGCAATGTGAGTCGTCTCCTGCAAATGACATCTTATCCTTCATTAAGTCAACCTCAGACGAGTCTGAAGGCCCTAACCACTGTTGGTTGAATACATCTGATGGAAATAA
AGGAATTTCAGAAAATGATGGAATCTACTTAATTCTTGCCAATCAATTTCTAGAGACAACAAGCTCTGATTCGGTTGTTTTGGTTGAAAATGTTAAGTTCCTTCAGCACA
GGTTTCCTCAGCTTCATGTGATTGGGTTTCAGTGTTCCAGTACTCTATCTGCCGCTGAAAAAAGTAACATGATCCAATTTATAATGAGGGAATATGTTTCCTTTCCCATT
TTGTTATCCAAGAAGATTTTTGAGATGGCGCCGGGGCTCTGTTATATTATCTCTAAGGACTTCAGTAATCCTTTGCTTCTCGGTGAGAAAGACATGGACCTTAGCATTCT
TCGGAAAGGGAACGATGGAAATCACCCCAATCCGAGGGATCATGATGAATTAGTTAGGGCCGAAGATTCATACAGAGGAGAAATGGGGAGACAAAGTCCAGAAGAGTCAG
AAGCAAACCAAATCTTGGGAAGCAGCCTTGATATTCCTATTCTTAAGGAAACGGTTGAGATTCTTTGGCAGAATGGCTTGTGCATCAGGCCGATCCCTAAGAAAACTGGG
GGTGGAGGAACTAAAGGAAAAAACAGCAAAAATCAAGTCAAAAGGGAGATAAAAGGATTAATCAACTCATGGGAGAAGCCAGTAGAAGAAAAGAGGAATTCCTCTCAGGG
CAAAGATGAATCTAAGCTTGCTTTTGTTAACAGATCTCTTGTTAAGCAGATTTGGAGCTCCCGTTTCATAGGGTGGGCCTCTCTAGACTCGGTAGGTGCTTCGGGAGGTA
TTCTAGTGTTATGGAAAGAGAAGGAAGTCAAAGCAGTTGATGTGGTTGTGGGTTGCCATTCGGTTTCAGTTCTGTTTTTGCTAGCTAATTCAAATTCTTTTTGGGTGACA
GGAGTTTATGGGCCCAATAGCTCTAAAGATGCGAATGGTAAATTTACATGGTCAGATAAGAGGGAAACCCCAGTGTTTACCAAGATTGATAGATTCTTTGTCTCTGCGGG
GTGGTCTGATCTCTTTTCGAATGTCCAGGTTTCTCATATGCAAAGAATAACTTCTGATCATTTTCCCATTTTGCTTCAAGCAGGTGACTTCTCATGGGGTCCAACTCCTT
TCAGATTTGAGAATGCTTGGCTTTCACACCCTGACTTTAAAAAGAACGTTGAAGGCTGGTGGAAGGAGTTTGAGGAGGAAGGATGGGCTGGGTTACAGTTCATGGCCAAG
CTAAAGACTCTTAAAGAGAAGCTAAAACAGTGGACAAGGAGGTCTTTGGCGATATTAGGGAGGAGAAAGCTATGGGGCTCAAAGGTTTACTATGGAGGGGGTAGACTGGA
AACCAATCTCGACAACCAAGCTAATTGGCTGGAGCGACCTTTTGAAATGGAGGAAATTTTTAAAGCTATTTCCCAACTAGGAGCTCAGAAATCCCCGGGCCCGGACGGTT
TTACAGGCGAATTCTTGAAAAATTCTTGGAACACTATTAAAAATGATTTGTTTAAGGTGTTCCAGGAGTTTTTTGACAATGGTATCGTCAGTAGAAGGATGAATGAAACC
TACATATGTTTGATCCCCAAAAAAAAGGTGGCTAGCAAGGTTAGTGACTTCCGCCCCATCAGTTTAGTGACTATGTTATACAAATCTCTTTCTAAGGTGTTGGCTGAGAG
GCTTAGAAAGGTTCTTCCGGACACGATAGACAAGGCTCAAATGGCTTTTGTCGAGGGGAGGCAAATTATGGATGCTATTCTTATAGCGACCGAGACAGTGGATGACTACC
GGCCAAGAGGGAAAATCCTAGCGTCTAGGGTATTCGTCAAGGAGACCGCTATCTCCTTTCTTTTTACTATTGTAGCTGACTCTATGAGTCGATTCATTCAGTTGTGCAGC
AAGAAAGGCTTTATTCGGGCTTTATGGTTGGGAGGGAGGATGTGGAGGTCACCCACATTCAATTTGCAGATGATACCCTTCTGTTTAGTGAAGGCAGTTTGCAGTGTATC
CATAATTGGAAGACTTTTCTGA
mRNA sequenceShow/hide mRNA sequence
ATGCTGAAAGTTGAAGGGGCAGATACAGAATGGAGGGTCCGGATTGGCAGGTGGGGCAGAGGTCCGCCGTCGACGCCGCCTGCAGGCTATGAAGTTCCCGTCGACTTGAT
GTCATTTCGTGCAACGAATATGGCTTTCAGGTTCCGGCGACTCAAACAAATCTCAAGGTCTTTGCCTCAATTCTACTCCGGATATTATCATCAGTATCACCATAGTTATG
CTGTTAGCTCATTGGCACTGTCCGTCGCTCCATCTCATGTATCTGAAGGAATCGATAAAAGGATTTTAGACAATGGACGCCACCTTCTGCGGTTTTCTACAACAACAGAG
CTGCAATGTGAGTCGTCTCCTGCAAATGACATCTTATCCTTCATTAAGTCAACCTCAGACGAGTCTGAAGGCCCTAACCACTGTTGGTTGAATACATCTGATGGAAATAA
AGGAATTTCAGAAAATGATGGAATCTACTTAATTCTTGCCAATCAATTTCTAGAGACAACAAGCTCTGATTCGGTTGTTTTGGTTGAAAATGTTAAGTTCCTTCAGCACA
GGTTTCCTCAGCTTCATGTGATTGGGTTTCAGTGTTCCAGTACTCTATCTGCCGCTGAAAAAAGTAACATGATCCAATTTATAATGAGGGAATATGTTTCCTTTCCCATT
TTGTTATCCAAGAAGATTTTTGAGATGGCGCCGGGGCTCTGTTATATTATCTCTAAGGACTTCAGTAATCCTTTGCTTCTCGGTGAGAAAGACATGGACCTTAGCATTCT
TCGGAAAGGGAACGATGGAAATCACCCCAATCCGAGGGATCATGATGAATTAGTTAGGGCCGAAGATTCATACAGAGGAGAAATGGGGAGACAAAGTCCAGAAGAGTCAG
AAGCAAACCAAATCTTGGGAAGCAGCCTTGATATTCCTATTCTTAAGGAAACGGTTGAGATTCTTTGGCAGAATGGCTTGTGCATCAGGCCGATCCCTAAGAAAACTGGG
GGTGGAGGAACTAAAGGAAAAAACAGCAAAAATCAAGTCAAAAGGGAGATAAAAGGATTAATCAACTCATGGGAGAAGCCAGTAGAAGAAAAGAGGAATTCCTCTCAGGG
CAAAGATGAATCTAAGCTTGCTTTTGTTAACAGATCTCTTGTTAAGCAGATTTGGAGCTCCCGTTTCATAGGGTGGGCCTCTCTAGACTCGGTAGGTGCTTCGGGAGGTA
TTCTAGTGTTATGGAAAGAGAAGGAAGTCAAAGCAGTTGATGTGGTTGTGGGTTGCCATTCGGTTTCAGTTCTGTTTTTGCTAGCTAATTCAAATTCTTTTTGGGTGACA
GGAGTTTATGGGCCCAATAGCTCTAAAGATGCGAATGGTAAATTTACATGGTCAGATAAGAGGGAAACCCCAGTGTTTACCAAGATTGATAGATTCTTTGTCTCTGCGGG
GTGGTCTGATCTCTTTTCGAATGTCCAGGTTTCTCATATGCAAAGAATAACTTCTGATCATTTTCCCATTTTGCTTCAAGCAGGTGACTTCTCATGGGGTCCAACTCCTT
TCAGATTTGAGAATGCTTGGCTTTCACACCCTGACTTTAAAAAGAACGTTGAAGGCTGGTGGAAGGAGTTTGAGGAGGAAGGATGGGCTGGGTTACAGTTCATGGCCAAG
CTAAAGACTCTTAAAGAGAAGCTAAAACAGTGGACAAGGAGGTCTTTGGCGATATTAGGGAGGAGAAAGCTATGGGGCTCAAAGGTTTACTATGGAGGGGGTAGACTGGA
AACCAATCTCGACAACCAAGCTAATTGGCTGGAGCGACCTTTTGAAATGGAGGAAATTTTTAAAGCTATTTCCCAACTAGGAGCTCAGAAATCCCCGGGCCCGGACGGTT
TTACAGGCGAATTCTTGAAAAATTCTTGGAACACTATTAAAAATGATTTGTTTAAGGTGTTCCAGGAGTTTTTTGACAATGGTATCGTCAGTAGAAGGATGAATGAAACC
TACATATGTTTGATCCCCAAAAAAAAGGTGGCTAGCAAGGTTAGTGACTTCCGCCCCATCAGTTTAGTGACTATGTTATACAAATCTCTTTCTAAGGTGTTGGCTGAGAG
GCTTAGAAAGGTTCTTCCGGACACGATAGACAAGGCTCAAATGGCTTTTGTCGAGGGGAGGCAAATTATGGATGCTATTCTTATAGCGACCGAGACAGTGGATGACTACC
GGCCAAGAGGGAAAATCCTAGCGTCTAGGGTATTCGTCAAGGAGACCGCTATCTCCTTTCTTTTTACTATTGTAGCTGACTCTATGAGTCGATTCATTCAGTTGTGCAGC
AAGAAAGGCTTTATTCGGGCTTTATGGTTGGGAGGGAGGATGTGGAGGTCACCCACATTCAATTTGCAGATGATACCCTTCTGTTTAGTGAAGGCAGTTTGCAGTGTATC
CATAATTGGAAGACTTTTCTGA
Protein sequenceShow/hide protein sequence
MLKVEGADTEWRVRIGRWGRGPPSTPPAGYEVPVDLMSFRATNMAFRFRRLKQISRSLPQFYSGYYHQYHHSYAVSSLALSVAPSHVSEGIDKRILDNGRHLLRFSTTTE
LQCESSPANDILSFIKSTSDESEGPNHCWLNTSDGNKGISENDGIYLILANQFLETTSSDSVVLVENVKFLQHRFPQLHVIGFQCSSTLSAAEKSNMIQFIMREYVSFPI
LLSKKIFEMAPGLCYIISKDFSNPLLLGEKDMDLSILRKGNDGNHPNPRDHDELVRAEDSYRGEMGRQSPEESEANQILGSSLDIPILKETVEILWQNGLCIRPIPKKTG
GGGTKGKNSKNQVKREIKGLINSWEKPVEEKRNSSQGKDESKLAFVNRSLVKQIWSSRFIGWASLDSVGASGGILVLWKEKEVKAVDVVVGCHSVSVLFLLANSNSFWVT
GVYGPNSSKDANGKFTWSDKRETPVFTKIDRFFVSAGWSDLFSNVQVSHMQRITSDHFPILLQAGDFSWGPTPFRFENAWLSHPDFKKNVEGWWKEFEEEGWAGLQFMAK
LKTLKEKLKQWTRRSLAILGRRKLWGSKVYYGGGRLETNLDNQANWLERPFEMEEIFKAISQLGAQKSPGPDGFTGEFLKNSWNTIKNDLFKVFQEFFDNGIVSRRMNET
YICLIPKKKVASKVSDFRPISLVTMLYKSLSKVLAERLRKVLPDTIDKAQMAFVEGRQIMDAILIATETVDDYRPRGKILASRVFVKETAISFLFTIVADSMSRFIQLCS
KKGFIRALWLGGRMWRSPTFNLQMIPFCLVKAVCSVSIIGRLF