; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Lag0032408 (gene) of Sponge gourd (AG-4) v1 genome

Gene IDLag0032408
OrganismLuffa acutangula AG-4 (Sponge gourd (AG-4) v1)
DescriptionGag protease polyprotein
Genome locationchr11:32067466..32077161
RNA-Seq ExpressionLag0032408
SyntenyLag0032408
Gene Ontology termsGO:0015074 - DNA integration (biological process)
GO:0003676 - nucleic acid binding (molecular function)
GO:0008270 - zinc ion binding (molecular function)
InterPro domainsIPR005162 - Retrotransposon gag domain


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_022156662.1 uncharacterized protein LOC111023512 [Momordica charantia]2.2e-5354.93Show/hide
Query:  LEALQALLNNVSANQASQVNQNENQGQVGSTAGARCLKDFKKYDPPTFSRKTVDPTEAEAWTAKIEEIFRYMGCPEAQQVSCAVFVLRDNALLWWRSAER
        +E LQ L+    +NQ +Q+ Q  N+G +  +  A+ L+DFKKYDP +F   +VDP  AEAW + +E IFRYM C E Q+V C VF+L+D+A LWW S ER
Subjt:  LEALQALLNNVSANQASQVNQNENQGQVGSTAGARCLKDFKKYDPPTFSRKTVDPTEAEAWTAKIEEIFRYMGCPEAQQVSCAVFVLRDNALLWWRSAER

Query:  SIDVSSGPVTWLQFKEAFFQKYYSAIISYRKEREFLTLSQGEKSVEEYELEFTRLSRFAPEAVDTEAKKTKRFIAGLKDDVQRVVGALGPADYAAALRAA
         IDVS GPVTWLQFKEAFFQ+YY AI  YRK+ EFL L Q  +SVEEY+ EFT+LSRFAPE VDTEA K +RFI  LKD+ +  V  L P DYA ALR A
Subjt:  SIDVSSGPVTWLQFKEAFFQKYYSAIISYRKEREFLTLSQGEKSVEEYELEFTRLSRFAPEAVDTEAKKTKRFIAGLKDDVQRVVGALGPADYAAALRAA

Query:  TFMGMPTVNATPV
          +   + + + V
Subjt:  TFMGMPTVNATPV

XP_022938329.1 uncharacterized protein LOC111444463 [Cucurbita moschata]2.6e-4638.34Show/hide
Query:  RGRGRGRGRGGRGLALPEQVDPPMDQYEEDLPDEQAPAPPAPAQTVTLTL-EALQALLNNVSANQASQVNQNENQGQVGSTAGARCLKDFKKYDPPTFSR
        +GRGRGRG  GR     EQ  PP             PA P  A   T  L +ALQ ++ N++ANQ +  +         ST  A+ L+DFK+ DP TF  
Subjt:  RGRGRGRGRGGRGLALPEQVDPPMDQYEEDLPDEQAPAPPAPAQTVTLTL-EALQALLNNVSANQASQVNQNENQGQVGSTAGARCLKDFKKYDPPTFSR

Query:  KTVDPTEAEAWTAKIEEIFRYMGCPEAQQVSCAVFVLRDNALLWWRSAERSIDVSSGPVTWLQFKEAFFQKYYSAIISYRKEREFLTLSQGEKSVEEYEL
         + DPT A+ W   IE +F    CPEA +V CA F+LR +A LWW++    I    G V+W++FK AF ++YY   +  RK++EF  LSQ   SV+ Y  
Subjt:  KTVDPTEAEAWTAKIEEIFRYMGCPEAQQVSCAVFVLRDNALLWWRSAERSIDVSSGPVTWLQFKEAFFQKYYSAIISYRKEREFLTLSQGEKSVEEYEL

Query:  EFTRLSRFAPEAVDTEAKKTKRFIAGLKDDVQRVVGALGPADYAAALRAATFM-GMPTVNATPVAKESELNAGQKRKHEQTATNLQRSQPSSESSRQKTQ
        EF++L RFAPE V+T+ +  +RF+ GL   +++ V A+ PA YA ALRAA  M G+ + + T        + GQKR+HE                     
Subjt:  EFTRLSRFAPEAVDTEAKKTKRFIAGLKDDVQRVVGALGPADYAAALRAATFM-GMPTVNATPVAKESELNAGQKRKHEQTATNLQRSQPSSESSRQKTQ

Query:  HDKQEGNGGDKPKCNTCGRHHWGQCL
            E     KPKCN CG++HW +CL
Subjt:  HDKQEGNGGDKPKCNTCGRHHWGQCL

XP_031741726.1 uncharacterized protein LOC116403920 [Cucumis sativus]4.7e-4835.78Show/hide
Query:  IKKSGRYSWYQS--------------------------------PVNGSVDCLHRRGRGRGRGRGRGGRGLALPEQVDPPMDQYEEDLP-----------
        ++K GRYSWYQS                                P  G V    RRG  RGRGRG GGRG        P   Q E+ +P           
Subjt:  IKKSGRYSWYQS--------------------------------PVNGSVDCLHRRGRGRGRGRGRGGRGLALPEQVDPPMDQYEEDLP-----------

Query:  ------------------DEQAPA-------PPAPAQTVTLTLEALQALLNNVSANQASQVNQNENQGQVGSTAGARCLKDFKKYDPPTFSRKTVDPTEA
                          ++QAPA       PP PA          Q L     A Q  Q+  N+       +A A+ L+DF+KYDP TF     DPT+A
Subjt:  ------------------DEQAPA-------PPAPAQTVTLTLEALQALLNNVSANQASQVNQNENQGQVGSTAGARCLKDFKKYDPPTFSRKTVDPTEA

Query:  EAWTAKIEEIFRYMGCPEAQQVSCAVFVLRDNALLWWRSAERSIDVSSGPVTWLQFKEAFFQKYYSAIISYRKEREFLTLSQGEKSVEEYELEFTRLSRF
        E W + +E IF YM CPE  +V CA F+LRD  ++WWR+  R +      +TW QFK  F+ K++SA +   K +EFL L QG  +VEEY+ EF  LSRF
Subjt:  EAWTAKIEEIFRYMGCPEAQQVSCAVFVLRDNALLWWRSAERSIDVSSGPVTWLQFKEAFFQKYYSAIISYRKEREFLTLSQGEKSVEEYELEFTRLSRF

Query:  APEAVDTEAKKTKRFIAGLKDDVQRVVGALGPADYAAALRAATFMGMPTVNATPVAKESELNAGQKRKHEQTATNL-QRSQPSSESSRQKTQHDKQEGNG
        APE V  E  +  RF+ GL+D+++  V AL P   A ALR A  M +      P + +   ++GQKRK EQ    + QR+    +  R   Q     G  
Subjt:  APEAVDTEAKKTKRFIAGLKDDVQRVVGALGPADYAAALRAATFMGMPTVNATPVAKESELNAGQKRKHEQTATNL-QRSQPSSESSRQKTQHDKQEGNG

Query:  GD----KPKCNTCGRHHWGQCL
        GD    KP CNTCG+ H G+CL
Subjt:  GD----KPKCNTCGRHHWGQCL

XP_031742890.1 uncharacterized protein LOC116404512 [Cucumis sativus]4.7e-4835.78Show/hide
Query:  IKKSGRYSWYQS--------------------------------PVNGSVDCLHRRGRGRGRGRGRGGRGLALPEQVDPPMDQYEEDLP-----------
        ++K GRYSWYQS                                P  G V    RRG  RGRGRG GGRG        P   Q E+ +P           
Subjt:  IKKSGRYSWYQS--------------------------------PVNGSVDCLHRRGRGRGRGRGRGGRGLALPEQVDPPMDQYEEDLP-----------

Query:  ------------------DEQAPA-------PPAPAQTVTLTLEALQALLNNVSANQASQVNQNENQGQVGSTAGARCLKDFKKYDPPTFSRKTVDPTEA
                          ++QAPA       PP PA          Q L     A Q  Q+  N+       +A A+ L+DF+KYDP TF     DPT+A
Subjt:  ------------------DEQAPA-------PPAPAQTVTLTLEALQALLNNVSANQASQVNQNENQGQVGSTAGARCLKDFKKYDPPTFSRKTVDPTEA

Query:  EAWTAKIEEIFRYMGCPEAQQVSCAVFVLRDNALLWWRSAERSIDVSSGPVTWLQFKEAFFQKYYSAIISYRKEREFLTLSQGEKSVEEYELEFTRLSRF
        E W + +E IF YM CPE  +V CA F+LRD  ++WWR+  R +      +TW QFK  F+ K++SA +   K +EFL L QG  +VEEY+ EF  LSRF
Subjt:  EAWTAKIEEIFRYMGCPEAQQVSCAVFVLRDNALLWWRSAERSIDVSSGPVTWLQFKEAFFQKYYSAIISYRKEREFLTLSQGEKSVEEYELEFTRLSRF

Query:  APEAVDTEAKKTKRFIAGLKDDVQRVVGALGPADYAAALRAATFMGMPTVNATPVAKESELNAGQKRKHEQTATNL-QRSQPSSESSRQKTQHDKQEGNG
        APE V  E  +  RF+ GL+D+++  V AL P   A ALR A  M +      P + +   ++GQKRK EQ    + QR+    +  R   Q     G  
Subjt:  APEAVDTEAKKTKRFIAGLKDDVQRVVGALGPADYAAALRAATFMGMPTVNATPVAKESELNAGQKRKHEQTATNL-QRSQPSSESSRQKTQHDKQEGNG

Query:  GD----KPKCNTCGRHHWGQCL
        GD    KP CNTCG+ H G+CL
Subjt:  GD----KPKCNTCGRHHWGQCL

XP_038891712.1 uncharacterized protein LOC120081110 [Benincasa hispida]1.4e-5245.86Show/hide
Query:  QASQVNQNENQGQVGSTAGARCLKDFKKYDPPTFSRKTVDPTEAEAWTAKIEEIFRYMGCPEAQQVSCAVFVLRDNALLWWRSAERSIDVSSGPVTWLQF
        Q    N  +NQ   G +  A+ L+DFKKY+P TF+    DPT AE W + IE IFRYM CPE Q+V CAVF+L D A +WW+ AER + V   PVTW QF
Subjt:  QASQVNQNENQGQVGSTAGARCLKDFKKYDPPTFSRKTVDPTEAEAWTAKIEEIFRYMGCPEAQQVSCAVFVLRDNALLWWRSAERSIDVSSGPVTWLQF

Query:  KEAFFQKYYSAIISYRKEREFLTLSQGEKSVEEYELEFTRLSRFAPEAVDTEAKKTKRFIAGLKDDVQRVVGALGPADYAAALRAATFMGMPTVNATPVA
        KE F+ KY+SA + Y K+REFL L QG +SVEEY+ EF  LSRFAPE V TEA + +RFI GLK+ ++ +V A  P  +  ALR A  +   + +   + 
Subjt:  KEAFFQKYYSAIISYRKEREFLTLSQGEKSVEEYELEFTRLSRFAPEAVDTEAKKTKRFIAGLKDDVQRVVGALGPADYAAALRAATFMGMPTVNATPVA

Query:  KESELNAGQKRKHEQTATNLQRSQPSSESSRQKTQHDKQEGNGGD----KPKCNTCGRHHWGQCLA
          +  ++GQKRK +Q        Q    S R      +Q    G     +P C++CGR HWGQCLA
Subjt:  KESELNAGQKRKHEQTATNLQRSQPSSESSRQKTQHDKQEGNGGD----KPKCNTCGRHHWGQCLA

TrEMBL top hitse value%identityAlignment
A0A5A7SQU8 Reverse transcriptase1.8e-4539.15Show/hide
Query:  DCLHRRGRGRGRGRGRG-GRGLALPE------QVDP----------PMDQYEEDL----PDEQAPAPPAPAQTVTLTLEALQALLNNVSANQASQVNQNE
        + L RRG  RG   GRG G G   PE        DP           M+Q   DL     ++Q P PPAPA           A     S   A QV  ++
Subjt:  DCLHRRGRGRGRGRGRG-GRGLALPE------QVDP----------PMDQYEEDL----PDEQAPAPPAPAQTVTLTLEALQALLNNVSANQASQVNQNE

Query:  NQGQVGSTAGARCLKDFKKYDPPTFSRKTVDPTEAEAWTAKIEEIFRYMGCPEAQQVSCAVFVLRDNALLWWRSAERSIDVSSGPVTWLQFKEAFFQKYY
               +A A+ L+DF+KY+P TF     DPT A+ W + +E IFRYM CPE Q+V CAVF+L D    WW + ER +      +TW QFKE+F+ K++
Subjt:  NQGQVGSTAGARCLKDFKKYDPPTFSRKTVDPTEAEAWTAKIEEIFRYMGCPEAQQVSCAVFVLRDNALLWWRSAERSIDVSSGPVTWLQFKEAFFQKYY

Query:  SAIISYRKEREFLTLSQGEKSVEEYELEFTRLSRFAPEAVDTEAKKTKRFIAGLKDDVQRVVGALGPADYAAALRAATFMGM-PTVNATPVAKESELNAG
        SA +   K +EFL L QG+ +VE+Y+ EF  LSRFAPE + TEA +  +F+ GL+ D+Q +V A  PA +A ALR A  + +    N++  A      +G
Subjt:  SAIISYRKEREFLTLSQGEKSVEEYELEFTRLSRFAPEAVDTEAKKTKRFIAGLKDDVQRVVGALGPADYAAALRAATFMGM-PTVNATPVAKESELNAG

Query:  QKRKHEQTATNL-QRSQPSSESSRQKTQHDKQEGNGG-DKPKCNTCGRHHWGQCL
        QKRK EQ    + QR+  S    R+  Q   + G     KP C TCG+HH G+CL
Subjt:  QKRKHEQTATNL-QRSQPSSESSRQKTQHDKQEGNGG-DKPKCNTCGRHHWGQCL

A0A5A7T6R9 Reverse transcriptase1.3e-4639.94Show/hide
Query:  RRGRGRGRGRGRGGRGLALPEQVDPPMDQYEEDLPDEQAPAP-PAPAQTVTLTLEALQALLNNVSANQASQVNQNENQGQVGSTAGARCLKDFKKYDPPT
        RRG  RG   GRGGRG     +V P M + ++      APAP PAPA        A Q + + +SA                    A+ L+DF+KY+P T
Subjt:  RRGRGRGRGRGRGGRGLALPEQVDPPMDQYEEDLPDEQAPAP-PAPAQTVTLTLEALQALLNNVSANQASQVNQNENQGQVGSTAGARCLKDFKKYDPPT

Query:  FSRKTVDPTEAEAWTAKIEEIFRYMGCPEAQQVSCAVFVLRDNALLWWRSAERSIDVSSGPVTWLQFKEAFFQKYYSAIISYRKEREFLTLSQGEKSVEE
        F     DPT A+ W + +E IFRYM CPE Q+V CAVF+L D    WW + ER +      +TW QFKE+F+ K++SA +   K +EFL L QG+ +VE+
Subjt:  FSRKTVDPTEAEAWTAKIEEIFRYMGCPEAQQVSCAVFVLRDNALLWWRSAERSIDVSSGPVTWLQFKEAFFQKYYSAIISYRKEREFLTLSQGEKSVEE

Query:  YELEFTRLSRFAPEAVDTEAKKTKRFIAGLKDDVQRVVGALGPADYAAALRAATFMGM-PTVNATPVAKESELNAGQKRKHEQTATNLQ----RSQPSSE
        Y+ EF  LSRFAPE + TEA +  +F+ GL+ D+Q +V A  PA +A ALR A  + +    N++  A      +GQKRK EQ    +     RS     
Subjt:  YELEFTRLSRFAPEAVDTEAKKTKRFIAGLKDDVQRVVGALGPADYAAALRAATFMGM-PTVNATPVAKESELNAGQKRKHEQTATNLQ----RSQPSSE

Query:  SSRQKTQHDKQEGNGGDKPKCNTCGRHHWGQCL
        S +QK     +   G  KP C TCG+HH G+CL
Subjt:  SSRQKTQHDKQEGNGGDKPKCNTCGRHHWGQCL

A0A5A7UAA8 Reverse transcriptase8.1e-4639.58Show/hide
Query:  RRGRGRGRGRGRGGRGLALPE-QVDPPMDQYEEDLPDEQAPAP-PAPAQTVTLTLEALQALLNNVSANQASQVNQNENQGQVGSTAGARCLKDFKKYDPP
        RRG  RG   GRGGRG      Q + P        P   APAP P PA    L   A Q + + +SA                    A+ L+DF+KY+P 
Subjt:  RRGRGRGRGRGRGGRGLALPE-QVDPPMDQYEEDLPDEQAPAP-PAPAQTVTLTLEALQALLNNVSANQASQVNQNENQGQVGSTAGARCLKDFKKYDPP

Query:  TFSRKTVDPTEAEAWTAKIEEIFRYMGCPEAQQVSCAVFVLRDNALLWWRSAERSIDVSSGPVTWLQFKEAFFQKYYSAIISYRKEREFLTLSQGEKSVE
        TF     DPT A+ W + +E IFRYM CPE Q+V CAVF+L D    WW + ER +      +TW QFKE+F+ K++SA +   K +EFL L QG+ +VE
Subjt:  TFSRKTVDPTEAEAWTAKIEEIFRYMGCPEAQQVSCAVFVLRDNALLWWRSAERSIDVSSGPVTWLQFKEAFFQKYYSAIISYRKEREFLTLSQGEKSVE

Query:  EYELEFTRLSRFAPEAVDTEAKKTKRFIAGLKDDVQRVVGALGPADYAAALRAATFMGMPTVNATPVAKESELNAGQKRKHEQTATNL-QRSQPSSESSR
        +Y+ EF  LSRFAPE + TEA +  +F+ GL+ D+Q +V A  PA +A ALR A  + +  +  +         +GQKRK EQ    + QR+  S    R
Subjt:  EYELEFTRLSRFAPEAVDTEAKKTKRFIAGLKDDVQRVVGALGPADYAAALRAATFMGMPTVNATPVAKESELNAGQKRKHEQTATNL-QRSQPSSESSR

Query:  QKTQHDKQEGNGG-DKPKCNTCGRHHWGQCL
           Q   + G     KP C TCG+HH G+CL
Subjt:  QKTQHDKQEGNGG-DKPKCNTCGRHHWGQCL

A0A6J1DSJ6 uncharacterized protein LOC1110235121.1e-5354.93Show/hide
Query:  LEALQALLNNVSANQASQVNQNENQGQVGSTAGARCLKDFKKYDPPTFSRKTVDPTEAEAWTAKIEEIFRYMGCPEAQQVSCAVFVLRDNALLWWRSAER
        +E LQ L+    +NQ +Q+ Q  N+G +  +  A+ L+DFKKYDP +F   +VDP  AEAW + +E IFRYM C E Q+V C VF+L+D+A LWW S ER
Subjt:  LEALQALLNNVSANQASQVNQNENQGQVGSTAGARCLKDFKKYDPPTFSRKTVDPTEAEAWTAKIEEIFRYMGCPEAQQVSCAVFVLRDNALLWWRSAER

Query:  SIDVSSGPVTWLQFKEAFFQKYYSAIISYRKEREFLTLSQGEKSVEEYELEFTRLSRFAPEAVDTEAKKTKRFIAGLKDDVQRVVGALGPADYAAALRAA
         IDVS GPVTWLQFKEAFFQ+YY AI  YRK+ EFL L Q  +SVEEY+ EFT+LSRFAPE VDTEA K +RFI  LKD+ +  V  L P DYA ALR A
Subjt:  SIDVSSGPVTWLQFKEAFFQKYYSAIISYRKEREFLTLSQGEKSVEEYELEFTRLSRFAPEAVDTEAKKTKRFIAGLKDDVQRVVGALGPADYAAALRAA

Query:  TFMGMPTVNATPV
          +   + + + V
Subjt:  TFMGMPTVNATPV

A0A6J1FDR9 uncharacterized protein LOC1114444631.3e-4638.34Show/hide
Query:  RGRGRGRGRGGRGLALPEQVDPPMDQYEEDLPDEQAPAPPAPAQTVTLTL-EALQALLNNVSANQASQVNQNENQGQVGSTAGARCLKDFKKYDPPTFSR
        +GRGRGRG  GR     EQ  PP             PA P  A   T  L +ALQ ++ N++ANQ +  +         ST  A+ L+DFK+ DP TF  
Subjt:  RGRGRGRGRGGRGLALPEQVDPPMDQYEEDLPDEQAPAPPAPAQTVTLTL-EALQALLNNVSANQASQVNQNENQGQVGSTAGARCLKDFKKYDPPTFSR

Query:  KTVDPTEAEAWTAKIEEIFRYMGCPEAQQVSCAVFVLRDNALLWWRSAERSIDVSSGPVTWLQFKEAFFQKYYSAIISYRKEREFLTLSQGEKSVEEYEL
         + DPT A+ W   IE +F    CPEA +V CA F+LR +A LWW++    I    G V+W++FK AF ++YY   +  RK++EF  LSQ   SV+ Y  
Subjt:  KTVDPTEAEAWTAKIEEIFRYMGCPEAQQVSCAVFVLRDNALLWWRSAERSIDVSSGPVTWLQFKEAFFQKYYSAIISYRKEREFLTLSQGEKSVEEYEL

Query:  EFTRLSRFAPEAVDTEAKKTKRFIAGLKDDVQRVVGALGPADYAAALRAATFM-GMPTVNATPVAKESELNAGQKRKHEQTATNLQRSQPSSESSRQKTQ
        EF++L RFAPE V+T+ +  +RF+ GL   +++ V A+ PA YA ALRAA  M G+ + + T        + GQKR+HE                     
Subjt:  EFTRLSRFAPEAVDTEAKKTKRFIAGLKDDVQRVVGALGPADYAAALRAATFM-GMPTVNATPVAKESELNAGQKRKHEQTATNLQRSQPSSESSRQKTQ

Query:  HDKQEGNGGDKPKCNTCGRHHWGQCL
            E     KPKCN CG++HW +CL
Subjt:  HDKQEGNGGDKPKCNTCGRHHWGQCL

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGGCCCTACCCTCTTTATGGCACGAGAGGGACTTCTGTTTGTTGGTTGGACCACAAACAGGTTGTTCATTAGAGGAGCACTGGTACTTAAGGACCAAGAGTTAGCCCA
GGGCCCAAGACTTCATCTTCTCCACATCTCCTCTTTCTTCCTCCTTGCCTCAGCCGCACAACACAGCCAACTGTGTTTTTTTTTCTTCTTCTCGGCCGGCGACCAGCACG
CACCCATCAACCTGTCGGCAGCAACGACCTCTCCGGCGAGCAAGGTCCGCGGCGGCGCACGCGTTCAACAAGAACCAGTCAGCTGCAGCGTTTTCCGATCCAACGGGTTC
CTTCGACGCACACAGCAGGCGGCGGTCTTCCACGGTGTGGGTCGTCTTCTCCGACCAGCCCGAGCTCTAGCAGCGGCAGGGTATCGCGGCGTTCGTTCCGGCGGAGTCGA
GGGCGGCAGTTTCGTTTTCTTCTTCTTCGCAGTTCAGGCAGCAACGTTTCGAGCAGTTCTGTCAAAGGCCAGCTACTTTATCGCGTGGGTCTTCTTCCTTGGACGTTTTC
CAGCGGCTGTAAGCTCGTTTATAGGTGATTTTTGGATTGGCATTAGAAGTTTATTGTCGGATTGGAGCGTTTTTGGAGACCTCGGACAACAAGTGTATTTTTGCTTTGGC
CTTGAAGTTTACGTGGTGATTAGTGTGGTTCAAGAGCCTCTAAGCCAAGGTAAAGGCAAGGGCACAGGAAAGGCGATCGATGAGAATCCGATAGGCAGCCATGAGAGGAC
TCGAGTTTCATATTTTGTTTTCTTCCACTTAAGAACTCTGTTTTATCTTATTGCATTGGCCTCCAAAGCATGCATTGTCGTAACGGTCCCTTTAAGACTTATTAAGAAGT
CGGGTCGTTACAGTTGGTATCAGAGCCCAGTTAATGGTTCTGTAGACTGCTTACATCGTCGAGGACGTGGACGCGGACGTGGTCGCGGTAGGGGTGGCAGAGGTCTGGCA
CTCCCAGAGCAGGTGGATCCCCCTATGGATCAGTATGAGGAAGATCTCCCTGATGAGCAAGCTCCTGCACCTCCAGCACCAGCGCAGACTGTCACTCTGACTTTAGAGGC
TCTTCAGGCGCTTCTAAATAATGTTTCTGCCAATCAGGCGTCGCAAGTGAACCAAAATGAGAATCAGGGTCAGGTAGGCTCAACTGCGGGAGCTCGGTGTCTTAAGGATT
TTAAGAAATATGACCCACCGACGTTTAGCAGAAAGACTGTGGATCCCACAGAGGCCGAAGCTTGGACTGCCAAAATTGAAGAGATTTTTCGCTATATGGGATGCCCTGAG
GCACAACAGGTGTCGTGCGCGGTTTTTGTGCTAAGGGATAATGCTCTGTTGTGGTGGAGATCGGCTGAGAGGTCCATCGATGTGAGTTCTGGTCCGGTCACTTGGTTGCA
GTTCAAGGAGGCGTTCTTCCAGAAATATTACTCGGCCATCATCAGTTACAGAAAGGAGAGGGAATTCCTAACCTTGTCACAAGGGGAAAAGTCAGTGGAAGAATACGAGC
TGGAATTCACCCGACTGTCCCGGTTCGCTCCTGAAGCGGTGGATACGGAGGCAAAGAAGACAAAGAGGTTCATCGCTGGCCTCAAAGATGACGTACAGAGGGTTGTCGGA
GCCCTTGGCCCAGCGGACTACGCAGCGGCCCTTCGAGCGGCCACCTTCATGGGCATGCCAACTGTTAATGCAACTCCAGTAGCCAAGGAGTCGGAGCTCAACGCAGGACA
GAAGAGGAAACACGAGCAGACAGCTACCAACCTCCAGCGATCTCAACCCTCATCTGAAAGTTCTAGACAGAAAACTCAGCATGACAAACAAGAGGGCAATGGAGGTGACA
AACCCAAATGCAACACTTGTGGAAGACATCATTGGGGTCAGTGCTTGGCGAGGAAAGCCTTCACGCAGCCCAAGACTTCATCTTCTCCACATCTCCTCTTTCTTCCTCCT
TGCCTCAGCCGCACAACACAGCCAACTGTGTTTTTTTTCTTCTTCTCGGCCGGCGACCAGCACGCACCCATCAACCTGTCGGCAGCGACGACCTCTCTGGCGAGCAAGGT
CCGCGGCGGCGCACGCGTTCAACAAGAACCAGTCAGCTGCAGCGTTTTCCGATCCAACGGGTTCCTTCGACGCACACAGCAGGCGGCGGTCTTCCACGGTGTGGGTCGTC
TTCTCCGACCAGCCCGAGCTCTAGCAGCGGCAGGGTATCGCGGTGTTCGTTCCGGCGGAGTCGAGGGCGGCAGTTTTGTTTTCTTCTTCTTCGCAGTTCAGGCAGCGACG
TTTCGAGCAGTTCTGTCAAAGGCCAGCTACTTTATCGCGTGGGTCTTCTTCCTTGGACGTTTTCCAGCGGCTGTAAGCTCGTTTATAGGTTCTCTTAGCATTAGAAGTTT
ATTGTCGGATTGGAGCGTTTTTGGAGACCTCGGACAACAAGTGTATTTTTGCTTTGGCCTTGAAGTTTACGTGGTGATTAGTGTGGTTCAAGAGCCTCTAAGCCAAGGAG
GATACCGGTGTTTCTTTCTGGTGGTGTCTCACTTCAAATCACAGCAAAGAAGGATCTTGAAGGTTGCTGGAATTTTAATTCAAAGGGTAGAATTTGGCGAGGTGATCAAG
AATTTCTATAACGGTAAGCGGTTCGGCGGTCTGGTTCGGTTGAGGCGCGGTTCGGTCGTCTGGCGAGCGGTTCGGTCGTCTGGCGAGCGGTTCGTTGACTGGCTTAGGCG
GTTCGTTGACCGGCTTAGTGCTGGTTCAAGCAGTTTTGACCAATTTTTGGTGGTTTTGGCACGTTTTGGAGCGGTTTTGGTCCAGTTCGGGGCTGCAGAAGCAATGTCTT
CCTCCATAATTTCCTTGCTTAAAAATGAACAACTTACTGGCGAAAACTTTCCTCAATGGAAATCGAACTTAAATACCATACTCGTGGTTGAGGATTTAAGGTTCGTCTTA
ACGGAGGAATGTCCTCCCGTTCCCCCTCGCACTGCCACTCAGGCAGTTAAAGACGCTCATGAACGCTGGACTAAGGCCAACGAAAAGGTCAAAGTCTACATATTGGTCAG
CTTATCTGAAGTATTGGCCAAGCGTTATGAAAACGTGGAAACTGCCAGGGAGATTATGAATTCCCTGCAGGAGATGTTTGGACTTCCGTCCTACCAGCTCCACCACGACG
CCTTGAAGAACGTCTTCAATGCCAAGATGTTAGAAGGTCAGTCTGTTCGGGAACATGTCCTGGACATGATTAACCAATTCAATATAGCTGAGGCAAATGGCAGGGTAGTC
TGCTACGCAGCAGGTTGCGTTCATCTTTCACTTGCTCCTAGCGAGCTATCTGTCATTCAGGACGAACGCGAGCATGAATAA
mRNA sequenceShow/hide mRNA sequence
ATGGGCCCTACCCTCTTTATGGCACGAGAGGGACTTCTGTTTGTTGGTTGGACCACAAACAGGTTGTTCATTAGAGGAGCACTGGTACTTAAGGACCAAGAGTTAGCCCA
GGGCCCAAGACTTCATCTTCTCCACATCTCCTCTTTCTTCCTCCTTGCCTCAGCCGCACAACACAGCCAACTGTGTTTTTTTTTCTTCTTCTCGGCCGGCGACCAGCACG
CACCCATCAACCTGTCGGCAGCAACGACCTCTCCGGCGAGCAAGGTCCGCGGCGGCGCACGCGTTCAACAAGAACCAGTCAGCTGCAGCGTTTTCCGATCCAACGGGTTC
CTTCGACGCACACAGCAGGCGGCGGTCTTCCACGGTGTGGGTCGTCTTCTCCGACCAGCCCGAGCTCTAGCAGCGGCAGGGTATCGCGGCGTTCGTTCCGGCGGAGTCGA
GGGCGGCAGTTTCGTTTTCTTCTTCTTCGCAGTTCAGGCAGCAACGTTTCGAGCAGTTCTGTCAAAGGCCAGCTACTTTATCGCGTGGGTCTTCTTCCTTGGACGTTTTC
CAGCGGCTGTAAGCTCGTTTATAGGTGATTTTTGGATTGGCATTAGAAGTTTATTGTCGGATTGGAGCGTTTTTGGAGACCTCGGACAACAAGTGTATTTTTGCTTTGGC
CTTGAAGTTTACGTGGTGATTAGTGTGGTTCAAGAGCCTCTAAGCCAAGGTAAAGGCAAGGGCACAGGAAAGGCGATCGATGAGAATCCGATAGGCAGCCATGAGAGGAC
TCGAGTTTCATATTTTGTTTTCTTCCACTTAAGAACTCTGTTTTATCTTATTGCATTGGCCTCCAAAGCATGCATTGTCGTAACGGTCCCTTTAAGACTTATTAAGAAGT
CGGGTCGTTACAGTTGGTATCAGAGCCCAGTTAATGGTTCTGTAGACTGCTTACATCGTCGAGGACGTGGACGCGGACGTGGTCGCGGTAGGGGTGGCAGAGGTCTGGCA
CTCCCAGAGCAGGTGGATCCCCCTATGGATCAGTATGAGGAAGATCTCCCTGATGAGCAAGCTCCTGCACCTCCAGCACCAGCGCAGACTGTCACTCTGACTTTAGAGGC
TCTTCAGGCGCTTCTAAATAATGTTTCTGCCAATCAGGCGTCGCAAGTGAACCAAAATGAGAATCAGGGTCAGGTAGGCTCAACTGCGGGAGCTCGGTGTCTTAAGGATT
TTAAGAAATATGACCCACCGACGTTTAGCAGAAAGACTGTGGATCCCACAGAGGCCGAAGCTTGGACTGCCAAAATTGAAGAGATTTTTCGCTATATGGGATGCCCTGAG
GCACAACAGGTGTCGTGCGCGGTTTTTGTGCTAAGGGATAATGCTCTGTTGTGGTGGAGATCGGCTGAGAGGTCCATCGATGTGAGTTCTGGTCCGGTCACTTGGTTGCA
GTTCAAGGAGGCGTTCTTCCAGAAATATTACTCGGCCATCATCAGTTACAGAAAGGAGAGGGAATTCCTAACCTTGTCACAAGGGGAAAAGTCAGTGGAAGAATACGAGC
TGGAATTCACCCGACTGTCCCGGTTCGCTCCTGAAGCGGTGGATACGGAGGCAAAGAAGACAAAGAGGTTCATCGCTGGCCTCAAAGATGACGTACAGAGGGTTGTCGGA
GCCCTTGGCCCAGCGGACTACGCAGCGGCCCTTCGAGCGGCCACCTTCATGGGCATGCCAACTGTTAATGCAACTCCAGTAGCCAAGGAGTCGGAGCTCAACGCAGGACA
GAAGAGGAAACACGAGCAGACAGCTACCAACCTCCAGCGATCTCAACCCTCATCTGAAAGTTCTAGACAGAAAACTCAGCATGACAAACAAGAGGGCAATGGAGGTGACA
AACCCAAATGCAACACTTGTGGAAGACATCATTGGGGTCAGTGCTTGGCGAGGAAAGCCTTCACGCAGCCCAAGACTTCATCTTCTCCACATCTCCTCTTTCTTCCTCCT
TGCCTCAGCCGCACAACACAGCCAACTGTGTTTTTTTTCTTCTTCTCGGCCGGCGACCAGCACGCACCCATCAACCTGTCGGCAGCGACGACCTCTCTGGCGAGCAAGGT
CCGCGGCGGCGCACGCGTTCAACAAGAACCAGTCAGCTGCAGCGTTTTCCGATCCAACGGGTTCCTTCGACGCACACAGCAGGCGGCGGTCTTCCACGGTGTGGGTCGTC
TTCTCCGACCAGCCCGAGCTCTAGCAGCGGCAGGGTATCGCGGTGTTCGTTCCGGCGGAGTCGAGGGCGGCAGTTTTGTTTTCTTCTTCTTCGCAGTTCAGGCAGCGACG
TTTCGAGCAGTTCTGTCAAAGGCCAGCTACTTTATCGCGTGGGTCTTCTTCCTTGGACGTTTTCCAGCGGCTGTAAGCTCGTTTATAGGTTCTCTTAGCATTAGAAGTTT
ATTGTCGGATTGGAGCGTTTTTGGAGACCTCGGACAACAAGTGTATTTTTGCTTTGGCCTTGAAGTTTACGTGGTGATTAGTGTGGTTCAAGAGCCTCTAAGCCAAGGAG
GATACCGGTGTTTCTTTCTGGTGGTGTCTCACTTCAAATCACAGCAAAGAAGGATCTTGAAGGTTGCTGGAATTTTAATTCAAAGGGTAGAATTTGGCGAGGTGATCAAG
AATTTCTATAACGGTAAGCGGTTCGGCGGTCTGGTTCGGTTGAGGCGCGGTTCGGTCGTCTGGCGAGCGGTTCGGTCGTCTGGCGAGCGGTTCGTTGACTGGCTTAGGCG
GTTCGTTGACCGGCTTAGTGCTGGTTCAAGCAGTTTTGACCAATTTTTGGTGGTTTTGGCACGTTTTGGAGCGGTTTTGGTCCAGTTCGGGGCTGCAGAAGCAATGTCTT
CCTCCATAATTTCCTTGCTTAAAAATGAACAACTTACTGGCGAAAACTTTCCTCAATGGAAATCGAACTTAAATACCATACTCGTGGTTGAGGATTTAAGGTTCGTCTTA
ACGGAGGAATGTCCTCCCGTTCCCCCTCGCACTGCCACTCAGGCAGTTAAAGACGCTCATGAACGCTGGACTAAGGCCAACGAAAAGGTCAAAGTCTACATATTGGTCAG
CTTATCTGAAGTATTGGCCAAGCGTTATGAAAACGTGGAAACTGCCAGGGAGATTATGAATTCCCTGCAGGAGATGTTTGGACTTCCGTCCTACCAGCTCCACCACGACG
CCTTGAAGAACGTCTTCAATGCCAAGATGTTAGAAGGTCAGTCTGTTCGGGAACATGTCCTGGACATGATTAACCAATTCAATATAGCTGAGGCAAATGGCAGGGTAGTC
TGCTACGCAGCAGGTTGCGTTCATCTTTCACTTGCTCCTAGCGAGCTATCTGTCATTCAGGACGAACGCGAGCATGAATAA
Protein sequenceShow/hide protein sequence
MGPTLFMAREGLLFVGWTTNRLFIRGALVLKDQELAQGPRLHLLHISSFFLLASAAQHSQLCFFFFFSAGDQHAPINLSAATTSPASKVRGGARVQQEPVSCSVFRSNGF
LRRTQQAAVFHGVGRLLRPARALAAAGYRGVRSGGVEGGSFVFFFFAVQAATFRAVLSKASYFIAWVFFLGRFPAAVSSFIGDFWIGIRSLLSDWSVFGDLGQQVYFCFG
LEVYVVISVVQEPLSQGKGKGTGKAIDENPIGSHERTRVSYFVFFHLRTLFYLIALASKACIVVTVPLRLIKKSGRYSWYQSPVNGSVDCLHRRGRGRGRGRGRGGRGLA
LPEQVDPPMDQYEEDLPDEQAPAPPAPAQTVTLTLEALQALLNNVSANQASQVNQNENQGQVGSTAGARCLKDFKKYDPPTFSRKTVDPTEAEAWTAKIEEIFRYMGCPE
AQQVSCAVFVLRDNALLWWRSAERSIDVSSGPVTWLQFKEAFFQKYYSAIISYRKEREFLTLSQGEKSVEEYELEFTRLSRFAPEAVDTEAKKTKRFIAGLKDDVQRVVG
ALGPADYAAALRAATFMGMPTVNATPVAKESELNAGQKRKHEQTATNLQRSQPSSESSRQKTQHDKQEGNGGDKPKCNTCGRHHWGQCLARKAFTQPKTSSSPHLLFLPP
CLSRTTQPTVFFFFFSAGDQHAPINLSAATTSLASKVRGGARVQQEPVSCSVFRSNGFLRRTQQAAVFHGVGRLLRPARALAAAGYRGVRSGGVEGGSFVFFFFAVQAAT
FRAVLSKASYFIAWVFFLGRFPAAVSSFIGSLSIRSLLSDWSVFGDLGQQVYFCFGLEVYVVISVVQEPLSQGGYRCFFLVVSHFKSQQRRILKVAGILIQRVEFGEVIK
NFYNGKRFGGLVRLRRGSVVWRAVRSSGERFVDWLRRFVDRLSAGSSSFDQFLVVLARFGAVLVQFGAAEAMSSSIISLLKNEQLTGENFPQWKSNLNTILVVEDLRFVL
TEECPPVPPRTATQAVKDAHERWTKANEKVKVYILVSLSEVLAKRYENVETAREIMNSLQEMFGLPSYQLHHDALKNVFNAKMLEGQSVREHVLDMINQFNIAEANGRVV
CYAAGCVHLSLAPSELSVIQDEREHE