; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Lag0010873 (gene) of Sponge gourd (AG-4) v1 genome

Gene IDLag0010873
OrganismLuffa acutangula AG-4 (Sponge gourd (AG-4) v1)
DescriptionRetrotrans_gag domain-containing protein
Genome locationchr1:8351033..8368336
RNA-Seq ExpressionLag0010873
SyntenyLag0010873
Gene Ontology termsNA
InterPro domainsIPR005162 - Retrotransposon gag domain
IPR029480 - Transposase-associated domain
IPR043128 - Reverse transcriptase/Diguanylate cyclase domain
IPR043502 - DNA/RNA polymerase superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
ERM93404.1 hypothetical protein AMTR_s04947p00003620 [Amborella trichopoda]1.9e-8271.92Show/hide
Query:  QAENPILIANDRTRAIRAYAVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVGQFHGLSSEDPHLHLKSFLGVSDSFVIQGVPRDALRLTLFPYSLRD
        Q  NPI++A+DR RAIR YA PMFNELNPGI RP+IQA  FE+KPVMFQMLQTVGQF G+ +EDPHLHL+SFL VSDSF IQGV  + LRL LFP+SLRD
Subjt:  QAENPILIANDRTRAIRAYAVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVGQFHGLSSEDPHLHLKSFLGVSDSFVIQGVPRDALRLTLFPYSLRD

Query:  GAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKLRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYNGLNGVTQGMVDASAGGAL
         A+SWLN+  P S+  W++LAEKFL KYFPP RNAK RSEI+ F+QLEDE+ S+AWERFKELLRKCPHHG+PHCIQMETFYNGLN  ++ ++DASA GA+
Subjt:  GAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKLRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYNGLNGVTQGMVDASAGGAL

Query:  LAK
        L+K
Subjt:  LAK

XP_022926214.1 uncharacterized protein LOC111433394 [Cucurbita moschata]5.1e-8064.29Show/hide
Query:  RPQNRLLQQN-PLFEQNEQQNNQAENPILIAN-------------DRTRAIRAYAVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVGQFHGLSSEDP
        + + ++ +QN    E   Q N + ENP ++AN             DR RAIRAYA P   ELNP I RP++QA  FE+KPVMFQMLQT+GQFHGL SEDP
Subjt:  RPQNRLLQQN-PLFEQNEQQNNQAENPILIAN-------------DRTRAIRAYAVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVGQFHGLSSEDP

Query:  HLHLKSFLGVSDSFVIQGVPRDALRLTLFPYSLRDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKLRSEIVGFRQLEDETFSEAWERFKELLRK
        HLHLKSFLGVSDSF  Q V +D +RL+LFPYSLRDGAKSWLN+ A G+I +W+ L EKFL KYFPP RNA+ R+EIV F+Q ED+T SEAWERFKE+LRK
Subjt:  HLHLKSFLGVSDSFVIQGVPRDALRLTLFPYSLRDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKLRSEIVGFRQLEDETFSEAWERFKELLRK

Query:  CPHHGLPHCIQMETFYNGLNGVTQGMVDASAGGALLAK
        CPHHGLPHCIQMETFYNGLN  T+ +VDASA GA+L+K
Subjt:  CPHHGLPHCIQMETFYNGLNGVTQGMVDASAGGALLAK

XP_022960432.1 uncharacterized protein LOC111461168 [Cucurbita moschata]3.8e-8367.24Show/hide
Query:  LLQQNPLFEQNEQQNNQAENPILIAN-------------DRTRAIRAYAVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVGQFHGLSSEDPHLHLKS
        + +QN   +Q  Q N + ENP+++AN             DR RAIRAYA P  +ELNP I RP++QA  FE+KPVMFQMLQT+GQFHGL SEDPHLHLKS
Subjt:  LLQQNPLFEQNEQQNNQAENPILIAN-------------DRTRAIRAYAVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVGQFHGLSSEDPHLHLKS

Query:  FLGVSDSFVIQGVPRDALRLTLFPYSLRDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKLRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGL
        FLGVSDSF  QGV +D +RL+LFPYSLRDGAKSWLN+ AP +I +W+ LAEKFL KYFPP RNA+ R+EIV F+Q EDET SEAWERFKE+LRKCPHHGL
Subjt:  FLGVSDSFVIQGVPRDALRLTLFPYSLRDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKLRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGL

Query:  PHCIQMETFYNGLNGVTQGMVDASAGGALLAK
        PHCIQMETFYNGLN  T+ +VDASA GA+L+K
Subjt:  PHCIQMETFYNGLNGVTQGMVDASAGGALLAK

XP_030497803.1 uncharacterized protein LOC115713460 [Cannabis sativa]4.7e-8170.73Show/hide
Query:  NNQAENPILIANDRTRAIRAYAVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVGQFHGLSSEDPHLHLKSFLGVSDSFVIQGVPRDALRLTLFPYSL
        +N+A NPI +A+DRTRAIR YA PMFNELNPGI RP+IQA +FE+KPVMFQMLQTVGQF G  +EDPHLH++SFL VSDSF +QGV  +ALRL LFP+SL
Subjt:  NNQAENPILIANDRTRAIRAYAVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVGQFHGLSSEDPHLHLKSFLGVSDSFVIQGVPRDALRLTLFPYSL

Query:  RDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKLRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYNGLNGVTQGMVDASAGG
        RD A++WLN+  P S+  W++LAEKFL KYFPP RNAK RSEI+ F+Q EDET S+AWERFKELLRKCPHHG+PHCIQ+ETFYNGLN  ++ ++DASA G
Subjt:  RDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKLRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYNGLNGVTQGMVDASAGG

Query:  ALLAK
        A+L+K
Subjt:  ALLAK

XP_030508936.1 uncharacterized protein LOC115723589 [Cannabis sativa]2.3e-8070.24Show/hide
Query:  NNQAENPILIANDRTRAIRAYAVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVGQFHGLSSEDPHLHLKSFLGVSDSFVIQGVPRDALRLTLFPYSL
        +N+A NPI +A+DR RAIR YA PMFNELNPGI RP+IQA +FE+KPVMFQMLQTVGQF G  +EDPHLH++SFL VSDSF +QGV  +ALRL LFP+SL
Subjt:  NNQAENPILIANDRTRAIRAYAVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVGQFHGLSSEDPHLHLKSFLGVSDSFVIQGVPRDALRLTLFPYSL

Query:  RDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKLRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYNGLNGVTQGMVDASAGG
        RD A++WLN+  P S+  W++LAEKFL KYFPP RNAK RSEI+ F+QLEDET S+AWERFKELLRKCPHHG+PHCIQ+ETFYNGLN   + ++DA A G
Subjt:  RDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKLRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYNGLNGVTQGMVDASAGG

Query:  ALLAK
        A+L+K
Subjt:  ALLAK

TrEMBL top hitse value%identityAlignment
A0A6J1EEI2 uncharacterized protein LOC1114333942.5e-8064.29Show/hide
Query:  RPQNRLLQQN-PLFEQNEQQNNQAENPILIAN-------------DRTRAIRAYAVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVGQFHGLSSEDP
        + + ++ +QN    E   Q N + ENP ++AN             DR RAIRAYA P   ELNP I RP++QA  FE+KPVMFQMLQT+GQFHGL SEDP
Subjt:  RPQNRLLQQN-PLFEQNEQQNNQAENPILIAN-------------DRTRAIRAYAVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVGQFHGLSSEDP

Query:  HLHLKSFLGVSDSFVIQGVPRDALRLTLFPYSLRDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKLRSEIVGFRQLEDETFSEAWERFKELLRK
        HLHLKSFLGVSDSF  Q V +D +RL+LFPYSLRDGAKSWLN+ A G+I +W+ L EKFL KYFPP RNA+ R+EIV F+Q ED+T SEAWERFKE+LRK
Subjt:  HLHLKSFLGVSDSFVIQGVPRDALRLTLFPYSLRDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKLRSEIVGFRQLEDETFSEAWERFKELLRK

Query:  CPHHGLPHCIQMETFYNGLNGVTQGMVDASAGGALLAK
        CPHHGLPHCIQMETFYNGLN  T+ +VDASA GA+L+K
Subjt:  CPHHGLPHCIQMETFYNGLNGVTQGMVDASAGGALLAK

A0A6J1EQ90 uncharacterized protein LOC1114364111.0e-7867.4Show/hide
Query:  QQNPLFEQNEQQNNQ---AENPILIANDRTRAIRAYAVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVGQFHGLSSEDPHLHLKSFLGV-------S
        Q N  FE      NQ     NPI +A+DR RAIRAYA P   ELNP I RP+IQ   FE+KPVMFQMLQT+GQFHGL  EDPHLHLKSFLGV       S
Subjt:  QQNPLFEQNEQQNNQ---AENPILIANDRTRAIRAYAVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVGQFHGLSSEDPHLHLKSFLGV-------S

Query:  DSFVIQGVPRDALRLTLFPYSLRDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKLRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQ
        DSF  QGV +D +RL+LFPY LRDGAKSWLN+ APG+I +W+ LAE FL KYFPP RNA+ ++EIV F+Q EDET SEA ERFKE+LRKCPHHGLPHCIQ
Subjt:  DSFVIQGVPRDALRLTLFPYSLRDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKLRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQ

Query:  METFYNGLNGVTQGMVDASAGGALLAK
        METFYNGLN VT+ +VDASA GA+L+K
Subjt:  METFYNGLNGVTQGMVDASAGGALLAK

A0A6J1G7Q6 uncharacterized protein LOC1114515981.2e-7769.42Show/hide
Query:  QNNQAENPILIANDRTRAIRAYAVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVGQFHGLSSEDPHLHLKSFLGVSDSFVIQGVPRDALRLTLFPYS
        Q     N I +A+DR RAIRAYA P   ELNP I RP++QA  FE+KPVMFQMLQT+GQFHGLSS+DPHLHLKSFLGVSDSF  QGV +D +RL+ F YS
Subjt:  QNNQAENPILIANDRTRAIRAYAVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVGQFHGLSSEDPHLHLKSFLGVSDSFVIQGVPRDALRLTLFPYS

Query:  LRDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKLRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYNGLNGVTQGMVDASAG
        LRDGAKSWLN  A G I +W+ LAEKFL KYFPP R+A+ R+EIV F++ E+ET SEAWERFKE LRKCPHHGLPHCIQ+ETFYNGLN  T+ +VDASA 
Subjt:  LRDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKLRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYNGLNGVTQGMVDASAG

Query:  GALLAK
        G +L+K
Subjt:  GALLAK

A0A6J1H7E4 uncharacterized protein LOC1114611681.8e-8367.24Show/hide
Query:  LLQQNPLFEQNEQQNNQAENPILIAN-------------DRTRAIRAYAVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVGQFHGLSSEDPHLHLKS
        + +QN   +Q  Q N + ENP+++AN             DR RAIRAYA P  +ELNP I RP++QA  FE+KPVMFQMLQT+GQFHGL SEDPHLHLKS
Subjt:  LLQQNPLFEQNEQQNNQAENPILIAN-------------DRTRAIRAYAVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVGQFHGLSSEDPHLHLKS

Query:  FLGVSDSFVIQGVPRDALRLTLFPYSLRDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKLRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGL
        FLGVSDSF  QGV +D +RL+LFPYSLRDGAKSWLN+ AP +I +W+ LAEKFL KYFPP RNA+ R+EIV F+Q EDET SEAWERFKE+LRKCPHHGL
Subjt:  FLGVSDSFVIQGVPRDALRLTLFPYSLRDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKLRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGL

Query:  PHCIQMETFYNGLNGVTQGMVDASAGGALLAK
        PHCIQMETFYNGLN  T+ +VDASA GA+L+K
Subjt:  PHCIQMETFYNGLNGVTQGMVDASAGGALLAK

U5CUI2 Retrotrans_gag domain-containing protein9.2e-8371.92Show/hide
Query:  QAENPILIANDRTRAIRAYAVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVGQFHGLSSEDPHLHLKSFLGVSDSFVIQGVPRDALRLTLFPYSLRD
        Q  NPI++A+DR RAIR YA PMFNELNPGI RP+IQA  FE+KPVMFQMLQTVGQF G+ +EDPHLHL+SFL VSDSF IQGV  + LRL LFP+SLRD
Subjt:  QAENPILIANDRTRAIRAYAVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVGQFHGLSSEDPHLHLKSFLGVSDSFVIQGVPRDALRLTLFPYSLRD

Query:  GAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKLRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYNGLNGVTQGMVDASAGGAL
         A+SWLN+  P S+  W++LAEKFL KYFPP RNAK RSEI+ F+QLEDE+ S+AWERFKELLRKCPHHG+PHCIQMETFYNGLN  ++ ++DASA GA+
Subjt:  GAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKLRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETFYNGLNGVTQGMVDASAGGAL

Query:  LAK
        L+K
Subjt:  LAK

SwissProt top hitse value%identityAlignment
P20825 Retrovirus-related Pol polyprotein from transposon 2978.2e-0458.97Show/hide
Query:  MHEEGVAKTAFRTHEGHYEFPVMSFGLTNALANFQSLMN
        M EE ++KTAF T  GHYE+  M FGL NA A FQ  MN
Subjt:  MHEEGVAKTAFRTHEGHYEFPVMSFGLTNALANFQSLMN

Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGCACGAAGAAGGTGTAGCCAAGACAGCCTTCCGAACGCATGAGGGGCACTATGAATTCCCCGTAATGTCATTTGGGTTAACCAACGCTTTGGCGAATTTTCAATCTCT
CATGAATCAGTCCTCTTCAACATTGGAGAAGTCCAAGAGTCCTGGGAGAATTAACATTCTTTTGTGGATTATGCTCAATGGGAATCTCAATACTTCAGACGTTCTACGGA
AGAAGAACCCATCTCATTGCATATTGCCATTCGTTTGTATTTTATGTCTTCAGGATGAAGATTCTCTCAACCATATTTCTTTAATTGCTCTTATGCCAGACTTGTTTACT
GTTTTGAAAGACCAACGAGCCCTCCGCCACTTGTTGGAGCAAAGGGAAATACAGTCGGAATACCAGAAGTGGTTAACCAAGCTCCTAGGCTATGACTTCGAAATAAAATA
CCATTCGAGCCTCCTCAACAAGGCTGTTGATGCCCTGTCCAGAGTCACACATGAAAGTGAGCTGAGGGCTTTGACTGTTCCGCCGTTGGTGGATGTGGAAATTGTGCAGA
GGGAAGTTGAGCAAGATCTCGGGTTGGTCAAGATCTGTCAAGAGTTGGGAGTTGTATTGGAAGGGGATGAAGAATCAGATCGAGAAATATGTGGAACAGTGCACTGTTCG
CCAGAAAAATGTGTTAGATTTGTTGGGTTTTATGCCCTAAAACTCGTAGATAGTGAATCTGGTGTTACGGACACTCGTGAAGGACTAACTAGTCAATATTGGTCTATATC
CGTGGACACAGAAAATATGTCTGCAGTGAGAAGAGTGCAACTGAGAAATTTCCAGCGATCACAAGAAAAGGAAGCTGCTGCGTTTTCGTTCGTGGAGCGTCGTTGGCGAA
GAACGGTCAAGTCTACAACGAAGACCCCACAGGCAGAGATCCGTAGCGAAGCGCCGATGTCAGAAACTTGCTCCATTGTATTATGGGCCTTTCAAGGTCTTACAGAAGAT
CGGGTCATTTTCTCTCTCCCTGATACTTACCCATCTTCTCAGTTTTCTCCGCCAGCTCTCCCACACAAAAATCTCCGCTCAGCTCTTGTCTGTCGTTCGCCGCCAGACCA
CCCGTCGTTCTCGCCGTCGTTCTCCGCCGTTCTCACCATTGTTCGCCGCCAACAAGTCTCCGCCAGCCTCGTTGTCTTTTCCTCCCCGTGGTTCGAAATCTTGTCGTGGG
TGTTCTCGCGCCGGCGTTTTGGTGGTGTTCTCTCCGCCGTCTCGTCTTCGTTTCTCCTCCACAATCTCGCCGTCGTTCTCGTGAGTGTTGTCCCTGGTGTCCTCAGCTCC
AAATCCGTGGGGTTTTCTCCAAAGAACCGTGGGAAATTTTGTTGTTCATTCTTCTTCCCGCTCCATTACAAGAATCGAATGTCACCAGAATATGGTCTAGGGGTTGAGAT
GTTCATTGAAAATGCACTACGACATTCAAGTAGGTCAAATGTCATTTGTTGTCCATGTTTGCGATGTGTGAATGCAAAGTCTCAAGGTGCAAATACAGTAAGAGATCACT
TATTCATTTATGGTATTGATGAGAGCTATAAAACATGGATTTGGAATGGTGAATATCTACCGATTAAGAGAAATAATGCAAACGTCTCTTCTAGTGGAAAAGAAGAAGTT
GAGGATGAAGATGGAGATCTTGATGACATCATTGAGAATGTGGCAAGCAAAGAACTCACTCCCACAATATCGCTCGAACATAATTTGGAAACTTATAAAGCATATACGCA
AGAAGAAATTGACGAGGTTCGGATAGAATGGGCAAGATTTGTGTATAGAGGATTATTTTTGGATGACCTGTGGAGGGACGGATGGTCCCATTGTATGTTGTGTAAACTTC
TGCTGGATGTTGTTTATGGAGGGATGGGTGGTCCCATATGCATGATTGGACTTGATCAAGCCTCAAAAGCTTTAGTCAACGCATCTGCTAATGGATCTTTCCCAAAGAAG
TCAGCAAATGAAGCACATGCGATTCTAGATACCATTGCAACAAATAACCAAAATTGGGGAGAGAATGAAATTACCATTGCAAAGAATCCATTCAAAGCTTTGGACACGAA
AGCAAATTCTACCATGCAAGCACAAATCAATGTCATTCACAATATGATGAAGAACTTGACCATGGGGAATCAAGCGAATATTGCTCCCGCGAATGCTATCTCTACTCCTT
GTGATATTTTTAATGAAAAACATATAACAGATACTTGTCCATTAAATCCCACTTCTGTTTTCTATGTAGGACCACAAGGAAATCAGAATCGACAGTTAGCTTATGAATTT
AGGAATAGACCACAAGGAGCATTGCCAAGCAAAACAGAAAATCCTTATTGTGAAGGGAAGGAGCAAGGTAAAGCAGTGACCTTGAGGAGCGAATTAGACCACTTAACCGG
AAAAGTCAACAAGCTTAAGAACCGTATGGTGATTAATTCAACCAATGCATCAAAATTAATTCCTAATCGGTCTTTTCACCTCCTCATCATCCCTGAAATTCGAGATGGCT
CAAAAACACTCCAAATAGATCGAATTAGCACCAATTTGTGTGGTTTGTTCCCAATGCCTCCATTTCTCAAATCACCTATTGTGAAACAGACAAGGAAGTACCGATCATTT
TATGTCGACCCTTTCCTAGCGACAGGGAGAACCTTGATTGACGTACAAAAGGGAGAATTGACCATGACGATCAATGATCAACAAGTTACTTTCAATGTCATGAATGCACT
CAAGTTTCCAGGTGAATTTGAGGAGTGTTTTGTAGTTGATGAGATTGATGAACTTTCCTTATCTACTTTGAAGGAACTTATGAAAGGTTTTATCAAGCCACAAAAAGGCT
ATAGGTTGGACCATTGCAGATATACAGGTGATAACTGCCCAAAAGATTATGCTGCTGGGCGACTTGAGGAAGCAAATTCTGTGCTGCAGCAAAACTGGGAACAGAACTGC
CACATCACAGCTCGTGTGAGCTTGGTGCATGAGCGATCCGCCTGGGGTAAGGTCCGCAGACCCCAGAATCGTTTGCTGCAGCAAAACCCGCTGTTTGAACAAAATGAGCA
GCAAAATAATCAGGCTGAGAATCCTATCTTGATAGCGAACGATAGGACCAGAGCCATTCGAGCGTATGCTGTCCCAATGTTTAATGAGTTGAATCCAGGGATTGCACGTC
CCCAAATCCAAGCGGCAAATTTTGAAATGAAACCGGTAATGTTTCAGATGTTGCAAACCGTGGGGCAATTCCATGGTTTGTCATCTGAAGACCCTCATTTACATCTTAAG
TCTTTTCTAGGAGTTAGTGATTCTTTTGTAATTCAAGGAGTGCCTAGAGATGCTCTTAGATTAACTTTGTTCCCGTATTCTCTTAGAGATGGAGCAAAGTCATGGTTGAA
CTCTTTTGCTCCAGGATCAATTAGGACATGGGATGAGTTAGCTGAAAAATTTTTGAGTAAATATTTCCCACCTAATAGAAATGCTAAATTAAGGAGTGAAATAGTAGGGT
TTAGGCAACTTGAGGATGAGACTTTTAGTGAGGCTTGGGAAAGGTTTAAGGAGCTTTTGCGAAAGTGTCCCCACCATGGTTTACCTCATTGTATTCAAATGGAAACATTT
TACAATGGTTTAAATGGAGTAACCCAAGGTATGGTCGATGCTTCGGCTGGAGGGGCCCTTTTGGCAAAACTTTTGATGAAGCTATGA
mRNA sequenceShow/hide mRNA sequence
ATGCACGAAGAAGGTGTAGCCAAGACAGCCTTCCGAACGCATGAGGGGCACTATGAATTCCCCGTAATGTCATTTGGGTTAACCAACGCTTTGGCGAATTTTCAATCTCT
CATGAATCAGTCCTCTTCAACATTGGAGAAGTCCAAGAGTCCTGGGAGAATTAACATTCTTTTGTGGATTATGCTCAATGGGAATCTCAATACTTCAGACGTTCTACGGA
AGAAGAACCCATCTCATTGCATATTGCCATTCGTTTGTATTTTATGTCTTCAGGATGAAGATTCTCTCAACCATATTTCTTTAATTGCTCTTATGCCAGACTTGTTTACT
GTTTTGAAAGACCAACGAGCCCTCCGCCACTTGTTGGAGCAAAGGGAAATACAGTCGGAATACCAGAAGTGGTTAACCAAGCTCCTAGGCTATGACTTCGAAATAAAATA
CCATTCGAGCCTCCTCAACAAGGCTGTTGATGCCCTGTCCAGAGTCACACATGAAAGTGAGCTGAGGGCTTTGACTGTTCCGCCGTTGGTGGATGTGGAAATTGTGCAGA
GGGAAGTTGAGCAAGATCTCGGGTTGGTCAAGATCTGTCAAGAGTTGGGAGTTGTATTGGAAGGGGATGAAGAATCAGATCGAGAAATATGTGGAACAGTGCACTGTTCG
CCAGAAAAATGTGTTAGATTTGTTGGGTTTTATGCCCTAAAACTCGTAGATAGTGAATCTGGTGTTACGGACACTCGTGAAGGACTAACTAGTCAATATTGGTCTATATC
CGTGGACACAGAAAATATGTCTGCAGTGAGAAGAGTGCAACTGAGAAATTTCCAGCGATCACAAGAAAAGGAAGCTGCTGCGTTTTCGTTCGTGGAGCGTCGTTGGCGAA
GAACGGTCAAGTCTACAACGAAGACCCCACAGGCAGAGATCCGTAGCGAAGCGCCGATGTCAGAAACTTGCTCCATTGTATTATGGGCCTTTCAAGGTCTTACAGAAGAT
CGGGTCATTTTCTCTCTCCCTGATACTTACCCATCTTCTCAGTTTTCTCCGCCAGCTCTCCCACACAAAAATCTCCGCTCAGCTCTTGTCTGTCGTTCGCCGCCAGACCA
CCCGTCGTTCTCGCCGTCGTTCTCCGCCGTTCTCACCATTGTTCGCCGCCAACAAGTCTCCGCCAGCCTCGTTGTCTTTTCCTCCCCGTGGTTCGAAATCTTGTCGTGGG
TGTTCTCGCGCCGGCGTTTTGGTGGTGTTCTCTCCGCCGTCTCGTCTTCGTTTCTCCTCCACAATCTCGCCGTCGTTCTCGTGAGTGTTGTCCCTGGTGTCCTCAGCTCC
AAATCCGTGGGGTTTTCTCCAAAGAACCGTGGGAAATTTTGTTGTTCATTCTTCTTCCCGCTCCATTACAAGAATCGAATGTCACCAGAATATGGTCTAGGGGTTGAGAT
GTTCATTGAAAATGCACTACGACATTCAAGTAGGTCAAATGTCATTTGTTGTCCATGTTTGCGATGTGTGAATGCAAAGTCTCAAGGTGCAAATACAGTAAGAGATCACT
TATTCATTTATGGTATTGATGAGAGCTATAAAACATGGATTTGGAATGGTGAATATCTACCGATTAAGAGAAATAATGCAAACGTCTCTTCTAGTGGAAAAGAAGAAGTT
GAGGATGAAGATGGAGATCTTGATGACATCATTGAGAATGTGGCAAGCAAAGAACTCACTCCCACAATATCGCTCGAACATAATTTGGAAACTTATAAAGCATATACGCA
AGAAGAAATTGACGAGGTTCGGATAGAATGGGCAAGATTTGTGTATAGAGGATTATTTTTGGATGACCTGTGGAGGGACGGATGGTCCCATTGTATGTTGTGTAAACTTC
TGCTGGATGTTGTTTATGGAGGGATGGGTGGTCCCATATGCATGATTGGACTTGATCAAGCCTCAAAAGCTTTAGTCAACGCATCTGCTAATGGATCTTTCCCAAAGAAG
TCAGCAAATGAAGCACATGCGATTCTAGATACCATTGCAACAAATAACCAAAATTGGGGAGAGAATGAAATTACCATTGCAAAGAATCCATTCAAAGCTTTGGACACGAA
AGCAAATTCTACCATGCAAGCACAAATCAATGTCATTCACAATATGATGAAGAACTTGACCATGGGGAATCAAGCGAATATTGCTCCCGCGAATGCTATCTCTACTCCTT
GTGATATTTTTAATGAAAAACATATAACAGATACTTGTCCATTAAATCCCACTTCTGTTTTCTATGTAGGACCACAAGGAAATCAGAATCGACAGTTAGCTTATGAATTT
AGGAATAGACCACAAGGAGCATTGCCAAGCAAAACAGAAAATCCTTATTGTGAAGGGAAGGAGCAAGGTAAAGCAGTGACCTTGAGGAGCGAATTAGACCACTTAACCGG
AAAAGTCAACAAGCTTAAGAACCGTATGGTGATTAATTCAACCAATGCATCAAAATTAATTCCTAATCGGTCTTTTCACCTCCTCATCATCCCTGAAATTCGAGATGGCT
CAAAAACACTCCAAATAGATCGAATTAGCACCAATTTGTGTGGTTTGTTCCCAATGCCTCCATTTCTCAAATCACCTATTGTGAAACAGACAAGGAAGTACCGATCATTT
TATGTCGACCCTTTCCTAGCGACAGGGAGAACCTTGATTGACGTACAAAAGGGAGAATTGACCATGACGATCAATGATCAACAAGTTACTTTCAATGTCATGAATGCACT
CAAGTTTCCAGGTGAATTTGAGGAGTGTTTTGTAGTTGATGAGATTGATGAACTTTCCTTATCTACTTTGAAGGAACTTATGAAAGGTTTTATCAAGCCACAAAAAGGCT
ATAGGTTGGACCATTGCAGATATACAGGTGATAACTGCCCAAAAGATTATGCTGCTGGGCGACTTGAGGAAGCAAATTCTGTGCTGCAGCAAAACTGGGAACAGAACTGC
CACATCACAGCTCGTGTGAGCTTGGTGCATGAGCGATCCGCCTGGGGTAAGGTCCGCAGACCCCAGAATCGTTTGCTGCAGCAAAACCCGCTGTTTGAACAAAATGAGCA
GCAAAATAATCAGGCTGAGAATCCTATCTTGATAGCGAACGATAGGACCAGAGCCATTCGAGCGTATGCTGTCCCAATGTTTAATGAGTTGAATCCAGGGATTGCACGTC
CCCAAATCCAAGCGGCAAATTTTGAAATGAAACCGGTAATGTTTCAGATGTTGCAAACCGTGGGGCAATTCCATGGTTTGTCATCTGAAGACCCTCATTTACATCTTAAG
TCTTTTCTAGGAGTTAGTGATTCTTTTGTAATTCAAGGAGTGCCTAGAGATGCTCTTAGATTAACTTTGTTCCCGTATTCTCTTAGAGATGGAGCAAAGTCATGGTTGAA
CTCTTTTGCTCCAGGATCAATTAGGACATGGGATGAGTTAGCTGAAAAATTTTTGAGTAAATATTTCCCACCTAATAGAAATGCTAAATTAAGGAGTGAAATAGTAGGGT
TTAGGCAACTTGAGGATGAGACTTTTAGTGAGGCTTGGGAAAGGTTTAAGGAGCTTTTGCGAAAGTGTCCCCACCATGGTTTACCTCATTGTATTCAAATGGAAACATTT
TACAATGGTTTAAATGGAGTAACCCAAGGTATGGTCGATGCTTCGGCTGGAGGGGCCCTTTTGGCAAAACTTTTGATGAAGCTATGA
Protein sequenceShow/hide protein sequence
MHEEGVAKTAFRTHEGHYEFPVMSFGLTNALANFQSLMNQSSSTLEKSKSPGRINILLWIMLNGNLNTSDVLRKKNPSHCILPFVCILCLQDEDSLNHISLIALMPDLFT
VLKDQRALRHLLEQREIQSEYQKWLTKLLGYDFEIKYHSSLLNKAVDALSRVTHESELRALTVPPLVDVEIVQREVEQDLGLVKICQELGVVLEGDEESDREICGTVHCS
PEKCVRFVGFYALKLVDSESGVTDTREGLTSQYWSISVDTENMSAVRRVQLRNFQRSQEKEAAAFSFVERRWRRTVKSTTKTPQAEIRSEAPMSETCSIVLWAFQGLTED
RVIFSLPDTYPSSQFSPPALPHKNLRSALVCRSPPDHPSFSPSFSAVLTIVRRQQVSASLVVFSSPWFEILSWVFSRRRFGGVLSAVSSSFLLHNLAVVLVSVVPGVLSS
KSVGFSPKNRGKFCCSFFFPLHYKNRMSPEYGLGVEMFIENALRHSSRSNVICCPCLRCVNAKSQGANTVRDHLFIYGIDESYKTWIWNGEYLPIKRNNANVSSSGKEEV
EDEDGDLDDIIENVASKELTPTISLEHNLETYKAYTQEEIDEVRIEWARFVYRGLFLDDLWRDGWSHCMLCKLLLDVVYGGMGGPICMIGLDQASKALVNASANGSFPKK
SANEAHAILDTIATNNQNWGENEITIAKNPFKALDTKANSTMQAQINVIHNMMKNLTMGNQANIAPANAISTPCDIFNEKHITDTCPLNPTSVFYVGPQGNQNRQLAYEF
RNRPQGALPSKTENPYCEGKEQGKAVTLRSELDHLTGKVNKLKNRMVINSTNASKLIPNRSFHLLIIPEIRDGSKTLQIDRISTNLCGLFPMPPFLKSPIVKQTRKYRSF
YVDPFLATGRTLIDVQKGELTMTINDQQVTFNVMNALKFPGEFEECFVVDEIDELSLSTLKELMKGFIKPQKGYRLDHCRYTGDNCPKDYAAGRLEEANSVLQQNWEQNC
HITARVSLVHERSAWGKVRRPQNRLLQQNPLFEQNEQQNNQAENPILIANDRTRAIRAYAVPMFNELNPGIARPQIQAANFEMKPVMFQMLQTVGQFHGLSSEDPHLHLK
SFLGVSDSFVIQGVPRDALRLTLFPYSLRDGAKSWLNSFAPGSIRTWDELAEKFLSKYFPPNRNAKLRSEIVGFRQLEDETFSEAWERFKELLRKCPHHGLPHCIQMETF
YNGLNGVTQGMVDASAGGALLAKLLMKL