; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

CSPI03G20730 (gene) of Cucumber (PI 183967) v1 genome

Gene IDCSPI03G20730
OrganismCucumis sativus L. var. sativus cv. PI 183967 (Cucumber (PI 183967) v1)
DescriptionRetrovirus-related Pol polyprotein from transposon TNT 1-94
Genome locationChr3:16746729..16747616
RNA-Seq ExpressionCSPI03G20730
SyntenyCSPI03G20730
Gene Ontology termsGO:0015074 - DNA integration (biological process)
GO:0003676 - nucleic acid binding (molecular function)
GO:0008270 - zinc ion binding (molecular function)
InterPro domainsIPR025314 - Domain of unknown function DUF4219


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAA0055915.1 copia protein [Cucumis melo var. makuwa]9.0e-13283.39Show/hide
Query:  MDNNGNAMGTTQPLILIFKGEGYEFWSMRMKTLLRSQDLWDLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFR
        M NNGN MGTTQPLI IFKGEGYEFWS+RMKTLL SQDLWDLVE  Y DPDDEGKL+E R KD KALVI+QQAVHD+ FSRI   TTSK+AWLILQKAF+
Subjt:  MDNNGNAMGTTQPLILIFKGEGYEFWSMRMKTLLRSQDLWDLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFR

Query:  GDLRVLVVKLQSLRQEFETLMMKNRESIANFLSRATTIISQMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINR
        GD RVLVVKLQSL+++FETLMMKN ESIA+FLSRATTIISQMQTYGE ITDQTIVEKVLRSLT KFD VVAAIEESKDLSTFTFIELMGSLQAHESRIN 
Subjt:  GDLRVLVVKLQSLRQEFETLMMKNRESIANFLSRATTIISQMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINR

Query:  SMERNEEKAFQVKDVVPKYNNSDRVMTRGRGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADCWYKNQRANFSAENEA
        SME+N+EKAF+VKDVVPKYN+SD VMT+G+G GGYR +GRGT KGC QNEE+ QF VQSSNKANIQCYH KKFGHVKADCWYKNQRANF+ +NEA
Subjt:  SMERNEEKAFQVKDVVPKYNNSDRVMTRGRGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADCWYKNQRANFSAENEA

KAE8650579.1 hypothetical protein Csa_010963 [Cucumis sativus]1.5e-15099.29Show/hide
Query:  MDNNGNAMGTTQPLILIFKGEGYEFWSMRMKTLLRSQDLWDLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFR
        MDNNGNAMGTTQPLILIFKGEGYEFWSMRMKTLLRSQDLWDLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFR
Subjt:  MDNNGNAMGTTQPLILIFKGEGYEFWSMRMKTLLRSQDLWDLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFR

Query:  GDLRVLVVKLQSLRQEFETLMMKNRESIANFLSRATTIISQMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINR
        GDLRVLVVKLQSLR+EFETLMMKNRESIANFLSRATTIISQMQTYGE ITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINR
Subjt:  GDLRVLVVKLQSLRQEFETLMMKNRESIANFLSRATTIISQMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINR

Query:  SMERNEEKAFQVKDVVPKYNNSDRVMTRGRGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADC
        SMERNEEKAFQVKDVVPKYNNSDRVMTRGRGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADC
Subjt:  SMERNEEKAFQVKDVVPKYNNSDRVMTRGRGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADC

TYK27735.1 putative gag-pol polyprotein, identical [Cucumis melo var. makuwa]5.5e-12186.47Show/hide
Query:  MKTLLRSQDLWDLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFRGDLRVLVVKLQSLRQEFETLMMKNRESIA
        +KTLLRSQDLWDLVE  Y DPDDEGKLRE R+KDSKALVIIQQAVHDS FSRI T TTSK+AWLILQKAF+GD RVL+VKLQSLR++FETLMMKN ESIA
Subjt:  MKTLLRSQDLWDLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFRGDLRVLVVKLQSLRQEFETLMMKNRESIA

Query:  NFLSRATTIISQMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINRSMERNEEKAFQVKDVVPKYNNSDRVMTRG
        +FLSRATTIISQMQTYGE I DQTIVEKVLRSLT KFD VVAAIEESK+L TFTFIELMGSL+AHESRINRSMERNEEKAFQVKD VPKYN+SDRVMTRG
Subjt:  NFLSRATTIISQMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINRSMERNEEKAFQVKDVVPKYNNSDRVMTRG

Query:  RGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADCWYKNQRANFSAENEA
        RGRGGYRG+G GTEKGC +NE + QF VQSSNKANIQCYH KKFGHVKADCWYKNQRANF+AENEA
Subjt:  RGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADCWYKNQRANFSAENEA

TYK28117.1 putative gag-pol polyprotein, identical [Cucumis melo var. makuwa]6.5e-11475.59Show/hide
Query:  MDNNGNAMGTTQPLILIFKGEGYEFWSMRMKTLLRSQDLWDLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFR
        M NNGN MGT QPLI IFKGEGYEFWS+RMKTLL SQDLWDLVE  Y DPDDEGKL+E R KDSKALVIIQQAVHD+ FSRI   TT             
Subjt:  MDNNGNAMGTTQPLILIFKGEGYEFWSMRMKTLLRSQDLWDLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFR

Query:  GDLRVLVVKLQSLRQEFETLMMKNRESIANFLSRATTIISQMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINR
                      ++FETLMMKN ESIA+FLSRATTIISQMQTYGE ITDQTIVEKVLRSLT KFD VV AIEESKDLSTFTFIELMGSLQAHESRIN 
Subjt:  GDLRVLVVKLQSLRQEFETLMMKNRESIANFLSRATTIISQMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINR

Query:  SMERNEEKAFQVKDVVPKYNNSDRVMTRGRGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADCWYKNQRANFSAENEA
        SME+NEEKAF+VKDVVPKYN+SD VMT+G+G GGYR +GRGT KGC QNEE+ QF VQSSNKANIQCYH KKFGHVKADCWYKN RANF+ +NEA
Subjt:  SMERNEEKAFQVKDVVPKYNNSDRVMTRGRGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADCWYKNQRANFSAENEA

XP_031738054.1 uncharacterized protein LOC116402652 [Cucumis sativus]3.5e-16099.32Show/hide
Query:  MDNNGNAMGTTQPLILIFKGEGYEFWSMRMKTLLRSQDLWDLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFR
        MDNNGNAMGTTQPLILIFKGEGYEFWSMRMKTLLRSQDLWDLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFR
Subjt:  MDNNGNAMGTTQPLILIFKGEGYEFWSMRMKTLLRSQDLWDLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFR

Query:  GDLRVLVVKLQSLRQEFETLMMKNRESIANFLSRATTIISQMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINR
        GDLRVLVVKLQSLR+EFETLMMKNRESIANFLSRATTIISQMQTYGE ITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINR
Subjt:  GDLRVLVVKLQSLRQEFETLMMKNRESIANFLSRATTIISQMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINR

Query:  SMERNEEKAFQVKDVVPKYNNSDRVMTRGRGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADCWYKNQRANFSAENEA
        SMERNEEKAFQVKDVVPKYNNSDRVMTRGRGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADCWYKNQRANFSAENEA
Subjt:  SMERNEEKAFQVKDVVPKYNNSDRVMTRGRGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADCWYKNQRANFSAENEA

TrEMBL top hitse value%identityAlignment
A0A5A7UDE3 DUF4219 domain-containing protein/UBN2 domain-containing protein3.4e-10872.88Show/hide
Query:  MDNNGNAMGTTQPLILIFKGEGYEFWSMRMKTLLRSQDLWDLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFR
        M +N NAMG+TQPLI IFKGEGYEFWS+R KTLLRSQ LWDLVE  YADP+DEGKL+E R+KDSK LVIIQQAVHD+ FSRI   TTSK+ WLILQKA +
Subjt:  MDNNGNAMGTTQPLILIFKGEGYEFWSMRMKTLLRSQDLWDLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFR

Query:  GDLRVLVVKLQSLRQEFETLMMKNRESIANFLSRATTIISQMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINR
        GD RVLV+                                  QTYGE I DQTIVEKVLRSLT KFD VVAAIEESKDLSTFTFIELMGSLQAHESRINR
Subjt:  GDLRVLVVKLQSLRQEFETLMMKNRESIANFLSRATTIISQMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINR

Query:  SMERNEEKAFQVKDVVPKYNNSDRVMTRGRGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADCWYKNQRANFSAENEA
        S+E NEEKAFQVKDVVPKYN+SDRVMTRGRGRG Y G+GRGT KG  QNEE+ QF VQSSNKANIQCYH KKFGHVK DCWYKN RANF+AENEA
Subjt:  SMERNEEKAFQVKDVVPKYNNSDRVMTRGRGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADCWYKNQRANFSAENEA

A0A5A7UQM0 Copia protein4.3e-13283.39Show/hide
Query:  MDNNGNAMGTTQPLILIFKGEGYEFWSMRMKTLLRSQDLWDLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFR
        M NNGN MGTTQPLI IFKGEGYEFWS+RMKTLL SQDLWDLVE  Y DPDDEGKL+E R KD KALVI+QQAVHD+ FSRI   TTSK+AWLILQKAF+
Subjt:  MDNNGNAMGTTQPLILIFKGEGYEFWSMRMKTLLRSQDLWDLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFR

Query:  GDLRVLVVKLQSLRQEFETLMMKNRESIANFLSRATTIISQMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINR
        GD RVLVVKLQSL+++FETLMMKN ESIA+FLSRATTIISQMQTYGE ITDQTIVEKVLRSLT KFD VVAAIEESKDLSTFTFIELMGSLQAHESRIN 
Subjt:  GDLRVLVVKLQSLRQEFETLMMKNRESIANFLSRATTIISQMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINR

Query:  SMERNEEKAFQVKDVVPKYNNSDRVMTRGRGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADCWYKNQRANFSAENEA
        SME+N+EKAF+VKDVVPKYN+SD VMT+G+G GGYR +GRGT KGC QNEE+ QF VQSSNKANIQCYH KKFGHVKADCWYKNQRANF+ +NEA
Subjt:  SMERNEEKAFQVKDVVPKYNNSDRVMTRGRGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADCWYKNQRANFSAENEA

A0A5D3CL10 UBN2 domain-containing protein2.4e-10684.27Show/hide
Query:  DLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFRGDLRVLVVKLQSLRQEFETLMMKNRESIANFLSRATTIIS
        +LVE  YADPDDEGKLR  ++KDSK LVIIQQAVHDS FS+I   TTSK+AWLILQK F+GD RVLVVKLQSLR++FETLMMKN ESIA+FLSRATTIIS
Subjt:  DLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFRGDLRVLVVKLQSLRQEFETLMMKNRESIANFLSRATTIIS

Query:  QMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINRSMERNEEKAFQVKDVVPKYNNSDRVMTRGRGRGGYRGQGR
        QMQTY E I D TIVEKVLRSLT KFD VVA IEESKDLSTFTFIELMGSLQAHESRINRSMERNEEKAFQVKDVV KYN+SDRV TRGRGRGGYRG+G 
Subjt:  QMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINRSMERNEEKAFQVKDVVPKYNNSDRVMTRGRGRGGYRGQGR

Query:  GTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADCWYKNQRAN
        G EKGC QNEE+ QF VQSSNKANIQCYH KKFGHVKADCWYKNQRAN
Subjt:  GTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADCWYKNQRAN

A0A5D3DWC7 Putative gag-pol polyprotein, identical3.1e-11475.59Show/hide
Query:  MDNNGNAMGTTQPLILIFKGEGYEFWSMRMKTLLRSQDLWDLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFR
        M NNGN MGT QPLI IFKGEGYEFWS+RMKTLL SQDLWDLVE  Y DPDDEGKL+E R KDSKALVIIQQAVHD+ FSRI   TT             
Subjt:  MDNNGNAMGTTQPLILIFKGEGYEFWSMRMKTLLRSQDLWDLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFR

Query:  GDLRVLVVKLQSLRQEFETLMMKNRESIANFLSRATTIISQMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINR
                      ++FETLMMKN ESIA+FLSRATTIISQMQTYGE ITDQTIVEKVLRSLT KFD VV AIEESKDLSTFTFIELMGSLQAHESRIN 
Subjt:  GDLRVLVVKLQSLRQEFETLMMKNRESIANFLSRATTIISQMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINR

Query:  SMERNEEKAFQVKDVVPKYNNSDRVMTRGRGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADCWYKNQRANFSAENEA
        SME+NEEKAF+VKDVVPKYN+SD VMT+G+G GGYR +GRGT KGC QNEE+ QF VQSSNKANIQCYH KKFGHVKADCWYKN RANF+ +NEA
Subjt:  SMERNEEKAFQVKDVVPKYNNSDRVMTRGRGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADCWYKNQRANFSAENEA

A0A5D3DWP2 Putative gag-pol polyprotein, identical2.6e-12186.47Show/hide
Query:  MKTLLRSQDLWDLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFRGDLRVLVVKLQSLRQEFETLMMKNRESIA
        +KTLLRSQDLWDLVE  Y DPDDEGKLRE R+KDSKALVIIQQAVHDS FSRI T TTSK+AWLILQKAF+GD RVL+VKLQSLR++FETLMMKN ESIA
Subjt:  MKTLLRSQDLWDLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFRGDLRVLVVKLQSLRQEFETLMMKNRESIA

Query:  NFLSRATTIISQMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINRSMERNEEKAFQVKDVVPKYNNSDRVMTRG
        +FLSRATTIISQMQTYGE I DQTIVEKVLRSLT KFD VVAAIEESK+L TFTFIELMGSL+AHESRINRSMERNEEKAFQVKD VPKYN+SDRVMTRG
Subjt:  NFLSRATTIISQMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINRSMERNEEKAFQVKDVVPKYNNSDRVMTRG

Query:  RGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADCWYKNQRANFSAENEA
        RGRGGYRG+G GTEKGC +NE + QF VQSSNKANIQCYH KKFGHVKADCWYKNQRANF+AENEA
Subjt:  RGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADCWYKNQRANFSAENEA

SwissProt top hitse value%identityAlignment
P04146 Copia protein4.3e-0422.14Show/hide
Query:  FKGEGYEFWSMRMKTLLRSQDLWDLVEHNYA-DPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFRGDLRVLVVKLQSLRQE
        F GE Y  W  R++ LL  QD+  +V+     + DD  K  E+  K +     I + + DS  +   +  T+++    L   +    R  +    +LR+ 
Subjt:  FKGEGYEFWSMRMKTLLRSQDLWDLVEHNYA-DPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFRGDLRVLVVKLQSLRQE

Query:  FETLMMKNRESIANFLSRATTIISQMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEE-SKDLSTFTFIELMGSLQAHESRINRSMERNEEKAFQVKDV
          +L + +  S+ +       +IS++   G  I +   +  +L +L   +D ++ AIE  S++  T  F++    L   E +I    + N+     +  +
Subjt:  FETLMMKNRESIANFLSRATTIISQMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEE-SKDLSTFTFIELMGSLQAHESRINRSMERNEEKAFQVKDV

Query:  VPKYNNSDRVMTRGRGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADCW-YKNQRANFSAENE
        V   NN+ +                      K    K +   + ++K  ++C+H  + GH+K DC+ YK    N + ENE
Subjt:  VPKYNNSDRVMTRGRGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADCW-YKNQRANFSAENE

Arabidopsis top hitse value%identityAlignment
AT1G48720.1 unknown protein7.1e-1034.21Show/hide
Query:  YEFWSMRMKTLLRSQDLWDLVEHNYADPDDEGK--------LREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKE
        Y+ WS+RMK +L + D+W++VE  + +P++EG         LR+ R++D KAL +I Q + +  F ++   T++K+
Subjt:  YEFWSMRMKTLLRSQDLWDLVEHNYADPDDEGK--------LREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKE


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGACAACAATGGCAATGCTATGGGTACAACACAACCACTCATTCTAATCTTCAAAGGAGAAGGCTACGAGTTTTGGAGTATGCGTATGAAGACTCTTCTCAGATCTCA
AGACTTATGGGACTTAGTAGAACACAACTATGCGGATCCTGACGACGAAGGCAAGTTGCGGGAGAAGAGGAGAAAGGACTCTAAGGCGTTAGTGATTATTCAACAAGCAG
TCCATGACAGTGGTTTTTCGCGGATTGGTACAACAACAACGTCAAAAGAAGCGTGGCTGATTTTGCAAAAGGCATTTCGAGGAGATTTAAGAGTACTTGTGGTAAAATTG
CAATCACTTAGACAAGAATTTGAGACCTTGATGATGAAAAATAGAGAATCAATTGCTAATTTTTTGTCACGGGCAACGACAATTATTAGTCAGATGCAAACATACGGCGA
GATGATTACAGATCAGACTATAGTGGAGAAAGTATTGAGAAGTTTGACTCTAAAGTTCGATCAAGTTGTGGCCGCAATAGAAGAATCAAAGGATCTGTCCACTTTCACAT
TTATTGAATTAATGGGATCTCTTCAAGCACATGAGTCAAGAATCAATAGATCGATGGAAAGAAACGAAGAAAAAGCGTTTCAGGTAAAGGATGTAGTTCCAAAGTATAAT
AACAGTGATCGTGTGATGACTCGAGGCAGAGGAAGAGGAGGATATCGTGGTCAAGGTCGTGGAACTGAAAAAGGATGCAAACAAAATGAAGAAAAAGGGCAGTTCAGAGT
GCAATCAAGCAACAAAGCTAATATTCAATGCTACCATGGCAAGAAGTTTGGTCATGTAAAGGCAGACTGCTGGTACAAAAATCAGCGAGCCAATTTTTCAGCAGAGAATG
AAGCATAA
mRNA sequenceShow/hide mRNA sequence
ATGGACAACAATGGCAATGCTATGGGTACAACACAACCACTCATTCTAATCTTCAAAGGAGAAGGCTACGAGTTTTGGAGTATGCGTATGAAGACTCTTCTCAGATCTCA
AGACTTATGGGACTTAGTAGAACACAACTATGCGGATCCTGACGACGAAGGCAAGTTGCGGGAGAAGAGGAGAAAGGACTCTAAGGCGTTAGTGATTATTCAACAAGCAG
TCCATGACAGTGGTTTTTCGCGGATTGGTACAACAACAACGTCAAAAGAAGCGTGGCTGATTTTGCAAAAGGCATTTCGAGGAGATTTAAGAGTACTTGTGGTAAAATTG
CAATCACTTAGACAAGAATTTGAGACCTTGATGATGAAAAATAGAGAATCAATTGCTAATTTTTTGTCACGGGCAACGACAATTATTAGTCAGATGCAAACATACGGCGA
GATGATTACAGATCAGACTATAGTGGAGAAAGTATTGAGAAGTTTGACTCTAAAGTTCGATCAAGTTGTGGCCGCAATAGAAGAATCAAAGGATCTGTCCACTTTCACAT
TTATTGAATTAATGGGATCTCTTCAAGCACATGAGTCAAGAATCAATAGATCGATGGAAAGAAACGAAGAAAAAGCGTTTCAGGTAAAGGATGTAGTTCCAAAGTATAAT
AACAGTGATCGTGTGATGACTCGAGGCAGAGGAAGAGGAGGATATCGTGGTCAAGGTCGTGGAACTGAAAAAGGATGCAAACAAAATGAAGAAAAAGGGCAGTTCAGAGT
GCAATCAAGCAACAAAGCTAATATTCAATGCTACCATGGCAAGAAGTTTGGTCATGTAAAGGCAGACTGCTGGTACAAAAATCAGCGAGCCAATTTTTCAGCAGAGAATG
AAGCATAA
Protein sequenceShow/hide protein sequence
MDNNGNAMGTTQPLILIFKGEGYEFWSMRMKTLLRSQDLWDLVEHNYADPDDEGKLREKRRKDSKALVIIQQAVHDSGFSRIGTTTTSKEAWLILQKAFRGDLRVLVVKL
QSLRQEFETLMMKNRESIANFLSRATTIISQMQTYGEMITDQTIVEKVLRSLTLKFDQVVAAIEESKDLSTFTFIELMGSLQAHESRINRSMERNEEKAFQVKDVVPKYN
NSDRVMTRGRGRGGYRGQGRGTEKGCKQNEEKGQFRVQSSNKANIQCYHGKKFGHVKADCWYKNQRANFSAENEA