; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

CSPI01G25250 (gene) of Cucumber (PI 183967) v1 genome

Gene IDCSPI01G25250
OrganismCucumis sativus L. var. sativus cv. PI 183967 (Cucumber (PI 183967) v1)
DescriptionRetrovirus-related Pol polyprotein from transposon TNT 1-94
Genome locationChr1:20709717..20714731
RNA-Seq ExpressionCSPI01G25250
SyntenyCSPI01G25250
Gene Ontology termsGO:0006260 - DNA replication (biological process)
GO:0032508 - DNA duplex unwinding (biological process)
GO:0003697 - single-stranded DNA binding (molecular function)
GO:0005524 - ATP binding (molecular function)
GO:0043139 - 5'-3' DNA helicase activity (molecular function)
InterPro domainsIPR027417 - P-loop containing nucleoside triphosphate hydrolase
IPR029472 - Retrotransposon Copia-like, N-terminal


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_022152756.1 uncharacterized protein LOC111020399 [Momordica charantia]4.4e-4837.59Show/hide
Query:  YLNPHFLHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRTGDLLPAW-IRNNIVISWILNSVSKPVSASILFSGSTRAIWIDLKE
        Y NP+FLHH+DNTSLVL ++ LT ENY SW+++M I L+VKNK+GFVD +I + TGDLL +W I NN+VISWILNS+SK +SASILFS S R IW+DLKE
Subjt:  YLNPHFLHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRTGDLLPAW-IRNNIVISWILNSVSKPVSASILFSGSTRAIWIDLKE

Query:  RFQKKNVPRIFQLKRSLATLEQNQDTI-----------------------------GVNDRFPPNVISH---GLMGLNENISQARAQLLLMDPLPSTSRA
        RF+K+N PRIFQL+R L+ L Q+Q ++                             GV +        H    LMGLNE+ SQ R QLLLM+P P+ +R 
Subjt:  RFQKKNVPRIFQLKRSLATLEQNQDTI-----------------------------GVNDRFPPNVISH---GLMGLNENISQARAQLLLMDPLPSTSRA

Query:  FSLLLQEEQQRSIGSFSSTAPTMAFAISVTSSMDTLLDIRPSNNSIATQNNESRSQGTTQSNQILNNTSTAEALIQCQNLLNQLQSQMNASNQPTTSHKA
        FSL+ QE QQR+I   +ST+P          ++ T L  R S++S     ++S S  ++ S Q   +T          NLL                   
Subjt:  FSLLLQEEQQRSIGSFSSTAPTMAFAISVTSSMDTLLDIRPSNNSIATQNNESRSQGTTQSNQILNNTSTAEALIQCQNLLNQLQSQMNASNQPTTSHKA

Query:  GTSYSFPLWIIDPGASTHISCCKSHFASIQPCSTSIRLPNKQVFEVSALTRDLLVDVSFSTNGCVTQDKFTLKKIGNAELLYGLYVFKLGNTLDLQSTIC
                                                     VSALT D  V V F+ N C+ QDK + K IG AE  +GLY+  +     L   +C
Subjt:  GTSYSFPLWIIDPGASTHISCCKSHFASIQPCSTSIRLPNKQVFEVSALTRDLLVDVSFSTNGCVTQDKFTLKKIGNAELLYGLYVFKLGNTLDLQSTIC

Query:  --ALQNW
          A   W
Subjt:  --ALQNW

XP_022154973.1 uncharacterized protein LOC111022117 [Momordica charantia]1.8e-4637.72Show/hide
Query:  LHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRTGDLLPAWIRN-NIVISWILNSVSKPVSASILFSGSTRAIWIDLKERFQKKN
        +HHND ++LVL ++ LT  NYVSW+++MTI LS+KNK+GF++ ++ K  GDLLP WIRN ++VI+W LNSVSKP+SAS++F+ ST  IW+DLK+RFQ +N
Subjt:  LHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRTGDLLPAWIRN-NIVISWILNSVSKPVSASILFSGSTRAIWIDLKERFQKKN

Query:  VPRIFQLKRSLATLEQNQDTIGVN-------------------------------DRFPP-NVISHGLMGLNENISQARAQLLLMDPLPSTSRAFSLLLQ
         P+IFQL+R LATL Q+Q ++ +                                ++F     +   LMGLNE+ +  RAQ+LLMDP PS  +AFSL+ Q
Subjt:  VPRIFQLKRSLATLEQNQDTIGVN-------------------------------DRFPP-NVISHGLMGLNENISQARAQLLLMDPLPSTSRAFSLLLQ

Query:  EEQQRSIGSFSSTAPTMAFAI---------------------------------------------------------SVTSSMDTLLDIRPSNNSIATQ
        EEQQR I  FS+ +P +  A+                                                         S+TSS+ +     P++NS +T 
Subjt:  EEQQRSIGSFSSTAPTMAFAI---------------------------------------------------------SVTSSMDTLLDIRPSNNSIATQ

Query:  NNESRSQGTTQSNQILNNTSTAEALIQCQNLLNQLQSQMNAS
           S SQ  T ++  L +  T++A  QC N+LN LQSQ+NA+
Subjt:  NNESRSQGTTQSNQILNNTSTAEALIQCQNLLNQLQSQMNAS

XP_031736904.1 uncharacterized protein LOC105434586 isoform X1 [Cucumis sativus]1.7e-6053.38Show/hide
Query:  TLFNQNQGYLNPHFLHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRTGDLLPAWIR-NNIVISWILNSVSKPVSASILFSGSTR
        T F+QNQGYLNP+FLHHNDNT+LVL TEQLTEENYVSW++AMTIGLSVKNKIGFVD TI + TGDLLP WIR NNIVISWILNSVSKP+SA+ILFS   R
Subjt:  TLFNQNQGYLNPHFLHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRTGDLLPAWIR-NNIVISWILNSVSKPVSASILFSGSTR

Query:  AIWIDLKERFQKKNVPRIFQLKRSLATLEQNQDTIGVNDR--------FPPNVISHGLMGLNENISQARAQLLLMDPLPSTSRAFSL-------LLQEEQ
         IW++LKERFQKKN PRIFQLKRSLATL QNQD+IG +          + P++ S       + I+Q    L + + +    + + +         +++Q
Subjt:  AIWIDLKERFQKKNVPRIFQLKRSLATLEQNQDTIGVNDR--------FPPNVISHGLMGLNENISQARAQLLLMDPLPSTSRAFSL-------LLQEEQ

Query:  QRSIGSFSSTAPTMAFAISVTSSMDTLLDIRPSNNSIATQNNESRSQGTTQ--SNQILNNTSTAEALIQCQNLLNQLQSQM
        QR+  +                            NS+  QN+E   QGTT+  SN  +NN  TAEALIQCQNLLNQLQ Q+
Subjt:  QRSIGSFSSTAPTMAFAISVTSSMDTLLDIRPSNNSIATQNNESRSQGTTQ--SNQILNNTSTAEALIQCQNLLNQLQSQM

XP_031736905.1 uncharacterized protein LOC105434586 isoform X2 [Cucumis sativus]1.7e-6053.38Show/hide
Query:  TLFNQNQGYLNPHFLHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRTGDLLPAWIR-NNIVISWILNSVSKPVSASILFSGSTR
        T F+QNQGYLNP+FLHHNDNT+LVL TEQLTEENYVSW++AMTIGLSVKNKIGFVD TI + TGDLLP WIR NNIVISWILNSVSKP+SA+ILFS   R
Subjt:  TLFNQNQGYLNPHFLHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRTGDLLPAWIR-NNIVISWILNSVSKPVSASILFSGSTR

Query:  AIWIDLKERFQKKNVPRIFQLKRSLATLEQNQDTIGVNDR--------FPPNVISHGLMGLNENISQARAQLLLMDPLPSTSRAFSL-------LLQEEQ
         IW++LKERFQKKN PRIFQLKRSLATL QNQD+IG +          + P++ S       + I+Q    L + + +    + + +         +++Q
Subjt:  AIWIDLKERFQKKNVPRIFQLKRSLATLEQNQDTIGVNDR--------FPPNVISHGLMGLNENISQARAQLLLMDPLPSTSRAFSL-------LLQEEQ

Query:  QRSIGSFSSTAPTMAFAISVTSSMDTLLDIRPSNNSIATQNNESRSQGTTQ--SNQILNNTSTAEALIQCQNLLNQLQSQM
        QR+  +                            NS+  QN+E   QGTT+  SN  +NN  TAEALIQCQNLLNQLQ Q+
Subjt:  QRSIGSFSSTAPTMAFAISVTSSMDTLLDIRPSNNSIATQNNESRSQGTTQ--SNQILNNTSTAEALIQCQNLLNQLQSQM

XP_031736906.1 uncharacterized protein LOC105434586 isoform X3 [Cucumis sativus]1.7e-6053.38Show/hide
Query:  TLFNQNQGYLNPHFLHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRTGDLLPAWIR-NNIVISWILNSVSKPVSASILFSGSTR
        T F+QNQGYLNP+FLHHNDNT+LVL TEQLTEENYVSW++AMTIGLSVKNKIGFVD TI + TGDLLP WIR NNIVISWILNSVSKP+SA+ILFS   R
Subjt:  TLFNQNQGYLNPHFLHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRTGDLLPAWIR-NNIVISWILNSVSKPVSASILFSGSTR

Query:  AIWIDLKERFQKKNVPRIFQLKRSLATLEQNQDTIGVNDR--------FPPNVISHGLMGLNENISQARAQLLLMDPLPSTSRAFSL-------LLQEEQ
         IW++LKERFQKKN PRIFQLKRSLATL QNQD+IG +          + P++ S       + I+Q    L + + +    + + +         +++Q
Subjt:  AIWIDLKERFQKKNVPRIFQLKRSLATLEQNQDTIGVNDR--------FPPNVISHGLMGLNENISQARAQLLLMDPLPSTSRAFSL-------LLQEEQ

Query:  QRSIGSFSSTAPTMAFAISVTSSMDTLLDIRPSNNSIATQNNESRSQGTTQ--SNQILNNTSTAEALIQCQNLLNQLQSQM
        QR+  +                            NS+  QN+E   QGTT+  SN  +NN  TAEALIQCQNLLNQLQ Q+
Subjt:  QRSIGSFSSTAPTMAFAISVTSSMDTLLDIRPSNNSIATQNNESRSQGTTQ--SNQILNNTSTAEALIQCQNLLNQLQSQM

TrEMBL top hitse value%identityAlignment
A0A5J5BKC2 Uncharacterized protein6.4e-4544.19Show/hide
Query:  NPHFLHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRTG---DLLPAWIR-NNIVISWILNSVSKPVSASILFSGSTRAIWIDLK
        NP++LHH+D+   +L ++QLT ENY +W++AM I LSVKNK+GFVD +I +  G   +LL +WIR NNIVISWILNSVSK +SASI+F+ S R IW+DL+
Subjt:  NPHFLHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRTG---DLLPAWIR-NNIVISWILNSVSKPVSASILFSGSTRAIWIDLK

Query:  ERFQKKNVPRIFQLKRSLATLEQNQDTIGV--------------------------------NDRFPPNVISHGLMGLNENISQARAQLLLMDPLPSTSR
        +RFQ++N PRIFQLKR L  L Q Q ++ +                                ND      I   LMGL+++ SQ R QLLLMDP+P  +R
Subjt:  ERFQKKNVPRIFQLKRSLATLEQNQDTIGV--------------------------------NDRFPPNVISHGLMGLNENISQARAQLLLMDPLPSTSR

Query:  AFSLLLQEEQQR---SIGSFSSTAPTMAFAISVTSSMDTLLDIRPSNNSIATQNNESRSQGTTQSNQ
         FSL++QEEQQR   S    S++  TMAFA+          D+  S  S  +QN+++ S  +   NQ
Subjt:  AFSLLLQEEQQR---SIGSFSSTAPTMAFAISVTSSMDTLLDIRPSNNSIATQNNESRSQGTTQSNQ

A0A6J1DIP8 uncharacterized protein LOC1110203992.1e-4837.59Show/hide
Query:  YLNPHFLHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRTGDLLPAW-IRNNIVISWILNSVSKPVSASILFSGSTRAIWIDLKE
        Y NP+FLHH+DNTSLVL ++ LT ENY SW+++M I L+VKNK+GFVD +I + TGDLL +W I NN+VISWILNS+SK +SASILFS S R IW+DLKE
Subjt:  YLNPHFLHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRTGDLLPAW-IRNNIVISWILNSVSKPVSASILFSGSTRAIWIDLKE

Query:  RFQKKNVPRIFQLKRSLATLEQNQDTI-----------------------------GVNDRFPPNVISH---GLMGLNENISQARAQLLLMDPLPSTSRA
        RF+K+N PRIFQL+R L+ L Q+Q ++                             GV +        H    LMGLNE+ SQ R QLLLM+P P+ +R 
Subjt:  RFQKKNVPRIFQLKRSLATLEQNQDTI-----------------------------GVNDRFPPNVISH---GLMGLNENISQARAQLLLMDPLPSTSRA

Query:  FSLLLQEEQQRSIGSFSSTAPTMAFAISVTSSMDTLLDIRPSNNSIATQNNESRSQGTTQSNQILNNTSTAEALIQCQNLLNQLQSQMNASNQPTTSHKA
        FSL+ QE QQR+I   +ST+P          ++ T L  R S++S     ++S S  ++ S Q   +T          NLL                   
Subjt:  FSLLLQEEQQRSIGSFSSTAPTMAFAISVTSSMDTLLDIRPSNNSIATQNNESRSQGTTQSNQILNNTSTAEALIQCQNLLNQLQSQMNASNQPTTSHKA

Query:  GTSYSFPLWIIDPGASTHISCCKSHFASIQPCSTSIRLPNKQVFEVSALTRDLLVDVSFSTNGCVTQDKFTLKKIGNAELLYGLYVFKLGNTLDLQSTIC
                                                     VSALT D  V V F+ N C+ QDK + K IG AE  +GLY+  +     L   +C
Subjt:  GTSYSFPLWIIDPGASTHISCCKSHFASIQPCSTSIRLPNKQVFEVSALTRDLLVDVSFSTNGCVTQDKFTLKKIGNAELLYGLYVFKLGNTLDLQSTIC

Query:  --ALQNW
          A   W
Subjt:  --ALQNW

A0A6J1DLQ9 uncharacterized protein LOC1110221178.9e-4737.72Show/hide
Query:  LHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRTGDLLPAWIRN-NIVISWILNSVSKPVSASILFSGSTRAIWIDLKERFQKKN
        +HHND ++LVL ++ LT  NYVSW+++MTI LS+KNK+GF++ ++ K  GDLLP WIRN ++VI+W LNSVSKP+SAS++F+ ST  IW+DLK+RFQ +N
Subjt:  LHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRTGDLLPAWIRN-NIVISWILNSVSKPVSASILFSGSTRAIWIDLKERFQKKN

Query:  VPRIFQLKRSLATLEQNQDTIGVN-------------------------------DRFPP-NVISHGLMGLNENISQARAQLLLMDPLPSTSRAFSLLLQ
         P+IFQL+R LATL Q+Q ++ +                                ++F     +   LMGLNE+ +  RAQ+LLMDP PS  +AFSL+ Q
Subjt:  VPRIFQLKRSLATLEQNQDTIGVN-------------------------------DRFPP-NVISHGLMGLNENISQARAQLLLMDPLPSTSRAFSLLLQ

Query:  EEQQRSIGSFSSTAPTMAFAI---------------------------------------------------------SVTSSMDTLLDIRPSNNSIATQ
        EEQQR I  FS+ +P +  A+                                                         S+TSS+ +     P++NS +T 
Subjt:  EEQQRSIGSFSSTAPTMAFAI---------------------------------------------------------SVTSSMDTLLDIRPSNNSIATQ

Query:  NNESRSQGTTQSNQILNNTSTAEALIQCQNLLNQLQSQMNAS
           S SQ  T ++  L +  T++A  QC N+LN LQSQ+NA+
Subjt:  NNESRSQGTTQSNQILNNTSTAEALIQCQNLLNQLQSQMNAS

A0A6J1DNP7 uncharacterized protein LOC1110220651.1e-4436.08Show/hide
Query:  YLNPHFLHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRTGDLLPAW-IRNNIVISWILNSVSKPVSASILFSGSTRAIWIDLKE
        + NP+FLHH+DNTSLVL ++ LT+ENY SW++++ I L+VKNKIGFVD +I++ T   L +W I NN+VISWI NS+SK +SAS+LFS S   IW+DLKE
Subjt:  YLNPHFLHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRTGDLLPAW-IRNNIVISWILNSVSKPVSASILFSGSTRAIWIDLKE

Query:  RFQKKNVPRIFQLKRSLATLEQNQDTI--------------------------------GVNDRFPPNVISHGLMGLNENISQARAQLLLMDPLPSTSRA
        RFQ++N PRIFQL+R L+ L Q+Q ++                                 +   +    +   LMGLN + SQ RAQLLLM+P P+ +RA
Subjt:  RFQKKNVPRIFQLKRSLATLEQNQDTI--------------------------------GVNDRFPPNVISHGLMGLNENISQARAQLLLMDPLPSTSRA

Query:  FSLLLQEEQQRSIGSFSSTAPTMAFAISVTSSMDTLLDI-------------------------------------------RPSNNSIATQNNESRSQG
        F+L+ QE QQRSI   S T+PT +   + ++S ++ L+                                              S+N+ ++++ E+ S+ 
Subjt:  FSLLLQEEQQRSIGSFSSTAPTMAFAISVTSSMDTLLDI-------------------------------------------RPSNNSIATQNNESRSQG

Query:  TTQSNQILNNTSTAEALIQCQNLLNQLQSQMN-----ASNQPTTSHKAGTSY
         + +   ++N+       QCQ LL  LQS +      + N   TSH A T +
Subjt:  TTQSNQILNNTSTAEALIQCQNLLNQLQSQMN-----ASNQPTTSHKAGTSY

A0A7J0FKC9 Haloacid dehalogenase-like hydrolase (HAD) superfamily protein3.2e-4445.35Show/hide
Query:  NPHFLHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRTG---DLLPAWIR-NNIVISWILNSVSKPVSASILFSGSTRAIWIDLK
        +P+FLHH+D   LVL ++ LT +NY SW +AM I LSVKNK+GF+D +ITK  G   +LL +WIR NN+VISWILNSVSK +SASI+FS S   IWIDLK
Subjt:  NPHFLHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRTG---DLLPAWIR-NNIVISWILNSVSKPVSASILFSGSTRAIWIDLK

Query:  ERFQKKNVPRIFQLKRSLATLEQNQDTIGV--------------------------------NDRFPPNVISHGLMGLNENISQARAQLLLMDPLPSTSR
        +RFQ+ N PRIFQL+R L    Q+Q  + V                                N  +    I   LM L+ + +Q R QLLLMDPLP  ++
Subjt:  ERFQKKNVPRIFQLKRSLATLEQNQDTIGV--------------------------------NDRFPPNVISHGLMGLNENISQARAQLLLMDPLPSTSR

Query:  AFSLLLQEEQQRSIG----SFSSTAPTMAFAISVTSSMDTLLDIRPSNNSIATQNNES
         FSL+ QEE QR IG    S S++A TMAFAI    ++    D   S+NS   + N++
Subjt:  AFSLLLQEEQQRSIG----SFSSTAPTMAFAISVTSSMDTLLDIRPSNNSIATQNNES

SwissProt top hitse value%identityAlignment
B5X582 Twinkle homolog protein, chloroplastic/mitochondrial2.2e-2668.42Show/hide
Query:  LQNWSGCPPNMHDISGNAHFINKCDNGIVIHRNRDPESGPIDLVQVCVRKVRNKVAGTIGEAYLAYNRVTGEFFDA
        LQ+W G  PN++DISG+AHFINKCDNGI++HRNRD  +GP+DLVQ+ VRKVRNKVAG IG+AYL Y+R TG + D+
Subjt:  LQNWSGCPPNMHDISGNAHFINKCDNGIVIHRNRDPESGPIDLVQVCVRKVRNKVAGTIGEAYLAYNRVTGEFFDA

Arabidopsis top hitse value%identityAlignment
AT1G21280.1 CONTAINS InterPro DOMAIN/s: Retrotransposon gag protein (InterPro:IPR005162); Has 707 Blast hits to 705 proteins in 25 species: Archae - 0; Bacteria - 0; Metazoa - 4; Fungi - 0; Plants - 703; Viruses - 0; Other Eukaryotes - 0 (source: NCBI BLink).3.5e-1131.01Show/hide
Query:  YLNPHFLHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRT--GDLLPAWIR-NNIVISWILNSVSKPVSASILFSGSTRAIWIDL
        Y  P  +HH  + S+   ++   E+NYV+W       L V  K GF+D T+ K      L   W + N +V+ W++NS++  +  S++++ +   +W DL
Subjt:  YLNPHFLHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRT--GDLLPAWIR-NNIVISWILNSVSKPVSASILFSGSTRAIWIDL

Query:  KERFQKKNVPRIFQLKRSLATLEQNQDTI
        +  F      +I+QL+R LATL Q  D++
Subjt:  KERFQKKNVPRIFQLKRSLATLEQNQDTI

AT1G30680.1 toprim domain-containing protein1.6e-2768.42Show/hide
Query:  LQNWSGCPPNMHDISGNAHFINKCDNGIVIHRNRDPESGPIDLVQVCVRKVRNKVAGTIGEAYLAYNRVTGEFFDA
        LQ+W G  PN++DISG+AHFINKCDNGI++HRNRD  +GP+DLVQ+ VRKVRNKVAG IG+AYL Y+R TG + D+
Subjt:  LQNWSGCPPNMHDISGNAHFINKCDNGIVIHRNRDPESGPIDLVQVCVRKVRNKVAGTIGEAYLAYNRVTGEFFDA


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGAAAAAATGGGGTGATAAGAAGCGACATTCTCTTGAGTTTCGAGCAGGAGATCAAGTCCTCATCAAGCTGAAAACAGATCAAATTCGGTTTAGAGGGCGCAAAGATCA
GCACCTTGTCAGAAAATATGAGGGGTTTGGGGAAGGTGTTGTTGCCTACATGGATGAAAATTCACCCAGTAATTCATGTGAGCAACTTAAAACCCTACCATCAAGACCCC
GACGACAAGCAGCACGACATATTGCTCGACCATGTATCAACCTGAAGCAGAAAGAAGATAAAGAAGTTGAAGAGATCCTTGCAGATCGAAGCGTGGAAGCAGAAGATCGA
AGAGTTCTAGCTTCGTCAGTCGATAGGGACGTCAGCTGTTTAAGTGGGGGAGAATGTTATGAGCATGCTTGTCCAAGGCGTTGTTTAACCTGCCGGAAACCCAAGAAAAC
TACACTTTTCAATCAAAACCAAGGATATCTGAATCCTCACTTCCTTCATCACAATGATAATACAAGTCTGGTGTTAGCAACGGAGCAATTGACAGAAGAGAATTATGTTT
CTTGGACTCAAGCGATGACCATTGGTCTTTCAGTGAAGAACAAGATTGGGTTCGTCGACGAGACTATTACCAAACGAACCGGTGATCTCCTTCCGGCTTGGATTAGAAAT
AACATCGTTATTTCTTGGATCCTAAACTCAGTCTCTAAACCTGTCTCAGCAAGTATTTTGTTTTCAGGTTCAACCAGAGCAATATGGATTGACCTCAAAGAAAGATTTCA
GAAGAAGAACGTGCCAAGGATTTTTCAGTTGAAGCGATCCCTTGCAACCTTGGAACAAAACCAAGATACCATTGGTGTAAATGACAGATTTCCTCCAAATGTAATATCTC
ATGGACTTATGGGATTGAATGAGAACATCTCTCAAGCCCGGGCTCAACTTCTCCTCATGGATCCTCTTCCATCAACAAGCCGAGCTTTCTCTCTTCTTCTTCAAGAGGAA
CAACAAAGATCAATTGGATCTTTTTCTTCTACAGCACCAACGATGGCCTTTGCAATTAGTGTTACAAGCTCCATGGATACCCTCCTGGATATAAGACCAAGCAACAACAG
CATAGCAACTCAGAACAATGAAAGTCGTTCTCAAGGTACCACACAGAGTAATCAGATATTGAATAATACCAGCACTGCAGAAGCTTTGATCCAGTGTCAAAACCTCCTCA
ACCAGCTTCAGTCTCAAATGAATGCTTCCAACCAACCAACTACCTCACATAAAGCAGGTACTTCTTATTCATTTCCCCTGTGGATAATTGATCCTGGAGCATCCACTCAC
ATTTCTTGTTGCAAGTCCCATTTTGCATCCATTCAACCATGCTCAACATCCATCCGTTTACCTAATAAACAAGTTTTTGAAGTGAGCGCATTAACAAGAGATCTACTCGT
CGATGTTAGTTTCTCTACTAATGGTTGTGTAACTCAGGACAAGTTCACTTTGAAGAAGATTGGCAATGCTGAACTTTTATATGGTCTATATGTCTTCAAATTGGGAAACA
CTCTTGATCTGCAGTCTACCATATGTGCTTTGCAGAATTGGTCTGGATGTCCACCTAATATGCATGATATAAGTGGAAATGCACACTTCATAAATAAATGTGATAATGGA
ATTGTCATTCATCGTAATAGGGATCCTGAAAGTGGTCCTATTGATCTCGTACAGGTATGTGTACGAAAAGTGAGAAATAAGGTTGCAGGAACAATTGGGGAAGCTTATTT
GGCATATAATAGGGTAACCGGAGAATTCTTCGATGCTGCTGGGGATATGAAACTTAAGAAACCATCATCTTGA
mRNA sequenceShow/hide mRNA sequence
ATGAAAAAATGGGGTGATAAGAAGCGACATTCTCTTGAGTTTCGAGCAGGAGATCAAGTCCTCATCAAGCTGAAAACAGATCAAATTCGGTTTAGAGGGCGCAAAGATCA
GCACCTTGTCAGAAAATATGAGGGGTTTGGGGAAGGTGTTGTTGCCTACATGGATGAAAATTCACCCAGTAATTCATGTGAGCAACTTAAAACCCTACCATCAAGACCCC
GACGACAAGCAGCACGACATATTGCTCGACCATGTATCAACCTGAAGCAGAAAGAAGATAAAGAAGTTGAAGAGATCCTTGCAGATCGAAGCGTGGAAGCAGAAGATCGA
AGAGTTCTAGCTTCGTCAGTCGATAGGGACGTCAGCTGTTTAAGTGGGGGAGAATGTTATGAGCATGCTTGTCCAAGGCGTTGTTTAACCTGCCGGAAACCCAAGAAAAC
TACACTTTTCAATCAAAACCAAGGATATCTGAATCCTCACTTCCTTCATCACAATGATAATACAAGTCTGGTGTTAGCAACGGAGCAATTGACAGAAGAGAATTATGTTT
CTTGGACTCAAGCGATGACCATTGGTCTTTCAGTGAAGAACAAGATTGGGTTCGTCGACGAGACTATTACCAAACGAACCGGTGATCTCCTTCCGGCTTGGATTAGAAAT
AACATCGTTATTTCTTGGATCCTAAACTCAGTCTCTAAACCTGTCTCAGCAAGTATTTTGTTTTCAGGTTCAACCAGAGCAATATGGATTGACCTCAAAGAAAGATTTCA
GAAGAAGAACGTGCCAAGGATTTTTCAGTTGAAGCGATCCCTTGCAACCTTGGAACAAAACCAAGATACCATTGGTGTAAATGACAGATTTCCTCCAAATGTAATATCTC
ATGGACTTATGGGATTGAATGAGAACATCTCTCAAGCCCGGGCTCAACTTCTCCTCATGGATCCTCTTCCATCAACAAGCCGAGCTTTCTCTCTTCTTCTTCAAGAGGAA
CAACAAAGATCAATTGGATCTTTTTCTTCTACAGCACCAACGATGGCCTTTGCAATTAGTGTTACAAGCTCCATGGATACCCTCCTGGATATAAGACCAAGCAACAACAG
CATAGCAACTCAGAACAATGAAAGTCGTTCTCAAGGTACCACACAGAGTAATCAGATATTGAATAATACCAGCACTGCAGAAGCTTTGATCCAGTGTCAAAACCTCCTCA
ACCAGCTTCAGTCTCAAATGAATGCTTCCAACCAACCAACTACCTCACATAAAGCAGGTACTTCTTATTCATTTCCCCTGTGGATAATTGATCCTGGAGCATCCACTCAC
ATTTCTTGTTGCAAGTCCCATTTTGCATCCATTCAACCATGCTCAACATCCATCCGTTTACCTAATAAACAAGTTTTTGAAGTGAGCGCATTAACAAGAGATCTACTCGT
CGATGTTAGTTTCTCTACTAATGGTTGTGTAACTCAGGACAAGTTCACTTTGAAGAAGATTGGCAATGCTGAACTTTTATATGGTCTATATGTCTTCAAATTGGGAAACA
CTCTTGATCTGCAGTCTACCATATGTGCTTTGCAGAATTGGTCTGGATGTCCACCTAATATGCATGATATAAGTGGAAATGCACACTTCATAAATAAATGTGATAATGGA
ATTGTCATTCATCGTAATAGGGATCCTGAAAGTGGTCCTATTGATCTCGTACAGGTATGTGTACGAAAAGTGAGAAATAAGGTTGCAGGAACAATTGGGGAAGCTTATTT
GGCATATAATAGGGTAACCGGAGAATTCTTCGATGCTGCTGGGGATATGAAACTTAAGAAACCATCATCTTGAGAGGAGGTATGGTGTTGAAGGCCTTCACAAGTATTGT
TTAAGGTCATTCATCCACTGCAATACTGTGAATTGTCGATAAATGTCGACTGCTAATTCATTCGTACAATTCACATTTATTTAGCAATGTCATTTTTGTTTTGGGTGATC
CATAGCAAGAAGAAATTATTAAGTTGATGGATGTGTATAATTATTAAGTTGATGGATGTGTA
Protein sequenceShow/hide protein sequence
MKKWGDKKRHSLEFRAGDQVLIKLKTDQIRFRGRKDQHLVRKYEGFGEGVVAYMDENSPSNSCEQLKTLPSRPRRQAARHIARPCINLKQKEDKEVEEILADRSVEAEDR
RVLASSVDRDVSCLSGGECYEHACPRRCLTCRKPKKTTLFNQNQGYLNPHFLHHNDNTSLVLATEQLTEENYVSWTQAMTIGLSVKNKIGFVDETITKRTGDLLPAWIRN
NIVISWILNSVSKPVSASILFSGSTRAIWIDLKERFQKKNVPRIFQLKRSLATLEQNQDTIGVNDRFPPNVISHGLMGLNENISQARAQLLLMDPLPSTSRAFSLLLQEE
QQRSIGSFSSTAPTMAFAISVTSSMDTLLDIRPSNNSIATQNNESRSQGTTQSNQILNNTSTAEALIQCQNLLNQLQSQMNASNQPTTSHKAGTSYSFPLWIIDPGASTH
ISCCKSHFASIQPCSTSIRLPNKQVFEVSALTRDLLVDVSFSTNGCVTQDKFTLKKIGNAELLYGLYVFKLGNTLDLQSTICALQNWSGCPPNMHDISGNAHFINKCDNG
IVIHRNRDPESGPIDLVQVCVRKVRNKVAGTIGEAYLAYNRVTGEFFDAAGDMKLKKPSS