; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Clc01G09220 (gene) of Watermelon (cordophanus) v2 genome

Gene IDClc01G09220
OrganismCitrullus lanatus subsp. cordophanus (Watermelon (cordophanus) v2)
DescriptionTransposon TX1 uncharacterized 149 kDa protein
Genome locationClcChr01:10143870..10146127
RNA-Seq ExpressionClc01G09220
SyntenyClc01G09220
Gene Ontology termsGO:0006281 - DNA repair (biological process)
GO:0004518 - nuclease activity (molecular function)
InterPro domainsIPR004808 - AP endonuclease 1
IPR036691 - Endonuclease/exonuclease/phosphatase superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
RVW13148.1 Transposon TX1 uncharacterized 149 kDa protein [Vitis vinifera]5.8e-3528.68Show/hide
Query:  WISFYPKITAFMKIVLWNIRWLGDKSKRMAIKRLLKKLNSDIVLLQESKKDRFDRIFIKSIWSSKDIGWSFVEAKGRSGGLLYLWDEGKISAIEI-----
        W+     +   MKI+ WN+R LG ++KR  +K  L+  N D+V++QE+KK+  DR F+ S+W+ ++  W  + A G SGG+L +WD   +   E+     
Subjt:  WISFYPKITAFMKIVLWNIRWLGDKSKRMAIKRLLKKLNSDIVLLQESKKDRFDRIFIKSIWSSKDIGWSFVEAKGRSGGLLYLWDEGKISAIEI-----

Query:  ---IETEAD-CGGIIFSF-----RTLCRAMVHRRRFQHHKKNSGAGSKTSADDHFPLLFEAEAFKWGPAPFRFCNSWLENKDCCRLIERSLEIDGQQGWA
           ++   D CG +  S          R       F  +        +TS  DH+P++ +   F WGP PFRF N WL++ +               GW 
Subjt:  ---IETEAD-CGGIIFSF-----RTLCRAMVHRRRFQHHKKNSGAGSKTSADDHFPLLFEAEAFKWGPAPFRFCNSWLENKDCCRLIERSLEIDGQQGWA

Query:  SFIIYAKLRNLKIKLKKWLSNYERNKKSREEYLLKEIEKRDGEIEVELENEKRHEASLLEDNIRTSLKAELMSLYRIDERNLIQKNKLNWLKLGDENTAF
              +L+ +K KLK+W        K +++ +L ++   D      +E E      LL    R S K EL  L   +E +  QK K+ W+K GD N+ F
Subjt:  SFIIYAKLRNLKIKLKKWLSNYERNKKSREEYLLKEIEKRDGEIEVELENEKRHEASLLEDNIRTSLKAELMSLYRIDERNLIQKNKLNWLKLGDENTAF

Query:  FHRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGIRSVPLNLEWAVVSREQNKGLVASFSSSEIRR
        +H+    ++ +  I EL N++ L  K+   I  +IL ++  LY+   G       L+W+ +S E    L + F+  EI +
Subjt:  FHRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGIRSVPLNLEWAVVSREQNKGLVASFSSSEIRR

XP_038884535.1 DEAD-box ATP-dependent RNA helicase FANCM isoform X1 [Benincasa hispida]1.8e-6859.73Show/hide
Query:  DHFPLLFEAEAFKWGPAPFRFCNSWLENKDCCRLIERSLEIDGQQGWASFIIYAKLRNLKIKLKKWLSNYERNKKSREEYLLKEIEKRDGEIEVELENEK
        DHFPLLFEA AF+WGP+PFRFCNSWL+NK+CCR+IE S  I GQQ WA F +Y++LR +K  +K+WL+ +E+++K REE LLKEI+++D + +  LEN  
Subjt:  DHFPLLFEAEAFKWGPAPFRFCNSWLENKDCCRLIERSLEIDGQQGWASFIIYAKLRNLKIKLKKWLSNYERNKKSREEYLLKEIEKRDGEIEVELENEK

Query:  RHEASLLEDNIRTSLKAELMSLYRIDERNLIQKNKLNWLKLGDENTAFFHRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGIRSV
               E+++R SLKA+L+SLY+ +ER+LIQK+KLNWL LGDENT+FFHRFLAAK+RKNLI+EL N+Q L TKSF EIE+ IL F+S+LY+   G RS+
Subjt:  RHEASLLEDNIRTSLKAELMSLYRIDERNLIQKNKLNWLKLGDENTAFFHRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGIRSV

Query:  PLNLEWAVVSREQNKGLVASFSSSEI
        PLN+ W+ VS E N  L+A FS++EI
Subjt:  PLNLEWAVVSREQNKGLVASFSSSEI

XP_038884536.1 DEAD-box ATP-dependent RNA helicase FANCM isoform X2 [Benincasa hispida]1.8e-6859.73Show/hide
Query:  DHFPLLFEAEAFKWGPAPFRFCNSWLENKDCCRLIERSLEIDGQQGWASFIIYAKLRNLKIKLKKWLSNYERNKKSREEYLLKEIEKRDGEIEVELENEK
        DHFPLLFEA AF+WGP+PFRFCNSWL+NK+CCR+IE S  I GQQ WA F +Y++LR +K  +K+WL+ +E+++K REE LLKEI+++D + +  LEN  
Subjt:  DHFPLLFEAEAFKWGPAPFRFCNSWLENKDCCRLIERSLEIDGQQGWASFIIYAKLRNLKIKLKKWLSNYERNKKSREEYLLKEIEKRDGEIEVELENEK

Query:  RHEASLLEDNIRTSLKAELMSLYRIDERNLIQKNKLNWLKLGDENTAFFHRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGIRSV
               E+++R SLKA+L+SLY+ +ER+LIQK+KLNWL LGDENT+FFHRFLAAK+RKNLI+EL N+Q L TKSF EIE+ IL F+S+LY+   G RS+
Subjt:  RHEASLLEDNIRTSLKAELMSLYRIDERNLIQKNKLNWLKLGDENTAFFHRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGIRSV

Query:  PLNLEWAVVSREQNKGLVASFSSSEI
        PLN+ W+ VS E N  L+A FS++EI
Subjt:  PLNLEWAVVSREQNKGLVASFSSSEI

XP_038884537.1 DEAD-box ATP-dependent RNA helicase FANCM isoform X3 [Benincasa hispida]1.8e-6859.73Show/hide
Query:  DHFPLLFEAEAFKWGPAPFRFCNSWLENKDCCRLIERSLEIDGQQGWASFIIYAKLRNLKIKLKKWLSNYERNKKSREEYLLKEIEKRDGEIEVELENEK
        DHFPLLFEA AF+WGP+PFRFCNSWL+NK+CCR+IE S  I GQQ WA F +Y++LR +K  +K+WL+ +E+++K REE LLKEI+++D + +  LEN  
Subjt:  DHFPLLFEAEAFKWGPAPFRFCNSWLENKDCCRLIERSLEIDGQQGWASFIIYAKLRNLKIKLKKWLSNYERNKKSREEYLLKEIEKRDGEIEVELENEK

Query:  RHEASLLEDNIRTSLKAELMSLYRIDERNLIQKNKLNWLKLGDENTAFFHRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGIRSV
               E+++R SLKA+L+SLY+ +ER+LIQK+KLNWL LGDENT+FFHRFLAAK+RKNLI+EL N+Q L TKSF EIE+ IL F+S+LY+   G RS+
Subjt:  RHEASLLEDNIRTSLKAELMSLYRIDERNLIQKNKLNWLKLGDENTAFFHRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGIRSV

Query:  PLNLEWAVVSREQNKGLVASFSSSEI
        PLN+ W+ VS E N  L+A FS++EI
Subjt:  PLNLEWAVVSREQNKGLVASFSSSEI

XP_038904301.1 uncharacterized protein LOC120090656 [Benincasa hispida]8.1e-4550Show/hide
Query:  DHFPLLFEAEAFKWGPAPFRFCNSWLENKDCCRLIERSLEIDGQQGWASFIIYAKLRNLKIKLKKWLSNYERNKKSREEYLLKEIEKRDGEIEVELENEK
        DHFPL  EA AF+WGP+ FRFCNSWL NK+ C+LIE+SL+      WA+  +   LR  K  LKKW   + +  K +EE LL E++++D  + V++ ++ 
Subjt:  DHFPLLFEAEAFKWGPAPFRFCNSWLENKDCCRLIERSLEIDGQQGWASFIIYAKLRNLKIKLKKWLSNYERNKKSREEYLLKEIEKRDGEIEVELENEK

Query:  RHEASLLEDNIRTSLKAELMSLYRIDERNLIQKNKLNWLKLGDENTAFFHRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGIRSV
        R       D+   SLKA+L++LY+++E++LIQK KL WLK GDENT+FFHRFL+ +KRKNL ++L+NDQ L T+   +IE  IL FYS LYS S G R++
Subjt:  RHEASLLEDNIRTSLKAELMSLYRIDERNLIQKNKLNWLKLGDENTAFFHRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGIRSV

Query:  PLNL
        PL L
Subjt:  PLNL

TrEMBL top hitse value%identityAlignment
A0A438BQB2 Transposon TX1 uncharacterized 149 kDa protein2.8e-3528.68Show/hide
Query:  WISFYPKITAFMKIVLWNIRWLGDKSKRMAIKRLLKKLNSDIVLLQESKKDRFDRIFIKSIWSSKDIGWSFVEAKGRSGGLLYLWDEGKISAIEI-----
        W+     +   MKI+ WN+R LG ++KR  +K  L+  N D+V++QE+KK+  DR F+ S+W+ ++  W  + A G SGG+L +WD   +   E+     
Subjt:  WISFYPKITAFMKIVLWNIRWLGDKSKRMAIKRLLKKLNSDIVLLQESKKDRFDRIFIKSIWSSKDIGWSFVEAKGRSGGLLYLWDEGKISAIEI-----

Query:  ---IETEAD-CGGIIFSF-----RTLCRAMVHRRRFQHHKKNSGAGSKTSADDHFPLLFEAEAFKWGPAPFRFCNSWLENKDCCRLIERSLEIDGQQGWA
           ++   D CG +  S          R       F  +        +TS  DH+P++ +   F WGP PFRF N WL++ +               GW 
Subjt:  ---IETEAD-CGGIIFSF-----RTLCRAMVHRRRFQHHKKNSGAGSKTSADDHFPLLFEAEAFKWGPAPFRFCNSWLENKDCCRLIERSLEIDGQQGWA

Query:  SFIIYAKLRNLKIKLKKWLSNYERNKKSREEYLLKEIEKRDGEIEVELENEKRHEASLLEDNIRTSLKAELMSLYRIDERNLIQKNKLNWLKLGDENTAF
              +L+ +K KLK+W        K +++ +L ++   D      +E E      LL    R S K EL  L   +E +  QK K+ W+K GD N+ F
Subjt:  SFIIYAKLRNLKIKLKKWLSNYERNKKSREEYLLKEIEKRDGEIEVELENEKRHEASLLEDNIRTSLKAELMSLYRIDERNLIQKNKLNWLKLGDENTAF

Query:  FHRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGIRSVPLNLEWAVVSREQNKGLVASFSSSEIRR
        +H+    ++ +  I EL N++ L  K+   I  +IL ++  LY+   G       L+W+ +S E    L + F+  EI +
Subjt:  FHRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGIRSVPLNLEWAVVSREQNKGLVASFSSSEIRR

A0A438HFR2 Transposon TX1 uncharacterized 149 kDa protein1.1e-3428.15Show/hide
Query:  KIVLWNIRWLGDKSKRMAIKRLLKKLNSDIVLLQESKKDRFDRIFIKSIWSSKDIGWSFVEAKGRSGGLLYLWDEGKISAIEII-------------ETE
        KI+ WN R LG + KR  ++R L   N D+V+LQE+K++ +DR  + SIW  K + W  + A G SGG++ LWD  K +  E +             E E
Subjt:  KIVLWNIRWLGDKSKRMAIKRLLKKLNSDIVLLQESKKDRFDRIFIKSIWSSKDIGWSFVEAKGRSGGLLYLWDEGKISAIEII-------------ETE

Query:  A-----------------------DCGGIIF-------SFRTL---------CRAMVHRRRFQHHKKNSGAGSKTSADDHFPLLFEAEAFKWGPAPFRFC
        +                       D  G+ F        F  +          R  V+ RRF    + S A  + ++ DH P+  E   F WGP PFRF 
Subjt:  A-----------------------DCGGIIF-------SFRTL---------CRAMVHRRRFQHHKKNSGAGSKTSADDHFPLLFEAEAFKWGPAPFRFC

Query:  NSWLENKDCCRLIERSLEIDGQQGWASFIIYAKLRNLKIKLKKWLSNYERNKKSREEYLLKEIEKRDGEIEVELENEKRHEASLLEDNIRTSLKAELMSL
        N WL + +         +    +GW       KL+ +K KLK+W +    + + R++++L ++ + D      +E E      L+ + I    + EL  L
Subjt:  NSWLENKDCCRLIERSLEIDGQQGWASFIIYAKLRNLKIKLKKWLSNYERNKKSREEYLLKEIEKRDGEIEVELENEKRHEASLLEDNIRTSLKAELMSL

Query:  YRIDERNLIQKNKLNWLKLGDENTAFFHRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGIRSVPLNLEWAVVSREQNKGLVASFS
           +E    QK+++ W+K GD N+ FFHR    ++ +  I  LI+++  T  +   I  +I+ F+ +LYS   G       ++WA +S E    L   FS
Subjt:  YRIDERNLIQKNKLNWLKLGDENTAFFHRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGIRSVPLNLEWAVVSREQNKGLVASFS

Query:  SSEIR
          E+R
Subjt:  SSEIR

A0A438IJB1 Transposon TX1 uncharacterized 149 kDa protein1.4e-3428.31Show/hide
Query:  KIVLWNIRWLGDKSKRMAIKRLLKKLNSDIVLLQESKKDRFDRIFIKSIWSSKDIGWSFVEAKGRSGGLLYLWDEGKISAIE------IIETEADCGGI-
        KI+ WN R LG K KR  ++R L   N +IV+LQE+K++ +DR F+ S+W+ + + W  + A G SGG++ LWD  K    E       +  + + G + 
Subjt:  KIVLWNIRWLGDKSKRMAIKRLLKKLNSDIVLLQESKKDRFDRIFIKSIWSSKDIGWSFVEAKGRSGGLLYLWDEGKISAIE------IIETEADCGGI-

Query:  ---------IFSFRTLCRAMVHRR--RFQHHKKNSGAGSKTSAD-------DHFPLLFEAEAFKWGPAPFRFCNSWLENKDCCRLIERSLEIDGQQGWAS
                  F++  +    + +R  RF    +     S++  +       DH P+  E    KWGP PFRF N WL + +         +    +GW  
Subjt:  ---------IFSFRTLCRAMVHRR--RFQHHKKNSGAGSKTSAD-------DHFPLLFEAEAFKWGPAPFRFCNSWLENKDCCRLIERSLEIDGQQGWAS

Query:  FIIYAKLRNLKIKLKKWLSNYERNKKSREEYLLKEIEKRDGEIEVELENEKRHEASLLEDNIRTSLKAELMSLYRIDERNLIQKNKLNWLKLGDENTAFF
             KL+ +K+KLK+W      + K R++ +L ++ + D      +E E      L+ +  RT  + EL  +   +E    QK+++ W+K GD N+ FF
Subjt:  FIIYAKLRNLKIKLKKWLSNYERNKKSREEYLLKEIEKRDGEIEVELENEKRHEASLLEDNIRTSLKAELMSLYRIDERNLIQKNKLNWLKLGDENTAFF

Query:  HRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGIRSVPLNLEWAVVSREQNKGLVASFSSSEIR
        HR    ++ +  I  LI+++  T  +  +I  +I+ F+ +LYS   G       ++W  +S E    L  SF+  E+R
Subjt:  HRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGIRSVPLNLEWAVVSREQNKGLVASFSSSEIR

A0A803P465 Uncharacterized protein2.9e-3230.73Show/hide
Query:  MKIVLWNIRW---LGDKSKRMAIKRLLKKLNSDIVLLQESKKDRFDRIFIKSIWSSKDIGWSFVEAKGRSGGLL-----------------------YLW
        +K + +NI +    GDK KR AIK  L K+N D+V+LQE KK   DR FI SIW S+   W  + A GRSG L+                         W
Subjt:  MKIVLWNIRW---LGDKSKRMAIKRLLKKLNSDIVLLQESKKDRFDRIFIKSIWSSKDIGWSFVEAKGRSGGLL-----------------------YLW

Query:  DEGKISAIEIIETEADCGGIIFSFRTLCRAMVHRRRFQHHKKNSGAGSKTS----ADDHFPLLFEAEAFKWGPAPFRFCNSWLENKDCCRLIERSLEIDG
        DE  ++ + II  +  C G  F+           RR Q  K NS + +K        DH P++ ++    WGP+PFRF N WLE+    +  E   +   
Subjt:  DEGKISAIEIIETEADCGGIIFSFRTLCRAMVHRRRFQHHKKNSGAGSKTS----ADDHFPLLFEAEAFKWGPAPFRFCNSWLENKDCCRLIERSLEIDG

Query:  QQGWASFIIYAKLRNLKIKLKKWLSNYERNKKSREEYLLKEIEKRDGEIEVELENEKRHEASLLEDNIRTSLKAELMSLYRIDERNLIQKNKLNWLKLGD
          GW      +KLR +K  + +W      NK     +++K   +R   +   LE       SL+E+  R ++K E   L   +ER +  K+K  W K GD
Subjt:  QQGWASFIIYAKLRNLKIKLKKWLSNYERNKKSREEYLLKEIEKRDGEIEVELENEKRHEASLLEDNIRTSLKAELMSLYRIDERNLIQKNKLNWLKLGD

Query:  ENTAFFHRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGIRSVPLNLEWAVVSREQNKGLVASFSSSEIR
         N+ FFH  L A+K +N IS +  +     +   EI  +I++F+SSLY+      +    ++W  +  +  + L   F  SE+R
Subjt:  ENTAFFHRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGIRSVPLNLEWAVVSREQNKGLVASFSSSEIR

A0A803PZR9 Uncharacterized protein1.9e-3631.61Show/hide
Query:  MKIVLWNIRWLGDKSKRMAIKRLLKKLNSDIVLLQESKKDRFDRIFIKSIWSSKDIGWSFVEAKGRSGGLLYLWDE----------GKISAIEIIETEAD
        MKI+ WNIR  GDK KR AIK  + K+N D+V+LQE KK   DR FI +IW S+   W +  A GRSGG L +WD           G+ S   +I+ E  
Subjt:  MKIVLWNIRWLGDKSKRMAIKRLLKKLNSDIVLLQESKKDRFDRIFIKSIWSSKDIGWSFVEAKGRSGGLLYLWDE----------GKISAIEIIETEAD

Query:  CGGIIFSFRTLCRAMVHRRRFQH---HKKNSGA------------------GSKTSADDHFPLLFEAEAFKWGPAPFRFCNSWLENKDCCRLIERSLEID
             F     C   +    +      K+  GA                   S ++  +H P++ ++   KWG +PFRF N WLENK   +L E      
Subjt:  CGGIIFSFRTLCRAMVHRRRFQH---HKKNSGA------------------GSKTSADDHFPLLFEAEAFKWGPAPFRFCNSWLENKDCCRLIERSLEID

Query:  GQQGWASFIIYAKLRNLKIKLKKWLSNYERNKKSREEYLLKEIEKRDGEIEVELENEKRHEASLLEDNIRTSLKAELMSLYRIDERNLIQKNKLNWLKLG
           GW       KLR ++  +KKW      N K  +      +E+R  EI+ +LE       SL ++  R  +K +       +ERN+  K+K  W+K G
Subjt:  GQQGWASFIIYAKLRNLKIKLKKWLSNYERNKKSREEYLLKEIEKRDGEIEVELENEKRHEASLLEDNIRTSLKAELMSLYRIDERNLIQKNKLNWLKLG

Query:  DENTAFFHRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGIRSVPLNLEWAVVSREQNKGLVASFSSSEIRR
        D N+ FFH  L A+K KN IS +  +         +I S++++F+S LY+      +    +EW  +S    + L   F   EI+R
Subjt:  DENTAFFHRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGIRSVPLNLEWAVVSREQNKGLVASFSSSEIRR

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
AT1G43760.1 DNAse I-like superfamily protein7.1e-0735.8Show/hide
Query:  AELMSLYRIDERNLIQKNKLNWLKLGDENTAFFHRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGI
        A L S YR       QK+++ WL+ GD NT FFH+ + A + KNLI  L  D  +  ++ T+++  I+A+Y+ L    + I
Subjt:  AELMSLYRIDERNLIQKNKLNWLKLGDENTAFFHRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGI


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGCGAAGGAAGCCGAACAGTCCCTCCCACTTTTCGCGTGCCACAAGAAATTCAAAATGGCAAGGAGAGTTATTGCCTGGCGAAAAGCTCAAAAGTTTGGAAGGTGAAAC
ACCCCCTGAGCAGCAGCCGATTCAAGCTCTATCCATTGAAGCACCTACTCGAGCATTAAAGGCCACAAATTCTGATGAAATTAATTGTAAGGATTCAATGAGACACGAAG
ACGAAAAGTCACCTACTCAAGGGACGATAGTGGGGGTTAGCCTTCCCTCATCAAGCATCTCTCTCTCAAACTCGGCGCCAAATCGAGCAGGAAGGAAGCTTTTTTCTTCA
GCATCCAAGAGACGGGCCCATCTATTGAAGCCATATCCTAAACATTTTGCCAGAGAAAGAACTTCCTTCCTCAACTCATGGTCAATAATAGCAAACCTGGACCTCCTTGA
AGCTTATTGCATCAAATTCCAACTCCTAGGCTTATTTCCAAGCAGGTCAGATCCCCCAACTTTTAATTGCAGCTCCTTTAACCACTCTGAATTTAGAATTCTTGGTTCTA
ACATTAACTTTGTGCGAGGCCTACTAAGCATCAATCAAGTTACCAAAAGCTCCAGACCAATCGATTCAAATGAAGAGTCGGTAATTAGTGCAAGTAGTGAGGAAATTGAA
GAATTCGAAGGCGACGAAAATCAGGGGAAGGGCGATTCATCGGAGGATTATGGAAAAGACTTAGGCCAGTTGTTCCAAGAAGATAACATTCAAGTCACAGACATTTCAAA
GTATAATTCTAGGTTAAAGTGGAATGGAAGATTGAAAATCAGACGTTGGATATCCTTCTACCCAAAAATAACAGCATTCATGAAGATTGTATTGTGGAATATAAGATGGC
TTGGAGATAAATCAAAGAGAATGGCAATCAAAAGACTCCTGAAAAAGTTAAATTCGGATATTGTTTTATTGCAAGAATCAAAGAAAGACCGTTTTGACCGTATCTTCATT
AAAAGCATCTGGAGCTCAAAAGATATTGGCTGGTCCTTTGTAGAAGCAAAGGGAAGATCTGGAGGGTTATTATATTTGTGGGATGAAGGCAAGATTTCTGCGATTGAAAT
TATAGAGACAGAAGCCGATTGTGGAGGAATTATCTTCTCTTTCCGAACACTGTGTAGAGCCATGGTGCATAGGAGGAGATTTCAACATCACAAGAAGAATTCAGGAGCGG
GTAGCAAGACAAGCGCGGATGATCACTTCCCACTCTTGTTTGAAGCCGAGGCTTTCAAATGGGGGCCAGCCCCTTTTAGATTTTGCAACAGCTGGTTGGAAAATAAGGAT
TGCTGCAGACTCATTGAAAGATCACTGGAAATCGATGGACAGCAAGGTTGGGCTAGTTTCATTATATATGCCAAGCTCAGGAATCTGAAAATTAAGTTAAAGAAATGGCT
CTCAAACTATGAAAGGAATAAGAAAAGCAGGGAAGAATACTTATTGAAGGAAATTGAAAAAAGGGATGGCGAAATAGAGGTTGAATTAGAAAATGAAAAAAGACATGAAG
CTTCATTGTTGGAGGATAATATAAGAACTTCCCTAAAGGCTGAATTAATGTCCCTCTACCGAATAGATGAAAGAAACTTGATCCAGAAAAACAAACTGAATTGGCTAAAA
TTGGGAGACGAAAATACAGCATTCTTCCACAGATTCCTTGCAGCAAAAAAAAGGAAAAACTTGATTTCTGAGCTGATCAATGATCAAAGATTGACGACCAAATCTTTCAC
GGAAATAGAATCTCAAATCCTAGCATTTTATTCATCTCTTTACTCAGTTTCAGCAGGGATCAGATCTGTCCCTCTAAATTTAGAGTGGGCGGTGGTCTCAAGGGAGCAAA
ACAAGGGGCTGGTAGCTAGCTTTTCCTCAAGTGAAATCAGAAGGCAGTGA
mRNA sequenceShow/hide mRNA sequence
ATGCGAAGGAAGCCGAACAGTCCCTCCCACTTTTCGCGTGCCACAAGAAATTCAAAATGGCAAGGAGAGTTATTGCCTGGCGAAAAGCTCAAAAGTTTGGAAGGTGAAAC
ACCCCCTGAGCAGCAGCCGATTCAAGCTCTATCCATTGAAGCACCTACTCGAGCATTAAAGGCCACAAATTCTGATGAAATTAATTGTAAGGATTCAATGAGACACGAAG
ACGAAAAGTCACCTACTCAAGGGACGATAGTGGGGGTTAGCCTTCCCTCATCAAGCATCTCTCTCTCAAACTCGGCGCCAAATCGAGCAGGAAGGAAGCTTTTTTCTTCA
GCATCCAAGAGACGGGCCCATCTATTGAAGCCATATCCTAAACATTTTGCCAGAGAAAGAACTTCCTTCCTCAACTCATGGTCAATAATAGCAAACCTGGACCTCCTTGA
AGCTTATTGCATCAAATTCCAACTCCTAGGCTTATTTCCAAGCAGGTCAGATCCCCCAACTTTTAATTGCAGCTCCTTTAACCACTCTGAATTTAGAATTCTTGGTTCTA
ACATTAACTTTGTGCGAGGCCTACTAAGCATCAATCAAGTTACCAAAAGCTCCAGACCAATCGATTCAAATGAAGAGTCGGTAATTAGTGCAAGTAGTGAGGAAATTGAA
GAATTCGAAGGCGACGAAAATCAGGGGAAGGGCGATTCATCGGAGGATTATGGAAAAGACTTAGGCCAGTTGTTCCAAGAAGATAACATTCAAGTCACAGACATTTCAAA
GTATAATTCTAGGTTAAAGTGGAATGGAAGATTGAAAATCAGACGTTGGATATCCTTCTACCCAAAAATAACAGCATTCATGAAGATTGTATTGTGGAATATAAGATGGC
TTGGAGATAAATCAAAGAGAATGGCAATCAAAAGACTCCTGAAAAAGTTAAATTCGGATATTGTTTTATTGCAAGAATCAAAGAAAGACCGTTTTGACCGTATCTTCATT
AAAAGCATCTGGAGCTCAAAAGATATTGGCTGGTCCTTTGTAGAAGCAAAGGGAAGATCTGGAGGGTTATTATATTTGTGGGATGAAGGCAAGATTTCTGCGATTGAAAT
TATAGAGACAGAAGCCGATTGTGGAGGAATTATCTTCTCTTTCCGAACACTGTGTAGAGCCATGGTGCATAGGAGGAGATTTCAACATCACAAGAAGAATTCAGGAGCGG
GTAGCAAGACAAGCGCGGATGATCACTTCCCACTCTTGTTTGAAGCCGAGGCTTTCAAATGGGGGCCAGCCCCTTTTAGATTTTGCAACAGCTGGTTGGAAAATAAGGAT
TGCTGCAGACTCATTGAAAGATCACTGGAAATCGATGGACAGCAAGGTTGGGCTAGTTTCATTATATATGCCAAGCTCAGGAATCTGAAAATTAAGTTAAAGAAATGGCT
CTCAAACTATGAAAGGAATAAGAAAAGCAGGGAAGAATACTTATTGAAGGAAATTGAAAAAAGGGATGGCGAAATAGAGGTTGAATTAGAAAATGAAAAAAGACATGAAG
CTTCATTGTTGGAGGATAATATAAGAACTTCCCTAAAGGCTGAATTAATGTCCCTCTACCGAATAGATGAAAGAAACTTGATCCAGAAAAACAAACTGAATTGGCTAAAA
TTGGGAGACGAAAATACAGCATTCTTCCACAGATTCCTTGCAGCAAAAAAAAGGAAAAACTTGATTTCTGAGCTGATCAATGATCAAAGATTGACGACCAAATCTTTCAC
GGAAATAGAATCTCAAATCCTAGCATTTTATTCATCTCTTTACTCAGTTTCAGCAGGGATCAGATCTGTCCCTCTAAATTTAGAGTGGGCGGTGGTCTCAAGGGAGCAAA
ACAAGGGGCTGGTAGCTAGCTTTTCCTCAAGTGAAATCAGAAGGCAGTGA
Protein sequenceShow/hide protein sequence
MRRKPNSPSHFSRATRNSKWQGELLPGEKLKSLEGETPPEQQPIQALSIEAPTRALKATNSDEINCKDSMRHEDEKSPTQGTIVGVSLPSSSISLSNSAPNRAGRKLFSS
ASKRRAHLLKPYPKHFARERTSFLNSWSIIANLDLLEAYCIKFQLLGLFPSRSDPPTFNCSSFNHSEFRILGSNINFVRGLLSINQVTKSSRPIDSNEESVISASSEEIE
EFEGDENQGKGDSSEDYGKDLGQLFQEDNIQVTDISKYNSRLKWNGRLKIRRWISFYPKITAFMKIVLWNIRWLGDKSKRMAIKRLLKKLNSDIVLLQESKKDRFDRIFI
KSIWSSKDIGWSFVEAKGRSGGLLYLWDEGKISAIEIIETEADCGGIIFSFRTLCRAMVHRRRFQHHKKNSGAGSKTSADDHFPLLFEAEAFKWGPAPFRFCNSWLENKD
CCRLIERSLEIDGQQGWASFIIYAKLRNLKIKLKKWLSNYERNKKSREEYLLKEIEKRDGEIEVELENEKRHEASLLEDNIRTSLKAELMSLYRIDERNLIQKNKLNWLK
LGDENTAFFHRFLAAKKRKNLISELINDQRLTTKSFTEIESQILAFYSSLYSVSAGIRSVPLNLEWAVVSREQNKGLVASFSSSEIRRQ