; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Lag0005966 (gene) of Sponge gourd (AG-4) v1 genome

Gene IDLag0005966
OrganismLuffa acutangula AG-4 (Sponge gourd (AG-4) v1)
DescriptionWD repeat-containing protein 43
Genome locationchr6:34724588..34729903
RNA-Seq ExpressionLag0005966
SyntenyLag0005966
Gene Ontology termsGO:0005730 - nucleolus (cellular component)
GO:0005515 - protein binding (molecular function)
InterPro domainsIPR007148 - Small-subunit processome, Utp12
IPR011047 - Quinoprotein alcohol dehydrogenase-like superfamily
IPR015943 - WD40/YVTN repeat-like-containing domain superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAA0058775.1 WD repeat-containing protein 43 [Cucumis melo var. makuwa]1.7e-22474.16Show/hide
Query:  IWSTSDGSLLAEWKDPEGKTDIGYSCMACCFVEKKRKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRRVHTVGSNGMAFEMDT
        IWST DGSLLAEWKD +GK D GYSCMACC + KKRK+S CV+A+GTNNGDVLAVNAS+GE KWVS GCH GGVIGLSFAN+GRR+HTVGSNGMA EMDT
Subjt:  IWSTSDGSLLAEWKDPEGKTDIGYSCMACCFVEKKRKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRRVHTVGSNGMAFEMDT

Query:  ETGSIIKEFKASKKSISSSAFSL---------------------------------------------------------------------GPVLSMKH
        ETG+IIKEFKASKKSISSSAFSL                                                                     GPVLSM H
Subjt:  ETGSIIKEFKASKKSISSSAFSL---------------------------------------------------------------------GPVLSMKH

Query:  PPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSEDEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSILVTHGSMDLPQHSVL
        PPFVS+CRN+ N+ED++VVLSVSVSG AYLWKLK LSEDE  PTKV+VK ND QSAEENHGSAKKNR+SV+ASRI  + DNEVS+LVTHGS+DLPQHS+L
Subjt:  PPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSEDEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSILVTHGSMDLPQHSVL

Query:  NIGYSAKEDTDTAHEKKTLQQNDDSSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLNLEDQNEEE
        +IGY+ KED +TAHE KTLQQND  SEQGPH  EQ V  PKSKKSKKKRAASDLDS TAGDVSDVGNG+ASDV+FNDDLNEP+MGEKLASLNL DQNE+ 
Subjt:  NIGYSAKEDTDTAHEKKTLQQNDDSSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLNLEDQNEEE

Query:  DREQE-PFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHASRIMSQES
         REQE P +P IPPSADSVQVLLKQALHADD ALLLECLYTKDDKVISKSIAQLNSSDVLKLLHS+IS IQSRGAILVCALPWLRGLLLQHAS+IMSQES
Subjt:  DREQE-PFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHASRIMSQES

Query:  SLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAIVPIIYEEDDSDDKESGDEMETDEDDEENEEVEVAFEDLSAGEVDDDMSE
        SLLALNSLYQLIE+RISTFQSALLLSSSLDFLYTGVLDEE ++NDAIVPIIYEE+DSD+ E+GDEMETDEDD E +EVE AF+DLSAGEVDDDMSE
Subjt:  SLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAIVPIIYEEDDSDDKESGDEMETDEDDEENEEVEVAFEDLSAGEVDDDMSE

XP_008461080.1 PREDICTED: WD repeat-containing protein 43 [Cucumis melo]2.2e-22473.42Show/hide
Query:  SFASFPIWSTSDGSLLAEWKDPEGKTDIGYSCMACCFVEKKRKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRRVHTVGSNGM
        S  +  IWST DGSLLAEWKD +GK D GYSCMACC + KKRK+S C++A+GTNNGDVLAVNAS+GE KWVS GCH GGVIGLSFAN+GRR+HTVGSNGM
Subjt:  SFASFPIWSTSDGSLLAEWKDPEGKTDIGYSCMACCFVEKKRKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRRVHTVGSNGM

Query:  AFEMDTETGSIIKEFKASKKSISSSAFSL---------------------------------------------------------------------GP
        A EMDTETG+IIKEFKASKKSISSSAFSL                                                                     GP
Subjt:  AFEMDTETGSIIKEFKASKKSISSSAFSL---------------------------------------------------------------------GP

Query:  VLSMKHPPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSEDEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSILVTHGSMDL
        VLSM HPPFVS+CRN+ N+ED++VVLSVSVSG AYLWKLK LSEDE  PTKV+VK ND QSAEENHGSAKKNR+SV+AS+I  + DNEVS+LVTHGS+DL
Subjt:  VLSMKHPPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSEDEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSILVTHGSMDL

Query:  PQHSVLNIGYSAKEDTDTAHEKKTLQQNDDSSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLNLE
        PQHS+L+IGY+ KED +TAHE KTLQQND  SEQGPH  EQ V TPKSKKSKKKRAASDLDS TAGDVSDVGNG+ASDV+FNDDLNEP+MGEKLASLNL 
Subjt:  PQHSVLNIGYSAKEDTDTAHEKKTLQQNDDSSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLNLE

Query:  DQNEEEDREQE-PFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHASR
        DQNE+  REQE P +P IPPSADSVQVLLKQALHADD ALLLECLYTKDDKVISKSIAQLNSSDVLKLLHS+IS IQSRGAILVCALPWLRGLLLQHAS+
Subjt:  DQNEEEDREQE-PFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHASR

Query:  IMSQESSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAIVPIIYEEDDSDDKESGDEMETDEDDEENEEVEVAFEDLSAGEVDDDM
        IMSQESSLLALNSLYQLIE+RISTFQSALLLSSSLDFLYTGVLDEE ++NDAIVPIIYEE+DSD+ E+GDEMETDEDD E +EVE AF+DLSAGEVDDDM
Subjt:  IMSQESSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAIVPIIYEEDDSDDKESGDEMETDEDDEENEEVEVAFEDLSAGEVDDDM

Query:  SE
        SE
Subjt:  SE

XP_022150139.1 WD repeat-containing protein 43 [Momordica charantia]2.5e-23177.14Show/hide
Query:  IWSTSDGSLLAEWKDPEGKTDIGYSCMACCFVEKK--RKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRRVHTVGSNGMAFEM
        IWS  DGSLLAEWKD EGKTD+GYSC+ACCFV KK  +K SSCVIA+GT+NGDVLAVNASSGETKWVS GCH+GGVIGLSFANKGRR+ TVGSNGMA EM
Subjt:  IWSTSDGSLLAEWKDPEGKTDIGYSCMACCFVEKK--RKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRRVHTVGSNGMAFEM

Query:  DTETGSIIKEFKASKKSISSSAF---------------------------------------------------------------------SLGPVLSM
        DTETG+IIKEFKASKKSISSSAF                                                                     S GPVLSM
Subjt:  DTETGSIIKEFKASKKSISSSAF---------------------------------------------------------------------SLGPVLSM

Query:  KHPPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSEDEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSILVTHGSMDLPQHS
        KHPPFVS+C+NI NEED+IVVLSVSVSGVAY+W+LK LSEDE  P KVTVK ND QSAEENHGSAKKNRISVIASRI G  DNEVS+LVTHGSMD PQ S
Subjt:  KHPPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSEDEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSILVTHGSMDLPQHS

Query:  VLNIGYSAKEDTDTAHEKKTLQQNDDSSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLNLEDQNE
        + NIGYS KED +TAHEKKTLQQNDD S QGPH  EQAVTTPKSKKSKKKRAASD+DSLTAGDVS VGNG+ASDVLFNDD+NEPTMGEKLASLNL DQ+E
Subjt:  VLNIGYSAKEDTDTAHEKKTLQQNDDSSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLNLEDQNE

Query:  EEDREQ-EPFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHASRIMSQ
        +E  EQ EP +PAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHASRIMSQ
Subjt:  EEDREQ-EPFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHASRIMSQ

Query:  ESSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAIVPIIYE-EDDSDDKESGDEMETDEDDEENEEVEVAFEDLSAGEVDD
        ESSLLALNSLYQLIESRISTFQSA+LLSSSLDFLYTGVLDEEVD+NDAIVPIIYE EDDSDD+ESGDEMETDED+EE EE E AF DLSAGEVDD
Subjt:  ESSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAIVPIIYE-EDDSDDKESGDEMETDEDDEENEEVEVAFEDLSAGEVDD

XP_022964606.1 WD repeat-containing protein 43 isoform X1 [Cucurbita moschata]5.5e-22373.93Show/hide
Query:  LPSFASFPIWSTSDGSLLAEWKDPEGKTDIGYSCMACCFVEKKRKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRRVHTVGSN
        L S  +  IW+TSDGSLLAEWKDP+GKTD GYSC+ACCFV KKRKNSSC+IA+GTN GDVL VNASSGETKWVS GCHLGGVIGLSFA+KGRR+HTVGSN
Subjt:  LPSFASFPIWSTSDGSLLAEWKDPEGKTDIGYSCMACCFVEKKRKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRRVHTVGSN

Query:  GMAFEMDTETGSIIKEFKASKKSISSSAFSL---------------------------------------------------------------------
        G+AF+M+ ETGSII EFKASKKSISSSAFSL                                                                     
Subjt:  GMAFEMDTETGSIIKEFKASKKSISSSAFSL---------------------------------------------------------------------

Query:  GPVLSMKHPPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSEDEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSILVTHGSM
        GPVLSMKHPPFVS+CRNI N ED+IVVLSVSVSGVAYLWKLKFLSED+  PTKVTVK N+ +SAEENHGSAKKNRISV++S IQGL DNEVS+LVTHGSM
Subjt:  GPVLSMKHPPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSEDEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSILVTHGSM

Query:  DLPQHSVLNIGYSAKEDTDTAHEKKTLQQNDDSSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLN
        DLPQH+VLNIGY AKED + A EK           +GPH  +QAVT+PKSKKSKKKRAASDLDS  AGDVSDVGN + S+VLFNDDLNEPTMG+KLASLN
Subjt:  DLPQHSVLNIGYSAKEDTDTAHEKKTLQQNDDSSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLN

Query:  LEDQNEEEDREQ-EPFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHA
        LE+QNE+E+ EQ EP +PAIPPSADSVQVLLKQAL ADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLH+LISIIQSRGAILVCA+PWLRGLLLQHA
Subjt:  LEDQNEEEDREQ-EPFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHA

Query:  SRIMSQESSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAIVPIIYEEDDSDDKE-SGDEMETDEDDEENEEVEV-AFEDLSAGEV
        SRIMSQESSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEE +ENDAIVPIIYEEDDSDDKE SGDEMET   DEE   VEV AF+DLSAGEV
Subjt:  SRIMSQESSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAIVPIIYEEDDSDDKE-SGDEMETDEDDEENEEVEV-AFEDLSAGEV

Query:  DDDMSE
        DDDMSE
Subjt:  DDDMSE

XP_038899613.1 WD repeat-containing protein 43 [Benincasa hispida]9.7e-23675.46Show/hide
Query:  LPSFASFPIWSTSDGSLLAEWKDPEGKTDIGYSCMACCFVEKKRKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRRVHTVGSN
        L S  +  IWST DGSLLAEWKDP+GK D+GYSCMACCF  KKRKNS CV+A+GTN+GDVLAVNAS+GE KWVS GCHLGGVIGLSFANKGRR+H VGSN
Subjt:  LPSFASFPIWSTSDGSLLAEWKDPEGKTDIGYSCMACCFVEKKRKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRRVHTVGSN

Query:  GMAFEMDTETGSIIKEFKASKKSISSSAFSL---------------------------------------------------------------------
        G   EMDTETG+IIKEFKASKKSISSS+FSL                                                                     
Subjt:  GMAFEMDTETGSIIKEFKASKKSISSSAFSL---------------------------------------------------------------------

Query:  GPVLSMKHPPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSEDEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSILVTHGSM
        GPVLSMKHPPFVS+CRN+ N+EDN+VVLSVSVSGVAYLWKLK LSEDE  PTKV+VK ND QSAEENHGSAKKNR+SVIASRI G+ DNEVS+LVTHGSM
Subjt:  GPVLSMKHPPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSEDEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSILVTHGSM

Query:  DLPQHSVLNIGYSAKEDTDTAHEKKTLQQNDDSSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLN
        DLPQHS+ +IGYS KED +TA   KTLQQNDD SEQGPH  EQ V TPKSKK KKKRAASDLDSLT GD+SDVGNG+ASDV+FNDDLNEP+MGEKLASLN
Subjt:  DLPQHSVLNIGYSAKEDTDTAHEKKTLQQNDDSSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLN

Query:  LEDQNEEEDREQEPFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHAS
        L DQNE+E RE+EP +PAIPPSADSVQVLLKQALHA+DR LLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLIS IQSRGAILVC LPWLRGLLLQHAS
Subjt:  LEDQNEEEDREQEPFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHAS

Query:  RIMSQESSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAIVPIIYEEDDSDDKESGDEMETDEDDEENEEVEVAFEDLSAGEVDDD
        +IMSQESSLLALNSLYQLIESRISTFQSALLLSSSLDFLY+GVLDEEVDENDAIVPIIYEEDDSDDKESGDEMETDE DEE +EVE AF+DLSAGEVDDD
Subjt:  RIMSQESSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAIVPIIYEEDDSDDKESGDEMETDEDDEENEEVEVAFEDLSAGEVDDD

Query:  MSE
        MSE
Subjt:  MSE

TrEMBL top hitse value%identityAlignment
A0A0A0KBW9 Utp12 domain-containing protein1.8e-21972.26Show/hide
Query:  SFASFPIWSTSDGSLLAEWKDPEGKTDIGYSCMACCFVEKKRKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRRVHTVGSNGM
        S  +  IWST DGSLLAEWKD +GK D GYSCMACCF+ KKRK+S CV+A+GTN+GDVLAVNAS+GE KWVS GCH GGVIGLSFANKG R+ TVGSNGM
Subjt:  SFASFPIWSTSDGSLLAEWKDPEGKTDIGYSCMACCFVEKKRKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRRVHTVGSNGM

Query:  AFEMDTETGSIIKEFKASKKSISSSAFSL---------------------------------------------------------------------GP
        A EMDTETG+IIKEFKASKKSISSSAFSL                                                                     GP
Subjt:  AFEMDTETGSIIKEFKASKKSISSSAFSL---------------------------------------------------------------------GP

Query:  VLSMKHPPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSEDEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSILVTHGSMDL
        +LSMKHPPFVS+CRN+ N+ED++VVLSVSVSG AYLWKLK LSEDE  PTKV+VK ND QSAEENHGSAKKNR SV+ASRI G+ DNEVS+LVTHGS+DL
Subjt:  VLSMKHPPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSEDEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSILVTHGSMDL

Query:  PQHSVLNIGYSAKEDTDTAHEKKTLQQNDDSSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLNLE
        PQH++L+IGY+ KED +TAHE KTLQQND  SEQGPH  EQ V TPKSKKSKKKRAAS+LDSLTAGDVSDVGNG+ SDVLFNDDLNEP+MGEKLASLNL 
Subjt:  PQHSVLNIGYSAKEDTDTAHEKKTLQQNDDSSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLNLE

Query:  DQNEEEDREQE-PFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHASR
        DQN++  REQE P +P IPPSADSVQVLLKQALHADDRALLLECLYTKD KVISKSIAQLNSSDVL LLH+LIS IQSRGAILVCALPWLR L+LQHAS+
Subjt:  DQNEEEDREQE-PFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHASR

Query:  IMSQESSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAIVPIIYEEDDSDDKESGDEMETDEDDEENEEVEVAFEDLSAGEVDDDM
        IMSQESSLLALNSLYQLIESR STFQSALLLSSSLDFLYT VLD+E ++ND IVPIIYEE+DSD+ E+GDEMET+EDD E +EVE AF+DLSAGEVDDDM
Subjt:  IMSQESSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAIVPIIYEEDDSDDKESGDEMETDEDDEENEEVEVAFEDLSAGEVDDDM

Query:  SE
        SE
Subjt:  SE

A0A1S3CDX9 WD repeat-containing protein 431.1e-22473.42Show/hide
Query:  SFASFPIWSTSDGSLLAEWKDPEGKTDIGYSCMACCFVEKKRKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRRVHTVGSNGM
        S  +  IWST DGSLLAEWKD +GK D GYSCMACC + KKRK+S C++A+GTNNGDVLAVNAS+GE KWVS GCH GGVIGLSFAN+GRR+HTVGSNGM
Subjt:  SFASFPIWSTSDGSLLAEWKDPEGKTDIGYSCMACCFVEKKRKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRRVHTVGSNGM

Query:  AFEMDTETGSIIKEFKASKKSISSSAFSL---------------------------------------------------------------------GP
        A EMDTETG+IIKEFKASKKSISSSAFSL                                                                     GP
Subjt:  AFEMDTETGSIIKEFKASKKSISSSAFSL---------------------------------------------------------------------GP

Query:  VLSMKHPPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSEDEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSILVTHGSMDL
        VLSM HPPFVS+CRN+ N+ED++VVLSVSVSG AYLWKLK LSEDE  PTKV+VK ND QSAEENHGSAKKNR+SV+AS+I  + DNEVS+LVTHGS+DL
Subjt:  VLSMKHPPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSEDEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSILVTHGSMDL

Query:  PQHSVLNIGYSAKEDTDTAHEKKTLQQNDDSSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLNLE
        PQHS+L+IGY+ KED +TAHE KTLQQND  SEQGPH  EQ V TPKSKKSKKKRAASDLDS TAGDVSDVGNG+ASDV+FNDDLNEP+MGEKLASLNL 
Subjt:  PQHSVLNIGYSAKEDTDTAHEKKTLQQNDDSSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLNLE

Query:  DQNEEEDREQE-PFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHASR
        DQNE+  REQE P +P IPPSADSVQVLLKQALHADD ALLLECLYTKDDKVISKSIAQLNSSDVLKLLHS+IS IQSRGAILVCALPWLRGLLLQHAS+
Subjt:  DQNEEEDREQE-PFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHASR

Query:  IMSQESSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAIVPIIYEEDDSDDKESGDEMETDEDDEENEEVEVAFEDLSAGEVDDDM
        IMSQESSLLALNSLYQLIE+RISTFQSALLLSSSLDFLYTGVLDEE ++NDAIVPIIYEE+DSD+ E+GDEMETDEDD E +EVE AF+DLSAGEVDDDM
Subjt:  IMSQESSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAIVPIIYEEDDSDDKESGDEMETDEDDEENEEVEVAFEDLSAGEVDDDM

Query:  SE
        SE
Subjt:  SE

A0A5A7UYX1 WD repeat-containing protein 438.3e-22574.16Show/hide
Query:  IWSTSDGSLLAEWKDPEGKTDIGYSCMACCFVEKKRKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRRVHTVGSNGMAFEMDT
        IWST DGSLLAEWKD +GK D GYSCMACC + KKRK+S CV+A+GTNNGDVLAVNAS+GE KWVS GCH GGVIGLSFAN+GRR+HTVGSNGMA EMDT
Subjt:  IWSTSDGSLLAEWKDPEGKTDIGYSCMACCFVEKKRKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRRVHTVGSNGMAFEMDT

Query:  ETGSIIKEFKASKKSISSSAFSL---------------------------------------------------------------------GPVLSMKH
        ETG+IIKEFKASKKSISSSAFSL                                                                     GPVLSM H
Subjt:  ETGSIIKEFKASKKSISSSAFSL---------------------------------------------------------------------GPVLSMKH

Query:  PPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSEDEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSILVTHGSMDLPQHSVL
        PPFVS+CRN+ N+ED++VVLSVSVSG AYLWKLK LSEDE  PTKV+VK ND QSAEENHGSAKKNR+SV+ASRI  + DNEVS+LVTHGS+DLPQHS+L
Subjt:  PPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSEDEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSILVTHGSMDLPQHSVL

Query:  NIGYSAKEDTDTAHEKKTLQQNDDSSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLNLEDQNEEE
        +IGY+ KED +TAHE KTLQQND  SEQGPH  EQ V  PKSKKSKKKRAASDLDS TAGDVSDVGNG+ASDV+FNDDLNEP+MGEKLASLNL DQNE+ 
Subjt:  NIGYSAKEDTDTAHEKKTLQQNDDSSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLNLEDQNEEE

Query:  DREQE-PFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHASRIMSQES
         REQE P +P IPPSADSVQVLLKQALHADD ALLLECLYTKDDKVISKSIAQLNSSDVLKLLHS+IS IQSRGAILVCALPWLRGLLLQHAS+IMSQES
Subjt:  DREQE-PFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHASRIMSQES

Query:  SLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAIVPIIYEEDDSDDKESGDEMETDEDDEENEEVEVAFEDLSAGEVDDDMSE
        SLLALNSLYQLIE+RISTFQSALLLSSSLDFLYTGVLDEE ++NDAIVPIIYEE+DSD+ E+GDEMETDEDD E +EVE AF+DLSAGEVDDDMSE
Subjt:  SLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAIVPIIYEEDDSDDKESGDEMETDEDDEENEEVEVAFEDLSAGEVDDDMSE

A0A6J1D938 WD repeat-containing protein 431.2e-23177.14Show/hide
Query:  IWSTSDGSLLAEWKDPEGKTDIGYSCMACCFVEKK--RKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRRVHTVGSNGMAFEM
        IWS  DGSLLAEWKD EGKTD+GYSC+ACCFV KK  +K SSCVIA+GT+NGDVLAVNASSGETKWVS GCH+GGVIGLSFANKGRR+ TVGSNGMA EM
Subjt:  IWSTSDGSLLAEWKDPEGKTDIGYSCMACCFVEKK--RKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRRVHTVGSNGMAFEM

Query:  DTETGSIIKEFKASKKSISSSAF---------------------------------------------------------------------SLGPVLSM
        DTETG+IIKEFKASKKSISSSAF                                                                     S GPVLSM
Subjt:  DTETGSIIKEFKASKKSISSSAF---------------------------------------------------------------------SLGPVLSM

Query:  KHPPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSEDEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSILVTHGSMDLPQHS
        KHPPFVS+C+NI NEED+IVVLSVSVSGVAY+W+LK LSEDE  P KVTVK ND QSAEENHGSAKKNRISVIASRI G  DNEVS+LVTHGSMD PQ S
Subjt:  KHPPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSEDEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSILVTHGSMDLPQHS

Query:  VLNIGYSAKEDTDTAHEKKTLQQNDDSSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLNLEDQNE
        + NIGYS KED +TAHEKKTLQQNDD S QGPH  EQAVTTPKSKKSKKKRAASD+DSLTAGDVS VGNG+ASDVLFNDD+NEPTMGEKLASLNL DQ+E
Subjt:  VLNIGYSAKEDTDTAHEKKTLQQNDDSSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLNLEDQNE

Query:  EEDREQ-EPFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHASRIMSQ
        +E  EQ EP +PAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHASRIMSQ
Subjt:  EEDREQ-EPFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHASRIMSQ

Query:  ESSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAIVPIIYE-EDDSDDKESGDEMETDEDDEENEEVEVAFEDLSAGEVDD
        ESSLLALNSLYQLIESRISTFQSA+LLSSSLDFLYTGVLDEEVD+NDAIVPIIYE EDDSDD+ESGDEMETDED+EE EE E AF DLSAGEVDD
Subjt:  ESSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAIVPIIYE-EDDSDDKESGDEMETDEDDEENEEVEVAFEDLSAGEVDD

A0A6J1HJF3 WD repeat-containing protein 43 isoform X12.7e-22373.93Show/hide
Query:  LPSFASFPIWSTSDGSLLAEWKDPEGKTDIGYSCMACCFVEKKRKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRRVHTVGSN
        L S  +  IW+TSDGSLLAEWKDP+GKTD GYSC+ACCFV KKRKNSSC+IA+GTN GDVL VNASSGETKWVS GCHLGGVIGLSFA+KGRR+HTVGSN
Subjt:  LPSFASFPIWSTSDGSLLAEWKDPEGKTDIGYSCMACCFVEKKRKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRRVHTVGSN

Query:  GMAFEMDTETGSIIKEFKASKKSISSSAFSL---------------------------------------------------------------------
        G+AF+M+ ETGSII EFKASKKSISSSAFSL                                                                     
Subjt:  GMAFEMDTETGSIIKEFKASKKSISSSAFSL---------------------------------------------------------------------

Query:  GPVLSMKHPPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSEDEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSILVTHGSM
        GPVLSMKHPPFVS+CRNI N ED+IVVLSVSVSGVAYLWKLKFLSED+  PTKVTVK N+ +SAEENHGSAKKNRISV++S IQGL DNEVS+LVTHGSM
Subjt:  GPVLSMKHPPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSEDEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSILVTHGSM

Query:  DLPQHSVLNIGYSAKEDTDTAHEKKTLQQNDDSSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLN
        DLPQH+VLNIGY AKED + A EK           +GPH  +QAVT+PKSKKSKKKRAASDLDS  AGDVSDVGN + S+VLFNDDLNEPTMG+KLASLN
Subjt:  DLPQHSVLNIGYSAKEDTDTAHEKKTLQQNDDSSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLN

Query:  LEDQNEEEDREQ-EPFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHA
        LE+QNE+E+ EQ EP +PAIPPSADSVQVLLKQAL ADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLH+LISIIQSRGAILVCA+PWLRGLLLQHA
Subjt:  LEDQNEEEDREQ-EPFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHA

Query:  SRIMSQESSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAIVPIIYEEDDSDDKE-SGDEMETDEDDEENEEVEV-AFEDLSAGEV
        SRIMSQESSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEE +ENDAIVPIIYEEDDSDDKE SGDEMET   DEE   VEV AF+DLSAGEV
Subjt:  SRIMSQESSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAIVPIIYEEDDSDDKE-SGDEMETDEDDEENEEVEV-AFEDLSAGEV

Query:  DDDMSE
        DDDMSE
Subjt:  DDDMSE

SwissProt top hitse value%identityAlignment
Q15061 WD repeat-containing protein 432.3e-0628.64Show/hide
Query:  NEPTMGEKLASLNLEDQNEEEDREQEPFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCA
        NE ++ E+L +++++   + ++  Q           +S  VLL Q L ++D  +L + L T++  +I K++ ++    ++ LL  L   +Q      V  
Subjt:  NEPTMGEKLASLNLEDQNEEEDREQEPFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCA

Query:  LPWLRGLLLQHASRIMSQESSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAI----VPIIYEEDDSDDKESGDEMETDEDDEENE
        + WL+ +L  HAS + +    +  L +LYQL+ESR+ TFQ    L   L  L T V   E  +          ++YEE+ S++ ES DE+  D+D E+N 
Subjt:  LPWLRGLLLQHASRIMSQESSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAI----VPIIYEEDDSDDKESGDEMETDEDDEENE

Query:  EVEVAFEDLSAGEVDDDMSE
        + +   E+ S  E D+D+ E
Subjt:  EVEVAFEDLSAGEVDDDMSE

Q6ZQL4 WD repeat-containing protein 432.1e-0728.09Show/hide
Query:  EPTMGEKLASLNLEDQNEEEDREQEPFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCAL
        E T+ E+L +++L+ +  ++D +            +S  VLL Q L ++D  +L + L TK+  +I +++ ++    V+ LL  L   +Q         +
Subjt:  EPTMGEKLASLNLEDQNEEEDREQEPFIPAIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCAL

Query:  PWLRGLLLQHASRIMSQESSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDEN----DAIVPIIYEEDDSD--------DKESGDEMETD
         WL+ +L  HAS + +    +  L +LYQL+ESR+ TFQ    L   L  L T V   E  +          ++YEE+ S+        +K+S D  + D
Subjt:  PWLRGLLLQHASRIMSQESSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDEN----DAIVPIIYEEDDSD--------DKESGDEMETD

Query:  ED---------DEENEEVEVAFEDLSAGEVDDDMS
        ED         DE+NEE +   ED    E D ++S
Subjt:  ED---------DEENEEVEVAFEDLSAGEVDDDMS

Arabidopsis top hitse value%identityAlignment
AT1G15420.1 CONTAINS InterPro DOMAIN/s: Small-subunit processome, Utp12 (InterPro:IPR007148); Has 764 Blast hits to 656 proteins in 193 species: Archae - 0; Bacteria - 42; Metazoa - 237; Fungi - 154; Plants - 85; Viruses - 23; Other Eukaryotes - 223 (source: NCBI BLink).7.3e-4848.36Show/hide
Query:  SSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLNL---EDQNEEEDREQEPFIPAIPPSADSVQVL
        SS+    + +  +   K KK  KKRA  + D  +  D     + +   VL +D LNEPT+G+KL SL+L   E  N EE           PP+A SV VL
Subjt:  SSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLNL---EDQNEEEDREQEPFIPAIPPSADSVQVL

Query:  LKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHASRIMSQESSLLALNSLYQLIESRISTFQSA
        L+QALHADDR+LLL+CLY +D++VI+ S+A+LNS++VLKLL++L+ I+QSRGAIL C +PW++ LLL H+S IMSQESSLLALN++YQLIESR+ST  +A
Subjt:  LKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHASRIMSQESSLLALNSLYQLIESRISTFQSA

Query:  LLLSSSLDFLYTGVLDEEVDENDAIVPIIYEEDDSD-DKESGDEMETDEDDEENEEVEVAFEDLSAGEVDDDMSE
        + +SS LD L    LDEE DE     P+IYE+ DSD D+E G E   + D+E ++  + A + ++  E  DDMS+
Subjt:  LLLSSSLDFLYTGVLDEEVDENDAIVPIIYEEDDSD-DKESGDEMETDEDDEENEEVEVAFEDLSAGEVDDDMSE

AT5G11240.1 transducin family protein / WD-40 repeat family protein3.4e-1323.47Show/hide
Query:  IWSTSDGSLLAEWKD-------------PEGKTDIGYSCMACCFVE--KKRKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRR
        IW T  G +  E+ D              +G   + Y+CM    +E  KKRK  + V+ +GT  GDVLA++ +SG+ KW    CH GGV  +S + K   
Subjt:  IWSTSDGSLLAEWKD-------------PEGKTDIGYSCMACCFVE--KKRKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRR

Query:  VHTVGSNGMAFEMDTETGSIIKEFKASKKSISS----------------------------------------SAF------------------------
        +++ G++GM  ++D  +G++I++FKAS K++SS                                         AF                        
Subjt:  VHTVGSNGMAFEMDTETGSIIKEFKASKKSISS----------------------------------------SAF------------------------

Query:  ----SLGPVLSMKHPPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSE-DEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSI
            S   VL+++HPP   D     NE+  + VL++S  GV Y W    + E   A PTKV +      +A+ +    K +   + A+++QG        
Subjt:  ----SLGPVLSMKHPPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSE-DEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSI

Query:  LVTHGSMDLPQHSVLNIGYSAKEDTDTAHEKKTLQQNDD----SSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNE
        ++  GS     H+ +  G   K     + +K  LQ  +D    +S+ G  +    +T   SK SK++   + + +L      D        +    DL+E
Subjt:  LVTHGSMDLPQHSVLNIGYSAKEDTDTAHEKKTLQQNDD----SSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNE

Query:  PTMGEKLASLNLEDQNE---EEDREQEPFIPAIPPSADSVQVLLKQALHAD-DRALLLECLYTK---DDKVISKSIAQLNSSDVLKLLHSLISIIQSRGA
            +K   L+  D++    ++         ++     S+ +L     H +   A +++    K     K +  ++  +  S   K L +L ++ Q+R  
Subjt:  PTMGEKLASLNLEDQNE---EEDREQEPFIPAIPPSADSVQVLLKQALHAD-DRALLLECLYTK---DDKVISKSIAQLNSSDVLKLLHSLISIIQSRGA

Query:  ILVCALPWLRGLLLQHASRIMSQE-SSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAIVPIIYEEDDSDDKESGDEMETDEDDEE
             LPW+  +++ H+  IMSQE  +   LN+L ++ +SR +  Q  L LS  L  L T  +++       I     E D+S+D+E  +++E     E 
Subjt:  ILVCALPWLRGLLLQHASRIMSQE-SSLLALNSLYQLIESRISTFQSALLLSSSLDFLYTGVLDEEVDENDAIVPIIYEEDDSDDKESGDEMETDEDDEE

Query:  NEEVEVAFEDLSAGEVDDDMSE
        + E +++ +D    + DD + E
Subjt:  NEEVEVAFEDLSAGEVDDDMSE


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGAACCGAGATCGGTCGCAGGTCATCTCGTCTCGCTCGAGGCAAGCACGCCGCAACTTCACAAGCTAGGCCGTTTCACCACGTCAAAAAACGAGCAACAAAACAACCA
CCACAAGCGTCGTGTAGGCAAGCCCACCTCTACCACTCGACTCGACACCCCACTCTCTCTCCCAAGCCAGATTGAGCCGTCAGCCTCCTTCCCCGTGGTTGCCGTGAACG
CCAGCAGCCACCATACTTTTCCCTTCCTCTCCTACCACCAGCAGCACACCACGACGCCGGCGCCGACCTCCTCCGGCGACGTGAGCAACAAGGGCGGCGTGGTAGTGGGT
TTTCGGCTAGCAGTACAGCAGCGGCTTCTAGGCGCGAGTTTCGGTGGCTGTGACTCTGGGTTGCGTAGACAGCATTGTTGGTGGTCGTTTTCTGCAGTTTCCCGGTGTTT
TAAGGTGAGTTCTCGCAATGGTTCTGGTAGCGGGTCTTCAAGTCTACGTGTCGTTGAGCATCTCGTGCAGTGGTGGGCTAAAGGCTTGGAGGCTAAGAGCGTTGTCGTTG
GGTCAAGCGGCTTATCAATTGAGAAGCAGGCTGCAGCTGCAAAGTGCCGTCCATCGCTTCCAAGTTTTGCTAGTTTTCCAATTTGGAGTACTAGTGATGGGAGTTTACTG
GCAGAATGGAAGGATCCTGAGGGAAAAACTGATATTGGTTATTCCTGTATGGCCTGCTGTTTTGTGGAGAAGAAGCGTAAAAACAGTTCTTGTGTAATTGCTATGGGTAC
CAACAATGGGGATGTGTTGGCTGTAAATGCTTCAAGCGGTGAGACAAAGTGGGTTTCTGTTGGTTGCCATCTTGGTGGAGTTATTGGCCTTTCTTTTGCGAACAAAGGCC
GTAGAGTGCATACGGTTGGAAGTAATGGAATGGCGTTTGAGATGGACACTGAAACAGGAAGCATTATCAAGGAGTTCAAAGCTTCTAAAAAATCAATCTCCTCTTCAGCC
TTTTCACTTGGTCCGGTTCTGTCGATGAAGCATCCTCCATTTGTTTCGGATTGCAGAAATATTTGCAATGAAGAAGATAACATAGTTGTCTTGTCAGTATCAGTATCAGG
TGTAGCTTATTTATGGAAATTAAAGTTCCTATCAGAAGACGAGGCTGGTCCAACTAAAGTCACTGTTAAAGGTAATGACTTCCAATCAGCCGAGGAAAACCATGGAAGTG
CTAAGAAAAATCGAATTTCTGTCATTGCTTCCAGAATACAAGGTTTAAGAGACAATGAAGTGTCAATTCTTGTTACTCATGGCTCCATGGACCTACCACAACATAGTGTT
CTTAATATTGGTTATTCCGCAAAGGAAGATACAGATACTGCGCATGAGAAAAAGACTCTCCAACAAAATGATGATTCTTCCGAGCAAGGTCCCCATGTGACTGAACAGGC
AGTTACTACACCTAAAAGTAAGAAAAGCAAAAAGAAAAGAGCAGCATCTGATCTTGATAGTCTGACAGCTGGAGATGTCAGTGACGTTGGCAATGGCAATGCTTCCGATG
TTCTATTTAACGATGATTTAAATGAGCCAACCATGGGAGAGAAACTTGCAAGTTTGAATTTGGAAGACCAGAATGAAGAGGAAGATCGTGAACAAGAACCATTCATCCCT
GCGATACCACCAAGTGCGGACTCTGTTCAGGTTTTGCTTAAGCAAGCACTACATGCAGACGATCGTGCCCTTTTGCTAGAATGCTTATATACCAAGGATGATAAGGTTAT
TTCAAAATCAATAGCACAATTAAATTCATCTGATGTTCTCAAGCTTTTACACTCTCTGATATCCATTATCCAATCGAGAGGGGCCATTCTTGTATGCGCCCTCCCTTGGC
TGAGAGGTTTACTTCTCCAACATGCTAGTAGAATAATGTCTCAAGAATCTTCTTTACTCGCCTTGAATTCTCTATATCAGCTCATTGAGTCTAGAATTTCAACTTTCCAA
TCCGCTCTACTACTATCGAGTAGCTTAGACTTCCTTTACACAGGGGTCCTTGACGAGGAGGTGGACGAAAATGATGCCATTGTGCCGATTATTTACGAGGAGGACGATAG
CGACGATAAGGAATCAGGAGACGAAATGGAAACTGATGAAGATGATGAAGAGAATGAAGAAGTAGAAGTAGCCTTTGAGGATCTTAGTGCTGGTGAAGTGGATGATGACA
TGAGTGAATGA
mRNA sequenceShow/hide mRNA sequence
ATGGAACCGAGATCGGTCGCAGGTCATCTCGTCTCGCTCGAGGCAAGCACGCCGCAACTTCACAAGCTAGGCCGTTTCACCACGTCAAAAAACGAGCAACAAAACAACCA
CCACAAGCGTCGTGTAGGCAAGCCCACCTCTACCACTCGACTCGACACCCCACTCTCTCTCCCAAGCCAGATTGAGCCGTCAGCCTCCTTCCCCGTGGTTGCCGTGAACG
CCAGCAGCCACCATACTTTTCCCTTCCTCTCCTACCACCAGCAGCACACCACGACGCCGGCGCCGACCTCCTCCGGCGACGTGAGCAACAAGGGCGGCGTGGTAGTGGGT
TTTCGGCTAGCAGTACAGCAGCGGCTTCTAGGCGCGAGTTTCGGTGGCTGTGACTCTGGGTTGCGTAGACAGCATTGTTGGTGGTCGTTTTCTGCAGTTTCCCGGTGTTT
TAAGGTGAGTTCTCGCAATGGTTCTGGTAGCGGGTCTTCAAGTCTACGTGTCGTTGAGCATCTCGTGCAGTGGTGGGCTAAAGGCTTGGAGGCTAAGAGCGTTGTCGTTG
GGTCAAGCGGCTTATCAATTGAGAAGCAGGCTGCAGCTGCAAAGTGCCGTCCATCGCTTCCAAGTTTTGCTAGTTTTCCAATTTGGAGTACTAGTGATGGGAGTTTACTG
GCAGAATGGAAGGATCCTGAGGGAAAAACTGATATTGGTTATTCCTGTATGGCCTGCTGTTTTGTGGAGAAGAAGCGTAAAAACAGTTCTTGTGTAATTGCTATGGGTAC
CAACAATGGGGATGTGTTGGCTGTAAATGCTTCAAGCGGTGAGACAAAGTGGGTTTCTGTTGGTTGCCATCTTGGTGGAGTTATTGGCCTTTCTTTTGCGAACAAAGGCC
GTAGAGTGCATACGGTTGGAAGTAATGGAATGGCGTTTGAGATGGACACTGAAACAGGAAGCATTATCAAGGAGTTCAAAGCTTCTAAAAAATCAATCTCCTCTTCAGCC
TTTTCACTTGGTCCGGTTCTGTCGATGAAGCATCCTCCATTTGTTTCGGATTGCAGAAATATTTGCAATGAAGAAGATAACATAGTTGTCTTGTCAGTATCAGTATCAGG
TGTAGCTTATTTATGGAAATTAAAGTTCCTATCAGAAGACGAGGCTGGTCCAACTAAAGTCACTGTTAAAGGTAATGACTTCCAATCAGCCGAGGAAAACCATGGAAGTG
CTAAGAAAAATCGAATTTCTGTCATTGCTTCCAGAATACAAGGTTTAAGAGACAATGAAGTGTCAATTCTTGTTACTCATGGCTCCATGGACCTACCACAACATAGTGTT
CTTAATATTGGTTATTCCGCAAAGGAAGATACAGATACTGCGCATGAGAAAAAGACTCTCCAACAAAATGATGATTCTTCCGAGCAAGGTCCCCATGTGACTGAACAGGC
AGTTACTACACCTAAAAGTAAGAAAAGCAAAAAGAAAAGAGCAGCATCTGATCTTGATAGTCTGACAGCTGGAGATGTCAGTGACGTTGGCAATGGCAATGCTTCCGATG
TTCTATTTAACGATGATTTAAATGAGCCAACCATGGGAGAGAAACTTGCAAGTTTGAATTTGGAAGACCAGAATGAAGAGGAAGATCGTGAACAAGAACCATTCATCCCT
GCGATACCACCAAGTGCGGACTCTGTTCAGGTTTTGCTTAAGCAAGCACTACATGCAGACGATCGTGCCCTTTTGCTAGAATGCTTATATACCAAGGATGATAAGGTTAT
TTCAAAATCAATAGCACAATTAAATTCATCTGATGTTCTCAAGCTTTTACACTCTCTGATATCCATTATCCAATCGAGAGGGGCCATTCTTGTATGCGCCCTCCCTTGGC
TGAGAGGTTTACTTCTCCAACATGCTAGTAGAATAATGTCTCAAGAATCTTCTTTACTCGCCTTGAATTCTCTATATCAGCTCATTGAGTCTAGAATTTCAACTTTCCAA
TCCGCTCTACTACTATCGAGTAGCTTAGACTTCCTTTACACAGGGGTCCTTGACGAGGAGGTGGACGAAAATGATGCCATTGTGCCGATTATTTACGAGGAGGACGATAG
CGACGATAAGGAATCAGGAGACGAAATGGAAACTGATGAAGATGATGAAGAGAATGAAGAAGTAGAAGTAGCCTTTGAGGATCTTAGTGCTGGTGAAGTGGATGATGACA
TGAGTGAATGA
Protein sequenceShow/hide protein sequence
MEPRSVAGHLVSLEASTPQLHKLGRFTTSKNEQQNNHHKRRVGKPTSTTRLDTPLSLPSQIEPSASFPVVAVNASSHHTFPFLSYHQQHTTTPAPTSSGDVSNKGGVVVG
FRLAVQQRLLGASFGGCDSGLRRQHCWWSFSAVSRCFKVSSRNGSGSGSSSLRVVEHLVQWWAKGLEAKSVVVGSSGLSIEKQAAAAKCRPSLPSFASFPIWSTSDGSLL
AEWKDPEGKTDIGYSCMACCFVEKKRKNSSCVIAMGTNNGDVLAVNASSGETKWVSVGCHLGGVIGLSFANKGRRVHTVGSNGMAFEMDTETGSIIKEFKASKKSISSSA
FSLGPVLSMKHPPFVSDCRNICNEEDNIVVLSVSVSGVAYLWKLKFLSEDEAGPTKVTVKGNDFQSAEENHGSAKKNRISVIASRIQGLRDNEVSILVTHGSMDLPQHSV
LNIGYSAKEDTDTAHEKKTLQQNDDSSEQGPHVTEQAVTTPKSKKSKKKRAASDLDSLTAGDVSDVGNGNASDVLFNDDLNEPTMGEKLASLNLEDQNEEEDREQEPFIP
AIPPSADSVQVLLKQALHADDRALLLECLYTKDDKVISKSIAQLNSSDVLKLLHSLISIIQSRGAILVCALPWLRGLLLQHASRIMSQESSLLALNSLYQLIESRISTFQ
SALLLSSSLDFLYTGVLDEEVDENDAIVPIIYEEDDSDDKESGDEMETDEDDEENEEVEVAFEDLSAGEVDDDMSE