; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Sgr020877 (gene) of Monk fruit (Qingpiguo) v1 genome

Gene IDSgr020877
OrganismSiraitia grosvenorii cv. Qingpiguo (Monk fruit (Qingpiguo) v1)
DescriptionTrihelix transcription factor GT-2
Genome locationtig00153574:823149..830240
RNA-Seq ExpressionSgr020877
SyntenySgr020877
Gene Ontology termsNA
InterPro domainsIPR001005 - SANT/Myb domain
IPR044822 - Myb/SANT-like DNA-binding domain 4


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAG6573527.1 hypothetical protein SDJN03_27414, partial [Cucurbita argyrosperma subsp. sororia]3.2e-16384.01Show/hide
Query:  MASVVKPSSRYSSYDVRSSTSSHFSDPSSSSEFKLKSPMAANSSSSRALVKCKASDLARGKSKPSDQNLTAMVKKFMEKRSGLKPKTAKQATGL------
        MA V+ PSSRYSSYDVRSS SSHFSDPSSSSEFKLKSPM A+SSSSRA+VK KA+DLAR K+KPSDQNLTAMVKKFMEKRSGLKPKT K ATGL      
Subjt:  MASVVKPSSRYSSYDVRSSTSSHFSDPSSSSEFKLKSPMAANSSSSRALVKCKASDLARGKSKPSDQNLTAMVKKFMEKRSGLKPKTAKQATGL------

Query:  --------------------KLFGKGTAAVEKKEKETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQREE
                            KLFGKG   VEKKEK  E KALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKY EIEKLKDLCLKQREE
Subjt:  --------------------KLFGKGTAAVEKKEKETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQREE

Query:  IKSLKNAILFPDVMNSQLQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYDHEDASNSLEFSACDPA
        IKSLKNAILFPDVMNSQLQ +LEKQDSELKQAKQ+IPTLQKQVTTLTGQL+SLAEDLAEVKADKYSGK WLQ  SSSPHTPTYDHEDASN LEFSACDP 
Subjt:  IKSLKNAILFPDVMNSQLQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYDHEDASNSLEFSACDPA

Query:  SPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNRMESGFKSCSRKLSKSSDCRQNSNKPNTTKTARRSDEAKYTYGKPMRKFY
        SP  PDD+LLKDVNPCLTPYYATKSK+FEAMGYDSPRDEILSHNRMESGF SCSRKLSKSSDCRQNSNK  TTKTARRSDEAKYTYGKPM KFY
Subjt:  SPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNRMESGFKSCSRKLSKSSDCRQNSNKPNTTKTARRSDEAKYTYGKPMRKFY

XP_022142583.1 inner centromere protein A [Momordica charantia]3.8e-17286.62Show/hide
Query:  MASVVKPSSRYSSYDVRSSTSSHFSDPSSSSEFKLKSPMAAN--SSSSRALVKCKASDLARGKSKPSDQNLTAMVKKFMEKRSGLKPKTAKQATGL----
        MASV+KPSSRYSSYDVRSSTSSHFSDPS+SSEFKLKSPMAAN  SSSSRALVK KASDLAR KSKPSDQNLTAMVKKFMEKRS  KPKTAK ATGL    
Subjt:  MASVVKPSSRYSSYDVRSSTSSHFSDPSSSSEFKLKSPMAAN--SSSSRALVKCKASDLARGKSKPSDQNLTAMVKKFMEKRSGLKPKTAKQATGL----

Query:  ----------------------KLFGKGTAAVEKKEKETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQR
                              KLFGKG+AAVEKKEK+ E KALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQR
Subjt:  ----------------------KLFGKGTAAVEKKEKETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQR

Query:  EEIKSLKNAILFPDVMNSQLQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYDHEDASNSLEFSACD
        EEIKSLKNAILFPDVMNSQLQE+LEKQDSELKQAKQ+IPTLQKQVT LTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYD EDASNSLEFSACD
Subjt:  EEIKSLKNAILFPDVMNSQLQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYDHEDASNSLEFSACD

Query:  PASPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNRMESGFKSCSRKLSKSSDCRQNSNKPNTTKTARRSDEAKYTYGKPMRKFY
        P SPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNR ESGF+SCSRKLS+SSDCRQ SN+ NTT+TARRSDEAKY YGKPM KFY
Subjt:  PASPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNRMESGFKSCSRKLSKSSDCRQNSNKPNTTKTARRSDEAKYTYGKPMRKFY

XP_022925334.1 uncharacterized protein LOC111432624 isoform X1 [Cucurbita moschata]2.1e-16283.76Show/hide
Query:  MASVVKPSSRYSSYDVRSSTSSHFSDPSSSSEFKLKSPMAANSSSSRALVKCKASDLARGKSKPSDQNLTAMVKKFMEKRSGLKPKTAKQATGL------
        MA V+ PSSRYSSYDVRSS SSHFSDPSSSSEFKLKSPM A+SSSSRA+VK KA+DL R K+KPSDQNLTAMVKKFMEKRSGLKPKT K ATGL      
Subjt:  MASVVKPSSRYSSYDVRSSTSSHFSDPSSSSEFKLKSPMAANSSSSRALVKCKASDLARGKSKPSDQNLTAMVKKFMEKRSGLKPKTAKQATGL------

Query:  --------------------KLFGKGTAAVEKKEKETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQREE
                            KLFGKG   VEKKEK  E KALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKY EIEKLKDLCLKQREE
Subjt:  --------------------KLFGKGTAAVEKKEKETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQREE

Query:  IKSLKNAILFPDVMNSQLQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYDHEDASNSLEFSACDPA
        IKSLKNAILFPDVMNSQLQ +LEKQDSELKQAKQ+IPTLQKQVTTLTGQL+SLAEDLAEVKADKYSGK WLQ  SSSPHTPTYDHEDASN LEFSACDP 
Subjt:  IKSLKNAILFPDVMNSQLQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYDHEDASNSLEFSACDPA

Query:  SPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNRMESGFKSCSRKLSKSSDCRQNSNKPNTTKTARRSDEAKYTYGKPMRKFY
        SP  PDD+LLKDVNPCLTPYYATKSK+FEAMGYDSPRDEILSHNRMESGF SCSRKLSKSSDCRQNSNK  TTKTARRSDEAKYTYGKPM KFY
Subjt:  SPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNRMESGFKSCSRKLSKSSDCRQNSNKPNTTKTARRSDEAKYTYGKPMRKFY

XP_023542139.1 uncharacterized protein LOC111802113 isoform X1 [Cucurbita pepo subsp. pepo]4.5e-16584.26Show/hide
Query:  MASVVKPSSRYSSYDVRSSTSSHFSDPSSSSEFKLKSPMAANSSSSRALVKCKASDLARGKSKPSDQNLTAMVKKFMEKRSGLKPKTAKQATGL------
        MA V+ PSSRYSSYDVRSS SSHFSDPSSSSEFKLKSPM A+SSSSRA+VK KA+DLAR K+KPSDQNLTAMVKKFMEKRSGLKPKT K ATGL      
Subjt:  MASVVKPSSRYSSYDVRSSTSSHFSDPSSSSEFKLKSPMAANSSSSRALVKCKASDLARGKSKPSDQNLTAMVKKFMEKRSGLKPKTAKQATGL------

Query:  --------------------KLFGKGTAAVEKKEKETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQREE
                            KLFGKG   VEKKEKE E KALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKY EIEKLKDLCLKQREE
Subjt:  --------------------KLFGKGTAAVEKKEKETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQREE

Query:  IKSLKNAILFPDVMNSQLQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYDHEDASNSLEFSACDPA
        IKSLKNAILFPDVMNSQLQ +LEKQDSELKQAKQ+IPTLQKQVTTLTGQL+SLAEDLAEVKADKYSGK WLQ  SSSPHTPTYDHEDASN LEFSACDP 
Subjt:  IKSLKNAILFPDVMNSQLQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYDHEDASNSLEFSACDPA

Query:  SPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNRMESGFKSCSRKLSKSSDCRQNSNKPNTTKTARRSDEAKYTYGKPMRKFY
        SP  PDD+LLKDVNPCLTPYYATKSK+FEAMGYDSPRDEILSHNRMESGF SCSRKLSKSSDCRQNSNK  TTKTARRSDEAKYTYGKPM KFY
Subjt:  SPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNRMESGFKSCSRKLSKSSDCRQNSNKPNTTKTARRSDEAKYTYGKPMRKFY

XP_038895034.1 uncharacterized protein LOC120083373 [Benincasa hispida]6.5e-16484.73Show/hide
Query:  MASVVKPSSRYSSYDVRSSTSSHFSDPSSSSEFKLKSPMAANSSSSRALVKCKASDLARGKSKPSDQNLTAMVKKFMEKRSGLKPKTAKQATGL------
        MA V+KPSSRYSSYDVRSSTSSHFSDPSSS EF LKSP+ ANSSSSRALVK K SDLAR K+KPSDQNLTAMVKKFMEKRSG KPKT KQA GL      
Subjt:  MASVVKPSSRYSSYDVRSSTSSHFSDPSSSSEFKLKSPMAANSSSSRALVKCKASDLARGKSKPSDQNLTAMVKKFMEKRSGLKPKTAKQATGL------

Query:  --------------------KLFGKGTAAVEKKEKETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQREE
                            KLFGKGT  VEKKE + E KALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKL+LEEKYREIEKLKDLCLKQREE
Subjt:  --------------------KLFGKGTAAVEKKEKETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQREE

Query:  IKSLKNAILFPDVMNSQLQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYDHEDASNSLEFSACDPA
        IKSLKNAILFPDVMNSQLQ +LEKQDSELKQAKQ+IPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGK+WLQ  S SPHTPTYD EDASNSLEFSACDP 
Subjt:  IKSLKNAILFPDVMNSQLQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYDHEDASNSLEFSACDPA

Query:  SPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNRMESGFKSCSRKLSKSSDCRQNSNKPNTTKTARRSDEAKYTYGKPMRKF
        SPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNRME GFKSCSRKLSKSSDCRQNS+K NTTKTARRSDEAKY YGKPM KF
Subjt:  SPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNRMESGFKSCSRKLSKSSDCRQNSNKPNTTKTARRSDEAKYTYGKPMRKF

TrEMBL top hitse value%identityAlignment
A0A0A0LTE0 Uncharacterized protein4.2e-16182.49Show/hide
Query:  MASVVKPSSRYSSYDVRSSTSSHFSDPSSSSEFKLKSPMAANSSSSRALVKCKASDLARGKSKPSDQNLTAMVKKFMEKRSGLKPKTAKQATGL------
        MA V+KPSSRY+SYD+RSSTSSHFSDPSSSS+F +KSP+  NSSSSRALVK K SDLAR K KPSDQNLTAMVKKFMEKRSG KPKT K A GL      
Subjt:  MASVVKPSSRYSSYDVRSSTSSHFSDPSSSSEFKLKSPMAANSSSSRALVKCKASDLARGKSKPSDQNLTAMVKKFMEKRSGLKPKTAKQATGL------

Query:  --------------------KLFGKGTAAVEKKEKETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQREE
                            KLFGKGT  VEKKE + E KALTEVKGNTRTLAMVLRSERELLSLNK+QELEITELKLVLEEKYREIEKLKDLCLKQREE
Subjt:  --------------------KLFGKGTAAVEKKEKETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQREE

Query:  IKSLKNAILFPDVMNSQLQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYDHEDASNSLEFSACDPA
        IKSLKNA+LFPDVMNSQLQ +LEKQDSELKQAKQ+IPTLQKQVTTLTGQL+SLAEDLAEVKADKYSGK+WLQ  S SPHTPTYDHEDASNSLEFS CDP 
Subjt:  IKSLKNAILFPDVMNSQLQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYDHEDASNSLEFSACDPA

Query:  SPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNRMESGFKSCSRKLSKSSDCRQNSNKPNTTKTARRSDEAKYTYGKPMRKFY
        SPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEIL  NRMESGFKSCSRKLSKSSDC+Q SNK NTTKT R+SDEAKYTYGKPMRKFY
Subjt:  SPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNRMESGFKSCSRKLSKSSDCRQNSNKPNTTKTARRSDEAKYTYGKPMRKFY

A0A1S3CL74 uncharacterized protein LOC1035017121.9e-16182.95Show/hide
Query:  MASVVKPSSRYSSYDVRSSTSSHFSDPSSSSEFKLKSPMAANSSSSRALVKCKASDLARGKSKPSDQNLTAMVKKFMEKRSGLKPKTAKQATGL------
        MA V+KPSSRYSSYDVRSSTSSHFSDPSSSS+FK+KSP+ ANSSSSRALVK K +DLAR K KPSDQNLTAMVKKFMEKRSG KPK  K A GL      
Subjt:  MASVVKPSSRYSSYDVRSSTSSHFSDPSSSSEFKLKSPMAANSSSSRALVKCKASDLARGKSKPSDQNLTAMVKKFMEKRSGLKPKTAKQATGL------

Query:  --------------------KLFGKGTAAVEKKEKETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQREE
                            KLFGKGT  +EKK+ + E KALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQREE
Subjt:  --------------------KLFGKGTAAVEKKEKETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQREE

Query:  IKSLKNAILFPDVMNSQLQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYDHEDASNSLEFSACDPA
        IKSLKNAILFPDVMNSQLQ +LEKQDSELKQAKQ+IPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGK+WLQ  S SPHTPTYDHEDASNSLEFS CDP 
Subjt:  IKSLKNAILFPDVMNSQLQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYDHEDASNSLEFSACDPA

Query:  SPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNRMESGFKSCSRKLSKSSDCRQNSNKPNTTKTARRSDEAKYTYGKPMRKF
        SPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPR E +S NRMESGFKSCSRKLSKSSDCRQNSNK NTTKT R+SDEAKYTYGKPM KF
Subjt:  SPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNRMESGFKSCSRKLSKSSDCRQNSNKPNTTKTARRSDEAKYTYGKPMRKF

A0A6J1CNL5 inner centromere protein A1.8e-17286.62Show/hide
Query:  MASVVKPSSRYSSYDVRSSTSSHFSDPSSSSEFKLKSPMAAN--SSSSRALVKCKASDLARGKSKPSDQNLTAMVKKFMEKRSGLKPKTAKQATGL----
        MASV+KPSSRYSSYDVRSSTSSHFSDPS+SSEFKLKSPMAAN  SSSSRALVK KASDLAR KSKPSDQNLTAMVKKFMEKRS  KPKTAK ATGL    
Subjt:  MASVVKPSSRYSSYDVRSSTSSHFSDPSSSSEFKLKSPMAAN--SSSSRALVKCKASDLARGKSKPSDQNLTAMVKKFMEKRSGLKPKTAKQATGL----

Query:  ----------------------KLFGKGTAAVEKKEKETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQR
                              KLFGKG+AAVEKKEK+ E KALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQR
Subjt:  ----------------------KLFGKGTAAVEKKEKETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQR

Query:  EEIKSLKNAILFPDVMNSQLQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYDHEDASNSLEFSACD
        EEIKSLKNAILFPDVMNSQLQE+LEKQDSELKQAKQ+IPTLQKQVT LTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYD EDASNSLEFSACD
Subjt:  EEIKSLKNAILFPDVMNSQLQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYDHEDASNSLEFSACD

Query:  PASPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNRMESGFKSCSRKLSKSSDCRQNSNKPNTTKTARRSDEAKYTYGKPMRKFY
        P SPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNR ESGF+SCSRKLS+SSDCRQ SN+ NTT+TARRSDEAKY YGKPM KFY
Subjt:  PASPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNRMESGFKSCSRKLSKSSDCRQNSNKPNTTKTARRSDEAKYTYGKPMRKFY

A0A6J1EHN1 uncharacterized protein LOC111432624 isoform X11.0e-16283.76Show/hide
Query:  MASVVKPSSRYSSYDVRSSTSSHFSDPSSSSEFKLKSPMAANSSSSRALVKCKASDLARGKSKPSDQNLTAMVKKFMEKRSGLKPKTAKQATGL------
        MA V+ PSSRYSSYDVRSS SSHFSDPSSSSEFKLKSPM A+SSSSRA+VK KA+DL R K+KPSDQNLTAMVKKFMEKRSGLKPKT K ATGL      
Subjt:  MASVVKPSSRYSSYDVRSSTSSHFSDPSSSSEFKLKSPMAANSSSSRALVKCKASDLARGKSKPSDQNLTAMVKKFMEKRSGLKPKTAKQATGL------

Query:  --------------------KLFGKGTAAVEKKEKETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQREE
                            KLFGKG   VEKKEK  E KALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKY EIEKLKDLCLKQREE
Subjt:  --------------------KLFGKGTAAVEKKEKETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQREE

Query:  IKSLKNAILFPDVMNSQLQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYDHEDASNSLEFSACDPA
        IKSLKNAILFPDVMNSQLQ +LEKQDSELKQAKQ+IPTLQKQVTTLTGQL+SLAEDLAEVKADKYSGK WLQ  SSSPHTPTYDHEDASN LEFSACDP 
Subjt:  IKSLKNAILFPDVMNSQLQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYDHEDASNSLEFSACDPA

Query:  SPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNRMESGFKSCSRKLSKSSDCRQNSNKPNTTKTARRSDEAKYTYGKPMRKFY
        SP  PDD+LLKDVNPCLTPYYATKSK+FEAMGYDSPRDEILSHNRMESGF SCSRKLSKSSDCRQNSNK  TTKTARRSDEAKYTYGKPM KFY
Subjt:  SPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNRMESGFKSCSRKLSKSSDCRQNSNKPNTTKTARRSDEAKYTYGKPMRKFY

A0A6J1HTF1 uncharacterized protein LOC111466593 isoform X13.2e-16183.25Show/hide
Query:  MASVVKPSSRYSSYDVRSSTSSHFSDPSSSSEFKLKSPMAANSSSSRALVKCKASDLARGKSKPSDQNLTAMVKKFMEKRSGLKPKTAKQATGL------
        MA V+ PSSRYSSYDVRSS SSHFSDPSSSSEFKLKSPM A+SSSSR +VK KA DLAR K+KP DQNLTAMVKKFMEKRSGLKPKT K ATGL      
Subjt:  MASVVKPSSRYSSYDVRSSTSSHFSDPSSSSEFKLKSPMAANSSSSRALVKCKASDLARGKSKPSDQNLTAMVKKFMEKRSGLKPKTAKQATGL------

Query:  --------------------KLFGKGTAAVEKKEKETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQREE
                            KLFGKG   VEKKEK  E KALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKY EIEKLKDLCLKQREE
Subjt:  --------------------KLFGKGTAAVEKKEKETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQREE

Query:  IKSLKNAILFPDVMNSQLQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYDHEDASNSLEFSACDPA
        IKSLKNAILFPDVMNSQLQ +LEKQDSELKQAKQ+IPTLQKQVTTLTGQL+SLAEDLAEVKADKYSGK WLQ  SSSPHTPTYDHEDASN LEFSACDP 
Subjt:  IKSLKNAILFPDVMNSQLQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYDHEDASNSLEFSACDPA

Query:  SPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNRMESGFKSCSRKLSKSSDCRQNSNKPNTTKTARRSDEAKYTYGKPMRKFY
        SP  PDD+LLKDVNPCLTPYYATKSK+FEAMGYDSPRDEILSHNRMES F SCSRKLSKSSDCRQNSNK  TTKTARRSDEAKYTYGKPM KFY
Subjt:  SPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNRMESGFKSCSRKLSKSSDCRQNSNKPNTTKTARRSDEAKYTYGKPMRKFY

SwissProt top hitse value%identityAlignment
Q39117 Trihelix transcription factor GT-23.5e-1932.2Show/hide
Query:  LDDDSCSTSDVGDDIVSTKKPLNHKRKRTRSLELFVENLVMKVLDKQEQMHQQLIDMIEKKEKERIVREEAWKQRRL----KESEGMRIE--NQCTED--
        L   S S+S   D+     +  + ++KR     LF + L  ++++KQE+M ++ ++ +E +EKERI REEAW+ + +    +E E +  E  N   +D  
Subjt:  LDDDSCSTSDVGDDIVSTKKPLNHKRKRTRSLELFVENLVMKVLDKQEQMHQQLIDMIEKKEKERIVREEAWKQRRL----KESEGMRIE--NQCTED--

Query:  --------DGGE----------SSIQKELKSD--------------------------------LSRRWPQAEVQALISLRTSLEHKFRATGSKGSIWEE
                 GG+           S +K+ +SD                                 S RWP+ EV+ALI +R +LE  ++  G+KG +WEE
Subjt:  --------DGGE----------SSIQKELKSD--------------------------------LSRRWPQAEVQALISLRTSLEHKFRATGSKGSIWEE

Query:  ISVEMHKMGHNRSAKKCKEKWENMNKYFKRTIGTGK
        IS  M ++G+NRSAK+CKEKWEN+NKYFK+   + K
Subjt:  ISVEMHKMGHNRSAKKCKEKWENMNKYFKRTIGTGK

Q8H181 Trihelix transcription factor GTL25.2e-1526.81Show/hide
Query:  NSSKKEKPVEVAMDNGGFGDIIGNNYFSEEETKDGGSGAVIAVENLSRR-----GEGPQLDDDSCSTSDVGDDIVSTKKPLNHKRK---RTRSLELFVEN
        N +K+   VE     G  G+ +  +  +E++ +D   G V      ++R     G+   ++DD+ S+S     ++  +K    ++K   R   L+ F E 
Subjt:  NSSKKEKPVEVAMDNGGFGDIIGNNYFSEEETKDGGSGAVIAVENLSRR-----GEGPQLDDDSCSTSDVGDDIVSTKKPLNHKRK---RTRSLELFVEN

Query:  LVMKVLDKQEQMHQQLIDMIEKKEKERIVREEAWKQR---RLKESEGMRIENQCTEDDGGES--------------------------------------
        LV  ++ +QE+MH++L++ + KKE+E+I REEAWK++   R+ +   +R + Q    D   +                                      
Subjt:  LVMKVLDKQEQMHQQLIDMIEKKEKERIVREEAWKQR---RLKESEGMRIENQCTEDDGGES--------------------------------------

Query:  -------------------SIQKEL----------------------KSDLSRRWPQAEVQALISLRTSL------EHKFR---ATGSKG-SIWEEISVE
                           +I K L                      KSDL +RWP+ EV ALI++R S+      +HK     +T SK   +WE IS +
Subjt:  -------------------SIQKEL----------------------KSDLSRRWPQAEVQALISLRTSL------EHKFR---ATGSKG-SIWEEISVE

Query:  MHKMGHNRSAKKCKEKWENMNKYFKRTIGTGK
        M ++G+ RSAK+CKEKWEN+NKYF++T    K
Subjt:  MHKMGHNRSAKKCKEKWENMNKYFKRTIGTGK

Q9C6K3 Trihelix transcription factor DF11.4e-2028.85Show/hide
Query:  GPQLDDDSCSTS---DVGDDIVSTKKPLNHKRKRTRSLELFVENLVMKVLDKQEQMHQQLIDMIEKKEKERIVREEAWK---------------------
        G  L D+S S+S       D+         ++KR R  ++F E L+ +V+DKQE++ ++ ++ +EK+E ER+VREE+W+                     
Subjt:  GPQLDDDSCSTS---DVGDDIVSTKKPLNHKRKRTRSLELFVENLVMKVLDKQEQMHQQLIDMIEKKEKERIVREEAWK---------------------

Query:  ----------QRRLKESE---------------GMRIEN------------------------------QCTEDDGGESSIQKELKSDLSRRWPQAEVQA
                   ++L E +                M++ N                                T+ D G         S  S RWP+ E++A
Subjt:  ----------QRRLKESE---------------GMRIEN------------------------------QCTEDDGGESSIQKELKSDLSRRWPQAEVQA

Query:  LISLRTSLEHKFRATGSKGSIWEEISVEMHKMGHNRSAKKCKEKWENMNKYFKRTIGTGK
        LI LRT+L+ K++  G KG +WEEIS  M ++G NR++K+CKEKWEN+NKYFK+   + K
Subjt:  LISLRTSLEHKFRATGSKGSIWEEISVEMHKMGHNRSAKKCKEKWENMNKYFKRTIGTGK

Q9C882 Trihelix transcription factor GTL12.8e-1652.44Show/hide
Query:  SSIQKELKSDLSRRWPQAEVQALISLRTSLEHKFRATGSKGSIWEEISVEMHKMGHNRSAKKCKEKWENMNKYFKRTIGTGK
        SS Q  L S  S RWP+AE+ ALI+LR+ +E +++    KG +WEEIS  M +MG+NR+AK+CKEKWEN+NKY+K+   + K
Subjt:  SSIQKELKSDLSRRWPQAEVQALISLRTSLEHKFRATGSKGSIWEEISVEMHKMGHNRSAKKCKEKWENMNKYFKRTIGTGK

Q9C882 Trihelix transcription factor GTL16.6e-0236.36Show/hide
Query:  GEGPQLDDDSCSTSDVGDDIVSTKKPLNHKRKR------TRSLELFVENLVMKVLDKQEQMHQQLIDMIEKKEKERIVREEAWKQRRL
        G G   DDD     D+  D  +     + KRKR       + +ELF E LV +V+ KQ  M +  ++ +EK+E+ER+ REEAWK++ +
Subjt:  GEGPQLDDDSCSTSDVGDDIVSTKKPLNHKRKR------TRSLELFVENLVMKVLDKQEQMHQQLIDMIEKKEKERIVREEAWKQRRL

Q9LZS0 Trihelix transcription factor PTL3.9e-1043.08Show/hide
Query:  RWPQAEVQALISLRTSLEHKFRATGSKGSIWEEIS-VEMHKMGHNRSAKKCKEKWENMNKYFKRT
        RWP+ E   L+ +R+ L+HKF+    KG +W+E+S +   + G+ RS KKC+EK+EN+ KY+++T
Subjt:  RWPQAEVQALISLRTSLEHKFRATGSKGSIWEEIS-VEMHKMGHNRSAKKCKEKWENMNKYFKRT

Arabidopsis top hitse value%identityAlignment
AT1G76880.1 Duplicated homeodomain-like superfamily protein1.0e-2128.85Show/hide
Query:  GPQLDDDSCSTS---DVGDDIVSTKKPLNHKRKRTRSLELFVENLVMKVLDKQEQMHQQLIDMIEKKEKERIVREEAWK---------------------
        G  L D+S S+S       D+         ++KR R  ++F E L+ +V+DKQE++ ++ ++ +EK+E ER+VREE+W+                     
Subjt:  GPQLDDDSCSTS---DVGDDIVSTKKPLNHKRKRTRSLELFVENLVMKVLDKQEQMHQQLIDMIEKKEKERIVREEAWK---------------------

Query:  ----------QRRLKESE---------------GMRIEN------------------------------QCTEDDGGESSIQKELKSDLSRRWPQAEVQA
                   ++L E +                M++ N                                T+ D G         S  S RWP+ E++A
Subjt:  ----------QRRLKESE---------------GMRIEN------------------------------QCTEDDGGESSIQKELKSDLSRRWPQAEVQA

Query:  LISLRTSLEHKFRATGSKGSIWEEISVEMHKMGHNRSAKKCKEKWENMNKYFKRTIGTGK
        LI LRT+L+ K++  G KG +WEEIS  M ++G NR++K+CKEKWEN+NKYFK+   + K
Subjt:  LISLRTSLEHKFRATGSKGSIWEEISVEMHKMGHNRSAKKCKEKWENMNKYFKRTIGTGK

AT1G76890.2 Duplicated homeodomain-like superfamily protein2.5e-2032.2Show/hide
Query:  LDDDSCSTSDVGDDIVSTKKPLNHKRKRTRSLELFVENLVMKVLDKQEQMHQQLIDMIEKKEKERIVREEAWKQRRL----KESEGMRIE--NQCTED--
        L   S S+S   D+     +  + ++KR     LF + L  ++++KQE+M ++ ++ +E +EKERI REEAW+ + +    +E E +  E  N   +D  
Subjt:  LDDDSCSTSDVGDDIVSTKKPLNHKRKRTRSLELFVENLVMKVLDKQEQMHQQLIDMIEKKEKERIVREEAWKQRRL----KESEGMRIE--NQCTED--

Query:  --------DGGE----------SSIQKELKSD--------------------------------LSRRWPQAEVQALISLRTSLEHKFRATGSKGSIWEE
                 GG+           S +K+ +SD                                 S RWP+ EV+ALI +R +LE  ++  G+KG +WEE
Subjt:  --------DGGE----------SSIQKELKSD--------------------------------LSRRWPQAEVQALISLRTSLEHKFRATGSKGSIWEE

Query:  ISVEMHKMGHNRSAKKCKEKWENMNKYFKRTIGTGK
        IS  M ++G+NRSAK+CKEKWEN+NKYFK+   + K
Subjt:  ISVEMHKMGHNRSAKKCKEKWENMNKYFKRTIGTGK

AT4G17240.1 unknown protein5.1e-5847.73Show/hide
Query:  SSRYSSYDVRSS-TSSHFSDPSSSSEFKLKSPMAANSSSSRALVKCKASDLARG----KSKPSDQNLTAMVKKFME-KRSGLK--------PKTAKQATG
        +SRY+SYD RSS TSS  SD SSS+EFK   P+     SS+A+V+ K+S L +     K   +  NLT M+KK ME K+S  K        P+  K+   
Subjt:  SSRYSSYDVRSS-TSSHFSDPSSSSEFKLKSPMAANSSSSRALVKCKASDLARG----KSKPSDQNLTAMVKKFME-KRSGLK--------PKTAKQATG

Query:  LKLFGKGTAAVEKKE--KETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQREEIKSLKNAILFPDVMNSQ
         K  GK T    +++   + + KALTEVK NTRTL+MVLRSERELL +NK+QE+EI ELK  LEEK RE+EKLKDLCLKQREEIKSLK+A+LFPD MNSQ
Subjt:  LKLFGKGTAAVEKKE--KETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQREEIKSLKNAILFPDVMNSQ

Query:  LQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKY-SGKAWLQNNSSSPHTPTYDHEDASNSLEFSACDPASPGSPDDFLLKDVNPC
        + ++      EL QA+++IP LQKQV +L GQL  +A+DLAEVKA+KY S   + Q  +SS     YD      SLEFS+      GSPD   L+D+NPC
Subjt:  LQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKY-SGKAWLQNNSSSPHTPTYDHEDASNSLEFSACDPASPGSPDDFLLKDVNPC

Query:  LTPYYATKSKEFEAMGYDSPRDEILSHNRMES---GFKSCSR--KLSKSSDCRQNSNKPNTTKTARRSDEAKYTY
        LTPY   K KE+E +  DS  + +   + + +     KS SR  K+S+SS+           K  +RS+E+K  Y
Subjt:  LTPYYATKSKEFEAMGYDSPRDEILSHNRMES---GFKSCSR--KLSKSSDCRQNSNKPNTTKTARRSDEAKYTY

AT4G17240.2 unknown protein6.1e-4342.13Show/hide
Query:  SSRYSSYDVRSS-TSSHFSDPSSSSEFKLKSPMAANSSSSRALVKCKASDLARG----KSKPSDQNLTAMVKKFME-KRSGLK--------PKTAKQATG
        +SRY+SYD RSS TSS  SD SSS+EFK   P+     SS+A+V+ K+S L +     K   +  NLT M+KK ME K+S  K        P+  K+   
Subjt:  SSRYSSYDVRSS-TSSHFSDPSSSSEFKLKSPMAANSSSSRALVKCKASDLARG----KSKPSDQNLTAMVKKFME-KRSGLK--------PKTAKQATG

Query:  LKLFGKGTAAVEKKE--KETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQREEIKSLKNAILFPDVMNSQ
         K  GK T    +++   + + KALTEVK NTRTL+M+            E+     ++K+ L     ++EKLKDLCLKQREEIKSLK+A+LFPD MNSQ
Subjt:  LKLFGKGTAAVEKKE--KETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQREEIKSLKNAILFPDVMNSQ

Query:  LQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKY-SGKAWLQNNSSSPHTPTYDHEDASNSLEFSACDPASPGSPDDFLLKDVNPC
        + ++      EL QA+++IP LQKQV +L GQL  +A+DLAEVKA+KY S   + Q  +SS     YD      SLEFS+      GSPD   L+D+NPC
Subjt:  LQELLEKQDSELKQAKQVIPTLQKQVTTLTGQLHSLAEDLAEVKADKY-SGKAWLQNNSSSPHTPTYDHEDASNSLEFSACDPASPGSPDDFLLKDVNPC

Query:  LTPYYATKSKEFEAMGYDSPRDEILSHNRMES---GFKSCSR--KLSKSSDCRQNSNKPNTTKTARRSDEAKYTY
        LTPY   K KE+E +  DS  + +   + + +     KS SR  K+S+SS+           K  +RS+E+K  Y
Subjt:  LTPYYATKSKEFEAMGYDSPRDEILSHNRMES---GFKSCSR--KLSKSSDCRQNSNKPNTTKTARRSDEAKYTY

AT5G47660.1 Homeodomain-like superfamily protein1.0e-2631.99Show/hide
Query:  MDLFTADHRITSSDDFPQHVAPFPDPTD-----LLYAAPSAVFPPADIIDHCPNPPPPPQKLRPIRCNGRSPAGSQAENIFDGALRNFQGIPSSPEGGFT
        M+L   D R    DDF + + PF D +D     +            D +    +   PPQKL+PIRC  + P+ S+  +  D        +P    G F 
Subjt:  MDLFTADHRITSSDDFPQHVAPFPDPTD-----LLYAAPSAVFPPADIIDHCPNPPPPPQKLRPIRCNGRSPAGSQAENIFDGALRNFQGIPSSPEGGFT

Query:  GDQLCVANIDPSEYFNSSKKEKPVEVAMDNGGFGDIIGNNYFSEEETKDGGSGAVIAVENLSRRGEGPQLDDDSCSTSDVGDDIVSTKKPLNHKRKR-TR
                    E    SK     E      GF          E+++         A   +S  G       DS S SD   D+   +K +  KRKR TR
Subjt:  GDQLCVANIDPSEYFNSSKKEKPVEVAMDNGGFGDIIGNNYFSEEETKDGGSGAVIAVENLSRRGEGPQLDDDSCSTSDVGDDIVSTKKPLNHKRKR-TR

Query:  -SLELFVENLVMKVLDKQEQMHQQLIDMIEKKEKERIVREEAWKQR---RLKESEGMR-------------------------------------IENQC
          LE F+E LV  ++ +QE+MH QLI+++EK E ERI REEAW+Q+   R+ ++E  R                                     +  QC
Subjt:  -SLELFVENLVMKVLDKQEQMHQQLIDMIEKKEKERIVREEAWKQR---RLKESEGMR-------------------------------------IENQC

Query:  TEDDGGESSIQKELKSDLS-------RRWPQAEVQALISLRTSLEHKFRATG-SKGSIWEEISVEMHKMGHNRSAKKCKEKWENMNKYFKRTIGTGK
         ++    +  ++E+K   S       RRWPQ EVQALIS R+ +E K   TG +KG+IW+EIS  M + G+ RSAKKCKEKWENMNKY++R    G+
Subjt:  TEDDGGESSIQKELKSDLS-------RRWPQAEVQALISLRTSLEHKFRATG-SKGSIWEEISVEMHKMGHNRSAKKCKEKWENMNKYFKRTIGTGK


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGACCTCTTCACCGCCGATCATCGGATAACCAGCTCCGACGACTTCCCGCAGCACGTAGCGCCATTTCCCGATCCCACGGACCTGCTTTACGCCGCTCCATCGGCCGT
ATTTCCCCCGGCCGATATTATCGACCACTGTCCGAACCCTCCACCGCCGCCGCAGAAGCTTCGTCCCATCCGGTGCAACGGCAGGTCCCCGGCGGGTTCTCAGGCCGAGA
ACATCTTCGACGGAGCGCTAAGAAATTTCCAGGGCATACCGTCGTCGCCAGAGGGTGGATTTACTGGTGATCAGCTCTGTGTGGCTAATATTGACCCGAGTGAGTACTTC
AATTCCTCGAAGAAGGAAAAGCCTGTTGAGGTCGCGATGGATAATGGCGGCTTCGGGGATATCATCGGAAACAATTATTTCTCGGAGGAAGAGACGAAGGACGGCGGTTC
GGGTGCAGTTATCGCTGTGGAGAATTTAAGCCGGAGAGGGGAAGGACCTCAATTAGATGACGATTCTTGTTCGACTTCAGATGTTGGTGATGACATTGTAAGCACAAAAA
AACCTTTAAATCATAAGAGAAAGAGAACAAGATCGCTCGAGCTTTTTGTGGAAAATTTGGTAATGAAGGTATTGGATAAACAGGAGCAGATGCATCAGCAGTTGATCGAT
ATGATAGAAAAGAAGGAGAAGGAAAGAATAGTCAGAGAAGAAGCTTGGAAGCAGAGGAGATTGAAAGAATCAGAAGGGATGAGGATTGAAAACCAATGTACAGAAGATGA
TGGAGGTGAAAGCAGCATTCAGAAGGAATTAAAAAGTGATCTTAGTAGGAGATGGCCTCAGGCTGAAGTACAGGCCTTGATATCACTACGAACGTCTCTGGAACATAAAT
TTCGTGCTACAGGCTCAAAAGGATCAATATGGGAGGAGATATCAGTTGAGATGCATAAGATGGGTCACAACCGCTCGGCGAAGAAGTGCAAAGAAAAATGGGAAAATATG
AACAAATATTTCAAAAGGACAATTGGAACCGGGAAGACTAGTATTACAAATGCCCAGATTTCTGATAACGCAGTTTCTTCATGTTTACCCTTGGTCCATTCATGGCTGCT
TCCCCAATCCGCTTGCATTTCGGAATTTAAACTCATGGATGTTTCAGACAATTGTGCTCCTATGACAGCGCCGAACACAGAGCTGCAGAAGCCGCCACGTGTCTACTCCG
GCGCAAACCTTTTAAAATCCAATCTCTTCGCTCCATCTCCTCCGATTCAAAACCCTAAAAATGCGCTGAATCAAACACTGCCAATCTTAGTTGAAGCTCTGGTAGTTGAT
ACCTCCCTCTTAAAGATCTCAAGTTTTCCATCCATGGCCTCCGTCGTCAAGCCCTCTTCGCGCTACAGTTCCTACGATGTGCGCTCTTCAACTTCCTCCCACTTCTCCGA
CCCTTCTTCTTCCTCCGAGTTCAAGCTCAAGTCCCCCATGGCAGCCAACTCGTCGTCTTCCCGCGCTCTTGTTAAGTGCAAGGCGTCTGATCTGGCTAGAGGCAAATCGA
AGCCGTCTGATCAGAACTTGACGGCGATGGTGAAGAAGTTCATGGAGAAACGCTCTGGTTTGAAGCCGAAGACGGCGAAGCAGGCGACGGGGTTGAAGCTGTTTGGGAAG
GGAACTGCGGCGGTGGAGAAGAAGGAGAAGGAGACGGAGGCGAAGGCGTTGACGGAAGTGAAAGGGAATACGAGGACATTGGCGATGGTGCTGAGAAGTGAAAGAGAGCT
TTTGAGTTTGAATAAGGAGCAGGAGTTGGAAATCACTGAGCTCAAATTAGTTCTGGAAGAGAAGTACAGAGAAATTGAGAAGTTGAAGGATTTATGTTTGAAGCAAAGAG
AAGAAATAAAGTCATTGAAAAATGCAATATTGTTCCCAGATGTTATGAATTCTCAGCTTCAAGAGCTGCTTGAAAAGCAGGATTCAGAGTTGAAGCAAGCCAAACAAGTC
ATCCCTACTCTGCAAAAGCAGGTCACTACTCTCACTGGCCAGCTTCATTCCCTCGCCGAGGACCTTGCCGAGGTGAAGGCAGATAAATATTCAGGAAAGGCCTGGTTGCA
AAATAATAGCAGTTCTCCTCACACACCAACATATGATCACGAGGATGCTTCTAACTCTTTGGAGTTCAGTGCCTGCGATCCAGCATCCCCTGGCAGTCCAGATGACTTTT
TGCTGAAGGATGTGAATCCCTGTCTAACACCCTATTATGCAACTAAATCCAAGGAGTTTGAGGCAATGGGATATGATTCTCCGCGAGATGAAATCTTATCCCACAACAGA
ATGGAATCTGGTTTTAAATCTTGTTCCAGAAAATTGTCCAAAAGTTCTGATTGCAGGCAGAATTCCAACAAACCAAACACTACAAAAACAGCCCGAAGATCTGATGAAGC
CAAATACACATATGGAAAGCCAATGCGTAAATTTTACTGA
mRNA sequenceShow/hide mRNA sequence
ATGGACCTCTTCACCGCCGATCATCGGATAACCAGCTCCGACGACTTCCCGCAGCACGTAGCGCCATTTCCCGATCCCACGGACCTGCTTTACGCCGCTCCATCGGCCGT
ATTTCCCCCGGCCGATATTATCGACCACTGTCCGAACCCTCCACCGCCGCCGCAGAAGCTTCGTCCCATCCGGTGCAACGGCAGGTCCCCGGCGGGTTCTCAGGCCGAGA
ACATCTTCGACGGAGCGCTAAGAAATTTCCAGGGCATACCGTCGTCGCCAGAGGGTGGATTTACTGGTGATCAGCTCTGTGTGGCTAATATTGACCCGAGTGAGTACTTC
AATTCCTCGAAGAAGGAAAAGCCTGTTGAGGTCGCGATGGATAATGGCGGCTTCGGGGATATCATCGGAAACAATTATTTCTCGGAGGAAGAGACGAAGGACGGCGGTTC
GGGTGCAGTTATCGCTGTGGAGAATTTAAGCCGGAGAGGGGAAGGACCTCAATTAGATGACGATTCTTGTTCGACTTCAGATGTTGGTGATGACATTGTAAGCACAAAAA
AACCTTTAAATCATAAGAGAAAGAGAACAAGATCGCTCGAGCTTTTTGTGGAAAATTTGGTAATGAAGGTATTGGATAAACAGGAGCAGATGCATCAGCAGTTGATCGAT
ATGATAGAAAAGAAGGAGAAGGAAAGAATAGTCAGAGAAGAAGCTTGGAAGCAGAGGAGATTGAAAGAATCAGAAGGGATGAGGATTGAAAACCAATGTACAGAAGATGA
TGGAGGTGAAAGCAGCATTCAGAAGGAATTAAAAAGTGATCTTAGTAGGAGATGGCCTCAGGCTGAAGTACAGGCCTTGATATCACTACGAACGTCTCTGGAACATAAAT
TTCGTGCTACAGGCTCAAAAGGATCAATATGGGAGGAGATATCAGTTGAGATGCATAAGATGGGTCACAACCGCTCGGCGAAGAAGTGCAAAGAAAAATGGGAAAATATG
AACAAATATTTCAAAAGGACAATTGGAACCGGGAAGACTAGTATTACAAATGCCCAGATTTCTGATAACGCAGTTTCTTCATGTTTACCCTTGGTCCATTCATGGCTGCT
TCCCCAATCCGCTTGCATTTCGGAATTTAAACTCATGGATGTTTCAGACAATTGTGCTCCTATGACAGCGCCGAACACAGAGCTGCAGAAGCCGCCACGTGTCTACTCCG
GCGCAAACCTTTTAAAATCCAATCTCTTCGCTCCATCTCCTCCGATTCAAAACCCTAAAAATGCGCTGAATCAAACACTGCCAATCTTAGTTGAAGCTCTGGTAGTTGAT
ACCTCCCTCTTAAAGATCTCAAGTTTTCCATCCATGGCCTCCGTCGTCAAGCCCTCTTCGCGCTACAGTTCCTACGATGTGCGCTCTTCAACTTCCTCCCACTTCTCCGA
CCCTTCTTCTTCCTCCGAGTTCAAGCTCAAGTCCCCCATGGCAGCCAACTCGTCGTCTTCCCGCGCTCTTGTTAAGTGCAAGGCGTCTGATCTGGCTAGAGGCAAATCGA
AGCCGTCTGATCAGAACTTGACGGCGATGGTGAAGAAGTTCATGGAGAAACGCTCTGGTTTGAAGCCGAAGACGGCGAAGCAGGCGACGGGGTTGAAGCTGTTTGGGAAG
GGAACTGCGGCGGTGGAGAAGAAGGAGAAGGAGACGGAGGCGAAGGCGTTGACGGAAGTGAAAGGGAATACGAGGACATTGGCGATGGTGCTGAGAAGTGAAAGAGAGCT
TTTGAGTTTGAATAAGGAGCAGGAGTTGGAAATCACTGAGCTCAAATTAGTTCTGGAAGAGAAGTACAGAGAAATTGAGAAGTTGAAGGATTTATGTTTGAAGCAAAGAG
AAGAAATAAAGTCATTGAAAAATGCAATATTGTTCCCAGATGTTATGAATTCTCAGCTTCAAGAGCTGCTTGAAAAGCAGGATTCAGAGTTGAAGCAAGCCAAACAAGTC
ATCCCTACTCTGCAAAAGCAGGTCACTACTCTCACTGGCCAGCTTCATTCCCTCGCCGAGGACCTTGCCGAGGTGAAGGCAGATAAATATTCAGGAAAGGCCTGGTTGCA
AAATAATAGCAGTTCTCCTCACACACCAACATATGATCACGAGGATGCTTCTAACTCTTTGGAGTTCAGTGCCTGCGATCCAGCATCCCCTGGCAGTCCAGATGACTTTT
TGCTGAAGGATGTGAATCCCTGTCTAACACCCTATTATGCAACTAAATCCAAGGAGTTTGAGGCAATGGGATATGATTCTCCGCGAGATGAAATCTTATCCCACAACAGA
ATGGAATCTGGTTTTAAATCTTGTTCCAGAAAATTGTCCAAAAGTTCTGATTGCAGGCAGAATTCCAACAAACCAAACACTACAAAAACAGCCCGAAGATCTGATGAAGC
CAAATACACATATGGAAAGCCAATGCGTAAATTTTACTGA
Protein sequenceShow/hide protein sequence
MDLFTADHRITSSDDFPQHVAPFPDPTDLLYAAPSAVFPPADIIDHCPNPPPPPQKLRPIRCNGRSPAGSQAENIFDGALRNFQGIPSSPEGGFTGDQLCVANIDPSEYF
NSSKKEKPVEVAMDNGGFGDIIGNNYFSEEETKDGGSGAVIAVENLSRRGEGPQLDDDSCSTSDVGDDIVSTKKPLNHKRKRTRSLELFVENLVMKVLDKQEQMHQQLID
MIEKKEKERIVREEAWKQRRLKESEGMRIENQCTEDDGGESSIQKELKSDLSRRWPQAEVQALISLRTSLEHKFRATGSKGSIWEEISVEMHKMGHNRSAKKCKEKWENM
NKYFKRTIGTGKTSITNAQISDNAVSSCLPLVHSWLLPQSACISEFKLMDVSDNCAPMTAPNTELQKPPRVYSGANLLKSNLFAPSPPIQNPKNALNQTLPILVEALVVD
TSLLKISSFPSMASVVKPSSRYSSYDVRSSTSSHFSDPSSSSEFKLKSPMAANSSSSRALVKCKASDLARGKSKPSDQNLTAMVKKFMEKRSGLKPKTAKQATGLKLFGK
GTAAVEKKEKETEAKALTEVKGNTRTLAMVLRSERELLSLNKEQELEITELKLVLEEKYREIEKLKDLCLKQREEIKSLKNAILFPDVMNSQLQELLEKQDSELKQAKQV
IPTLQKQVTTLTGQLHSLAEDLAEVKADKYSGKAWLQNNSSSPHTPTYDHEDASNSLEFSACDPASPGSPDDFLLKDVNPCLTPYYATKSKEFEAMGYDSPRDEILSHNR
MESGFKSCSRKLSKSSDCRQNSNKPNTTKTARRSDEAKYTYGKPMRKFY