; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Sgr019020 (gene) of Monk fruit (Qingpiguo) v1 genome

Gene IDSgr019020
OrganismSiraitia grosvenorii cv. Qingpiguo (Monk fruit (Qingpiguo) v1)
Descriptionprotein KOKOPELLI isoform X1
Genome locationtig00153234:331820..334312
RNA-Seq ExpressionSgr019020
SyntenySgr019020
Gene Ontology termsNA
InterPro domainsNA


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_022154937.1 protein KOKOPELLI isoform X1 [Momordica charantia]6.7e-16864.05Show/hide
Query:  MDVDEVYLDLLALRALYILLLKSCLRDANSE-RLDERAQILLKNLLDDATAGVLELHSKILATDSGFFNNFRHKEGSSFLTGIDAKQTKPLDKKVAEWME
        M+V+E+YLDLLALR LYILLLKSCLRDANSE +LDERAQILLK+LLDDATA +++ HSK                            TKP+++KVAEWME
Subjt:  MDVDEVYLDLLALRALYILLLKSCLRDANSE-RLDERAQILLKNLLDDATAGVLELHSKILATDSGFFNNFRHKEGSSFLTGIDAKQTKPLDKKVAEWME

Query:  HNQSARKMGNLETEDNPRMARSSALNVATNHLSNGISLALRRIELHILSLQRCTSQSRRNTRSHINGAKLANYLQGNEILSQQKVQSRTDHSTLKARITE
        +NQS RK G                NVA N LSNGI LALRRIE HILSLQ  TSQS RNTRSHINGAKL+     N  L QQKVQSR DHS LKAR+ E
Subjt:  HNQSARKMGNLETEDNPRMARSSALNVATNHLSNGISLALRRIELHILSLQRCTSQSRRNTRSHINGAKLANYLQGNEILSQQKVQSRTDHSTLKARITE

Query:  PIRGSHNLRSHISRHLLGGQNVKPVVRGVESLTRACQMNHCSEFVHGFRIPLSQDNDEVRKPPTVETQISKEHKLINPMILIDKSGCSVGSKATVRSGRK
        PI G                                   HCSEFVHGFR+PLSQDN E  KPP V TQ+SK++K+INP+ILIDKS CSVGSKATVRS   
Subjt:  PIRGSHNLRSHISRHLLGGQNVKPVVRGVESLTRACQMNHCSEFVHGFRIPLSQDNDEVRKPPTVETQISKEHKLINPMILIDKSGCSVGSKATVRSGRK

Query:  LLNQPRIQERRCQNSPGRMIMRPTLLDHISRGVEREKENHKKTHVATQQESENTNSESESASSSSWETQQTSESETTDYPSSPTHQKGPPATGSEASSRY
         +N+ +I ERRCQN PG MIMRPTLL            NH KT + TQQESE TNSESES SSSSW TQQTSE+ETTDYPSS +HQ+  PATGSE SSRY
Subjt:  LLNQPRIQERRCQNSPGRMIMRPTLLDHISRGVEREKENHKKTHVATQQESENTNSESESASSSSWETQQTSESETTDYPSSPTHQKGPPATGSEASSRY

Query:  RSSSISTKTFRFSHGKKGSKKAIGRFKRLKNKLGLIFHHHHHHHHHHNTNT----FMWKHLRKIFHLHRTDNKKLTSEGGYGKLKKSAIRSVSRKNQVGK
        RSS IS+K FR SHGKKGSKKAIGRFKRL+NKLGLIFHHHHHHHHHH+ N+    FMWK LRKIF  H TD K++TS+G +  LKK+AIRSVSRKNQVG+
Subjt:  RSSSISTKTFRFSHGKKGSKKAIGRFKRLKNKLGLIFHHHHHHHHHHNTNT----FMWKHLRKIFHLHRTDNKKLTSEGGYGKLKKSAIRSVSRKNQVGK

Query:  FQALAEGLRSHVWKSKAMKKKELR--RLGGGRKKGVKKLQWWQMFRRRRGVKLPKKGRVKIGYVNRKPQLKVV
        FQALAEGLRSHVWK  AMKKKELR  RLG   KKGVKKL WW+MF RRRGVKLP KGRVKIGYVNRKPQ K+V
Subjt:  FQALAEGLRSHVWKSKAMKKKELR--RLGGGRKKGVKKLQWWQMFRRRRGVKLPKKGRVKIGYVNRKPQLKVV

XP_022154939.1 protein KOKOPELLI isoform X2 [Momordica charantia]6.0e-16964.16Show/hide
Query:  MDVDEVYLDLLALRALYILLLKSCLRDANSERLDERAQILLKNLLDDATAGVLELHSKILATDSGFFNNFRHKEGSSFLTGIDAKQTKPLDKKVAEWMEH
        M+V+E+YLDLLALR LYILLLKSCLRDANSE LDERAQILLK+LLDDATA +++ HSK                            TKP+++KVAEWME+
Subjt:  MDVDEVYLDLLALRALYILLLKSCLRDANSERLDERAQILLKNLLDDATAGVLELHSKILATDSGFFNNFRHKEGSSFLTGIDAKQTKPLDKKVAEWMEH

Query:  NQSARKMGNLETEDNPRMARSSALNVATNHLSNGISLALRRIELHILSLQRCTSQSRRNTRSHINGAKLANYLQGNEILSQQKVQSRTDHSTLKARITEP
        NQS RK G                NVA N LSNGI LALRRIE HILSLQ  TSQS RNTRSHINGAKL+     N  L QQKVQSR DHS LKAR+ EP
Subjt:  NQSARKMGNLETEDNPRMARSSALNVATNHLSNGISLALRRIELHILSLQRCTSQSRRNTRSHINGAKLANYLQGNEILSQQKVQSRTDHSTLKARITEP

Query:  IRGSHNLRSHISRHLLGGQNVKPVVRGVESLTRACQMNHCSEFVHGFRIPLSQDNDEVRKPPTVETQISKEHKLINPMILIDKSGCSVGSKATVRSGRKL
        I G                                   HCSEFVHGFR+PLSQDN E  KPP V TQ+SK++K+INP+ILIDKS CSVGSKATVRS    
Subjt:  IRGSHNLRSHISRHLLGGQNVKPVVRGVESLTRACQMNHCSEFVHGFRIPLSQDNDEVRKPPTVETQISKEHKLINPMILIDKSGCSVGSKATVRSGRKL

Query:  LNQPRIQERRCQNSPGRMIMRPTLLDHISRGVEREKENHKKTHVATQQESENTNSESESASSSSWETQQTSESETTDYPSSPTHQKGPPATGSEASSRYR
        +N+ +I ERRCQN PG MIMRPTLL            NH KT + TQQESE TNSESES SSSSW TQQTSE+ETTDYPSS +HQ+  PATGSE SSRYR
Subjt:  LNQPRIQERRCQNSPGRMIMRPTLLDHISRGVEREKENHKKTHVATQQESENTNSESESASSSSWETQQTSESETTDYPSSPTHQKGPPATGSEASSRYR

Query:  SSSISTKTFRFSHGKKGSKKAIGRFKRLKNKLGLIFHHHHHHHHHHNTNT----FMWKHLRKIFHLHRTDNKKLTSEGGYGKLKKSAIRSVSRKNQVGKF
        SS IS+K FR SHGKKGSKKAIGRFKRL+NKLGLIFHHHHHHHHHH+ N+    FMWK LRKIF  H TD K++TS+G +  LKK+AIRSVSRKNQVG+F
Subjt:  SSSISTKTFRFSHGKKGSKKAIGRFKRLKNKLGLIFHHHHHHHHHHNTNT----FMWKHLRKIFHLHRTDNKKLTSEGGYGKLKKSAIRSVSRKNQVGKF

Query:  QALAEGLRSHVWKSKAMKKKELR--RLGGGRKKGVKKLQWWQMFRRRRGVKLPKKGRVKIGYVNRKPQLKVV
        QALAEGLRSHVWK  AMKKKELR  RLG   KKGVKKL WW+MF RRRGVKLP KGRVKIGYVNRKPQ K+V
Subjt:  QALAEGLRSHVWKSKAMKKKELR--RLGGGRKKGVKKLQWWQMFRRRRGVKLPKKGRVKIGYVNRKPQLKVV

XP_022154940.1 uncharacterized protein LOC111022084 isoform X3 [Momordica charantia]5.0e-15562.8Show/hide
Query:  SERLDERAQILLKNLLDDATAGVLELHSKILATDSGFFNNFRHKEGSSFLTGIDAKQTKPLDKKVAEWMEHNQSARKMGNLETEDNPRMARSSALNVATN
        S++LDERAQILLK+LLDDATA +++ HSK                            TKP+++KVAEWME+NQS RK G                NVA N
Subjt:  SERLDERAQILLKNLLDDATAGVLELHSKILATDSGFFNNFRHKEGSSFLTGIDAKQTKPLDKKVAEWMEHNQSARKMGNLETEDNPRMARSSALNVATN

Query:  HLSNGISLALRRIELHILSLQRCTSQSRRNTRSHINGAKLANYLQGNEILSQQKVQSRTDHSTLKARITEPIRGSHNLRSHISRHLLGGQNVKPVVRGVE
         LSNGI LALRRIE HILSLQ  TSQS RNTRSHINGAKL+     N  L QQKVQSR DHS LKAR+ EPI G                          
Subjt:  HLSNGISLALRRIELHILSLQRCTSQSRRNTRSHINGAKLANYLQGNEILSQQKVQSRTDHSTLKARITEPIRGSHNLRSHISRHLLGGQNVKPVVRGVE

Query:  SLTRACQMNHCSEFVHGFRIPLSQDNDEVRKPPTVETQISKEHKLINPMILIDKSGCSVGSKATVRSGRKLLNQPRIQERRCQNSPGRMIMRPTLLDHIS
                 HCSEFVHGFR+PLSQDN E  KPP V TQ+SK++K+INP+ILIDKS CSVGSKATVRS    +N+ +I ERRCQN PG MIMRPTLL    
Subjt:  SLTRACQMNHCSEFVHGFRIPLSQDNDEVRKPPTVETQISKEHKLINPMILIDKSGCSVGSKATVRSGRKLLNQPRIQERRCQNSPGRMIMRPTLLDHIS

Query:  RGVEREKENHKKTHVATQQESENTNSESESASSSSWETQQTSESETTDYPSSPTHQKGPPATGSEASSRYRSSSISTKTFRFSHGKKGSKKAIGRFKRLK
                NH KT + TQQESE TNSESES SSSSW TQQTSE+ETTDYPSS +HQ+  PATGSE SSRYRSS IS+K FR SHGKKGSKKAIGRFKRL+
Subjt:  RGVEREKENHKKTHVATQQESENTNSESESASSSSWETQQTSESETTDYPSSPTHQKGPPATGSEASSRYRSSSISTKTFRFSHGKKGSKKAIGRFKRLK

Query:  NKLGLIFHHHHHHHHHHNTNT----FMWKHLRKIFHLHRTDNKKLTSEGGYGKLKKSAIRSVSRKNQVGKFQALAEGLRSHVWKSKAMKKKELR--RLGG
        NKLGLIFHHHHHHHHHH+ N+    FMWK LRKIF  H TD K++TS+G +  LKK+AIRSVSRKNQVG+FQALAEGLRSHVWK  AMKKKELR  RLG 
Subjt:  NKLGLIFHHHHHHHHHHNTNT----FMWKHLRKIFHLHRTDNKKLTSEGGYGKLKKSAIRSVSRKNQVGKFQALAEGLRSHVWKSKAMKKKELR--RLGG

Query:  GRKKGVKKLQWWQMFRRRRGVKLPKKGRVKIGYVNRKPQLKVV
          KKGVKKL WW+MF RRRGVKLP KGRVKIGYVNRKPQ K+V
Subjt:  GRKKGVKKLQWWQMFRRRRGVKLPKKGRVKIGYVNRKPQLKVV

XP_038877121.1 protein KOKOPELLI-like isoform X1 [Benincasa hispida]3.4e-14859.79Show/hide
Query:  MDVDEVYLDLLALRALYILLLKSCLRDANSERLDERAQILLKNLLDDATAGVLELHSKILATDSGFFNNFRHKEGSSFLTGIDAKQTKPLDKKVAEWMEH
        MDVD++YLDLLALR LYILLLKSCL DANSE LDERAQILLK+LLDDATAGVLE  S  LAT+S  F+NF HK         D KQ KPL  KV EWM+H
Subjt:  MDVDEVYLDLLALRALYILLLKSCLRDANSERLDERAQILLKNLLDDATAGVLELHSKILATDSGFFNNFRHKEGSSFLTGIDAKQTKPLDKKVAEWMEH

Query:  NQSARKMGNLETEDNPRMARSSALNVATNHLSNGISLALRRIELHILSLQRCTSQSRRNTRSHINGAKLANYLQGNEILSQQKVQSRTDHSTLKARITEP
        NQ+ RKMGN E  D     R+SA NVA N+LS+ IS ALRRIELHILSLQ CTSQ RR TR H       + LQ NE L+QQ V  RT  STL++R T+P
Subjt:  NQSARKMGNLETEDNPRMARSSALNVATNHLSNGISLALRRIELHILSLQRCTSQSRRNTRSHINGAKLANYLQGNEILSQQKVQSRTDHSTLKARITEP

Query:  IRGSHNLRSHISRHLLGGQ-NVKPVVRGVESLTRACQMNHCSEFVHGFRIPLSQDNDEVRKPPTVETQISKEHKLINPMILIDKSG-CSVGSKATVRSGR
        I+G          H +G Q  VKP              NHCSE+VHGFRIPLSQ NDE  KP T+ET I+K+HK++NPM LIDKSG  SVGSKAT R   
Subjt:  IRGSHNLRSHISRHLLGGQ-NVKPVVRGVESLTRACQMNHCSEFVHGFRIPLSQDNDEVRKPPTVETQISKEHKLINPMILIDKSG-CSVGSKATVRSGR

Query:  KLLNQPRIQERRCQNSPGRMIMRPTLLDHISRGVEREKENHKKTHV-ATQQESENTNSE--SESASSSSWETQQTSESETT-----DYPSSPTHQKGPPA
        KL    + Q +R QNS G+M+M PTLLDH      R +  + KTH+ ATQQESE T+SE  S S+SSSSW TQ+TS SET        PSSP+HQ  P +
Subjt:  KLLNQPRIQERRCQNSPGRMIMRPTLLDHISRGVEREKENHKKTHV-ATQQESENTNSE--SESASSSSWETQQTSESETT-----DYPSSPTHQKGPPA

Query:  TGSEASSRYRSSSISTKTFRFSHGKKGSKKAIGRFKRLKNKLGLIF-HHHHHHHHHHNTNTFMWK-HLRKIFHLHRTDNKKL--TSEGGYGKLKKSAIRS
        T S++SS        TKTF    GK  SKK +GRFKRLKNKLG++F HHHHHHHHHHN+N FMWK  LRKIF  H  DNK+L  + E G  K+KK AIR+
Subjt:  TGSEASSRYRSSSISTKTFRFSHGKKGSKKAIGRFKRLKNKLGLIF-HHHHHHHHHHNTNTFMWK-HLRKIFHLHRTDNKKL--TSEGGYGKLKKSAIRS

Query:  VSRKNQVGKFQALAEGLRSHVWKSKAMKKKELRRLGGGRKKGVKKLQWWQMFRRRRGVKLPKKGRVKIGYVNRKPQL
        V  KNQVGKFQALAEGLRSHVW+SKAMK+K ++ +  G KKGVKKL WW+MFR RRGV+LP KG +KIGYVN+K +L
Subjt:  VSRKNQVGKFQALAEGLRSHVWKSKAMKKKELRRLGGGRKKGVKKLQWWQMFRRRRGVKLPKKGRVKIGYVNRKPQL

XP_038877123.1 protein KOKOPELLI-like isoform X3 [Benincasa hispida]3.4e-14859.79Show/hide
Query:  MDVDEVYLDLLALRALYILLLKSCLRDANSERLDERAQILLKNLLDDATAGVLELHSKILATDSGFFNNFRHKEGSSFLTGIDAKQTKPLDKKVAEWMEH
        MDVD++YLDLLALR LYILLLKSCL DANSE LDERAQILLK+LLDDATAGVLE  S  LAT+S  F+NF HK         D KQ KPL  KV EWM+H
Subjt:  MDVDEVYLDLLALRALYILLLKSCLRDANSERLDERAQILLKNLLDDATAGVLELHSKILATDSGFFNNFRHKEGSSFLTGIDAKQTKPLDKKVAEWMEH

Query:  NQSARKMGNLETEDNPRMARSSALNVATNHLSNGISLALRRIELHILSLQRCTSQSRRNTRSHINGAKLANYLQGNEILSQQKVQSRTDHSTLKARITEP
        NQ+ RKMGN E  D     R+SA NVA N+LS+ IS ALRRIELHILSLQ CTSQ RR TR H       + LQ NE L+QQ V  RT  STL++R T+P
Subjt:  NQSARKMGNLETEDNPRMARSSALNVATNHLSNGISLALRRIELHILSLQRCTSQSRRNTRSHINGAKLANYLQGNEILSQQKVQSRTDHSTLKARITEP

Query:  IRGSHNLRSHISRHLLGGQ-NVKPVVRGVESLTRACQMNHCSEFVHGFRIPLSQDNDEVRKPPTVETQISKEHKLINPMILIDKSG-CSVGSKATVRSGR
        I+G          H +G Q  VKP              NHCSE+VHGFRIPLSQ NDE  KP T+ET I+K+HK++NPM LIDKSG  SVGSKAT R   
Subjt:  IRGSHNLRSHISRHLLGGQ-NVKPVVRGVESLTRACQMNHCSEFVHGFRIPLSQDNDEVRKPPTVETQISKEHKLINPMILIDKSG-CSVGSKATVRSGR

Query:  KLLNQPRIQERRCQNSPGRMIMRPTLLDHISRGVEREKENHKKTHV-ATQQESENTNSE--SESASSSSWETQQTSESETT-----DYPSSPTHQKGPPA
        KL    + Q +R QNS G+M+M PTLLDH      R +  + KTH+ ATQQESE T+SE  S S+SSSSW TQ+TS SET        PSSP+HQ  P +
Subjt:  KLLNQPRIQERRCQNSPGRMIMRPTLLDHISRGVEREKENHKKTHV-ATQQESENTNSE--SESASSSSWETQQTSESETT-----DYPSSPTHQKGPPA

Query:  TGSEASSRYRSSSISTKTFRFSHGKKGSKKAIGRFKRLKNKLGLIF-HHHHHHHHHHNTNTFMWK-HLRKIFHLHRTDNKKL--TSEGGYGKLKKSAIRS
        T S++SS        TKTF    GK  SKK +GRFKRLKNKLG++F HHHHHHHHHHN+N FMWK  LRKIF  H  DNK+L  + E G  K+KK AIR+
Subjt:  TGSEASSRYRSSSISTKTFRFSHGKKGSKKAIGRFKRLKNKLGLIF-HHHHHHHHHHNTNTFMWK-HLRKIFHLHRTDNKKL--TSEGGYGKLKKSAIRS

Query:  VSRKNQVGKFQALAEGLRSHVWKSKAMKKKELRRLGGGRKKGVKKLQWWQMFRRRRGVKLPKKGRVKIGYVNRKPQL
        V  KNQVGKFQALAEGLRSHVW+SKAMK+K ++ +  G KKGVKKL WW+MFR RRGV+LP KG +KIGYVN+K +L
Subjt:  VSRKNQVGKFQALAEGLRSHVWKSKAMKKKELRRLGGGRKKGVKKLQWWQMFRRRRGVKLPKKGRVKIGYVNRKPQL

TrEMBL top hitse value%identityAlignment
A0A6J1DL21 uncharacterized protein LOC111022084 isoform X32.4e-15562.8Show/hide
Query:  SERLDERAQILLKNLLDDATAGVLELHSKILATDSGFFNNFRHKEGSSFLTGIDAKQTKPLDKKVAEWMEHNQSARKMGNLETEDNPRMARSSALNVATN
        S++LDERAQILLK+LLDDATA +++ HSK                            TKP+++KVAEWME+NQS RK G                NVA N
Subjt:  SERLDERAQILLKNLLDDATAGVLELHSKILATDSGFFNNFRHKEGSSFLTGIDAKQTKPLDKKVAEWMEHNQSARKMGNLETEDNPRMARSSALNVATN

Query:  HLSNGISLALRRIELHILSLQRCTSQSRRNTRSHINGAKLANYLQGNEILSQQKVQSRTDHSTLKARITEPIRGSHNLRSHISRHLLGGQNVKPVVRGVE
         LSNGI LALRRIE HILSLQ  TSQS RNTRSHINGAKL+     N  L QQKVQSR DHS LKAR+ EPI G                          
Subjt:  HLSNGISLALRRIELHILSLQRCTSQSRRNTRSHINGAKLANYLQGNEILSQQKVQSRTDHSTLKARITEPIRGSHNLRSHISRHLLGGQNVKPVVRGVE

Query:  SLTRACQMNHCSEFVHGFRIPLSQDNDEVRKPPTVETQISKEHKLINPMILIDKSGCSVGSKATVRSGRKLLNQPRIQERRCQNSPGRMIMRPTLLDHIS
                 HCSEFVHGFR+PLSQDN E  KPP V TQ+SK++K+INP+ILIDKS CSVGSKATVRS    +N+ +I ERRCQN PG MIMRPTLL    
Subjt:  SLTRACQMNHCSEFVHGFRIPLSQDNDEVRKPPTVETQISKEHKLINPMILIDKSGCSVGSKATVRSGRKLLNQPRIQERRCQNSPGRMIMRPTLLDHIS

Query:  RGVEREKENHKKTHVATQQESENTNSESESASSSSWETQQTSESETTDYPSSPTHQKGPPATGSEASSRYRSSSISTKTFRFSHGKKGSKKAIGRFKRLK
                NH KT + TQQESE TNSESES SSSSW TQQTSE+ETTDYPSS +HQ+  PATGSE SSRYRSS IS+K FR SHGKKGSKKAIGRFKRL+
Subjt:  RGVEREKENHKKTHVATQQESENTNSESESASSSSWETQQTSESETTDYPSSPTHQKGPPATGSEASSRYRSSSISTKTFRFSHGKKGSKKAIGRFKRLK

Query:  NKLGLIFHHHHHHHHHHNTNT----FMWKHLRKIFHLHRTDNKKLTSEGGYGKLKKSAIRSVSRKNQVGKFQALAEGLRSHVWKSKAMKKKELR--RLGG
        NKLGLIFHHHHHHHHHH+ N+    FMWK LRKIF  H TD K++TS+G +  LKK+AIRSVSRKNQVG+FQALAEGLRSHVWK  AMKKKELR  RLG 
Subjt:  NKLGLIFHHHHHHHHHHNTNT----FMWKHLRKIFHLHRTDNKKLTSEGGYGKLKKSAIRSVSRKNQVGKFQALAEGLRSHVWKSKAMKKKELR--RLGG

Query:  GRKKGVKKLQWWQMFRRRRGVKLPKKGRVKIGYVNRKPQLKVV
          KKGVKKL WW+MF RRRGVKLP KGRVKIGYVNRKPQ K+V
Subjt:  GRKKGVKKLQWWQMFRRRRGVKLPKKGRVKIGYVNRKPQLKVV

A0A6J1DLN1 protein KOKOPELLI isoform X13.2e-16864.05Show/hide
Query:  MDVDEVYLDLLALRALYILLLKSCLRDANSE-RLDERAQILLKNLLDDATAGVLELHSKILATDSGFFNNFRHKEGSSFLTGIDAKQTKPLDKKVAEWME
        M+V+E+YLDLLALR LYILLLKSCLRDANSE +LDERAQILLK+LLDDATA +++ HSK                            TKP+++KVAEWME
Subjt:  MDVDEVYLDLLALRALYILLLKSCLRDANSE-RLDERAQILLKNLLDDATAGVLELHSKILATDSGFFNNFRHKEGSSFLTGIDAKQTKPLDKKVAEWME

Query:  HNQSARKMGNLETEDNPRMARSSALNVATNHLSNGISLALRRIELHILSLQRCTSQSRRNTRSHINGAKLANYLQGNEILSQQKVQSRTDHSTLKARITE
        +NQS RK G                NVA N LSNGI LALRRIE HILSLQ  TSQS RNTRSHINGAKL+     N  L QQKVQSR DHS LKAR+ E
Subjt:  HNQSARKMGNLETEDNPRMARSSALNVATNHLSNGISLALRRIELHILSLQRCTSQSRRNTRSHINGAKLANYLQGNEILSQQKVQSRTDHSTLKARITE

Query:  PIRGSHNLRSHISRHLLGGQNVKPVVRGVESLTRACQMNHCSEFVHGFRIPLSQDNDEVRKPPTVETQISKEHKLINPMILIDKSGCSVGSKATVRSGRK
        PI G                                   HCSEFVHGFR+PLSQDN E  KPP V TQ+SK++K+INP+ILIDKS CSVGSKATVRS   
Subjt:  PIRGSHNLRSHISRHLLGGQNVKPVVRGVESLTRACQMNHCSEFVHGFRIPLSQDNDEVRKPPTVETQISKEHKLINPMILIDKSGCSVGSKATVRSGRK

Query:  LLNQPRIQERRCQNSPGRMIMRPTLLDHISRGVEREKENHKKTHVATQQESENTNSESESASSSSWETQQTSESETTDYPSSPTHQKGPPATGSEASSRY
         +N+ +I ERRCQN PG MIMRPTLL            NH KT + TQQESE TNSESES SSSSW TQQTSE+ETTDYPSS +HQ+  PATGSE SSRY
Subjt:  LLNQPRIQERRCQNSPGRMIMRPTLLDHISRGVEREKENHKKTHVATQQESENTNSESESASSSSWETQQTSESETTDYPSSPTHQKGPPATGSEASSRY

Query:  RSSSISTKTFRFSHGKKGSKKAIGRFKRLKNKLGLIFHHHHHHHHHHNTNT----FMWKHLRKIFHLHRTDNKKLTSEGGYGKLKKSAIRSVSRKNQVGK
        RSS IS+K FR SHGKKGSKKAIGRFKRL+NKLGLIFHHHHHHHHHH+ N+    FMWK LRKIF  H TD K++TS+G +  LKK+AIRSVSRKNQVG+
Subjt:  RSSSISTKTFRFSHGKKGSKKAIGRFKRLKNKLGLIFHHHHHHHHHHNTNT----FMWKHLRKIFHLHRTDNKKLTSEGGYGKLKKSAIRSVSRKNQVGK

Query:  FQALAEGLRSHVWKSKAMKKKELR--RLGGGRKKGVKKLQWWQMFRRRRGVKLPKKGRVKIGYVNRKPQLKVV
        FQALAEGLRSHVWK  AMKKKELR  RLG   KKGVKKL WW+MF RRRGVKLP KGRVKIGYVNRKPQ K+V
Subjt:  FQALAEGLRSHVWKSKAMKKKELR--RLGGGRKKGVKKLQWWQMFRRRRGVKLPKKGRVKIGYVNRKPQLKVV

A0A6J1DNR3 protein KOKOPELLI isoform X22.9e-16964.16Show/hide
Query:  MDVDEVYLDLLALRALYILLLKSCLRDANSERLDERAQILLKNLLDDATAGVLELHSKILATDSGFFNNFRHKEGSSFLTGIDAKQTKPLDKKVAEWMEH
        M+V+E+YLDLLALR LYILLLKSCLRDANSE LDERAQILLK+LLDDATA +++ HSK                            TKP+++KVAEWME+
Subjt:  MDVDEVYLDLLALRALYILLLKSCLRDANSERLDERAQILLKNLLDDATAGVLELHSKILATDSGFFNNFRHKEGSSFLTGIDAKQTKPLDKKVAEWMEH

Query:  NQSARKMGNLETEDNPRMARSSALNVATNHLSNGISLALRRIELHILSLQRCTSQSRRNTRSHINGAKLANYLQGNEILSQQKVQSRTDHSTLKARITEP
        NQS RK G                NVA N LSNGI LALRRIE HILSLQ  TSQS RNTRSHINGAKL+     N  L QQKVQSR DHS LKAR+ EP
Subjt:  NQSARKMGNLETEDNPRMARSSALNVATNHLSNGISLALRRIELHILSLQRCTSQSRRNTRSHINGAKLANYLQGNEILSQQKVQSRTDHSTLKARITEP

Query:  IRGSHNLRSHISRHLLGGQNVKPVVRGVESLTRACQMNHCSEFVHGFRIPLSQDNDEVRKPPTVETQISKEHKLINPMILIDKSGCSVGSKATVRSGRKL
        I G                                   HCSEFVHGFR+PLSQDN E  KPP V TQ+SK++K+INP+ILIDKS CSVGSKATVRS    
Subjt:  IRGSHNLRSHISRHLLGGQNVKPVVRGVESLTRACQMNHCSEFVHGFRIPLSQDNDEVRKPPTVETQISKEHKLINPMILIDKSGCSVGSKATVRSGRKL

Query:  LNQPRIQERRCQNSPGRMIMRPTLLDHISRGVEREKENHKKTHVATQQESENTNSESESASSSSWETQQTSESETTDYPSSPTHQKGPPATGSEASSRYR
        +N+ +I ERRCQN PG MIMRPTLL            NH KT + TQQESE TNSESES SSSSW TQQTSE+ETTDYPSS +HQ+  PATGSE SSRYR
Subjt:  LNQPRIQERRCQNSPGRMIMRPTLLDHISRGVEREKENHKKTHVATQQESENTNSESESASSSSWETQQTSESETTDYPSSPTHQKGPPATGSEASSRYR

Query:  SSSISTKTFRFSHGKKGSKKAIGRFKRLKNKLGLIFHHHHHHHHHHNTNT----FMWKHLRKIFHLHRTDNKKLTSEGGYGKLKKSAIRSVSRKNQVGKF
        SS IS+K FR SHGKKGSKKAIGRFKRL+NKLGLIFHHHHHHHHHH+ N+    FMWK LRKIF  H TD K++TS+G +  LKK+AIRSVSRKNQVG+F
Subjt:  SSSISTKTFRFSHGKKGSKKAIGRFKRLKNKLGLIFHHHHHHHHHHNTNT----FMWKHLRKIFHLHRTDNKKLTSEGGYGKLKKSAIRSVSRKNQVGKF

Query:  QALAEGLRSHVWKSKAMKKKELR--RLGGGRKKGVKKLQWWQMFRRRRGVKLPKKGRVKIGYVNRKPQLKVV
        QALAEGLRSHVWK  AMKKKELR  RLG   KKGVKKL WW+MF RRRGVKLP KGRVKIGYVNRKPQ K+V
Subjt:  QALAEGLRSHVWKSKAMKKKELR--RLGGGRKKGVKKLQWWQMFRRRRGVKLPKKGRVKIGYVNRKPQLKVV

A0A6J1DQ76 protein KOKOPELLI isoform X48.9e-14265.68Show/hide
Query:  MEHNQSARKMGNLETEDNPRMARSSALNVATNHLSNGISLALRRIELHILSLQRCTSQSRRNTRSHINGAKLANYLQGNEILSQQKVQSRTDHSTLKARI
        ME+NQS RK G                NVA N LSNGI LALRRIE HILSLQ  TSQS RNTRSHINGAKL+     N  L QQKVQSR DHS LKAR+
Subjt:  MEHNQSARKMGNLETEDNPRMARSSALNVATNHLSNGISLALRRIELHILSLQRCTSQSRRNTRSHINGAKLANYLQGNEILSQQKVQSRTDHSTLKARI

Query:  TEPIRGSHNLRSHISRHLLGGQNVKPVVRGVESLTRACQMNHCSEFVHGFRIPLSQDNDEVRKPPTVETQISKEHKLINPMILIDKSGCSVGSKATVRSG
         EPI G                                   HCSEFVHGFR+PLSQDN E  KPP V TQ+SK++K+INP+ILIDKS CSVGSKATVRS 
Subjt:  TEPIRGSHNLRSHISRHLLGGQNVKPVVRGVESLTRACQMNHCSEFVHGFRIPLSQDNDEVRKPPTVETQISKEHKLINPMILIDKSGCSVGSKATVRSG

Query:  RKLLNQPRIQERRCQNSPGRMIMRPTLLDHISRGVEREKENHKKTHVATQQESENTNSESESASSSSWETQQTSESETTDYPSSPTHQKGPPATGSEASS
           +N+ +I ERRCQN PG MIMRPTLL            NH KT + TQQESE TNSESES SSSSW TQQTSE+ETTDYPSS +HQ+  PATGSE SS
Subjt:  RKLLNQPRIQERRCQNSPGRMIMRPTLLDHISRGVEREKENHKKTHVATQQESENTNSESESASSSSWETQQTSESETTDYPSSPTHQKGPPATGSEASS

Query:  RYRSSSISTKTFRFSHGKKGSKKAIGRFKRLKNKLGLIFHHHHHHHHHHNTNT----FMWKHLRKIFHLHRTDNKKLTSEGGYGKLKKSAIRSVSRKNQV
        RYRSS IS+K FR SHGKKGSKKAIGRFKRL+NKLGLIFHHHHHHHHHH+ N+    FMWK LRKIF  H TD K++TS+G +  LKK+AIRSVSRKNQV
Subjt:  RYRSSSISTKTFRFSHGKKGSKKAIGRFKRLKNKLGLIFHHHHHHHHHHNTNT----FMWKHLRKIFHLHRTDNKKLTSEGGYGKLKKSAIRSVSRKNQV

Query:  GKFQALAEGLRSHVWKSKAMKKKELR--RLGGGRKKGVKKLQWWQMFRRRRGVKLPKKGRVKIGYVNRKPQLKVV
        G+FQALAEGLRSHVWK  AMKKKELR  RLG   KKGVKKL WW+MF RRRGVKLP KGRVKIGYVNRKPQ K+V
Subjt:  GKFQALAEGLRSHVWKSKAMKKKELR--RLGGGRKKGVKKLQWWQMFRRRRGVKLPKKGRVKIGYVNRKPQLKVV

A0A6J1K5J4 uncharacterized protein LOC111491355 isoform X11.9e-13656.48Show/hide
Query:  MDVDEVYLDLLALRALYILLLKSCLRDANSER-LDERAQILLKNLLDDATAGVLELHSKILATDSGFFNNFRHKEGSSFLTGIDAKQTKPLDKKVAEWME
        M+ DE+YLDLLALR LY  LLK CLRDANSE  +  RA+ILLK+LLDDAT G+LE HSK LA     F NF  K         D KQTKPLD+KVAEWME
Subjt:  MDVDEVYLDLLALRALYILLLKSCLRDANSER-LDERAQILLKNLLDDATAGVLELHSKILATDSGFFNNFRHKEGSSFLTGIDAKQTKPLDKKVAEWME

Query:  HNQSARKMGNLE-TEDNPRMARSSALNVATNHLSNGISLALRRIELHILSLQRCTSQSRRNTRSHINGAKLANY----LQGNEILSQQKVQSRTDHSTLK
        HNQ+AR+M N E  E  PR  R+SA NVA N LS+GI+ ALRRIELHILSLQ       R TRSHI+  KLA Y     QGNE  +QQK           
Subjt:  HNQSARKMGNLE-TEDNPRMARSSALNVATNHLSNGISLALRRIELHILSLQRCTSQSRRNTRSHINGAKLANY----LQGNEILSQQKVQSRTDHSTLK

Query:  ARITEPIRGSHNLRSHISRHLLGGQNVKPVVRGVESLTRACQMNHCSEFVHGFRIPLSQDNDEVRKPPTVETQISKEHKLINPMILIDKSGCSVGSKATV
                                  VKP+V            NHCS+FV+GFRIPL+QD DE            K+H+L+ P  L+DKSGC  GSKAT 
Subjt:  ARITEPIRGSHNLRSHISRHLLGGQNVKPVVRGVESLTRACQMNHCSEFVHGFRIPLSQDNDEVRKPPTVETQISKEHKLINPMILIDKSGCSVGSKATV

Query:  RSGRKLLNQPRIQERRCQNSPGRMIMRPTLLDHISRGVEREKENHKKTHVATQQESENTNSESESASSSSWETQQTSESETTDYPSSPTHQKGPPATGSE
        R   K LN+  IQE+R +NS GR++M+PTL  H SR V +E+ +H + H+A QQESE TNSES S SS +  T QTSESETTD  SSP +Q  P ATGSE
Subjt:  RSGRKLLNQPRIQERRCQNSPGRMIMRPTLLDHISRGVEREKENHKKTHVATQQESENTNSESESASSSSWETQQTSESETTDYPSSPTHQKGPPATGSE

Query:  ASSRY--RSSSISTKTFRFSHGKKGSKKAIGRFKRLKNKLGLIFHHH----HHHHHHHNTNTFMWKHLRKIFHLHRTDNKKLTS-EGGYGKLKKSAIRSV
        ASS+Y   SS+I+ K F+FSHGKK S  A+GRFK L+NKLGLIFHHH    HHHHHHH+ +  MWK +R +F  HRTD K+LTS E   GKL+K+ IRSV
Subjt:  ASSRY--RSSSISTKTFRFSHGKKGSKKAIGRFKRLKNKLGLIFHHH----HHHHHHHNTNTFMWKHLRKIFHLHRTDNKKLTS-EGGYGKLKKSAIRSV

Query:  SRKNQVGKFQALAEGLRSHVWKSKAMKKKELRRLGGGRKKGVKKLQWWQMFRRRRGVKLPKKGRVKIGYVNRKPQLKVV
        SR NQVGKFQAL EGLRSHVWKSKAMKKKE R L  G     KKL WW+M RRRRGVK P KGRVKIGYVNRKP +K++
Subjt:  SRKNQVGKFQALAEGLRSHVWKSKAMKKKELRRLGGGRKKGVKKLQWWQMFRRRRGVKLPKKGRVKIGYVNRKPQLKVV

SwissProt top hitse value%identityAlignment
Q9FFP2 Protein KOKOPELLI5.5e-1632.71Show/hide
Query:  QPRIQERRCQNSPGRMIMRPTLLDHISRGVEREKENHK--KTHVATQQESENT--------NSESESASSSSWETQQTSESETTDYPSSPTHQKGPPATG
        +P    R  Q  P   IM+PTL+D  +   + +    +  +T  AT  ESE+         + E+ S+S S WETQ  +++E      S +    PP   
Subjt:  QPRIQERRCQNSPGRMIMRPTLLDHISRGVEREKENHK--KTHVATQQESENT--------NSESESASSSSWETQQTSESETTDYPSSPTHQKGPPATG

Query:  SEASSRYRSSSISTKTFRFSHGKKGSKKAIGRFKRLKNKLGLIFHHHHHHHHHHN----TNTFMWKHLRKIFHLHRTDNKKLTSEGGYGKLKKSAIRSVS
           S    S   + +      GK+  +  +GRFKR+KNK+G IFHHHHHHHHHH+         W  L+  FH H+   K  + E      +   + +  
Subjt:  SEASSRYRSSSISTKTFRFSHGKKGSKKAIGRFKRLKNKLGLIFHHHHHHHHHHN----TNTFMWKHLRKIFHLHRTDNKKLTSEGGYGKLKKSAIRSVS

Query:  RKNQVGKFQALAEGLRSHVWKSKAMKKKELRRLGGGRKKGVKKLQWWQMFRRRR--GVKLPKKGRVKIG
        +++Q G F AL EGL  H   SK  K +         K   KK +WW++ ++R+  GVK+PK+GRVK+G
Subjt:  RKNQVGKFQALAEGLRSHVWKSKAMKKKELRRLGGGRKKGVKKLQWWQMFRRRR--GVKLPKKGRVKIG

Arabidopsis top hitse value%identityAlignment
AT5G63720.1 kokopelli3.9e-1732.71Show/hide
Query:  QPRIQERRCQNSPGRMIMRPTLLDHISRGVEREKENHK--KTHVATQQESENT--------NSESESASSSSWETQQTSESETTDYPSSPTHQKGPPATG
        +P    R  Q  P   IM+PTL+D  +   + +    +  +T  AT  ESE+         + E+ S+S S WETQ  +++E      S +    PP   
Subjt:  QPRIQERRCQNSPGRMIMRPTLLDHISRGVEREKENHK--KTHVATQQESENT--------NSESESASSSSWETQQTSESETTDYPSSPTHQKGPPATG

Query:  SEASSRYRSSSISTKTFRFSHGKKGSKKAIGRFKRLKNKLGLIFHHHHHHHHHHN----TNTFMWKHLRKIFHLHRTDNKKLTSEGGYGKLKKSAIRSVS
           S    S   + +      GK+  +  +GRFKR+KNK+G IFHHHHHHHHHH+         W  L+  FH H+   K  + E      +   + +  
Subjt:  SEASSRYRSSSISTKTFRFSHGKKGSKKAIGRFKRLKNKLGLIFHHHHHHHHHHN----TNTFMWKHLRKIFHLHRTDNKKLTSEGGYGKLKKSAIRSVS

Query:  RKNQVGKFQALAEGLRSHVWKSKAMKKKELRRLGGGRKKGVKKLQWWQMFRRRR--GVKLPKKGRVKIG
        +++Q G F AL EGL  H   SK  K +         K   KK +WW++ ++R+  GVK+PK+GRVK+G
Subjt:  RKNQVGKFQALAEGLRSHVWKSKAMKKKELRRLGGGRKKGVKKLQWWQMFRRRR--GVKLPKKGRVKIG


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGATGTTGATGAGGTCTATCTTGATCTCCTTGCACTGAGGGCATTATATATCCTCCTCTTAAAGAGCTGTTTGCGAGATGCAAATTCAGAACGTCTGGATGAAAGGGC
ACAGATTTTGTTGAAGAATTTGCTCGATGATGCTACTGCAGGAGTTCTTGAGTTACACTCAAAGATCTTGGCAACAGACTCTGGCTTTTTTAACAACTTTCGGCATAAAG
AGGGATCAAGCTTTCTTACTGGCATTGATGCTAAACAGACGAAGCCACTGGACAAGAAAGTTGCTGAATGGATGGAACATAATCAAAGTGCAAGAAAGATGGGAAATCTG
GAGACTGAAGACAATCCCAGAATGGCCAGATCTTCAGCTTTAAATGTCGCCACTAATCACTTATCAAATGGTATTAGTTTAGCTCTCAGAAGAATTGAACTTCACATTTT
ATCTCTGCAACGTTGTACAAGTCAAAGTAGGAGGAATACAAGAAGCCATATCAATGGAGCTAAATTAGCTAACTATCTTCAAGGGAATGAGATATTGAGCCAGCAGAAAG
TTCAGTCAAGGACAGATCACTCAACTTTGAAGGCCAGAATTACTGAGCCGATTAGAGGTAGTCATAACTTGCGCAGTCATATAAGTCGTCATCTTCTTGGTGGACAGAAT
GTTAAGCCAGTAGTGAGGGGCGTTGAGTCACTAACTAGAGCGTGTCAGATGAACCATTGTTCTGAGTTCGTTCATGGGTTCAGAATACCTCTGAGTCAAGACAATGATGA
GGTCAGGAAACCTCCAACCGTTGAAACCCAGATATCTAAAGAACACAAACTTATAAATCCAATGATTCTGATAGATAAATCTGGATGTTCAGTGGGATCCAAGGCTACCG
TCAGGTCCGGTAGGAAACTGCTCAATCAACCTCGGATACAAGAAAGGAGGTGCCAGAATTCACCTGGTCGTATGATCATGAGGCCAACTTTGCTGGATCATATCTCCAGA
GGAGTAGAAAGAGAAAAGGAAAACCATAAGAAGACCCATGTGGCTACTCAGCAAGAATCTGAAAACACAAACTCAGAATCAGAATCAGCTTCTTCTTCGAGTTGGGAAAC
TCAGCAGACCAGTGAAAGTGAAACCACTGATTACCCTTCTTCGCCAACTCACCAAAAGGGTCCACCGGCAACCGGTTCTGAAGCAAGTAGCCGGTACAGAAGCAGCAGCA
TTTCAACAAAAACATTCAGATTCAGCCATGGGAAAAAGGGGTCCAAGAAAGCAATCGGACGGTTCAAGAGACTCAAGAACAAGTTAGGCCTTATCTTCCACCACCATCAC
CACCACCACCACCACCATAACACCAACACCTTCATGTGGAAGCATCTAAGAAAGATCTTCCATCTCCATCGCACAGATAACAAAAAACTAACAAGTGAAGGAGGATATGG
GAAGCTAAAGAAATCAGCAATCAGAAGTGTGTCTCGCAAGAACCAAGTTGGGAAGTTTCAGGCTCTTGCTGAAGGGCTTCGGAGCCATGTTTGGAAATCGAAAGCCATGA
AGAAGAAAGAGCTTAGGAGGCTGGGTGGTGGGAGGAAGAAGGGTGTGAAGAAGTTGCAGTGGTGGCAGATGTTTCGTCGCCGCCGTGGAGTGAAGTTACCCAAAAAAGGG
CGTGTTAAGATAGGGTATGTAAACAGAAAACCACAGCTTAAGGTAGTTTAG
mRNA sequenceShow/hide mRNA sequence
ATGGATGTTGATGAGGTCTATCTTGATCTCCTTGCACTGAGGGCATTATATATCCTCCTCTTAAAGAGCTGTTTGCGAGATGCAAATTCAGAACGTCTGGATGAAAGGGC
ACAGATTTTGTTGAAGAATTTGCTCGATGATGCTACTGCAGGAGTTCTTGAGTTACACTCAAAGATCTTGGCAACAGACTCTGGCTTTTTTAACAACTTTCGGCATAAAG
AGGGATCAAGCTTTCTTACTGGCATTGATGCTAAACAGACGAAGCCACTGGACAAGAAAGTTGCTGAATGGATGGAACATAATCAAAGTGCAAGAAAGATGGGAAATCTG
GAGACTGAAGACAATCCCAGAATGGCCAGATCTTCAGCTTTAAATGTCGCCACTAATCACTTATCAAATGGTATTAGTTTAGCTCTCAGAAGAATTGAACTTCACATTTT
ATCTCTGCAACGTTGTACAAGTCAAAGTAGGAGGAATACAAGAAGCCATATCAATGGAGCTAAATTAGCTAACTATCTTCAAGGGAATGAGATATTGAGCCAGCAGAAAG
TTCAGTCAAGGACAGATCACTCAACTTTGAAGGCCAGAATTACTGAGCCGATTAGAGGTAGTCATAACTTGCGCAGTCATATAAGTCGTCATCTTCTTGGTGGACAGAAT
GTTAAGCCAGTAGTGAGGGGCGTTGAGTCACTAACTAGAGCGTGTCAGATGAACCATTGTTCTGAGTTCGTTCATGGGTTCAGAATACCTCTGAGTCAAGACAATGATGA
GGTCAGGAAACCTCCAACCGTTGAAACCCAGATATCTAAAGAACACAAACTTATAAATCCAATGATTCTGATAGATAAATCTGGATGTTCAGTGGGATCCAAGGCTACCG
TCAGGTCCGGTAGGAAACTGCTCAATCAACCTCGGATACAAGAAAGGAGGTGCCAGAATTCACCTGGTCGTATGATCATGAGGCCAACTTTGCTGGATCATATCTCCAGA
GGAGTAGAAAGAGAAAAGGAAAACCATAAGAAGACCCATGTGGCTACTCAGCAAGAATCTGAAAACACAAACTCAGAATCAGAATCAGCTTCTTCTTCGAGTTGGGAAAC
TCAGCAGACCAGTGAAAGTGAAACCACTGATTACCCTTCTTCGCCAACTCACCAAAAGGGTCCACCGGCAACCGGTTCTGAAGCAAGTAGCCGGTACAGAAGCAGCAGCA
TTTCAACAAAAACATTCAGATTCAGCCATGGGAAAAAGGGGTCCAAGAAAGCAATCGGACGGTTCAAGAGACTCAAGAACAAGTTAGGCCTTATCTTCCACCACCATCAC
CACCACCACCACCACCATAACACCAACACCTTCATGTGGAAGCATCTAAGAAAGATCTTCCATCTCCATCGCACAGATAACAAAAAACTAACAAGTGAAGGAGGATATGG
GAAGCTAAAGAAATCAGCAATCAGAAGTGTGTCTCGCAAGAACCAAGTTGGGAAGTTTCAGGCTCTTGCTGAAGGGCTTCGGAGCCATGTTTGGAAATCGAAAGCCATGA
AGAAGAAAGAGCTTAGGAGGCTGGGTGGTGGGAGGAAGAAGGGTGTGAAGAAGTTGCAGTGGTGGCAGATGTTTCGTCGCCGCCGTGGAGTGAAGTTACCCAAAAAAGGG
CGTGTTAAGATAGGGTATGTAAACAGAAAACCACAGCTTAAGGTAGTTTAG
Protein sequenceShow/hide protein sequence
MDVDEVYLDLLALRALYILLLKSCLRDANSERLDERAQILLKNLLDDATAGVLELHSKILATDSGFFNNFRHKEGSSFLTGIDAKQTKPLDKKVAEWMEHNQSARKMGNL
ETEDNPRMARSSALNVATNHLSNGISLALRRIELHILSLQRCTSQSRRNTRSHINGAKLANYLQGNEILSQQKVQSRTDHSTLKARITEPIRGSHNLRSHISRHLLGGQN
VKPVVRGVESLTRACQMNHCSEFVHGFRIPLSQDNDEVRKPPTVETQISKEHKLINPMILIDKSGCSVGSKATVRSGRKLLNQPRIQERRCQNSPGRMIMRPTLLDHISR
GVEREKENHKKTHVATQQESENTNSESESASSSSWETQQTSESETTDYPSSPTHQKGPPATGSEASSRYRSSSISTKTFRFSHGKKGSKKAIGRFKRLKNKLGLIFHHHH
HHHHHHNTNTFMWKHLRKIFHLHRTDNKKLTSEGGYGKLKKSAIRSVSRKNQVGKFQALAEGLRSHVWKSKAMKKKELRRLGGGRKKGVKKLQWWQMFRRRRGVKLPKKG
RVKIGYVNRKPQLKVV