; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Lag0018308 (gene) of Sponge gourd (AG-4) v1 genome

Gene IDLag0018308
OrganismLuffa acutangula AG-4 (Sponge gourd (AG-4) v1)
DescriptionReverse transcriptase
Genome locationchr5:22588366..22598884
RNA-Seq ExpressionLag0018308
SyntenyLag0018308
Gene Ontology termsGO:0006508 - proteolysis (biological process)
GO:0004190 - aspartic-type endopeptidase activity (molecular function)
InterPro domainsIPR019103 - Aspartic peptidase, DDI1-type
IPR021109 - Aspartic peptidase domain superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAG6734747.1 hypothetical protein I3842_01G285500 [Carya illinoinensis]5.1e-7431.3Show/hide
Query:  PWPIRDYFQPVFQGQQSGIVYAPINANNFELKTGLMQMARDCAYRGSPTEDPNSHL--------------------------------------------
        P  ++DY +PV  G  S I+  PINANNFELK  L+ M +   + GSP +DPN HL                                            
Subjt:  PWPIRDYFQPVFQGQQSGIVYAPINANNFELKTGLMQMARDCAYRGSPTEDPNSHL--------------------------------------------

Query:  ------KSSRLVAVHSP-----WEHHHLGCFGPCLFEEIFPS---CKD---------------------------------AAGGTLLSKTVENARIRLE
               + R +A   P          +G F    FE ++ +    KD                                 A+GGTL+SKT E A   LE
Subjt:  ------KSSRLVAVHSP-----WEHHHLGCFGPCLFEEIFPS---CKD---------------------------------AAGGTLLSKTVENARIRLE

Query:  DMATNSYQWPSERSTPKKIAAGVFEVDKVSALQAQMTSLANAFMKFSGTRSAQSIESAAALA--------------------------------------
        +MA+N+YQWP+ER+  KK+ AG+ E++ ++AL AQ+ +L++     +  R  QS E  A+ +                                      
Subjt:  DMATNSYQWPSERSTPKKIAAGVFEVDKVSALQAQMTSLANAFMKFSGTRSAQSIESAAALA--------------------------------------

Query:  ----------------------SRPQEETIEQLEDLVGAFIAESSNRTTKLEEAVIAINSTVDGHSAAIKNIETQLGQLVSVVSTMNKGKAPAEQEKTQM
                              S+P E+ +  LED + +F+ E++ R  K +  +  I +      A +KN+E Q+GQL + ++   +G  P+  E    
Subjt:  ----------------------SRPQEETIEQLEDLVGAFIAESSNRTTKLEEAVIAINSTVDGHSAAIKNIETQLGQLVSVVSTMNKGKAPAEQEKTQM

Query:  EYCKAITVHK-EEAEEEPESEDYDAPTWEAEEDTSSDEAEKHE------------------PEPPITSPTLMVPKEKKKKNKKKNNQVQFDRFMNIFMNL
        E CKAIT+   +E E  P  E    PT  A    S D+ E+ E                    PPI +P L  P+  +K+   K    QF +F++IF  +
Subjt:  EYCKAITVHK-EEAEEEPESEDYDAPTWEAEEDTSSDEAEKHE------------------PEPPITSPTLMVPKEKKKKNKKKNNQVQFDRFMNIFMNL

Query:  NINIPFAEALE-MPQYNRFMKEWLAKKRKEKKVDTVYLASTCNTRVQHKVPEKVADPGSFYVPCSFGTYSF-RALCDLGASINIIPLSLCKKLDIGEIKS
        +INIPFA+ALE MP Y +F+K+ ++KKR+ ++ +TV L+  C+  +Q K+P+K+ DP SF +PC+ G   F R LCDLGASIN++P  +C+KL +GE+K 
Subjt:  NINIPFAEALE-MPQYNRFMKEWLAKKRKEKKVDTVYLASTCNTRVQHKVPEKVADPGSFYVPCSFGTYSF-RALCDLGASINIIPLSLCKKLDIGEIKS

Query:  TPVKLQLADQSVVRLVGIVENVLIRVGRFFLSIDLYVMDMIENSSMPFILGRPFLATGRVIIDIECRELTVRVRNDKEIFKAVEDSK
        T + LQLAD+S+    GI+E+VL++V +F    D  V+DM E+  +P ILGRPFLATGR +ID++  ELT+RV  ++ +F   +  K
Subjt:  TPVKLQLADQSVVRLVGIVENVLIRVGRFFLSIDLYVMDMIENSSMPFILGRPFLATGRVIIDIECRELTVRVRNDKEIFKAVEDSK

KAG7947748.1 hypothetical protein I3843_14G109500 [Carya illinoinensis]1.2e-7531.59Show/hide
Query:  PWPIRDYFQPVFQGQQSGIVYAPINANNFELKTGLMQMARDCAYRGSPTEDPNSHL--------------------------------------------
        P  ++DY +PV  G  S I+  PINANNFELK  L+ M +   + GSP +DPN HL                                            
Subjt:  PWPIRDYFQPVFQGQQSGIVYAPINANNFELKTGLMQMARDCAYRGSPTEDPNSHL--------------------------------------------

Query:  ------KSSRLVAVHSP-----WEHHHLGCFGPCLFEEIFPS---CKD---------------------------------AAGGTLLSKTVENARIRLE
               + R +A   P          +G F    FE ++ +    KD                                 A+GGTL+SKT E A   LE
Subjt:  ------KSSRLVAVHSP-----WEHHHLGCFGPCLFEEIFPS---CKD---------------------------------AAGGTLLSKTVENARIRLE

Query:  DMATNSYQWPSERSTPKKIAAGVFEVDKVSALQAQMTSLANAFMKFSGTRSAQSIESAAALA--------------------------------------
        +MA+N+YQWP+ER+  KK+ AG+ E++ ++AL AQ+ +L++     +  R  QS E  A+ +                                      
Subjt:  DMATNSYQWPSERSTPKKIAAGVFEVDKVSALQAQMTSLANAFMKFSGTRSAQSIESAAALA--------------------------------------

Query:  ----------------------SRPQEETIEQLEDLVGAFIAESSNRTTKLEEAVIAINSTVDGHSAAIKNIETQLGQLVSVVSTMNKGKAPAEQEKTQM
                              S+P E+ +  LED + +F+ E++ R  K +  +  I +      A +KN+E Q+GQL + ++   +G  P+  E    
Subjt:  ----------------------SRPQEETIEQLEDLVGAFIAESSNRTTKLEEAVIAINSTVDGHSAAIKNIETQLGQLVSVVSTMNKGKAPAEQEKTQM

Query:  EYCKAITVHK-EEAEEEPESEDYDAPTWEAEEDTSSDEAEKHE------------------PEPPITSPTLMVPKEKKKKNKKKNNQVQFDRFMNIFMNL
        E CKAIT+   +E E  P  E    PT  A    S D+ E+ E                    PPI +P L  P+  +K+   K    QF +F++IF  +
Subjt:  EYCKAITVHK-EEAEEEPESEDYDAPTWEAEEDTSSDEAEKHE------------------PEPPITSPTLMVPKEKKKKNKKKNNQVQFDRFMNIFMNL

Query:  NINIPFAEALE-MPQYNRFMKEWLAKKRKEKKVDTVYLASTCNTRVQHKVPEKVADPGSFYVPCSFGTYSF-RALCDLGASINIIPLSLCKKLDIGEIKS
        +INIPFA+ALE MP Y +F+K+ ++KKR+ ++ +TV L+  C+  +Q K+P+K+ DPGSF +PC+ G   F R LCDLGASIN++P S+C+KL +GE+K 
Subjt:  NINIPFAEALE-MPQYNRFMKEWLAKKRKEKKVDTVYLASTCNTRVQHKVPEKVADPGSFYVPCSFGTYSF-RALCDLGASINIIPLSLCKKLDIGEIKS

Query:  TPVKLQLADQSVVRLVGIVENVLIRVGRFFLSIDLYVMDMIENSSMPFILGRPFLATGRVIIDIECRELTVRVRNDKEIFKAVEDSK
        T + LQLAD+S+    GI+E+VL++V +F    D  V+DM E+  +P ILGRPFLATGR +ID++  ELT+RV  ++ +F   +  K
Subjt:  TPVKLQLADQSVVRLVGIVENVLIRVGRFFLSIDLYVMDMIENSSMPFILGRPFLATGRVIIDIECRELTVRVRNDKEIFKAVEDSK

KAG7990634.1 hypothetical protein I3843_02G035100 [Carya illinoinensis]1.4e-7431.76Show/hide
Query:  PWPIRDYFQPVFQGQQSGIVYAPINANNFELKTGLMQMARDCAYRGSPTEDPNSHL--------------------------------------------
        P  ++DY +PV  G  S I+  PINANNFELK  L+ M +   + GSP +DPN HL                                            
Subjt:  PWPIRDYFQPVFQGQQSGIVYAPINANNFELKTGLMQMARDCAYRGSPTEDPNSHL--------------------------------------------

Query:  ------KSSRLVAVHSP-----WEHHHLGCFGPCLFEEIFPS---CKD---------------------------------AAGGTLLSKTVENARIRLE
               + R +A   P          +G F    FE ++ +    KD                                 A+GGTL+SKT E A   LE
Subjt:  ------KSSRLVAVHSP-----WEHHHLGCFGPCLFEEIFPS---CKD---------------------------------AAGGTLLSKTVENARIRLE

Query:  DMATNSYQWPSERSTPKKIAAGVFEVDKVSALQAQMTSLANAFMKFSGTRSAQSIESAAALA--------------------------------------
        +MA+N+YQWP+ER+  KK+ AG+ +++ ++AL AQ+ +L++     +  R  QS E  A+ +                                      
Subjt:  DMATNSYQWPSERSTPKKIAAGVFEVDKVSALQAQMTSLANAFMKFSGTRSAQSIESAAALA--------------------------------------

Query:  ----------------------SRPQEETIEQLEDLVGAFIAESSNRTTKLEEAVIAINSTVDGHSAAIKNIETQLGQLVSVVSTMNKGKAPAEQEKTQM
                              S+P E  +  LED + +F+ E++ R  K +  +  I +      AAIKNIE Q+GQL + ++   +G  P+  E    
Subjt:  ----------------------SRPQEETIEQLEDLVGAFIAESSNRTTKLEEAVIAINSTVDGHSAAIKNIETQLGQLVSVVSTMNKGKAPAEQEKTQM

Query:  EYCKAITVHK-EEAEEEPESEDYDAPT---------WEAEEDTSSDEAEKHE--------PEPPITSPTLMVPKEKKKKNKKKNNQVQFDRFMNIFMNLN
        E CKAIT+   +E E  P  E    PT            E++  +D  E+ +          PPI +P L  P+  +K+   K    QF +F++IF  ++
Subjt:  EYCKAITVHK-EEAEEEPESEDYDAPT---------WEAEEDTSSDEAEKHE--------PEPPITSPTLMVPKEKKKKNKKKNNQVQFDRFMNIFMNLN

Query:  INIPFAEALE-MPQYNRFMKEWLAKKRKEKKVDTVYLASTCNTRVQHKVPEKVADPGSFYVPCSFGTYSF-RALCDLGASINIIPLSLCKKLDIGEIKST
        INIPFA+ALE MP Y +F+K+ ++KKR+ ++ +TV L+  C+  +Q K+P+K+ DPGSF +PC+ G   F + LCDLGASIN++PLS+C+KL + E+K T
Subjt:  INIPFAEALE-MPQYNRFMKEWLAKKRKEKKVDTVYLASTCNTRVQHKVPEKVADPGSFYVPCSFGTYSF-RALCDLGASINIIPLSLCKKLDIGEIKST

Query:  PVKLQLADQSVVRLVGIVENVLIRVGRFFLSIDLYVMDMIENSSMPFILGRPFLATGRVIIDIECRELTVRVRNDKEIFK
         + LQLAD+S+    GI+E+VL++V +F    D  V+DM E+  +P ILGRPFLATGR +ID++  ELT+RV  ++ +FK
Subjt:  PVKLQLADQSVVRLVGIVENVLIRVGRFFLSIDLYVMDMIENSSMPFILGRPFLATGRVIIDIECRELTVRVRNDKEIFK

XP_022843226.1 uncharacterized protein LOC111366761 [Olea europaea var. sylvestris]3.1e-7132.52Show/hide
Query:  IRDYFQPVFQGQQSGIVYAPINANNFELKTGLMQMARDCAYRGSPTEDPNSHL-----------------------------------------------
        IRDY +PV     SGI    I A NFELK GL+ M +   + G+  EDPN+HL                                               
Subjt:  IRDYFQPVFQGQQSGIVYAPINANNFELKTGLMQMARDCAYRGSPTEDPNSHL-----------------------------------------------

Query:  ----------------KSSRLVAVHSPWEHHHLGCFGPC--LFEEIFPSCKD--------------------------AAGGTLLSKTVENARIRLEDMA
                        KS++L    S ++      F      F+++   C                            AAGG L++KT E A   L+D+A
Subjt:  ----------------KSSRLVAVHSPWEHHHLGCFGPC--LFEEIFPSCKD--------------------------AAGGTLLSKTVENARIRLEDMA

Query:  TNSYQWPSERSTPKKIAAGVFEVDKVSALQAQMTSLANAFMKFSGTRSAQSIESAAALASRPQEETI--EQ-----------------------------
        TNSYQWPSERS  KK+ AG+ EVD ++AL AQ+ SL N  +  +   + Q+++S  + +S  QE  +  EQ                             
Subjt:  TNSYQWPSERSTPKKIAAGVFEVDKVSALQAQMTSLANAFMKFSGTRSAQSIESAAALASRPQEETI--EQ-----------------------------

Query:  --------------------------LEDLVGAFIAESSNRTTKLEEAVIAINSTVDGHSAAIKNIETQLGQLVSVVSTMNKGKAPAEQEKTQMEYCKAI
                                  LED++G FI+E+ +R  K E  +  I + V    A +KN+E Q+GQL +++ +  KGK P++ E    E+C AI
Subjt:  --------------------------LEDLVGAFIAESSNRTTKLEEAVIAINSTVDGHSAAIKNIETQLGQLVSVVSTMNKGKAPAEQEKTQMEYCKAI

Query:  TVHKEEAEEEPESEDYDAPTWEA-EEDTSSDEAEKHEPE---------------PPITSPTLMVPKEKKKKNKKKNNQVQFDRFMNIFMNLNINIPFAEA
        T+   +  EE + +    PT +    D    E +K E E               PPI  P L  P    ++  KK    QF +F+ +F  ++INIPFAE 
Subjt:  TVHKEEAEEEPESEDYDAPTWEA-EEDTSSDEAEKHEPE---------------PPITSPTLMVPKEKKKKNKKKNNQVQFDRFMNIFMNLNINIPFAEA

Query:  L-EMPQYNRFMKEWLAKKRKEKKVDTVYLASTCNTRVQHKVPEKVADPGSFYVPCSFGTYSF-RALCDLGASINIIPLSLCKKLDIGEIKSTPVKLQLAD
        L +MP Y +F+KE ++ K+K ++ +T+ L   C+  +Q K+P K+ DPGSF +PC+ G  +F RALCD GASIN++PLS+ KKL +GE+K T + LQLAD
Subjt:  L-EMPQYNRFMKEWLAKKRKEKKVDTVYLASTCNTRVQHKVPEKVADPGSFYVPCSFGTYSF-RALCDLGASINIIPLSLCKKLDIGEIKSTPVKLQLAD

Query:  QSVVRLVGIVENVLIRVGRFFLSIDLYVMDMIENSSMPFILGRPFLATGRVIIDI
        +S+    G++E+VL++V +F L +D  V+DM EN  +P ILGRPFLATGR +ID+
Subjt:  QSVVRLVGIVENVLIRVGRFFLSIDLYVMDMIENSSMPFILGRPFLATGRVIIDI

XP_022868848.1 uncharacterized protein LOC111388394 [Olea europaea var. sylvestris]4.8e-7237.04Show/hide
Query:  AAGGTLLSKTVENARIRLEDMATNSYQWPSERSTPKKIAAGVFEVDKVSALQAQMTSLANAFMKFSGTRSAQSIESAAALASRPQEETI--EQ-------
        AAGG L++KT E A   L+D+ATNSYQWPSERS  KK+ AG+ EVD ++AL AQ+ SL N  +  +   + Q ++S  + +S  QE  I  EQ       
Subjt:  AAGGTLLSKTVENARIRLEDMATNSYQWPSERSTPKKIAAGVFEVDKVSALQAQMTSLANAFMKFSGTRSAQSIESAAALASRPQEETI--EQ-------

Query:  ------------------------------------------------LEDLVGAFIAESSNRTTKLEEAVIAINSTVDGHSAAIKNIETQLGQLVSVVS
                                                        LED++G FI+E+ +R  K E  +  I + V    A +KN+E Q+GQL + + 
Subjt:  ------------------------------------------------LEDLVGAFIAESSNRTTKLEEAVIAINSTVDGHSAAIKNIETQLGQLVSVVS

Query:  TMNKGKAPAEQEKTQMEYCKAITVHKEEAEEEPESEDYDAPTWEAEEDTSSDEAEKHEPEP-----------PITSPTLMVPKEKKKKNKKKNNQVQFDR
        +  KGK P++ E    E+C AIT+   +  EE + +    PT    +   +DE +K EPE            P   P L  P    ++  KK    QF +
Subjt:  TMNKGKAPAEQEKTQMEYCKAITVHKEEAEEEPESEDYDAPTWEAEEDTSSDEAEKHEPEP-----------PITSPTLMVPKEKKKKNKKKNNQVQFDR

Query:  FMNIFMNLNINIPFAEAL-EMPQYNRFMKEWLAKKRKEKKVDTVYLASTCNTRVQHKVPEKVADPGSFYVPCSFGTYSF-RALCDLGASINIIPLSLCKK
        F+ +F  ++INIPFAEAL +MP Y +F+KE ++ K+K ++ +T+ L   C+  +Q K+P K+ DPGSF +PC+ G  +F +ALCDLGASIN++PLS+ KK
Subjt:  FMNIFMNLNINIPFAEAL-EMPQYNRFMKEWLAKKRKEKKVDTVYLASTCNTRVQHKVPEKVADPGSFYVPCSFGTYSF-RALCDLGASINIIPLSLCKK

Query:  LDIGEIKSTPVKLQLADQSVVRLVGIVENVLIRVGRFFLSIDLYVMDMIENSSMPFILGRPFLATGRVIIDIECRELTVRVRNDKEIFKAVEDSKGHSNV
        L +GE+KST + LQLAD+S+    G++E+VL++V +F L +D  V+DM EN  +P ILGRP LATGR +ID++  +LT+RV  +   F   +  K    V
Subjt:  LDIGEIKSTPVKLQLADQSVVRLVGIVENVLIRVGRFFLSIDLYVMDMIENSSMPFILGRPFLATGRVIIDIECRELTVRVRNDKEIFKAVEDSKGHSNV

Query:  LFMGYKKGARKSTTVGFTEKK
              + A     +G  EKK
Subjt:  LFMGYKKGARKSTTVGFTEKK

TrEMBL top hitse value%identityAlignment
A0A1S3Z766 Reverse transcriptase1.2e-6534.72Show/hide
Query:  IRDYFQPVFQGQQSGIVYAPINANNFELKTGLMQ-MARDCAYRGSPTEDPNSHLKSSRLVAVHSPWEHHHLGCFGPCLFEEIFPSCKDA----AGGTLLS
        +RDY +P     +S +    I ANNFE++T L+Q + + C + G  +EDP++HL                        F E+  +C++     +GG+++S
Subjt:  IRDYFQPVFQGQQSGIVYAPINANNFELKTGLMQ-MARDCAYRGSPTEDPNSHLKSSRLVAVHSPWEHHHLGCFGPCLFEEIFPSCKDA----AGGTLLS

Query:  KTVENARIRLEDMATNSYQWPSERSTPKKIAAGVFEVDKVSALQAQMTSLANAFMKFSGTRSAQS----IESAAALASRPQEETIEQLEDLVGAFIAESS
        KT   A   L +++ N  QW S+R   KK A  V +V+  ++L  Q+ +L     +  G    Q+     +        PQ+   + LED++  FI    
Subjt:  KTVENARIRLEDMATNSYQWPSERSTPKKIAAGVFEVDKVSALQAQMTSLANAFMKFSGTRSAQS----IESAAALASRPQEETIEQLEDLVGAFIAESS

Query:  NRTTKLEEAVIAINSTVDGHSAAIKNIETQLGQLVSVVSTMNKGKAPAEQEKTQMEYCKAITVHKEEAEEEPESEDYDAP-----TWEAEEDTSSD----
                   A +  V+  S+AI N+E Q+ QL +++S   K   P+  EK   E+ KAI++   +  ++P ++    P       E E    S+    
Subjt:  NRTTKLEEAVIAINSTVDGHSAAIKNIETQLGQLVSVVSTMNKGKAPAEQEKTQMEYCKAITVHKEEAEEEPESEDYDAP-----TWEAEEDTSSD----

Query:  --EAEKHEPEPPITSPTLMVPKEKKKKNKKKNNQVQFDRFMNIFMNLNINIPFAEAL-EMPQYNRFMKEWLAKKRKEKKVDTVYLASTCNTRVQHKVPEK
          E EK   E  + +    VP    +K K+KN   QF +F++I   L INIPF EAL +MP Y +F+KE L+ KRK ++V  V L   C+  +Q+K+P+K
Subjt:  --EAEKHEPEPPITSPTLMVPKEKKKKNKKKNNQVQFDRFMNIFMNLNINIPFAEAL-EMPQYNRFMKEWLAKKRKEKKVDTVYLASTCNTRVQHKVPEK

Query:  VADPGSFYVPCSF-GTYSFRALCDLGASINIIPLSLCKKLDIGEIKSTPVKLQLADQSVVRLVGIVENVLIRVGRFFLSIDLYVMDMIENSSMPFILGRP
        + DP SF + C+  G +  +ALCD GASI ++P S+ +KL++GE+K T V LQLADQS  R  GI+ENVL+RV +F   +D  V++M EN+ +P ILGRP
Subjt:  VADPGSFYVPCSF-GTYSFRALCDLGASINIIPLSLCKKLDIGEIKSTPVKLQLADQSVVRLVGIVENVLIRVGRFFLSIDLYVMDMIENSSMPFILGRP

Query:  FLATGRVIIDIECRELTVRVRNDKEIF
        FLATGR IID+   +L +RV  ++ IF
Subjt:  FLATGRVIIDIECRELTVRVRNDKEIF

A0A1U7Z6K8 uncharacterized protein LOC1045909351.6e-5731.77Show/hide
Query:  YFQPVFQGQQSGIVYAPINANNFELKTGLMQMARDCAYRGSPTEDPNSHL---------------------------KSSRL------------VAVHSP
        Y  P   G  S I+   + ANNFE+   ++QM +       P +DPNSH+                           K+++L              ++  
Subjt:  YFQPVFQGQQSGIVYAPINANNFELKTGLMQMARDCAYRGSPTEDPNSHL---------------------------KSSRL------------VAVHSP

Query:  WE----------HHHL------GCFGPCLFEEIFPSCKDAAGGTLLSKTVENARIRLEDMATNSYQWPSERSTPKKIAAGVFEVDKVSALQAQMTSLANA
        WE          HH L        F   + +         AGGTL+ KT E A   LE+MA NSYQW +E+S  K +  G++  D ++ L AQ+ +L N 
Subjt:  WE----------HHHL------GCFGPCLFEEIFPSCKDAAGGTLLSKTVENARIRLEDMATNSYQWPSERSTPKKIAAGVFEVDKVSALQAQMTSLANA

Query:  FMKFSGTRSAQSIES----AAALASRPQEETIEQLEDLVGAFIAESSNRTTKLEEAVIAINSTVDGHSAAIKNIETQLGQLVSVVSTMNKGKAPAEQEKT
         ++  G  SA+S+ S    +  L  + +  +++      G       N   K       +   ++   A I+ IETQ+GQL   +S   +G  P   EK 
Subjt:  FMKFSGTRSAQSIES----AAALASRPQEETIEQLEDLVGAFIAESSNRTTKLEEAVIAINSTVDGHSAAIKNIETQLGQLVSVVSTMNKGKAPAEQEKT

Query:  QMEYCKAIT------VHKEEAEEEPESEDYDAPTWEAEEDTSSDEAEKHEPEPPITSPTLMVPKEKKKKNKKKNNQVQFDRFMNIFMNLNINIPFAEAL-
          E  KAIT      + KEE ++E   E+ +    EA+E       EK      +   +   P    K+  K     QF +F+ +F  L+INIP  +A+ 
Subjt:  QMEYCKAIT------VHKEEAEEEPESEDYDAPTWEAEEDTSSDEAEKHEPEPPITSPTLMVPKEKKKKNKKKNNQVQFDRFMNIFMNLNINIPFAEAL-

Query:  EMPQYNRFMKEWLAKKRKEKKVDTVYLASTCNTRVQHKVPEKVADPGSFYVPCSFGTYSF-RALCDLGASINIIPLSLCKKLDIGEIKSTPVKLQLADQS
        +MP Y + +KE ++ KRK ++ D V L   C   +Q+K+P K+ D  SF +PC+ G  +F +ALCDLGASIN++P S  KK  + E   T   LQLAD+S
Subjt:  EMPQYNRFMKEWLAKKRKEKKVDTVYLASTCNTRVQHKVPEKVADPGSFYVPCSFGTYSF-RALCDLGASINIIPLSLCKKLDIGEIKSTPVKLQLADQS

Query:  VVRLVGIVENVLIRVGRFFLSIDLYVMDMIENSSMPFILGRPFLATGRVIIDIECRELTVRVRNDK---EIFKAVE
        +    G+VENVLI+V +F   +D  ++DM E+  MP ILGR FLATGR +ID++  +L+ R+ +D+    +FK  E
Subjt:  VVRLVGIVENVLIRVGRFFLSIDLYVMDMIENSSMPFILGRPFLATGRVIIDIECRELTVRVRNDK---EIFKAVE

A0A2G9HYD8 Reverse transcriptase1.5e-5536.23Show/hide
Query:  GGTLLSKTVENARIRLEDMATNSYQWPSERSTPKKIAAGVFEVDKVSALQAQMTSLANAFMKFSGTRSAQ---SIESA--AALASRPQEETI--------
        G + LS T       L ++  N Y+  SER+TP K AA V EVD+V+AL A++  L  +   F    S Q   S+ES    + A +PQ            
Subjt:  GGTLLSKTVENARIRLEDMATNSYQWPSERSTPKKIAAGVFEVDKVSALQAQMTSLANAFMKFSGTRSAQ---SIESA--AALASRPQEETI--------

Query:  ----------EQLEDLVGAF--------IAESSNRTTKLEEAVIAINSTVDGHSAAIKNIETQLGQLVSVVSTMNKGKAPAEQEKTQME----YCKAITV
                   Q +     F              +   LEE +I   +++   +A  K +ETQ+GQL + +++  +G  P+  E    +     C+A+T+
Subjt:  ----------EQLEDLVGAF--------IAESSNRTTKLEEAVIAINSTVDGHSAAIKNIETQLGQLVSVVSTMNKGKAPAEQEKTQME----YCKAITV

Query:  HKEEAEEEPESEDYDAPTWEAEEDTSSDEAEKHEPEP-PITSPTLMVPKEKKKKNKKKNNQVQFDRFMNIFMNLNINIPFAEALE-MPQYNRFMKEWLAK
              +E   +    PT   E++  S E EK    P  ++ PT + P   +K  K+K  + QF +F+ +F  L+INIPFAEALE MP Y +FMK+ L+K
Subjt:  HKEEAEEEPESEDYDAPTWEAEEDTSSDEAEKHEPEP-PITSPTLMVPKEKKKKNKKKNNQVQFDRFMNIFMNLNINIPFAEALE-MPQYNRFMKEWLAK

Query:  KRKEKKVDTVYLASTCNTRVQHKVPEKVADPGSFYVPCSFGT-YSFRALCDLGASINIIPLSLCKKLDIGEIKSTPVKLQLADQSVVRLVGIVENVLIRV
        KR+    +T  L   CN  +Q+K+P K+ DPGSF +PC+ GT +S RALCDLGASIN++P S+ + L +GE K T + LQLAD+S+    G++E++L++V
Subjt:  KRKEKKVDTVYLASTCNTRVQHKVPEKVADPGSFYVPCSFGT-YSFRALCDLGASINIIPLSLCKKLDIGEIKSTPVKLQLADQSVVRLVGIVENVLIRV

Query:  GRFFLSIDLYVMDMIENSSMPFILGRPFLATGRVIIDIECRELTVRVRNDK---EIFKAVE
         +F    D  V+DM  +S +P ILGRPFLATGR +ID++  ELT+RV++ +    +FKA++
Subjt:  GRFFLSIDLYVMDMIENSSMPFILGRPFLATGRVIIDIECRELTVRVRNDK---EIFKAVE

A0A6J0ZX64 LOW QUALITY PROTEIN: uncharacterized protein LOC1104129456.2e-5729.41Show/hide
Query:  PHSPWPIRDYFQPVFQGQQSGIVYAPINANNFELKTGLMQMARDCA-YRGSPTEDPNSHL----------------------------------------
        P +   +RDY  P+ QG    I    INANNFE+K   +QM +    + G P++DPNSHL                                        
Subjt:  PHSPWPIRDYFQPVFQGQQSGIVYAPINANNFELKTGLMQMARDCA-YRGSPTEDPNSHL----------------------------------------

Query:  -----------------------KSSRL------------VAVHSPWE----------HH------HLGCFGPCLFEEIFPSCKDAAGGTLLSKTVENAR
                               K++++             +++  WE          HH       +  F   L   I      AAGG L+SK   +A 
Subjt:  -----------------------KSSRL------------VAVHSPWE----------HH------HLGCFGPCLFEEIFPSCKDAAGGTLLSKTVENAR

Query:  IRLEDMATNSYQWPSERSTPKKIAAGVFEVDKVSALQAQMTSLANAFMKFSGTRSAQS----IESAAALASRPQEETIEQLEDLVGAFIAESSN------
          LE+MA+N+YQWPSERS  +K A G +E+D +  L  Q+ +L+   +   G  + Q+     E      S  Q     +    VG F  + +N      
Subjt:  IRLEDMATNSYQWPSERSTPKKIAAGVFEVDKVSALQAQMTSLANAFMKFSGTRSAQS----IESAAALASRPQEETIEQLEDLVGAFIAESSN------

Query:  ---------------------------------------RTTKLEEAVIAINSTVD----GHSAAIKNIETQLGQLVSVVSTMNKGKAPAEQE--KTQME
                                               + ++LEE ++   S  D       A+++N+ETQ+GQL + ++   +G  P++ +      E
Subjt:  ---------------------------------------RTTKLEEAVIAINSTVD----GHSAAIKNIETQLGQLVSVVSTMNKGKAPAEQE--KTQME

Query:  YCKAIT---------VHKEEAEEEPESEDYDAPTWEAEEDTSSDEAEKHEPEPPITSPTLMVPKEKKKKNKKKNNQVQFDRFMNIFMNLNINIPFAEALE
         C+AIT         V+++  E E E  D +      E +    + +  + E   TS  +  P    ++ +K+  + QF +F+N+F  L+INIPFAEALE
Subjt:  YCKAIT---------VHKEEAEEEPESEDYDAPTWEAEEDTSSDEAEKHEPEPPITSPTLMVPKEKKKKNKKKNNQVQFDRFMNIFMNLNINIPFAEALE

Query:  -MPQYNRFMKEWLAKKRKEKKVDTVYLASTCNTRVQHKVPEKVADPGSFYVPCSFGTYSF-RALCDLGASINIIPLSLCKKLDIGEIKSTPVKLQLADQS
         MP Y +F+K+ L+KKRK  + +TV+L   C+  +Q+K+P K+ DPGSF +PC+ G   F +AL DLGASIN++P S+ +KL +GE K T V LQLAD+S
Subjt:  -MPQYNRFMKEWLAKKRKEKKVDTVYLASTCNTRVQHKVPEKVADPGSFYVPCSFGTYSF-RALCDLGASINIIPLSLCKKLDIGEIKSTPVKLQLADQS

Query:  VVRLVGIVENVLIRVGRFFLSIDLYVMDMIENSSMPFILGRPFLATGRVIIDIECRELTVRVRNDKEIFKAVEDSKGHSN
         V   GI+E+VL++V +F   +D  ++DM E+  +P ILGRPFLAT   IID+   +++ +V  +   F     SK  S+
Subjt:  VVRLVGIVENVLIRVGRFFLSIDLYVMDMIENSSMPFILGRPFLATGRVIIDIECRELTVRVRNDKEIFKAVEDSKGHSN

A0A6J1DU19 uncharacterized protein LOC1110243611.7e-6733.61Show/hide
Query:  IRDYFQPVFQGQQSGIVYAPINANNFELKTGLMQMARDCAYRGSPTEDPNSHL----------------------------------------------K
        IRDY QP F     GI+  PINANN ELK GL+QM R+  +RG+ TEDPN+HL                                              K
Subjt:  IRDYFQPVFQGQQSGIVYAPINANNFELKTGLMQMARDCAYRGSPTEDPNSHL----------------------------------------------K

Query:  SSRL-VAVHSPWEHHHLGCFGPC-LFEEIFPSCKD--------------------------AAGGTLLSKTVENARIRLEDMATNSYQWPSERSTPKKIA
        +++L   + S  ++ +   F     ++E+   C                            AAGGTLLS+T ENA I L+DMA NS+QWPSERS  KK+ 
Subjt:  SSRL-VAVHSPWEHHHLGCFGPC-LFEEIFPSCKD--------------------------AAGGTLLSKTVENARIRLEDMATNSYQWPSERSTPKKIA

Query:  AGVFEVDKVSALQAQMTSLANAFMKFSGTRSAQSIESAAALASRP-QEETIEQ-------------LEDLVGAFIAESSNRTTKLEEAVIAINSTVDGHS
        AG++E+D++S+L+AQ+ +L NA  K SG  ++ S E  AA  +    E TIEQ             LEDL+GAFI E  +R +++E  V  +   ++G++
Subjt:  AGVFEVDKVSALQAQMTSLANAFMKFSGTRSAQSIESAAALASRP-QEETIEQ-------------LEDLVGAFIAESSNRTTKLEEAVIAINSTVDGHS

Query:  AAIKNIETQLGQLVSVVSTMNKGKAPAEQEKTQMEYCKAITVHKEEAEEEPESEDYDAPTWEAEEDTSSDEAEKHEPEPPITSPTLMVPKEKKKKNKKKN
         +IKN+E Q+GQ+   ++TM KGK P++ E    E+CKA+T+   +  +EPE +  + P    EE  + +E  K        +P L   K          
Subjt:  AAIKNIETQLGQLVSVVSTMNKGKAPAEQEKTQMEYCKAITVHKEEAEEEPESEDYDAPTWEAEEDTSSDEAEKHEPEPPITSPTLMVPKEKKKKNKKKN

Query:  NQVQFDRFMNIFMNLNINIPFAEALE-MPQYNRFMKEWLAKKRKEKKVDTVYLASTCNTRVQHKVPEKVADPGSFYVPCSFGTYSF-RALCDLGASINII
        N + + +                ALE MP Y RFMK+ +  KRK +  +TV L   C+  +Q K+P+K+ DPGSF +PC+  + SF +ALCD+ ASIN++
Subjt:  NQVQFDRFMNIFMNLNINIPFAEALE-MPQYNRFMKEWLAKKRKEKKVDTVYLASTCNTRVQHKVPEKVADPGSFYVPCSFGTYSF-RALCDLGASINII

Query:  PLSLCKKLDIGEIKSTPVKLQLADQSVVRLVGIVENVLIRVGRFFLSIDLYVMDMIENSSMPFILGRPFLATGRVIIDIECRELTVRVRNDKEIF
        PL                             G++E+VL++V R     D  V+   E+S +P ILGR FLATG  +ID++   LT+RV  +  +F
Subjt:  PLSLCKKLDIGEIKSTPVKLQLADQSVVRLVGIVENVLIRVGRFFLSIDLYVMDMIENSSMPFILGRPFLATGRVIIDIECRELTVRVRNDKEIF

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGAGACATATGCCCTATGCAGTAGGCATTGTCAGTAAATATCAATCAAATCCAGGATTTGATCTCTGGAGAGCCGTTAAGTACATCCTCCAGTATCTTAGGAGAACGAG
GATGCATTGCAGCCTCGACCATTTAGGCCGAACAGGGATCCCGCCCTCTCATTGGCCCGAGAGGGACTTTCTGTTTACTGGTTGGACCATAAACAGGTTGTTCATTAGAG
GAGCACTGGTACTTAAGGAGCCAGAGATAACTCGTACTATAGCTTCTCCATTAAGAGTGAAGACAGACACTGATGTAGATTTCATCAAATCCTTATCAGTTAAAACATCT
GTACATAAGAAACTTGAGACAAGACTAGCTCAGCTTCTCCCAAATTTTCATTTGGAAATGAGCTGCAAGCCTCTTTTAATACTCATCAACGTTCTGATTAAAGCCAAAAA
ATTTGATCACCGTATCAAATCTTCCATGCCAAGACCGAAATGCTTAGGTCAATCCATGAATGGACCAAGTAAGCTCGCAAACCCTTTTGCTTTTGATCTTGTTCAATGGA
CCCTTCTGGTTGGAACCTTAAATGTCTCATTTCCTCTTGTACATCCACTTGTAACCTCTAGGTGTTACCCCATCAAGCAAATTTACAAGATCCTAGATGGAATTGAAATG
CAAAGACTTCATTTCCAGGTAGCGTCGCGACGCTGTCACAACGTCGAGACGCTAAGGAGACAGCGTCTCGACGTTGTTTCTGTTGCCGCCCAATTCAGAAAAGGAAAGGC
AGCGTCGAGACGCTCCAAGAGCAGCGTCCCGACGCTGGCCTACGGATGGCTCTCTTTTGACTTCTTCTTGGTTGAATCTTTTCCGTTCTTTAGCCAATTTTCATGTCCCG
AACACGCCCTCCGAAGGGATGGTCGCTCCAGGACGACACGCAGGCGGATCATAGGAATCTCACGTTGTGACCTAGGGGTGGAGACCGAAGGATACGTTGACACGCGTCCC
GCTCCCACTTACTATAGTCACTTTCCCCATTCACCTTGGCCTATTAGAGACTACTTTCAGCCCGTGTTTCAGGGGCAACAGTCGGGGATTGTCTATGCCCCGATTAATGC
CAACAACTTTGAGCTGAAGACCGGTCTCATGCAGATGGCCCGAGACTGTGCATACAGAGGATCGCCCACCGAGGATCCAAATTCCCATCTAAAATCATCGAGATTGGTTG
CAGTCCATTCCCCCTGGGAGCATCACCACTTGGGATGCTTTGGTCCATGCCTTTTTGAAGAAATTTTTCCCTCCTGCAAAGACGCTGCAGGTGGGACTCTGTTGTCCAAG
ACCGTGGAAAATGCTCGCATACGTCTAGAGGATATGGCCACCAACAGCTATCAGTGGCCATCTGAGCGGTCTACACCCAAAAAGATTGCTGCTGGAGTGTTTGAGGTTGA
TAAAGTAAGTGCACTCCAGGCCCAGATGACCTCCCTTGCTAATGCTTTTATGAAATTTTCAGGTACAAGGAGTGCACAGTCAATTGAATCAGCTGCTGCTTTAGCATCTA
GACCTCAGGAGGAGACCATTGAACAGTTAGAGGATCTTGTTGGAGCTTTCATTGCAGAGTCTAGTAACAGGACAACCAAATTGGAGGAGGCAGTCATTGCCATCAACTCA
ACAGTTGATGGCCACAGTGCTGCCATCAAGAATATTGAGACTCAGCTGGGACAGTTGGTAAGCGTTGTAAGCACGATGAATAAAGGTAAGGCTCCAGCTGAGCAAGAGAA
AACTCAGATGGAGTACTGTAAGGCAATCACTGTGCACAAGGAGGAAGCTGAAGAGGAGCCTGAGTCTGAGGATTATGATGCACCTACATGGGAAGCTGAGGAGGACACAT
CATCAGATGAGGCTGAAAAGCATGAACCTGAGCCTCCTATTACTTCTCCCACACTGATGGTTCCCAAAGAAAAGAAAAAGAAAAATAAGAAAAAGAACAATCAGGTTCAG
TTTGATAGGTTTATGAATATTTTTATGAATCTGAACATTAATATTCCTTTTGCAGAGGCGTTAGAGATGCCCCAGTATAACAGGTTCATGAAGGAGTGGTTAGCAAAGAA
GCGAAAGGAAAAGAAGGTTGACACTGTATATCTTGCTTCCACGTGCAACACCAGAGTACAACATAAAGTACCTGAAAAAGTAGCAGATCCAGGGAGTTTTTATGTTCCTT
GTAGTTTTGGTACTTATTCTTTCAGAGCATTATGTGATTTAGGTGCTAGCATTAATATTATTCCTTTATCCTTGTGTAAAAAGTTAGATATAGGTGAGATTAAATCTACT
CCTGTTAAGCTTCAATTGGCTGATCAGTCTGTGGTTAGACTAGTTGGCATTGTAGAAAATGTTTTAATCAGAGTAGGTAGATTTTTCCTCTCTATTGATTTATATGTTAT
GGACATGATAGAAAATTCTTCAATGCCTTTCATATTAGGAAGACCATTCCTCGCTACTGGGCGAGTGATTATTGATATTGAGTGCAGGGAGCTCACTGTAAGAGTCAGGA
ATGACAAAGAAATATTTAAAGCAGTTGAAGACTCTAAGGGACACTCAAACGTGCTTTTCATGGGCTACAAGAAAGGTGCAAGAAAGAGCACCACTGTTGGATTCACAGAA
AAGAAGCCTCCTTGA
mRNA sequenceShow/hide mRNA sequence
ATGAGACATATGCCCTATGCAGTAGGCATTGTCAGTAAATATCAATCAAATCCAGGATTTGATCTCTGGAGAGCCGTTAAGTACATCCTCCAGTATCTTAGGAGAACGAG
GATGCATTGCAGCCTCGACCATTTAGGCCGAACAGGGATCCCGCCCTCTCATTGGCCCGAGAGGGACTTTCTGTTTACTGGTTGGACCATAAACAGGTTGTTCATTAGAG
GAGCACTGGTACTTAAGGAGCCAGAGATAACTCGTACTATAGCTTCTCCATTAAGAGTGAAGACAGACACTGATGTAGATTTCATCAAATCCTTATCAGTTAAAACATCT
GTACATAAGAAACTTGAGACAAGACTAGCTCAGCTTCTCCCAAATTTTCATTTGGAAATGAGCTGCAAGCCTCTTTTAATACTCATCAACGTTCTGATTAAAGCCAAAAA
ATTTGATCACCGTATCAAATCTTCCATGCCAAGACCGAAATGCTTAGGTCAATCCATGAATGGACCAAGTAAGCTCGCAAACCCTTTTGCTTTTGATCTTGTTCAATGGA
CCCTTCTGGTTGGAACCTTAAATGTCTCATTTCCTCTTGTACATCCACTTGTAACCTCTAGGTGTTACCCCATCAAGCAAATTTACAAGATCCTAGATGGAATTGAAATG
CAAAGACTTCATTTCCAGGTAGCGTCGCGACGCTGTCACAACGTCGAGACGCTAAGGAGACAGCGTCTCGACGTTGTTTCTGTTGCCGCCCAATTCAGAAAAGGAAAGGC
AGCGTCGAGACGCTCCAAGAGCAGCGTCCCGACGCTGGCCTACGGATGGCTCTCTTTTGACTTCTTCTTGGTTGAATCTTTTCCGTTCTTTAGCCAATTTTCATGTCCCG
AACACGCCCTCCGAAGGGATGGTCGCTCCAGGACGACACGCAGGCGGATCATAGGAATCTCACGTTGTGACCTAGGGGTGGAGACCGAAGGATACGTTGACACGCGTCCC
GCTCCCACTTACTATAGTCACTTTCCCCATTCACCTTGGCCTATTAGAGACTACTTTCAGCCCGTGTTTCAGGGGCAACAGTCGGGGATTGTCTATGCCCCGATTAATGC
CAACAACTTTGAGCTGAAGACCGGTCTCATGCAGATGGCCCGAGACTGTGCATACAGAGGATCGCCCACCGAGGATCCAAATTCCCATCTAAAATCATCGAGATTGGTTG
CAGTCCATTCCCCCTGGGAGCATCACCACTTGGGATGCTTTGGTCCATGCCTTTTTGAAGAAATTTTTCCCTCCTGCAAAGACGCTGCAGGTGGGACTCTGTTGTCCAAG
ACCGTGGAAAATGCTCGCATACGTCTAGAGGATATGGCCACCAACAGCTATCAGTGGCCATCTGAGCGGTCTACACCCAAAAAGATTGCTGCTGGAGTGTTTGAGGTTGA
TAAAGTAAGTGCACTCCAGGCCCAGATGACCTCCCTTGCTAATGCTTTTATGAAATTTTCAGGTACAAGGAGTGCACAGTCAATTGAATCAGCTGCTGCTTTAGCATCTA
GACCTCAGGAGGAGACCATTGAACAGTTAGAGGATCTTGTTGGAGCTTTCATTGCAGAGTCTAGTAACAGGACAACCAAATTGGAGGAGGCAGTCATTGCCATCAACTCA
ACAGTTGATGGCCACAGTGCTGCCATCAAGAATATTGAGACTCAGCTGGGACAGTTGGTAAGCGTTGTAAGCACGATGAATAAAGGTAAGGCTCCAGCTGAGCAAGAGAA
AACTCAGATGGAGTACTGTAAGGCAATCACTGTGCACAAGGAGGAAGCTGAAGAGGAGCCTGAGTCTGAGGATTATGATGCACCTACATGGGAAGCTGAGGAGGACACAT
CATCAGATGAGGCTGAAAAGCATGAACCTGAGCCTCCTATTACTTCTCCCACACTGATGGTTCCCAAAGAAAAGAAAAAGAAAAATAAGAAAAAGAACAATCAGGTTCAG
TTTGATAGGTTTATGAATATTTTTATGAATCTGAACATTAATATTCCTTTTGCAGAGGCGTTAGAGATGCCCCAGTATAACAGGTTCATGAAGGAGTGGTTAGCAAAGAA
GCGAAAGGAAAAGAAGGTTGACACTGTATATCTTGCTTCCACGTGCAACACCAGAGTACAACATAAAGTACCTGAAAAAGTAGCAGATCCAGGGAGTTTTTATGTTCCTT
GTAGTTTTGGTACTTATTCTTTCAGAGCATTATGTGATTTAGGTGCTAGCATTAATATTATTCCTTTATCCTTGTGTAAAAAGTTAGATATAGGTGAGATTAAATCTACT
CCTGTTAAGCTTCAATTGGCTGATCAGTCTGTGGTTAGACTAGTTGGCATTGTAGAAAATGTTTTAATCAGAGTAGGTAGATTTTTCCTCTCTATTGATTTATATGTTAT
GGACATGATAGAAAATTCTTCAATGCCTTTCATATTAGGAAGACCATTCCTCGCTACTGGGCGAGTGATTATTGATATTGAGTGCAGGGAGCTCACTGTAAGAGTCAGGA
ATGACAAAGAAATATTTAAAGCAGTTGAAGACTCTAAGGGACACTCAAACGTGCTTTTCATGGGCTACAAGAAAGGTGCAAGAAAGAGCACCACTGTTGGATTCACAGAA
AAGAAGCCTCCTTGA
Protein sequenceShow/hide protein sequence
MRHMPYAVGIVSKYQSNPGFDLWRAVKYILQYLRRTRMHCSLDHLGRTGIPPSHWPERDFLFTGWTINRLFIRGALVLKEPEITRTIASPLRVKTDTDVDFIKSLSVKTS
VHKKLETRLAQLLPNFHLEMSCKPLLILINVLIKAKKFDHRIKSSMPRPKCLGQSMNGPSKLANPFAFDLVQWTLLVGTLNVSFPLVHPLVTSRCYPIKQIYKILDGIEM
QRLHFQVASRRCHNVETLRRQRLDVVSVAAQFRKGKAASRRSKSSVPTLAYGWLSFDFFLVESFPFFSQFSCPEHALRRDGRSRTTRRRIIGISRCDLGVETEGYVDTRP
APTYYSHFPHSPWPIRDYFQPVFQGQQSGIVYAPINANNFELKTGLMQMARDCAYRGSPTEDPNSHLKSSRLVAVHSPWEHHHLGCFGPCLFEEIFPSCKDAAGGTLLSK
TVENARIRLEDMATNSYQWPSERSTPKKIAAGVFEVDKVSALQAQMTSLANAFMKFSGTRSAQSIESAAALASRPQEETIEQLEDLVGAFIAESSNRTTKLEEAVIAINS
TVDGHSAAIKNIETQLGQLVSVVSTMNKGKAPAEQEKTQMEYCKAITVHKEEAEEEPESEDYDAPTWEAEEDTSSDEAEKHEPEPPITSPTLMVPKEKKKKNKKKNNQVQ
FDRFMNIFMNLNINIPFAEALEMPQYNRFMKEWLAKKRKEKKVDTVYLASTCNTRVQHKVPEKVADPGSFYVPCSFGTYSFRALCDLGASINIIPLSLCKKLDIGEIKST
PVKLQLADQSVVRLVGIVENVLIRVGRFFLSIDLYVMDMIENSSMPFILGRPFLATGRVIIDIECRELTVRVRNDKEIFKAVEDSKGHSNVLFMGYKKGARKSTTVGFTE
KKPP