; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Lag0038780 (gene) of Sponge gourd (AG-4) v1 genome

Gene IDLag0038780
OrganismLuffa acutangula AG-4 (Sponge gourd (AG-4) v1)
DescriptionCCHC-type domain-containing protein
Genome locationchr2:26516971..26521230
RNA-Seq ExpressionLag0038780
SyntenyLag0038780
Gene Ontology termsNA
InterPro domainsIPR021109 - Aspartic peptidase domain superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAG8501049.1 hypothetical protein CXB51_003148 [Gossypium anomalum]2.8e-8839.93Show/hide
Query:  GLDQASKALANASANGSFLKNSTNEAHAILDTMATNNRHWGENEPTIILKNPVKAVETETNSSMQAQIKAIHSMMIGMTMN--NQANIAPANAVSSLCCE
        GL+  ++ + +ASANG+ L  S NEA+ I++ +A+N+  W  N  T   +      E +  +S+ +Q+ +I SM+  +T N  N     P N   ++ C 
Subjt:  GLDQASKALANASANGSFLKNSTNEAHAILDTMATNNRHWGENEPTIILKNPVKAVETETNSSMQAQIKAIHSMMIGMTMN--NQANIAPANAVSSLCCE

Query:  IC------------------VGSSNSKTHQFGKQPTQQQPHKNFQQVPVQSQESN-LETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTEI
         C                  +G+  S  +    +PTQ        Q PVQ++ SN LE L+K YM +ND  +R+L+ Q+GQ+A E++NRP G LPS T+ 
Subjt:  IC------------------VGSSNSKTHQFGKQPTQQQPHKNFQQVPVQSQESN-LETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTEI

Query:  PHREGKEQCKVDLKKWIRIRWTKVSRQVGSSNSKTHQFGKQPTQQQPHKNFQQVPVQSQESNLETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGTL
            GKE CK           T  SR+             +P   +  K      V++Q+S   T        N +AV +L  ++ Q+      +P    
Subjt:  PHREGKEQCKVDLKKWIRIRWTKVSRQVGSSNSKTHQFGKQPTQQQPHKNFQQVPVQSQESNLETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGTL

Query:  PSKTEIPHREGKEQCKVDLKKWIRIRWTKV-------------SNILSKKRRLGEYETVALTECSSALVKNEIPPKLKDPGSFTIPCSVGGKDVGRALCD
        P + +   +E + +  +D+ K + I    V              +ILSKKR+LGE+ETVALT+  +  +++++PPKLKDPG FTIPC++G    G+ALCD
Subjt:  PSKTEIPHREGKEQCKVDLKKWIRIRWTKV-------------SNILSKKRRLGEYETVALTECSSALVKNEIPPKLKDPGSFTIPCSVGGKDVGRALCD

Query:  LGASINLMPYSVFKQLGVGEARPTTVTLQLADRSLTHPIGKIEDVLVKVDKFVFPADFIILDCEADKDVLIILGRPFLATDRTLIDVQKGELTMRVNDQQ
        LGASINLMP S+FK+LG+GE RPTTVTLQLADRSL H  GKI+DVLV+VDKF+FPADF+ILD EADK+V IILGRPFLAT RTLIDVQKGELTM V D Q
Subjt:  LGASINLMPYSVFKQLGVGEARPTTVTLQLADRSLTHPIGKIEDVLVKVDKFVFPADFIILDCEADKDVLIILGRPFLATDRTLIDVQKGELTMRVNDQQ

Query:  VTFNVMNVMKYSGDFEECSMIDEIDEL----SQSTLQEIMRRETLEDILGDEESDE
        VTFNV   M++    ++CS++ ++++L      +++++++ R    D   DEE DE
Subjt:  VTFNVMNVMKYSGDFEECSMIDEIDEL----SQSTLQEIMRRETLEDILGDEESDE

XP_022157836.1 uncharacterized protein LOC111024449 [Momordica charantia]7.6e-8639.41Show/hide
Query:  YKECGLDQASKALANASANGSFLKNSTNEAHAILDTMATNNRHW----GENEPTIILKNPVKAVETETNSSMQAQIKAIHSMMIGMTMNNQA--------
        Y++C  D  +K + N +ANG F   + NE   ILD +  +N  W       +P  I +  V     +  SSMQ+Q+  +  MM  M  NN A        
Subjt:  YKECGLDQASKALANASANGSFLKNSTNEAHAILDTMATNNRHW----GENEPTIILKNPVKAVETETNSSMQAQIKAIHSMMIGMTMNNQA--------

Query:  NIAPANAVSSLCCEICVGSSNSKTHQFGKQPT-----QQQPHKNFQQVPVQSQESNLETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTEI
        N+AP   +    C  C  S NS+       PT      Q    N  QV  Q           +Y   N   V    + L    Q+   + + + P +  +
Subjt:  NIAPANAVSSLCCEICVGSSNSKTHQFGKQPT-----QQQPHKNFQQVPVQSQESNLETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTEI

Query:  PHREG-KEQCKVDLKKWIRIRWTKVSRQVGSSNSKTHQFGKQPTQQQPHKNFQQVPVQSQESNLETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGT
           E   ++C   + K   IR                                   +Q   +  +T +++Y  RND A+R+L+ Q+GQ+A E+KNRP GT
Subjt:  PHREG-KEQCKVDLKKWIRIRWTKVSRQVGSSNSKTHQFGKQPTQQQPHKNFQQVPVQSQESNLETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGT

Query:  LPSKTEIPHREGKEQCK-VDLKKWIRIRWTK---------------------------------VSNILSKKRRLGEYETVALTECSSALVKNEIPPKLK
        LPS TE P  EG+E CK +  +  +     K                                 + +I+++K++LGEYETVALTECSS + K+++ PKLK
Subjt:  LPSKTEIPHREGKEQCK-VDLKKWIRIRWTK---------------------------------VSNILSKKRRLGEYETVALTECSSALVKNEIPPKLK

Query:  DPGSFTIPCSVGGKDVGRALCDLGASINLMPYSVFKQLGVGEARPTTVTLQLADRSLTHPIGKIEDVLVKVDKFVFPADFIILDCEADKDVLIILGRPFL
        DPGSFTIPCS+GGKDVGRALCDL ASINLMP S+FK+L +G+A PTTVTLQLADRS+T P GKIEDVLVKVDKF+FPADFIIL+CEADKDV IILGRPFL
Subjt:  DPGSFTIPCSVGGKDVGRALCDLGASINLMPYSVFKQLGVGEARPTTVTLQLADRSLTHPIGKIEDVLVKVDKFVFPADFIILDCEADKDVLIILGRPFL

Query:  ATDRTLIDVQKGELTMRVNDQQVTFNVMNVMKYSGDFEECSMIDEIDELSQSTLQEIMRRETLEDILGDEESDEKG
        +T  TLIDV+KGELTM V+DQ+VTFN+++ MKY  D EEC+ I     L+   L +++  E    +   EE++++G
Subjt:  ATDRTLIDVQKGELTMRVNDQQVTFNVMNVMKYSGDFEECSMIDEIDELSQSTLQEIMRRETLEDILGDEESDEKG

XP_022159235.1 uncharacterized protein LOC111025653 [Momordica charantia]2.9e-8544.78Show/hide
Query:  FGKQPTQQQPHKNFQQVPVQSQESNLETLMKE----------------------------YMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTEIPHR
        F   P Q    KN+ Q P Q   SN+E LMKE                            YM RNDV VR L++QLGQ+  E++ RP G+LPS TE P R
Subjt:  FGKQPTQQQPHKNFQQVPVQSQESNLETLMKE----------------------------YMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTEIPHR

Query:  EGKEQC-KVDLKKWIRIRWTKVSRQVGSSNSKTHQFGKQPTQQQPHKNFQ---QVPVQSQESNL-------ETLMKEYMVRNDVAVRDLKVQLGQIAQEI
         GKE C  +  +  ++    ++  +   S S+     ++ TQ  P K  +    VPV  Q SN        + L+++    N     D+  QL       
Subjt:  EGKEQC-KVDLKKWIRIRWTKVSRQVGSSNSKTHQFGKQPTQQQPHKNFQ---QVPVQSQESNL-------ETLMKEYMVRNDVAVRDLKVQLGQIAQEI

Query:  KNRPHGTLPSKTEIPHREGKEQCKVDLKKWIRIRWTKVSNILSKKRRLGEYETVALTECSSALVKNEIPPKLKDPGSFTIPCSVGGKDVGRALCDLGASI
                     IP  E  EQ     K         + +I+++K++LGEYETVALTECSS + K+++PPKLKDPGSFTIPC +GGKDVGRALCDLGASI
Subjt:  KNRPHGTLPSKTEIPHREGKEQCKVDLKKWIRIRWTKVSNILSKKRRLGEYETVALTECSSALVKNEIPPKLKDPGSFTIPCSVGGKDVGRALCDLGASI

Query:  NLMPYSVFKQLGVGEARPTTVTLQLADRSLTHPIGKIEDVLVKVDKFVFPADFIILDCEADKDVLIILGRPFLATDRTLIDVQKGELTMRVNDQQVTFNV
        NLMP S+FK+  +G+A PTTVTLQLADRS+T P GKIEDVLVKVDKF+FP DFIILDCEADKDV IILGRPFLAT  TLIDV+KGELTMRV+DQ+VTFN+
Subjt:  NLMPYSVFKQLGVGEARPTTVTLQLADRSLTHPIGKIEDVLVKVDKFVFPADFIILDCEADKDVLIILGRPFLATDRTLIDVQKGELTMRVNDQQVTFNV

Query:  MNVMKYSGDFEECSMID--------EIDELSQSTLQEIMRRETLEDILGDEESDEKGGKKIEDEEINQP
        ++ MKY  D EEC++I         E+D+L  + ++  +     E I+      ++  K I+  +I  P
Subjt:  MNVMKYSGDFEECSMID--------EIDELSQSTLQEIMRRETLEDILGDEESDEKGGKKIEDEEINQP

XP_024028757.1 uncharacterized protein LOC112093792 [Morus notabilis]6.6e-9038.34Show/hide
Query:  GLDQASKALANASANGSFLKNSTNEAHAILDTMATNNRHWGENEPTIILKNPVKAVETETNSSMQAQIKAIHSMMIGMTMNNQANIAPANAVSSLCC---
        GL+ +++A+ +ASAN + L  + NEA+ IL+ M+ NN  W   E     +      E +  +++ AQ+ ++ +++  + +   AN A   A++ + C   
Subjt:  GLDQASKALANASANGSFLKNSTNEAHAILDTMATNNRHWGENEPTIILKNPVKAVETETNSSMQAQIKAIHSMMIGMTMNNQANIAPANAVSSLCC---

Query:  ---EICVGS-----------------SNS---------------------------------KTHQFGKQPTQQQPHKNFQQVPVQSQESNLETLMKEYM
           E C  +                 SNS                                   HQ  +QP Q+Q +   Q+ P Q+  + +E L+KEYM
Subjt:  ---EICVGS-----------------SNS---------------------------------KTHQFGKQPTQQQPHKNFQQVPVQSQESNLETLMKEYM

Query:  VRND--------------VAVRDLKVQLGQIAQEIKNRPHGTLPSKTEIPHREGKEQCKVDLKKWIRIRWTKVSRQVGSSNSKTHQFGKQPTQQQPHKNF
         RND               ++R L+ Q+GQ+A  + NRP G+LPS T+ P R+GKE CK   K  I ++  +   Q+    + T     Q TQ+      
Subjt:  VRND--------------VAVRDLKVQLGQIAQEIKNRPHGTLPSKTEIPHREGKEQCKVDLKKWIRIRWTKVSRQVGSSNSKTHQFGKQPTQQQPHKNF

Query:  QQVPVQSQESNLETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTEIPHREGKEQCKVDLKKWIRIRWTKV-------------SNILSKKR
        QQ P +S++              DV  +D   +L Q   E   RP    P + +   ++ + +  +D+ K + I    V              +IL+KKR
Subjt:  QQVPVQSQESNLETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTEIPHREGKEQCKVDLKKWIRIRWTKV-------------SNILSKKR

Query:  RLGEYETVALTECSSALVKNEIPPKLKDPGSFTIPCSVGGKDVGRALCDLGASINLMPYSVFKQLGVGEARPTTVTLQLADRSLTHPIGKIEDVLVKVDK
        RLGE+ETVALTE  SA++KN +PPKLKDPGSFTIPCS+G + +G+ALCDLGASINLMP S+F++LG+GE  PTTVTLQLADRS  HP GKIEDVLV+VDK
Subjt:  RLGEYETVALTECSSALVKNEIPPKLKDPGSFTIPCSVGGKDVGRALCDLGASINLMPYSVFKQLGVGEARPTTVTLQLADRSLTHPIGKIEDVLVKVDK

Query:  FVFPADFIILDCEADKDVLIILGRPFLATDRTLIDVQKGELTMRVNDQQVTFNVMNVMKYSGDFEECSMIDEID-----ELSQSTLQEIMRRETLEDILG
        F+FPADFI+LD EADK+V IILGRPFLAT +TLIDVQKGELTMRV+DQQVTFNV   M+++ + EECS ++ +D     E  ++  +++M  E L D   
Subjt:  FVFPADFIILDCEADKDVLIILGRPFLATDRTLIDVQKGELTMRVNDQQVTFNVMNVMKYSGDFEECSMIDEID-----ELSQSTLQEIMRRETLEDILG

Query:  DEESDEKGGKKIE
        +E++++K   ++E
Subjt:  DEESDEKGGKKIE

XP_030509265.1 uncharacterized protein LOC115723943 [Cannabis sativa]7.3e-8942.91Show/hide
Query:  GLDQASKALANASANGSFLKNSTNEAHAILDTMATNNRHWGENE-PTIILKNPVKAVETETNSSMQAQIKAIHSMMIGMTMNNQANIAPANAVSSLCCEI
        GL+ AS+ + +ASANG+ L  S NEA  IL+T+A+NN  W     PT      V  V+          I A+ + M  MT NN  N++            
Subjt:  GLDQASKALANASANGSFLKNSTNEAHAILDTMATNNRHWGENE-PTIILKNPVKAVETETNSSMQAQIKAIHSMMIGMTMNNQANIAPANAVSSLCCEI

Query:  CVGSSNSKTHQFGKQP-----TQQQPHKNFQQVPVQSQESNLETLMKEYMVRNDVAV-------RDLKVQLGQIAQEIKNRPHGTLPSKTEIPHREGKEQ
          G+S+S     G+Q      +QQ  H    Q    +Q S+LE+LM++YM +ND  +       R+L++QLG +A E+K RP G+LPS TE P R+GKEQ
Subjt:  CVGSSNSKTHQFGKQP-----TQQQPHKNFQQVPVQSQESNLETLMKEYMVRNDVAV-------RDLKVQLGQIAQEIKNRPHGTLPSKTEIPHREGKEQ

Query:  CKVDLKKWIRIRWTKVSRQVGSSNSKTHQFGKQPTQQQPHKNFQQVPVQSQESNLETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTEIPH
        CK      I +R     + + +S  +    G +PT            +Q+ E   +   +E      V     +    Q +  +  +P    P +     
Subjt:  CKVDLKKWIRIRWTKVSRQVGSSNSKTHQFGKQPTQQQPHKNFQQVPVQSQESNLETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTEIPH

Query:  REGKEQCKVDLKKWIRIRWTKV-------------SNILSKKRRLGEYETVALTECSSALVKNEIPPKLKDPGSFTIPCSVGGKDVGRALCDLGASINLM
        ++G+ +  +D+ K + I    V              +IL+KKRRLGE+ETVALTE  SA++K++IPPKLKDPGSFTIPCS+GG+DVGRALCDLGASINLM
Subjt:  REGKEQCKVDLKKWIRIRWTKV-------------SNILSKKRRLGEYETVALTECSSALVKNEIPPKLKDPGSFTIPCSVGGKDVGRALCDLGASINLM

Query:  PYSVFKQLGVGEARPTTVTLQLADRSLTHPIGKIEDVLVKVDKFVFPADFIILDCEADKDVLIILGRPFLATDRTLIDVQKGELTMRVNDQQVTFNVMNV
        P S+FK+LG+GEARPTTVTLQLADRS+ HP GKIEDVLV+VDKF+FPADFIILD EAD+DV IILGRPFLAT RTLIDVQ GELTMR+ D          
Subjt:  PYSVFKQLGVGEARPTTVTLQLADRSLTHPIGKIEDVLVKVDKFVFPADFIILDCEADKDVLIILGRPFLATDRTLIDVQKGELTMRVNDQQVTFNVMNV

Query:  MKYSGDFEECSMIDEIDEL-SQSTLQEIMRRE-------TLEDILGDEES
             + EECS I  ID + ++   +E  R E        LED+  DE++
Subjt:  MKYSGDFEECSMIDEIDEL-SQSTLQEIMRRE-------TLEDILGDEES

TrEMBL top hitse value%identityAlignment
A0A2G9HWC5 DNA-directed DNA polymerase3.3e-7949.71Show/hide
Query:  GKQPTQQQPHKNFQQVPVQSQESNLETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTE-IPHREGKEQCK-VDLKKWIRIRWTK-------
        G  P  QQ  +   Q P+Q ++ +LE  + ++M       + ++ Q+GQ A  I +RP G+LPS TE  P ++GK QC+ V LK  I I + +       
Subjt:  GKQPTQQQPHKNFQQVPVQSQESNLETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTE-IPHREGKEQCK-VDLKKWIRIRWTK-------

Query:  ----VSNILSKKRRLGEYETVALTECSSALVKNEIPPKLKDPGSFTIPCSVGGKDVGRALCDLGASINLMPYSVFKQLGVGEARPTTVTLQLADRSLTHP
            + +ILSKKRRLG+YETVALTE  SA+++N++PPKLKDPGSFTIPC++G    GRALCDLGASINLMPYS+++ LG+GEA+PT++TLQLADRSLT+P
Subjt:  ----VSNILSKKRRLGEYETVALTECSSALVKNEIPPKLKDPGSFTIPCSVGGKDVGRALCDLGASINLMPYSVFKQLGVGEARPTTVTLQLADRSLTHP

Query:  IGKIEDVLVKVDKFVFPADFIILDCEADKDVLIILGRPFLATDRTLIDVQKGELTMRVNDQQVTFNVMNVMKYSGDFEECSMIDEIDELSQSTLQEIMRR
         G IED+LVKVDKF+FPAD ++LD E D ++LIILGRPFLAT RTLIDVQKGELTMRV DQQ+TFNV   MK+  + +EC  +   D  + +        
Subjt:  IGKIEDVLVKVDKFVFPADFIILDCEADKDVLIILGRPFLATDRTLIDVQKGELTMRVNDQQVTFNVMNVMKYSGDFEECSMIDEIDELSQSTLQEIMRR

Query:  ETLEDILGD--EESDEKGGKKIEDEEINQPMKRQRIKPYWGRGFEDEE
        ++LE  L D  +E +E      ED E+ + +   +   +  RG E  E
Subjt:  ETLEDILGD--EESDEKGGKKIEDEEINQPMKRQRIKPYWGRGFEDEE

A0A6J1CPJ3 uncharacterized protein LOC1110129478.4e-8338.38Show/hide
Query:  DQASKALANASANGSFLKNSTNEAHAILDTMATNNRHWGENEPTIILK--NPVKAVETETNSSMQAQIKAIHSMMIGMTMNNQA--------NIAPANAV
        D  +  + N +ANG F   S NE   ILD ++ +N  W   +P    K  +P   +  +  +SMQ QI  I  M+  M  NN A        N +P   +
Subjt:  DQASKALANASANGSFLKNSTNEAHAILDTMATNNRHWGENEPTIILK--NPVKAVETETNSSMQAQIKAIHSMMIGMTMNNQA--------NIAPANAV

Query:  SSLCCEI------------------------CVGSSNSKTHQ------------------FGKQPTQQQPHKNFQQVPVQSQESNLETLMKE--------
        +   C++                            S+S T Q                  F   P Q    KN+ Q P Q   SN+E LMKE        
Subjt:  SSLCCEI------------------------CVGSSNSKTHQ------------------FGKQPTQQQPHKNFQQVPVQSQESNLETLMKE--------

Query:  --------------------YMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTEIPHREGKEQCKVDLKKWIRIRWTKVSRQVGSSNSKTHQFGKQPT
                            YM RNDV VR+L++QLGQ+A E++ RP G+LPS TE P                        R V  S S+     ++ T
Subjt:  --------------------YMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTEIPHREGKEQCKVDLKKWIRIRWTKVSRQVGSSNSKTHQFGKQPT

Query:  QQQPHKNFQ---QVPVQSQESNL-------ETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTEIPHREGKEQCKVDLKKWIRIRWTKVSNI
        Q  P K  +    V V  Q SN        + L+++    N     D+  QL                    IP  E  EQ     K         + +I
Subjt:  QQQPHKNFQ---QVPVQSQESNL-------ETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTEIPHREGKEQCKVDLKKWIRIRWTKVSNI

Query:  LSKKRRLGEYETVALTECSSALVKNEIPPKLKDPGSFTIPCSVGGKDVGRALCDLGASINLMPYSVFKQLGVGEARPTTVTLQLADRSLTHPIGKIEDVL
        +++K++LGEYETVALTECSS + K++ PPKLKDPGSFTI C +GGKDVGRALCDLGA INLMP S+FK+L +G+A PTTVTL LADRS+T P GKIEDVL
Subjt:  LSKKRRLGEYETVALTECSSALVKNEIPPKLKDPGSFTIPCSVGGKDVGRALCDLGASINLMPYSVFKQLGVGEARPTTVTLQLADRSLTHPIGKIEDVL

Query:  VKVDKFVFPADFIILDCEADKDVLIILGRPFLATDRTLIDVQKGELTMRVNDQQVTFNVMNVMKYSGDFEECSMID--------EIDELSQSTLQEIMRR
        VKVDKF+FPADFIILDCEADKDV IILGRPFLAT  TLIDV+KGELTMRV+DQ+VTFN+++ MKY  D EEC +I         E+D+L  + ++  +  
Subjt:  VKVDKFVFPADFIILDCEADKDVLIILGRPFLATDRTLIDVQKGELTMRVNDQQVTFNVMNVMKYSGDFEECSMID--------EIDELSQSTLQEIMRR

Query:  ETLEDILGDEESDEKGGKKIEDEEINQP
           E I+      ++  K I+  +I  P
Subjt:  ETLEDILGDEESDEKGGKKIEDEEINQP

A0A6J1DY39 uncharacterized protein LOC1110256531.4e-8544.78Show/hide
Query:  FGKQPTQQQPHKNFQQVPVQSQESNLETLMKE----------------------------YMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTEIPHR
        F   P Q    KN+ Q P Q   SN+E LMKE                            YM RNDV VR L++QLGQ+  E++ RP G+LPS TE P R
Subjt:  FGKQPTQQQPHKNFQQVPVQSQESNLETLMKE----------------------------YMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTEIPHR

Query:  EGKEQC-KVDLKKWIRIRWTKVSRQVGSSNSKTHQFGKQPTQQQPHKNFQ---QVPVQSQESNL-------ETLMKEYMVRNDVAVRDLKVQLGQIAQEI
         GKE C  +  +  ++    ++  +   S S+     ++ TQ  P K  +    VPV  Q SN        + L+++    N     D+  QL       
Subjt:  EGKEQC-KVDLKKWIRIRWTKVSRQVGSSNSKTHQFGKQPTQQQPHKNFQ---QVPVQSQESNL-------ETLMKEYMVRNDVAVRDLKVQLGQIAQEI

Query:  KNRPHGTLPSKTEIPHREGKEQCKVDLKKWIRIRWTKVSNILSKKRRLGEYETVALTECSSALVKNEIPPKLKDPGSFTIPCSVGGKDVGRALCDLGASI
                     IP  E  EQ     K         + +I+++K++LGEYETVALTECSS + K+++PPKLKDPGSFTIPC +GGKDVGRALCDLGASI
Subjt:  KNRPHGTLPSKTEIPHREGKEQCKVDLKKWIRIRWTKVSNILSKKRRLGEYETVALTECSSALVKNEIPPKLKDPGSFTIPCSVGGKDVGRALCDLGASI

Query:  NLMPYSVFKQLGVGEARPTTVTLQLADRSLTHPIGKIEDVLVKVDKFVFPADFIILDCEADKDVLIILGRPFLATDRTLIDVQKGELTMRVNDQQVTFNV
        NLMP S+FK+  +G+A PTTVTLQLADRS+T P GKIEDVLVKVDKF+FP DFIILDCEADKDV IILGRPFLAT  TLIDV+KGELTMRV+DQ+VTFN+
Subjt:  NLMPYSVFKQLGVGEARPTTVTLQLADRSLTHPIGKIEDVLVKVDKFVFPADFIILDCEADKDVLIILGRPFLATDRTLIDVQKGELTMRVNDQQVTFNV

Query:  MNVMKYSGDFEECSMID--------EIDELSQSTLQEIMRRETLEDILGDEESDEKGGKKIEDEEINQP
        ++ MKY  D EEC++I         E+D+L  + ++  +     E I+      ++  K I+  +I  P
Subjt:  MNVMKYSGDFEECSMID--------EIDELSQSTLQEIMRRETLEDILGDEESDEKGGKKIEDEEINQP

A0A6J1DZC3 uncharacterized protein LOC1110244493.7e-8639.41Show/hide
Query:  YKECGLDQASKALANASANGSFLKNSTNEAHAILDTMATNNRHW----GENEPTIILKNPVKAVETETNSSMQAQIKAIHSMMIGMTMNNQA--------
        Y++C  D  +K + N +ANG F   + NE   ILD +  +N  W       +P  I +  V     +  SSMQ+Q+  +  MM  M  NN A        
Subjt:  YKECGLDQASKALANASANGSFLKNSTNEAHAILDTMATNNRHW----GENEPTIILKNPVKAVETETNSSMQAQIKAIHSMMIGMTMNNQA--------

Query:  NIAPANAVSSLCCEICVGSSNSKTHQFGKQPT-----QQQPHKNFQQVPVQSQESNLETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTEI
        N+AP   +    C  C  S NS+       PT      Q    N  QV  Q           +Y   N   V    + L    Q+   + + + P +  +
Subjt:  NIAPANAVSSLCCEICVGSSNSKTHQFGKQPT-----QQQPHKNFQQVPVQSQESNLETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTEI

Query:  PHREG-KEQCKVDLKKWIRIRWTKVSRQVGSSNSKTHQFGKQPTQQQPHKNFQQVPVQSQESNLETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGT
           E   ++C   + K   IR                                   +Q   +  +T +++Y  RND A+R+L+ Q+GQ+A E+KNRP GT
Subjt:  PHREG-KEQCKVDLKKWIRIRWTKVSRQVGSSNSKTHQFGKQPTQQQPHKNFQQVPVQSQESNLETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGT

Query:  LPSKTEIPHREGKEQCK-VDLKKWIRIRWTK---------------------------------VSNILSKKRRLGEYETVALTECSSALVKNEIPPKLK
        LPS TE P  EG+E CK +  +  +     K                                 + +I+++K++LGEYETVALTECSS + K+++ PKLK
Subjt:  LPSKTEIPHREGKEQCK-VDLKKWIRIRWTK---------------------------------VSNILSKKRRLGEYETVALTECSSALVKNEIPPKLK

Query:  DPGSFTIPCSVGGKDVGRALCDLGASINLMPYSVFKQLGVGEARPTTVTLQLADRSLTHPIGKIEDVLVKVDKFVFPADFIILDCEADKDVLIILGRPFL
        DPGSFTIPCS+GGKDVGRALCDL ASINLMP S+FK+L +G+A PTTVTLQLADRS+T P GKIEDVLVKVDKF+FPADFIIL+CEADKDV IILGRPFL
Subjt:  DPGSFTIPCSVGGKDVGRALCDLGASINLMPYSVFKQLGVGEARPTTVTLQLADRSLTHPIGKIEDVLVKVDKFVFPADFIILDCEADKDVLIILGRPFL

Query:  ATDRTLIDVQKGELTMRVNDQQVTFNVMNVMKYSGDFEECSMIDEIDELSQSTLQEIMRRETLEDILGDEESDEKG
        +T  TLIDV+KGELTM V+DQ+VTFN+++ MKY  D EEC+ I     L+   L +++  E    +   EE++++G
Subjt:  ATDRTLIDVQKGELTMRVNDQQVTFNVMNVMKYSGDFEECSMIDEIDELSQSTLQEIMRRETLEDILGDEESDEKG

A0A6J1GJ68 uncharacterized protein LOC1114543446.3e-7846.28Show/hide
Query:  SNLETLMKEYMVRNDVAV-------RDLKVQLGQIAQEIKNRPHGTLPSKTEIPHREGKEQCKVDLKKWIRIRWTKVSRQVGSSNSKTHQFGKQPTQQQP
        +++E+L+KEYM +NDV +       ++L+VQ+GQ+A E++NRP G LP+ TE P REGKEQC+      I +R  K     G  N++      Q T    
Subjt:  SNLETLMKEYMVRNDVAV-------RDLKVQLGQIAQEIKNRPHGTLPSKTEIPHREGKEQCKVDLKKWIRIRWTKVSRQVGSSNSKTHQFGKQPTQQQP

Query:  HKNFQQVPVQSQESNLETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTEIPHREGKEQCKVDLKKWIRIRWTKV-------------SNIL
         +N  +  VQ + S      K+Y    +          GQ  + I   P    P + +    E   +  +D+ K I I   +V              ++L
Subjt:  HKNFQQVPVQSQESNLETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTEIPHREGKEQCKVDLKKWIRIRWTKV-------------SNIL

Query:  SKKRRLGEYETVALTECSSALVKNEIPPKLKDPGSFTIPCSVGGKDVGRALCDLGASINLMPYSVFKQLGVGEARPTTVTLQLADRSLTHPIGKIEDVLV
        + +R+  E++ V+L E  SA++KN+IP K KDPGSFTIP S+GGK++GRALCDLGASINLMP S++K+LG+GEARPTTVTLQLADRS+T+P GKIED+L+
Subjt:  SKKRRLGEYETVALTECSSALVKNEIPPKLKDPGSFTIPCSVGGKDVGRALCDLGASINLMPYSVFKQLGVGEARPTTVTLQLADRSLTHPIGKIEDVLV

Query:  KVDKFVFPADFIILDCEADKDVLIILGRPFLATDRTLIDVQKGELTMRVNDQQVTFNVMNVMKYSGDFEECSMIDE
        +VDKF+FPADFIILD EAD DV IILGRPFL T RTL+DV KG +T+R+ DQ+V FN+ + MKY    EECS + E
Subjt:  KVDKFVFPADFIILDCEADKDVLIILGRPFLATDRTLIDVQKGELTMRVNDQQVTFNVMNVMKYSGDFEECSMIDE

SwissProt top hitse value%identityAlignment
P10978 Retrovirus-related Pol polyprotein from transposon TNT 1-943.7e-1124.28Show/hide
Query:  ENFKRLLWEALDKKYKLEDAGTKKFLVGKFLDYKMIDTKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVIEKLPPSWKDFKYYLKHKRNELSMEN---
        E+  R +W  L+  Y  +    K +L  +     M +    ++ +     +I+ L + G+ I E  +   ++  LP S+ +    + H +  + +++   
Subjt:  ENFKRLLWEALDKKYKLEDAGTKKFLVGKFLDYKMIDTKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVIEKLPPSWKDFKYYLKHKRNELSMEN---

Query:  ---LGVKLRIKEDNR------KGDKASLGVEANAHIVESSKHGPKKQQLKKRNVNPRPRNDTNKLKKRNVNPR--------PRNDANKRIC---------
           L  K+R K +N+      +G   S    +N +   S   G  K + K R  N    N     K+   NPR         +ND N             
Subjt:  ---LGVKLRIKEDNR------KGDKASLGVEANAHIVESSKHGPKKQQLKKRNVNPRPRNDTNKLKKRNVNPR--------PRNDANKRIC---------

Query:  -----GVC---------WWIDIGATRHICKDKSLFTTYEKLDGGEKLYMGNASTTSVASKGNVLLKWTSGKIFTLNDVWHVPEIRKNLVSGTLLNKNGFK
               C         W +D  A+ H    + LF  Y   D G  + MGN S + +A  G++ +K   G    L DV HVP++R NL+SG  L+++G++
Subjt:  -----GVC---------WWIDIGATRHICKDKSLFTTYEKLDGGEKLYMGNASTTSVASKGNVLLKWTSGKIFTLNDVWHVPEIRKNLVSGTLLNKNGFK

Query:  LVFESDKFILTKG
          F + K+ LTKG
Subjt:  LVFESDKFILTKG

Q9ZT94 Retrovirus-related Pol polyprotein from transposon RE23.1e-0535.62Show/hide
Query:  WWIDIGATRHICKDKSLFTTYEKLDGGEKLYMGNASTTSVASKGNVLLKWTSGKIFTLNDVWHVPEIRKNLVS
        W +D GAT HI  D +  + ++   GG+ + + + ST  +   G+  L  TS +   LN V +VP I KNL+S
Subjt:  WWIDIGATRHICKDKSLFTTYEKLDGGEKLYMGNASTTSVASKGNVLLKWTSGKIFTLNDVWHVPEIRKNLVS

Arabidopsis top hitse value%identityAlignment
AT4G00980.1 zinc knuckle (CCHC-type) family protein2.6e-0728.95Show/hide
Query:  LWEALDKKYKLEDAGTKKFLVGKFLDYKMIDTKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVIEKLPPSWKDF
        LW+ L   Y+ +++ +K+  V K+++++M++ + ++ Q++    I   + S G+ + E F V+ +I K PPSW+ F
Subjt:  LWEALDKKYKLEDAGTKKFLVGKFLDYKMIDTKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVIEKLPPSWKDF


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGCTGCAAACTCCTCCACCAATGTTCCTGGTGCTACCATGATGGGATCAACCATCCTCAAAACTCATGCCGAAAAACCAGAGAGATTCAAGGGAGAAAATTTCAAGAG
GTTATTGTGGGAGGCATTAGACAAGAAGTATAAGCTGGAAGATGCTGGTACTAAGAAGTTTCTTGTCGGAAAATTCTTAGATTATAAGATGATTGATACCAAGTTGGTAG
TCAATCAGATGGAAGAATTGCAAATTATCATTAGTGATTTGCAAAGTGAGGGATTGGACATCAGTGAACCATTCCAAGTTGCTGCTGTGATTGAGAAGCTACCTCCTTCT
TGGAAGGATTTCAAATACTATCTTAAGCACAAGCGAAATGAGCTTTCCATGGAGAATCTTGGAGTCAAACTCCGCATTAAAGAGGATAATAGAAAAGGAGACAAAGCTTC
GTTGGGAGTTGAAGCTAATGCACACATTGTTGAATCTTCAAAACATGGTCCTAAGAAGCAACAACTCAAGAAGAGGAATGTGAATCCTCGACCGAGGAATGACACCAACA
AACTCAAGAAGAGGAATGTGAATCCTCGACCGAGGAATGACGCCAACAAGCGCATTTGTGGAGTCTGCTGGTGGATTGATATTGGTGCTACTAGGCACATTTGTAAGGAT
AAGTCCTTATTCACCACATATGAGAAGCTAGATGGTGGTGAGAAGTTGTACATGGGAAATGCTTCAACTACCTCTGTGGCAAGTAAAGGGAATGTTCTGCTGAAATGGAC
TTCTGGAAAGATATTTACCCTCAATGATGTGTGGCACGTGCCAGAGATTCGAAAGAATCTAGTGTCAGGAACCCTACTTAATAAAAATGGGTTCAAGCTTGTGTTTGAGT
CAGATAAATTTATTCTGACAAAAGGGGGCTGGGGCATGTCAATTATCGTTCTTTACAAAGAATGTGGATTAGATCAAGCTTCAAAGGCACTAGCAAACGCATCTGCAAAC
GGATCTTTTCTGAAGAATTCTACTAATGAGGCGCATGCTATTCTAGACACCATGGCCACCAACAATCGACATTGGGGAGAAAATGAACCAACAATCATTTTGAAAAATCC
AGTGAAAGCAGTGGAAACAGAGACTAACTCTTCGATGCAAGCCCAGATCAAGGCCATTCATAGCATGATGATAGGCATGACCATGAATAATCAAGCCAACATAGCCCCTG
CCAATGCTGTCTCTTCTCTCTGTTGTGAGATATGTGTCGGGTCGAGTAATTCCAAAACTCATCAGTTTGGGAAGCAGCCTACACAACAACAACCGCATAAAAACTTCCAA
CAAGTTCCAGTACAGAGTCAAGAGTCAAATCTAGAGACTCTTATGAAGGAGTACATGGTAAGGAACGATGTTGCAGTAAGAGATTTGAAAGTTCAGCTTGGACAAATTGC
TCAGGAGATTAAGAATAGACCACATGGAACGTTGCCTAGCAAAACAGAGATCCCTCACAGAGAAGGAAAGGAACAATGCAAGGTAGACCTTAAGAAGTGGATTAGAATAC
GATGGACCAAAGTATCCAGACAAGTCGGGTCGAGTAATTCCAAAACTCATCAGTTTGGGAAGCAGCCTACACAACAACAACCGCATAAAAACTTCCAACAAGTTCCAGTA
CAGAGTCAAGAGTCAAATCTAGAGACTCTTATGAAGGAGTACATGGTAAGGAACGATGTTGCAGTAAGAGATTTGAAAGTTCAGCTTGGACAAATTGCTCAGGAGATTAA
GAATAGACCACATGGAACGTTGCCTAGCAAAACAGAGATCCCTCACAGAGAAGGAAAGGAACAATGCAAGGTAGACCTTAAGAAGTGGATTAGAATACGATGGACCAAAG
TATCCAATATTCTGTCAAAAAAGAGAAGGTTGGGAGAATATGAAACGGTTGCACTCACGGAATGTTCCAGTGCACTGGTGAAGAACGAGATCCCTCCCAAGCTTAAAGAC
CCAGGAAGCTTCACCATTCCATGCTCTGTCGGAGGCAAGGATGTAGGCAGAGCTCTGTGCGATTTAGGAGCAAGCATCAACTTAATGCCATACTCGGTTTTTAAACAGTT
AGGGGTGGGTGAAGCACGACCAACAACTGTGACCTTACAGTTGGCGGACAGATCGCTTACACACCCCATTGGAAAGATTGAAGACGTATTGGTTAAGGTTGACAAGTTTG
TTTTCCCTGCAGACTTCATCATTTTAGATTGTGAAGCTGACAAGGATGTGCTGATCATCTTAGGACGACCTTTCCTAGCCACTGACAGAACCTTAATAGATGTACAGAAA
GGAGAATTGACCATGAGGGTCAACGATCAGCAAGTCACGTTTAACGTGATGAATGTAATGAAGTACTCTGGAGATTTCGAGGAATGTTCTATGATTGATGAAATTGATGA
ACTCTCCCAGTCTACTTTGCAGGAAATAATGAGAAGAGAAACCTTGGAAGATATACTTGGAGATGAAGAATCAGACGAAAAAGGTGGGAAGAAGATTGAAGATGAAGAGA
TAAATCAACCCATGAAGCGACAAAGGATCAAACCATATTGGGGGAGAGGCTTCGAGGATGAGGAATCCCATGTTTCTGTGATTGATTTGGTATGA
mRNA sequenceShow/hide mRNA sequence
ATGGCTGCAAACTCCTCCACCAATGTTCCTGGTGCTACCATGATGGGATCAACCATCCTCAAAACTCATGCCGAAAAACCAGAGAGATTCAAGGGAGAAAATTTCAAGAG
GTTATTGTGGGAGGCATTAGACAAGAAGTATAAGCTGGAAGATGCTGGTACTAAGAAGTTTCTTGTCGGAAAATTCTTAGATTATAAGATGATTGATACCAAGTTGGTAG
TCAATCAGATGGAAGAATTGCAAATTATCATTAGTGATTTGCAAAGTGAGGGATTGGACATCAGTGAACCATTCCAAGTTGCTGCTGTGATTGAGAAGCTACCTCCTTCT
TGGAAGGATTTCAAATACTATCTTAAGCACAAGCGAAATGAGCTTTCCATGGAGAATCTTGGAGTCAAACTCCGCATTAAAGAGGATAATAGAAAAGGAGACAAAGCTTC
GTTGGGAGTTGAAGCTAATGCACACATTGTTGAATCTTCAAAACATGGTCCTAAGAAGCAACAACTCAAGAAGAGGAATGTGAATCCTCGACCGAGGAATGACACCAACA
AACTCAAGAAGAGGAATGTGAATCCTCGACCGAGGAATGACGCCAACAAGCGCATTTGTGGAGTCTGCTGGTGGATTGATATTGGTGCTACTAGGCACATTTGTAAGGAT
AAGTCCTTATTCACCACATATGAGAAGCTAGATGGTGGTGAGAAGTTGTACATGGGAAATGCTTCAACTACCTCTGTGGCAAGTAAAGGGAATGTTCTGCTGAAATGGAC
TTCTGGAAAGATATTTACCCTCAATGATGTGTGGCACGTGCCAGAGATTCGAAAGAATCTAGTGTCAGGAACCCTACTTAATAAAAATGGGTTCAAGCTTGTGTTTGAGT
CAGATAAATTTATTCTGACAAAAGGGGGCTGGGGCATGTCAATTATCGTTCTTTACAAAGAATGTGGATTAGATCAAGCTTCAAAGGCACTAGCAAACGCATCTGCAAAC
GGATCTTTTCTGAAGAATTCTACTAATGAGGCGCATGCTATTCTAGACACCATGGCCACCAACAATCGACATTGGGGAGAAAATGAACCAACAATCATTTTGAAAAATCC
AGTGAAAGCAGTGGAAACAGAGACTAACTCTTCGATGCAAGCCCAGATCAAGGCCATTCATAGCATGATGATAGGCATGACCATGAATAATCAAGCCAACATAGCCCCTG
CCAATGCTGTCTCTTCTCTCTGTTGTGAGATATGTGTCGGGTCGAGTAATTCCAAAACTCATCAGTTTGGGAAGCAGCCTACACAACAACAACCGCATAAAAACTTCCAA
CAAGTTCCAGTACAGAGTCAAGAGTCAAATCTAGAGACTCTTATGAAGGAGTACATGGTAAGGAACGATGTTGCAGTAAGAGATTTGAAAGTTCAGCTTGGACAAATTGC
TCAGGAGATTAAGAATAGACCACATGGAACGTTGCCTAGCAAAACAGAGATCCCTCACAGAGAAGGAAAGGAACAATGCAAGGTAGACCTTAAGAAGTGGATTAGAATAC
GATGGACCAAAGTATCCAGACAAGTCGGGTCGAGTAATTCCAAAACTCATCAGTTTGGGAAGCAGCCTACACAACAACAACCGCATAAAAACTTCCAACAAGTTCCAGTA
CAGAGTCAAGAGTCAAATCTAGAGACTCTTATGAAGGAGTACATGGTAAGGAACGATGTTGCAGTAAGAGATTTGAAAGTTCAGCTTGGACAAATTGCTCAGGAGATTAA
GAATAGACCACATGGAACGTTGCCTAGCAAAACAGAGATCCCTCACAGAGAAGGAAAGGAACAATGCAAGGTAGACCTTAAGAAGTGGATTAGAATACGATGGACCAAAG
TATCCAATATTCTGTCAAAAAAGAGAAGGTTGGGAGAATATGAAACGGTTGCACTCACGGAATGTTCCAGTGCACTGGTGAAGAACGAGATCCCTCCCAAGCTTAAAGAC
CCAGGAAGCTTCACCATTCCATGCTCTGTCGGAGGCAAGGATGTAGGCAGAGCTCTGTGCGATTTAGGAGCAAGCATCAACTTAATGCCATACTCGGTTTTTAAACAGTT
AGGGGTGGGTGAAGCACGACCAACAACTGTGACCTTACAGTTGGCGGACAGATCGCTTACACACCCCATTGGAAAGATTGAAGACGTATTGGTTAAGGTTGACAAGTTTG
TTTTCCCTGCAGACTTCATCATTTTAGATTGTGAAGCTGACAAGGATGTGCTGATCATCTTAGGACGACCTTTCCTAGCCACTGACAGAACCTTAATAGATGTACAGAAA
GGAGAATTGACCATGAGGGTCAACGATCAGCAAGTCACGTTTAACGTGATGAATGTAATGAAGTACTCTGGAGATTTCGAGGAATGTTCTATGATTGATGAAATTGATGA
ACTCTCCCAGTCTACTTTGCAGGAAATAATGAGAAGAGAAACCTTGGAAGATATACTTGGAGATGAAGAATCAGACGAAAAAGGTGGGAAGAAGATTGAAGATGAAGAGA
TAAATCAACCCATGAAGCGACAAAGGATCAAACCATATTGGGGGAGAGGCTTCGAGGATGAGGAATCCCATGTTTCTGTGATTGATTTGGTATGA
Protein sequenceShow/hide protein sequence
MAANSSTNVPGATMMGSTILKTHAEKPERFKGENFKRLLWEALDKKYKLEDAGTKKFLVGKFLDYKMIDTKLVVNQMEELQIIISDLQSEGLDISEPFQVAAVIEKLPPS
WKDFKYYLKHKRNELSMENLGVKLRIKEDNRKGDKASLGVEANAHIVESSKHGPKKQQLKKRNVNPRPRNDTNKLKKRNVNPRPRNDANKRICGVCWWIDIGATRHICKD
KSLFTTYEKLDGGEKLYMGNASTTSVASKGNVLLKWTSGKIFTLNDVWHVPEIRKNLVSGTLLNKNGFKLVFESDKFILTKGGWGMSIIVLYKECGLDQASKALANASAN
GSFLKNSTNEAHAILDTMATNNRHWGENEPTIILKNPVKAVETETNSSMQAQIKAIHSMMIGMTMNNQANIAPANAVSSLCCEICVGSSNSKTHQFGKQPTQQQPHKNFQ
QVPVQSQESNLETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTEIPHREGKEQCKVDLKKWIRIRWTKVSRQVGSSNSKTHQFGKQPTQQQPHKNFQQVPV
QSQESNLETLMKEYMVRNDVAVRDLKVQLGQIAQEIKNRPHGTLPSKTEIPHREGKEQCKVDLKKWIRIRWTKVSNILSKKRRLGEYETVALTECSSALVKNEIPPKLKD
PGSFTIPCSVGGKDVGRALCDLGASINLMPYSVFKQLGVGEARPTTVTLQLADRSLTHPIGKIEDVLVKVDKFVFPADFIILDCEADKDVLIILGRPFLATDRTLIDVQK
GELTMRVNDQQVTFNVMNVMKYSGDFEECSMIDEIDELSQSTLQEIMRRETLEDILGDEESDEKGGKKIEDEEINQPMKRQRIKPYWGRGFEDEESHVSVIDLV