; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Moc09g04470 (gene) of Bitter gourd (OHB3-1) v2 genome

Gene IDMoc09g04470
OrganismMomordica charantia cv. OHB3-1 (Bitter gourd (OHB3-1) v2)
DescriptionDNA glycosylase superfamily protein
Genome locationchr9:3427952..3430815
RNA-Seq ExpressionMoc09g04470
SyntenyMoc09g04470
Gene Ontology termsGO:0006284 - base-excision repair (biological process)
GO:0008725 - DNA-3-methyladenine glycosylase activity (molecular function)
InterPro domainsIPR005019 - Methyladenine glycosylase
IPR011257 - DNA glycosylase


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAG6593364.1 hypothetical protein SDJN03_12840, partial [Cucurbita argyrosperma subsp. sororia]8.7e-18591.33Show/hide
Query:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCV-SVPSVLRQQDRHQAILNLSMNASCSSDASSDSF
        MSGPPRIRSMNVADSDSRPVLGPTGNKAR VE RK G KPLKKLEKPHQEAESKDKRVPLSPPQCV +VPSVLRQQDRHQAIL LSMNASCSSDASSDSF
Subjt:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCV-SVPSVLRQQDRHQAILNLSMNASCSSDASSDSF

Query:  NSRASSARGTRQRGPNLRRK-QSTVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTW
        NSRASSARGTRQRGPNLRRK  STVKRAEKAVEKVG ESVV   +TV  LEPKKRCAWVT NTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTW
Subjt:  NSRASSARGTRQRGPNLRRK-QSTVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTW

Query:  PAILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVIS
        P IL KRHLFRE FLDFDPNAVSKLNEKKMVA GSAATSLLSE KVRAIIENGRQMCKVIDEFGSF+VY+WNFVNHKP ISQFRYPRQVPDKTSKAEVIS
Subjt:  PAILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVIS

Query:  KDLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIETAERGEKDGEIKPIINEKIPEALKNLEL
        KDLVKRGFRSVGPTVIYTFMQVAGLTNDHL+SCFRFPECIET E+GE+DG+IKP I EKIPEALKNLEL
Subjt:  KDLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIETAERGEKDGEIKPIINEKIPEALKNLEL

XP_004136097.2 uncharacterized protein LOC101205558 [Cucumis sativus]8.7e-18592.18Show/hide
Query:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCVSVPSVLRQQDRHQAILNLSMNASCSSDASSDSFN
        MSGPPRIRSMNVADSDSRPVLGPTGNKAR VE RKPG KPLKKLEKP QE ESKDKRVPLSPPQCV+VPSVLRQQDRHQAILNLSMNASCSSDASSDSFN
Subjt:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCVSVPSVLRQQDRHQAILNLSMNASCSSDASSDSFN

Query:  SRASSARGTRQRGPNLRRKQ-STVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWP
        SRASSARGTRQRGPNLRRKQ STVK A+KAVEKVGVESV VVVDTV  LE KKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWP
Subjt:  SRASSARGTRQRGPNLRRKQ-STVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWP

Query:  AILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISK
        AILNKRHLFREIFLDFDP AVSKLNEKKMVA GSAATSLLSELKVRAIIENGRQMCKVIDEFGSF+VY+WNFVNHKPIISQFRYPRQVPDKTSKAEVISK
Subjt:  AILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISK

Query:  DLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIE--TAERGEKD-GEIKPIINEKIPEALKNLEL
        DLVKRGFRSVGPTVIYTFMQVAGLTNDHLI CFRF ECIE  TAE+GE+D GE+K   NEK+PEALKNLEL
Subjt:  DLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIE--TAERGEKD-GEIKPIINEKIPEALKNLEL

XP_008461179.1 PREDICTED: probable GMP synthase [glutamine-hydrolyzing] [Cucumis melo]1.3e-18591.89Show/hide
Query:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCVSVPSVLRQQDRHQAILNLSMNASCSSDASSDSFN
        MSGPPRIRSMNVADSDSRPVLGPTGNKAR VE RKPG KPLKKLEKP QE ESKDKRVPLSPPQCV+VPSVLRQQDRHQAILNLSMNASCSSDASSDSFN
Subjt:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCVSVPSVLRQQDRHQAILNLSMNASCSSDASSDSFN

Query:  SRASSARGTRQRGPNLRRKQ-STVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWP
        SRASSARGTRQRGPNLRRKQ STVK A+KAVEKVGVESV VV DTV  LE KKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWP
Subjt:  SRASSARGTRQRGPNLRRKQ-STVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWP

Query:  AILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISK
        AILNKRHLFREIFLDFDP  VSKLNEKKMVA GSAATSLLSELK+RAIIENGRQMCKVIDEFGSF+VY+WNFVNHKPIISQFRYPRQVPDKTSKAEVISK
Subjt:  AILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISK

Query:  DLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIE--TAERGEKDGEIKPIINEKIPEALKNLEL
        DLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRF ECIE  TAE+GE+DGE+K   NEK+PEALKNLEL
Subjt:  DLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIE--TAERGEKDGEIKPIINEKIPEALKNLEL

XP_022155202.1 uncharacterized protein LOC111022341 [Momordica charantia]1.5e-205100Show/hide
Query:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCVSVPSVLRQQDRHQAILNLSMNASCSSDASSDSFN
        MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCVSVPSVLRQQDRHQAILNLSMNASCSSDASSDSFN
Subjt:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCVSVPSVLRQQDRHQAILNLSMNASCSSDASSDSFN

Query:  SRASSARGTRQRGPNLRRKQSTVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWPA
        SRASSARGTRQRGPNLRRKQSTVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWPA
Subjt:  SRASSARGTRQRGPNLRRKQSTVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWPA

Query:  ILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISKD
        ILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISKD
Subjt:  ILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISKD

Query:  LVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIETAERGEKDGEIKPIINEKIPEALKNLEL
        LVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIETAERGEKDGEIKPIINEKIPEALKNLEL
Subjt:  LVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIETAERGEKDGEIKPIINEKIPEALKNLEL

XP_023514420.1 uncharacterized protein LOC111778684 [Cucurbita pepo subsp. pepo]8.7e-18591.33Show/hide
Query:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCV-SVPSVLRQQDRHQAILNLSMNASCSSDASSDSF
        MSGPPRIRSMNVADSDSRPVLGPTGNKAR VE RK G KPLKKLEKPHQEAESKDKRVPLSPPQCV +VPSVLRQQDRHQAIL LSMNASCSSDASSDSF
Subjt:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCV-SVPSVLRQQDRHQAILNLSMNASCSSDASSDSF

Query:  NSRASSARGTRQRGPNLRRK-QSTVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTW
        NSRASSARGTRQRGPNLRRK  S+VKRAEKAVEKVG ESVV V +TV  LEPKKRCAWVT NTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTW
Subjt:  NSRASSARGTRQRGPNLRRK-QSTVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTW

Query:  PAILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVIS
        P IL KRHLFRE FLDFDPNAVSKLNEKKMVA GSAATSLLSE KVRAIIENGRQMCKVIDEFGSF+VY+WNFVNHKP ISQFRYPRQVPDKTSKAEVIS
Subjt:  PAILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVIS

Query:  KDLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIETAERGEKDGEIKPIINEKIPEALKNLEL
        KDLVKRGFRSVGPTVIYTFMQVAGLTNDHL+SCFRFPECIET E+GE+DG+IKP I EKIPEALKNLEL
Subjt:  KDLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIETAERGEKDGEIKPIINEKIPEALKNLEL

TrEMBL top hitse value%identityAlignment
A0A0A0K8L6 Uncharacterized protein4.2e-18592.18Show/hide
Query:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCVSVPSVLRQQDRHQAILNLSMNASCSSDASSDSFN
        MSGPPRIRSMNVADSDSRPVLGPTGNKAR VE RKPG KPLKKLEKP QE ESKDKRVPLSPPQCV+VPSVLRQQDRHQAILNLSMNASCSSDASSDSFN
Subjt:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCVSVPSVLRQQDRHQAILNLSMNASCSSDASSDSFN

Query:  SRASSARGTRQRGPNLRRKQ-STVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWP
        SRASSARGTRQRGPNLRRKQ STVK A+KAVEKVGVESV VVVDTV  LE KKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWP
Subjt:  SRASSARGTRQRGPNLRRKQ-STVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWP

Query:  AILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISK
        AILNKRHLFREIFLDFDP AVSKLNEKKMVA GSAATSLLSELKVRAIIENGRQMCKVIDEFGSF+VY+WNFVNHKPIISQFRYPRQVPDKTSKAEVISK
Subjt:  AILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISK

Query:  DLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIE--TAERGEKD-GEIKPIINEKIPEALKNLEL
        DLVKRGFRSVGPTVIYTFMQVAGLTNDHLI CFRF ECIE  TAE+GE+D GE+K   NEK+PEALKNLEL
Subjt:  DLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIE--TAERGEKD-GEIKPIINEKIPEALKNLEL

A0A1S3CE52 probable GMP synthase [glutamine-hydrolyzing]6.5e-18691.89Show/hide
Query:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCVSVPSVLRQQDRHQAILNLSMNASCSSDASSDSFN
        MSGPPRIRSMNVADSDSRPVLGPTGNKAR VE RKPG KPLKKLEKP QE ESKDKRVPLSPPQCV+VPSVLRQQDRHQAILNLSMNASCSSDASSDSFN
Subjt:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCVSVPSVLRQQDRHQAILNLSMNASCSSDASSDSFN

Query:  SRASSARGTRQRGPNLRRKQ-STVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWP
        SRASSARGTRQRGPNLRRKQ STVK A+KAVEKVGVESV VV DTV  LE KKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWP
Subjt:  SRASSARGTRQRGPNLRRKQ-STVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWP

Query:  AILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISK
        AILNKRHLFREIFLDFDP  VSKLNEKKMVA GSAATSLLSELK+RAIIENGRQMCKVIDEFGSF+VY+WNFVNHKPIISQFRYPRQVPDKTSKAEVISK
Subjt:  AILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISK

Query:  DLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIE--TAERGEKDGEIKPIINEKIPEALKNLEL
        DLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRF ECIE  TAE+GE+DGE+K   NEK+PEALKNLEL
Subjt:  DLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIE--TAERGEKDGEIKPIINEKIPEALKNLEL

A0A5A7UYZ9 Putative GMP synthase6.5e-18691.89Show/hide
Query:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCVSVPSVLRQQDRHQAILNLSMNASCSSDASSDSFN
        MSGPPRIRSMNVADSDSRPVLGPTGNKAR VE RKPG KPLKKLEKP QE ESKDKRVPLSPPQCV+VPSVLRQQDRHQAILNLSMNASCSSDASSDSFN
Subjt:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCVSVPSVLRQQDRHQAILNLSMNASCSSDASSDSFN

Query:  SRASSARGTRQRGPNLRRKQ-STVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWP
        SRASSARGTRQRGPNLRRKQ STVK A+KAVEKVGVESV VV DTV  LE KKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWP
Subjt:  SRASSARGTRQRGPNLRRKQ-STVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWP

Query:  AILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISK
        AILNKRHLFREIFLDFDP  VSKLNEKKMVA GSAATSLLSELK+RAIIENGRQMCKVIDEFGSF+VY+WNFVNHKPIISQFRYPRQVPDKTSKAEVISK
Subjt:  AILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISK

Query:  DLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIE--TAERGEKDGEIKPIINEKIPEALKNLEL
        DLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRF ECIE  TAE+GE+DGE+K   NEK+PEALKNLEL
Subjt:  DLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIE--TAERGEKDGEIKPIINEKIPEALKNLEL

A0A6J1DNQ3 uncharacterized protein LOC1110223417.4e-206100Show/hide
Query:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCVSVPSVLRQQDRHQAILNLSMNASCSSDASSDSFN
        MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCVSVPSVLRQQDRHQAILNLSMNASCSSDASSDSFN
Subjt:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCVSVPSVLRQQDRHQAILNLSMNASCSSDASSDSFN

Query:  SRASSARGTRQRGPNLRRKQSTVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWPA
        SRASSARGTRQRGPNLRRKQSTVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWPA
Subjt:  SRASSARGTRQRGPNLRRKQSTVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWPA

Query:  ILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISKD
        ILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISKD
Subjt:  ILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISKD

Query:  LVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIETAERGEKDGEIKPIINEKIPEALKNLEL
        LVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIETAERGEKDGEIKPIINEKIPEALKNLEL
Subjt:  LVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIETAERGEKDGEIKPIINEKIPEALKNLEL

A0A6J1H7A2 uncharacterized protein LOC1114610817.9e-18490.79Show/hide
Query:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCV-SVPSVLRQQDRHQAILNLSMNASCSSDASSDSF
        MSGPPRIRSMNVADSDSRPVLGPTGNKAR VE RK G KPLKKLEKPHQEAESKDKRVPLSPPQCV +VPSVLRQQDRHQAIL LSMNASCSSDASSDSF
Subjt:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCV-SVPSVLRQQDRHQAILNLSMNASCSSDASSDSF

Query:  NSRASSARGTRQRGPNLRRK-QSTVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTW
        NSRASSARGTRQRGPNLRRK  STVKRAEKAVEKVG ESVV   +TV  LEPKKRCAWVT NTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTW
Subjt:  NSRASSARGTRQRGPNLRRK-QSTVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTW

Query:  PAILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVIS
        P IL KRHLFRE FLDFDPNAVSKLNEKKMVA GSAATSLLSE KVRAIIENGRQMCKVIDEFGSF+VY+WNFVNHKP ISQFRYPRQVPDKTSKA+VIS
Subjt:  PAILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVIS

Query:  KDLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIETAERGEKDGEIKPIINEKIPEALKNLEL
        KDLVKRGFRSVGPTVIYTFMQVAGLTNDHL+SCFRF ECIET E+GE+DG+IKP I EKIPEALKNLEL
Subjt:  KDLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIETAERGEKDGEIKPIINEKIPEALKNLEL

SwissProt top hitse value%identityAlignment
P05100 DNA-3-methyladenine glycosylase 11.1e-3841.3Show/hide
Query:  KRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWPAILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENG
        +RC WV  + DP Y A+HD EWGVP  D KKLFE++CL G  A L+W  +L KR  +R  F  FDP  V+ + E+ +      A  +    K++AII N 
Subjt:  KRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWPAILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENG

Query:  RQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISKDLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFP
        R   ++      F  ++W+FVNH+P ++Q     ++P  TS ++ +SK L KRGF+ VG T+ Y+FMQ  GL NDH++ C  +P
Subjt:  RQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISKDLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFP

P44321 DNA-3-methyladenine glycosylase3.2e-3339.66Show/hide
Query:  RCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWPAILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGR
        RC WV       Y  +HD+EWG P  D +KLFE +CL G  A L+W  +L KR  +RE F  FDP  ++K+    + A    +  +    K+ AI++N +
Subjt:  RCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWPAILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGR

Query:  QMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISKDLVKRGFRSVGPTVIYTFMQVAGLTNDHLISC
            +     +F  +IW+FVNHKPI++     R VP KT  ++ +SK L KRGF  +G T  Y FMQ  GL +DHL  C
Subjt:  QMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISKDLVKRGFRSVGPTVIYTFMQVAGLTNDHLISC

Q7VG78 Probable GMP synthase [glutamine-hydrolyzing]2.5e-4144.85Show/hide
Query:  DTVAGLEPKKRCAWVTPNTDPC---YAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWPAILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLL
        D+  G+  K RCAW T   +     Y  +HD EWG P+H+DKKLFE L L G  A L+W  IL KR  FR  F DFDP+ V+  +E K+         + 
Subjt:  DTVAGLEPKKRCAWVTPNTDPC---YAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWPAILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLL

Query:  SELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISKDLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFR
        +  K+ A I N +    V  EFGSFD YIW FV  KPII+ F     +P  T  ++ I+KDL KRGF+ VG T +Y  MQ  G+ NDHL SCF+
Subjt:  SELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISKDLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFR

Arabidopsis top hitse value%identityAlignment
AT1G15970.1 DNA glycosylase superfamily protein1.6e-9155.49Show/hide
Query:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQE---AESKDKR-----VPLSP----PQCVSV-PSVLRQQDRHQAILNLSMN
        MS PPR RS+N  + + R VLGPTGNK +    RKP G    KLEKP  E    +SKD++      P SP     QC S+  S+LR+        + SM 
Subjt:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQE---AESKDKR-----VPLSP----PQCVSV-PSVLRQQDRHQAILNLSMN

Query:  ASCSSDASSDSFNSRASSARGTRQRGPNLRRKQSTVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELL
        AS SSDASS   +S  S A  +  +   + R+  +V    K    VG E   V  D  A  + +KRCAW+TP  DPCY AFHDEEWGVPVHDDKKLFELL
Subjt:  ASCSSDASSDSFNSRASSARGTRQRGPNLRRKQSTVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELL

Query:  CLSGALAELTWPAILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQV
        CLSGALAEL+W  IL++RH+ RE+F+DFDP AV++LN+KK+ A G+AA SLLSE+K+R+I++N R + K+I E GS   Y+WNFVN+KP  SQFRY RQV
Subjt:  CLSGALAELTWPAILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQV

Query:  PDKTSKAEVISKDLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIETAE
        P KTSKAE ISKDLV+RGFRSV PTVIY+FMQ AGLTNDHLI CFR+ +C   AE
Subjt:  PDKTSKAEVISKDLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIETAE

AT1G75090.1 DNA glycosylase superfamily protein1.2e-6744.73Show/hide
Query:  PLKKLEKPHQEAESKDKRVPLSPPQCVSVPSVLRQQDRHQAILNLSMNASCSSDASSDSFNSRASSARGTRQRGPNLRRKQSTVKRAEKAVEKVG--VES
        P+K +++      S   R  ++  +    P +  +  +  A      N S S+D SS S +S   S+  T   G      + T       VEK+   V S
Subjt:  PLKKLEKPHQEAESKDKRVPLSPPQCVSVPSVLRQQDRHQAILNLSMNASCSSDASSDSFNSRASSARGTRQRGPNLRRKQSTVKRAEKAVEKVG--VES

Query:  VVVVVDTVAGLE-PKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWPAILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAAT
        V VV D    +  P KRC W+TPN+DP Y  FHDEEWGVPV DDKKLFELL  S ALAE +WP+IL +R  FR++F +FDP+A+++  EK++++      
Subjt:  VVVVVDTVAGLE-PKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWPAILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAAT

Query:  SLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISKDLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPE
         +LSE K+RAI+EN + + KV  EFGSF  Y W FVNHKP+ + +RY RQVP K+ KAE ISKD+++RGFR VGPTV+Y+F+Q +G+ NDHL +CFR+ E
Subjt:  SLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISKDLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPE

Query:  CIETAERGEKDGE
        C    ER  K  E
Subjt:  CIETAERGEKDGE

AT1G80850.1 DNA glycosylase superfamily protein3.0e-9053.11Show/hide
Query:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCVSVPSVLRQQDRHQAILNLSMNASCSSDASSD---
        MS PPR+RS++ +D + R VLGP GNK +     KP  KP+ +  K     E   +  PLSPP       +LR+         +SM AS SSDASS    
Subjt:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCVSVPSVLRQQDRHQAILNLSMNASCSSDASSD---

Query:  ---SFNSRASSARGTRQRG----PNLRRKQSTVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLS
           S  S +S  R  R+ G     +  R+  T +R EKA +                 + +KRCAW+TP +D CY AFHDEEWGVPVHDDK+LFELL LS
Subjt:  ---SFNSRASSARGTRQRG----PNLRRKQSTVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLS

Query:  GALAELTWPAILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDK
        GALAEL+W  IL+KR LFRE+F+DFDP A+S+L  KK+ +   AAT+LLSE K+R+I+EN  Q+CK+I  FGSFD YIWNFVN KP  SQFRYPRQVP K
Subjt:  GALAELTWPAILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDK

Query:  TSKAEVISKDLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIETAERG
        TSKAE+ISKDLV+RGFRSV PTVIY+FMQ AGLTNDHL  CFR  +C+   E G
Subjt:  TSKAEVISKDLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIETAERG

AT5G57970.1 DNA glycosylase superfamily protein1.1e-10056.06Show/hide
Query:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPP----------QCVSVPSVLRQQDRHQAIL--NLSMNA
        MSG PR++SMNVA++++R  LG T  KA P    K   K L+KLE+        D++   + P            ++  S+LR   RH+  L  NLS+NA
Subjt:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPP----------QCVSVPSVLRQQDRHQAIL--NLSMNA

Query:  SCSSDASSDSFNSRASSARGTRQRGPNLRRKQSTVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLC
        S SSDAS DSF+SRAS+ R  R      R K S   +    V +  ++S         G E KKRC WVTPN+DPCY  FHDEEWGVPVHDDK+LFELL 
Subjt:  SCSSDASSDSFNSRASSARGTRQRGPNLRRKQSTVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLC

Query:  LSGALAELTWPAILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVP
        LSGALAE TWP IL+KR  FRE+F DFDPNA+ K+NEKK++  GS A++LLS+LK+RA+IEN RQ+ KVI+E+GSFD YIW+FV +K I+S+FRY RQVP
Subjt:  LSGALAELTWPAILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVP

Query:  DKTSKAEVISKDLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIETAER
         KT KAEVISKDLV+RGFRSVGPTV+Y+FMQ AG+TNDHL SCFRF  CI   ER
Subjt:  DKTSKAEVISKDLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIETAER

AT5G57970.2 DNA glycosylase superfamily protein1.1e-10056.06Show/hide
Query:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPP----------QCVSVPSVLRQQDRHQAIL--NLSMNA
        MSG PR++SMNVA++++R  LG T  KA P    K   K L+KLE+        D++   + P            ++  S+LR   RH+  L  NLS+NA
Subjt:  MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPP----------QCVSVPSVLRQQDRHQAIL--NLSMNA

Query:  SCSSDASSDSFNSRASSARGTRQRGPNLRRKQSTVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLC
        S SSDAS DSF+SRAS+ R  R      R K S   +    V +  ++S         G E KKRC WVTPN+DPCY  FHDEEWGVPVHDDK+LFELL 
Subjt:  SCSSDASSDSFNSRASSARGTRQRGPNLRRKQSTVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLC

Query:  LSGALAELTWPAILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVP
        LSGALAE TWP IL+KR  FRE+F DFDPNA+ K+NEKK++  GS A++LLS+LK+RA+IEN RQ+ KVI+E+GSFD YIW+FV +K I+S+FRY RQVP
Subjt:  LSGALAELTWPAILNKRHLFREIFLDFDPNAVSKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVP

Query:  DKTSKAEVISKDLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIETAER
         KT KAEVISKDLV+RGFRSVGPTV+Y+FMQ AG+TNDHL SCFRF  CI   ER
Subjt:  DKTSKAEVISKDLVKRGFRSVGPTVIYTFMQVAGLTNDHLISCFRFPECIETAER


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGTCAGGCCCTCCCAGAATCCGGTCGATGAATGTGGCGGATTCCGACTCCCGGCCGGTTCTTGGGCCTACCGGAAACAAAGCCCGACCTGTCGAGCCCAGAAAACCTGG
TGGGAAGCCATTGAAGAAGCTTGAAAAGCCTCACCAGGAGGCTGAATCGAAGGACAAGAGGGTGCCATTGTCGCCGCCTCAATGCGTCTCGGTGCCATCGGTTCTGAGGC
AGCAGGACCGCCACCAGGCGATTCTCAATCTCTCGATGAATGCGTCGTGTTCTTCCGATGCGTCGTCTGATTCGTTCAATAGCCGGGCGTCTAGCGCGAGAGGTACGAGG
CAGCGCGGTCCCAATTTGAGGAGAAAGCAAAGTACGGTAAAGAGGGCTGAAAAGGCCGTTGAAAAGGTTGGCGTTGAAAGTGTGGTGGTGGTGGTGGATACAGTTGCTGG
TTTAGAGCCAAAAAAACGATGTGCTTGGGTAACACCTAATACAGATCCATGTTATGCTGCTTTTCATGATGAAGAGTGGGGAGTACCTGTTCACGATGACAAAAAATTGT
TTGAACTGCTCTGCCTATCGGGTGCTTTGGCTGAACTTACATGGCCTGCTATCCTTAACAAAAGACATCTATTTAGGGAAATCTTTTTGGACTTTGACCCAAATGCTGTT
TCAAAATTAAACGAGAAAAAGATGGTTGCAGCTGGAAGTGCTGCTACCTCTTTACTGTCAGAACTTAAGGTGCGAGCTATCATTGAAAATGGTCGTCAAATGTGCAAGGT
AATCGATGAATTTGGTTCCTTCGACGTGTACATTTGGAACTTTGTCAACCACAAACCGATCATCAGTCAGTTTCGGTACCCACGCCAGGTCCCCGATAAGACCTCAAAAG
CAGAGGTGATTAGCAAGGATCTCGTTAAGAGAGGGTTTCGTAGCGTGGGACCGACAGTCATCTATACATTCATGCAGGTGGCAGGGTTAACCAACGACCATCTCATCAGT
TGCTTTAGGTTTCCAGAATGTATAGAGACAGCAGAGAGAGGAGAAAAGGATGGTGAAATCAAGCCTATTATTAACGAGAAAATACCAGAGGCTCTGAAAAACTTGGAACT
ATAA
mRNA sequenceShow/hide mRNA sequence
ATGTCAGGCCCTCCCAGAATCCGGTCGATGAATGTGGCGGATTCCGACTCCCGGCCGGTTCTTGGGCCTACCGGAAACAAAGCCCGACCTGTCGAGCCCAGAAAACCTGG
TGGGAAGCCATTGAAGAAGCTTGAAAAGCCTCACCAGGAGGCTGAATCGAAGGACAAGAGGGTGCCATTGTCGCCGCCTCAATGCGTCTCGGTGCCATCGGTTCTGAGGC
AGCAGGACCGCCACCAGGCGATTCTCAATCTCTCGATGAATGCGTCGTGTTCTTCCGATGCGTCGTCTGATTCGTTCAATAGCCGGGCGTCTAGCGCGAGAGGTACGAGG
CAGCGCGGTCCCAATTTGAGGAGAAAGCAAAGTACGGTAAAGAGGGCTGAAAAGGCCGTTGAAAAGGTTGGCGTTGAAAGTGTGGTGGTGGTGGTGGATACAGTTGCTGG
TTTAGAGCCAAAAAAACGATGTGCTTGGGTAACACCTAATACAGATCCATGTTATGCTGCTTTTCATGATGAAGAGTGGGGAGTACCTGTTCACGATGACAAAAAATTGT
TTGAACTGCTCTGCCTATCGGGTGCTTTGGCTGAACTTACATGGCCTGCTATCCTTAACAAAAGACATCTATTTAGGGAAATCTTTTTGGACTTTGACCCAAATGCTGTT
TCAAAATTAAACGAGAAAAAGATGGTTGCAGCTGGAAGTGCTGCTACCTCTTTACTGTCAGAACTTAAGGTGCGAGCTATCATTGAAAATGGTCGTCAAATGTGCAAGGT
AATCGATGAATTTGGTTCCTTCGACGTGTACATTTGGAACTTTGTCAACCACAAACCGATCATCAGTCAGTTTCGGTACCCACGCCAGGTCCCCGATAAGACCTCAAAAG
CAGAGGTGATTAGCAAGGATCTCGTTAAGAGAGGGTTTCGTAGCGTGGGACCGACAGTCATCTATACATTCATGCAGGTGGCAGGGTTAACCAACGACCATCTCATCAGT
TGCTTTAGGTTTCCAGAATGTATAGAGACAGCAGAGAGAGGAGAAAAGGATGGTGAAATCAAGCCTATTATTAACGAGAAAATACCAGAGGCTCTGAAAAACTTGGAACT
ATAA
Protein sequenceShow/hide protein sequence
MSGPPRIRSMNVADSDSRPVLGPTGNKARPVEPRKPGGKPLKKLEKPHQEAESKDKRVPLSPPQCVSVPSVLRQQDRHQAILNLSMNASCSSDASSDSFNSRASSARGTR
QRGPNLRRKQSTVKRAEKAVEKVGVESVVVVVDTVAGLEPKKRCAWVTPNTDPCYAAFHDEEWGVPVHDDKKLFELLCLSGALAELTWPAILNKRHLFREIFLDFDPNAV
SKLNEKKMVAAGSAATSLLSELKVRAIIENGRQMCKVIDEFGSFDVYIWNFVNHKPIISQFRYPRQVPDKTSKAEVISKDLVKRGFRSVGPTVIYTFMQVAGLTNDHLIS
CFRFPECIETAERGEKDGEIKPIINEKIPEALKNLEL