; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

HG10019668 (gene) of Bottle gourd (Hangzhou Gourd) v1 genome

Gene IDHG10019668
OrganismLagenaria siceraria cv. Hangzhou Gourd (Bottle gourd (Hangzhou Gourd) v1)
DescriptionProcollagen-proline 4-dioxygenase
Genome locationChr04:24301008..24304216
RNA-Seq ExpressionHG10019668
SyntenyHG10019668
Gene Ontology termsGO:0018401 - peptidyl-proline hydroxylation to 4-hydroxy-L-proline (biological process)
GO:0005789 - endoplasmic reticulum membrane (cellular component)
GO:0004656 - procollagen-proline 4-dioxygenase activity (molecular function)
GO:0005506 - iron ion binding (molecular function)
GO:0031418 - L-ascorbic acid binding (molecular function)
InterPro domainsIPR003582 - ShKT domain
IPR005123 - Oxoglutarate/iron-dependent dioxygenase
IPR006620 - Prolyl 4-hydroxylase, alpha subunit
IPR044862 - Prolyl 4-hydroxylase alpha subunit, Fe(2+) 2OG dioxygenase domain
IPR045054 - Prolyl 4-hydroxylase


Homology Show/hide homology
GenBank top hitse value%identityAlignment
TYK17735.1 putative prolyl 4-hydroxylase 7 isoform X2 [Cucumis melo var. makuwa]2.3e-14984.95Show/hide
Query:  FSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAAETGESVTSEERTSTGMFLKKAQ
        F+ + N L   L   +N  +SVIRMKTGG AITIDPTRVI+LSSKPRAFLYKGFLS EEC HLI+LAKGKL QSLVAA TGESVTS+ERTSTGMFL+KAQ
Subjt:  FSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAAETGESVTSEERTSTGMFLKKAQ

Query:  DEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPVKLSEQEKADLSDCAKIGYGVR
        D+IVARIESRIAAWTFLP+DNGEPIQILRYENGQKYEPHFDFFQDP NIA+GGHRIATILMYLSDVEKGGETVFPNSPVKLSE+EK DLS+CAK+GYGVR
Subjt:  DEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPVKLSEQEKADLSDCAKIGYGVR

Query:  PKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYMMGSKNELGYCRMSCKVCSPPS
        PK+GDALLFFS+NPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPI E+WRNPACVDEN+ C AWA AGEC+KNP YMMGSKNELG+CR+SCKVCSP S
Subjt:  PKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYMMGSKNELGYCRMSCKVCSPPS

XP_004147455.1 probable prolyl 4-hydroxylase 7 [Cucumis sativus]2.1e-15885.22Show/hide
Query:  MASRFFLAFSICFLCFF--PFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAAE
        MAS FFL FSI FL  F  PF S S NR PKL+LHN ++++SVIRMKTGG A+TIDPTRVI+LSSKPRAFLYKGFLS EEC HLIN AKGKLHQSLVAA 
Subjt:  MASRFFLAFSICFLCFF--PFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAAE

Query:  TGESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPV
        TG+SVTS+ERTSTGMFL KAQDEIVARIESRIAAWTFLP+DNGEPIQILRYENGQKYEPHFDFFQDP NIA+GGHRIATILMYLS+VEKGGETVFPNSPV
Subjt:  TGESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPV

Query:  KLSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYMM
        KLSE+EKADLS+C K+GYGVRPK+GDALLFFS+NPNVTPD TSYHGSCPVIEGEKWSATKWIHMLPI E WRNPACVDEN+ C AWA AGECEKNP YMM
Subjt:  KLSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYMM

Query:  GSKNELGYCRMSCKVCSP
        GSKNELG+CR SCKVCSP
Subjt:  GSKNELGYCRMSCKVCSP

XP_008443446.1 PREDICTED: probable prolyl 4-hydroxylase 7 [Cucumis melo]1.4e-15985.53Show/hide
Query:  MASRFFLAFSICFLCFFPFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAAETG
        MAS F LAFSI FL   P  S S NR PK+LLHN +M +SVIRMKTGG AITIDPTRVI+LSSKPRAFLYKGFLS EEC HLI+LAKGKL QSLVAA TG
Subjt:  MASRFFLAFSICFLCFFPFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAAETG

Query:  ESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPVKL
        ESVTS+ERTSTGMFL+KAQD+IVARIESRIAAWTFLP+DNGEPIQILRYENGQKYEPHFDFFQDP NIA+GGHRIATILMYLSDVEKGGETVFPNSPVKL
Subjt:  ESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPVKL

Query:  SEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYMMGS
        SE+EK DLS+CAK+GYGVRPK+GDALLFFS+NPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPI E+WRNPACVDEN+ C AWA AGEC+KNP YMMGS
Subjt:  SEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYMMGS

Query:  KNELGYCRMSCKVCSPPS
        KNELG+CR+SCKVCSP S
Subjt:  KNELGYCRMSCKVCSPPS

XP_038905408.1 probable prolyl 4-hydroxylase 7 isoform X1 [Benincasa hispida]4.1e-17592.45Show/hide
Query:  MASRFFLAFSICFLCFFPFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAAETG
        MASRFFLAFS+CFLCFFPFFSRS NRLPKLLLHN NM+QSVIRMKT G  +TIDPTRVIKLSSKPRAFLYKGFLSE+EC HLINLAKGKL QSLVAAETG
Subjt:  MASRFFLAFSICFLCFFPFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAAETG

Query:  ESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPVKL
        ESVTS+ERTSTGMFL +AQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDP NIA+GGHRIATILMYLSDVEKGGETVFPNSP+KL
Subjt:  ESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPVKL

Query:  SEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYMMGS
        SEQE+ADLSDCAK+GYGV+PKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDEN +CRAWANAGECEKNP YMMGS
Subjt:  SEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYMMGS

Query:  KNELGYCRMSCKVCSPPS
        KNELG+CRMSCKVCSPPS
Subjt:  KNELGYCRMSCKVCSPPS

XP_038905410.1 probable prolyl 4-hydroxylase 7 isoform X2 [Benincasa hispida]9.6e-15691Show/hide
Query:  LLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAAETGESVTSEERTSTGMFLKKAQDEIVARIESR
        LL   +N +QSVIRMKT G  +TIDPTRVIKLSSKPRAFLYKGFLSE+EC HLINLAKGKL QSLVAAETGESVTS+ERTSTGMFL +AQDEIVARIESR
Subjt:  LLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAAETGESVTSEERTSTGMFLKKAQDEIVARIESR

Query:  IAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPVKLSEQEKADLSDCAKIGYGVRPKMGDALLFF
        IAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDP NIA+GGHRIATILMYLSDVEKGGETVFPNSP+KLSEQE+ADLSDCAK+GYGV+PKMGDALLFF
Subjt:  IAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPVKLSEQEKADLSDCAKIGYGVRPKMGDALLFF

Query:  SLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYMMGSKNELGYCRMSCKVCSPPS
        SLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDEN +CRAWANAGECEKNP YMMGSKNELG+CRMSCKVCSPPS
Subjt:  SLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYMMGSKNELGYCRMSCKVCSPPS

TrEMBL top hitse value%identityAlignment
A0A0A0LG32 Procollagen-proline 4-dioxygenase1.0e-15885.22Show/hide
Query:  MASRFFLAFSICFLCFF--PFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAAE
        MAS FFL FSI FL  F  PF S S NR PKL+LHN ++++SVIRMKTGG A+TIDPTRVI+LSSKPRAFLYKGFLS EEC HLIN AKGKLHQSLVAA 
Subjt:  MASRFFLAFSICFLCFF--PFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAAE

Query:  TGESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPV
        TG+SVTS+ERTSTGMFL KAQDEIVARIESRIAAWTFLP+DNGEPIQILRYENGQKYEPHFDFFQDP NIA+GGHRIATILMYLS+VEKGGETVFPNSPV
Subjt:  TGESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPV

Query:  KLSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYMM
        KLSE+EKADLS+C K+GYGVRPK+GDALLFFS+NPNVTPD TSYHGSCPVIEGEKWSATKWIHMLPI E WRNPACVDEN+ C AWA AGECEKNP YMM
Subjt:  KLSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYMM

Query:  GSKNELGYCRMSCKVCSP
        GSKNELG+CR SCKVCSP
Subjt:  GSKNELGYCRMSCKVCSP

A0A1S3B814 Procollagen-proline 4-dioxygenase6.9e-16085.53Show/hide
Query:  MASRFFLAFSICFLCFFPFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAAETG
        MAS F LAFSI FL   P  S S NR PK+LLHN +M +SVIRMKTGG AITIDPTRVI+LSSKPRAFLYKGFLS EEC HLI+LAKGKL QSLVAA TG
Subjt:  MASRFFLAFSICFLCFFPFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAAETG

Query:  ESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPVKL
        ESVTS+ERTSTGMFL+KAQD+IVARIESRIAAWTFLP+DNGEPIQILRYENGQKYEPHFDFFQDP NIA+GGHRIATILMYLSDVEKGGETVFPNSPVKL
Subjt:  ESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPVKL

Query:  SEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYMMGS
        SE+EK DLS+CAK+GYGVRPK+GDALLFFS+NPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPI E+WRNPACVDEN+ C AWA AGEC+KNP YMMGS
Subjt:  SEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYMMGS

Query:  KNELGYCRMSCKVCSPPS
        KNELG+CR+SCKVCSP S
Subjt:  KNELGYCRMSCKVCSPPS

A0A5A7UCT9 Procollagen-proline 4-dioxygenase4.8e-14577.44Show/hide
Query:  FSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAAETGESVTSEERTSTGMFLKKAQ
        F+ + N L   L   +N  +SVIRMKTGG AITIDPTRVI+LSSKPRAFLYKGFLS EEC HLI+LAKGKL QSLVAA TGESVTS+ERTSTGMFL+KAQ
Subjt:  FSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAAETGESVTSEERTSTGMFLKKAQ

Query:  DEIVARIESRIAAWTFLPI-----------------------------DNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGE
        D+IVARIESRIAAWTFLP+                             DNGEPIQILRYENGQKYEPHFDFFQDP NIA+GGHRIATILMYLSDVEKGGE
Subjt:  DEIVARIESRIAAWTFLPI-----------------------------DNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGE

Query:  TVFPNSPVKLSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGEC
        TVFPNSPVKLSE+EK DLS+CAK+GYGVRPK+GDALLFFS+NPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPI E+WRNPACVDEN+ C AWA AGEC
Subjt:  TVFPNSPVKLSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGEC

Query:  EKNPTYMMGSKNELGYCRMSCKVCSPPS
        +KNP YMMGSKNELG+CR+SCKVCSP S
Subjt:  EKNPTYMMGSKNELGYCRMSCKVCSPPS

A0A5D3D1X2 Procollagen-proline 4-dioxygenase1.1e-14984.95Show/hide
Query:  FSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAAETGESVTSEERTSTGMFLKKAQ
        F+ + N L   L   +N  +SVIRMKTGG AITIDPTRVI+LSSKPRAFLYKGFLS EEC HLI+LAKGKL QSLVAA TGESVTS+ERTSTGMFL+KAQ
Subjt:  FSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAAETGESVTSEERTSTGMFLKKAQ

Query:  DEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPVKLSEQEKADLSDCAKIGYGVR
        D+IVARIESRIAAWTFLP+DNGEPIQILRYENGQKYEPHFDFFQDP NIA+GGHRIATILMYLSDVEKGGETVFPNSPVKLSE+EK DLS+CAK+GYGVR
Subjt:  DEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPVKLSEQEKADLSDCAKIGYGVR

Query:  PKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYMMGSKNELGYCRMSCKVCSPPS
        PK+GDALLFFS+NPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPI E+WRNPACVDEN+ C AWA AGEC+KNP YMMGSKNELG+CR+SCKVCSP S
Subjt:  PKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYMMGSKNELGYCRMSCKVCSPPS

A0A6J1EYJ1 Procollagen-proline 4-dioxygenase5.5e-14175.23Show/hide
Query:  MASRFFLAFSICFLCFFPFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAAE-T
        M SRFFLAFS+CFLC FP F+RS NRLPKLLL +   E SVIRMK  G +I IDPTRV++LSS+PRAFLYKGFLS EEC HLI+LAK  L QSLV  + T
Subjt:  MASRFFLAFSICFLCFFPFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAAE-T

Query:  GESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPVK
        G S +S +RTSTGMFL KAQD+IVA IE++IAAWTFLP+DNGEPIQILRYENGQ+Y PHFDFFQDP N+A GGHRIAT+LMYLS+VE+GGETVFP+SP K
Subjt:  GESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPVK

Query:  LSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYM--
        + E+E  DL DC+  GYGV+PK GDALLFFSL+PNVT D TSYHGSCPVIEGEKWSATKWIHMLP+ EIWRNP CVDENE C AWA AGECEKNP YM  
Subjt:  LSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYM--

Query:  --MGSKNELGYCRMSCKVCSPPS
          +GSK ELGYCR+SCK CSPPS
Subjt:  --MGSKNELGYCRMSCKVCSPPS

SwissProt top hitse value%identityAlignment
F4J0A8 Probable prolyl 4-hydroxylase 62.3e-9655.38Show/hide
Query:  MASRFFLAFSICFLCFFPFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAA--E
        M S++FLAFS+  L  F   S                            + ++DPTR+ +LS  PRAFLYKGFLS+EEC HLI LAKGKL +S+V A  +
Subjt:  MASRFFLAFSICFLCFFPFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAA--E

Query:  TGESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPV
        +GES  SE RTS+GMFL K QD+IVA +E+++AAWTFLP +NGE +QIL YENGQKY+PHFD+F D   + +GGHRIAT+LMYLS+V KGGETVFPN   
Subjt:  TGESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPV

Query:  KLSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYMM
        K  + +    S CAK GY V+P+ GDALLFF+L+ N T D  S HGSCPVIEGEKWSAT+WIH+    +  +   CVD++E C+ WA+AGECEKNP YM+
Subjt:  KLSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYMM

Query:  GSKNELGYCRMSCKVC
        GS+  LG+CR SCK C
Subjt:  GSKNELGYCRMSCKVC

F4JAU3 Prolyl 4-hydroxylase 24.1e-8556.99Show/hide
Query:  PAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVA-AETGESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQIL
        P+  I+P++V ++SSKPRAF+Y+GFL++ EC HLI+LAK  L +S VA  + GES  S+ RTS+G F+ K +D IV+ IE +++ WTFLP +NGE +Q+L
Subjt:  PAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVA-AETGESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQIL

Query:  RYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPN----SPVKLSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYH
        RYE+GQKY+ HFD+F D  NIA GGHRIAT+L+YLS+V KGGETVFP+    S   LSE  K DLSDCAK G  V+PK G+ALLFF+L  +  PD  S H
Subjt:  RYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPN----SPVKLSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYH

Query:  GSCPVIEGEKWSATKWIHMLPIYEIWRNPA-CVDENEKCRAWANAGECEKNPTYMMGSKNELGYCRMSCKVC
        G CPVIEGEKWSATKWIH+    +I  +   C D NE C  WA  GEC KNP YM+G+    G CR SCK C
Subjt:  GSCPVIEGEKWSATKWIHMLPIYEIWRNPA-CVDENEKCRAWANAGECEKNPTYMMGSKNELGYCRMSCKVC

Q8L970 Probable prolyl 4-hydroxylase 74.5e-10859.31Show/hide
Query:  MASRFFLAFSICFLCFFPFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVA-AET
        M SR FLAFS+CFL   P  S + NR   L   +   + SVI+MKT   +   DPTRV +LS  PR FLY+GFLS+EEC H I LAKGKL +S+VA  ++
Subjt:  MASRFFLAFSICFLCFFPFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVA-AET

Query:  GESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPVK
        GESV SE RTS+GMFL K QD+IV+ +E+++AAWTFLP +NGE +QIL YENGQKYEPHFD+F D AN+ +GGHRIAT+LMYLS+VEKGGETVFP    K
Subjt:  GESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPVK

Query:  LSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIW-RNPACVDENEKCRAWANAGECEKNPTYMM
         ++ +    ++CAK GY V+P+ GDALLFF+L+PN T D+ S HGSCPV+EGEKWSAT+WIH+      + +   C+DEN  C  WA AGEC+KNPTYM+
Subjt:  LSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIW-RNPACVDENEKCRAWANAGECEKNPTYMM

Query:  GSKNELGYCRMSCKVCS
        GS  + GYCR SCK CS
Subjt:  GSKNELGYCRMSCKVCS

Q8LAN3 Probable prolyl 4-hydroxylase 45.2e-8856.67Show/hide
Query:  AITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVA-AETGESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILR
        ++ ++P++V ++SSKPRAF+Y+GFL+E EC H+++LAK  L +S VA  ++GES  SE RTS+G F+ K +D IV+ IE +I+ WTFLP +NGE IQ+LR
Subjt:  AITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVA-AETGESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILR

Query:  YENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPV---KLSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGS
        YE+GQKY+ HFD+F D  NI  GGHR+ATILMYLS+V KGGETVFP++ +   ++  + K DLSDCAK G  V+P+ GDALLFF+L+P+  PD  S HG 
Subjt:  YENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPV---KLSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGS

Query:  CPVIEGEKWSATKWIHMLPIYEI-WRNPACVDENEKCRAWANAGECEKNPTYMMGSKNELGYCRMSCKVC
        CPVIEGEKWSATKWIH+     I   +  C D NE C  WA  GEC KNP YM+G+    GYCR SCK C
Subjt:  CPVIEGEKWSATKWIHMLPIYEI-WRNPACVDENEKCRAWANAGECEKNPTYMMGSKNELGYCRMSCKVC

Q9LN20 Probable prolyl 4-hydroxylase 38.4e-6255.88Show/hide
Query:  LSSKPRAFLYKGFLSEEECHHLINLAKGKLHQS-LVAAETGESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHF
        LS +PRAF+Y  FLS+EEC +LI+LAK  + +S +V +ETG+S  S  RTS+G FL++ +D+I+  IE RIA +TF+P D+GE +Q+L YE GQKYEPH+
Subjt:  LSSKPRAFLYKGFLSEEECHHLINLAKGKLHQS-LVAAETGESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHF

Query:  DFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPVKLSEQE-KADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATK
        D+F D  N   GG R+AT+LMYLSDVE+GGETVFP + +  S      +LS+C K G  V+P+MGDALLF+S+ P+ T D TS HG CPVI G KWS+TK
Subjt:  DFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPVKLSEQE-KADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATK

Query:  WIHM
        W+H+
Subjt:  WIHM

Arabidopsis top hitse value%identityAlignment
AT3G06300.1 P4H isoform 22.9e-8656.99Show/hide
Query:  PAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVA-AETGESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQIL
        P+  I+P++V ++SSKPRAF+Y+GFL++ EC HLI+LAK  L +S VA  + GES  S+ RTS+G F+ K +D IV+ IE +++ WTFLP +NGE +Q+L
Subjt:  PAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVA-AETGESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQIL

Query:  RYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPN----SPVKLSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYH
        RYE+GQKY+ HFD+F D  NIA GGHRIAT+L+YLS+V KGGETVFP+    S   LSE  K DLSDCAK G  V+PK G+ALLFF+L  +  PD  S H
Subjt:  RYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPN----SPVKLSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYH

Query:  GSCPVIEGEKWSATKWIHMLPIYEIWRNPA-CVDENEKCRAWANAGECEKNPTYMMGSKNELGYCRMSCKVC
        G CPVIEGEKWSATKWIH+    +I  +   C D NE C  WA  GEC KNP YM+G+    G CR SCK C
Subjt:  GSCPVIEGEKWSATKWIHMLPIYEIWRNPA-CVDENEKCRAWANAGECEKNPTYMMGSKNELGYCRMSCKVC

AT3G28480.1 Oxoglutarate/iron-dependent oxygenase3.2e-10959.31Show/hide
Query:  MASRFFLAFSICFLCFFPFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVA-AET
        M SR FLAFS+CFL   P  S + NR   L   +   + SVI+MKT   +   DPTRV +LS  PR FLY+GFLS+EEC H I LAKGKL +S+VA  ++
Subjt:  MASRFFLAFSICFLCFFPFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVA-AET

Query:  GESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPVK
        GESV SE RTS+GMFL K QD+IV+ +E+++AAWTFLP +NGE +QIL YENGQKYEPHFD+F D AN+ +GGHRIAT+LMYLS+VEKGGETVFP    K
Subjt:  GESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPVK

Query:  LSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIW-RNPACVDENEKCRAWANAGECEKNPTYMM
         ++ +    ++CAK GY V+P+ GDALLFF+L+PN T D+ S HGSCPV+EGEKWSAT+WIH+      + +   C+DEN  C  WA AGEC+KNPTYM+
Subjt:  LSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIW-RNPACVDENEKCRAWANAGECEKNPTYMM

Query:  GSKNELGYCRMSCKVCS
        GS  + GYCR SCK CS
Subjt:  GSKNELGYCRMSCKVCS

AT3G28480.2 Oxoglutarate/iron-dependent oxygenase5.0e-10255.69Show/hide
Query:  MASRFFLAFSICFLCFFPFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVA-AET
        M SR FLAFS+CFL   P  S + NR   L   +   + SVI+MKT   +   DPTRV +LS  PR FLY+GFLS+EEC H I LAKGKL +S+VA  ++
Subjt:  MASRFFLAFSICFLCFFPFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVA-AET

Query:  GESVTSEERTS----TGMFLKKAQ----DEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGET
        GESV SE+  S    +  F+        D+IV+ +E+++AAWTFLP +NGE +QIL YENGQKYEPHFD+F D AN+ +GGHRIAT+LMYLS+VEKGGET
Subjt:  GESVTSEERTS----TGMFLKKAQ----DEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGET

Query:  VFPNSPVKLSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIW-RNPACVDENEKCRAWANAGEC
        VFP    K ++ +    ++CAK GY V+P+ GDALLFF+L+PN T D+ S HGSCPV+EGEKWSAT+WIH+      + +   C+DEN  C  WA AGEC
Subjt:  VFPNSPVKLSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIW-RNPACVDENEKCRAWANAGEC

Query:  EKNPTYMMGSKNELGYCRMSCKVCS
        +KNPTYM+GS  + GYCR SCK CS
Subjt:  EKNPTYMMGSKNELGYCRMSCKVCS

AT3G28490.1 Oxoglutarate/iron-dependent oxygenase1.7e-9755.38Show/hide
Query:  MASRFFLAFSICFLCFFPFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAA--E
        M S++FLAFS+  L  F   S                            + ++DPTR+ +LS  PRAFLYKGFLS+EEC HLI LAKGKL +S+V A  +
Subjt:  MASRFFLAFSICFLCFFPFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAA--E

Query:  TGESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPV
        +GES  SE RTS+GMFL K QD+IVA +E+++AAWTFLP +NGE +QIL YENGQKY+PHFD+F D   + +GGHRIAT+LMYLS+V KGGETVFPN   
Subjt:  TGESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPV

Query:  KLSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYMM
        K  + +    S CAK GY V+P+ GDALLFF+L+ N T D  S HGSCPVIEGEKWSAT+WIH+    +  +   CVD++E C+ WA+AGECEKNP YM+
Subjt:  KLSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYMM

Query:  GSKNELGYCRMSCKVC
        GS+  LG+CR SCK C
Subjt:  GSKNELGYCRMSCKVC

AT5G18900.1 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily protein3.7e-8956.67Show/hide
Query:  AITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVA-AETGESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILR
        ++ ++P++V ++SSKPRAF+Y+GFL+E EC H+++LAK  L +S VA  ++GES  SE RTS+G F+ K +D IV+ IE +I+ WTFLP +NGE IQ+LR
Subjt:  AITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVA-AETGESVTSEERTSTGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILR

Query:  YENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPV---KLSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGS
        YE+GQKY+ HFD+F D  NI  GGHR+ATILMYLS+V KGGETVFP++ +   ++  + K DLSDCAK G  V+P+ GDALLFF+L+P+  PD  S HG 
Subjt:  YENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPV---KLSEQEKADLSDCAKIGYGVRPKMGDALLFFSLNPNVTPDATSYHGS

Query:  CPVIEGEKWSATKWIHMLPIYEI-WRNPACVDENEKCRAWANAGECEKNPTYMMGSKNELGYCRMSCKVC
        CPVIEGEKWSATKWIH+     I   +  C D NE C  WA  GEC KNP YM+G+    GYCR SCK C
Subjt:  CPVIEGEKWSATKWIHMLPIYEI-WRNPACVDENEKCRAWANAGECEKNPTYMMGSKNELGYCRMSCKVC


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGCTTCTCGATTTTTTCTCGCATTTTCTATCTGTTTCCTTTGCTTCTTCCCCTTCTTTTCTCGCTCCACAAATCGCTTGCCGAAATTGCTCTTACACAACATGAACAT
GGAACAATCTGTTATTAGGATGAAAACGGGTGGTCCCGCCATTACCATCGATCCCACTCGTGTAATTAAGCTTTCATCCAAACCCAGGGCTTTCTTATATAAGGGATTTT
TGTCTGAAGAGGAGTGCCATCATCTTATCAATTTGGCGAAAGGTAAGCTACATCAATCATTGGTAGCGGCTGAAACGGGTGAGAGTGTTACAAGTGAAGAACGGACAAGT
ACCGGCATGTTTCTTAAGAAGGCTCAGGATGAAATAGTTGCTCGCATTGAGTCCAGGATTGCTGCGTGGACCTTCCTTCCCATTGATAATGGGGAGCCTATTCAAATACT
AAGGTATGAGAACGGTCAGAAATACGAGCCACATTTTGATTTTTTTCAAGACCCAGCTAATATAGCTGTTGGAGGTCACCGGATAGCCACAATCTTGATGTATTTGTCCG
ATGTTGAAAAGGGTGGAGAAACAGTCTTTCCCAATTCTCCGGTGAAATTATCCGAGCAGGAGAAGGCTGACTTGTCTGATTGCGCTAAGATTGGTTATGGAGTAAGACCA
AAGATGGGTGATGCTTTACTGTTCTTCAGTCTGAATCCAAATGTGACGCCAGACGCGACCAGCTATCACGGGAGCTGCCCGGTAATAGAGGGTGAGAAATGGTCTGCGAC
TAAATGGATTCACATGCTTCCAATCTATGAAATTTGGAGGAATCCAGCTTGTGTGGACGAGAATGAGAAGTGTAGAGCGTGGGCAAATGCAGGTGAGTGTGAAAAGAATC
CTACTTATATGATGGGTTCTAAGAATGAACTTGGATATTGTAGGATGAGTTGCAAAGTGTGCTCTCCTCCCTCGTAA
mRNA sequenceShow/hide mRNA sequence
ATGGCTTCTCGATTTTTTCTCGCATTTTCTATCTGTTTCCTTTGCTTCTTCCCCTTCTTTTCTCGCTCCACAAATCGCTTGCCGAAATTGCTCTTACACAACATGAACAT
GGAACAATCTGTTATTAGGATGAAAACGGGTGGTCCCGCCATTACCATCGATCCCACTCGTGTAATTAAGCTTTCATCCAAACCCAGGGCTTTCTTATATAAGGGATTTT
TGTCTGAAGAGGAGTGCCATCATCTTATCAATTTGGCGAAAGGTAAGCTACATCAATCATTGGTAGCGGCTGAAACGGGTGAGAGTGTTACAAGTGAAGAACGGACAAGT
ACCGGCATGTTTCTTAAGAAGGCTCAGGATGAAATAGTTGCTCGCATTGAGTCCAGGATTGCTGCGTGGACCTTCCTTCCCATTGATAATGGGGAGCCTATTCAAATACT
AAGGTATGAGAACGGTCAGAAATACGAGCCACATTTTGATTTTTTTCAAGACCCAGCTAATATAGCTGTTGGAGGTCACCGGATAGCCACAATCTTGATGTATTTGTCCG
ATGTTGAAAAGGGTGGAGAAACAGTCTTTCCCAATTCTCCGGTGAAATTATCCGAGCAGGAGAAGGCTGACTTGTCTGATTGCGCTAAGATTGGTTATGGAGTAAGACCA
AAGATGGGTGATGCTTTACTGTTCTTCAGTCTGAATCCAAATGTGACGCCAGACGCGACCAGCTATCACGGGAGCTGCCCGGTAATAGAGGGTGAGAAATGGTCTGCGAC
TAAATGGATTCACATGCTTCCAATCTATGAAATTTGGAGGAATCCAGCTTGTGTGGACGAGAATGAGAAGTGTAGAGCGTGGGCAAATGCAGGTGAGTGTGAAAAGAATC
CTACTTATATGATGGGTTCTAAGAATGAACTTGGATATTGTAGGATGAGTTGCAAAGTGTGCTCTCCTCCCTCGTAA
Protein sequenceShow/hide protein sequence
MASRFFLAFSICFLCFFPFFSRSTNRLPKLLLHNMNMEQSVIRMKTGGPAITIDPTRVIKLSSKPRAFLYKGFLSEEECHHLINLAKGKLHQSLVAAETGESVTSEERTS
TGMFLKKAQDEIVARIESRIAAWTFLPIDNGEPIQILRYENGQKYEPHFDFFQDPANIAVGGHRIATILMYLSDVEKGGETVFPNSPVKLSEQEKADLSDCAKIGYGVRP
KMGDALLFFSLNPNVTPDATSYHGSCPVIEGEKWSATKWIHMLPIYEIWRNPACVDENEKCRAWANAGECEKNPTYMMGSKNELGYCRMSCKVCSPPS