; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Spg021826 (gene) of Sponge gourd (cylindrica) v1 genome

Gene IDSpg021826
OrganismLuffa cylindrica (Sponge gourd (cylindrica) v1)
DescriptionSOUL heme-binding protein
Genome locationscaffold2:6862056..6870158
RNA-Seq ExpressionSpg021826
SyntenySpg021826
Gene Ontology termsGO:0110165 - cellular anatomical structure (cellular component)
GO:0016746 - transferase activity, transferring acyl groups (molecular function)
InterPro domainsIPR006917 - SOUL haem-binding protein
IPR011256 - Regulatory factor, effector binding domain superfamily
IPR018790 - Protein of unknown function DUF2358


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAF4400066.1 hypothetical protein G4B88_021280 [Cannabis sativa]2.3e-20254.68Show/hide
Query:  SRKSKWVIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYDNITGYLLNIALLREFFKPEFTLHWVKKTGPYEITTRW
        +  SKW ++  L +QS KKSTV++ RLVDFLY+DL H+FD+QGIDRTAY ++V+F+DPITK+D+I+GYL NIALL+  F+P+F LHWVK+TGPYEITTRW
Subjt:  SRKSKWVIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYDNITGYLLNIALLREFFKPEFTLHWVKKTGPYEITTRW

Query:  TAVMKFILLPWKPELVLTGTSVMGIDPETGKFCSHVDRWDSVRNNDYFSLEGLWDVFKQFRFYETPELESPKYQILKRTANYEVRKYAPFTLVETTGDKL
        T VMKFILLPWKPELV TGTSVMGI+PETGKFCSH+D WDS++NNDYFSLEGLW+VFKQ R Y+TP+L +PKYQILK+TANYEVRKY PF +VE + DKL
Subjt:  TAVMKFILLPWKPELVLTGTSVMGIDPETGKFCSHVDRWDSVRNNDYFSLEGLWDVFKQFRFYETPELESPKYQILKRTANYEVRKYAPFTLVETTGDKL

Query:  SGFGGFNTVSGLVAGYLFGKNSTKEKIP---------------KVSIQIVLPSEKDMSSLPDPEQDTMRLRKIEGGIAAVLKFGGNPTEEVVQQKVKELQ
        SG  GFN V+    GY+FGKNS  EKIP                VSIQ+ LP +KD++SLP+P +DT+ LRK+EGGIAAV+KF G PTE+VV +K K L+
Subjt:  SGFGGFNTVSGLVAGYLFGKNSTKEKIP---------------KVSIQIVLPSEKDMSSLPDPEQDTMRLRKIEGGIAAVLKFGGNPTEEVVQQKVKELQ

Query:  YNLKKDGLKPIINGSYLLAQYNDPRRTWSFVMLGPI-PIPM--------ATAIAQVSLQNFLSIPTV----------GFGFRPSKSGRPTGLAQSRKSKW
          L KDGLKP +  S  LA+YNDP RTWSFVM+  I P+           T +A  SL+   ++P             +GF   KS     L     SK 
Subjt:  YNLKKDGLKPIINGSYLLAQYNDPRRTWSFVMLGPI-PIPM--------ATAIAQVSLQNFLSIPTV----------GFGFRPSKSGRPTGLAQSRKSKW

Query:  VIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYDNITGYLLNIALLREFFNPEFTLHWVKKTGPYEITTRWTAVMKF
         + S +   +  KST +V+ LVDFLY+DL HVFD+QGIDR  Y + +RF+DPITK++++  YL NI+LL+  F P+F LH+VK+TGP+EITTRWT VM++
Subjt:  VIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYDNITGYLLNIALLREFFNPEFTLHWVKKTGPYEITTRWTAVMKF

Query:  ILLPWKPELVLTGISIMGIDPDTGKFRTHVDLWDSVQNNDYFSLEGLWDVFKQLRFYETPE-LESPKYQILKRTANYEVRKYAPFIVFETNEDRLSAGFN
        ++ PWKPE+V+TG S MGI+P TGKF THVD WDS+++N++FSLEGLW V KQ+  ++T + L  PKYQILKR ANYEVRKY P +     +        
Subjt:  ILLPWKPELVLTGISIMGIDPDTGKFRTHVDLWDSVQNNDYFSLEGLWDVFKQLRFYETPE-LESPKYQILKRTANYEVRKYAPFIVFETNEDRLSAGFN

Query:  RVASFPDSKQDSLSLRKMEGGIAAVLKFSGNPTEDLAEQKAKQLLYTLKRDGLKPIHGCLLARYNNSARTFSF
              D+ Q +  LR  +GGIAA +KFSG  T+D+ ++K + L   L  DGL+P  GCLL   +  + +++F
Subjt:  RVASFPDSKQDSLSLRKMEGGIAAVLKFSGNPTEDLAEQKAKQLLYTLKRDGLKPIHGCLLARYNNSARTFSF

KAG6587902.1 hypothetical protein SDJN03_16467, partial [Cucurbita argyrosperma subsp. sororia]8.2e-14478.79Show/hide
Query:  AIAQVSLQNFLSIPTVGFGFRPSKSGRPTGLAQSRKS----KWVIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYD
        A AQVS QNFLSIPTV FG RP KS  PT  AQSR      K  IRS L DQS +K TVDVDRLVDF+YDDLRHVFDEQGIDRTAY ++VRF+DPITKYD
Subjt:  AIAQVSLQNFLSIPTVGFGFRPSKSGRPTGLAQSRKS----KWVIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYD

Query:  NITGYLLNIALLREFFNPEFTLHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTGISIMGIDPDTGKFRTHVDLWDSVQNNDYFSLEGLWDVFKQLRFY
         I+GY+LNIALLREFF PE  LHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTG SIMGI+P TGKF +HVDLWDS+QNNDYFSLE LWDVFKQLRFY
Subjt:  NITGYLLNIALLREFFNPEFTLHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTGISIMGIDPDTGKFRTHVDLWDSVQNNDYFSLEGLWDVFKQLRFY

Query:  ETPELESPKYQILKRTANYEVRKYAPFIVFETNEDRLSAGFNRVASFPDSKQ-DSLSLRKMEGGIAAVLKFSGNPTEDLAEQKAKQLLYTLKRDGLKPIH
        ETPELESPKYQILKRTANYEVRKYAPF+V E N  ++SAGFNRV S  D+KQ D++S+R+MEGGI AVLKFSG+PTED+A+QKAK+L  +LK+DGL PI+
Subjt:  ETPELESPKYQILKRTANYEVRKYAPFIVFETNEDRLSAGFNRVASFPDSKQ-DSLSLRKMEGGIAAVLKFSGNPTEDLAEQKAKQLLYTLKRDGLKPIH

Query:  GCLLARYNNSARTFSFVMRNEVLIWLEEFS
        GCLLARYN+S RT+SFVMRNEVLIWLEEFS
Subjt:  GCLLARYNNSARTFSFVMRNEVLIWLEEFS

XP_022933414.1 uncharacterized protein LOC111440839 [Cucurbita moschata]1.4e-14378.48Show/hide
Query:  AIAQVSLQNFLSIPTVGFGFRPSKSGRPTGLAQSRKS----KWVIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYD
        A AQVS QNFLSIPTV  G RP KS  PT  AQSR      K  IRS LADQS +K TVDVDRLVDF+YDDLRHVFDEQGIDRTAY D+VRF+DPITKYD
Subjt:  AIAQVSLQNFLSIPTVGFGFRPSKSGRPTGLAQSRKS----KWVIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYD

Query:  NITGYLLNIALLREFFNPEFTLHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTGISIMGIDPDTGKFRTHVDLWDSVQNNDYFSLEGLWDVFKQLRFY
         I+GY+LNIALLREFF PE  LHWVKKTGPYEITTRWTA+MKFILLPWKPELVLTG SIMGI+P TGKF +HVDLWDS+QNNDYFS+E LWDVFKQ RFY
Subjt:  NITGYLLNIALLREFFNPEFTLHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTGISIMGIDPDTGKFRTHVDLWDSVQNNDYFSLEGLWDVFKQLRFY

Query:  ETPELESPKYQILKRTANYEVRKYAPFIVFETNEDRLSAGFNRVASFPDSKQ-DSLSLRKMEGGIAAVLKFSGNPTEDLAEQKAKQLLYTLKRDGLKPIH
        ETPELESPKYQILKRTANYEVRKYAPF+V E N  ++SAGFNRV S  D+KQ D++S+R+MEGGI AVLKFSG+PTED+A+QKAK+L  +LK+DGLKPI+
Subjt:  ETPELESPKYQILKRTANYEVRKYAPFIVFETNEDRLSAGFNRVASFPDSKQ-DSLSLRKMEGGIAAVLKFSGNPTEDLAEQKAKQLLYTLKRDGLKPIH

Query:  GCLLARYNNSARTFSFVMRNEVLIWLEEFS
        GCLLARYN+S RT+SFVMRNEVLIWLEEFS
Subjt:  GCLLARYNNSARTFSFVMRNEVLIWLEEFS

XP_022965046.1 uncharacterized protein LOC111465022 [Cucurbita maxima]3.9e-14678.48Show/hide
Query:  AIAQVSLQNFLSIPTVGFGFRPSKSGRPTGLAQSRKS----KWVIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYD
        A AQVS QNFLSIPTV FG RP KS  PT  AQSR +    KW IRS LADQ  +K TVDVDRLVDF+YDDLRHVFDEQGIDRTAY ++VRF+DPITKYD
Subjt:  AIAQVSLQNFLSIPTVGFGFRPSKSGRPTGLAQSRKS----KWVIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYD

Query:  NITGYLLNIALLREFFNPEFTLHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTGISIMGIDPDTGKFRTHVDLWDSVQNNDYFSLEGLWDVFKQLRFY
         I+GY+LNIALLREFF PE  LHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTG SIMGI+P TGKF +HVDLWDS+QNNDYFS+E LWDVFKQ RFY
Subjt:  NITGYLLNIALLREFFNPEFTLHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTGISIMGIDPDTGKFRTHVDLWDSVQNNDYFSLEGLWDVFKQLRFY

Query:  ETPELESPKYQILKRTANYEVRKYAPFIVFETNEDRLSAGFNRVASFPDSKQ-DSLSLRKMEGGIAAVLKFSGNPTEDLAEQKAKQLLYTLKRDGLKPIH
        ETPELESPKYQILKRTANYEVRKYAPF+V E N  ++SAGFNRV SFPD+KQ D++S+R+MEGGI AVLKFSG+PTED+A+QKAK+L  +LK+DGLKPI+
Subjt:  ETPELESPKYQILKRTANYEVRKYAPFIVFETNEDRLSAGFNRVASFPDSKQ-DSLSLRKMEGGIAAVLKFSGNPTEDLAEQKAKQLLYTLKRDGLKPIH

Query:  GCLLARYNNSARTFSFVMRNEVLIWLEEFS
        GCLLARYN+S RT+ FVMRNEV+IWL+EFS
Subjt:  GCLLARYNNSARTFSFVMRNEVLIWLEEFS

XP_023531546.1 uncharacterized protein LOC111793749 [Cucurbita pepo subsp. pepo]1.1e-14579.39Show/hide
Query:  AIAQVSLQNFLSIPTVGFGFRPSKSGRPTGLAQSRKS----KWVIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYD
        A AQVS QNFLSIPTV FG RP KS  PT  AQSR      K  IRS LADQS +K TVDVDRLVDF+YDDLRHVFDEQGIDRTAY ++VRF+DPITKYD
Subjt:  AIAQVSLQNFLSIPTVGFGFRPSKSGRPTGLAQSRKS----KWVIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYD

Query:  NITGYLLNIALLREFFNPEFTLHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTGISIMGIDPDTGKFRTHVDLWDSVQNNDYFSLEGLWDVFKQLRFY
         I+GY+LNIALLREFF PE   HWVKKTGPYEITTRWTAVMKFILLPWKPELVLTG SIMGI+P TGKF +HVD+WDS+QNNDYFSLE LWDVFKQ RFY
Subjt:  NITGYLLNIALLREFFNPEFTLHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTGISIMGIDPDTGKFRTHVDLWDSVQNNDYFSLEGLWDVFKQLRFY

Query:  ETPELESPKYQILKRTANYEVRKYAPFIVFETNEDRLSAGFNRVASFPDSKQ-DSLSLRKMEGGIAAVLKFSGNPTEDLAEQKAKQLLYTLKRDGLKPIH
        ETPELESPKYQILKRTANYEVRKYAPF+V E N  ++SAGFNRV SF D+KQ D++S+R+MEGGI AVLKFSG+PTED+A+QKAK+L  +LK+DGLKPI+
Subjt:  ETPELESPKYQILKRTANYEVRKYAPFIVFETNEDRLSAGFNRVASFPDSKQ-DSLSLRKMEGGIAAVLKFSGNPTEDLAEQKAKQLLYTLKRDGLKPIH

Query:  GCLLARYNNSARTFSFVMRNEVLIWLEEFS
        GCLLARYNNSART+SFVMRNEVLIWLEEFS
Subjt:  GCLLARYNNSARTFSFVMRNEVLIWLEEFS

TrEMBL top hitse value%identityAlignment
A0A6J1CUY2 uncharacterized protein LOC111014503 isoform X11.7e-14267.44Show/hide
Query:  MATAQVSLQNFLSIPTVAFGFRPTKSGQPT---------------GRAQSRKSKWVIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHD
        MA  Q+SLQNFLS PT  FGFRP KSG  T                +  +R SKW +R  L DQS  KS VDVDRLVDFLY+DLRH+FDEQGIDRTAY +
Subjt:  MATAQVSLQNFLSIPTVAFGFRPTKSGQPT---------------GRAQSRKSKWVIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHD

Query:  QVRFQDPITKYDNITGYLLNIALLREFFKPEFTLHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTGTSVMGIDPETGKFCSHVDRWDSVRNNDYFSLE
         VRF+DPITK+D I+GY  NI+LLRE F+PEF LHWVK+TGPYEITTRWT VMKF+LLPWKPE + TG S+MGI+PETGKFCSHVD WDS++NNDYFSLE
Subjt:  QVRFQDPITKYDNITGYLLNIALLREFFKPEFTLHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTGTSVMGIDPETGKFCSHVDRWDSVRNNDYFSLE

Query:  GLWDVFKQFRFYETPELESPKYQILKRTANYEVRKYAPFTLVETTGDKLSGFGGFNTVSGLVAGYLFGKNSTKEKI---------------PKVSIQIVL
        GL DVFKQ RFY+TPELESPKY+ILKRTANYEVRKY PF +VET+GDKLSG  GFNT    VAGY+FGKNS KEKI               PKVSIQIVL
Subjt:  GLWDVFKQFRFYETPELESPKYQILKRTANYEVRKYAPFTLVETTGDKLSGFGGFNTVSGLVAGYLFGKNSTKEKI---------------PKVSIQIVL

Query:  PSEKDMSSLPDPEQDTMRLRKIEGGIAAVLKFGGNPTEEVVQQKVKELQYNLKKDGLKPIINGSYLLAQYNDPRRTWSFVMLGPIPI
        PS+KD++SLPDPEQDT+ LRK+EGGIAAVLKF G PTE++VQ+K KEL+  L KDGLKP  +   LLA+YNDP RTWSF+M   + I
Subjt:  PSEKDMSSLPDPEQDTMRLRKIEGGIAAVLKFGGNPTEEVVQQKVKELQYNLKKDGLKPIINGSYLLAQYNDPRRTWSFVMLGPIPI

A0A6J1EZQ2 uncharacterized protein LOC1114408396.7e-14478.48Show/hide
Query:  AIAQVSLQNFLSIPTVGFGFRPSKSGRPTGLAQSRKS----KWVIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYD
        A AQVS QNFLSIPTV  G RP KS  PT  AQSR      K  IRS LADQS +K TVDVDRLVDF+YDDLRHVFDEQGIDRTAY D+VRF+DPITKYD
Subjt:  AIAQVSLQNFLSIPTVGFGFRPSKSGRPTGLAQSRKS----KWVIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYD

Query:  NITGYLLNIALLREFFNPEFTLHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTGISIMGIDPDTGKFRTHVDLWDSVQNNDYFSLEGLWDVFKQLRFY
         I+GY+LNIALLREFF PE  LHWVKKTGPYEITTRWTA+MKFILLPWKPELVLTG SIMGI+P TGKF +HVDLWDS+QNNDYFS+E LWDVFKQ RFY
Subjt:  NITGYLLNIALLREFFNPEFTLHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTGISIMGIDPDTGKFRTHVDLWDSVQNNDYFSLEGLWDVFKQLRFY

Query:  ETPELESPKYQILKRTANYEVRKYAPFIVFETNEDRLSAGFNRVASFPDSKQ-DSLSLRKMEGGIAAVLKFSGNPTEDLAEQKAKQLLYTLKRDGLKPIH
        ETPELESPKYQILKRTANYEVRKYAPF+V E N  ++SAGFNRV S  D+KQ D++S+R+MEGGI AVLKFSG+PTED+A+QKAK+L  +LK+DGLKPI+
Subjt:  ETPELESPKYQILKRTANYEVRKYAPFIVFETNEDRLSAGFNRVASFPDSKQ-DSLSLRKMEGGIAAVLKFSGNPTEDLAEQKAKQLLYTLKRDGLKPIH

Query:  GCLLARYNNSARTFSFVMRNEVLIWLEEFS
        GCLLARYN+S RT+SFVMRNEVLIWLEEFS
Subjt:  GCLLARYNNSARTFSFVMRNEVLIWLEEFS

A0A6J1HKM5 uncharacterized protein LOC1114650221.9e-14678.48Show/hide
Query:  AIAQVSLQNFLSIPTVGFGFRPSKSGRPTGLAQSRKS----KWVIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYD
        A AQVS QNFLSIPTV FG RP KS  PT  AQSR +    KW IRS LADQ  +K TVDVDRLVDF+YDDLRHVFDEQGIDRTAY ++VRF+DPITKYD
Subjt:  AIAQVSLQNFLSIPTVGFGFRPSKSGRPTGLAQSRKS----KWVIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYD

Query:  NITGYLLNIALLREFFNPEFTLHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTGISIMGIDPDTGKFRTHVDLWDSVQNNDYFSLEGLWDVFKQLRFY
         I+GY+LNIALLREFF PE  LHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTG SIMGI+P TGKF +HVDLWDS+QNNDYFS+E LWDVFKQ RFY
Subjt:  NITGYLLNIALLREFFNPEFTLHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTGISIMGIDPDTGKFRTHVDLWDSVQNNDYFSLEGLWDVFKQLRFY

Query:  ETPELESPKYQILKRTANYEVRKYAPFIVFETNEDRLSAGFNRVASFPDSKQ-DSLSLRKMEGGIAAVLKFSGNPTEDLAEQKAKQLLYTLKRDGLKPIH
        ETPELESPKYQILKRTANYEVRKYAPF+V E N  ++SAGFNRV SFPD+KQ D++S+R+MEGGI AVLKFSG+PTED+A+QKAK+L  +LK+DGLKPI+
Subjt:  ETPELESPKYQILKRTANYEVRKYAPFIVFETNEDRLSAGFNRVASFPDSKQ-DSLSLRKMEGGIAAVLKFSGNPTEDLAEQKAKQLLYTLKRDGLKPIH

Query:  GCLLARYNNSARTFSFVMRNEVLIWLEEFS
        GCLLARYN+S RT+ FVMRNEV+IWL+EFS
Subjt:  GCLLARYNNSARTFSFVMRNEVLIWLEEFS

A0A7J6HY64 Very-long-chain 3-oxoacyl-CoA synthase1.1e-20254.68Show/hide
Query:  SRKSKWVIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYDNITGYLLNIALLREFFKPEFTLHWVKKTGPYEITTRW
        +  SKW ++  L +QS KKSTV++ RLVDFLY+DL H+FD+QGIDRTAY ++V+F+DPITK+D+I+GYL NIALL+  F+P+F LHWVK+TGPYEITTRW
Subjt:  SRKSKWVIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYDNITGYLLNIALLREFFKPEFTLHWVKKTGPYEITTRW

Query:  TAVMKFILLPWKPELVLTGTSVMGIDPETGKFCSHVDRWDSVRNNDYFSLEGLWDVFKQFRFYETPELESPKYQILKRTANYEVRKYAPFTLVETTGDKL
        T VMKFILLPWKPELV TGTSVMGI+PETGKFCSH+D WDS++NNDYFSLEGLW+VFKQ R Y+TP+L +PKYQILK+TANYEVRKY PF +VE + DKL
Subjt:  TAVMKFILLPWKPELVLTGTSVMGIDPETGKFCSHVDRWDSVRNNDYFSLEGLWDVFKQFRFYETPELESPKYQILKRTANYEVRKYAPFTLVETTGDKL

Query:  SGFGGFNTVSGLVAGYLFGKNSTKEKIP---------------KVSIQIVLPSEKDMSSLPDPEQDTMRLRKIEGGIAAVLKFGGNPTEEVVQQKVKELQ
        SG  GFN V+    GY+FGKNS  EKIP                VSIQ+ LP +KD++SLP+P +DT+ LRK+EGGIAAV+KF G PTE+VV +K K L+
Subjt:  SGFGGFNTVSGLVAGYLFGKNSTKEKIP---------------KVSIQIVLPSEKDMSSLPDPEQDTMRLRKIEGGIAAVLKFGGNPTEEVVQQKVKELQ

Query:  YNLKKDGLKPIINGSYLLAQYNDPRRTWSFVMLGPI-PIPM--------ATAIAQVSLQNFLSIPTV----------GFGFRPSKSGRPTGLAQSRKSKW
          L KDGLKP +  S  LA+YNDP RTWSFVM+  I P+           T +A  SL+   ++P             +GF   KS     L     SK 
Subjt:  YNLKKDGLKPIINGSYLLAQYNDPRRTWSFVMLGPI-PIPM--------ATAIAQVSLQNFLSIPTV----------GFGFRPSKSGRPTGLAQSRKSKW

Query:  VIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYDNITGYLLNIALLREFFNPEFTLHWVKKTGPYEITTRWTAVMKF
         + S +   +  KST +V+ LVDFLY+DL HVFD+QGIDR  Y + +RF+DPITK++++  YL NI+LL+  F P+F LH+VK+TGP+EITTRWT VM++
Subjt:  VIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYDNITGYLLNIALLREFFNPEFTLHWVKKTGPYEITTRWTAVMKF

Query:  ILLPWKPELVLTGISIMGIDPDTGKFRTHVDLWDSVQNNDYFSLEGLWDVFKQLRFYETPE-LESPKYQILKRTANYEVRKYAPFIVFETNEDRLSAGFN
        ++ PWKPE+V+TG S MGI+P TGKF THVD WDS+++N++FSLEGLW V KQ+  ++T + L  PKYQILKR ANYEVRKY P +     +        
Subjt:  ILLPWKPELVLTGISIMGIDPDTGKFRTHVDLWDSVQNNDYFSLEGLWDVFKQLRFYETPE-LESPKYQILKRTANYEVRKYAPFIVFETNEDRLSAGFN

Query:  RVASFPDSKQDSLSLRKMEGGIAAVLKFSGNPTEDLAEQKAKQLLYTLKRDGLKPIHGCLLARYNNSARTFSF
              D+ Q +  LR  +GGIAA +KFSG  T+D+ ++K + L   L  DGL+P  GCLL   +  + +++F
Subjt:  RVASFPDSKQDSLSLRKMEGGIAAVLKFSGNPTEDLAEQKAKQLLYTLKRDGLKPIHGCLLARYNNSARTFSF

A0A803NX00 Uncharacterized protein9.6e-19158.78Show/hide
Query:  SRKSKWVIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYDNITGYLLNIALLREFFKPEFTLHWVKKTGPYEITTRW
        +  SKW ++  L +QS KKSTV++ RLVDFLY+DL H+FD+QGIDRTAY ++V+F+DPITK+D+I+GYL NIALL+  F+P+F LHWVK+TGPYEITTRW
Subjt:  SRKSKWVIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYDNITGYLLNIALLREFFKPEFTLHWVKKTGPYEITTRW

Query:  TAVMKFILLPWKPELVLTGTSVMGIDPETGKFCSHVDRWDSVRNNDYFSLEGLWDVFKQFRFYETPELESPKYQILKRTANYEVRKYAPFTLVETTGDKL
        T VMKFILLPWKPELV TGTSVMGI+PETGKFCSH+D WDS++NNDYFSLEGLW+VFKQ R Y+TP+L +PKY+ILK+TANYEVRKY PF +VE + DKL
Subjt:  TAVMKFILLPWKPELVLTGTSVMGIDPETGKFCSHVDRWDSVRNNDYFSLEGLWDVFKQFRFYETPELESPKYQILKRTANYEVRKYAPFTLVETTGDKL

Query:  SGFGGFNTVSGLVAGYLFGKNSTKEKIP---------------KVSIQIVLPSEKDMSSLPDPEQDTMRLRKIEGGIAAVLKFGGNPTEEVVQQKVKELQ
        SG  GFN V+    GY+FGKNS  EKIP                VSIQ+ LP +KD++SLP+P +DT+ LRK+EGGIAAV+KF G PTE+VV +K K L+
Subjt:  SGFGGFNTVSGLVAGYLFGKNSTKEKIP---------------KVSIQIVLPSEKDMSSLPDPEQDTMRLRKIEGGIAAVLKFGGNPTEEVVQQKVKELQ

Query:  YNLKKDGLKPIINGSYLLAQYNDPRRTWSFVMLGPIPIPMATAIAQVSLQNFLSIPTV----------GFGFRPSKSGRPTGLAQSRKSKWVIRSKLADQ
          L KDGLKP +    LLA+YNDP RTWSFVM         T +A  SL+   ++P             +GF   KS     L     SK  + S +   
Subjt:  YNLKKDGLKPIINGSYLLAQYNDPRRTWSFVMLGPIPIPMATAIAQVSLQNFLSIPTV----------GFGFRPSKSGRPTGLAQSRKSKWVIRSKLADQ

Query:  SCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYDNITGYLLNIALLREFFNPEFTLHWVKKTGPYEITTRWTAVMKFILLPWKPEL
        +  KST +V+ LVDFLY+DL HVFD+QGIDR  Y + +RF+DPITK++++  YL NI+LL+  F P+F LH+VK+TGP+EITTRWT VM++++ PWKPE+
Subjt:  SCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYDNITGYLLNIALLREFFNPEFTLHWVKKTGPYEITTRWTAVMKFILLPWKPEL

Query:  VLTGISIMGIDPDTGKFRTHVDLWDSVQNNDYFSLEGLWDVFKQLRFYETPE-LESPKYQILKRTANYEVRKYAP
        V+TG S MGI+P TGKF THVD WDS+++N++FSLEGLW V KQ+  ++T + L  PKYQILKR ANYEVRKY P
Subjt:  VLTGISIMGIDPDTGKFRTHVDLWDSVQNNDYFSLEGLWDVFKQLRFYETPE-LESPKYQILKRTANYEVRKYAP

SwissProt top hitse value%identityAlignment
Q9SR77 Heme-binding-like protein At3g10130, chloroplastic7.7e-1228.96Show/hide
Query:  FYETPELESPKYQILKRTANYEVRKYAPFTLVETTGDKLSGFGGFNTVS--GLVAGYLFGKNSTKEKI----PKVS------------------------
        F   P+LE+  +++L RT  YE+R+  P+ + ET     +GF  +       ++A YLFGKN+ KEK+    P V+                        
Subjt:  FYETPELESPKYQILKRTANYEVRKYAPFTLVETTGDKLSGFGGFNTVS--GLVAGYLFGKNSTKEKI----PKVS------------------------

Query:  ----IQIVLPSEKDMSSLPDPEQDTMRLRKIEGGIAAVLKFGGNPTEEVVQQKVKELQYNLKKDGLKPIING-SYLLAQYNDP
            +  V+PS K  S+LP P+  +++++++   I AV+ F G  T+E ++++ +EL+  L+ D    + +G S+ +AQYN P
Subjt:  ----IQIVLPSEKDMSSLPDPEQDTMRLRKIEGGIAAVLKFGGNPTEEVVQQKVKELQYNLKKDGLKPIING-SYLLAQYNDP

Arabidopsis top hitse value%identityAlignment
AT2G37970.1 SOUL heme-binding family protein3.3e-1027.6Show/hide
Query:  LESPKYQILKRTANYEVRKYAPFTLVETTGD----KLSGFGGFNTVSGLVAGYLFGKNSTKEKIPK----------------------------------
        +E+PKY + K    YE+R+Y P    E T D    K    GGF  ++  +  +   +N   EKI                                    
Subjt:  LESPKYQILKRTANYEVRKYAPFTLVETTGD----KLSGFGGFNTVSGLVAGYLFGKNSTKEKIPK----------------------------------

Query:  -----------VSIQIVLPS-EKDMSSLPDPEQDTMRLRKIEGGIAAVLKFGGNPTEEVVQQKVKELQYNLKKDGLKPIINGSYLLAQYNDP
                   V++Q +LPS  K     P P  + + +++  G    V+KF G  +E VV +KVK+L  +L+KDG K  I G ++LA+YN P
Subjt:  -----------VSIQIVLPS-EKDMSSLPDPEQDTMRLRKIEGGIAAVLKFGGNPTEEVVQQKVKELQYNLKKDGLKPIINGSYLLAQYNDP

AT3G10130.1 SOUL heme-binding family protein5.5e-1328.96Show/hide
Query:  FYETPELESPKYQILKRTANYEVRKYAPFTLVETTGDKLSGFGGFNTVS--GLVAGYLFGKNSTKEKI----PKVS------------------------
        F   P+LE+  +++L RT  YE+R+  P+ + ET     +GF  +       ++A YLFGKN+ KEK+    P V+                        
Subjt:  FYETPELESPKYQILKRTANYEVRKYAPFTLVETTGDKLSGFGGFNTVS--GLVAGYLFGKNSTKEKI----PKVS------------------------

Query:  ----IQIVLPSEKDMSSLPDPEQDTMRLRKIEGGIAAVLKFGGNPTEEVVQQKVKELQYNLKKDGLKPIING-SYLLAQYNDP
            +  V+PS K  S+LP P+  +++++++   I AV+ F G  T+E ++++ +EL+  L+ D    + +G S+ +AQYN P
Subjt:  ----IQIVLPSEKDMSSLPDPEQDTMRLRKIEGGIAAVLKFGGNPTEEVVQQKVKELQYNLKKDGLKPIING-SYLLAQYNDP

AT5G20140.1 SOUL heme-binding family protein7.2e-11461.88Show/hide
Query:  STVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYDNITGYLLNIALLREFFKPEFTLHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTG
        STV+++ LV FLY+DL H+FD+QGID+TAY ++V+F+DPITK+D I+GYL NIA L+  F P+F LHW K+TGPYEITTRWT VMKFI LPWKPELV TG
Subjt:  STVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYDNITGYLLNIALLREFFKPEFTLHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTG

Query:  TSVMGIDPETGKFCSHVDRWDSVRNNDYFSLEGLWDVFKQFRFYETPELESPKYQILKRTANYEVRKYAPFTLVETTGDKLSGFGGFNTVSGLVAGYLFG
         S+M ++PET KFCSH+D WDS++NNDYFSLEGL DVFKQ R Y+TP+LE+PKYQILKRTANYEVR Y PF +VET GDKLSG  GFN     VAGY+FG
Subjt:  TSVMGIDPETGKFCSHVDRWDSVRNNDYFSLEGLWDVFKQFRFYETPELESPKYQILKRTANYEVRKYAPFTLVETTGDKLSGFGGFNTVSGLVAGYLFG

Query:  KNSTKEKIP----------------KVSIQIVLPSEKDMSSLPDPEQDTMRLRKIEGGIAAVLKFGGNPTEEVVQQKVKELQYNLKKDGLKPIINGSYLL
        KNST EKIP                 VS+QIV+PS KD+SSLP P ++ + L+K+EGG AA +KF G PTE+VVQ K  EL+ +L KDGL+       +L
Subjt:  KNSTKEKIP----------------KVSIQIVLPSEKDMSSLPDPEQDTMRLRKIEGGIAAVLKFGGNPTEEVVQQKVKELQYNLKKDGLKPIINGSYLL

Query:  AQYNDPRRTWSFVMLGPIPI
        A+YNDP RTW+F+M   + I
Subjt:  AQYNDPRRTWSFVMLGPIPI

AT5G20140.2 SOUL heme-binding family protein7.2e-11462.74Show/hide
Query:  STVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYDNITGYLLNIALLREFFKPEFTLHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTG
        STV+++ LV FLY+DL H+FD+QGID+TAY ++V+F+DPITK+D I+GYL NIA L+  F P+F LHW K+TGPYEITTRWT VMKFI LPWKPELV TG
Subjt:  STVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYDNITGYLLNIALLREFFKPEFTLHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTG

Query:  TSVMGIDPETGKFCSHVDRWDSVRNNDYFSLEGLWDVFKQFRFYETPELESPKYQILKRTANYEVRKYAPFTLVETTGDKLSGFGGFNTVSGLVAGYLFG
         S+M ++PET KFCSH+D WDS++NNDYFSLEGL DVFKQ R Y+TP+LE+PKYQILKRTANYEVR Y PF +VET GDKLSG  GFN     VAGY+FG
Subjt:  TSVMGIDPETGKFCSHVDRWDSVRNNDYFSLEGLWDVFKQFRFYETPELESPKYQILKRTANYEVRKYAPFTLVETTGDKLSGFGGFNTVSGLVAGYLFG

Query:  KNSTKEKIP----------------KVSIQIVLPSEKDMSSLPDPEQDTMRLRKIEGGIAAVLKFGGNPTEEVVQQKVKELQYNLKKDGLKPIINGSYLL
        KNST EKIP                 VS+QIV+PS KD+SSLP P ++ + L+K+EGG AA +KF G PTE+VVQ K  EL+ +L KDGL+       +L
Subjt:  KNSTKEKIP----------------KVSIQIVLPSEKDMSSLPDPEQDTMRLRKIEGGIAAVLKFGGNPTEEVVQQKVKELQYNLKKDGLKPIINGSYLL

Query:  AQYNDPRRTWSFVM
        A+YNDP RTW+F+M
Subjt:  AQYNDPRRTWSFVM


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGCCACTGCCCAAGTTTCCCTCCAAAACTTCCTCTCAATCCCAACCGTTGCTTTCGGTTTCCGGCCGACCAAATCCGGCCAACCAACCGGCCGAGCACAGAGCAGAAA
GTCAAAGTGGGTCATTCGATCGAAATTGGCGGATCAAAGCTGTAAGAAATCGACGGTGGACGTTGACCGATTGGTGGATTTCTTGTACGACGATCTCCGCCATGTGTTCG
ACGAACAGGGGATCGATCGGACGGCGTACCACGACCAAGTGAGATTCCAAGACCCAATCACAAAATATGATAACATCACTGGGTATTTGCTGAATATTGCCCTCTTGCGA
GAATTCTTCAAGCCTGAGTTCACATTGCACTGGGTCAAGAAGACTGGACCATATGAAATAACTACGAGATGGACAGCGGTAATGAAGTTCATCCTTCTTCCATGGAAACC
AGAATTAGTTTTGACAGGAACTTCCGTTATGGGTATCGATCCAGAGACGGGCAAGTTCTGTAGCCATGTGGATCGTTGGGATTCAGTTCGAAATAATGACTACTTTTCTC
TAGAAGGGCTTTGGGATGTATTTAAACAGTTTAGGTTTTATGAGACTCCAGAATTGGAATCGCCCAAATATCAGATACTGAAAAGGACTGCAAATTATGAGGTGAGAAAA
TATGCACCATTTACACTGGTTGAAACAACTGGGGACAAGCTCTCTGGGTTTGGTGGATTCAATACGGTTTCAGGGTTAGTTGCAGGGTATTTATTTGGGAAGAACTCTAC
AAAGGAGAAGATACCCAAAGTGTCCATCCAAATAGTTCTTCCTTCAGAAAAAGATATGAGCAGTTTACCAGATCCTGAACAAGACACAATGAGGTTGAGAAAGATTGAAG
GAGGAATTGCTGCAGTGCTGAAATTCGGTGGAAACCCCACTGAAGAAGTGGTGCAACAGAAGGTGAAAGAATTACAATATAATCTCAAAAAGGATGGTCTTAAACCCATA
ATTAATGGCAGCTATTTGCTTGCTCAGTACAACGACCCTCGACGAACATGGAGCTTTGTAATGTTAGGTCCCATTCCAATTCCAATGGCCACTGCCATTGCCCAAGTTTC
CCTCCAAAACTTCCTCTCAATCCCAACCGTTGGATTCGGTTTCCGGCCGAGCAAATCCGGCCGACCAACCGGGCTCGCACAGAGCAGAAAGTCAAAGTGGGTTATTCGAT
CAAAATTGGCAGATCAAAGCTGTAAGAAATCGACGGTGGACGTTGACCGATTGGTGGATTTCTTGTACGACGATCTCCGCCATGTGTTCGACGAACAGGGGATCGATCGG
ACGGCGTACCACGACCAAGTGAGATTCCAAGACCCAATCACAAAATATGATAACATCACTGGGTATTTGCTGAATATTGCCCTCTTGCGAGAATTCTTCAACCCTGAGTT
CACATTGCATTGGGTCAAGAAGACTGGACCATATGAAATAACTACAAGATGGACGGCCGTGATGAAGTTCATCCTTCTTCCATGGAAACCAGAATTAGTTTTGACTGGAA
TCTCCATTATGGGCATTGATCCCGACACGGGCAAGTTCCGTACCCACGTGGATCTTTGGGATTCAGTACAAAATAATGACTACTTTTCTCTAGAAGGATTGTGGGATGTA
TTTAAACAGTTGAGATTTTATGAGACTCCAGAATTGGAATCACCCAAGTATCAGATCTTGAAAAGGACTGCTAATTATGAGGTGAGAAAATATGCACCATTTATAGTGTT
TGAAACAAATGAAGACAGGCTTTCTGCTGGATTCAATAGGGTTGCTAGTTTCCCAGATTCTAAACAGGACTCACTCAGCTTGAGAAAGATGGAAGGAGGGATTGCTGCAG
TGTTGAAATTCAGTGGAAATCCCACAGAAGATTTGGCTGAACAAAAGGCAAAGCAATTACTGTATACTCTCAAAAGGGATGGTCTCAAACCCATCCATGGCTGTTTGCTT
GCTCGATACAACAACTCCGCCCGAACTTTCAGCTTTGTAATGAGAAATGAGGTGTTGATATGGCTCGAAGAATTCTCATTTTAG
mRNA sequenceShow/hide mRNA sequence
ATGGCCACTGCCCAAGTTTCCCTCCAAAACTTCCTCTCAATCCCAACCGTTGCTTTCGGTTTCCGGCCGACCAAATCCGGCCAACCAACCGGCCGAGCACAGAGCAGAAA
GTCAAAGTGGGTCATTCGATCGAAATTGGCGGATCAAAGCTGTAAGAAATCGACGGTGGACGTTGACCGATTGGTGGATTTCTTGTACGACGATCTCCGCCATGTGTTCG
ACGAACAGGGGATCGATCGGACGGCGTACCACGACCAAGTGAGATTCCAAGACCCAATCACAAAATATGATAACATCACTGGGTATTTGCTGAATATTGCCCTCTTGCGA
GAATTCTTCAAGCCTGAGTTCACATTGCACTGGGTCAAGAAGACTGGACCATATGAAATAACTACGAGATGGACAGCGGTAATGAAGTTCATCCTTCTTCCATGGAAACC
AGAATTAGTTTTGACAGGAACTTCCGTTATGGGTATCGATCCAGAGACGGGCAAGTTCTGTAGCCATGTGGATCGTTGGGATTCAGTTCGAAATAATGACTACTTTTCTC
TAGAAGGGCTTTGGGATGTATTTAAACAGTTTAGGTTTTATGAGACTCCAGAATTGGAATCGCCCAAATATCAGATACTGAAAAGGACTGCAAATTATGAGGTGAGAAAA
TATGCACCATTTACACTGGTTGAAACAACTGGGGACAAGCTCTCTGGGTTTGGTGGATTCAATACGGTTTCAGGGTTAGTTGCAGGGTATTTATTTGGGAAGAACTCTAC
AAAGGAGAAGATACCCAAAGTGTCCATCCAAATAGTTCTTCCTTCAGAAAAAGATATGAGCAGTTTACCAGATCCTGAACAAGACACAATGAGGTTGAGAAAGATTGAAG
GAGGAATTGCTGCAGTGCTGAAATTCGGTGGAAACCCCACTGAAGAAGTGGTGCAACAGAAGGTGAAAGAATTACAATATAATCTCAAAAAGGATGGTCTTAAACCCATA
ATTAATGGCAGCTATTTGCTTGCTCAGTACAACGACCCTCGACGAACATGGAGCTTTGTAATGTTAGGTCCCATTCCAATTCCAATGGCCACTGCCATTGCCCAAGTTTC
CCTCCAAAACTTCCTCTCAATCCCAACCGTTGGATTCGGTTTCCGGCCGAGCAAATCCGGCCGACCAACCGGGCTCGCACAGAGCAGAAAGTCAAAGTGGGTTATTCGAT
CAAAATTGGCAGATCAAAGCTGTAAGAAATCGACGGTGGACGTTGACCGATTGGTGGATTTCTTGTACGACGATCTCCGCCATGTGTTCGACGAACAGGGGATCGATCGG
ACGGCGTACCACGACCAAGTGAGATTCCAAGACCCAATCACAAAATATGATAACATCACTGGGTATTTGCTGAATATTGCCCTCTTGCGAGAATTCTTCAACCCTGAGTT
CACATTGCATTGGGTCAAGAAGACTGGACCATATGAAATAACTACAAGATGGACGGCCGTGATGAAGTTCATCCTTCTTCCATGGAAACCAGAATTAGTTTTGACTGGAA
TCTCCATTATGGGCATTGATCCCGACACGGGCAAGTTCCGTACCCACGTGGATCTTTGGGATTCAGTACAAAATAATGACTACTTTTCTCTAGAAGGATTGTGGGATGTA
TTTAAACAGTTGAGATTTTATGAGACTCCAGAATTGGAATCACCCAAGTATCAGATCTTGAAAAGGACTGCTAATTATGAGGTGAGAAAATATGCACCATTTATAGTGTT
TGAAACAAATGAAGACAGGCTTTCTGCTGGATTCAATAGGGTTGCTAGTTTCCCAGATTCTAAACAGGACTCACTCAGCTTGAGAAAGATGGAAGGAGGGATTGCTGCAG
TGTTGAAATTCAGTGGAAATCCCACAGAAGATTTGGCTGAACAAAAGGCAAAGCAATTACTGTATACTCTCAAAAGGGATGGTCTCAAACCCATCCATGGCTGTTTGCTT
GCTCGATACAACAACTCCGCCCGAACTTTCAGCTTTGTAATGAGAAATGAGGTGTTGATATGGCTCGAAGAATTCTCATTTTAG
Protein sequenceShow/hide protein sequence
MATAQVSLQNFLSIPTVAFGFRPTKSGQPTGRAQSRKSKWVIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDRTAYHDQVRFQDPITKYDNITGYLLNIALLR
EFFKPEFTLHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTGTSVMGIDPETGKFCSHVDRWDSVRNNDYFSLEGLWDVFKQFRFYETPELESPKYQILKRTANYEVRK
YAPFTLVETTGDKLSGFGGFNTVSGLVAGYLFGKNSTKEKIPKVSIQIVLPSEKDMSSLPDPEQDTMRLRKIEGGIAAVLKFGGNPTEEVVQQKVKELQYNLKKDGLKPI
INGSYLLAQYNDPRRTWSFVMLGPIPIPMATAIAQVSLQNFLSIPTVGFGFRPSKSGRPTGLAQSRKSKWVIRSKLADQSCKKSTVDVDRLVDFLYDDLRHVFDEQGIDR
TAYHDQVRFQDPITKYDNITGYLLNIALLREFFNPEFTLHWVKKTGPYEITTRWTAVMKFILLPWKPELVLTGISIMGIDPDTGKFRTHVDLWDSVQNNDYFSLEGLWDV
FKQLRFYETPELESPKYQILKRTANYEVRKYAPFIVFETNEDRLSAGFNRVASFPDSKQDSLSLRKMEGGIAAVLKFSGNPTEDLAEQKAKQLLYTLKRDGLKPIHGCLL
ARYNNSARTFSFVMRNEVLIWLEEFSF