; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Spg027361 (gene) of Sponge gourd (cylindrica) v1 genome

Gene IDSpg027361
OrganismLuffa cylindrica (Sponge gourd (cylindrica) v1)
DescriptionNucleolar protein 58-like
Genome locationscaffold7:43549406..43551667
RNA-Seq ExpressionSpg027361
SyntenySpg027361
Gene Ontology termsNA
InterPro domainsNA


Homology Show/hide homology
GenBank top hitse value%identityAlignment
PIN01433.1 hypothetical protein CDL12_26059 [Handroanthus impetiginosus]1.3e-2236.98Show/hide
Query:  FCAHPQEAVLPLVREFYAGLREESISMAVVRGKMVSFSSVDINRVYRIKAPLNPRGNDVIRN--PSAKQMKEALKLVANKGVQWKESQTKVKSLVPSDLK
        F A P+  VLPLVREFYA   E      +VRG+ V F SV IN +Y I     P   D   N   +    +E  + +   G QWK ++ +  S   + L 
Subjt:  FCAHPQEAVLPLVREFYAGLREESISMAVVRGKMVSFSSVDINRVYRIKAPLNPRGNDVIRN--PSAKQMKEALKLVANKGVQWKESQTKVKSLVPSDLK

Query:  LESAVWLHFIKNRLMPTTHDSTISVDRVMLLYCVMKGLEINVGSIIRDEILACGRKRAGKLFFGSLITQLCQRVKIIPGKDEERHFFKPTID
          + +WL FI  R++PT H   ++ DR +LLYC+M G   +VG II D I+         L+F SLIT+LC R  +   + EE  F +  ID
Subjt:  LESAVWLHFIKNRLMPTTHDSTISVDRVMLLYCVMKGLEINVGSIIRDEILACGRKRAGKLFFGSLITQLCQRVKIIPGKDEERHFFKPTID

PON35554.1 hypothetical protein PanWU01x14_335450, partial [Parasponia andersonii]1.5e-2835.87Show/hide
Query:  EDFCAHPQEAVLPLVREFYAGLREESISMAVVRGKMVSFSSVDINRVYRIKAPLNPRGNDVIRNPSAKQMKEALKLVANKGVQWKESQTKVKSLVPSDLK
        + FCAHP++ ++PLVREFYA L +   +   VRG  VS+S   IN V+ +  P++   ++ I N +   +   L+ VA  G +W  S     + + S L 
Subjt:  EDFCAHPQEAVLPLVREFYAGLREESISMAVVRGKMVSFSSVDINRVYRIKAPLNPRGNDVIRNPSAKQMKEALKLVANKGVQWKESQTKVKSLVPSDLK

Query:  LESAVWLHFIKNRLMPTTHDSTISVDRVMLLYCVMKGLEINVGSIIRDEILACGRKRAGKLFFGSLITQLCQRVKIIPGKDEERHFFKPTIDLSLIGKL-
          + VW HF+K+ L+PTTH  T+S DR++LL+ ++ G  INVG +I  EI AC  ++ G LFF SLIT+LC+  +     +EE+      ID   + ++ 
Subjt:  LESAVWLHFIKNRLMPTTHDSTISVDRVMLLYCVMKGLEINVGSIIRDEILACGRKRAGKLFFGSLITQLCQRVKIIPGKDEERHFFKPTIDLSLIGKL-

Query:  QHNSIQRKDKASTSQATPPTGSN
        Q    +   + S+S+  P T S+
Subjt:  QHNSIQRKDKASTSQATPPTGSN

PON46472.1 hypothetical protein PanWU01x14_251180, partial [Parasponia andersonii]1.1e-2936.32Show/hide
Query:  EDFCAHPQEAVLPLVREFYAGLREESISMAVVRGKMVSFSSVDINRVYRIKAPLNPRGNDVIRNPSAKQMKEALKLVANKGVQWKESQTKVKSLVPSDLK
        + FCAHP++ ++PLVREFYA L +   +   VRG  VS+S   IN V+ +  P++   ++ I+N + + +   L+ VA  G +W  S     + + S L 
Subjt:  EDFCAHPQEAVLPLVREFYAGLREESISMAVVRGKMVSFSSVDINRVYRIKAPLNPRGNDVIRNPSAKQMKEALKLVANKGVQWKESQTKVKSLVPSDLK

Query:  LESAVWLHFIKNRLMPTTHDSTISVDRVMLLYCVMKGLEINVGSIIRDEILACGRKRAGKLFFGSLITQLCQRVKIIPGKDEERHFFKPTIDLSLIGKL-
          + VW HF+K+RL+PTTH  T+S DR++LL+ ++ G  INVG +I  EI AC  ++ G LFF SLIT+LC+  +     +EE+      ID   + ++ 
Subjt:  LESAVWLHFIKNRLMPTTHDSTISVDRVMLLYCVMKGLEINVGSIIRDEILACGRKRAGKLFFGSLITQLCQRVKIIPGKDEERHFFKPTIDLSLIGKL-

Query:  QHNSIQRKDKASTSQATPPTGSN
        Q    +   + S+S+  P T S+
Subjt:  QHNSIQRKDKASTSQATPPTGSN

PON70375.1 hypothetical protein PanWU01x14_080440 [Parasponia andersonii]7.1e-2636.42Show/hide
Query:  FCAHPQEAVLPLVREFYAGLREESISMAVVRGKMVSFSSVDINRVYRIKAPLNPRGNDVIRNPSAKQMKEALKLVANKGVQWKESQTKVKSLVPSDLKLE
        FCAHP++ ++PLVREFY  +         +RG  V  S   IN ++ +  P++   ++ + + +  ++   L+ VA  G +W  S     + + S L   
Subjt:  FCAHPQEAVLPLVREFYAGLREESISMAVVRGKMVSFSSVDINRVYRIKAPLNPRGNDVIRNPSAKQMKEALKLVANKGVQWKESQTKVKSLVPSDLKLE

Query:  SAVWLHFIKNRLMPTTHDSTISVDRVMLLYCVMKGLEINVGSIIRDEILACGRKRAGKLFFGSLITQLCQRVK
        + VW HF+K+RL+PTTH  T+S + V LLY ++ G  INVG +I  EI AC  +++G LFF SLIT +C+  +
Subjt:  SAVWLHFIKNRLMPTTHDSTISVDRVMLLYCVMKGLEINVGSIIRDEILACGRKRAGKLFFGSLITQLCQRVK

PON78020.1 hypothetical protein PanWU01x14_023740 [Parasponia andersonii]5.1e-2435.07Show/hide
Query:  LPLVREFYAGLREESISMAVVRGKMVSFSSVDINRVYRIKAPLNPRGNDVIRNPSAKQMKEALKLVANKGVQWKESQTKVKSLVPSDLKLESAVWLHFIK
        LPLVREFYA L +   +   VRG  VS+S   IN V+ +  P++   ++ I N +  ++   L+ VA  G +W  S     + + S L   + VW HF+K
Subjt:  LPLVREFYAGLREESISMAVVRGKMVSFSSVDINRVYRIKAPLNPRGNDVIRNPSAKQMKEALKLVANKGVQWKESQTKVKSLVPSDLKLESAVWLHFIK

Query:  NRLMPTTHDSTISVDRVMLLYCVMKGLEINVGSIIRDEILACGRKRAGKLFFGSLITQLCQRVKIIPGKDEERHFFKPTIDLSLIGKL-QHNSIQRKDKA
        +RL+PTTH   +S DR++LL+ ++ G  INVG +I  EI AC  ++ G LFF SLIT+LC+    +   +EE+      ID   + ++ Q    +   + 
Subjt:  NRLMPTTHDSTISVDRVMLLYCVMKGLEINVGSIIRDEILACGRKRAGKLFFGSLITQLCQRVKIIPGKDEERHFFKPTIDLSLIGKL-QHNSIQRKDKA

Query:  STSQATPPTGS
        S+S+    + S
Subjt:  STSQATPPTGS

TrEMBL top hitse value%identityAlignment
A0A2G9G807 Uncharacterized protein6.1e-2336.98Show/hide
Query:  FCAHPQEAVLPLVREFYAGLREESISMAVVRGKMVSFSSVDINRVYRIKAPLNPRGNDVIRN--PSAKQMKEALKLVANKGVQWKESQTKVKSLVPSDLK
        F A P+  VLPLVREFYA   E      +VRG+ V F SV IN +Y I     P   D   N   +    +E  + +   G QWK ++ +  S   + L 
Subjt:  FCAHPQEAVLPLVREFYAGLREESISMAVVRGKMVSFSSVDINRVYRIKAPLNPRGNDVIRN--PSAKQMKEALKLVANKGVQWKESQTKVKSLVPSDLK

Query:  LESAVWLHFIKNRLMPTTHDSTISVDRVMLLYCVMKGLEINVGSIIRDEILACGRKRAGKLFFGSLITQLCQRVKIIPGKDEERHFFKPTID
          + +WL FI  R++PT H   ++ DR +LLYC+M G   +VG II D I+         L+F SLIT+LC R  +   + EE  F +  ID
Subjt:  LESAVWLHFIKNRLMPTTHDSTISVDRVMLLYCVMKGLEINVGSIIRDEILACGRKRAGKLFFGSLITQLCQRVKIIPGKDEERHFFKPTID

A0A2P5AGA5 Uncharacterized protein (Fragment)7.4e-2935.87Show/hide
Query:  EDFCAHPQEAVLPLVREFYAGLREESISMAVVRGKMVSFSSVDINRVYRIKAPLNPRGNDVIRNPSAKQMKEALKLVANKGVQWKESQTKVKSLVPSDLK
        + FCAHP++ ++PLVREFYA L +   +   VRG  VS+S   IN V+ +  P++   ++ I N +   +   L+ VA  G +W  S     + + S L 
Subjt:  EDFCAHPQEAVLPLVREFYAGLREESISMAVVRGKMVSFSSVDINRVYRIKAPLNPRGNDVIRNPSAKQMKEALKLVANKGVQWKESQTKVKSLVPSDLK

Query:  LESAVWLHFIKNRLMPTTHDSTISVDRVMLLYCVMKGLEINVGSIIRDEILACGRKRAGKLFFGSLITQLCQRVKIIPGKDEERHFFKPTIDLSLIGKL-
          + VW HF+K+ L+PTTH  T+S DR++LL+ ++ G  INVG +I  EI AC  ++ G LFF SLIT+LC+  +     +EE+      ID   + ++ 
Subjt:  LESAVWLHFIKNRLMPTTHDSTISVDRVMLLYCVMKGLEINVGSIIRDEILACGRKRAGKLFFGSLITQLCQRVKIIPGKDEERHFFKPTIDLSLIGKL-

Query:  QHNSIQRKDKASTSQATPPTGSN
        Q    +   + S+S+  P T S+
Subjt:  QHNSIQRKDKASTSQATPPTGSN

A0A2P5BCG4 Uncharacterized protein (Fragment)5.1e-3036.32Show/hide
Query:  EDFCAHPQEAVLPLVREFYAGLREESISMAVVRGKMVSFSSVDINRVYRIKAPLNPRGNDVIRNPSAKQMKEALKLVANKGVQWKESQTKVKSLVPSDLK
        + FCAHP++ ++PLVREFYA L +   +   VRG  VS+S   IN V+ +  P++   ++ I+N + + +   L+ VA  G +W  S     + + S L 
Subjt:  EDFCAHPQEAVLPLVREFYAGLREESISMAVVRGKMVSFSSVDINRVYRIKAPLNPRGNDVIRNPSAKQMKEALKLVANKGVQWKESQTKVKSLVPSDLK

Query:  LESAVWLHFIKNRLMPTTHDSTISVDRVMLLYCVMKGLEINVGSIIRDEILACGRKRAGKLFFGSLITQLCQRVKIIPGKDEERHFFKPTIDLSLIGKL-
          + VW HF+K+RL+PTTH  T+S DR++LL+ ++ G  INVG +I  EI AC  ++ G LFF SLIT+LC+  +     +EE+      ID   + ++ 
Subjt:  LESAVWLHFIKNRLMPTTHDSTISVDRVMLLYCVMKGLEINVGSIIRDEILACGRKRAGKLFFGSLITQLCQRVKIIPGKDEERHFFKPTIDLSLIGKL-

Query:  QHNSIQRKDKASTSQATPPTGSN
        Q    +   + S+S+  P T S+
Subjt:  QHNSIQRKDKASTSQATPPTGSN

A0A2P5DAQ2 Uncharacterized protein3.4e-2636.42Show/hide
Query:  FCAHPQEAVLPLVREFYAGLREESISMAVVRGKMVSFSSVDINRVYRIKAPLNPRGNDVIRNPSAKQMKEALKLVANKGVQWKESQTKVKSLVPSDLKLE
        FCAHP++ ++PLVREFY  +         +RG  V  S   IN ++ +  P++   ++ + + +  ++   L+ VA  G +W  S     + + S L   
Subjt:  FCAHPQEAVLPLVREFYAGLREESISMAVVRGKMVSFSSVDINRVYRIKAPLNPRGNDVIRNPSAKQMKEALKLVANKGVQWKESQTKVKSLVPSDLKLE

Query:  SAVWLHFIKNRLMPTTHDSTISVDRVMLLYCVMKGLEINVGSIIRDEILACGRKRAGKLFFGSLITQLCQRVK
        + VW HF+K+RL+PTTH  T+S + V LLY ++ G  INVG +I  EI AC  +++G LFF SLIT +C+  +
Subjt:  SAVWLHFIKNRLMPTTHDSTISVDRVMLLYCVMKGLEINVGSIIRDEILACGRKRAGKLFFGSLITQLCQRVK

A0A2P5DXM3 Uncharacterized protein2.5e-2435.07Show/hide
Query:  LPLVREFYAGLREESISMAVVRGKMVSFSSVDINRVYRIKAPLNPRGNDVIRNPSAKQMKEALKLVANKGVQWKESQTKVKSLVPSDLKLESAVWLHFIK
        LPLVREFYA L +   +   VRG  VS+S   IN V+ +  P++   ++ I N +  ++   L+ VA  G +W  S     + + S L   + VW HF+K
Subjt:  LPLVREFYAGLREESISMAVVRGKMVSFSSVDINRVYRIKAPLNPRGNDVIRNPSAKQMKEALKLVANKGVQWKESQTKVKSLVPSDLKLESAVWLHFIK

Query:  NRLMPTTHDSTISVDRVMLLYCVMKGLEINVGSIIRDEILACGRKRAGKLFFGSLITQLCQRVKIIPGKDEERHFFKPTIDLSLIGKL-QHNSIQRKDKA
        +RL+PTTH   +S DR++LL+ ++ G  INVG +I  EI AC  ++ G LFF SLIT+LC+    +   +EE+      ID   + ++ Q    +   + 
Subjt:  NRLMPTTHDSTISVDRVMLLYCVMKGLEINVGSIIRDEILACGRKRAGKLFFGSLITQLCQRVKIIPGKDEERHFFKPTIDLSLIGKL-QHNSIQRKDKA

Query:  STSQATPPTGS
        S+S+    + S
Subjt:  STSQATPPTGS

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGCCGAGTTCATCCTCTTCGGCAATGCCGGCCACGTCGAGGGAGATGCCGAGTTCATCTACACCGAGACGGTTCACGCGCGCCACTGCTGTCTGCCAAACCCAAAAACC
CGCCACTCAACAGTTTAGAAAACGTTCGCGGGAATGGTTTGCAATGATCCGGGAGATAGGTGCTCAGAGACGTGTTGCCCTTGAAGAGGAAGGGAATCAGCAAGATGAAA
AAGAAGCCACCAAGGCAGCTGAAAGCTCTCTGCAAGGAGAAGCTTCAATGGGTAAGGTTTCCGAACCTTCAATTAACCCCTCTTTTTCTTGCAGGACCAAACCCGTTGTT
ACTTATAGTGCAAGAAAGAGGAGCCCGAAGAAGGTTGTGTCTGAAAAGCCGCTTGAGATTAAGCCCCTAAAAACCGCAAGGTTGCCTTCGGATGTATTCGAAGGAATAAT
TCGCCAAGCAGTGGCAAAAGCCCTTGTGATTGCTAAAGGGTATAAGGCTGAACAGGATGCTTTGAAAGAGGTTGAAGCTGAGAGAGAGATGGAAAATCAGAAAATGGCTG
AGGAAGATGAGTTTGCAAAGGAAAGAGATGAGAAAGATGAGAAAAGAAGGGGAAAAGAAGAACAAGAGGTCGAGAGGGCCTTAGAAGCTAAGGAAGAGAGAAAGTATGAG
GAAAATCTCAAGAGGGTAGCTATGGATTTGCAACTCCTTGAGGAAGAGAAAAAAAGAAGAGAAGAATTAAAAGAAGATGAAAAAAGAAGGAAAGAAGTTGAAGACTTCCT
TGCAGCCTTTGAGCCACTCCACAAGGCTCAAAGTGAAGCTGAAGCGCTGCAAGGGAGGGTAGAACAAGAGGCCCAACAGGGGCCAACTAAAGAAATTTTAGACGAAGAAA
AAGAAAGAGAAGTGGAGAATGAAGGCCAGAATGCGACTGCATCTGGGCCGCATTCTGAAGGAGGCCTAGCCGAGGCCACTATCGATCAGCCAGCTGAAGAGGTTTTTGAG
CCTCTATTCACGAATAACCCACCGGCTGTTGATAGCACCTCTTCGGGAGAGAAGAGGAACGAAGAGGAACAGGAAGTTGAGAAGGCCGAGACCTCCACTGACTCTGATAC
AGAATCTGATTCAGAGATTAGGGAACTAGATGATGACCAAGTCCCTATCTCTGCAGCGTTGAGAAGAAAGAGGAAGAGAGAGATTAAGGCTGAGAGGAGGACAAAGAACA
AGAATGACTCGATATTTGCCAAGAGGTCGAGGATAAGGTCCATGGACGCCTCTCCTGCAGTTCCTCCGACCGTCTCACCCGCGAAGCCAAAGGGTAAATCACCCAAGGCC
ACATCTCCCAGAAATCCATTGCTTGAGGACTTCTGTGCTCACCCTCAGGAGGCTGTTTTGCCTTTAGTGCGAGAGTTTTACGCCGGCCTGAGGGAGGAGAGCATTAGCAT
GGCGGTGGTGAGGGGGAAGATGGTCAGTTTCTCCTCAGTCGACATTAACAGGGTGTACAGGATCAAGGCGCCCTTGAACCCAAGAGGGAACGACGTTATCAGGAACCCTT
CGGCCAAGCAAATGAAGGAAGCATTGAAACTTGTGGCCAACAAGGGGGTCCAATGGAAAGAATCACAGACGAAAGTGAAGTCTTTAGTGCCAAGCGATCTAAAGCTAGAA
TCAGCAGTTTGGCTTCACTTCATAAAAAACCGTTTGATGCCAACCACCCACGATAGCACGATTTCAGTGGATAGAGTGATGCTACTCTATTGCGTTATGAAGGGGTTGGA
AATCAACGTAGGGAGCATTATCAGGGACGAAATTTTAGCCTGTGGGAGAAAGCGAGCAGGCAAGCTTTTCTTTGGATCACTCATCACTCAGCTTTGTCAAAGGGTGAAGA
TCATTCCGGGCAAGGACGAGGAGCGTCACTTCTTCAAGCCGACCATTGACCTGTCCTTGATTGGAAAGCTCCAGCATAATAGCATCCAGAGGAAAGACAAAGCCTCTACA
TCTCAGGCTACTCCACCTACAGGGTCGAATGTAGCTTCTCCATCCCAGCACACTCCTTTCACAGGGCCATCACCATCATCGGAGGCCCTAGCCATTGCCTACCGCCAGAT
AGATCAACTCAGGGAGAACCTGAGAACGTATTAG
mRNA sequenceShow/hide mRNA sequence
ATGCCGAGTTCATCCTCTTCGGCAATGCCGGCCACGTCGAGGGAGATGCCGAGTTCATCTACACCGAGACGGTTCACGCGCGCCACTGCTGTCTGCCAAACCCAAAAACC
CGCCACTCAACAGTTTAGAAAACGTTCGCGGGAATGGTTTGCAATGATCCGGGAGATAGGTGCTCAGAGACGTGTTGCCCTTGAAGAGGAAGGGAATCAGCAAGATGAAA
AAGAAGCCACCAAGGCAGCTGAAAGCTCTCTGCAAGGAGAAGCTTCAATGGGTAAGGTTTCCGAACCTTCAATTAACCCCTCTTTTTCTTGCAGGACCAAACCCGTTGTT
ACTTATAGTGCAAGAAAGAGGAGCCCGAAGAAGGTTGTGTCTGAAAAGCCGCTTGAGATTAAGCCCCTAAAAACCGCAAGGTTGCCTTCGGATGTATTCGAAGGAATAAT
TCGCCAAGCAGTGGCAAAAGCCCTTGTGATTGCTAAAGGGTATAAGGCTGAACAGGATGCTTTGAAAGAGGTTGAAGCTGAGAGAGAGATGGAAAATCAGAAAATGGCTG
AGGAAGATGAGTTTGCAAAGGAAAGAGATGAGAAAGATGAGAAAAGAAGGGGAAAAGAAGAACAAGAGGTCGAGAGGGCCTTAGAAGCTAAGGAAGAGAGAAAGTATGAG
GAAAATCTCAAGAGGGTAGCTATGGATTTGCAACTCCTTGAGGAAGAGAAAAAAAGAAGAGAAGAATTAAAAGAAGATGAAAAAAGAAGGAAAGAAGTTGAAGACTTCCT
TGCAGCCTTTGAGCCACTCCACAAGGCTCAAAGTGAAGCTGAAGCGCTGCAAGGGAGGGTAGAACAAGAGGCCCAACAGGGGCCAACTAAAGAAATTTTAGACGAAGAAA
AAGAAAGAGAAGTGGAGAATGAAGGCCAGAATGCGACTGCATCTGGGCCGCATTCTGAAGGAGGCCTAGCCGAGGCCACTATCGATCAGCCAGCTGAAGAGGTTTTTGAG
CCTCTATTCACGAATAACCCACCGGCTGTTGATAGCACCTCTTCGGGAGAGAAGAGGAACGAAGAGGAACAGGAAGTTGAGAAGGCCGAGACCTCCACTGACTCTGATAC
AGAATCTGATTCAGAGATTAGGGAACTAGATGATGACCAAGTCCCTATCTCTGCAGCGTTGAGAAGAAAGAGGAAGAGAGAGATTAAGGCTGAGAGGAGGACAAAGAACA
AGAATGACTCGATATTTGCCAAGAGGTCGAGGATAAGGTCCATGGACGCCTCTCCTGCAGTTCCTCCGACCGTCTCACCCGCGAAGCCAAAGGGTAAATCACCCAAGGCC
ACATCTCCCAGAAATCCATTGCTTGAGGACTTCTGTGCTCACCCTCAGGAGGCTGTTTTGCCTTTAGTGCGAGAGTTTTACGCCGGCCTGAGGGAGGAGAGCATTAGCAT
GGCGGTGGTGAGGGGGAAGATGGTCAGTTTCTCCTCAGTCGACATTAACAGGGTGTACAGGATCAAGGCGCCCTTGAACCCAAGAGGGAACGACGTTATCAGGAACCCTT
CGGCCAAGCAAATGAAGGAAGCATTGAAACTTGTGGCCAACAAGGGGGTCCAATGGAAAGAATCACAGACGAAAGTGAAGTCTTTAGTGCCAAGCGATCTAAAGCTAGAA
TCAGCAGTTTGGCTTCACTTCATAAAAAACCGTTTGATGCCAACCACCCACGATAGCACGATTTCAGTGGATAGAGTGATGCTACTCTATTGCGTTATGAAGGGGTTGGA
AATCAACGTAGGGAGCATTATCAGGGACGAAATTTTAGCCTGTGGGAGAAAGCGAGCAGGCAAGCTTTTCTTTGGATCACTCATCACTCAGCTTTGTCAAAGGGTGAAGA
TCATTCCGGGCAAGGACGAGGAGCGTCACTTCTTCAAGCCGACCATTGACCTGTCCTTGATTGGAAAGCTCCAGCATAATAGCATCCAGAGGAAAGACAAAGCCTCTACA
TCTCAGGCTACTCCACCTACAGGGTCGAATGTAGCTTCTCCATCCCAGCACACTCCTTTCACAGGGCCATCACCATCATCGGAGGCCCTAGCCATTGCCTACCGCCAGAT
AGATCAACTCAGGGAGAACCTGAGAACGTATTAG
Protein sequenceShow/hide protein sequence
MPSSSSSAMPATSREMPSSSTPRRFTRATAVCQTQKPATQQFRKRSREWFAMIREIGAQRRVALEEEGNQQDEKEATKAAESSLQGEASMGKVSEPSINPSFSCRTKPVV
TYSARKRSPKKVVSEKPLEIKPLKTARLPSDVFEGIIRQAVAKALVIAKGYKAEQDALKEVEAEREMENQKMAEEDEFAKERDEKDEKRRGKEEQEVERALEAKEERKYE
ENLKRVAMDLQLLEEEKKRREELKEDEKRRKEVEDFLAAFEPLHKAQSEAEALQGRVEQEAQQGPTKEILDEEKEREVENEGQNATASGPHSEGGLAEATIDQPAEEVFE
PLFTNNPPAVDSTSSGEKRNEEEQEVEKAETSTDSDTESDSEIRELDDDQVPISAALRRKRKREIKAERRTKNKNDSIFAKRSRIRSMDASPAVPPTVSPAKPKGKSPKA
TSPRNPLLEDFCAHPQEAVLPLVREFYAGLREESISMAVVRGKMVSFSSVDINRVYRIKAPLNPRGNDVIRNPSAKQMKEALKLVANKGVQWKESQTKVKSLVPSDLKLE
SAVWLHFIKNRLMPTTHDSTISVDRVMLLYCVMKGLEINVGSIIRDEILACGRKRAGKLFFGSLITQLCQRVKIIPGKDEERHFFKPTIDLSLIGKLQHNSIQRKDKAST
SQATPPTGSNVASPSQHTPFTGPSPSSEALAIAYRQIDQLRENLRTY