; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Moc08g40540 (gene) of Bitter gourd (OHB3-1) v2 genome

Gene IDMoc08g40540
OrganismMomordica charantia cv. OHB3-1 (Bitter gourd (OHB3-1) v2)
DescriptionRetrovirus-related Pol polyprotein from transposon TNT 1-94
Genome locationchr8:31092810..31098825
RNA-Seq ExpressionMoc08g40540
SyntenyMoc08g40540
Gene Ontology termsGO:0016021 - integral component of membrane (cellular component)
InterPro domainsIPR005162 - Retrotransposon gag domain
IPR007658 - Protein of unknown function DUF594
IPR025315 - Domain of unknown function DUF4220


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAG6599136.1 hypothetical protein SDJN03_08914, partial [Cucurbita argyrosperma subsp. sororia]3.9e-16053.35Show/hide
Query:  NFVFCISLILKPHKPPSYLYRRKNAGKKLRLSVWLAYLLVPKVAVIVLGKLMTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVRQV
        NFVF + L     +      RR   G KL L+VW +YLL  K+A +VLGKL TIDIG+  RNT TQ+QALLAPLM MQIG+ DTITAYSIEDN LGVRQV
Subjt:  NFVFCISLILKPHKPPSYLYRRKNAGKKLRLSVWLAYLLVPKVAVIVLGKLMTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVRQV

Query:  FSMMMQVVIMFYILGRSWTDFKS----------------------------NSGFTYADFFQYDEAVKFLERLEGGNELPDAKLILRAYCRFCWLKPHLE
        FSM +QV IMFYIL RSWTD K+                            N GFT ADFF+Y E  K  E+L   NELP+AKLILRAY RFC LKPHLE
Subjt:  FSMMMQVVIMFYILGRSWTDFKS----------------------------NSGFTYADFFQYDEAVKFLERLEGGNELPDAKLILRAYCRFCWLKPHLE

Query:  NWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVY--PGLVLRFISFISLITTLCGFSVLFKDGFVYNIGAGMIHYVLITCVILEVY
        NW   P ++ D +KL I+DCEYE+VFRITDSELGFMYD LYTKAPV+Y   GL+LRFIS +SLI TLCGFSVLFKD FVYNI  G IHYVLI  +I+E+Y
Subjt:  NWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVY--PGLVLRFISFISLITTLCGFSVLFKDGFVYNIGAGMIHYVLITCVILEVY

Query:  QILKLPFSDWAIVQMIRHYETFPILWPLLRSLAPTSATWIRWSNKMGQFNLIDFCL-CKNQNFSRIKILRHWGLDMKLRKQLNLGQIEVHPKVKELVVKE
        QI+++PF+DWAIVQMIRH++TFPIL  LL SLAP SATW RWSN MGQFNL+DFC+  K++N+SRIK+LR WG+DMKLRKQ++L +IEV P+VKELVV E
Subjt:  QILKLPFSDWAIVQMIRHYETFPILWPLLRSLAPTSATWIRWSNKMGQFNLIDFCL-CKNQNFSRIKILRHWGLDMKLRKQLNLGQIEVHPKVKELVVKE

Query:  LREVEKIRKQDKDEFTKGGEWT---------------------------------------------IQRLHDVPFG-----------------------
        LRE+EKI+ Q  +EF + G+WT                                             I+R HD   G                       
Subjt:  LREVEKIRKQDKDEFTKGGEWT---------------------------------------------IQRLHDVPFG-----------------------

Query:  -------------C----SFPRAIHNN----CRYLISI-----LQQDAPNVVDKAKIREEEKVIGNWNLLKDVNELANSLLTLPNNEKRWNLIGSMWVEM
                     C     F R  H N    C  L+++     LQ   P+    A    E+ V+GNW+LLKDV +LANSLL L +NE +W LIGSMW EM
Subjt:  -------------C----SFPRAIHNN----CRYLISI-----LQQDAPNVVDKAKIREEEKVIGNWNLLKDVNELANSLLTLPNNEKRWNLIGSMWVEM

Query:  LGYAASKCEMEYHAEHIRQAGELITH
        LGYAASKCEMEYH+EHIRQ GELITH
Subjt:  LGYAASKCEMEYHAEHIRQAGELITH

XP_022158134.1 uncharacterized protein LOC111024693 [Momordica charantia]1.1e-196100Show/hide
Query:  MTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVRQVFSMMMQVVIMFYILGRSWTDFKSNSGFTYADFFQYDEAVKFLERLEGGNEL
        MTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVRQVFSMMMQVVIMFYILGRSWTDFKSNSGFTYADFFQYDEAVKFLERLEGGNEL
Subjt:  MTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVRQVFSMMMQVVIMFYILGRSWTDFKSNSGFTYADFFQYDEAVKFLERLEGGNEL

Query:  PDAKLILRAYCRFCWLKPHLENWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVYPGLVLRFISFISLITTLCGFSVLFKDGFVYN
        PDAKLILRAYCRFCWLKPHLENWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVYPGLVLRFISFISLITTLCGFSVLFKDGFVYN
Subjt:  PDAKLILRAYCRFCWLKPHLENWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVYPGLVLRFISFISLITTLCGFSVLFKDGFVYN

Query:  IGAGMIHYVLITCVILEVYQILKLPFSDWAIVQMIRHYETFPILWPLLRSLAPTSATWIRWSNKMGQFNLIDFCLCKNQNFSRIKILRHWGLDMKLRKQL
        IGAGMIHYVLITCVILEVYQILKLPFSDWAIVQMIRHYETFPILWPLLRSLAPTSATWIRWSNKMGQFNLIDFCLCKNQNFSRIKILRHWGLDMKLRKQL
Subjt:  IGAGMIHYVLITCVILEVYQILKLPFSDWAIVQMIRHYETFPILWPLLRSLAPTSATWIRWSNKMGQFNLIDFCLCKNQNFSRIKILRHWGLDMKLRKQL

Query:  NLGQIEVHPKVKELVVKELREVEKIRKQDKDEFTKGGEWTIQR
        NLGQIEVHPKVKELVVKELREVEKIRKQDKDEFTKGGEWTIQR
Subjt:  NLGQIEVHPKVKELVVKELREVEKIRKQDKDEFTKGGEWTIQR

XP_022158138.1 uncharacterized protein LOC111024697 [Momordica charantia]9.8e-17256.51Show/hide
Query:  LQNFVFCISLILKPHKPPSYLYRRKNAGKKLRLSVWLAYLLVPKVAVIVLGKLMTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVR
        L N VF I L     +      RR   G +L LSVW AYLL  KVA +VLGKL TIDIG+T+RNT TQIQ LLAPLM MQIG+ DTITAYSIEDN LGVR
Subjt:  LQNFVFCISLILKPHKPPSYLYRRKNAGKKLRLSVWLAYLLVPKVAVIVLGKLMTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVR

Query:  QVFSMMMQVVIMFYILGRSWTDFKS----------------------------NSGFTYADFFQYDEAVKFLERL-EGGNELPDAKLILRAYCRFCWLKP
        QVFSM++QV IMFYIL RSWTDFK+                            N GFT ADFF+Y E     ERL +G +ELP A+LILRAY RFC LKP
Subjt:  QVFSMMMQVVIMFYILGRSWTDFKS----------------------------NSGFTYADFFQYDEAVKFLERL-EGGNELPDAKLILRAYCRFCWLKP

Query:  HLENWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVY--PGLVLRFISFISLITTLCGFSVLFKDGFVYNIGAGMIHYVLITCVIL
        HLENW   P ++ D +KLSIEDCEYEEVF+ITDSELGFMYD LYTKAPVVY   GLVLRFIS ISLI T+CGFSVLFKD FVYN+  G+IH+ L T VIL
Subjt:  HLENWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVY--PGLVLRFISFISLITTLCGFSVLFKDGFVYNIGAGMIHYVLITCVIL

Query:  EVYQILKLPFSDWAIVQMIRHYETFPILWPLLRSLAPTSATWIRWSNKMGQFNLIDFCL-CKNQNFSRIKILRHWGLDMKLRKQLNLGQIEVHPKVKELV
        EVYQIL+LPFSDWAIVQM+RHY+TFPIL  LL+SLAP SATW RWSN MGQFNL+DFCL  K++N+SRIKILR+WGLDMKLRKQL+LGQ+EVH KVKELV
Subjt:  EVYQILKLPFSDWAIVQMIRHYETFPILWPLLRSLAPTSATWIRWSNKMGQFNLIDFCL-CKNQNFSRIKILRHWGLDMKLRKQLNLGQIEVHPKVKELV

Query:  VKELREVEKIRKQDKDEFTKGGEWTIQR---------------------------------------------LHDVPFGCSFPRAIHNNCRYLISIL--
        VKELREVEKI++Q  +EFTK GEWTI+R                                              HD   G     AI N   Y++ +L  
Subjt:  VKELREVEKIRKQDKDEFTKGGEWTIQR---------------------------------------------LHDVPFGCSFPRAIHNNCRYLISIL--

Query:  ------QQDAPNVVDKAKIR------------------------------------------EEEKVIGNWNLLKDVNELANSLLTLPNNEKRWNLIGSM
                 A  + D + ++                                           E+ V+GNWNLLKDVNELANSLLTL  NE +W LIGSM
Subjt:  ------QQDAPNVVDKAKIR------------------------------------------EEEKVIGNWNLLKDVNELANSLLTLPNNEKRWNLIGSM

Query:  WVEMLGYAASKCEMEYHAEHIRQAGELITH
        WVEMLGYAAS CEMEYH+EHIRQ GELITH
Subjt:  WVEMLGYAASKCEMEYHAEHIRQAGELITH

XP_022999644.1 uncharacterized protein LOC111493941 [Cucurbita maxima]2.0e-16153.38Show/hide
Query:  NFVFCISLILKPHKPPSYLYRRKNAGKKLRLSVWLAYLLVPKVAVIVLGKLMTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVRQV
        NF+F + L     +      RR   G KL L+VW +YLL  K+A +VLGKL TIDIG+  RNT TQ+QALLAPLM MQIG+ DTITAYSIEDN LGVRQV
Subjt:  NFVFCISLILKPHKPPSYLYRRKNAGKKLRLSVWLAYLLVPKVAVIVLGKLMTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVRQV

Query:  FSMMMQVVIMFYILGRSWTDFKS----------------------------NSGFTYADFFQYDEAVKFLERLEGGNELPDAKLILRAYCRFCWLKPHLE
        FSM++QV IMFYIL RSWTD K+                            N GFT ADFF+Y E     ++L   NELP+AKLILRAY RFC LKPHLE
Subjt:  FSMMMQVVIMFYILGRSWTDFKS----------------------------NSGFTYADFFQYDEAVKFLERLEGGNELPDAKLILRAYCRFCWLKPHLE

Query:  NWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVY--PGLVLRFISFISLITTLCGFSVLFKDGFVYNIGAGMIHYVLITCVILEVY
        NW   P ++ D +KL I+DCEYE+VFRITDSELGFMYD LYTKAPV+Y   GL+LRFIS +SLI TLCGFSVLFKD FVYNI  G IHYVLI  +I+E+Y
Subjt:  NWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVY--PGLVLRFISFISLITTLCGFSVLFKDGFVYNIGAGMIHYVLITCVILEVY

Query:  QILKLPFSDWAIVQMIRHYETFPILWPLLRSLAPTSATWIRWSNKMGQFNLIDFCL-CKNQNFSRIKILRHWGLDMKLRKQLNLGQIEVHPKVKELVVKE
        QI+++PF+DWAIVQMIRH+ETFPIL  LL SLAP SATW RWSN MGQFNL+DFCL  K++N+SRIK+LR WGLDMKLRKQ++L +I+VHP+VKELVV E
Subjt:  QILKLPFSDWAIVQMIRHYETFPILWPLLRSLAPTSATWIRWSNKMGQFNLIDFCL-CKNQNFSRIKILRHWGLDMKLRKQLNLGQIEVHPKVKELVVKE

Query:  LREVEKIRKQDKDEFTKGGEWT---------------------------------------------IQRLHDVPFG-----------------------
        LRE+EKI+ Q  +EF + G+WT                                             I+R HD   G                       
Subjt:  LREVEKIRKQDKDEFTKGGEWT---------------------------------------------IQRLHDVPFG-----------------------

Query:  -------------C----SFPRAIHNN----CRYLISILQQDAPNVVDKAKIREEEK-VIGNWNLLKDVNELANSLLTLPNNEKRWNLIGSMWVEMLGYA
                     C     F R  H N    C  L+++ ++      +  +  E EK V+GNW+LLKDV +LA+SLL L +NE RW LIGSMW EMLGYA
Subjt:  -------------C----SFPRAIHNN----CRYLISILQQDAPNVVDKAKIREEEK-VIGNWNLLKDVNELANSLLTLPNNEKRWNLIGSMWVEMLGYA

Query:  ASKCEMEYHAEHIRQAGELITH
        ASKCEMEYH+EHIRQ GELITH
Subjt:  ASKCEMEYHAEHIRQAGELITH

XP_023545210.1 uncharacterized protein LOC111804689 [Cucurbita pepo subsp. pepo]5.0e-16053.04Show/hide
Query:  NFVFCISLILKPHKPPSYLYRRKNAGKKLRLSVWLAYLLVPKVAVIVLGKLMTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVRQV
        NFVF + L     +      RR   G KL L+VW +YLL  K+A +VLGKL TIDIG+  RNT TQ+QALLAPLM MQIG+ DTITAYSIEDN LGVRQV
Subjt:  NFVFCISLILKPHKPPSYLYRRKNAGKKLRLSVWLAYLLVPKVAVIVLGKLMTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVRQV

Query:  FSMMMQVVIMFYILGRSWTDFKS----------------------------NSGFTYADFFQYDEAVKFLERLEGGNELPDAKLILRAYCRFCWLKPHLE
        FSM++QV IMFYIL RSWTD K+                            N GFT ADFF+Y E  K  E+L   NELP+AKLILRAY RFC LKP LE
Subjt:  FSMMMQVVIMFYILGRSWTDFKS----------------------------NSGFTYADFFQYDEAVKFLERLEGGNELPDAKLILRAYCRFCWLKPHLE

Query:  NWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVY--PGLVLRFISFISLITTLCGFSVLFKDGFVYNIGAGMIHYVLITCVILEVY
        NW   P ++ D +KL I+DCEYE+VFRITDSELGFMYD LYTKAPV+Y   GL+LRFIS +SLI TLCGFSVLFKD FVYNI  G IHYVLI  +I+E+Y
Subjt:  NWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVY--PGLVLRFISFISLITTLCGFSVLFKDGFVYNIGAGMIHYVLITCVILEVY

Query:  QILKLPFSDWAIVQMIRHYETFPILWPLLRSLAPTSATWIRWSNKMGQFNLIDFCL-CKNQNFSRIKILRHWGLDMKLRKQLNLGQIEVHPKVKELVVKE
        QI+++PF+DWAIVQM+RH++TFPIL  LL SLAP SATW RWSN MGQFNL+DFC+  K++N+SRIK+LR WG+DMKLRKQ++L +IEV P+VKELVV E
Subjt:  QILKLPFSDWAIVQMIRHYETFPILWPLLRSLAPTSATWIRWSNKMGQFNLIDFCL-CKNQNFSRIKILRHWGLDMKLRKQLNLGQIEVHPKVKELVVKE

Query:  LREVEKIRKQDKDEFTKGGEWT---------------------------------------------IQRLHDVPFG-----------------------
        LRE+EKI+ Q  +EF + G+WT                                             I+R HD   G                       
Subjt:  LREVEKIRKQDKDEFTKGGEWT---------------------------------------------IQRLHDVPFG-----------------------

Query:  -------------C----SFPRAIHNN----CRYLISI-----LQQDAPNVVDKAKIREEEKVIGNWNLLKDVNELANSLLTLPNNEKRWNLIGSMWVEM
                     C     F R  H N    C  L+++     LQ   P+    A    E+ V+GNW+LLKDV +LANSLL L +NE +W LIGSMW EM
Subjt:  -------------C----SFPRAIHNN----CRYLISI-----LQQDAPNVVDKAKIREEEKVIGNWNLLKDVNELANSLLTLPNNEKRWNLIGSMWVEM

Query:  LGYAASKCEMEYHAEHIRQAGELITH
        LGYAASKCEMEYH+EHIRQ GELITH
Subjt:  LGYAASKCEMEYHAEHIRQAGELITH

TrEMBL top hitse value%identityAlignment
A0A1S4DZ97 uncharacterized protein LOC1034935082.8e-15652.16Show/hide
Query:  LQNFVFCISLILKPHKPPSYLYRRKNAGKKLRLSVWLAYLLVPKVAVIVLGKLMTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVR
        L NFVF + L     +      RR   G +L L VW +YLL  K+A +VLGKL TIDIG  +RNT TQ+QALLAPLM MQIG+ DTITAYSIEDN LGVR
Subjt:  LQNFVFCISLILKPHKPPSYLYRRKNAGKKLRLSVWLAYLLVPKVAVIVLGKLMTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVR

Query:  QVFSMMMQVVIMFYILGRSWTDFKS----------------------------NSGFTYADFFQYDEAVKFLERL-EGGNELPDAKLILRAYCRFCWLKP
        QVFSM++QV IMFYIL RSWTD K+                            N GFT ADFF+Y E      +L +G NELP+A LILRAY RFC LKP
Subjt:  QVFSMMMQVVIMFYILGRSWTDFKS----------------------------NSGFTYADFFQYDEAVKFLERL-EGGNELPDAKLILRAYCRFCWLKP

Query:  HLENWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVY--PGLVLRFISFISLITTLCGFSVLFKDGFVYNIGAGMIHYVLITCVIL
        HLENW   P ++ D  KL I+DC YE+VFRITD ELGFMYD LYTKAPVVY   GL+LRFIS +S+I TLCGFSVLFKD FVYNI  G IH+VLI  +I+
Subjt:  HLENWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVY--PGLVLRFISFISLITTLCGFSVLFKDGFVYNIGAGMIHYVLITCVIL

Query:  EVYQILKLPFSDWAIVQMIRHYETFPILWPLLRSLAPTSATWIRWSNKMGQFNLIDFCL-CKNQNFSRIKILRHWGLDMKLRKQLNLGQIEVHPKVKELV
        E+YQIL+LPF+DWAIVQM+RH+E FPIL   LRSLAP SATW RWSN MGQFNL+DFCL  K++N+SRIKILR+WG+DMKLRKQL+L +I+VHP+V+E +
Subjt:  EVYQILKLPFSDWAIVQMIRHYETFPILWPLLRSLAPTSATWIRWSNKMGQFNLIDFCL-CKNQNFSRIKILRHWGLDMKLRKQLNLGQIEVHPKVKELV

Query:  VKELREVEKIRKQDKDEFTKGGEWTIQRLHDV----------------PFG-CSFPRAIHNNCRYLI---------------------------------
        V ELRE+E+I+ Q  +EF   G+WTI R                    PF  C F   I  N  Y I                                 
Subjt:  VKELREVEKIRKQDKDEFTKGGEWTIQRLHDV----------------PFG-CSFPRAIHNNCRYLI---------------------------------

Query:  ----------------------------------------SILQQDAPNVVDKAKIREEEKVIGNWNLLKDVNELANSLLTLPNNEKRWNLIGSMWVEML
                                                SIL    P+  +  +   E+ V+GNW+L+KDV ELA+ LLTL +NE +W LIGSMW EML
Subjt:  ----------------------------------------SILQQDAPNVVDKAKIREEEKVIGNWNLLKDVNELANSLLTLPNNEKRWNLIGSMWVEML

Query:  GYAASKCEMEYHAEHIRQAGELITH
        GYAASKCEMEYH+EHIRQ GELITH
Subjt:  GYAASKCEMEYHAEHIRQAGELITH

A0A6J1DV94 uncharacterized protein LOC1110246974.7e-17256.51Show/hide
Query:  LQNFVFCISLILKPHKPPSYLYRRKNAGKKLRLSVWLAYLLVPKVAVIVLGKLMTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVR
        L N VF I L     +      RR   G +L LSVW AYLL  KVA +VLGKL TIDIG+T+RNT TQIQ LLAPLM MQIG+ DTITAYSIEDN LGVR
Subjt:  LQNFVFCISLILKPHKPPSYLYRRKNAGKKLRLSVWLAYLLVPKVAVIVLGKLMTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVR

Query:  QVFSMMMQVVIMFYILGRSWTDFKS----------------------------NSGFTYADFFQYDEAVKFLERL-EGGNELPDAKLILRAYCRFCWLKP
        QVFSM++QV IMFYIL RSWTDFK+                            N GFT ADFF+Y E     ERL +G +ELP A+LILRAY RFC LKP
Subjt:  QVFSMMMQVVIMFYILGRSWTDFKS----------------------------NSGFTYADFFQYDEAVKFLERL-EGGNELPDAKLILRAYCRFCWLKP

Query:  HLENWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVY--PGLVLRFISFISLITTLCGFSVLFKDGFVYNIGAGMIHYVLITCVIL
        HLENW   P ++ D +KLSIEDCEYEEVF+ITDSELGFMYD LYTKAPVVY   GLVLRFIS ISLI T+CGFSVLFKD FVYN+  G+IH+ L T VIL
Subjt:  HLENWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVY--PGLVLRFISFISLITTLCGFSVLFKDGFVYNIGAGMIHYVLITCVIL

Query:  EVYQILKLPFSDWAIVQMIRHYETFPILWPLLRSLAPTSATWIRWSNKMGQFNLIDFCL-CKNQNFSRIKILRHWGLDMKLRKQLNLGQIEVHPKVKELV
        EVYQIL+LPFSDWAIVQM+RHY+TFPIL  LL+SLAP SATW RWSN MGQFNL+DFCL  K++N+SRIKILR+WGLDMKLRKQL+LGQ+EVH KVKELV
Subjt:  EVYQILKLPFSDWAIVQMIRHYETFPILWPLLRSLAPTSATWIRWSNKMGQFNLIDFCL-CKNQNFSRIKILRHWGLDMKLRKQLNLGQIEVHPKVKELV

Query:  VKELREVEKIRKQDKDEFTKGGEWTIQR---------------------------------------------LHDVPFGCSFPRAIHNNCRYLISIL--
        VKELREVEKI++Q  +EFTK GEWTI+R                                              HD   G     AI N   Y++ +L  
Subjt:  VKELREVEKIRKQDKDEFTKGGEWTIQR---------------------------------------------LHDVPFGCSFPRAIHNNCRYLISIL--

Query:  ------QQDAPNVVDKAKIR------------------------------------------EEEKVIGNWNLLKDVNELANSLLTLPNNEKRWNLIGSM
                 A  + D + ++                                           E+ V+GNWNLLKDVNELANSLLTL  NE +W LIGSM
Subjt:  ------QQDAPNVVDKAKIR------------------------------------------EEEKVIGNWNLLKDVNELANSLLTLPNNEKRWNLIGSM

Query:  WVEMLGYAASKCEMEYHAEHIRQAGELITH
        WVEMLGYAAS CEMEYH+EHIRQ GELITH
Subjt:  WVEMLGYAASKCEMEYHAEHIRQAGELITH

A0A6J1DYH9 uncharacterized protein LOC1110246935.6e-197100Show/hide
Query:  MTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVRQVFSMMMQVVIMFYILGRSWTDFKSNSGFTYADFFQYDEAVKFLERLEGGNEL
        MTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVRQVFSMMMQVVIMFYILGRSWTDFKSNSGFTYADFFQYDEAVKFLERLEGGNEL
Subjt:  MTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVRQVFSMMMQVVIMFYILGRSWTDFKSNSGFTYADFFQYDEAVKFLERLEGGNEL

Query:  PDAKLILRAYCRFCWLKPHLENWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVYPGLVLRFISFISLITTLCGFSVLFKDGFVYN
        PDAKLILRAYCRFCWLKPHLENWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVYPGLVLRFISFISLITTLCGFSVLFKDGFVYN
Subjt:  PDAKLILRAYCRFCWLKPHLENWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVYPGLVLRFISFISLITTLCGFSVLFKDGFVYN

Query:  IGAGMIHYVLITCVILEVYQILKLPFSDWAIVQMIRHYETFPILWPLLRSLAPTSATWIRWSNKMGQFNLIDFCLCKNQNFSRIKILRHWGLDMKLRKQL
        IGAGMIHYVLITCVILEVYQILKLPFSDWAIVQMIRHYETFPILWPLLRSLAPTSATWIRWSNKMGQFNLIDFCLCKNQNFSRIKILRHWGLDMKLRKQL
Subjt:  IGAGMIHYVLITCVILEVYQILKLPFSDWAIVQMIRHYETFPILWPLLRSLAPTSATWIRWSNKMGQFNLIDFCLCKNQNFSRIKILRHWGLDMKLRKQL

Query:  NLGQIEVHPKVKELVVKELREVEKIRKQDKDEFTKGGEWTIQR
        NLGQIEVHPKVKELVVKELREVEKIRKQDKDEFTKGGEWTIQR
Subjt:  NLGQIEVHPKVKELVVKELREVEKIRKQDKDEFTKGGEWTIQR

A0A6J1G3A8 uncharacterized protein LOC1114503967.1e-16053.19Show/hide
Query:  NFVFCISLILKPHKPPSYLYRRKNAGKKLRLSVWLAYLLVPKVAVIVLGKLMTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVRQV
        NFVF + L     +      RR   G KL L+VW +YLL  K+A +VLGKL TIDIG+  RNT TQ+QALLAPLM MQIG+ DTITAYSIEDN LGVRQV
Subjt:  NFVFCISLILKPHKPPSYLYRRKNAGKKLRLSVWLAYLLVPKVAVIVLGKLMTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVRQV

Query:  FSMMMQVVIMFYILGRSWTDFKS----------------------------NSGFTYADFFQYDEAVKFLERLEGGNELPDAKLILRAYCRFCWLKPHLE
        FSM +QV IMFYIL RSWTD K+                            N GFT ADFF+Y E  K  E+L   NELP+AKLILRAY RFC LKPHLE
Subjt:  FSMMMQVVIMFYILGRSWTDFKS----------------------------NSGFTYADFFQYDEAVKFLERLEGGNELPDAKLILRAYCRFCWLKPHLE

Query:  NWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVY--PGLVLRFISFISLITTLCGFSVLFKDGFVYNIGAGMIHYVLITCVILEVY
        NW   P ++ D +KL I+DCEYE+VFRITDSELGFMYD LYTKAPV+Y   GL+LRFIS +SLI TLCGFSVLFKD FVYNI  G IHYVLI  +I+E+Y
Subjt:  NWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVY--PGLVLRFISFISLITTLCGFSVLFKDGFVYNIGAGMIHYVLITCVILEVY

Query:  QILKLPFSDWAIVQMIRHYETFPILWPLLRSLAPTSATWIRWSNKMGQFNLIDFCL-CKNQNFSRIKILRHWGLDMKLRKQLNLGQIEVHPKVKELVVKE
        QI+++PF+DWAIVQMIRH++TFPIL  LL SLAP SATW RWSN MGQFNL+DFC+  K++N+SRIK+LR WG+DMKLRKQ++L +IEV P+VKELVV E
Subjt:  QILKLPFSDWAIVQMIRHYETFPILWPLLRSLAPTSATWIRWSNKMGQFNLIDFCL-CKNQNFSRIKILRHWGLDMKLRKQLNLGQIEVHPKVKELVVKE

Query:  LREVEKIRKQDKDEFTKGGEWT---------------------------------------------IQRLHDVPFG-----------------------
        LRE+E I+ Q  +EF + G+WT                                             I+R HD   G                       
Subjt:  LREVEKIRKQDKDEFTKGGEWT---------------------------------------------IQRLHDVPFG-----------------------

Query:  -------------C----SFPRAIHNN----CRYLISI-----LQQDAPNVVDKAKIREEEKVIGNWNLLKDVNELANSLLTLPNNEKRWNLIGSMWVEM
                     C     F R  H N    C  L+++     LQ   P+    A    E+ V+GNW+LLKDV +LANSLL L +NE +W LIGSMW EM
Subjt:  -------------C----SFPRAIHNN----CRYLISI-----LQQDAPNVVDKAKIREEEKVIGNWNLLKDVNELANSLLTLPNNEKRWNLIGSMWVEM

Query:  LGYAASKCEMEYHAEHIRQAGELITH
        LGYAASKCEMEYH+EHIRQ GELITH
Subjt:  LGYAASKCEMEYHAEHIRQAGELITH

A0A6J1KHN9 uncharacterized protein LOC1114939419.9e-16253.38Show/hide
Query:  NFVFCISLILKPHKPPSYLYRRKNAGKKLRLSVWLAYLLVPKVAVIVLGKLMTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVRQV
        NF+F + L     +      RR   G KL L+VW +YLL  K+A +VLGKL TIDIG+  RNT TQ+QALLAPLM MQIG+ DTITAYSIEDN LGVRQV
Subjt:  NFVFCISLILKPHKPPSYLYRRKNAGKKLRLSVWLAYLLVPKVAVIVLGKLMTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVRQV

Query:  FSMMMQVVIMFYILGRSWTDFKS----------------------------NSGFTYADFFQYDEAVKFLERLEGGNELPDAKLILRAYCRFCWLKPHLE
        FSM++QV IMFYIL RSWTD K+                            N GFT ADFF+Y E     ++L   NELP+AKLILRAY RFC LKPHLE
Subjt:  FSMMMQVVIMFYILGRSWTDFKS----------------------------NSGFTYADFFQYDEAVKFLERLEGGNELPDAKLILRAYCRFCWLKPHLE

Query:  NWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVY--PGLVLRFISFISLITTLCGFSVLFKDGFVYNIGAGMIHYVLITCVILEVY
        NW   P ++ D +KL I+DCEYE+VFRITDSELGFMYD LYTKAPV+Y   GL+LRFIS +SLI TLCGFSVLFKD FVYNI  G IHYVLI  +I+E+Y
Subjt:  NWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVY--PGLVLRFISFISLITTLCGFSVLFKDGFVYNIGAGMIHYVLITCVILEVY

Query:  QILKLPFSDWAIVQMIRHYETFPILWPLLRSLAPTSATWIRWSNKMGQFNLIDFCL-CKNQNFSRIKILRHWGLDMKLRKQLNLGQIEVHPKVKELVVKE
        QI+++PF+DWAIVQMIRH+ETFPIL  LL SLAP SATW RWSN MGQFNL+DFCL  K++N+SRIK+LR WGLDMKLRKQ++L +I+VHP+VKELVV E
Subjt:  QILKLPFSDWAIVQMIRHYETFPILWPLLRSLAPTSATWIRWSNKMGQFNLIDFCL-CKNQNFSRIKILRHWGLDMKLRKQLNLGQIEVHPKVKELVVKE

Query:  LREVEKIRKQDKDEFTKGGEWT---------------------------------------------IQRLHDVPFG-----------------------
        LRE+EKI+ Q  +EF + G+WT                                             I+R HD   G                       
Subjt:  LREVEKIRKQDKDEFTKGGEWT---------------------------------------------IQRLHDVPFG-----------------------

Query:  -------------C----SFPRAIHNN----CRYLISILQQDAPNVVDKAKIREEEK-VIGNWNLLKDVNELANSLLTLPNNEKRWNLIGSMWVEMLGYA
                     C     F R  H N    C  L+++ ++      +  +  E EK V+GNW+LLKDV +LA+SLL L +NE RW LIGSMW EMLGYA
Subjt:  -------------C----SFPRAIHNN----CRYLISILQQDAPNVVDKAKIREEEK-VIGNWNLLKDVNELANSLLTLPNNEKRWNLIGSMWVEMLGYA

Query:  ASKCEMEYHAEHIRQAGELITH
        ASKCEMEYH+EHIRQ GELITH
Subjt:  ASKCEMEYHAEHIRQAGELITH

SwissProt top hitse value%identityAlignment
P10978 Retrovirus-related Pol polyprotein from transposon TNT 1-943.2e-0821.93Show/hide
Query:  KLTGHN-YLQWSQSVKMFMYGRGLEDHIMDKAESPKSNDPKFRKWRAENNQVMSWLINSMATEIGENFLLFSTAKEIWEAVRDTFSNKENTAEIFQIETT
        K  G N +  W + ++  +  +GL  H +   +S K +  K   W   + +  S +   ++ ++  N +   TA+ IW  +   + +K  T +++     
Subjt:  KLTGHN-YLQWSQSVKMFMYGRGLEDHIMDKAESPKSNDPKFRKWRAENNQVMSWLINSMATEIGENFLLFSTAKEIWEAVRDTFSNKENTAEIFQIETT

Query:  LQDLKQGDLSVTIYYSTLSRYWQQLDLFETIDWKCSDDRTLFREFVETKRIFKFLMGLNKSLDEVCGRILGTKPLPSIREVFFEVRREESRKQVMLGSSE
           LK+      ++ S  + +   L++F  +  + ++   L  +  E  +    L  L  S D +   IL  K    +++V   +   E  ++      +
Subjt:  LQDLKQGDLSVTIYYSTLSRYWQQLDLFETIDWKCSDDRTLFREFVETKRIFKFLMGLNKSLDEVCGRILGTKPLPSIREVFFEVRREESRKQVMLGSSE

Query:  HPLTQNGSALISQRDHPNDPIALAARGNFQSYGDNRQRKGRPWCDHCHKVRHVKETCWKLHGKPANWKPNTTRSDRETKGNAAVSETSNQQPHAKELLNM
          +T+      S +   N+     ARG  ++   +R R     C +C++  H K  C           PN     R+ KG     ETS Q+         
Subjt:  HPLTQNGSALISQRDHPNDPIALAARGNFQSYGDNRQRKGRPWCDHCHKVRHVKETCWKLHGKPANWKPNTTRSDRETKGNAAVSETSNQQPHAKELLNM

Query:  LQQLLHKTNLSTTVGGSGNVQQG-NQNSLALHTQTLPSEWIVDSGASDHMTGDRSLFSSFSLYTGNF-SIRIANGTPAKVTGIGSI----QISSSLILES
             +  N +  V  + NV    N+    +H     SEW+VD+ AS H T  R LF  +    G+F ++++ N + +K+ GIG I     +  +L+L+ 
Subjt:  LQQLLHKTNLSTTVGGSGNVQQG-NQNSLALHTQTLPSEWIVDSGASDHMTGDRSLFSSFSLYTGNF-SIRIANGTPAKVTGIGSI----QISSSLILES

Query:  ILFVPTLEYNLLSDL
        +  VP L  NL+S +
Subjt:  ILFVPTLEYNLLSDL

Q94HW2 Retrovirus-related Pol polyprotein from transposon RE12.1e-1521.9Show/hide
Query:  KLTGHNYLQWSQSVKMFMYGRGLEDHIMDKAESPKSN---------DPKFRKWRAENNQVMSWLINSMATEIGENFLLFSTAKEIWEAVRDTFSNKENTA
        KLT  NYL WS+ V     G  L   +      P +          +P + +W+ ++  + S ++ +++  +       +TA +IWE +R  ++N  +  
Subjt:  KLTGHNYLQWSQSVKMFMYGRGLEDHIMDKAESPKSN---------DPKFRKWRAENNQVMSWLINSMATEIGENFLLFSTAKEIWEAVRDTFSNKENTA

Query:  EIFQIETTLQDLKQGDLSVTIYYSTLSRYWQQLDLFETIDWKCSDDRTLFREFVETKRIFKFLMGLNKSLDEVCGRILGTKPLPSIREVFFEVRREESRK
         + Q+ T L+   +G  ++  Y   L   + QL L             L +     +++ + L  L +    V  +I      P++ E+   +   ES+ 
Subjt:  EIFQIETTLQDLKQGDLSVTIYYSTLSRYWQQLDLFETIDWKCSDDRTLFREFVETKRIFKFLMGLNKSLDEVCGRILGTKPLPSIREVFFEVRREESRK

Query:  QVMLGSSEHPLTQNGSALISQRDHPNDPIALAARGNFQSYGDNR--QRKGRPWCDHCHKVRHVKETCWKLHGKPANWKPNTTRSDRETKGNAAVSETSNQ
          +  ++  P+T N  +  +     N+       GN  +  DNR      +PW                      N+ PN  +S    K      +    
Subjt:  QVMLGSSEHPLTQNGSALISQRDHPNDPIALAARGNFQSYGDNR--QRKGRPWCDHCHKVRHVKETCWKLHGKPANWKPNTTRSDRETKGNAAVSETSNQ

Query:  QPHAKELLNMLQQLLHKTNLSTTVGGSGNVQQGNQNSLALHTQTLPSEWIVDSGASDHMTGDRSLFSSFSLYTGNFSIRIANGTPAKVTGIGSIQISSS-
        Q H+ +  + LQ  L   N           Q   + +LAL +    + W++DSGA+ H+T D +  S    YTG   + +A+G+   ++  GS  +S+  
Subjt:  QPHAKELLNMLQQLLHKTNLSTTVGGSGNVQQGNQNSLALHTQTLPSEWIVDSGASDHMTGDRSLFSSFSLYTGNFSIRIANGTPAKVTGIGSIQISSS-

Query:  --LILESILFVPTLEYNLLS
          L L +IL+VP +  NL+S
Subjt:  --LILESILFVPTLEYNLLS

Q9ZT94 Retrovirus-related Pol polyprotein from transposon RE21.6e-1221.62Show/hide
Query:  KLTGHNYLQWSQSVKMFMYGRGLEDHIMDKAESPKSN---------DPKFRKWRAENNQVMSWLINSMATEIGENFLLFSTAKEIWEAVRDTFSNKENTA
        KLT  NYL WS+ V     G  L   +      P +          +P + +WR ++  + S ++ +++  +       +TA +IWE +R  ++N  +  
Subjt:  KLTGHNYLQWSQSVKMFMYGRGLEDHIMDKAESPKSN---------DPKFRKWRAENNQVMSWLINSMATEIGENFLLFSTAKEIWEAVRDTFSNKENTA

Query:  EIFQIETTLQDLKQGDLSVTIYYSTLSRYWQQLDLFETIDWKCSDDRTLFREFVETKRIFKFLMGLNKSLDEVCGRILGTKPLPSIREVFFEVRREESRK
         + Q+                    ++R+ Q   L + +D                +++ + L  L      V  +I      PS+ E+   +   ES+ 
Subjt:  EIFQIETTLQDLKQGDLSVTIYYSTLSRYWQQLDLFETIDWKCSDDRTLFREFVETKRIFKFLMGLNKSLDEVCGRILGTKPLPSIREVFFEVRREESRK

Query:  QVMLGSSEHPLTQNGSALISQRDHPNDPIALAARGNFQSYGDNRQRKGRPWCDHCHKVRHVKETCWKLHGKPANWKPNT--TRSD-RETKGNAAVSETSN
          +  +   P+T N   +++ R + N       RG+ ++Y +N  R                           +W+P++  +RSD R+ K      +  +
Subjt:  QVMLGSSEHPLTQNGSALISQRDHPNDPIALAARGNFQSYGDNRQRKGRPWCDHCHKVRHVKETCWKLHGKPANWKPNT--TRSD-RETKGNAAVSETSN

Query:  QQPHAKELLNMLQQLLHKTNLSTTVGGSGNVQQGNQNSLALHTQTLPSEWIVDSGASDHMTGDRSLFSSFSLYTGNFSIRIANGTPAKVTGIGSIQI---
         Q H+ +    L Q    TN   +       Q   + +LA+++    + W++DSGA+ H+T D +  S    YTG   + IA+G+   +T  GS  +   
Subjt:  QQPHAKELLNMLQQLLHKTNLSTTVGGSGNVQQGNQNSLALHTQTLPSEWIVDSGASDHMTGDRSLFSSFSLYTGNFSIRIANGTPAKVTGIGSIQI---

Query:  SSSLILESILFVPTLEYNLLS
        S SL L  +L+VP +  NL+S
Subjt:  SSSLILESILFVPTLEYNLLS

Arabidopsis top hitse value%identityAlignment
AT1G21280.1 CONTAINS InterPro DOMAIN/s: Retrotransposon gag protein (InterPro:IPR005162); Has 707 Blast hits to 705 proteins in 25 species: Archae - 0; Bacteria - 0; Metazoa - 4; Fungi - 0; Plants - 703; Viruses - 0; Other Eukaryotes - 0 (source: NCBI BLink).2.8e-1530.37Show/hide
Query:  NYLQWSQSVKMFMYGRGLEDHIMDKAESPKSNDPKFRKWRAENNQVMSWLINSMATEIGENFLLFSTAKEIWEAVRDTFSNKENTAEIFQIETTLQDLKQ
        NY+ W    + F+        I      P    P ++ W   N  VM WL+NSM  ++ E+ +   TA ++WE +R  F    +  +I+Q+   L  L+Q
Subjt:  NYLQWSQSVKMFMYGRGLEDHIMDKAESPKSNDPKFRKWRAENNQVMSWLINSMATEIGENFLLFSTAKEIWEAVRDTFSNKENTAEIFQIETTLQDLKQ

Query:  GDLSVTIYYSTLSRYWQQLDLFETI-DWKCS----DDRTLFREFVETKRIFKFLMG--LNKSLDEVCGRILGTKPLPSIREVFFEVRREES
        G  SV  Y+  LS+ W +L  +  I + KC     +      E  E ++ ++FLMG  LN+  + V  +I+  KP PS+ E F  V+  ES
Subjt:  GDLSVTIYYSTLSRYWQQLDLFETI-DWKCS----DDRTLFREFVETKRIFKFLMG--LNKSLDEVCGRILGTKPLPSIREVFFEVRREES

AT5G45460.1 unknown protein3.8e-1223.86Show/hide
Query:  LQNFVFCISLILKPHKPPSYLYRRKNAGKKLRLSVWLAYLLV---PKVAVIVLGKLMTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDL
        LQ F+ C S +           R++   + L + +W +YLL       AV ++ K    D+   +     ++ AL AP +L+ +G  DTITA+++EDN L
Subjt:  LQNFVFCISLILKPHKPPSYLYRRKNAGKKLRLSVWLAYLLV---PKVAVIVLGKLMTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDL

Query:  GVRQVFSMMMQVVIMFYILGRSWTDFKSNSGFTYADFFQYDEAVKFLER----------------LEGGNELPDAKLILRAYCRFCWLK----------P
         +R VF ++ Q +   Y++ +S      NS +           +K+LER                ++G +  P+   ++  Y      K          P
Subjt:  GVRQVFSMMMQVVIMFYILGRSWTDFKSNSGFTYADFFQYDEAVKFLER----------------LEGGNELPDAKLILRAYCRFCWLK----------P

Query:  HLEN-----WHPSPTSNVDLQKLS-IEDCEY-----------------------------------EEVFRITDSELGFMYDVLYTKAPVVYP--GLVLR
          E+      HPS  S    ++L+ +E  +Y                                   EE  RI + ELGF+YD L+TK  V++   G V R
Subjt:  HLEN-----WHPSPTSNVDLQKLS-IEDCEY-----------------------------------EEVFRITDSELGFMYDVLYTKAPVVYP--GLVLR

Query:  FISFISLITTLCGFSVLFKDGFVYNIGAGMIHYVLITC-VILEVYQILKLPFSDW--AIVQMIRHYETFPILW
         ++  SL+     F  +   G  ++    +I Y+L    ++L+   IL   FSDW  A +  ++     P+ W
Subjt:  FISFISLITTLCGFSVLFKDGFVYNIGAGMIHYVLITC-VILEVYQILKLPFSDW--AIVQMIRHYETFPILW

AT5G45470.1 Protein of unknown function (DUF594)7.8e-1024.62Show/hide
Query:  RRKNAGKKLRLSVWLAYLLV---PKVAVIVLGKLMTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVRQVFSMMMQVVIMFYILGRS
        R++   + L + VW +YLL       AV ++ K    D+   +     ++ AL AP +L+ +G  DTITA+++EDN L +R VF ++ Q +   Y++  S
Subjt:  RRKNAGKKLRLSVWLAYLLV---PKVAVIVLGKLMTIDIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVRQVFSMMMQVVIMFYILGRS

Query:  WTDFKSNSGFTYADFFQYDEAVKFLERL------------EGGNELPD-------------AKLILRAYCRFCWL-KPHLEN-----WHPSPTS---NVD
              NS +           +K+LER             +   + PD             AK   R   +   + +P  EN      HP+  S     D
Subjt:  WTDFKSNSGFTYADFFQYDEAVKFLERL------------EGGNELPD-------------AKLILRAYCRFCWL-KPHLEN-----WHPSPTS---NVD

Query:  LQKLSIEDCEY---------------------------------EEVFRITDSELGFMYDVLYTKAPVVYPGL--VLRFISFISLITTLCGFSVLFKDGF
        L  L I    Y                                 EE  RI + ELGF+YD L+TK  +++ G+  V R  +  +L+     F      G 
Subjt:  LQKLSIEDCEY---------------------------------EEVFRITDSELGFMYDVLYTKAPVVYPGL--VLRFISFISLITTLCGFSVLFKDGF

Query:  VYNIGAGMIHYVLITC-VILEVYQILKLPFSDW
         ++    ++ Y L    ++L+   IL   FSDW
Subjt:  VYNIGAGMIHYVLITC-VILEVYQILKLPFSDW

AT5G45470.1 Protein of unknown function (DUF594)1.3e-0435.59Show/hide
Query:  NLLKDVNELANSLLTL--PNNEKRWNLIGSMWVEMLGYAASKCEMEYHAEHIRQAGELI
        ++L D + LA  L  +   +N+ +W ++  +WVE+L YAA  C+   H E + + GELI
Subjt:  NLLKDVNELANSLLTL--PNNEKRWNLIGSMWVEMLGYAASKCEMEYHAEHIRQAGELI

AT5G45480.1 Protein of unknown function (DUF594)9.6e-0822.94Show/hide
Query:  RRKNAGKKLRLS-VWLAYLLVPKVAVIVLGKLMTI---DIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVRQVFSMMMQVVIMFYILGR
        +RK + +K+ LS +W AYLL    A    G++      D    E     ++ A   P +L+ +G  DTITA ++EDN+L +R +  +  Q V   Y+L +
Subjt:  RRKNAGKKLRLS-VWLAYLLVPKVAVIVLGKLMTI---DIGYTERNTLTQIQALLAPLMLMQIGSADTITAYSIEDNDLGVRQVFSMMMQVVIMFYILGR

Query:  S-----WTDF--------------------------------KSNSGFTYADFFQYDEAVKFLE---------------RLEGGNELPDA----KLILRA
        S     W                                   + + G  YA   +   A K ++               R +   + PD      ++  A
Subjt:  S-----WTDF--------------------------------KSNSGFTYADFFQYDEAVKFLE---------------RLEGGNELPDA----KLILRA

Query:  YCRFCWLKPHLENWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVYP--GLVLRFISFISLITTLCGFSVLFKDGFVYNIGAGMIH
        Y  F   K  + +   +     +  K   +  + EE  RI + EL F+Y  LYTKA +++   G + RFI+   L   L  F    K  +      G+ +
Subjt:  YCRFCWLKPHLENWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVVYP--GLVLRFISFISLITTLCGFSVLFKDGFVYNIGAGMIH

Query:  YVLITCVILEVYQILKLPFSDWAIVQM
         +L+  + L+   ++    SDW  V++
Subjt:  YVLITCVILEVYQILKLPFSDWAIVQM

AT5G45480.1 Protein of unknown function (DUF594)1.9e-0337.1Show/hide
Query:  NLLKDVNELANSLLTLPNNE----KRWNLIGSMWVEMLGYAASKCEMEYHAEHIRQAGELIT
        ++L D   LA  L  L  N+    + W ++  +WVE+L YAA+KC    HA  + + GELI+
Subjt:  NLLKDVNELANSLLTLPNNE----KRWNLIGSMWVEMLGYAASKCEMEYHAEHIRQAGELIT

AT5G45540.1 Protein of unknown function (DUF594)5.6e-1626.28Show/hide
Query:  RRKNAGKKLRLSVWLAYLLVPKVAVIVLGKLMTIDIGYTERNTLTQIQALLA---PLMLMQIGSADTITAYSIEDNDLGVRQVFSMMMQVVIMFYIL---
        RR+ A K   + +W AYLL    A   +G++        E N  ++ + LLA   P +L+ +G  DTITA ++EDN+L  R +FS++ Q V   Y++   
Subjt:  RRKNAGKKLRLSVWLAYLLVPKVAVIVLGKLMTIDIGYTERNTLTQIQALLA---PLMLMQIGSADTITAYSIEDNDLGVRQVFSMMMQVVIMFYIL---

Query:  ----------------------------GRSWTDFKS------NSGFTYADFFQYDEA------------VKFLERLEGG-------NELPDAKLILRAY
                                      S   FK       + G  YA   +  EA            VK  E+   G       NEL   ++I  AY
Subjt:  ----------------------------GRSWTDFKS------NSGFTYADFFQYDEA------------VKFLERLEGG-------NELPDAKLILRAY

Query:  CRFCWLKPHLENWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVV--YPGLVLRFISFISLITTLCGFSVLFKDGFVYNIGAGMIHY
          F   K  + +   +     + +K   +    EE  RI + ELG +YD L+TKA ++  + G V RFI+   L+ +LC F +  KD   Y+    ++ Y
Subjt:  CRFCWLKPHLENWHPSPTSNVDLQKLSIEDCEYEEVFRITDSELGFMYDVLYTKAPVV--YPGLVLRFISFISLITTLCGFSVLFKDGFVYNIGAGMIHY

Query:  VLITC-VILEVYQILKLPFSDWAIVQMIRHYETFP---------ILWPL----LR-----------SLAPTSATWIRWSNKMGQFNLIDFCL
         L+ C + L+   +L    SDW I ++ +  E            + W L    LR            +   +  + RWS  +  +NLI FCL
Subjt:  VLITC-VILEVYQILKLPFSDWAIVQMIRHYETFP---------ILWPL----LR-----------SLAPTSATWIRWSNKMGQFNLIDFCL

AT5G45540.1 Protein of unknown function (DUF594)1.7e-0426.43Show/hide
Query:  VKELVVKELREVEKIRKQDKDEFTKGGEWTIQRLHDVPFGCSFPRAIHNNCRYLISILQQDAPNVV--DKAKIREEEKVIGNWNLLKDVNELANSLLTLP
        V+  ++  +  + KIR +D  E  K      QR H           +   CR ++S+  +  P  V  D++K           ++L D + LA  L+   
Subjt:  VKELVVKELREVEKIRKQDKDEFTKGGEWTIQRLHDVPFGCSFPRAIHNNCRYLISILQQDAPNVV--DKAKIREEEKVIGNWNLLKDVNELANSLLTLP

Query:  NNEKRWNLIGSMWVEMLGYAASKCEMEYHAEHIRQAGELI
          E  W ++  +WVE+L YA+  C+ + HA  + + GELI
Subjt:  NNEKRWNLIGSMWVEMLGYAASKCEMEYHAEHIRQAGELI


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGCTCCTATTTCACAAGCACCAACGTGGATTTGACAGTACTGGAAAAATAGTAGCTCTTGCTATGCTGCTCGACCACAATAAGTTACATAATCTTCAGAACTTTGTTTT
CTGTATTTCACTCATTTTGAAGCCTCACAAGCCTCCAAGCTATCTCTACAGAAGGAAAAATGCTGGAAAAAAGCTCAGATTATCTGTTTGGCTTGCCTACTTACTGGTTC
CTAAAGTTGCGGTCATCGTTCTGGGCAAGCTCATGACAATTGATATAGGCTATACCGAGCGCAACACGCTGACCCAAATTCAAGCATTGTTGGCACCTTTGATGTTGATG
CAGATAGGAAGTGCAGATACAATCACAGCCTACTCAATTGAAGACAATGATCTAGGAGTGAGACAAGTTTTCAGCATGATGATGCAAGTGGTAATTATGTTTTACATTCT
TGGAAGGTCGTGGACAGATTTCAAAAGCAACTCAGGCTTCACATATGCAGATTTCTTCCAATATGACGAAGCAGTTAAGTTTTTAGAAAGACTAGAGGGAGGTAATGAAC
TTCCAGATGCAAAATTGATCTTGAGAGCTTATTGTAGATTTTGCTGGCTCAAACCTCATCTTGAGAATTGGCATCCCTCTCCTACATCGAATGTTGATCTCCAAAAACTA
TCCATTGAAGACTGTGAGTATGAAGAGGTGTTCAGAATTACAGATTCTGAATTGGGTTTCATGTACGATGTGCTCTACACTAAGGCACCCGTTGTATATCCAGGTTTAGT
TCTTCGTTTTATTAGCTTCATCAGCTTGATTACAACACTATGTGGATTTTCAGTGTTGTTCAAGGATGGATTTGTCTATAATATTGGCGCTGGCATGATCCATTATGTGT
TGATAACATGTGTGATACTTGAGGTTTATCAGATCTTGAAACTACCATTCTCAGATTGGGCAATAGTGCAGATGATACGGCACTATGAAACCTTTCCCATCCTTTGGCCT
TTATTGCGGTCTCTAGCCCCTACATCAGCAACTTGGATAAGGTGGTCTAACAAAATGGGACAGTTCAATCTTATAGATTTCTGCTTATGCAAGAATCAGAACTTTAGCAG
AATCAAAATTCTTCGACATTGGGGCTTGGATATGAAACTCCGAAAGCAATTAAATTTGGGACAAATTGAAGTCCATCCAAAAGTGAAAGAACTTGTGGTTAAGGAACTGA
GAGAGGTAGAGAAGATCAGAAAACAAGACAAAGACGAATTCACTAAAGGAGGCGAATGGACAATCCAAAGATTACATGATGTACCTTTTGGCTGTTCGTTCCCACGTGCT
ATCCACAACAACTGCAGATATCTTATTTCAATATTGCAGCAAGACGCTCCAAATGTGGTTGACAAGGCAAAAATAAGAGAAGAGGAAAAGGTTATAGGGAATTGGAATCT
GCTGAAAGATGTAAATGAACTTGCAAATAGCTTACTCACTCTGCCCAACAACGAAAAAAGATGGAATTTAATTGGTAGTATGTGGGTTGAGATGTTGGGATATGCTGCAA
GTAAGTGTGAAATGGAATATCATGCAGAACACATCAGACAAGCTGGTGAATTGATCACTCATAAACCCAAGCTGTCATCAGATCCACCTCCGGCGAACTCTTCACGGCAG
CTCCGTCGCCGTCGATCCTCTCCATCGCGACGTCCTTCACGGCCTCTCCATCGCCGGTGGCCTTCATCGCGAACAGAACATCCCGACAGCCTCATCCCGCACAGATCCAG
CACATCCCGACCTCCGCAATCCGGCGAACATCCCGGCAGTTGCATAGATCCAGCCGGTCTTCATCGTGAGCAAGTCTCCATCTGCCGGCCTCCGCGATCCGTCCCCGCAC
AATCGGCTTTCGTGATTCTTTCTTCGCCGACATCACTATCTTCAAATCTGCTCAGATCTGTTACTGACGCCGCTGCTCGTGCCGCCGCTCATACTGCCGCTGCCGCCACT
GCTGCCACCGCTGCCATCCTCATAGAAGCAGAAATACTCGATCTTTATCGATTCGCTATGGCAGATTCGGAATCAGAATCTCAGAATAATATCCCAATTTTTCGTGAAAG
CAGCAACACTCAGATAACATGCCACAAACTCACCGGCCACAATTATCTTCAGTGGTCTCAATCAGTGAAGATGTTCATGTATGGTCGTGGATTGGAGGACCATATCATGG
ATAAAGCAGAATCACCAAAATCTAATGATCCAAAATTTCGCAAATGGAGGGCTGAAAATAATCAAGTAATGAGCTGGTTAATAAACTCTATGGCAACCGAAATTGGAGAA
AATTTTCTCCTATTTTCTACTGCCAAAGAGATTTGGGAAGCTGTTCGTGATACTTTCTCCAATAAGGAAAACACTGCTGAAATTTTCCAGATAGAGACTACTCTCCAAGA
TCTCAAACAAGGTGATTTATCTGTTACAATCTACTACTCCACTTTATCTCGCTATTGGCAACAATTAGATCTATTTGAAACTATTGATTGGAAATGTTCTGACGATCGTA
CTCTCTTTCGAGAATTTGTGGAAACAAAGAGGATTTTCAAATTCCTAATGGGTCTCAATAAATCCCTTGATGAAGTTTGTGGTCGAATTCTCGGAACTAAGCCATTACCT
AGTATCCGAGAAGTCTTTTTTGAAGTACGCAGAGAGGAAAGTCGAAAGCAAGTCATGTTGGGATCATCTGAACATCCTCTTACTCAAAATGGATCTGCTCTCATTTCCCA
ACGAGATCACCCCAATGATCCAATCGCTCTTGCTGCCCGGGGAAACTTCCAATCCTATGGTGACAATCGACAGCGCAAAGGGCGACCATGGTGCGATCACTGTCATAAAG
TGAGACATGTCAAAGAAACCTGTTGGAAACTACATGGGAAACCAGCAAATTGGAAGCCGAACACTACTAGATCAGATCGAGAAACTAAGGGGAATGCTGCTGTATCTGAA
ACTTCCAATCAACAACCACATGCTAAAGAGTTGTTGAACATGCTGCAACAACTATTGCACAAAACAAATCTATCAACAACAGTAGGTGGTTCTGGGAATGTACAGCAAGG
TAACCAAAATTCCCTTGCTCTCCATACTCAAACTCTTCCATCTGAATGGATAGTGGACTCTGGAGCATCAGATCATATGACAGGTGATAGATCCTTATTCTCATCCTTTT
CACTGTATACGGGAAACTTCTCCATTCGAATTGCTAACGGTACACCTGCTAAAGTGACCGGTATTGGAAGCATTCAGATATCAAGCTCTCTAATTCTTGAGTCAATTTTA
TTTGTGCCAACTTTGGAGTATAATCTGCTATCTGACTTGGGATCGGGGAAGATGATTGCCAATGCTGAACTATGTGCTGGATTGTATCTTCTGAGAGCAACTAGTCCTCC
ACCAACACATGAATGCAATAAACTTGTGGGGGAGCATTATGATCTTGAATCACAAAATTGGGATTGCACTCCTGATTTTCCTACGGTAACTTTAAATCTATCATCTGAAA
GTTCACTTGTACCATCTTCTGAGTCAACTCCCTTATCTACCCCTAAGCCTGAATTGCAAGTTTATTCGCGAAGAGCGAGACAACCTGAGAGGACTGAGATACAACCTACA
CAAGTTCAGCAAAGCCAAGATCTAAACCAAAATCCTAATCTCCTTGAAATAGCACACGACATGTCAGAAGATGATAGACCCATTGCTCTTTTGTGA
mRNA sequenceShow/hide mRNA sequence
ATGCTCCTATTTCACAAGCACCAACGTGGATTTGACAGTACTGGAAAAATAGTAGCTCTTGCTATGCTGCTCGACCACAATAAGTTACATAATCTTCAGAACTTTGTTTT
CTGTATTTCACTCATTTTGAAGCCTCACAAGCCTCCAAGCTATCTCTACAGAAGGAAAAATGCTGGAAAAAAGCTCAGATTATCTGTTTGGCTTGCCTACTTACTGGTTC
CTAAAGTTGCGGTCATCGTTCTGGGCAAGCTCATGACAATTGATATAGGCTATACCGAGCGCAACACGCTGACCCAAATTCAAGCATTGTTGGCACCTTTGATGTTGATG
CAGATAGGAAGTGCAGATACAATCACAGCCTACTCAATTGAAGACAATGATCTAGGAGTGAGACAAGTTTTCAGCATGATGATGCAAGTGGTAATTATGTTTTACATTCT
TGGAAGGTCGTGGACAGATTTCAAAAGCAACTCAGGCTTCACATATGCAGATTTCTTCCAATATGACGAAGCAGTTAAGTTTTTAGAAAGACTAGAGGGAGGTAATGAAC
TTCCAGATGCAAAATTGATCTTGAGAGCTTATTGTAGATTTTGCTGGCTCAAACCTCATCTTGAGAATTGGCATCCCTCTCCTACATCGAATGTTGATCTCCAAAAACTA
TCCATTGAAGACTGTGAGTATGAAGAGGTGTTCAGAATTACAGATTCTGAATTGGGTTTCATGTACGATGTGCTCTACACTAAGGCACCCGTTGTATATCCAGGTTTAGT
TCTTCGTTTTATTAGCTTCATCAGCTTGATTACAACACTATGTGGATTTTCAGTGTTGTTCAAGGATGGATTTGTCTATAATATTGGCGCTGGCATGATCCATTATGTGT
TGATAACATGTGTGATACTTGAGGTTTATCAGATCTTGAAACTACCATTCTCAGATTGGGCAATAGTGCAGATGATACGGCACTATGAAACCTTTCCCATCCTTTGGCCT
TTATTGCGGTCTCTAGCCCCTACATCAGCAACTTGGATAAGGTGGTCTAACAAAATGGGACAGTTCAATCTTATAGATTTCTGCTTATGCAAGAATCAGAACTTTAGCAG
AATCAAAATTCTTCGACATTGGGGCTTGGATATGAAACTCCGAAAGCAATTAAATTTGGGACAAATTGAAGTCCATCCAAAAGTGAAAGAACTTGTGGTTAAGGAACTGA
GAGAGGTAGAGAAGATCAGAAAACAAGACAAAGACGAATTCACTAAAGGAGGCGAATGGACAATCCAAAGATTACATGATGTACCTTTTGGCTGTTCGTTCCCACGTGCT
ATCCACAACAACTGCAGATATCTTATTTCAATATTGCAGCAAGACGCTCCAAATGTGGTTGACAAGGCAAAAATAAGAGAAGAGGAAAAGGTTATAGGGAATTGGAATCT
GCTGAAAGATGTAAATGAACTTGCAAATAGCTTACTCACTCTGCCCAACAACGAAAAAAGATGGAATTTAATTGGTAGTATGTGGGTTGAGATGTTGGGATATGCTGCAA
GTAAGTGTGAAATGGAATATCATGCAGAACACATCAGACAAGCTGGTGAATTGATCACTCATAAACCCAAGCTGTCATCAGATCCACCTCCGGCGAACTCTTCACGGCAG
CTCCGTCGCCGTCGATCCTCTCCATCGCGACGTCCTTCACGGCCTCTCCATCGCCGGTGGCCTTCATCGCGAACAGAACATCCCGACAGCCTCATCCCGCACAGATCCAG
CACATCCCGACCTCCGCAATCCGGCGAACATCCCGGCAGTTGCATAGATCCAGCCGGTCTTCATCGTGAGCAAGTCTCCATCTGCCGGCCTCCGCGATCCGTCCCCGCAC
AATCGGCTTTCGTGATTCTTTCTTCGCCGACATCACTATCTTCAAATCTGCTCAGATCTGTTACTGACGCCGCTGCTCGTGCCGCCGCTCATACTGCCGCTGCCGCCACT
GCTGCCACCGCTGCCATCCTCATAGAAGCAGAAATACTCGATCTTTATCGATTCGCTATGGCAGATTCGGAATCAGAATCTCAGAATAATATCCCAATTTTTCGTGAAAG
CAGCAACACTCAGATAACATGCCACAAACTCACCGGCCACAATTATCTTCAGTGGTCTCAATCAGTGAAGATGTTCATGTATGGTCGTGGATTGGAGGACCATATCATGG
ATAAAGCAGAATCACCAAAATCTAATGATCCAAAATTTCGCAAATGGAGGGCTGAAAATAATCAAGTAATGAGCTGGTTAATAAACTCTATGGCAACCGAAATTGGAGAA
AATTTTCTCCTATTTTCTACTGCCAAAGAGATTTGGGAAGCTGTTCGTGATACTTTCTCCAATAAGGAAAACACTGCTGAAATTTTCCAGATAGAGACTACTCTCCAAGA
TCTCAAACAAGGTGATTTATCTGTTACAATCTACTACTCCACTTTATCTCGCTATTGGCAACAATTAGATCTATTTGAAACTATTGATTGGAAATGTTCTGACGATCGTA
CTCTCTTTCGAGAATTTGTGGAAACAAAGAGGATTTTCAAATTCCTAATGGGTCTCAATAAATCCCTTGATGAAGTTTGTGGTCGAATTCTCGGAACTAAGCCATTACCT
AGTATCCGAGAAGTCTTTTTTGAAGTACGCAGAGAGGAAAGTCGAAAGCAAGTCATGTTGGGATCATCTGAACATCCTCTTACTCAAAATGGATCTGCTCTCATTTCCCA
ACGAGATCACCCCAATGATCCAATCGCTCTTGCTGCCCGGGGAAACTTCCAATCCTATGGTGACAATCGACAGCGCAAAGGGCGACCATGGTGCGATCACTGTCATAAAG
TGAGACATGTCAAAGAAACCTGTTGGAAACTACATGGGAAACCAGCAAATTGGAAGCCGAACACTACTAGATCAGATCGAGAAACTAAGGGGAATGCTGCTGTATCTGAA
ACTTCCAATCAACAACCACATGCTAAAGAGTTGTTGAACATGCTGCAACAACTATTGCACAAAACAAATCTATCAACAACAGTAGGTGGTTCTGGGAATGTACAGCAAGG
TAACCAAAATTCCCTTGCTCTCCATACTCAAACTCTTCCATCTGAATGGATAGTGGACTCTGGAGCATCAGATCATATGACAGGTGATAGATCCTTATTCTCATCCTTTT
CACTGTATACGGGAAACTTCTCCATTCGAATTGCTAACGGTACACCTGCTAAAGTGACCGGTATTGGAAGCATTCAGATATCAAGCTCTCTAATTCTTGAGTCAATTTTA
TTTGTGCCAACTTTGGAGTATAATCTGCTATCTGACTTGGGATCGGGGAAGATGATTGCCAATGCTGAACTATGTGCTGGATTGTATCTTCTGAGAGCAACTAGTCCTCC
ACCAACACATGAATGCAATAAACTTGTGGGGGAGCATTATGATCTTGAATCACAAAATTGGGATTGCACTCCTGATTTTCCTACGGTAACTTTAAATCTATCATCTGAAA
GTTCACTTGTACCATCTTCTGAGTCAACTCCCTTATCTACCCCTAAGCCTGAATTGCAAGTTTATTCGCGAAGAGCGAGACAACCTGAGAGGACTGAGATACAACCTACA
CAAGTTCAGCAAAGCCAAGATCTAAACCAAAATCCTAATCTCCTTGAAATAGCACACGACATGTCAGAAGATGATAGACCCATTGCTCTTTTGTGA
Protein sequenceShow/hide protein sequence
MLLFHKHQRGFDSTGKIVALAMLLDHNKLHNLQNFVFCISLILKPHKPPSYLYRRKNAGKKLRLSVWLAYLLVPKVAVIVLGKLMTIDIGYTERNTLTQIQALLAPLMLM
QIGSADTITAYSIEDNDLGVRQVFSMMMQVVIMFYILGRSWTDFKSNSGFTYADFFQYDEAVKFLERLEGGNELPDAKLILRAYCRFCWLKPHLENWHPSPTSNVDLQKL
SIEDCEYEEVFRITDSELGFMYDVLYTKAPVVYPGLVLRFISFISLITTLCGFSVLFKDGFVYNIGAGMIHYVLITCVILEVYQILKLPFSDWAIVQMIRHYETFPILWP
LLRSLAPTSATWIRWSNKMGQFNLIDFCLCKNQNFSRIKILRHWGLDMKLRKQLNLGQIEVHPKVKELVVKELREVEKIRKQDKDEFTKGGEWTIQRLHDVPFGCSFPRA
IHNNCRYLISILQQDAPNVVDKAKIREEEKVIGNWNLLKDVNELANSLLTLPNNEKRWNLIGSMWVEMLGYAASKCEMEYHAEHIRQAGELITHKPKLSSDPPPANSSRQ
LRRRRSSPSRRPSRPLHRRWPSSRTEHPDSLIPHRSSTSRPPQSGEHPGSCIDPAGLHREQVSICRPPRSVPAQSAFVILSSPTSLSSNLLRSVTDAAARAAAHTAAAAT
AATAAILIEAEILDLYRFAMADSESESQNNIPIFRESSNTQITCHKLTGHNYLQWSQSVKMFMYGRGLEDHIMDKAESPKSNDPKFRKWRAENNQVMSWLINSMATEIGE
NFLLFSTAKEIWEAVRDTFSNKENTAEIFQIETTLQDLKQGDLSVTIYYSTLSRYWQQLDLFETIDWKCSDDRTLFREFVETKRIFKFLMGLNKSLDEVCGRILGTKPLP
SIREVFFEVRREESRKQVMLGSSEHPLTQNGSALISQRDHPNDPIALAARGNFQSYGDNRQRKGRPWCDHCHKVRHVKETCWKLHGKPANWKPNTTRSDRETKGNAAVSE
TSNQQPHAKELLNMLQQLLHKTNLSTTVGGSGNVQQGNQNSLALHTQTLPSEWIVDSGASDHMTGDRSLFSSFSLYTGNFSIRIANGTPAKVTGIGSIQISSSLILESIL
FVPTLEYNLLSDLGSGKMIANAELCAGLYLLRATSPPPTHECNKLVGEHYDLESQNWDCTPDFPTVTLNLSSESSLVPSSESTPLSTPKPELQVYSRRARQPERTEIQPT
QVQQSQDLNQNPNLLEIAHDMSEDDRPIALL