; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Sed0003957 (gene) of Chayote v1 genome

Gene IDSed0003957
OrganismSechium edule (Chayote v1)
DescriptionIntegrase catalytic domain-containing protein
Genome locationLG02:37957010..37960558
RNA-Seq ExpressionSed0003957
SyntenySed0003957
Gene Ontology termsGO:0015074 - DNA integration (biological process)
GO:0003676 - nucleic acid binding (molecular function)
InterPro domainsIPR001584 - Integrase, catalytic core
IPR012337 - Ribonuclease H-like superfamily
IPR025724 - GAG-pre-integrase domain
IPR036397 - Ribonuclease H superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
CAA7015607.1 unnamed protein product [Microthlaspi erraticum]1.2e-9447.85Show/hide
Query:  NTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNGLVLKDVLFIPKFSHNLISVNCLLKINEVILQFSNTSCIIQDRNCSKMI
        N WIIDSGAS H+  +  LF     +S   V LP G R+ I H   +  S+ L+L +VL +P F  NLISV+CL++       F +T C+IQ+ +   MI
Subjt:  NTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNGLVLKDVLFIPKFSHNLISVNCLLKINEVILQFSNTSCIIQDRNCSKMI

Query:  GKAECFNGLYVFKDCSDAITCFTS------------VSTWHKRLGHLSNKRLESMKN-TSNFCDHPCKL--CHICPLSKQKRLSFPFNNNVSSNCFDLIH
        G+A+  + LY+  +  DA    +S            VS WH RLGH S   L+ +K+   +F DH   L  C +CPL+KQ+RL++  +NN++S  FDLIH
Subjt:  GKAECFNGLYVFKDCSDAITCFTS------------VSTWHKRLGHLSNKRLESMKN-TSNFCDHPCKL--CHICPLSKQKRLSFPFNNNVSSNCFDLIH

Query:  CDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPELKFQDFFSTIGTIHQFSCAETPQQNLVVERK
         DIWGPF V +  GYR+FLT+VDDC+R TW +++++KSDV ++ P    L+ TQ+N  +K  RSDNAPEL F       G +HQ SCA TPQQN VVERK
Subjt:  CDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPELKFQDFFSTIGTIHQFSCAETPQQNLVVERK

Query:  HQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEADYSVLRSFGCLFFRSTL
        HQHLLN+AR+LLFQSNVPL +W DCV TA ++ NR+  PL++NK+PF +L     DYS+L+SFGCL + STL
Subjt:  HQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEADYSVLRSFGCLFFRSTL

KAA0065480.1 Cysteine-rich RLK (receptor-like protein kinase) 8 [Cucumis melo var. makuwa]2.7e-9954.19Show/hide
Query:  INHVSGIFSRPFTHDNYKNTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNGLVLKDVLFIPKFSHNLISVNCLLKINEVIL
        I H SGIF+    ++   + WIIDSGASRHI  ++ LFKNW   + + V+LP G R+++D IGDI+ +  L LKDVLF+ +F++NLISV+CLL    + L
Subjt:  INHVSGIFSRPFTHDNYKNTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNGLVLKDVLFIPKFSHNLISVNCLLKINEVIL

Query:  QFSNTSCIIQDRNCSKMIGKAECFNGLYVFKDCSDAITCFT--------SVSTWHKRLGHLSNKRLESMKNTSNFCDHPC--KLCHICPLSKQKRLSFPF
         F +T CIIQD +   MIGKA C NGLYV    ++   C          SV TWH+RLGHLS K L S+ +T    +H      CH+CPL+KQKRLSF  
Subjt:  QFSNTSCIIQDRNCSKMIGKAECFNGLYVFKDCSDAITCFT--------SVSTWHKRLGHLSNKRLESMKNTSNFCDHPC--KLCHICPLSKQKRLSFPF

Query:  NNNVSSNCFDLIHCDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPELKFQDFFSTIGTIHQFSC
        NNNV+S+ FDL+H DIWGPFK+ +Y GY++FLTLVDDC R+TW ++++ KSDVL I+P+ F+L++TQF+K IK FRSDNAPELK  +FF+  GT+HQFSC
Subjt:  NNNVSSNCFDLIHCDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPELKFQDFFSTIGTIHQFSC

Query:  AETPQQNLVVERKHQHLLNIARSLLFQSNVPLMF
         E PQQN VVERKHQHLLN+AR+L F      +F
Subjt:  AETPQQNLVVERKHQHLLNIARSLLFQSNVPLMF

KAG7587171.1 Integrase catalytic core [Arabidopsis thaliana x Arabidopsis arenosa]9.5e-9749.87Show/hide
Query:  NTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNGLVLKDVLFIPKFSHNLISVNCLLKINEVILQFSNTSCIIQDRNCSKMI
        + WIIDSGA+ H+  +   FK    +SGI V LP G  L I H G I+ S  L+L +VL +P F  NLISV+ LL  N+    F  TSC +Q+ +   MI
Subjt:  NTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNGLVLKDVLFIPKFSHNLISVNCLLKINEVILQFSNTSCIIQDRNCSKMI

Query:  GKAECFNGLYVFKDCSDAITCFTSV------------STWHKRLGHLSNKRLESMKNTSNFCDHPCKL---CHICPLSKQKRLSFPFNNNVSSNCFDLIH
        G A   + LY+    S       S+            + WH+RLGH S  +LE +  T +           CH+CPL+KQKRLSFPFNNNVSSN FDLIH
Subjt:  GKAECFNGLYVFKDCSDAITCFTSV------------STWHKRLGHLSNKRLESMKNTSNFCDHPCKL---CHICPLSKQKRLSFPFNNNVSSNCFDLIH

Query:  CDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPELKFQDFFSTIGTIHQFSCAETPQQNLVVERK
         DIWGPF V +  GY+FFLT+VDDC+R TW +++K+KSDV ++     KL+ TQ+   IK  RSDNAPELKF +     G IH FSCA TPQQN VVERK
Subjt:  CDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPELKFQDFFSTIGTIHQFSCAETPQQNLVVERK

Query:  HQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEADYSVLRSFGCLFFRSTLQ
        HQHLLN+AR+LLFQS VPL++W DCVLTA+++ NRI   ++ N TP+  L   + DY+ L+SFGCL + STLQ
Subjt:  HQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEADYSVLRSFGCLFFRSTLQ

RVW82526.1 Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Vitis vinifera]2.6e-9446.02Show/hide
Query:  MGFLSSHMNSGK----------ENNINHVSGIFS-RPFTHDNYKNTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNGLVLK
        +  LS H +SG           + +I++ +GI S  P +     + WI+DSGA+ H+  N  +F +    S   V LPTG ++ I  IG I  S  LVL+
Subjt:  MGFLSSHMNSGK----------ENNINHVSGIFS-RPFTHDNYKNTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNGLVLK

Query:  DVLFIPKFSHNLISVNCLLKINEVILQFSNTSCIIQDRNCSKMIGKAECFNGLY-----VFKDCSDAITCFTSVST-----WHKRLGHLSNKRLESMKN-
         VL+IP F  NLIS++ L + N     F+   C IQD +  K+IG       LY     VF+  S       + S      WH RL H SN +L  +K  
Subjt:  DVLFIPKFSHNLISVNCLLKINEVILQFSNTSCIIQDRNCSKMIGKAECFNGLY-----VFKDCSDAITCFTSVST-----WHKRLGHLSNKRLESMKN-

Query:  ---TSNFCDHPCKLCHICPLSKQKRLSFPFNNNVSSNCFDLIHCDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNK
            SN   +    C ICPL+KQKRL F  +NN+SS+ FDLIHCDIWGPF + T+ G+R+FLT+VDDC+R TW  L+++KSDV +I P+ F +VKT+F  
Subjt:  ---TSNFCDHPCKLCHICPLSKQKRLSFPFNNNVSSNCFDLIHCDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNK

Query:  DIKQFRSDNAPELKFQDFFSTIGTIHQFSCAETPQQNLVVERKHQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEADY
         IK  RSDNAPEL   + F+ +  +H FSC ETPQQN VVERKHQH+LN+AR+L FQSN+P+ +WGDCVLT+ Y+ NRI  PL++NKTPF +LH     Y
Subjt:  DIKQFRSDNAPELKFQDFFSTIGTIHQFSCAETPQQNLVVERKHQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEADY

Query:  SVLRSFGCLFFRSTL
        S L+SFGCL + STL
Subjt:  SVLRSFGCLFFRSTL

XP_013594438.1 PREDICTED: uncharacterized protein LOC106302482 [Brassica oleracea var. oleracea]6.8e-9549.18Show/hide
Query:  TWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNGLVLKDVLFIPKFSHNLISVNCLLKINEVILQFSNTSCIIQDRNCSKMIG
        +WI+DSGA+ H+  ++ +F+     S + V LP G R+ I H G +  S+ L+L DVL +P F  NL+SV+ LL+ N+    F  TSC IQ+ +    IG
Subjt:  TWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNGLVLKDVLFIPKFSHNLISVNCLLKINEVILQFSNTSCIIQDRNCSKMIG

Query:  KAECFNGLYVF----KDCSDAITCFTSVS--TWHKRLGHLSNKRLESMKNTSNF--CDHPC-KLCHICPLSKQKRLSFPFNNNVSSNCFDLIHCDIWGPF
        K    + LY+     +D S A T   S     WH+RLGH S  +L+ +    +   C  P    C +CPL+KQKRLSFPFNNN+S   FDL+H DIWGPF
Subjt:  KAECFNGLYVF----KDCSDAITCFTSVS--TWHKRLGHLSNKRLESMKNTSNF--CDHPC-KLCHICPLSKQKRLSFPFNNNVSSNCFDLIHCDIWGPF

Query:  KVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPELKFQDFFSTIGTIHQFSCAETPQQNLVVERKHQHLLNI
           +  GY++FLT+VDDCSR TW  L+K KSDV+++ P   + V TQ+N  +K  RSDNAPEL+F     T G IH FSCA TP+QN VVERKHQHLLN+
Subjt:  KVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPELKFQDFFSTIGTIHQFSCAETPQQNLVVERKHQHLLNI

Query:  ARSLLFQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEADYSVLRSFGCLFFRSTLQ
        AR+LLFQSNVPL++W DC+ TA ++ NRI  PL+ ++TP+ +L   + DYS+LRSFGCL + STLQ
Subjt:  ARSLLFQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEADYSVLRSFGCLFFRSTLQ

TrEMBL top hitse value%identityAlignment
A0A392M266 Retrovirus-related Pol polyprotein from transposon TNT 1-94 (Fragment)6.5e-9144.39Show/hide
Query:  GFLSSHMNSGKENNINHVSGIFSRPFTHDNY--KNTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNGLVLKDVLFIPKFSH
        G ++   ++G  N  + +  +F+    H+ Y   +TWI+DSGA+ HI  + +LF +++ LS   V LP   ++++  IG +  ++GL L +VL+IP+F  
Subjt:  GFLSSHMNSGKENNINHVSGIFSRPFTHDNY--KNTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNGLVLKDVLFIPKFSH

Query:  NLISVNCLLKINEVILQFSNTSCIIQDRNCSKMIGKAECFNGLYVFK---DCSDAITCF-----------------TSVSTWHKRLGHLSNKRLESMKN-
        NL+SV  LL+ + + L  ++T  +IQD    K+IGK++  +GLY  K   + +    C                   SVS WH RLGHLS+  L+S+ N 
Subjt:  NLISVNCLLKINEVILQFSNTSCIIQDRNCSKMIGKAECFNGLYVFK---DCSDAITCF-----------------TSVSTWHKRLGHLSNKRLESMKN-

Query:  -----TSNFCDHPCKLCHICPLSKQKRLSFPFNNNVSSNCFDLIHCDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQF
             TSN      + C +CPL+K  RL F  NNN +   FDLIHCD+WGPFK  TY GY +FLT+VDD +RYTWT LMK K++   +I   FK VKTQF
Subjt:  -----TSNFCDHPCKLCHICPLSKQKRLSFPFNNNVSSNCFDLIHCDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQF

Query:  NKDIKQFRSDNAPELKFQDFFSTIGTIHQFSCAETPQQNLVVERKHQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEA
        NK IK   +DNA EL+   F    GT+HQFSC   PQQN VVERKHQHLLN+AR+LLFQS VPL FWGDC+ TA+Y+ NR+    + N +P+ +LH    
Subjt:  NKDIKQFRSDNAPELKFQDFFSTIGTIHQFSCAETPQQNLVVERKHQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEA

Query:  DYSVLRSFGCLFFRSTLQA
        DY+ LR FGC+ F ST+ A
Subjt:  DYSVLRSFGCLFFRSTLQA

A0A438HDI8 Retrovirus-related Pol polyprotein from transposon TNT 1-941.3e-9446.02Show/hide
Query:  MGFLSSHMNSGK----------ENNINHVSGIFS-RPFTHDNYKNTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNGLVLK
        +  LS H +SG           + +I++ +GI S  P +     + WI+DSGA+ H+  N  +F +    S   V LPTG ++ I  IG I  S  LVL+
Subjt:  MGFLSSHMNSGK----------ENNINHVSGIFS-RPFTHDNYKNTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNGLVLK

Query:  DVLFIPKFSHNLISVNCLLKINEVILQFSNTSCIIQDRNCSKMIGKAECFNGLY-----VFKDCSDAITCFTSVST-----WHKRLGHLSNKRLESMKN-
         VL+IP F  NLIS++ L + N     F+   C IQD +  K+IG       LY     VF+  S       + S      WH RL H SN +L  +K  
Subjt:  DVLFIPKFSHNLISVNCLLKINEVILQFSNTSCIIQDRNCSKMIGKAECFNGLY-----VFKDCSDAITCFTSVST-----WHKRLGHLSNKRLESMKN-

Query:  ---TSNFCDHPCKLCHICPLSKQKRLSFPFNNNVSSNCFDLIHCDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNK
            SN   +    C ICPL+KQKRL F  +NN+SS+ FDLIHCDIWGPF + T+ G+R+FLT+VDDC+R TW  L+++KSDV +I P+ F +VKT+F  
Subjt:  ---TSNFCDHPCKLCHICPLSKQKRLSFPFNNNVSSNCFDLIHCDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNK

Query:  DIKQFRSDNAPELKFQDFFSTIGTIHQFSCAETPQQNLVVERKHQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEADY
         IK  RSDNAPEL   + F+ +  +H FSC ETPQQN VVERKHQH+LN+AR+L FQSN+P+ +WGDCVLT+ Y+ NRI  PL++NKTPF +LH     Y
Subjt:  DIKQFRSDNAPELKFQDFFSTIGTIHQFSCAETPQQNLVVERKHQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEADY

Query:  SVLRSFGCLFFRSTL
        S L+SFGCL + STL
Subjt:  SVLRSFGCLFFRSTL

A0A5A7VE66 Cysteine-rich RLK (Receptor-like protein kinase) 81.3e-9954.19Show/hide
Query:  INHVSGIFSRPFTHDNYKNTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNGLVLKDVLFIPKFSHNLISVNCLLKINEVIL
        I H SGIF+    ++   + WIIDSGASRHI  ++ LFKNW   + + V+LP G R+++D IGDI+ +  L LKDVLF+ +F++NLISV+CLL    + L
Subjt:  INHVSGIFSRPFTHDNYKNTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNGLVLKDVLFIPKFSHNLISVNCLLKINEVIL

Query:  QFSNTSCIIQDRNCSKMIGKAECFNGLYVFKDCSDAITCFT--------SVSTWHKRLGHLSNKRLESMKNTSNFCDHPC--KLCHICPLSKQKRLSFPF
         F +T CIIQD +   MIGKA C NGLYV    ++   C          SV TWH+RLGHLS K L S+ +T    +H      CH+CPL+KQKRLSF  
Subjt:  QFSNTSCIIQDRNCSKMIGKAECFNGLYVFKDCSDAITCFT--------SVSTWHKRLGHLSNKRLESMKNTSNFCDHPC--KLCHICPLSKQKRLSFPF

Query:  NNNVSSNCFDLIHCDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPELKFQDFFSTIGTIHQFSC
        NNNV+S+ FDL+H DIWGPFK+ +Y GY++FLTLVDDC R+TW ++++ KSDVL I+P+ F+L++TQF+K IK FRSDNAPELK  +FF+  GT+HQFSC
Subjt:  NNNVSSNCFDLIHCDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPELKFQDFFSTIGTIHQFSC

Query:  AETPQQNLVVERKHQHLLNIARSLLFQSNVPLMF
         E PQQN VVERKHQHLLN+AR+L F      +F
Subjt:  AETPQQNLVVERKHQHLLNIARSLLFQSNVPLMF

A0A6D2HNE3 Uncharacterized protein5.6e-9547.85Show/hide
Query:  NTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNGLVLKDVLFIPKFSHNLISVNCLLKINEVILQFSNTSCIIQDRNCSKMI
        N WIIDSGAS H+  +  LF     +S   V LP G R+ I H   +  S+ L+L +VL +P F  NLISV+CL++       F +T C+IQ+ +   MI
Subjt:  NTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNGLVLKDVLFIPKFSHNLISVNCLLKINEVILQFSNTSCIIQDRNCSKMI

Query:  GKAECFNGLYVFKDCSDAITCFTS------------VSTWHKRLGHLSNKRLESMKN-TSNFCDHPCKL--CHICPLSKQKRLSFPFNNNVSSNCFDLIH
        G+A+  + LY+  +  DA    +S            VS WH RLGH S   L+ +K+   +F DH   L  C +CPL+KQ+RL++  +NN++S  FDLIH
Subjt:  GKAECFNGLYVFKDCSDAITCFTS------------VSTWHKRLGHLSNKRLESMKN-TSNFCDHPCKL--CHICPLSKQKRLSFPFNNNVSSNCFDLIH

Query:  CDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPELKFQDFFSTIGTIHQFSCAETPQQNLVVERK
         DIWGPF V +  GYR+FLT+VDDC+R TW +++++KSDV ++ P    L+ TQ+N  +K  RSDNAPEL F       G +HQ SCA TPQQN VVERK
Subjt:  CDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPELKFQDFFSTIGTIHQFSCAETPQQNLVVERK

Query:  HQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEADYSVLRSFGCLFFRSTL
        HQHLLN+AR+LLFQSNVPL +W DCV TA ++ NR+  PL++NK+PF +L     DYS+L+SFGCL + STL
Subjt:  HQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEADYSVLRSFGCLFFRSTL

A0A6J0KKJ8 uncharacterized protein LOC1088199796.5e-9146.81Show/hide
Query:  KNTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNGLVLKDVLFIPKFSHNLISVNCLLKINEVILQFSNTSCIIQDRNCSKM
        +N WI+DSGAS H+ ++  +F     +S I+V LP G ++ I H G I  S  L+L +VL +P F  NLISV+CL++       F    C IQ+ +   M
Subjt:  KNTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNGLVLKDVLFIPKFSHNLISVNCLLKINEVILQFSNTSCIIQDRNCSKM

Query:  IGKAECFNGLYVFKDCSDAITCFTSVST---------------WHKRLGHLSNKRLESMKNT--SNFCDHPCKL-CHICPLSKQKRLSFPFNNNVSSNCF
        IGK    + LY+    S +I    S S+               WH+RLGH S+  L+ + ++  S   + P +  C +CPLSKQKRLSF   NN+SS  F
Subjt:  IGKAECFNGLYVFKDCSDAITCFTSVST---------------WHKRLGHLSNKRLESMKNT--SNFCDHPCKL-CHICPLSKQKRLSFPFNNNVSSNCF

Query:  DLIHCDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPELKFQDFFSTIGTIHQFSCAETPQQNLV
        DL+H DIWGPF V +  G+++FLTLVDDC+R  W +++++KSDV +  P    LV TQFN  IK  RSDNAPEL F D  +  G IH FSC  TPQQN V
Subjt:  DLIHCDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPELKFQDFFSTIGTIHQFSCAETPQQNLV

Query:  VERKHQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEADYSVLRSFGCLFFRSTL
        VERKHQHLLN+ARSLLFQSNVPL +W DCV TA+++ NR+  PL+ N +P+  L      Y+ LR+FGCL + STL
Subjt:  VERKHQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEADYSVLRSFGCLFFRSTL

SwissProt top hitse value%identityAlignment
P04146 Copia protein3.7e-2726.98Show/hide
Query:  WIIDSGASRHISFNQELFKNWRKLSGINVVLP-------TGFRLNIDHIGDIECSNG--LVLKDVLFIPKFSHNLISVNCLLKINEVILQFSNTSCIIQD
        +++DSGAS H+  ++ L+ +      + VV P        G  +     G +   N   + L+DVLF  + + NL+SV  L +   + ++F  +   I  
Subjt:  WIIDSGASRHISFNQELFKNWRKLSGINVVLP-------TGFRLNIDHIGDIECSNG--LVLKDVLFIPKFSHNLISVNCLLKINEVILQFSNTSCIIQD

Query:  RNCSKMIGKAECFNGLYV--FKDCSDAITCFTSVSTWHKRLGHLSNKRLESMKNTSNFCDH--------PCKLCHICPLSKQKRLSF---PFNNNVSSNC
        +N   ++  +   N + V  F+  S       +   WH+R GH+S+ +L  +K  + F D          C++C  C   KQ RL F       ++    
Subjt:  RNCSKMIGKAECFNGLYV--FKDCSDAITCFTSVSTWHKRLGHLSNKRLESMKNTSNFCDH--------PCKLCHICPLSKQKRLSF---PFNNNVSSNC

Query:  FDLIHCDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPEL---KFQDFFSTIGTIHQFSCAETPQ
        F ++H D+ GP   +T     +F+  VD  + Y  T+L+K KSDV S+        +  FN  +     DN  E    + + F    G  +  +   TPQ
Subjt:  FDLIHCDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPEL---KFQDFFSTIGTIHQFSCAETPQ

Query:  QNLVVERKHQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMPLV--SNKTPFSILHSYEADYSVLRSFGCLFF
         N V ER  + +   AR+++  + +   FWG+ VLTA+Y+ NRI    +  S+KTP+ + H+ +     LR FG   +
Subjt:  QNLVVERKHQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMPLV--SNKTPFSILHSYEADYSVLRSFGCLFF

P10978 Retrovirus-related Pol polyprotein from transposon TNT 1-942.5e-3930.08Show/hide
Query:  KNTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNG----LVLKDVLFIPKFSHNLISVNCLLKINEVILQFSNTSCIIQDRN
        ++ W++D+ AS H +  ++LF  +       V +       I  IGDI         LVLKDV  +P    NLIS    L  +     F+N    +    
Subjt:  KNTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNG----LVLKDVLFIPKFSHNLISVNCLLKINEVILQFSNTSCIIQDRN

Query:  CSKMIGKAECFNGLY-----VFKDCSDAITCFTSVSTWHKRLGHLSNKRLESMKNTSNFC---DHPCKLCHICPLSKQKRLSFPFNNNVSSNCFDLIHCD
         S +I K      LY     + +   +A     SV  WHKR+GH+S K L+ +   S          K C  C   KQ R+SF  ++    N  DL++ D
Subjt:  CSKMIGKAECFNGLY-----VFKDCSDAITCFTSVSTWHKRLGHLSNKRLESMKNTSNFC---DHPCKLCHICPLSKQKRLSFPFNNNVSSNCFDLIHCD

Query:  IWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPEL---KFQDFFSTIGTIHQFSCAETPQQNLVVER
        + GP ++ +  G ++F+T +DD SR  W +++K+K  V  +  +   LV+ +  + +K+ RSDN  E    +F+++ S+ G  H+ +   TPQ N V ER
Subjt:  IWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPEL---KFQDFFSTIGTIHQFSCAETPQQNLVVER

Query:  KHQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEADYSVLRSFGCLFF
         ++ ++   RS+L  + +P  FWG+ V TA Y+ NR     ++ + P  +  + E  YS L+ FGC  F
Subjt:  KHQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEADYSVLRSFGCLFF

Q07791 Transposon Ty2-DR3 Gag-Pol polyprotein9.8e-1221.92Show/hide
Query:  THDNYKNTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECS--NGLVLK-DVLFIPKFSHNLISVNCLLKINEVILQFSNTSCII
        ++D   +  +IDSGAS+ +  +     +    S IN+V      + I+ IG++  +  NG       L  P  +++L+S++ L   N +   F+  +   
Subjt:  THDNYKNTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECS--NGLVLK-DVLFIPKFSHNLISVNCLLKINEVILQFSNTSCII

Query:  QDRNCSKMIGKAECFNGL--------YVFKDCSDAITCFTSVSTW-----HKRLGHLSNKRLE-SMKNT------------SNFCDHPCKLCHICPLSKQ
         D      I K   F  L        ++ K   + +    SV+ +     H+ LGH + + ++ S+K              SN   + C  C I   +K 
Subjt:  QDRNCSKMIGKAECFNGL--------YVFKDCSDAITCFTSVSTW-----HKRLGHLSNKRLE-SMKNT------------SNFCDHPCKLCHICPLSKQ

Query:  KRL-SFPFNNNVSSNCFDLIHCDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLM--KSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPEL---KFQD
        + +         S   F  +H DI+GP   +      +F++  D+ +R+ W + +  + +  +L++   +   +K QFN  +   + D   E        
Subjt:  KRL-SFPFNNNVSSNCFDLIHCDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLM--KSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPEL---KFQD

Query:  FFSTIGTIHQFSCAETPQQNLVVERKHQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMP
        FF+  G    ++     + + V ER ++ LLN  R+LL  S +P   W   V  ++ I N ++ P
Subjt:  FFSTIGTIHQFSCAETPQQNLVVERKHQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMP

Q94HW2 Retrovirus-related Pol polyprotein from transposon RE18.5e-4029.23Show/hide
Query:  NTWIIDSGASRHISFN-QELFKNWRKLSGINVVLPTGFRLNIDHIGDIEC---SNGLVLKDVLFIPKFSHNLISVNCLLKINEVILQFSNTSCIIQDRNC
        N W++DSGA+ HI+ +   L  +     G +V++  G  + I H G       S  L L ++L++P    NLISV  L   N V ++F   S  ++D N 
Subjt:  NTWIIDSGASRHISFN-QELFKNWRKLSGINVVLPTGFRLNIDHIGDIEC---SNGLVLKDVLFIPKFSHNLISVNCLLKINEVILQFSNTSCIIQDRNC

Query:  SKMIGKAECFNGLYVFK-DCSDAITCFTSV------STWHKRLGHLSNKRLESMKNTSNFC----DHPCKLCHICPLSKQKRLSFPFNNNVSSNCFDLIH
           + + +  + LY +    S  ++ F S       S+WH RLGH +   L S+ +  +       H    C  C ++K  ++ F  +   S+   + I+
Subjt:  SKMIGKAECFNGLYVFK-DCSDAITCFTSV------STWHKRLGHLSNKRLESMKNTSNFC----DHPCKLCHICPLSKQKRLSFPFNNNVSSNCFDLIH

Query:  CDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPE-LKFQDFFSTIGTIHQFSCAETPQQNLVVER
         D+W    ++++  YR+++  VD  +RYTW + +K KS V         L++ +F   I  F SDN  E +   ++FS  G  H  S   TP+ N + ER
Subjt:  CDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPE-LKFQDFFSTIGTIHQFSCAETPQQNLVVER

Query:  KHQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEADYSVLRSFGC
        KH+H++    +LL  +++P  +W      A Y+ NR+  PL+  ++PF  L     +Y  LR FGC
Subjt:  KHQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEADYSVLRSFGC

Q9ZT94 Retrovirus-related Pol polyprotein from transposon RE21.7e-4030.79Show/hide
Query:  NTWIIDSGASRHIS--FNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIE---CSNGLVLKDVLFIPKFSHNLISVNCLLKINEVILQFSNTSCIIQDRN
        N W++DSGA+ HI+  FN   F +     G +V++  G  + I H G       S  L L  VL++P    NLISV  L   N V ++F   S  ++D N
Subjt:  NTWIIDSGASRHIS--FNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIE---CSNGLVLKDVLFIPKFSHNLISVNCLLKINEVILQFSNTSCIIQDRN

Query:  CSKMIGKAECFNGLYVFK-DCSDAITCFTSV------STWHKRLGHLSNKRLESMKNTSNF----CDHPCKLCHICPLSKQKRLSFPFNNNVSSNCFDLI
            + + +  + LY +    S A++ F S       S+WH RLGH S   L S+ +  +       H    C  C ++K  ++ F  +   SS   + I
Subjt:  CSKMIGKAECFNGLYVFK-DCSDAITCFTSV------STWHKRLGHLSNKRLESMKNTSNF----CDHPCKLCHICPLSKQKRLSFPFNNNVSSNCFDLI

Query:  HCDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPE-LKFQDFFSTIGTIHQFSCAETPQQNLVVE
        + D+W    +++   YR+++  VD  +RYTW + +K KS V         LV+ +F   I    SDN  E +  +D+ S  G  H  S   TP+ N + E
Subjt:  HCDIWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPE-LKFQDFFSTIGTIHQFSCAETPQQNLVVE

Query:  RKHQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEADYSVLRSFGC
        RKH+H++ +  +LL  ++VP  +W      A Y+ NR+  PL+  ++PF  L     +Y  L+ FGC
Subjt:  RKHQHLLNIARSLLFQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEADYSVLRSFGC

Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGGATTTCTTTCATCCCACATGAATTCTGGCAAAGAAAACAACATAAACCATGTCTCAGGTATATTTTCACGCCCTTTTACTCATGACAATTATAAAAATACATGGAT
TATTGACTCAGGAGCCTCACGACATATTAGTTTTAATCAAGAACTTTTTAAGAATTGGAGAAAACTATCTGGCATAAATGTGGTTTTACCTACTGGATTCCGTTTGAACA
TTGACCATATAGGAGATATTGAATGTTCCAATGGATTAGTACTAAAGGATGTCTTGTTCATACCTAAGTTTTCACATAATCTAATATCTGTAAATTGCCTTCTTAAAATT
AATGAAGTCATTTTACAGTTTTCTAATACTTCGTGTATAATTCAGGACAGAAATTGTTCGAAGATGATTGGCAAGGCTGAATGTTTTAATGGTCTTTACGTGTTCAAAGA
TTGTTCTGATGCTATTACTTGTTTTACCTCTGTATCTACATGGCATAAAAGATTAGGCCATTTATCAAATAAGCGTTTAGAATCAATGAAAAATACTTCGAATTTCTGTG
ATCATCCTTGTAAACTTTGTCATATTTGTCCCCTTTCTAAACAAAAGAGACTTTCATTCCCTTTTAATAATAATGTGTCTTCTAACTGCTTTGATTTGATTCATTGTGAT
ATATGGGGCCCTTTTAAAGTTATTACTTATATGGGTTATAGATTCTTTCTCACATTAGTTGATGATTGTTCCCGATACACTTGGACATTTTTAATGAAATCAAAATCTGA
TGTTCTATCTATAATACCTAGATTATTTAAATTGGTTAAGACGCAGTTTAACAAAGATATCAAACAATTTCGGTCAGATAATGCACCTGAGCTTAAATTCCAAGATTTTT
TCTCCACAATTGGAACTATTCATCAATTTTCTTGTGCAGAAACTCCTCAACAAAATTTGGTGGTTGAAAGAAAACATCAACACCTTTTAAACATAGCTAGATCTTTACTT
TTTCAGTCCAATGTTCCTTTAATGTTTTGGGGAGATTGTGTTCTCACTGCATCTTACATTGCTAATCGTATTCTTATGCCTCTTGTTTCTAATAAAACTCCATTTTCCAT
ACTACATTCTTATGAAGCTGATTATTCTGTGCTAAGATCTTTTGGTTGTCTTTTTTTTCGTTCTACCCTTCAAGCTCATTGA
mRNA sequenceShow/hide mRNA sequence
CTTCCCTCAGCTCACTATTTAAGTGAAGCTTATATTGAATAATAAACCAACTGAGTATTGTACACACTCGTGAAATGGTATTAGGGCATTTTCCCTAAAAACAAAATCTA
CCTCACCTTTTCGTTGCGTCTTTTCTCCTTTCCCGTTTCTCAAATGGCGGAACAAACTCCAATTCCACCTTCTCACACAACAAGTCAGAACACTAACAATGGAGACGATG
CAGCATCACTTGAAGCTCAATTCAATCCTTTCCACTTATATCATTCGTTCTCATTAGCATCTTCCCTAGTAACTCAGCCACTCACTAGAGTAGAAAATTGCAACTCCTGG
AAGCGTGTGACTTGATTTCACTATCAGAAAGAAACAAGCAAGGGTTCTTGACTGGCGAGGTTGTGAAACCAACGAACCCTACAACTTTACTCGCTTGGAAATGCAACAAC
GACGTTATAGCTTCGTGGATCTTGAACTCAGTGTCCAAAAAGATTGCGGCTAGCATTGTATACACAGATTCGGTGAAAGAAATTTGGGATGAATTAGGGGAAAGATTCAA
AGAGAGCAACAAAACCAAGATCTTTTCTCTTCGAAAAGCTTTGATCACTACAACTCAAGGAAATCAGTCTGTTGAAACCTATTTCACTCAATTGAAAACAATCTGGCAAG
ATCTCTATGGTTTTCGCCCAATAGATCCCTGTTCCTATGAAGCAATAAAGCCTCTCCTTGCACATTTAGAATCGGAATACATCATGGTGTTTCTCATGGGCCTTAACGAT
AGCTATGCTGCTGTCCGTACACAAATTCTTCTAATGGAACCTCTTAAATCATTCAGTAAGGTCTTCTCCCTCATAATCCAAGAAGAACGCCAAAGAGCTGCTTCATCCTC
AAACTCTTCAATCATGGATCCAATGGTTTTTCTTGCGGATGTGTCAAAAGCCTTAGTAAAGAAAGATTTTAACAAGCCACTCTGTTCTCACTGTGGAAAAAGGGGGCACA
CGGCTGAAAAATGCTATCAATTGCACTGATTCCCTCTAGGTTTCAAATTTCGCAACCCCACTTACAATCCCAATTACCGTCAATCCAGTGATGTAAAATCGTCCACCACT
ACAAACATCACTGTCGCAGTCAATTCTGTTGTTCAAACTCCGAGCGACTCTACAAATTTTTTCTCAAGCTTAACCGCAAAACAAAATTCACAACTTATGGGATTTCTTTC
ATCCCACATGAATTCTGGCAAAGAAAACAACATAAACCATGTCTCAGGTATATTTTCACGCCCTTTTACTCATGACAATTATAAAAATACATGGATTATTGACTCAGGAG
CCTCACGACATATTAGTTTTAATCAAGAACTTTTTAAGAATTGGAGAAAACTATCTGGCATAAATGTGGTTTTACCTACTGGATTCCGTTTGAACATTGACCATATAGGA
GATATTGAATGTTCCAATGGATTAGTACTAAAGGATGTCTTGTTCATACCTAAGTTTTCACATAATCTAATATCTGTAAATTGCCTTCTTAAAATTAATGAAGTCATTTT
ACAGTTTTCTAATACTTCGTGTATAATTCAGGACAGAAATTGTTCGAAGATGATTGGCAAGGCTGAATGTTTTAATGGTCTTTACGTGTTCAAAGATTGTTCTGATGCTA
TTACTTGTTTTACCTCTGTATCTACATGGCATAAAAGATTAGGCCATTTATCAAATAAGCGTTTAGAATCAATGAAAAATACTTCGAATTTCTGTGATCATCCTTGTAAA
CTTTGTCATATTTGTCCCCTTTCTAAACAAAAGAGACTTTCATTCCCTTTTAATAATAATGTGTCTTCTAACTGCTTTGATTTGATTCATTGTGATATATGGGGCCCTTT
TAAAGTTATTACTTATATGGGTTATAGATTCTTTCTCACATTAGTTGATGATTGTTCCCGATACACTTGGACATTTTTAATGAAATCAAAATCTGATGTTCTATCTATAA
TACCTAGATTATTTAAATTGGTTAAGACGCAGTTTAACAAAGATATCAAACAATTTCGGTCAGATAATGCACCTGAGCTTAAATTCCAAGATTTTTTCTCCACAATTGGA
ACTATTCATCAATTTTCTTGTGCAGAAACTCCTCAACAAAATTTGGTGGTTGAAAGAAAACATCAACACCTTTTAAACATAGCTAGATCTTTACTTTTTCAGTCCAATGT
TCCTTTAATGTTTTGGGGAGATTGTGTTCTCACTGCATCTTACATTGCTAATCGTATTCTTATGCCTCTTGTTTCTAATAAAACTCCATTTTCCATACTACATTCTTATG
AAGCTGATTATTCTGTGCTAAGATCTTTTGGTTGTCTTTTTTTTCGTTCTACCCTTCAAGCTCATTGAGGTAAATTTGATCCCCATGCCTCCGCTTGTATTTTCATTGGT
TATCCTCCTGTCATAAAAGGCTATAAACTGTATGATATTCAAAAGATGCAGGTTTATATCTCAAGAGATGTCATCTTTCATGAGAATACTTTTCTTTTACATGAAACACC
TATGCAAGATGAAACATCAAACATAGATATCTTTTCAGAACATGTGCTTCCGTGTTGTGTGCCTAATTCTTTTGCGAATTATTCGGATGTCACTGATAGTTTATACATGC
CTACTGTCACACCACCTGTGGAAAATAATGAAACCAATTGTACTGAGTTTCATGAAGACAATCTTGGCAACTTCTATCATGGAAATACCAATGATTCAAGTTCCAGTTAT
CACAATTCTTAGCTCTCAACTAGTTCAGCTGCACTGCCTTCTTCTACACCAGTTATCATACCTTTACGAAGGTCTCAAAGATCAGCCAATCCTCCTAGTTACCTCTCAGA
TTACCACTGCTACCTTTTACAACATAATATTTCTCATATAGTTACTACTTGCCCATATGACATTGACAAGTTTCTGAGATATGATTCATTGTCTTCAAGATGCAAAAATT
TCATACTTAACATCTCTACCTATTATGAACCATCCTTTTATCACCAAGCTGTTAAGTTTAAAGAATGGCGTATGGCTATGGATGATGAGATATATTGTAACGGCCCAGGT
TTTCATCCCGGTCTTCCAGGTATGCCCTAGGGGTATTTTGGTAATTTGGTCGGGCCGTCAGGGCCGTTTATGTCTTTTTCAGTTGCTAAAATTATTGGATGGTCTTATGT
AGGATTTTGTTGCTTTGTGGATGTGTTTGGGGTGGAGAGATGCAAAAGAAATAATTTAGGGAATTAAAGGTGTTGTGGTTGGGACTCGAGCCAAGGTGGCCTTGTTTGTA
GTAGCAGGGCCTTGCCAGTTGAGCCACCAGGTGATAGTGATGCTCATTCTGCGGTATGTCTTGGTTAGCTATGGGTCTTTGGGGTTTTGGTGCATTTGACTAAATGAAGG
GAAATTAGGTTTGTTTGTGTTCTTCTCCAAATCCTCCATAGCCGCCCAAGCCCTTCTCTCCCTAGCCGCCCAAGCCTTCTCTCCCAACCCTTTGAAATGCTTTTCTTCCC
CGTGCCAAGCTCCGTCCATCGATCGTTTT
Protein sequenceShow/hide protein sequence
MGFLSSHMNSGKENNINHVSGIFSRPFTHDNYKNTWIIDSGASRHISFNQELFKNWRKLSGINVVLPTGFRLNIDHIGDIECSNGLVLKDVLFIPKFSHNLISVNCLLKI
NEVILQFSNTSCIIQDRNCSKMIGKAECFNGLYVFKDCSDAITCFTSVSTWHKRLGHLSNKRLESMKNTSNFCDHPCKLCHICPLSKQKRLSFPFNNNVSSNCFDLIHCD
IWGPFKVITYMGYRFFLTLVDDCSRYTWTFLMKSKSDVLSIIPRLFKLVKTQFNKDIKQFRSDNAPELKFQDFFSTIGTIHQFSCAETPQQNLVVERKHQHLLNIARSLL
FQSNVPLMFWGDCVLTASYIANRILMPLVSNKTPFSILHSYEADYSVLRSFGCLFFRSTLQAH