; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Moc06g30840 (gene) of Bitter gourd (OHB3-1) v2 genome

Gene IDMoc06g30840
OrganismMomordica charantia cv. OHB3-1 (Bitter gourd (OHB3-1) v2)
DescriptionDUF659 domain-containing protein
Genome locationchr6:23227718..23234383
RNA-Seq ExpressionMoc06g30840
SyntenyMoc06g30840
Gene Ontology termsGO:0003677 - DNA binding (molecular function)
GO:0016853 - isomerase activity (molecular function)
GO:0046872 - metal ion binding (molecular function)
GO:0046983 - protein dimerization activity (molecular function)
InterPro domainsIPR007021 - Domain of unknown function DUF659
IPR012337 - Ribonuclease H-like superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAG5532188.1 hypothetical protein RHGRI_026721 [Rhododendron griersonianum]3.8e-17565.04Show/hide
Query:  MEKLEEEAKNRKERKAPKNIPLPPSFISIDGVNVSNSPGTSNIEPKKRKGTPSAIEKSFNKASRDQLNALIARMFYSAGLPFHLARNPHFRGAFSYAANH
        M+KLE+EAK + +  APK +PLPPS  S  G  +             R  T S +EK+++K  RDQL+A IARMFYS G+PF+LARNP++  ++ +AAN+
Subjt:  MEKLEEEAKNRKERKAPKNIPLPPSFISIDGVNVSNSPGTSNIEPKKRKGTPSAIEKSFNKASRDQLNALIARMFYSAGLPFHLARNPHFRGAFSYAANH

Query:  MLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVGPDN
         L+GY+PPG+N LRT+LLQQEK N+ERLL+PIKG WR KGVSIVSDGWSDSQRRPLINFMA++EG P+FLKAVDCS E KDK+FI  LM++VI EVGPDN
Subjt:  MLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVGPDN

Query:  VVQVITDNAPNCK-------------------VHTLNLVLKNICAAKNVEDNQIAYGECSWISDVAVDVMVVKHFIMNHSIRLSMFNEFVPLKLLSVAET
        VVQVITDNA NC                    VHTLNL L+NICAAKNVE+NQ+ Y ECSWI+ +A DV  +K+FIMNHS+RL++FN+FVPLKLLSVA T
Subjt:  VVQVITDNAPNCK-------------------VHTLNLVLKNICAAKNVEDNQIAYGECSWISDVAVDVMVVKHFIMNHSIRLSMFNEFVPLKLLSVAET

Query:  RFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLNDLWWDKIDHILSFTSPIYDMIRACDTDKPCLHLVYDMWDTMIEKVKKIIYRH
        RFAS++VMLKRFKL+K  L+ MVIS +W +YREDD GKA+ VKE +L+D+WWD ID+ILSFTSP+YDM+R CDTDKPCLHLVYDMWDTMIEKVK  IYRH
Subjt:  RFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLNDLWWDKIDHILSFTSPIYDMIRACDTDKPCLHLVYDMWDTMIEKVKKIIYRH

Query:  ERLQSNENSSCYDVVHTILVDRWNKNNTPLHCLAHSLNPRYYSEEWLAEDSNCVPPSQDVELTRERMKLLKR
        E  +  ++S+ YDVVHTILVDRWNKN+TPLHCLAHSLNPRYYS+EWL E  + VPP +DVE+ RERMK +K+
Subjt:  ERLQSNENSSCYDVVHTILVDRWNKNNTPLHCLAHSLNPRYYSEEWLAEDSNCVPPSQDVELTRERMKLLKR

RWR74797.1 DUF659 domain-containing protein/Dimer_Tnp_hAT domain-containing protein [Cinnamomum micranthum f. kanehirae]4.1e-18269.49Show/hide
Query:  EEAKNRKERKAPKNIPLPPSFISIDGVNVS-NSPGTSNIEPKKRK----GTPSAIEKSFNKASRDQLNALIARMFYSAGLPFHLARNPHFRGAFSYAANH
        EE K R +  APK +PLP   +++   ++S NS      + KKRK    G  + IEK+FN  + DQL+A IARMFYSAGLPFHLARNPHF  AF++AAN 
Subjt:  EEAKNRKERKAPKNIPLPPSFISIDGVNVS-NSPGTSNIEPKKRK----GTPSAIEKSFNKASRDQLNALIARMFYSAGLPFHLARNPHFRGAFSYAANH

Query:  MLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVGPDN
         LTGYVPPG+N LRTSLLQ+EKANIERLL PIKG WR KGVSIVSDGWSDSQRRPLI+FMA++EG P+FLKAVDCS E KDK+FIANLMK+VIN+VG +N
Subjt:  MLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVGPDN

Query:  VVQVITDNAPNCK-------------------VHTLNLVLKNICAAKNVEDNQIAYGECSWISDVAVDVMVVKHFIMNHSIRLSMFNEFVPLKLLSVAET
        VVQVITDNAPNCK                   VHTLNL L NICAAKNVE+NQ+ YGECSWI D+  DVM +KHFIMNHS+RL+MFNEFV LKLLSVA+T
Subjt:  VVQVITDNAPNCK-------------------VHTLNLVLKNICAAKNVEDNQIAYGECSWISDVAVDVMVVKHFIMNHSIRLSMFNEFVPLKLLSVAET

Query:  RFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLNDLWWDKIDHILSFTSPIYDMIRACDTDKPCLHLVYDMWDTMIEKVKKIIYRH
        RFAS IVMLKRFKLIK GL+ MVISDKW+ YRE DVG A+ VKE LL+D+WWD ID+ILSFTSPIYDM+R CDTDKPCLHLVYDMWDTMIEKVK  I+RH
Subjt:  RFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLNDLWWDKIDHILSFTSPIYDMIRACDTDKPCLHLVYDMWDTMIEKVKKIIYRH

Query:  ERLQSNENSSCYDVVHTILVDRWNKNNTPLHCLAHSLNPRYYSEEWLAEDSNCVPPSQDVELTRERMKLLKR
        E  + +E S  YDVVH ILVD WNKNNTPLHCLAHSLNPRYYS+EWL ED + VPP +DVE++RER K L +
Subjt:  ERLQSNENSSCYDVVHTILVDRWNKNNTPLHCLAHSLNPRYYSEEWLAEDSNCVPPSQDVELTRERMKLLKR

XP_022156304.1 uncharacterized protein LOC111023231 isoform X1 [Momordica charantia]4.7e-17866.11Show/hide
Query:  MEKLEEEAKNRKERKAPKNIPLPP---SFISIDGVNVSNSPGTSNIEPKKRKGTPSAIEKSFNKASRDQLNALIARMFYSAGLPFHLARNPHFRGAFSYA
        M++LE+EAK RKE+ APK + LPP   +     G     S   S  +PKKRK + S +EKSFN  + DQL++ IA+MFYS+GLPF LARNPHF  AF++A
Subjt:  MEKLEEEAKNRKERKAPKNIPLPP---SFISIDGVNVSNSPGTSNIEPKKRKGTPSAIEKSFNKASRDQLNALIARMFYSAGLPFHLARNPHFRGAFSYA

Query:  ANHMLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVG
        AN++L+GYVPPG+N LRT+LLQ+EK NIERLL PIK  W  KGVSIVSDGWSDSQRRP INFMAI++G PIFLK VDCS EVKDK+FI NL+K+VINEVG
Subjt:  ANHMLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVG

Query:  PDNVVQVITDNAPNCK-------------------VHTLNLVLKNICAAKNVEDNQIAYGECSWISDVAVDVMVVKHFIMNHSIRLSMFNEFVPLKLLSV
          N++Q+ITDN PNC+                   V TLNL LKNIC++KN+E N+  + EC WIS  + DVM+VK FIMNH +RL+MF EFV LKLLS+
Subjt:  PDNVVQVITDNAPNCK-------------------VHTLNLVLKNICAAKNVEDNQIAYGECSWISDVAVDVMVVKHFIMNHSIRLSMFNEFVPLKLLSV

Query:  AETRFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLNDLWWDKIDHILSFTSPIYDMIRACDTDKPCLHLVYDMWDTMIEKVKKII
        AETRFA  I MLKRFKLIK GL+ M ISDKW+ YREDDVGKA+H+K+L+LND+WWDKID+ILSFTSPIYDMIRACDTDKPCLHL+YDMWDTMIEKVK  I
Subjt:  AETRFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLNDLWWDKIDHILSFTSPIYDMIRACDTDKPCLHLVYDMWDTMIEKVKKII

Query:  YRHERLQSNENSSCYDVVHTILVDRWNKNNTPLHCLAHSLNPRYYSEEWLAEDSNCVPPSQDVELTRERMKLLKR
        YR++    ++ SS Y VVH IL+DRWNKNNTPLHCLAHSLNPRYYSE+WL ED N VPP QD+E+TRERMK +KR
Subjt:  YRHERLQSNENSSCYDVVHTILVDRWNKNNTPLHCLAHSLNPRYYSEEWLAEDSNCVPPSQDVELTRERMKLLKR

XP_022156306.1 uncharacterized protein LOC111023231 isoform X2 [Momordica charantia]4.7e-17866.11Show/hide
Query:  MEKLEEEAKNRKERKAPKNIPLPP---SFISIDGVNVSNSPGTSNIEPKKRKGTPSAIEKSFNKASRDQLNALIARMFYSAGLPFHLARNPHFRGAFSYA
        M++LE+EAK RKE+ APK + LPP   +     G     S   S  +PKKRK + S +EKSFN  + DQL++ IA+MFYS+GLPF LARNPHF  AF++A
Subjt:  MEKLEEEAKNRKERKAPKNIPLPP---SFISIDGVNVSNSPGTSNIEPKKRKGTPSAIEKSFNKASRDQLNALIARMFYSAGLPFHLARNPHFRGAFSYA

Query:  ANHMLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVG
        AN++L+GYVPPG+N LRT+LLQ+EK NIERLL PIK  W  KGVSIVSDGWSDSQRRP INFMAI++G PIFLK VDCS EVKDK+FI NL+K+VINEVG
Subjt:  ANHMLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVG

Query:  PDNVVQVITDNAPNCK-------------------VHTLNLVLKNICAAKNVEDNQIAYGECSWISDVAVDVMVVKHFIMNHSIRLSMFNEFVPLKLLSV
          N++Q+ITDN PNC+                   V TLNL LKNIC++KN+E N+  + EC WIS  + DVM+VK FIMNH +RL+MF EFV LKLLS+
Subjt:  PDNVVQVITDNAPNCK-------------------VHTLNLVLKNICAAKNVEDNQIAYGECSWISDVAVDVMVVKHFIMNHSIRLSMFNEFVPLKLLSV

Query:  AETRFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLNDLWWDKIDHILSFTSPIYDMIRACDTDKPCLHLVYDMWDTMIEKVKKII
        AETRFA  I MLKRFKLIK GL+ M ISDKW+ YREDDVGKA+H+K+L+LND+WWDKID+ILSFTSPIYDMIRACDTDKPCLHL+YDMWDTMIEKVK  I
Subjt:  AETRFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLNDLWWDKIDHILSFTSPIYDMIRACDTDKPCLHLVYDMWDTMIEKVKKII

Query:  YRHERLQSNENSSCYDVVHTILVDRWNKNNTPLHCLAHSLNPRYYSEEWLAEDSNCVPPSQDVELTRERMKLLKR
        YR++    ++ SS Y VVH IL+DRWNKNNTPLHCLAHSLNPRYYSE+WL ED N VPP QD+E+TRERMK +KR
Subjt:  YRHERLQSNENSSCYDVVHTILVDRWNKNNTPLHCLAHSLNPRYYSEEWLAEDSNCVPPSQDVELTRERMKLLKR

XP_038721052.1 uncharacterized protein LOC120013346 isoform X1 [Tripterygium wilfordii]2.9e-17565.62Show/hide
Query:  MEKLEEEAKNRKERKAPKNIPLPPSFISIDGVNVSNSPGTSNIEPKKRK-----GTPSAIEKSFNKASRDQLNALIARMFYSAGLPFHLARNPHFRGAFS
        M++LEEEAKNR+   APK +PLPPS +   G+           E KKRK       PS +EKSFN  +R+QL+ALIAR FY++GLPFHLAR+P++   F 
Subjt:  MEKLEEEAKNRKERKAPKNIPLPPSFISIDGVNVSNSPGTSNIEPKKRK-----GTPSAIEKSFNKASRDQLNALIARMFYSAGLPFHLARNPHFRGAFS

Query:  YAANHMLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINE
        +A +H L GY+PPG+N LRT+LLQQEKAN+ERLL PIK  WR KGVSIVSDGWSDSQRRPLINFMA+SE  P+FLKAVDCS E KDKFFI NLMK+VI E
Subjt:  YAANHMLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINE

Query:  VGPDNVVQVITDNAPNCK-------------------VHTLNLVLKNICAAKNVEDNQIAYGECSWISDVAVDVMVVKHFIMNHSIRLSMFNEFVPLKLL
        VGP NVVQVITDNA NC                    VHTLNL L+NICAAKN+E+NQ+ Y ECSWI+ V+ DV ++K+FIMNHS+RL++FNEFVPLKLL
Subjt:  VGPDNVVQVITDNAPNCK-------------------VHTLNLVLKNICAAKNVEDNQIAYGECSWISDVAVDVMVVKHFIMNHSIRLSMFNEFVPLKLL

Query:  SVAETRFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLNDLWWDKIDHILSFTSPIYDMIRACDTDKPCLHLVYDMWDTMIEKVKK
        S+A TRFAS++VMLKRF LIK  L  MVIS++W +YREDD GKA+ VKE +L+D+WWD ID+IL FT+PIYDM+RACDTDKPCLHLVYDMWD+MIEKV+ 
Subjt:  SVAETRFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLNDLWWDKIDHILSFTSPIYDMIRACDTDKPCLHLVYDMWDTMIEKVKK

Query:  IIYRHERLQSNENSSCYDVVHTILVDRWNKNNTPLHCLAHSLNPRYYSEEWLAEDSNCVPPSQDVELTRERMKLLKR
         IYR E  +  E+S  YDVVH ILV RWNKNNTPLHCLAHSLNPRYYSE+WL ED   VPP +DVE+ RERMK LK+
Subjt:  IIYRHERLQSNENSSCYDVVHTILVDRWNKNNTPLHCLAHSLNPRYYSEEWLAEDSNCVPPSQDVELTRERMKLLKR

TrEMBL top hitse value%identityAlignment
A0A443N8D6 DUF659 domain-containing protein/Dimer_Tnp_hAT domain-containing protein2.0e-18269.49Show/hide
Query:  EEAKNRKERKAPKNIPLPPSFISIDGVNVS-NSPGTSNIEPKKRK----GTPSAIEKSFNKASRDQLNALIARMFYSAGLPFHLARNPHFRGAFSYAANH
        EE K R +  APK +PLP   +++   ++S NS      + KKRK    G  + IEK+FN  + DQL+A IARMFYSAGLPFHLARNPHF  AF++AAN 
Subjt:  EEAKNRKERKAPKNIPLPPSFISIDGVNVS-NSPGTSNIEPKKRK----GTPSAIEKSFNKASRDQLNALIARMFYSAGLPFHLARNPHFRGAFSYAANH

Query:  MLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVGPDN
         LTGYVPPG+N LRTSLLQ+EKANIERLL PIKG WR KGVSIVSDGWSDSQRRPLI+FMA++EG P+FLKAVDCS E KDK+FIANLMK+VIN+VG +N
Subjt:  MLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVGPDN

Query:  VVQVITDNAPNCK-------------------VHTLNLVLKNICAAKNVEDNQIAYGECSWISDVAVDVMVVKHFIMNHSIRLSMFNEFVPLKLLSVAET
        VVQVITDNAPNCK                   VHTLNL L NICAAKNVE+NQ+ YGECSWI D+  DVM +KHFIMNHS+RL+MFNEFV LKLLSVA+T
Subjt:  VVQVITDNAPNCK-------------------VHTLNLVLKNICAAKNVEDNQIAYGECSWISDVAVDVMVVKHFIMNHSIRLSMFNEFVPLKLLSVAET

Query:  RFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLNDLWWDKIDHILSFTSPIYDMIRACDTDKPCLHLVYDMWDTMIEKVKKIIYRH
        RFAS IVMLKRFKLIK GL+ MVISDKW+ YRE DVG A+ VKE LL+D+WWD ID+ILSFTSPIYDM+R CDTDKPCLHLVYDMWDTMIEKVK  I+RH
Subjt:  RFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLNDLWWDKIDHILSFTSPIYDMIRACDTDKPCLHLVYDMWDTMIEKVKKIIYRH

Query:  ERLQSNENSSCYDVVHTILVDRWNKNNTPLHCLAHSLNPRYYSEEWLAEDSNCVPPSQDVELTRERMKLLKR
        E  + +E S  YDVVH ILVD WNKNNTPLHCLAHSLNPRYYS+EWL ED + VPP +DVE++RER K L +
Subjt:  ERLQSNENSSCYDVVHTILVDRWNKNNTPLHCLAHSLNPRYYSEEWLAEDSNCVPPSQDVELTRERMKLLKR

A0A5B7AFB0 Uncharacterized protein5.1e-17865.76Show/hide
Query:  MEKLEEEAKNRKERKAPKNIPLPPSFISIDGVNVSNSPGTSNIEPKKRKGTPSA----IEKSFNKASRDQLNALIARMFYSAGLPFHLARNPHFRGAFSY
        M+KLE+E K R +  A K +PLP S IS+ G   S S      + KKRK T S     +EK+FN  + +QL+A IARMFYS+GLPFHLARNP++  +F++
Subjt:  MEKLEEEAKNRKERKAPKNIPLPPSFISIDGVNVSNSPGTSNIEPKKRKGTPSA----IEKSFNKASRDQLNALIARMFYSAGLPFHLARNPHFRGAFSY

Query:  AANHMLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEV
        AAN+ + GY+PPG+N LRT+LLQ EK NIERLL PIKG W+ KGVSIVSDGWS+SQRRPLINFMA++E  P+FLK VDCS E KDK+FIANLM++VINEV
Subjt:  AANHMLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEV

Query:  GPDNVVQVITDNAPNCK-------------------VHTLNLVLKNICAAKNVEDNQIAYGECSWISDVAVDVMVVKHFIMNHSIRLSMFNEFVPLKLLS
        G +NV+Q+ITDNAPNCK                   VHTLNL LKNICAAKNVE+NQ+ Y ECSWISD+A DVM +KHFIMNHS+RL MFNEFV LKLLS
Subjt:  GPDNVVQVITDNAPNCK-------------------VHTLNLVLKNICAAKNVEDNQIAYGECSWISDVAVDVMVVKHFIMNHSIRLSMFNEFVPLKLLS

Query:  VAETRFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLNDLWWDKIDHILSFTSPIYDMIRACDTDKPCLHLVYDMWDTMIEKVKKI
        VA+TRFAS+IVM +RFKLIK GL+ MVISDKW+ Y+EDDVG+ + VKE +LND+WWD ID+ILSFT+PIY+M++ACDTDKPCLHLVYDMWD+M+EKVK  
Subjt:  VAETRFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLNDLWWDKIDHILSFTSPIYDMIRACDTDKPCLHLVYDMWDTMIEKVKKI

Query:  IYRHERLQSNENSSCYDVVHTILVDRWNKNNTPLHCLAHSLNPRYYSEEWLAEDSNCVPPSQDVELTRERMKLLKR
        IYRHE  +  E+S+ YDVVH ILVDRWNKNNTPLHCLAHSLNP+YYS EWL E+ N VPP ++ E+++ER+K LKR
Subjt:  IYRHERLQSNENSSCYDVVHTILVDRWNKNNTPLHCLAHSLNPRYYSEEWLAEDSNCVPPSQDVELTRERMKLLKR

A0A6J1DT13 uncharacterized protein LOC111023231 isoform X12.3e-17866.11Show/hide
Query:  MEKLEEEAKNRKERKAPKNIPLPP---SFISIDGVNVSNSPGTSNIEPKKRKGTPSAIEKSFNKASRDQLNALIARMFYSAGLPFHLARNPHFRGAFSYA
        M++LE+EAK RKE+ APK + LPP   +     G     S   S  +PKKRK + S +EKSFN  + DQL++ IA+MFYS+GLPF LARNPHF  AF++A
Subjt:  MEKLEEEAKNRKERKAPKNIPLPP---SFISIDGVNVSNSPGTSNIEPKKRKGTPSAIEKSFNKASRDQLNALIARMFYSAGLPFHLARNPHFRGAFSYA

Query:  ANHMLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVG
        AN++L+GYVPPG+N LRT+LLQ+EK NIERLL PIK  W  KGVSIVSDGWSDSQRRP INFMAI++G PIFLK VDCS EVKDK+FI NL+K+VINEVG
Subjt:  ANHMLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVG

Query:  PDNVVQVITDNAPNCK-------------------VHTLNLVLKNICAAKNVEDNQIAYGECSWISDVAVDVMVVKHFIMNHSIRLSMFNEFVPLKLLSV
          N++Q+ITDN PNC+                   V TLNL LKNIC++KN+E N+  + EC WIS  + DVM+VK FIMNH +RL+MF EFV LKLLS+
Subjt:  PDNVVQVITDNAPNCK-------------------VHTLNLVLKNICAAKNVEDNQIAYGECSWISDVAVDVMVVKHFIMNHSIRLSMFNEFVPLKLLSV

Query:  AETRFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLNDLWWDKIDHILSFTSPIYDMIRACDTDKPCLHLVYDMWDTMIEKVKKII
        AETRFA  I MLKRFKLIK GL+ M ISDKW+ YREDDVGKA+H+K+L+LND+WWDKID+ILSFTSPIYDMIRACDTDKPCLHL+YDMWDTMIEKVK  I
Subjt:  AETRFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLNDLWWDKIDHILSFTSPIYDMIRACDTDKPCLHLVYDMWDTMIEKVKKII

Query:  YRHERLQSNENSSCYDVVHTILVDRWNKNNTPLHCLAHSLNPRYYSEEWLAEDSNCVPPSQDVELTRERMKLLKR
        YR++    ++ SS Y VVH IL+DRWNKNNTPLHCLAHSLNPRYYSE+WL ED N VPP QD+E+TRERMK +KR
Subjt:  YRHERLQSNENSSCYDVVHTILVDRWNKNNTPLHCLAHSLNPRYYSEEWLAEDSNCVPPSQDVELTRERMKLLKR

A0A6J1DUJ6 uncharacterized protein LOC111023231 isoform X22.3e-17866.11Show/hide
Query:  MEKLEEEAKNRKERKAPKNIPLPP---SFISIDGVNVSNSPGTSNIEPKKRKGTPSAIEKSFNKASRDQLNALIARMFYSAGLPFHLARNPHFRGAFSYA
        M++LE+EAK RKE+ APK + LPP   +     G     S   S  +PKKRK + S +EKSFN  + DQL++ IA+MFYS+GLPF LARNPHF  AF++A
Subjt:  MEKLEEEAKNRKERKAPKNIPLPP---SFISIDGVNVSNSPGTSNIEPKKRKGTPSAIEKSFNKASRDQLNALIARMFYSAGLPFHLARNPHFRGAFSYA

Query:  ANHMLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVG
        AN++L+GYVPPG+N LRT+LLQ+EK NIERLL PIK  W  KGVSIVSDGWSDSQRRP INFMAI++G PIFLK VDCS EVKDK+FI NL+K+VINEVG
Subjt:  ANHMLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVG

Query:  PDNVVQVITDNAPNCK-------------------VHTLNLVLKNICAAKNVEDNQIAYGECSWISDVAVDVMVVKHFIMNHSIRLSMFNEFVPLKLLSV
          N++Q+ITDN PNC+                   V TLNL LKNIC++KN+E N+  + EC WIS  + DVM+VK FIMNH +RL+MF EFV LKLLS+
Subjt:  PDNVVQVITDNAPNCK-------------------VHTLNLVLKNICAAKNVEDNQIAYGECSWISDVAVDVMVVKHFIMNHSIRLSMFNEFVPLKLLSV

Query:  AETRFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLNDLWWDKIDHILSFTSPIYDMIRACDTDKPCLHLVYDMWDTMIEKVKKII
        AETRFA  I MLKRFKLIK GL+ M ISDKW+ YREDDVGKA+H+K+L+LND+WWDKID+ILSFTSPIYDMIRACDTDKPCLHL+YDMWDTMIEKVK  I
Subjt:  AETRFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLNDLWWDKIDHILSFTSPIYDMIRACDTDKPCLHLVYDMWDTMIEKVKKII

Query:  YRHERLQSNENSSCYDVVHTILVDRWNKNNTPLHCLAHSLNPRYYSEEWLAEDSNCVPPSQDVELTRERMKLLKR
        YR++    ++ SS Y VVH IL+DRWNKNNTPLHCLAHSLNPRYYSE+WL ED N VPP QD+E+TRERMK +KR
Subjt:  YRHERLQSNENSSCYDVVHTILVDRWNKNNTPLHCLAHSLNPRYYSEEWLAEDSNCVPPSQDVELTRERMKLLKR

A0A7J0FQI8 BED-type domain-containing protein5.9e-16669.21Show/hide
Query:  AIEKSFNKASRDQLNALIARMFYSAGLPFHLARNPHFRGAFSYAANHMLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQR
        AI K+FN   RD L   IARMFYS GLPFHLARNP+F  AFSYAA H + G+VPPG+N LRT+LLQ+EKANI+ LL  IK  W   GVSIVSDGWSDSQR
Subjt:  AIEKSFNKASRDQLNALIARMFYSAGLPFHLARNPHFRGAFSYAANHMLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQR

Query:  RPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVGPDNVVQVITDNAPNCK-------------------VHTLNLVLKNICAAKNVEDNQ
        RPLI FMA+S G P+FLKAVD S E+KDK+FIANLMK VINEVGP NVVQ+IT NAPNCK                   VHTLNL LKNICAAK+VE+N+
Subjt:  RPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVGPDNVVQVITDNAPNCK-------------------VHTLNLVLKNICAAKNVEDNQ

Query:  IAYGECSWISDVAVDVMVVKHFIMNHSIRLSMFNEFVPLKLLSVAETRFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLNDLWWD
          Y ECSWISD+A D  ++K+FI NHS+RL+M+NEFV LKLLSVAETRFAS IVMLKRFKLIKGGL+ MVI+DKW+ YREDDVG+A+ VK+ +L+D+WWD
Subjt:  IAYGECSWISDVAVDVMVVKHFIMNHSIRLSMFNEFVPLKLLSVAETRFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLNDLWWD

Query:  KIDHILSFTSPIYDMIRACDTDKPCLHLVYDMWDTMIEKVKKIIYRHERLQSNENSSCYDVVHTILVDRWNKNNTPLHCLAHSLNPRYYSEEWLAEDSNC
         ID+IL FT PIYDMIRACDTD PCLHLVYDMWD+MIE+VK  IYRHE     E+S+ Y V+H ILVDRWNKNNTPLHCLAHSLNPRYYS++WL E +N 
Subjt:  KIDHILSFTSPIYDMIRACDTDKPCLHLVYDMWDTMIEKVKKIIYRHERLQSNENSSCYDVVHTILVDRWNKNNTPLHCLAHSLNPRYYSEEWLAEDSNC

Query:  VPPSQDVELTRERMKLLKR
        VPP +D E++RER+K LKR
Subjt:  VPPSQDVELTRERMKLLKR

SwissProt top hitse value%identityAlignment
P04146 Copia protein3.2e-0424.44Show/hide
Query:  QELWEKLAAFYKAKGISNRLYLKEQFHTLRMEEGTKISYHLSNLNSIIFELQAIEVKIDDEDKALRLILSLPPSYEH-MKPILMYKNDTLNFAEATSKLL
        +++ E L A Y+ K ++++L L+++  +L++     +  H    + +I EL A   KI++ DK   L+++LP  Y+  +  I     + L  A   ++LL
Subjt:  QELWEKLAAFYKAKGISNRLYLKEQFHTLRMEEGTKISYHLSNLNSIIFELQAIEVKIDDEDKALRLILSLPPSYEH-MKPILMYKNDTLNFAEATSKLL

Query:  SEERRLKSEGHTSHED--SALVASNWKKKKDSVQK
         +E ++K++ + + +   +A+V +N    K+++ K
Subjt:  SEERRLKSEGHTSHED--SALVASNWKKKKDSVQK

P10978 Retrovirus-related Pol polyprotein from transposon TNT 1-946.7e-1832.95Show/hide
Query:  FGLWQVQVKDVLIQSGLHKA------------------LKGRPSEGAFEKLSNDEGPMESSGGPSRGSKNQELWEKLAAFYKAKGISNRLYLKEQFHTLR
        F  WQ +++D+LIQ GLHK                   L  R +      LS+D          +RG     +W +L + Y +K ++N+LYLK+Q + L 
Subjt:  FGLWQVQVKDVLIQSGLHKA------------------LKGRPSEGAFEKLSNDEGPMESSGGPSRGSKNQELWEKLAAFYKAKGISNRLYLKEQFHTLR

Query:  MEEGTKISYHLSNLNSIIFELQAIEVKIDDEDKALRLILSLPPSYEHMKPILMYKNDTLNFAEATSKLLSEERRLK
        M EGT    HL+  N +I +L  + VKI++EDKA+ L+ SLP SY+++   +++   T+   + TS LL  E+  K
Subjt:  MEEGTKISYHLSNLNSIIFELQAIEVKIDDEDKALRLILSLPPSYEHMKPILMYKNDTLNFAEATSKLLSEERRLK

Arabidopsis top hitse value%identityAlignment
AT1G79740.1 hAT transposon superfamily9.0e-2625.2Show/hide
Query:  SRDQLNALIARMFYSAGLPFHLARNPHFRGAFSYAANHMLTGYVP--PGF--NSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLIN
        ++D     I+  F+   + F +AR+P +        +HML       PGF   S +T  L + K++I   L   + EW   G +I+++ W+D++ R LIN
Subjt:  SRDQLNALIARMFYSAGLPFHLARNPHFRGAFSYAANHMLTGYVP--PGF--NSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLIN

Query:  FMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVGPDNVVQVITDNAPNCKVHTLNLVLKNI-------CAAKNVEDNQIAYGECSWISDVAVDVM
        F   S  R  F K+VD S   K+   +A+L   VI ++G +++VQ+I DN+  C     N +L+N        CA++ +      + +  W++       
Subjt:  FMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVGPDNVVQVITDNAPNCKVHTLNLVLKNI-------CAAKNVEDNQIAYGECSWISDVAVDVM

Query:  VVKHFIMNHSIRLSMFNEFV-PLKLLSVAETRFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVK--ELLLNDLWWDKIDHILSFTSPIYD
        V+  F+ N+S  L +  +      ++    TR  S  + L+     K  LK M    ++      +  K Q +    +L ++ +W  ++  ++ + PI  
Subjt:  VVKHFIMNHSIRLSMFNEFV-PLKLLSVAETRFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVK--ELLLNDLWWDKIDHILSFTSPIYD

Query:  MIRACDTDKPCLHLVYDMWDTMIEKVKKIIYRHERLQSNENSSCYDVVHTILVDRWNKN-NTPLHCLAHSLNP
        ++R   T KP +  +Y+    ++ K K+ I  +  +  N++    D+V T     W ++ ++PLH  A  LNP
Subjt:  MIRACDTDKPCLHLVYDMWDTMIEKVKKIIYRHERLQSNENSSCYDVVHTILVDRWNKN-NTPLHCLAHSLNP

AT3G13020.1 hAT transposon superfamily protein2.1e-2224.32Show/hide
Query:  IARMFYSAGLPFHLARNPHFRGAFSYAANHMLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAISEGRPIFL
        I R FY   +      +P F+            G   P  + L   LLQ+    ++  +  IK  W++ G SI+ D W D +   L++F+A     P++L
Subjt:  IARMFYSAGLPFHLARNPHFRGAFSYAANHMLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAISEGRPIFL

Query:  KAVDCSCEVKDKFFIANLMKKVINEVGPDNVVQVITDNAPNCKVHTLNL-------VLKNICAAKNVEDNQIAYGECSWISDVAVDVMVVKHFIMNHSIR
        K++D S    D   + +L+  ++ EVG  NV Q+I  +          L       V  ++  +   E   +  G+     D+   V  +  FI N+   
Subjt:  KAVDCSCEVKDKFFIANLMKKVINEVGPDNVVQVITDNAPNCKVHTLNL-------VLKNICAAKNVEDNQIAYGECSWISDVAVDVMVVKHFIMNHSIR

Query:  LSMFNEFVPLKLLSVAETRFASI--IVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLNDLWWDKIDHILSFTSPIYDMIRACDTDKPCLH
        L ++ +    K ++V+ + F  +   ++LK     K  L  M  S  W    + + GK+  V  L+ +  +W+ ++ IL  TSP+ D +R         H
Subjt:  LSMFNEFVPLKLLSVAETRFASI--IVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLNDLWWDKIDHILSFTSPIYDMIRACDTDKPCLH

Query:  L--VYDMWDTMIEKVKKIIYRHERLQSNENSSCYDVVHTILVDRWNKN-NTPLHCLAHSLNP-RYYSEEW
        +  +YD  D +   +KK        + N+    Y  +  ++ D WNK+ + PLH   + LNP  +YS ++
Subjt:  L--VYDMWDTMIEKVKKIIYRHERLQSNENSSCYDVVHTILVDRWNKN-NTPLHCLAHSLNP-RYYSEEW

AT3G17450.1 hAT dimerisation domain-containing protein1.3e-2122.34Show/hide
Query:  SRDQLNALIARMFYSAGLPFHLARNPHFRGAFSYAANHMLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAI
        SR  + + I++  +  G+P   A + +F+        +   G+V P        LLQ+E + I+  L   +  W + G SI++D W++++ + +I+F+  
Subjt:  SRDQLNALIARMFYSAGLPFHLARNPHFRGAFSYAANHMLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAI

Query:  SEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVGPDNVVQVITDNA-------------------PNCKVHTLNLVLKNICAAKNVEDNQIAYGECSWI
              F  ++D +  V+D   +   + K+++++G +NVVQVIT N                      C +H   LVL++             + +  ++
Subjt:  SEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVGPDNVVQVITDNA-------------------PNCKVHTLNLVLKNICAAKNVEDNQIAYGECSWI

Query:  SDVAVDVMVVKHFIMNHSIRLS-MFNEFVP-LKLLSVAETRFASIIVMLKRFKLIKGGLKVMVISDKW-ANYREDDVGKAQHVKELLLNDLWWDKIDHIL
        S+       +  FI N +  L+ M NEF   L LL  A  R AS    L+     K  L+ +  SD W  +       + + V++++L+ ++W K+ ++L
Subjt:  SDVAVDVMVVKHFIMNHSIRLS-MFNEFVP-LKLLSVAETRFASIIVMLKRFKLIKGGLKVMVISDKW-ANYREDDVGKAQHVKELLLNDLWWDKIDHIL

Query:  SFTSPIYDMIRACDT--DKPCLHLVYDMWDTMIEKVKKIIYRHERLQSNENSSCYDVVHTILVDRWNK-NNTPLHCLAHSLNPRY-YSEEWLAE
            P+  +I   +   D+  +   Y         +K I         ++++  Y     ++  RWN   + PL+  A+  NP Y Y  +++A+
Subjt:  SFTSPIYDMIRACDT--DKPCLHLVYDMWDTMIEKVKKIIYRHERLQSNENSSCYDVVHTILVDRWNK-NNTPLHCLAHSLNPRY-YSEEWLAE

AT4G15020.1 hAT transposon superfamily4.6e-2224.62Show/hide
Query:  SAIEKSFNKASRDQLNAL---IARMFYSAGLPFHLARNPHFRGAFSYAANHMLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWS
        S+++   + + RD+ N +   I R  +  G  F    + +F+      A+    G   P  + LR  +L+     + + +   K  W+  G SI+ +  +
Subjt:  SAIEKSFNKASRDQLNAL---IARMFYSAGLPFHLARNPHFRGAFSYAANHMLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWS

Query:  DSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVGPDNVVQVIT--DNAPNCKVHTLNLVLKNI----CAAKNVEDNQIAYGECSWI
          +   ++NF+     + +FLK+VD S  +     +  L+ +++ EVG  NVVQVIT  D+        L LV  ++    CAA  ++     +G+  WI
Subjt:  DSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVGPDNVVQVIT--DNAPNCKVHTLNLVLKNI----CAAKNVEDNQIAYGECSWI

Query:  SDVAVDVMVVKHFIMNHSIRLSMFNEF-----VPLKLLSVAETRFASIIVMLKRFKLIKGGLKVMVISDKW--ANYREDDVGKAQHVKELLLNDLWWDKI
        S+       +  F+ NHS  L++  +F     + L   S + T FA+    L R   +K  L+ MV S +W   +Y E+  G    V   L ++ +W  +
Subjt:  SDVAVDVMVVKHFIMNHSIRLSMFNEF-----VPLKLLSVAETRFASIIVMLKRFKLIKGGLKVMVISDKW--ANYREDDVGKAQHVKELLLNDLWWDKI

Query:  DHILSFTSPIYDMIR-ACDTDKPCLHLVYDMWDTMIEKVKKIIYRHERLQSNENSSCYDVVHTILVDRW--NKNNTPLHCLAHSLNPRYY
          +   TSP+   +R  C   +P +  VY      + + K  I  H  L + E+     +++  ++DRW   + + PL      LNP+ +
Subjt:  DHILSFTSPIYDMIR-ACDTDKPCLHLVYDMWDTMIEKVKKIIYRHERLQSNENSSCYDVVHTILVDRW--NKNNTPLHCLAHSLNPRYY

AT4G15020.2 hAT transposon superfamily4.6e-2224.62Show/hide
Query:  SAIEKSFNKASRDQLNAL---IARMFYSAGLPFHLARNPHFRGAFSYAANHMLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWS
        S+++   + + RD+ N +   I R  +  G  F    + +F+      A+    G   P  + LR  +L+     + + +   K  W+  G SI+ +  +
Subjt:  SAIEKSFNKASRDQLNAL---IARMFYSAGLPFHLARNPHFRGAFSYAANHMLTGYVPPGFNSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWS

Query:  DSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVGPDNVVQVIT--DNAPNCKVHTLNLVLKNI----CAAKNVEDNQIAYGECSWI
          +   ++NF+     + +FLK+VD S  +     +  L+ +++ EVG  NVVQVIT  D+        L LV  ++    CAA  ++     +G+  WI
Subjt:  DSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVGPDNVVQVIT--DNAPNCKVHTLNLVLKNI----CAAKNVEDNQIAYGECSWI

Query:  SDVAVDVMVVKHFIMNHSIRLSMFNEF-----VPLKLLSVAETRFASIIVMLKRFKLIKGGLKVMVISDKW--ANYREDDVGKAQHVKELLLNDLWWDKI
        S+       +  F+ NHS  L++  +F     + L   S + T FA+    L R   +K  L+ MV S +W   +Y E+  G    V   L ++ +W  +
Subjt:  SDVAVDVMVVKHFIMNHSIRLSMFNEF-----VPLKLLSVAETRFASIIVMLKRFKLIKGGLKVMVISDKW--ANYREDDVGKAQHVKELLLNDLWWDKI

Query:  DHILSFTSPIYDMIR-ACDTDKPCLHLVYDMWDTMIEKVKKIIYRHERLQSNENSSCYDVVHTILVDRW--NKNNTPLHCLAHSLNPRYY
          +   TSP+   +R  C   +P +  VY      + + K  I  H  L + E+     +++  ++DRW   + + PL      LNP+ +
Subjt:  DHILSFTSPIYDMIR-ACDTDKPCLHLVYDMWDTMIEKVKKIIYRHERLQSNENSSCYDVVHTILVDRW--NKNNTPLHCLAHSLNPRYY


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGAGAAATTGGAAGAAGAAGCGAAAAATCGGAAGGAGAGAAAAGCCCCTAAAAATATTCCTTTACCACCTTCATTTATATCAATCGATGGTGTTAATGTGAGTAATTC
TCCTGGCACGAGTAATATTGAGCCAAAAAAGAGGAAAGGCACTCCAAGTGCAATTGAGAAGTCATTCAACAAGGCATCTCGAGATCAACTGAATGCACTCATTGCGCGAA
TGTTTTATTCTGCTGGCTTGCCATTTCATTTAGCTAGAAACCCACACTTTAGAGGTGCCTTTAGTTATGCGGCGAACCATATGTTGACCGGATATGTACCTCCGGGATTT
AATTCGTTGAGGACGAGTCTTTTACAACAAGAGAAGGCGAATATTGAGAGGTTATTAATACCAATTAAAGGTGAATGGCGTCTGAAAGGAGTGAGCATTGTGAGTGATGG
ATGGAGTGACTCACAAAGGAGGCCTTTAATTAACTTTATGGCCATCTCTGAAGGTAGACCAATATTTTTGAAAGCAGTAGATTGTTCTTGCGAGGTAAAAGACAAGTTTT
TTATTGCTAATTTGATGAAAAAAGTGATCAATGAAGTTGGTCCTGATAATGTAGTGCAAGTGATAACCGATAATGCTCCTAATTGCAAAGTACATACATTGAATCTTGTC
CTGAAGAACATTTGTGCTGCTAAGAATGTTGAAGACAATCAAATTGCATATGGAGAATGCAGCTGGATATCTGATGTTGCTGTAGATGTCATGGTTGTGAAACATTTTAT
CATGAATCATTCTATTAGGTTGTCCATGTTTAATGAATTTGTACCTCTTAAATTGTTGTCTGTGGCTGAAACACGTTTTGCATCGATTATTGTTATGTTGAAGAGGTTTA
AATTGATTAAAGGTGGCTTAAAAGTCATGGTCATCAGTGACAAGTGGGCAAATTATAGAGAAGATGATGTGGGAAAGGCCCAACATGTAAAGGAGTTATTGCTTAATGAT
TTATGGTGGGACAAAATCGACCACATTCTTTCCTTCACTTCACCTATATACGACATGATCAGAGCTTGTGATACAGACAAGCCTTGTTTGCACTTGGTATATGATATGTG
GGATACGATGATCGAAAAGGTGAAGAAAATCATATATAGACATGAAAGATTGCAGTCGAACGAAAATTCATCTTGTTATGATGTGGTGCACACCATTCTCGTTGATCGTT
GGAACAAAAATAATACTCCACTACATTGTTTAGCACATTCTCTAAATCCAAGGTATTATAGTGAAGAGTGGCTTGCAGAAGACTCTAATTGTGTGCCTCCGAGTCAAGAT
GTGGAATTAACTAGGGAGAGAATGAAGTTGTTGAAGAGGATCAACTTTGGCCTGTGGCAAGTGCAAGTCAAGGATGTGCTGATACAATCAGGGTTACACAAGGCGTTGAA
GGGAAGACCGAGTGAAGGTGCTTTTGAAAAGCTAAGCAATGATGAAGGTCCAATGGAGTCCAGCGGTGGTCCCAGCAGAGGTTCTAAGAACCAAGAACTTTGGGAGAAGC
TCGCAGCGTTTTATAAAGCAAAGGGCATCTCAAATCGACTGTACCTGAAGGAGCAGTTTCACACGCTGCGAATGGAAGAAGGTACGAAAATTTCATATCATCTGAGTAAT
CTCAATAGCATCATCTTTGAGCTACAGGCGATCGAAGTGAAGATAGATGACGAAGATAAAGCACTCAGGCTCATCTTATCACTTCCACCTTCTTATGAACACATGAAGCC
GATTTTGATGTATAAGAATGATACTTTGAATTTTGCCGAGGCTACTAGTAAACTCTTGTCAGAGGAAAGAAGGTTGAAGAGTGAAGGGCATACTTCACATGAAGATTCGG
CACTGGTAGCTAGCAATTGGAAGAAGAAGAAAGACTCCGTACAAAAGAAAGCTTGTTGCTAG
mRNA sequenceShow/hide mRNA sequence
ATGGAGAAATTGGAAGAAGAAGCGAAAAATCGGAAGGAGAGAAAAGCCCCTAAAAATATTCCTTTACCACCTTCATTTATATCAATCGATGGTGTTAATGTGAGTAATTC
TCCTGGCACGAGTAATATTGAGCCAAAAAAGAGGAAAGGCACTCCAAGTGCAATTGAGAAGTCATTCAACAAGGCATCTCGAGATCAACTGAATGCACTCATTGCGCGAA
TGTTTTATTCTGCTGGCTTGCCATTTCATTTAGCTAGAAACCCACACTTTAGAGGTGCCTTTAGTTATGCGGCGAACCATATGTTGACCGGATATGTACCTCCGGGATTT
AATTCGTTGAGGACGAGTCTTTTACAACAAGAGAAGGCGAATATTGAGAGGTTATTAATACCAATTAAAGGTGAATGGCGTCTGAAAGGAGTGAGCATTGTGAGTGATGG
ATGGAGTGACTCACAAAGGAGGCCTTTAATTAACTTTATGGCCATCTCTGAAGGTAGACCAATATTTTTGAAAGCAGTAGATTGTTCTTGCGAGGTAAAAGACAAGTTTT
TTATTGCTAATTTGATGAAAAAAGTGATCAATGAAGTTGGTCCTGATAATGTAGTGCAAGTGATAACCGATAATGCTCCTAATTGCAAAGTACATACATTGAATCTTGTC
CTGAAGAACATTTGTGCTGCTAAGAATGTTGAAGACAATCAAATTGCATATGGAGAATGCAGCTGGATATCTGATGTTGCTGTAGATGTCATGGTTGTGAAACATTTTAT
CATGAATCATTCTATTAGGTTGTCCATGTTTAATGAATTTGTACCTCTTAAATTGTTGTCTGTGGCTGAAACACGTTTTGCATCGATTATTGTTATGTTGAAGAGGTTTA
AATTGATTAAAGGTGGCTTAAAAGTCATGGTCATCAGTGACAAGTGGGCAAATTATAGAGAAGATGATGTGGGAAAGGCCCAACATGTAAAGGAGTTATTGCTTAATGAT
TTATGGTGGGACAAAATCGACCACATTCTTTCCTTCACTTCACCTATATACGACATGATCAGAGCTTGTGATACAGACAAGCCTTGTTTGCACTTGGTATATGATATGTG
GGATACGATGATCGAAAAGGTGAAGAAAATCATATATAGACATGAAAGATTGCAGTCGAACGAAAATTCATCTTGTTATGATGTGGTGCACACCATTCTCGTTGATCGTT
GGAACAAAAATAATACTCCACTACATTGTTTAGCACATTCTCTAAATCCAAGGTATTATAGTGAAGAGTGGCTTGCAGAAGACTCTAATTGTGTGCCTCCGAGTCAAGAT
GTGGAATTAACTAGGGAGAGAATGAAGTTGTTGAAGAGGATCAACTTTGGCCTGTGGCAAGTGCAAGTCAAGGATGTGCTGATACAATCAGGGTTACACAAGGCGTTGAA
GGGAAGACCGAGTGAAGGTGCTTTTGAAAAGCTAAGCAATGATGAAGGTCCAATGGAGTCCAGCGGTGGTCCCAGCAGAGGTTCTAAGAACCAAGAACTTTGGGAGAAGC
TCGCAGCGTTTTATAAAGCAAAGGGCATCTCAAATCGACTGTACCTGAAGGAGCAGTTTCACACGCTGCGAATGGAAGAAGGTACGAAAATTTCATATCATCTGAGTAAT
CTCAATAGCATCATCTTTGAGCTACAGGCGATCGAAGTGAAGATAGATGACGAAGATAAAGCACTCAGGCTCATCTTATCACTTCCACCTTCTTATGAACACATGAAGCC
GATTTTGATGTATAAGAATGATACTTTGAATTTTGCCGAGGCTACTAGTAAACTCTTGTCAGAGGAAAGAAGGTTGAAGAGTGAAGGGCATACTTCACATGAAGATTCGG
CACTGGTAGCTAGCAATTGGAAGAAGAAGAAAGACTCCGTACAAAAGAAAGCTTGTTGCTAG
Protein sequenceShow/hide protein sequence
MEKLEEEAKNRKERKAPKNIPLPPSFISIDGVNVSNSPGTSNIEPKKRKGTPSAIEKSFNKASRDQLNALIARMFYSAGLPFHLARNPHFRGAFSYAANHMLTGYVPPGF
NSLRTSLLQQEKANIERLLIPIKGEWRLKGVSIVSDGWSDSQRRPLINFMAISEGRPIFLKAVDCSCEVKDKFFIANLMKKVINEVGPDNVVQVITDNAPNCKVHTLNLV
LKNICAAKNVEDNQIAYGECSWISDVAVDVMVVKHFIMNHSIRLSMFNEFVPLKLLSVAETRFASIIVMLKRFKLIKGGLKVMVISDKWANYREDDVGKAQHVKELLLND
LWWDKIDHILSFTSPIYDMIRACDTDKPCLHLVYDMWDTMIEKVKKIIYRHERLQSNENSSCYDVVHTILVDRWNKNNTPLHCLAHSLNPRYYSEEWLAEDSNCVPPSQD
VELTRERMKLLKRINFGLWQVQVKDVLIQSGLHKALKGRPSEGAFEKLSNDEGPMESSGGPSRGSKNQELWEKLAAFYKAKGISNRLYLKEQFHTLRMEEGTKISYHLSN
LNSIIFELQAIEVKIDDEDKALRLILSLPPSYEHMKPILMYKNDTLNFAEATSKLLSEERRLKSEGHTSHEDSALVASNWKKKKDSVQKKACC