; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Tan0022828 (gene) of Snake gourd v1 genome

Gene IDTan0022828
OrganismTrichosanthes anguina (Snake gourd v1)
DescriptionGag/pol protein
Genome locationLG02:52254812..52256325
RNA-Seq ExpressionTan0022828
SyntenyTan0022828
Gene Ontology termsGO:0006508 - proteolysis (biological process)
GO:0015074 - DNA integration (biological process)
GO:0003676 - nucleic acid binding (molecular function)
GO:0008234 - cysteine-type peptidase activity (molecular function)
GO:0008270 - zinc ion binding (molecular function)
InterPro domainsIPR001584 - Integrase, catalytic core
IPR012337 - Ribonuclease H-like superfamily
IPR025724 - GAG-pre-integrase domain
IPR036397 - Ribonuclease H superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
KAA0034863.1 gag/pol protein [Cucumis melo var. makuwa]3.0e-7150.83Show/hide
Query:  MSDVLAKKHELMVSAKEIMESLRNVWAAVLFGRHDSLKSFSTPRMKEGRRPS----------KECRGKCCLTVLSRGSTCRMKLVAPSRPSE--EEDEEG
        +S+VL KKHE M++A  + E + N+        H ++   +   + E  R S           + R K  +    RGST   K V  S  ++  ++ + G
Subjt:  MSDVLAKKHELMVSAKEIMESLRNVWAAVLFGRHDSLKSFSTPRMKEGRRPS----------KECRGKCCLTVLSRGSTCRMKLVAPSRPSE--EEDEEG

Query:  KSDRLCRQEARRSRRLRERK-VFPPCNGAILWRETVPKFTVERK--NQETRTKKAKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLPVCES
        + ++     A+  ++ +  K +   CN    W+   PK+  E+K   Q T+ K+  +SP+ENAHLWHLRLGH+NL RIEKLVK+GLL+ELEENSLPVCES
Subjt:  KSDRLCRQEARRSRRLRERK-VFPPCNGAILWRETVPKFTVERK--NQETRTKKAKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLPVCES

Query:  CLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFVSFIDDYSRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYMDTE
        CLEGKMTKRPF+GKG+ +KEPLELVHSDLCGPMN KARGG+EYF++F DDYSRYGY+YLM  KSE LEKFKEYK EVEN L K++KT RS+RG EYMD +
Subjt:  CLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFVSFIDDYSRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYMDTE

Query:  F
        F
Subjt:  F

KAA0046415.1 gag/pol protein [Cucumis melo var. makuwa]5.8e-6779.49Show/hide
Query:  KNQETRTKKAKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLPVCESCLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFV
        K   T+ K+ K+SPKENAHLWHLRLGHINL RIE+LVK+GLL+ELEENSLPVCESCLEGKMTKRPF+GKG+RAKEPLELVHSDLCGPMNVKARGG+EYF+
Subjt:  KNQETRTKKAKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLPVCESCLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFV

Query:  SFIDDYSRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYMDTEF
        +F DDYSRYGY+YLM  KSE LEKFKEYK EVEN L K++KT RS+RG EYMD +F
Subjt:  SFIDDYSRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYMDTEF

KAA0048404.1 gag/pol protein [Cucumis melo var. makuwa]5.8e-6779.49Show/hide
Query:  KNQETRTKKAKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLPVCESCLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFV
        K   T+ K+ K+SPKENAHLWHLRLGHINL RIE+LVK+GLL+ELEENSLPVCESCLEGKMTKRPF+GKG+RAKEPLELVHSDLCGPMNVKARGG+EYF+
Subjt:  KNQETRTKKAKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLPVCESCLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFV

Query:  SFIDDYSRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYMDTEF
        +F DDYSRYGY+YLM  KSE LEKFKEYK EVEN L K++KT RS+RG EYMD +F
Subjt:  SFIDDYSRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYMDTEF

TYJ96675.1 gag/pol protein [Cucumis melo var. makuwa]3.0e-7150.83Show/hide
Query:  MSDVLAKKHELMVSAKEIMESLRNVWAAVLFGRHDSLKSFSTPRMKEGRRPS----------KECRGKCCLTVLSRGSTCRMKLVAPSRPSE--EEDEEG
        +S+VL KKHE M++A  + E + N+        H ++   +   + E  R S           + R K  +    RGST   K V  S  ++  ++ + G
Subjt:  MSDVLAKKHELMVSAKEIMESLRNVWAAVLFGRHDSLKSFSTPRMKEGRRPS----------KECRGKCCLTVLSRGSTCRMKLVAPSRPSE--EEDEEG

Query:  KSDRLCRQEARRSRRLRERK-VFPPCNGAILWRETVPKFTVERK--NQETRTKKAKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLPVCES
        + ++     A+  ++ +  K +   CN    W+   PK+  E+K   Q T+ K+  +SP+ENAHLWHLRLGH+NL RIEKLVK+GLL+ELEENSLPVCES
Subjt:  KSDRLCRQEARRSRRLRERK-VFPPCNGAILWRETVPKFTVERK--NQETRTKKAKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLPVCES

Query:  CLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFVSFIDDYSRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYMDTE
        CLEGKMTKRPF+GKG+ +KEPLELVHSDLCGPMN KARGG+EYF++F DDYSRYGY+YLM  KSE LEKFKEYK EVEN L K++KT RS+RG EYMD +
Subjt:  CLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFVSFIDDYSRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYMDTE

Query:  F
        F
Subjt:  F

TYJ96910.1 gag/pol protein [Cucumis melo var. makuwa]3.1e-6860.62Show/hide
Query:  RGSTCRMKLVAPSRPSE--EEDEEGKSDRLCRQEARRSRRLRERK-VFPPCNGAILWRETVPKFTVERK--NQETRTKKAKVSPKENAHLWHLRLGHINL
        RGST   K +  S  ++  ++++ G+ +++    A+ S++ +  K +   CN    W+   PK+  E+K   Q T+ K+ ++SPKENAHLWHLRL HINL
Subjt:  RGSTCRMKLVAPSRPSE--EEDEEGKSDRLCRQEARRSRRLRERK-VFPPCNGAILWRETVPKFTVERK--NQETRTKKAKVSPKENAHLWHLRLGHINL

Query:  KRIEKLVKSGLLNELEENSLPVCESCLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFVSFIDDYSRYGYIYLMHKKSETLEKFKEYKT
         RIE+LV++GLL+ELEEN LPVCESCLEGKMTKRPF+GKG+RAKEPLELVHSDLCGPMNVKARGG+EYF++F DDYSRYGY+YLM  KSE LEKFKEYK 
Subjt:  KRIEKLVKSGLLNELEENSLPVCESCLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFVSFIDDYSRYGYIYLMHKKSETLEKFKEYKT

Query:  EVENLLGKSLKTLRSNRGREYMDTEF
        EVEN L K++KT RS+RG EYMD +F
Subjt:  EVENLLGKSLKTLRSNRGREYMDTEF

TrEMBL top hitse value%identityAlignment
A0A5A7SMH8 Gag/pol protein2.8e-6779.49Show/hide
Query:  KNQETRTKKAKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLPVCESCLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFV
        K   T+ K+ K+SPKENAHLWHLRLGHINL RIE+LVK+GLL+ELEENSLPVCESCLEGKMTKRPF+GKG+RAKEPLELVHSDLCGPMNVKARGG+EYF+
Subjt:  KNQETRTKKAKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLPVCESCLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFV

Query:  SFIDDYSRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYMDTEF
        +F DDYSRYGY+YLM  KSE LEKFKEYK EVEN L K++KT RS+RG EYMD +F
Subjt:  SFIDDYSRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYMDTEF

A0A5A7SWF4 Gag/pol protein1.4e-7150.83Show/hide
Query:  MSDVLAKKHELMVSAKEIMESLRNVWAAVLFGRHDSLKSFSTPRMKEGRRPS----------KECRGKCCLTVLSRGSTCRMKLVAPSRPSE--EEDEEG
        +S+VL KKHE M++A  + E + N+        H ++   +   + E  R S           + R K  +    RGST   K V  S  ++  ++ + G
Subjt:  MSDVLAKKHELMVSAKEIMESLRNVWAAVLFGRHDSLKSFSTPRMKEGRRPS----------KECRGKCCLTVLSRGSTCRMKLVAPSRPSE--EEDEEG

Query:  KSDRLCRQEARRSRRLRERK-VFPPCNGAILWRETVPKFTVERK--NQETRTKKAKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLPVCES
        + ++     A+  ++ +  K +   CN    W+   PK+  E+K   Q T+ K+  +SP+ENAHLWHLRLGH+NL RIEKLVK+GLL+ELEENSLPVCES
Subjt:  KSDRLCRQEARRSRRLRERK-VFPPCNGAILWRETVPKFTVERK--NQETRTKKAKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLPVCES

Query:  CLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFVSFIDDYSRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYMDTE
        CLEGKMTKRPF+GKG+ +KEPLELVHSDLCGPMN KARGG+EYF++F DDYSRYGY+YLM  KSE LEKFKEYK EVEN L K++KT RS+RG EYMD +
Subjt:  CLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFVSFIDDYSRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYMDTE

Query:  F
        F
Subjt:  F

A0A5D3BAN6 Gag/pol protein1.5e-6860.62Show/hide
Query:  RGSTCRMKLVAPSRPSE--EEDEEGKSDRLCRQEARRSRRLRERK-VFPPCNGAILWRETVPKFTVERK--NQETRTKKAKVSPKENAHLWHLRLGHINL
        RGST   K +  S  ++  ++++ G+ +++    A+ S++ +  K +   CN    W+   PK+  E+K   Q T+ K+ ++SPKENAHLWHLRL HINL
Subjt:  RGSTCRMKLVAPSRPSE--EEDEEGKSDRLCRQEARRSRRLRERK-VFPPCNGAILWRETVPKFTVERK--NQETRTKKAKVSPKENAHLWHLRLGHINL

Query:  KRIEKLVKSGLLNELEENSLPVCESCLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFVSFIDDYSRYGYIYLMHKKSETLEKFKEYKT
         RIE+LV++GLL+ELEEN LPVCESCLEGKMTKRPF+GKG+RAKEPLELVHSDLCGPMNVKARGG+EYF++F DDYSRYGY+YLM  KSE LEKFKEYK 
Subjt:  KRIEKLVKSGLLNELEENSLPVCESCLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFVSFIDDYSRYGYIYLMHKKSETLEKFKEYKT

Query:  EVENLLGKSLKTLRSNRGREYMDTEF
        EVEN L K++KT RS+RG EYMD +F
Subjt:  EVENLLGKSLKTLRSNRGREYMDTEF

A0A5D3BDY3 Gag/pol protein1.4e-7150.83Show/hide
Query:  MSDVLAKKHELMVSAKEIMESLRNVWAAVLFGRHDSLKSFSTPRMKEGRRPS----------KECRGKCCLTVLSRGSTCRMKLVAPSRPSE--EEDEEG
        +S+VL KKHE M++A  + E + N+        H ++   +   + E  R S           + R K  +    RGST   K V  S  ++  ++ + G
Subjt:  MSDVLAKKHELMVSAKEIMESLRNVWAAVLFGRHDSLKSFSTPRMKEGRRPS----------KECRGKCCLTVLSRGSTCRMKLVAPSRPSE--EEDEEG

Query:  KSDRLCRQEARRSRRLRERK-VFPPCNGAILWRETVPKFTVERK--NQETRTKKAKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLPVCES
        + ++     A+  ++ +  K +   CN    W+   PK+  E+K   Q T+ K+  +SP+ENAHLWHLRLGH+NL RIEKLVK+GLL+ELEENSLPVCES
Subjt:  KSDRLCRQEARRSRRLRERK-VFPPCNGAILWRETVPKFTVERK--NQETRTKKAKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLPVCES

Query:  CLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFVSFIDDYSRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYMDTE
        CLEGKMTKRPF+GKG+ +KEPLELVHSDLCGPMN KARGG+EYF++F DDYSRYGY+YLM  KSE LEKFKEYK EVEN L K++KT RS+RG EYMD +
Subjt:  CLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFVSFIDDYSRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYMDTE

Query:  F
        F
Subjt:  F

A0A5D3CPJ6 Gag/pol protein2.8e-6779.49Show/hide
Query:  KNQETRTKKAKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLPVCESCLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFV
        K   T+ K+ K+SPKENAHLWHLRLGHINL RIE+LVK+GLL+ELEENSLPVCESCLEGKMTKRPF+GKG+RAKEPLELVHSDLCGPMNVKARGG+EYF+
Subjt:  KNQETRTKKAKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLPVCESCLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFV

Query:  SFIDDYSRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYMDTEF
        +F DDYSRYGY+YLM  KSE LEKFKEYK EVEN L K++KT RS+RG EYMD +F
Subjt:  SFIDDYSRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYMDTEF

SwissProt top hitse value%identityAlignment
P04146 Copia protein5.7e-1735.57Show/hide
Query:  KENAHLWHLRLGHIN------LKRIEKLVKSGLLNELEENSLPVCESCLEGKMTKRPFSGKGYRA--KEPLELVHSDLCGPMNVKARGGYEYFVSFIDDY
        K N  LWH R GHI+      +KR        LLN L E S  +CE CL GK  + PF     +   K PL +VHSD+CGP+         YFV F+D +
Subjt:  KENAHLWHLRLGHIN------LKRIEKLVKSGLLNELEENSLPVCESCLEGKMTKRPFSGKGYRA--KEPLELVHSDLCGPMNVKARGGYEYFVSFIDDY

Query:  SRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYMDTE
        + Y   YL+  KS+    F+++  + E      +  L  + GREY+  E
Subjt:  SRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYMDTE

P10978 Retrovirus-related Pol polyprotein from transposon TNT 1-941.7e-2434.68Show/hide
Query:  GAILWRETVPKFTVERKNQETRTKKAKVSPKE-NAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLPVCESCLEGKMTKRPFSGKGYRAKEPLELVHSD
        G+++  + V + T+ R N E    +   +  E +  LWH R+GH++ K ++ L K  L++  +  ++  C+ CL GK  +  F     R    L+LV+SD
Subjt:  GAILWRETVPKFTVERKNQETRTKKAKVSPKE-NAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLPVCESCLEGKMTKRPFSGKGYRAKEPLELVHSD

Query:  LCGPMNVKARGGYEYFVSFIDDYSRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYMDTEF
        +CGPM +++ GG +YFV+FIDD SR  ++Y++  K +  + F+++   VE   G+ LK LRS+ G EY   EF
Subjt:  LCGPMNVKARGGYEYFVSFIDDYSRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYMDTEF

Q12491 Transposon Ty2-B Gag-Pol polyprotein2.1e-1128.9Show/hide
Query:  VPKFTVERKNQETRTKKAKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLP-------VCESCLEGKMTKRPFSGKGYRAK-----EPLELV
        + K T+   N      K+K   K    L H  LGH N + I+K +K   +  L+E+ +         C  CL GK TK     KG R K     EP + +
Subjt:  VPKFTVERKNQETRTKKAKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLP-------VCESCLEGKMTKRPFSGKGYRAK-----EPLELV

Query:  HSDLCGPMNVKARGGYEYFVSFIDDYSRYGYIYLMHKKSE--TLEKFKEYKTEVENLLGKSLKTLRSNRGREY
        H+D+ GP++   +    YF+SF D+ +R+ ++Y +H + E   L  F      ++N     +  ++ +RG EY
Subjt:  HSDLCGPMNVKARGGYEYFVSFIDDYSRYGYIYLMHKKSE--TLEKFKEYKTEVENLLGKSLKTLRSNRGREY

Q94HW2 Retrovirus-related Pol polyprotein from transposon RE13.2e-1229.86Show/hide
Query:  AKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELE-ENSLPVCESCLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFVSFIDDYSR
        A  S K     WH RLGH     +  ++ +  L+ L   +    C  CL  K  K PFS     +  PLE ++SD+     + +   Y Y+V F+D ++R
Subjt:  AKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELE-ENSLPVCESCLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFVSFIDDYSR

Query:  YGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYM
        Y ++Y + +KS+  E F  +K  +EN     + T  S+ G E++
Subjt:  YGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYM

Q9ZT94 Retrovirus-related Pol polyprotein from transposon RE21.1e-1231.58Show/hide
Query:  WHLRLGHINLKRIEKLVKSGLLNELE-ENSLPVCESCLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFVSFIDDYSRYGYIYLMHKKS
        WH RLGH +L  +  ++ +  L  L   + L  C  C   K  K PFS     + +PLE ++SD+     + +   Y Y+V F+D ++RY ++Y + +KS
Subjt:  WHLRLGHINLKRIEKLVKSGLLNELE-ENSLPVCESCLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNVKARGGYEYFVSFIDDYSRYGYIYLMHKKS

Query:  ETLEKFKEYKTEVENLLGKSLKTLRSNRGREYM
        +  + F  +K+ VEN     + TL S+ G E++
Subjt:  ETLEKFKEYKTEVENLLGKSLKTLRSNRGREYM

Arabidopsis top hitse value%identityAlignment
ATMG00300.1 Gag-Pol-related retrotransposon family protein2.2e-1139.08Show/hide
Query:  ETRTKKAKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLPVCESCLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNV
        ET       + K+   LWH RL H++ + +E LVK G L+  + +SL  CE C+ GK  +  FS   +  K PL+ VHSDL G  +V
Subjt:  ETRTKKAKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLPVCESCLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNV


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGTCTGATGTCTTGGCAAAGAAGCATGAGCTGATGGTTTCCGCTAAGGAGATCATGGAGTCCTTGCGAAATGTTTGGGCGGCGGTCCTTTTCGGTCGGCATGACTCGCT
CAAATCGTTTTCAACGCCGCGGATGAAAGAGGGTCGTCGTCCGTCCAAGGAATGCAGAGGCAAATGTTGCCTCACGGTCTTATCACGGGGTTCGACCTGCAGGATGAAAC
TCGTTGCTCCTTCACGCCCAAGTGAAGAAGAGGATGAAGAGGGTAAAAGCGACCGTCTTTGCCGCCAAGAGGCAAGAAGGTCAAGGAGGTTGCGAGAGAGGAAAGTGTTT
CCACCTTGCAATGGGGCGATCCTTTGGAGAGAAACTGTCCCCAAATTCACAGTCGAGAGGAAGAATCAAGAAACACGAACTAAGAAAGCAAAAGTTTCTCCAAAAGAAAA
TGCCCACCTTTGGCATCTTCGGTTAGGCCACATTAATCTCAAGAGGATTGAGAAACTAGTGAAGAGTGGACTTCTAAACGAGTTGGAAGAAAACTCTTTGCCGGTGTGTG
AGTCATGCCTTGAGGGCAAAATGACCAAACGTCCTTTTAGTGGAAAAGGATATAGAGCCAAAGAGCCTCTTGAGTTAGTACATTCTGACCTCTGTGGTCCGATGAATGTT
AAAGCTCGAGGCGGTTATGAGTATTTTGTGTCTTTCATAGACGATTACTCCAGATATGGGTATATTTACCTAATGCACAAGAAGTCTGAAACTCTTGAAAAGTTCAAGGA
GTACAAGACTGAGGTTGAGAACCTCTTAGGTAAATCGCTTAAAACACTTCGATCGAATCGAGGTAGAGAGTACATGGACACCGAATTCGGGACTATATGA
mRNA sequenceShow/hide mRNA sequence
ATGTCTGATGTCTTGGCAAAGAAGCATGAGCTGATGGTTTCCGCTAAGGAGATCATGGAGTCCTTGCGAAATGTTTGGGCGGCGGTCCTTTTCGGTCGGCATGACTCGCT
CAAATCGTTTTCAACGCCGCGGATGAAAGAGGGTCGTCGTCCGTCCAAGGAATGCAGAGGCAAATGTTGCCTCACGGTCTTATCACGGGGTTCGACCTGCAGGATGAAAC
TCGTTGCTCCTTCACGCCCAAGTGAAGAAGAGGATGAAGAGGGTAAAAGCGACCGTCTTTGCCGCCAAGAGGCAAGAAGGTCAAGGAGGTTGCGAGAGAGGAAAGTGTTT
CCACCTTGCAATGGGGCGATCCTTTGGAGAGAAACTGTCCCCAAATTCACAGTCGAGAGGAAGAATCAAGAAACACGAACTAAGAAAGCAAAAGTTTCTCCAAAAGAAAA
TGCCCACCTTTGGCATCTTCGGTTAGGCCACATTAATCTCAAGAGGATTGAGAAACTAGTGAAGAGTGGACTTCTAAACGAGTTGGAAGAAAACTCTTTGCCGGTGTGTG
AGTCATGCCTTGAGGGCAAAATGACCAAACGTCCTTTTAGTGGAAAAGGATATAGAGCCAAAGAGCCTCTTGAGTTAGTACATTCTGACCTCTGTGGTCCGATGAATGTT
AAAGCTCGAGGCGGTTATGAGTATTTTGTGTCTTTCATAGACGATTACTCCAGATATGGGTATATTTACCTAATGCACAAGAAGTCTGAAACTCTTGAAAAGTTCAAGGA
GTACAAGACTGAGGTTGAGAACCTCTTAGGTAAATCGCTTAAAACACTTCGATCGAATCGAGGTAGAGAGTACATGGACACCGAATTCGGGACTATATGA
Protein sequenceShow/hide protein sequence
MSDVLAKKHELMVSAKEIMESLRNVWAAVLFGRHDSLKSFSTPRMKEGRRPSKECRGKCCLTVLSRGSTCRMKLVAPSRPSEEEDEEGKSDRLCRQEARRSRRLRERKVF
PPCNGAILWRETVPKFTVERKNQETRTKKAKVSPKENAHLWHLRLGHINLKRIEKLVKSGLLNELEENSLPVCESCLEGKMTKRPFSGKGYRAKEPLELVHSDLCGPMNV
KARGGYEYFVSFIDDYSRYGYIYLMHKKSETLEKFKEYKTEVENLLGKSLKTLRSNRGREYMDTEFGTI