; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Clc01G12600 (gene) of Watermelon (cordophanus) v2 genome

Gene IDClc01G12600
OrganismCitrullus lanatus subsp. cordophanus (Watermelon (cordophanus) v2)
DescriptionMuDRA-like transposase
Genome locationClcChr01:23526259..23529997
RNA-Seq ExpressionClc01G12600
SyntenyClc01G12600
Gene Ontology termsGO:0006313 - transposition, DNA-mediated (biological process)
GO:0003677 - DNA binding (molecular function)
GO:0004803 - transposase activity (molecular function)
GO:0008270 - zinc ion binding (molecular function)
InterPro domainsIPR001207 - Transposase, mutator type
IPR006564 - Zinc finger, PMZ-type
IPR007527 - Zinc finger, SWIM-type


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_038874877.1 uncharacterized protein LOC120067378 [Benincasa hispida]1.9e-8566.13Show/hide
Query:  MTHLKASIRDIPNLVIISDHQISIGNAVSSIFLETFYALCIYHIWNNLLDKFKNKDIIPHFYLSAKAYRMSEFQIYWSKLQRYHGMTTYLEEIGLQQWAR
        MTHLKASI D+PNLVIIS   ISI  AV+ IF   F+AL IYHI NNL+DKFKNKDIIPHFYL AK YRMSEFQ+YW+KL +Y G+T YLEE+GLQQWAR
Subjt:  MTHLKASIRDIPNLVIISDHQISIGNAVSSIFLETFYALCIYHIWNNLLDKFKNKDIIPHFYLSAKAYRMSEFQIYWSKLQRYHGMTTYLEEIGLQQWAR

Query:  VYQVHCRYYKITTNIVECLNEVLIDARELPITKLLEHIR------------------------------EAESVSRTYRVSPVDMYVLNVDDGHLGGLVD
        VYQV+C+Y KITTNI E LN VL DARELP+TKLLEHIR                              E++ +SRTYRVSPVDM+++NVDD + GGLV 
Subjt:  VYQVHCRYYKITTNIVECLNEVLIDARELPITKLLEHIR------------------------------EAESVSRTYRVSPVDMYVLNVDDGHLGGLVD

Query:  LRSRTCTCMKFNCMEIPCSHAVAAAAMQNINVQTLCSKWFTVECVLAA
        LRSRTCTCM+FNC+EIPCSHA++AA ++NINVQTLC KWFT ECVLAA
Subjt:  LRSRTCTCMKFNCMEIPCSHAVAAAAMQNINVQTLCSKWFTVECVLAA

XP_038876499.1 uncharacterized protein LOC120068931 [Benincasa hispida]4.6e-7665.07Show/hide
Query:  YGRAYRAREYVLVFARGSSEGSYPLVNSYGEALKLANPGTVFEVEVEEQRYFKYVYMAFGLCTKGFLNCIRSVIVVNDGEKDASWLWFMTHLKASIR-DI
        Y RAYR REY LV+ RGS +GSY +V +Y EALKL N  T+F+VEVE+++YFKYV+MA G C KGFLNCI  +IVV+            THL    + ++
Subjt:  YGRAYRAREYVLVFARGSSEGSYPLVNSYGEALKLANPGTVFEVEVEEQRYFKYVYMAFGLCTKGFLNCIRSVIVVNDGEKDASWLWFMTHLKASIR-DI

Query:  PNLVIISDHQISIGNAVSSIFLETFYALCIYHIWNNLLDKFKNKDIIPHFYLSAKAYRMSEFQIYWSKLQRYHGMTTYLEEIGLQQWARVYQVHCRYYKI
        PNLVI  D  ISI  AV+ IF + FYALCIYHI NNL+DKFKNKDII HFYL+ KAYRMS F +YW+KL +Y G+T YLEE+GLQ WARVYQVHCRY K+
Subjt:  PNLVIISDHQISIGNAVSSIFLETFYALCIYHIWNNLLDKFKNKDIIPHFYLSAKAYRMSEFQIYWSKLQRYHGMTTYLEEIGLQQWARVYQVHCRYYKI

Query:  TTNIVECLNEVLIDARELPITKLLEHIRE
        TTNIVECLN VL DA+ELPITKLLEHIRE
Subjt:  TTNIVECLNEVLIDARELPITKLLEHIRE

XP_038884809.1 protein FAR-RED ELONGATED HYPOCOTYL 3-like [Benincasa hispida]2.7e-7655.28Show/hide
Query:  IVGPTNDSSYDDLITILHDCLSDEDIREGQIFFSKYDLYGRAYRAREYVLVFARGSSEGSYPLVNSYGEALKLANPGTVFEVEVEEQRYFKYVYMAFGLC
        ++G    S Y+ +  I +     ED+R     +     YGRAY  +EY L++ RG  EGSY +V +Y EALKLAN GT+FEVEVE+ +YFKYV+MA   C
Subjt:  IVGPTNDSSYDDLITILHDCLSDEDIREGQIFFSKYDLYGRAYRAREYVLVFARGSSEGSYPLVNSYGEALKLANPGTVFEVEVEEQRYFKYVYMAFGLC

Query:  TKGFLNCIRSVIVVN--------------------------------DGEKDASWLWFMTHLKASIRDIPNLVIISDHQISIGNAVSSIFLETFYALCIY
         +GFLNCIR VIVV+                                D E DASW WFMTHLKASI D+ NLVIIS+  ISI  A++ IF   F+ALCIY
Subjt:  TKGFLNCIRSVIVVN--------------------------------DGEKDASWLWFMTHLKASIRDIPNLVIISDHQISIGNAVSSIFLETFYALCIY

Query:  HIWNNLLDKFKNKDIIPHFYLSAKAYRMSEFQIYWSKLQRYHGMTTYLEEIGLQQWARVYQVHCRYYKITTNIVECLNEVLIDA
        HI NNLLDKFKN+DIIPHFYL+AKAYR+SEFQ+YW+KL +Y G+T YLEE GLQ+WARVYQ+HCRY K+TTNIVE LN VL DA
Subjt:  HIWNNLLDKFKNKDIIPHFYLSAKAYRMSEFQIYWSKLQRYHGMTTYLEEIGLQQWARVYQVHCRYYKITTNIVECLNEVLIDA

XP_038887021.1 uncharacterized protein LOC120077189 [Benincasa hispida]1.1e-7763.45Show/hide
Query:  MTHLKASIRDIPNLVIISDHQISIGNAVSSIFLETFYALCIYHIWNNLLDKFKNKDIIPHFYLSAKAYRMSEFQIYWSKLQRYHGMTTYLEEIGLQQWAR
        MTHLKASIRD+PNLVIISD  ISI  AV+SIFL+ F+ALCIYHIWNNL+DK KNKDIIP+FYL+AKAYRMS+FQ+YW+KL +Y  + TYLE++GLQ+WAR
Subjt:  MTHLKASIRDIPNLVIISDHQISIGNAVSSIFLETFYALCIYHIWNNLLDKFKNKDIIPHFYLSAKAYRMSEFQIYWSKLQRYHGMTTYLEEIGLQQWAR

Query:  VYQVHCRYYKITTNIVECLNEVLIDARELPITKLLEHIR------------------------------EAESVSRTYRVSPVDMYVLNVDDGHLGGLVD
        VYQV+CRY K+T NIVECLNEVL  ARELPITKLLEHIR                              E+E +SRTY V  VDM+++NVDDG+L G VD
Subjt:  VYQVHCRYYKITTNIVECLNEVLIDARELPITKLLEHIR------------------------------EAESVSRTYRVSPVDMYVLNVDDGHLGGLVD

Query:  LRSRTCTCMKFNCMEIPCSHAVAAAAMQNINVQTLCSK
        LRS TCT M+FNC EIPCSHA++   ++NIN+QT C K
Subjt:  LRSRTCTCMKFNCMEIPCSHAVAAAAMQNINVQTLCSK

XP_038904212.1 uncharacterized protein LOC120090557 [Benincasa hispida]2.3e-8366.12Show/hide
Query:  SKYDLYGRAYRAREYVLVFARGSSEG-SY--PLVNSYGEALKLANPGTVFEVEVEEQRYFKYVYMAFGLCTKGFLNCIRSVIVVN---------DGEKDA
        SKY+  G+ YR ++ +    R      SY    V +YGEALKLANPGT+FEVEVE+++YFKYV+MA G C KGFLNC   VIVV+         DGE DA
Subjt:  SKYDLYGRAYRAREYVLVFARGSSEG-SY--PLVNSYGEALKLANPGTVFEVEVEEQRYFKYVYMAFGLCTKGFLNCIRSVIVVN---------DGEKDA

Query:  SWLWFMTHLKASIRDIPNLVIISDHQISIGNAVSSIFLETFYALCIYHIWNNLLDKFKNKDIIPHFYLSAKAYRMSEFQIYWSKLQRYHGMTTYLEEIGL
        SW WFMTHLKASIRD+PNL+IISD  ISI  AV+SIF + F+ALCIYHI NNL+DKFKNK+ IPHFYL+AK YRM EFQ+YW+KL +Y G+T YLEEIGL
Subjt:  SWLWFMTHLKASIRDIPNLVIISDHQISIGNAVSSIFLETFYALCIYHIWNNLLDKFKNKDIIPHFYLSAKAYRMSEFQIYWSKLQRYHGMTTYLEEIGL

Query:  QQWARVYQVHCRYYKITTNIVECLNEVLIDARELPITKLLEHIRE
         QWARVYQVHCRY K+TTNI+ECLN VL DAR+L ITKLLEHI+E
Subjt:  QQWARVYQVHCRYYKITTNIVECLNEVLIDARELPITKLLEHIRE

TrEMBL top hitse value%identityAlignment
A0A6J1BRM2 protein FAR1-RELATED SEQUENCE 4-like6.3e-6336.5Show/hide
Query:  IVGPTNDSSYDDLITILHDCLSDEDIREGQIFFSKYDLYGRAYRAREYVLVFARGSSEGSYPLVNSYGEALKLANPGTVFEVEVEEQRYFKYVYMAFGLC
        +VG    S + D+          +DIRE       YD   +A+R+ E  L   RG    SY L+ +YGEA+K+ NPGT+FE+E+++ +YFKYV+MA G  
Subjt:  IVGPTNDSSYDDLITILHDCLSDEDIREGQIFFSKYDLYGRAYRAREYVLVFARGSSEGSYPLVNSYGEALKLANPGTVFEVEVEEQRYFKYVYMAFGLC

Query:  TKGFLNCIRSVIVVN--------------------------------DGEKDASWLWFMTHLKASIRDIPNLVIISDHQISIGNAVSSIFLETFYALCIY
         +GF+ CIR V+V++                                  E  ASW+WFMT LK+ +  + NLV ISD   +I  A+  +F   F+  CI+
Subjt:  TKGFLNCIRSVIVVN--------------------------------DGEKDASWLWFMTHLKASIRDIPNLVIISDHQISIGNAVSSIFLETFYALCIY

Query:  HIWNNLLDKFKNKDIIPHFYLSAKAYRMSEFQIYWSKLQRYHGMTTYLEEIGLQQWARVYQVHCRYYKITTNIVECLNEVLIDARELPITKLLEHIR---
        H+  NLL KFK   +   F+ +AKA+R S F   W +L  + G+  YLE IG ++WAR +Q   RY ++TTNI E +N +   AR+L IT LL+HIR   
Subjt:  HIWNNLLDKFKNKDIIPHFYLSAKAYRMSEFQIYWSKLQRYHGMTTYLEEIGLQQWARVYQVHCRYYKITTNIVECLNEVLIDARELPITKLLEHIR---

Query:  ---------------------------EAESVSRTYRVSPVDMYVLNVDDGHLGGLVDLRSRTCTCMKFNCMEIPCSHAVAAAAMQNINVQTLCSKWFTV
                                   EA   +R + V  +D +   V DG+L G VDL+S+TCTC +F+  ++PCSHA+AAA+ ++IN  TLC + +TV
Subjt:  ---------------------------EAESVSRTYRVSPVDMYVLNVDDGHLGGLVDLRSRTCTCMKFNCMEIPCSHAVAAAAMQNINVQTLCSKWFTV

A0A6J1D278 uncharacterized protein LOC1110166239.8e-6435.29Show/hide
Query:  YDL---YGRAYRAREYVLVFARGSSEGSYPLVNSYGEALKLANPGTVFEVEVEEQRYFKYVYMAFGLCTKGFLNCIRSVIVVN-----------------
        YD+   Y +A+RA+E  L    GS + SY L+  YGEALK  NPGT+F +++E+ +YF+Y +MA G   +GF +CIRSV+V++                 
Subjt:  YDL---YGRAYRAREYVLVFARGSSEGSYPLVNSYGEALKLANPGTVFEVEVEEQRYFKYVYMAFGLCTKGFLNCIRSVIVVN-----------------

Query:  ---------------DGEKDASWLWFMTHLKASIRDIPNLVIISDHQISIGNAVSSIFLETFYALCIYHIWNNLLDKFKNKDIIPHFYLSAKAYRMSEFQ
                       D E D SW WF+  +K++I ++  LV +SD   +I N+V+++F    +  C++HI  NL DKFKN+ +   +  +A+A++ S F+
Subjt:  ---------------DGEKDASWLWFMTHLKASIRDIPNLVIISDHQISIGNAVSSIFLETFYALCIYHIWNNLLDKFKNKDIIPHFYLSAKAYRMSEFQ

Query:  IYWSKLQRYHGMTTYLEEIGLQQWARVYQVHCRYYKITTNIVECLNEVLIDARELPITKLLEH------------------------------IREAESV
         YW++L  +  +  YL+E+G  +W+R YQ   RY ++TTNI E +N VL+ AR LP+T LLE+                              +R++   
Subjt:  IYWSKLQRYHGMTTYLEEIGLQQWARVYQVHCRYYKITTNIVECLNEVLIDARELPITKLLEH------------------------------IREAESV

Query:  SRTYRVSPVDMYVLNVDDGHLGGLVDLRSRTCTCMKFNCMEIPCSHAVAAAAMQNINVQTLCSKWFTVECVLAA
        +R + + P+D Y   V DG     V+L S+TC+C +F+  +IPCSHA+AAA +QN+N  TLCS  + +E ++ A
Subjt:  SRTYRVSPVDMYVLNVDDGHLGGLVDLRSRTCTCMKFNCMEIPCSHAVAAAAMQNINVQTLCSKWFTVECVLAA

A0A6J1DL12 uncharacterized protein LOC1110220772.2e-6335.66Show/hide
Query:  YGRAYRAREYVLVFARGSSEGSYPLVNSYGEALKLANPGTVFEVEVEEQRYFKYVYMAFGLCTKGFLNCIRSVIVVN-----------------------
        Y +A+RA+E  L    GS + SY  +  Y EALK+ N GT+FE+E+EE +YFKY +MA G C +GF +CIR V+V++                       
Subjt:  YGRAYRAREYVLVFARGSSEGSYPLVNSYGEALKLANPGTVFEVEVEEQRYFKYVYMAFGLCTKGFLNCIRSVIVVN-----------------------

Query:  ---------DGEKDASWLWFMTHLKASIRDIPNLVIISDHQISIGNAVSSIFLETFYALCIYHIWNNLLDKFKNKDIIPHFYLSAKAYRMSEFQIYWSKL
                 D E DASW WF+  LK  I ++  L+ +SD  +SI  +V  +F E  + +C++H+  NL DKFKN DI   F L+AKA++ S F+ Y+S+L
Subjt:  ---------DGEKDASWLWFMTHLKASIRDIPNLVIISDHQISIGNAVSSIFLETFYALCIYHIWNNLLDKFKNKDIIPHFYLSAKAYRMSEFQIYWSKL

Query:  QRYHGMTTYLEEIGLQQWARVYQVHCRYYKITTNIVECLNEVLIDARELPITKLLEH------------------------------IREAESVSRTYRV
          +  +  YLE IG ++W R +Q   RY ++T+N  E +N VL  AR LP+T LLE                               +R AE++SR Y +
Subjt:  QRYHGMTTYLEEIGLQQWARVYQVHCRYYKITTNIVECLNEVLIDARELPITKLLEH------------------------------IREAESVSRTYRV

Query:  SPVDMYVLNVDDGHLGGLVDLRSRTCTCMKFNCMEIPCSHAVAAAAMQNINVQTLCSKWFTVECVLAAMPNRYTQLAIDKSGDKDLI
        +P+D++ L V DG     V+L +RTC C +F+  E+PCSHA+AA   QN+N  +LCS  ++++ ++    N Y +       ++D +
Subjt:  SPVDMYVLNVDDGHLGGLVDLRSRTCTCMKFNCMEIPCSHAVAAAAMQNINVQTLCSKWFTVECVLAAMPNRYTQLAIDKSGDKDLI

A0A6J1DLB0 uncharacterized protein LOC1110219691.8e-6235.6Show/hide
Query:  YGRAYRAREYVLVFARGSSEGSYPLVNSYGEALKLANPGTVFEVEVEEQRYFKYVYMAFGLCTKGFLNCIRSVIVVN-----------------------
        Y + +RARE  L    GS + SY  ++ YG ALK AN GTVF++++E+  YFKY +MA G   +GF +CIR V+VV+                       
Subjt:  YGRAYRAREYVLVFARGSSEGSYPLVNSYGEALKLANPGTVFEVEVEEQRYFKYVYMAFGLCTKGFLNCIRSVIVVN-----------------------

Query:  ---------DGEKDASWLWFMTHLKASIRDIPNLVIISDHQISIGNAVSSIFLETFYALCIYHIWNNLLDKFKNKDIIPHFYLSAKAYRMSEFQIYWSKL
                 D E D SW WF+  +K  I ++  LV +SD   +I N+V+++FL+  +  C++H+   L +KF+N  +   FY +AKA+++S+F+ YW +L
Subjt:  ---------DGEKDASWLWFMTHLKASIRDIPNLVIISDHQISIGNAVSSIFLETFYALCIYHIWNNLLDKFKNKDIIPHFYLSAKAYRMSEFQIYWSKL

Query:  QRYHGMTTYLEEIGLQQWARVYQVHCRYYKITTNIVECLNEVLIDARELPITKLLEH------------------------------IREAESVSRTYRV
          + G+  YLE+IGL +WAR YQ   RY ++T+N+ E +N VL+ AR+LPIT L E+                              ++E    +R + V
Subjt:  QRYHGMTTYLEEIGLQQWARVYQVHCRYYKITTNIVECLNEVLIDARELPITKLLEH------------------------------IREAESVSRTYRV

Query:  SPVDMYVLNVDDGHLGGLVDLRSRTCTCMKFNCMEIPCSHAVAAAAMQNINVQTLCSKWFTVECVLAA
         P+D +   V DG     V++ S+TCTC +F   EIPCSHA+A A ++NI+V TLCS  + ++ ++ A
Subjt:  SPVDMYVLNVDDGHLGGLVDLRSRTCTCMKFNCMEIPCSHAVAAAAMQNINVQTLCSKWFTVECVLAA

A0A6J1DRN0 uncharacterized protein LOC111022579 isoform X13.7e-6337.94Show/hide
Query:  YGRAYRAREYVLVFARGSSEGSYPLVNSYGEALKLANPGTVFEVEVEEQRYFKYVYMAFGLCTKGFLNCIRSVIVVN-----------------------
        Y +A+RARE V +  +GSSE SY L++ YGEALKLANPGT +E+++E+  +FKY++MA G C +GFLNCIR VIV++                       
Subjt:  YGRAYRAREYVLVFARGSSEGSYPLVNSYGEALKLANPGTVFEVEVEEQRYFKYVYMAFGLCTKGFLNCIRSVIVVN-----------------------

Query:  ---------DGEKDASWLWFMTHLKASIRDIPNLVIISDHQISIGNAVSSIFLETFYALCIYHIWNNLLDKFKNKDIIPHFYLSAKAYRMSEFQIYWSKL
                 D E D S  WF   LK +I ++ +L+ +SD   SI  +++ +F   F+ LCI+H+  NL  KF N+ I   F  +AKAYR S+F+  W ++
Subjt:  ---------DGEKDASWLWFMTHLKASIRDIPNLVIISDHQISIGNAVSSIFLETFYALCIYHIWNNLLDKFKNKDIIPHFYLSAKAYRMSEFQIYWSKL

Query:  QRY-HGMTTYLEEIGLQQWARVYQVHCRYYKITTNIVECLNEVLIDARELPITKLLEHIR------------EAESV------------------SRTYR
          + +G+  YLEE+GL +W R+Y    RY  +TTNI E +N +L +AREL +  ++EH+R            EA  V                  S T +
Subjt:  QRY-HGMTTYLEEIGLQQWARVYQVHCRYYKITTNIVECLNEVLIDARELPITKLLEHIR------------EAESV------------------SRTYR

Query:  VSPVDMYVLNVDDGHLGGLVDLRSRTCTCMKFNCMEIPCSHAVAAAAMQNINVQTLCSKWFTVECVLAA
        V+ ++ +  +V D      V+L +R CTCM+F   ++PC+HA+ AA  QNI+V +LC+ ++T EC+LAA
Subjt:  VSPVDMYVLNVDDGHLGGLVDLRSRTCTCMKFNCMEIPCSHAVAAAAMQNINVQTLCSKWFTVECVLAA

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
AT1G49920.1 MuDR family transposase9.2e-0621.25Show/hide
Query:  SWLWFMTHLKASIRDIPNLVIISDHQ---ISIGNAVSSIFLE--TFYALCIYHIWNNLLDKFKNKDIIPHFYL--SAKAYRMSEFQIYWSKL-QRYHGMT
        SW WF+T ++  +     + +IS      +++ N   S + E   ++  C+YH+ + L       D   HF +  +  + +  EF  Y  ++ +R     
Subjt:  SWLWFMTHLKASIRDIPNLVIISDHQ---ISIGNAVSSIFLE--TFYALCIYHIWNNLLDKFKNKDIIPHFYL--SAKAYRMSEFQIYWSKL-QRYHGMT

Query:  TYLEEIGLQQWARVYQVHCRY--YKITTNIVEC--------------------LNEVLIDARELPITKL----------LEHIREAESVSRTY--RVSPV
         +L++    QWA  +    RY   +I T  +                      L +   ++ +L    L          +E + E E+ S T+   ++P+
Subjt:  TYLEEIGLQQWARVYQVHCRY--YKITTNIVEC--------------------LNEVLIDARELPITKL----------LEHIREAESVSRTY--RVSPV

Query:  --DMY-----------VLNVDDGHLGGLVDLRSRTCTCMKFNCMEIPCSHAVAAAAMQNINVQTLCSKWFTVE
          D Y           ++   +    G+V L   TCTC +F   + PC HA+A      IN        +TVE
Subjt:  --DMY-----------VLNVDDGHLGGLVDLRSRTCTCMKFNCMEIPCSHAVAAAAMQNINVQTLCSKWFTVE


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGATACTAGGCGATCGATTAGTCCATTACCAGGCGATCGCATACATGATACTAGGCGATTGCTTAATCTATCACCAGGCGATTGCATACATGATACTAGGCAATCGATT
AGTCCACTACCAAGCGATTGCATACATGATACTAGGCGATCGATTAGTCCACCAAGCGATCGCATACATGATACTAGGCGATCGTTTACACTTACCTGAATTCAGACCAT
TTCTAACTAGAAAACTTAACTTCTCTGCAGAAGGAAGTGATGCATCTCGAATGTCATTGTTCGTATCACTTATACCTCATGAGAGGCATGGAGTACATGACATGCACCAA
GAATGCAATCCAACAGCTGAAGCAGTTCCAAGCATCTCAACATATGATTTTCCCACCACGTCATCATTGGAGAAGAATGTTGAGGCATGCAATCCGGGGGATGACAGAAG
TGAAGAATGGGCTGGTCAAAGTATTCTGACGTATGACGTGTACGGACAGTGGTCAATACCACAGACAATGCCAACCTCACCTGTGGCAATGCCATCCGTCCAAGTGCCAC
CCATTGTCCCAACCCAACCATTTGTAGACATAGTCGGACCAACAAATGACTCATCGTACGACGACCTCATCACAATTCTCCACGATTGTTTGAGCGACGAAGATATAAGG
GAAGGTCAAATTTTCTTCTCCAAATATGACCTCTATGGGCGAGCTTATCGTGCAAGGGAGTACGTCTTGGTATTTGCAAGAGGGTCGTCAGAAGGATCATATCCGCTTGT
CAATTCATATGGTGAGGCACTGAAACTTGCAAATCCAGGCACGGTGTTTGAGGTTGAAGTAGAAGAGCAGCGGTACTTTAAGTACGTCTACATGGCATTTGGTCTATGTA
CTAAGGGATTCTTGAACTGCATCCGTTCTGTTATTGTTGTCAACGACGGCGAGAAGGATGCATCTTGGCTTTGGTTCATGACCCACTTGAAGGCCTCGATTAGAGACATC
CCTAACTTGGTGATTATATCAGATCATCAGATATCCATCGGGAATGCTGTATCAAGTATATTCCTAGAGACATTCTATGCACTGTGTATATACCACATTTGGAATAACTT
GCTGGATAAGTTCAAAAATAAGGACATAATTCCGCATTTTTACCTATCAGCTAAGGCCTACAGGATGTCTGAGTTTCAAATATATTGGTCTAAGCTTCAAAGGTATCATG
GGATGACAACCTACCTTGAAGAGATTGGATTACAACAGTGGGCACGGGTGTACCAAGTCCATTGTAGGTATTACAAGATAACAACAAATATAGTAGAATGTCTCAATGAA
GTATTAATAGACGCGAGGGAACTGCCTATTACGAAGCTTCTAGAGCATATTCGGGAGGCTGAATCGGTGTCCAGAACGTATCGTGTATCTCCAGTGGACATGTATGTGCT
AAATGTTGACGATGGGCATCTTGGTGGGCTGGTTGATCTTCGTTCACGAACATGTACGTGTATGAAGTTCAATTGCATGGAGATTCCATGCTCACACGCAGTAGCTGCGG
CAGCCATGCAGAATATCAACGTTCAAACGTTGTGCTCAAAGTGGTTCACAGTAGAGTGTGTGCTTGCCGCTATGCCGAACCGATATACCCAGTTGGCTATCGACAAGAGT
GGAGACAAAGACCTAATTTCGATAACTTTGAAATCTTACCACCGCAAAAGGTGCCAAGTGTAG
mRNA sequenceShow/hide mRNA sequence
ATGATACTAGGCGATCGATTAGTCCATTACCAGGCGATCGCATACATGATACTAGGCGATTGCTTAATCTATCACCAGGCGATTGCATACATGATACTAGGCAATCGATT
AGTCCACTACCAAGCGATTGCATACATGATACTAGGCGATCGATTAGTCCACCAAGCGATCGCATACATGATACTAGGCGATCGTTTACACTTACCTGAATTCAGACCAT
TTCTAACTAGAAAACTTAACTTCTCTGCAGAAGGAAGTGATGCATCTCGAATGTCATTGTTCGTATCACTTATACCTCATGAGAGGCATGGAGTACATGACATGCACCAA
GAATGCAATCCAACAGCTGAAGCAGTTCCAAGCATCTCAACATATGATTTTCCCACCACGTCATCATTGGAGAAGAATGTTGAGGCATGCAATCCGGGGGATGACAGAAG
TGAAGAATGGGCTGGTCAAAGTATTCTGACGTATGACGTGTACGGACAGTGGTCAATACCACAGACAATGCCAACCTCACCTGTGGCAATGCCATCCGTCCAAGTGCCAC
CCATTGTCCCAACCCAACCATTTGTAGACATAGTCGGACCAACAAATGACTCATCGTACGACGACCTCATCACAATTCTCCACGATTGTTTGAGCGACGAAGATATAAGG
GAAGGTCAAATTTTCTTCTCCAAATATGACCTCTATGGGCGAGCTTATCGTGCAAGGGAGTACGTCTTGGTATTTGCAAGAGGGTCGTCAGAAGGATCATATCCGCTTGT
CAATTCATATGGTGAGGCACTGAAACTTGCAAATCCAGGCACGGTGTTTGAGGTTGAAGTAGAAGAGCAGCGGTACTTTAAGTACGTCTACATGGCATTTGGTCTATGTA
CTAAGGGATTCTTGAACTGCATCCGTTCTGTTATTGTTGTCAACGACGGCGAGAAGGATGCATCTTGGCTTTGGTTCATGACCCACTTGAAGGCCTCGATTAGAGACATC
CCTAACTTGGTGATTATATCAGATCATCAGATATCCATCGGGAATGCTGTATCAAGTATATTCCTAGAGACATTCTATGCACTGTGTATATACCACATTTGGAATAACTT
GCTGGATAAGTTCAAAAATAAGGACATAATTCCGCATTTTTACCTATCAGCTAAGGCCTACAGGATGTCTGAGTTTCAAATATATTGGTCTAAGCTTCAAAGGTATCATG
GGATGACAACCTACCTTGAAGAGATTGGATTACAACAGTGGGCACGGGTGTACCAAGTCCATTGTAGGTATTACAAGATAACAACAAATATAGTAGAATGTCTCAATGAA
GTATTAATAGACGCGAGGGAACTGCCTATTACGAAGCTTCTAGAGCATATTCGGGAGGCTGAATCGGTGTCCAGAACGTATCGTGTATCTCCAGTGGACATGTATGTGCT
AAATGTTGACGATGGGCATCTTGGTGGGCTGGTTGATCTTCGTTCACGAACATGTACGTGTATGAAGTTCAATTGCATGGAGATTCCATGCTCACACGCAGTAGCTGCGG
CAGCCATGCAGAATATCAACGTTCAAACGTTGTGCTCAAAGTGGTTCACAGTAGAGTGTGTGCTTGCCGCTATGCCGAACCGATATACCCAGTTGGCTATCGACAAGAGT
GGAGACAAAGACCTAATTTCGATAACTTTGAAATCTTACCACCGCAAAAGGTGCCAAGTGTAG
Protein sequenceShow/hide protein sequence
MILGDRLVHYQAIAYMILGDCLIYHQAIAYMILGNRLVHYQAIAYMILGDRLVHQAIAYMILGDRLHLPEFRPFLTRKLNFSAEGSDASRMSLFVSLIPHERHGVHDMHQ
ECNPTAEAVPSISTYDFPTTSSLEKNVEACNPGDDRSEEWAGQSILTYDVYGQWSIPQTMPTSPVAMPSVQVPPIVPTQPFVDIVGPTNDSSYDDLITILHDCLSDEDIR
EGQIFFSKYDLYGRAYRAREYVLVFARGSSEGSYPLVNSYGEALKLANPGTVFEVEVEEQRYFKYVYMAFGLCTKGFLNCIRSVIVVNDGEKDASWLWFMTHLKASIRDI
PNLVIISDHQISIGNAVSSIFLETFYALCIYHIWNNLLDKFKNKDIIPHFYLSAKAYRMSEFQIYWSKLQRYHGMTTYLEEIGLQQWARVYQVHCRYYKITTNIVECLNE
VLIDARELPITKLLEHIREAESVSRTYRVSPVDMYVLNVDDGHLGGLVDLRSRTCTCMKFNCMEIPCSHAVAAAAMQNINVQTLCSKWFTVECVLAAMPNRYTQLAIDKS
GDKDLISITLKSYHRKRCQV