; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Moc11g29310 (gene) of Bitter gourd (OHB3-1) v2 genome

Gene IDMoc11g29310
OrganismMomordica charantia cv. OHB3-1 (Bitter gourd (OHB3-1) v2)
DescriptionUnknown protein
Genome locationchr11:21346354..21357253
RNA-Seq ExpressionMoc11g29310
SyntenyMoc11g29310
Gene Ontology termsGO:0016021 - integral component of membrane (cellular component)
GO:0003676 - nucleic acid binding (molecular function)
GO:0004523 - RNA-DNA hybrid ribonuclease activity (molecular function)
GO:0043167 - ion binding (molecular function)
InterPro domainsIPR002156 - Ribonuclease H domain
IPR012337 - Ribonuclease H-like superfamily
IPR036397 - Ribonuclease H superfamily
IPR044730 - Ribonuclease H-like domain, plant type


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_022138041.1 uncharacterized protein LOC111009298 [Momordica charantia]1.8e-11889.37Show/hide
Query:  MCARKGAGGIVKGPTSIKGWVRKWFYASGEWLAKDESGRSFLDVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDKLLLESGLLDYNP
        MCARKGA GIVKGPTSIKGWVRKWFYASGEWLAKDES              V+IRPVPELTQASFDTLKYYKE FPRGRKVGTLVTDKLLLESGLLDYNP
Subjt:  MCARKGAGGIVKGPTSIKGWVRKWFYASGEWLAKDESGRSFLDVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDKLLLESGLLDYNP

Query:  AVRPIESSRPNSELAMVCGFASNVKRKSKGQVHALEAVQSSKPATPAVVGPASEDPAPVIELESSGGPSREKRPRDQTEVVDVSPLGEELREEVPLKRRR
        AVRPIESSRPNSELAMVCGFASNVKRKSKGQ HALEA QSSKP TPAVVGPASEDPAPVIELESS GPSREKRPRDQTE VDVSPLGEE+REEVPLKRRR
Subjt:  AVRPIESSRPNSELAMVCGFASNVKRKSKGQVHALEAVQSSKPATPAVVGPASEDPAPVIELESSGGPSREKRPRDQTEVVDVSPLGEELREEVPLKRRR

Query:  KKKKTTSPLEVGARGVLPASFVDRVDDPEARMGGTSAVTARFRVEPSSSGVRDQ
        KKKKTTSPLEVGARGVLPASF DRVDDPEARMGGT  VT RFRVEPSSSGVRDQ
Subjt:  KKKKTTSPLEVGARGVLPASFVDRVDDPEARMGGTSAVTARFRVEPSSSGVRDQ

XP_022144034.1 uncharacterized protein LOC111013826 [Momordica charantia]2.1e-13892.31Show/hide
Query:  MFEYGLRLPVHPFVQEFLFRTGLAPAQVAPNGWGVIFALATLFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYMCARKGAGGIVKGPTSIKGWVR
        MFEYGLRLP+HPFVQEFLFRTGLAPAQVAPNGWGVIFALA LFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYMCARKGAGGIVKGPTSIKGWVR
Subjt:  MFEYGLRLPVHPFVQEFLFRTGLAPAQVAPNGWGVIFALATLFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYMCARKGAGGIVKGPTSIKGWVR

Query:  KWFYASGEWLAKDESGRSFLDVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDKLLLESGLLDYNPAVRPIESSRPNSELAMVCGFAS
        KWFYASGEWLAKDESGRSF DVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTD+LLLESGLLDYNPAVRPIE SRPNS LAMVC FAS
Subjt:  KWFYASGEWLAKDESGRSFLDVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDKLLLESGLLDYNPAVRPIESSRPNSELAMVCGFAS

Query:  NVKRKSKGQVHALEAVQSSKPATPAVVGPASEDPAPVIELESSGGPSREKRPRDQTEVV-------DVSPLGE
         VKRKSKG+ HALEA QSSKP TPAVVGPASEDPAPVIELESSGGPSREKRPRDQTE V       DV PLGE
Subjt:  NVKRKSKGQVHALEAVQSSKPATPAVVGPASEDPAPVIELESSGGPSREKRPRDQTEVV-------DVSPLGE

XP_022158122.1 uncharacterized protein LOC111024680 [Momordica charantia]1.1e-10497.92Show/hide
Query:  MFEYGLRLPVHPFVQEFLFRTGLAPAQVAPNGWGVIFALATLFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYMCARKGAGGIVKGPTSIKGWVR
        MFEYGLRLP+HPFVQEFLFRTGLAPAQVAPNGWGVIFALA LFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYMCARKGAGGIVKGPTSIKGWVR
Subjt:  MFEYGLRLPVHPFVQEFLFRTGLAPAQVAPNGWGVIFALATLFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYMCARKGAGGIVKGPTSIKGWVR

Query:  KWFYASGEWLAKDESGRSFLDVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDKLLLESGLLDYNPAVRPIESSRPNSEL
        KWFYASGEWLAKDESGRSF DVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTD+LLLESGLLDYNPAVRPIESSRPNSEL
Subjt:  KWFYASGEWLAKDESGRSFLDVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDKLLLESGLLDYNPAVRPIESSRPNSEL

XP_022159063.1 uncharacterized protein LOC111025502, partial [Momordica charantia]1.7e-18592.96Show/hide
Query:  MSSPFSSDLGSDEDLARRLESELDEIENFRFSDYGEDSDASTSGQGLEYPSRIPEHYLGSLRRGFAIPENILLRIPEEGERADNPLEGWVTLYFKMFEYG
        MSS  SS+L  + DLARRLES+L+EIEN R SD GEDSDASTSGQGLEYPSRIPEHYLGSLRRGFAIPENILLR+PEEGERADNP EGWVTLYFKMFEYG
Subjt:  MSSPFSSDLGSDEDLARRLESELDEIENFRFSDYGEDSDASTSGQGLEYPSRIPEHYLGSLRRGFAIPENILLRIPEEGERADNPLEGWVTLYFKMFEYG

Query:  LRLPVHPFVQEFLFRTGLAPAQVAPNGWGVIFALATLFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYMCARKGAGGIVKGPTSIKGWVRKWFYA
        LRLP+HPFVQEFLFRTGLAPAQVAPNGWGVIFALA LFWLRARDSEEAEL DVDQLLACFEAKRIAKKPGRFYMCARKGAGGIVKGPTSIKGWVRKWFYA
Subjt:  LRLPVHPFVQEFLFRTGLAPAQVAPNGWGVIFALATLFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYMCARKGAGGIVKGPTSIKGWVRKWFYA

Query:  SGEWLAKDESGRSFLDVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDKLLLESGLLDYNPAVRPIESSRPNSELAMVCGFASNVKRK
        SGEWLAKDESGRSF DVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTD+LLLESGLLDYNPAVRPIESSRPNSELAMVCGFAS VKRK
Subjt:  SGEWLAKDESGRSFLDVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDKLLLESGLLDYNPAVRPIESSRPNSELAMVCGFASNVKRK

Query:  SKGQVHALEAVQSSKPATPAVVGPASEDPAPVIELESSGGPSREKRPRDQTEVVD
        SKG+ HALEA QSSKPATPAVVGPASEDPA VIELESSGGPSREKRPRDQTE VD
Subjt:  SKGQVHALEAVQSSKPATPAVVGPASEDPAPVIELESSGGPSREKRPRDQTEVVD

XP_022159252.1 uncharacterized protein LOC111025665 [Momordica charantia]3.1e-17164.69Show/hide
Query:  MCARKGAGGIVKGPTSIKGWVRKWFYASGEWLAKDESGRSFLDVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDKLLLESGLLDYNP
        MCARKG GGIVKGPTSIKGWV KWF+ASGEWLAKDESGR+F DVPTRFGNLVSI+ +PEL QA+FDTLK+YK+ FPR RK+ TLVTDKLLLESGLLDYNP
Subjt:  MCARKGAGGIVKGPTSIKGWVRKWFYASGEWLAKDESGRSFLDVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDKLLLESGLLDYNP

Query:  AVRPIESSRPNSELAMVCGFASNVKRKSKGQVHALEAVQSSKPATPAV--------VGPASEDPAPVIELESSGGPSREKRPRDQTEVVDVSPLGEELRE
         VR IE+SRPNSELAMVCGF  +VKRKSKG+ HAL+ V  ++P TP V         GP+S  P PVIEL+ SGG S EKR R+++E +DVSPL  E+R 
Subjt:  AVRPIESSRPNSELAMVCGFASNVKRKSKGQVHALEAVQSSKPATPAV--------VGPASEDPAPVIELESSGGPSREKRPRDQTEVVDVSPLGEELRE

Query:  EVPLKRRRKKKKTTSPLEVGARGVLPASFVDRVDDPEARMGGTSAVTARFRVEPSSSGVRDQVSRISAASLDRCLRRASKFVSDPGSVLQRTIDYAAEAF
        E PL+RRRKKKKT+S  E GARG LP S  D VDDPEARM GTS V  RF +EPSSSGV+DQVSRISA  LDR LRRASKFVSDPGSVLQRTID  AEAF
Subjt:  EVPLKRRRKKKKTTSPLEVGARGVLPASFVDRVDDPEARMGGTSAVTARFRVEPSSSGVRDQVSRISAASLDRCLRRASKFVSDPGSVLQRTIDYAAEAF

Query:  VASIQSALAVKAELDGREVLAAREKEEFSAALEAASSTMKDELLKAHSEVEILKAEVETKAELLKKEEDRRKAQLQAAHAITKDLEKEKFQLLKEKDNML
        +ASI  A+ VKAELDGRE LAA+E+E   AALEAA +T+K ELLKA  EV+IL+AEV+ K +LLKKE ++ KA L+AAHAITK LEKEKFQLLKEKD++ 
Subjt:  VASIQSALAVKAELDGREVLAAREKEEFSAALEAASSTMKDELLKAHSEVEILKAEVETKAELLKKEEDRRKAQLQAAHAITKDLEKEKFQLLKEKDNML

Query:  QALEAKEEELKHATAELATVKE-------------------------------LSQKGIASDMPDLQIDLGGLKKRYAEQWASGPSGTPGPQALVDKYVR
        Q LE K+  +   T EL  +KE                                  KGIA+DMP LQIDL GLKK+Y+E+WASGP+GTP PQ+LVDKYVR
Subjt:  QALEAKEEELKHATAELATVKE-------------------------------LSQKGIASDMPDLQIDLGGLKKRYAEQWASGPSGTPGPQALVDKYVR

Query:  DLDSDYSDLEEDQGRAARSISLGS
        +LDSDYSD+EE+   +     +G+
Subjt:  DLDSDYSDLEEDQGRAARSISLGS

TrEMBL top hitse value%identityAlignment
A0A6J1C8K9 uncharacterized protein LOC1110092988.8e-11989.37Show/hide
Query:  MCARKGAGGIVKGPTSIKGWVRKWFYASGEWLAKDESGRSFLDVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDKLLLESGLLDYNP
        MCARKGA GIVKGPTSIKGWVRKWFYASGEWLAKDES              V+IRPVPELTQASFDTLKYYKE FPRGRKVGTLVTDKLLLESGLLDYNP
Subjt:  MCARKGAGGIVKGPTSIKGWVRKWFYASGEWLAKDESGRSFLDVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDKLLLESGLLDYNP

Query:  AVRPIESSRPNSELAMVCGFASNVKRKSKGQVHALEAVQSSKPATPAVVGPASEDPAPVIELESSGGPSREKRPRDQTEVVDVSPLGEELREEVPLKRRR
        AVRPIESSRPNSELAMVCGFASNVKRKSKGQ HALEA QSSKP TPAVVGPASEDPAPVIELESS GPSREKRPRDQTE VDVSPLGEE+REEVPLKRRR
Subjt:  AVRPIESSRPNSELAMVCGFASNVKRKSKGQVHALEAVQSSKPATPAVVGPASEDPAPVIELESSGGPSREKRPRDQTEVVDVSPLGEELREEVPLKRRR

Query:  KKKKTTSPLEVGARGVLPASFVDRVDDPEARMGGTSAVTARFRVEPSSSGVRDQ
        KKKKTTSPLEVGARGVLPASF DRVDDPEARMGGT  VT RFRVEPSSSGVRDQ
Subjt:  KKKKTTSPLEVGARGVLPASFVDRVDDPEARMGGTSAVTARFRVEPSSSGVRDQ

A0A6J1CR42 uncharacterized protein LOC1110138261.0e-13892.31Show/hide
Query:  MFEYGLRLPVHPFVQEFLFRTGLAPAQVAPNGWGVIFALATLFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYMCARKGAGGIVKGPTSIKGWVR
        MFEYGLRLP+HPFVQEFLFRTGLAPAQVAPNGWGVIFALA LFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYMCARKGAGGIVKGPTSIKGWVR
Subjt:  MFEYGLRLPVHPFVQEFLFRTGLAPAQVAPNGWGVIFALATLFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYMCARKGAGGIVKGPTSIKGWVR

Query:  KWFYASGEWLAKDESGRSFLDVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDKLLLESGLLDYNPAVRPIESSRPNSELAMVCGFAS
        KWFYASGEWLAKDESGRSF DVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTD+LLLESGLLDYNPAVRPIE SRPNS LAMVC FAS
Subjt:  KWFYASGEWLAKDESGRSFLDVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDKLLLESGLLDYNPAVRPIESSRPNSELAMVCGFAS

Query:  NVKRKSKGQVHALEAVQSSKPATPAVVGPASEDPAPVIELESSGGPSREKRPRDQTEVV-------DVSPLGE
         VKRKSKG+ HALEA QSSKP TPAVVGPASEDPAPVIELESSGGPSREKRPRDQTE V       DV PLGE
Subjt:  NVKRKSKGQVHALEAVQSSKPATPAVVGPASEDPAPVIELESSGGPSREKRPRDQTEVV-------DVSPLGE

A0A6J1DWD2 uncharacterized protein LOC1110246805.5e-10597.92Show/hide
Query:  MFEYGLRLPVHPFVQEFLFRTGLAPAQVAPNGWGVIFALATLFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYMCARKGAGGIVKGPTSIKGWVR
        MFEYGLRLP+HPFVQEFLFRTGLAPAQVAPNGWGVIFALA LFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYMCARKGAGGIVKGPTSIKGWVR
Subjt:  MFEYGLRLPVHPFVQEFLFRTGLAPAQVAPNGWGVIFALATLFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYMCARKGAGGIVKGPTSIKGWVR

Query:  KWFYASGEWLAKDESGRSFLDVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDKLLLESGLLDYNPAVRPIESSRPNSEL
        KWFYASGEWLAKDESGRSF DVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTD+LLLESGLLDYNPAVRPIESSRPNSEL
Subjt:  KWFYASGEWLAKDESGRSFLDVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDKLLLESGLLDYNPAVRPIESSRPNSEL

A0A6J1DXS5 uncharacterized protein LOC1110255028.3e-18692.96Show/hide
Query:  MSSPFSSDLGSDEDLARRLESELDEIENFRFSDYGEDSDASTSGQGLEYPSRIPEHYLGSLRRGFAIPENILLRIPEEGERADNPLEGWVTLYFKMFEYG
        MSS  SS+L  + DLARRLES+L+EIEN R SD GEDSDASTSGQGLEYPSRIPEHYLGSLRRGFAIPENILLR+PEEGERADNP EGWVTLYFKMFEYG
Subjt:  MSSPFSSDLGSDEDLARRLESELDEIENFRFSDYGEDSDASTSGQGLEYPSRIPEHYLGSLRRGFAIPENILLRIPEEGERADNPLEGWVTLYFKMFEYG

Query:  LRLPVHPFVQEFLFRTGLAPAQVAPNGWGVIFALATLFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYMCARKGAGGIVKGPTSIKGWVRKWFYA
        LRLP+HPFVQEFLFRTGLAPAQVAPNGWGVIFALA LFWLRARDSEEAEL DVDQLLACFEAKRIAKKPGRFYMCARKGAGGIVKGPTSIKGWVRKWFYA
Subjt:  LRLPVHPFVQEFLFRTGLAPAQVAPNGWGVIFALATLFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYMCARKGAGGIVKGPTSIKGWVRKWFYA

Query:  SGEWLAKDESGRSFLDVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDKLLLESGLLDYNPAVRPIESSRPNSELAMVCGFASNVKRK
        SGEWLAKDESGRSF DVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTD+LLLESGLLDYNPAVRPIESSRPNSELAMVCGFAS VKRK
Subjt:  SGEWLAKDESGRSFLDVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDKLLLESGLLDYNPAVRPIESSRPNSELAMVCGFASNVKRK

Query:  SKGQVHALEAVQSSKPATPAVVGPASEDPAPVIELESSGGPSREKRPRDQTEVVD
        SKG+ HALEA QSSKPATPAVVGPASEDPA VIELESSGGPSREKRPRDQTE VD
Subjt:  SKGQVHALEAVQSSKPATPAVVGPASEDPAPVIELESSGGPSREKRPRDQTEVVD

A0A6J1DZB3 uncharacterized protein LOC1110256651.5e-17164.69Show/hide
Query:  MCARKGAGGIVKGPTSIKGWVRKWFYASGEWLAKDESGRSFLDVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDKLLLESGLLDYNP
        MCARKG GGIVKGPTSIKGWV KWF+ASGEWLAKDESGR+F DVPTRFGNLVSI+ +PEL QA+FDTLK+YK+ FPR RK+ TLVTDKLLLESGLLDYNP
Subjt:  MCARKGAGGIVKGPTSIKGWVRKWFYASGEWLAKDESGRSFLDVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGTLVTDKLLLESGLLDYNP

Query:  AVRPIESSRPNSELAMVCGFASNVKRKSKGQVHALEAVQSSKPATPAV--------VGPASEDPAPVIELESSGGPSREKRPRDQTEVVDVSPLGEELRE
         VR IE+SRPNSELAMVCGF  +VKRKSKG+ HAL+ V  ++P TP V         GP+S  P PVIEL+ SGG S EKR R+++E +DVSPL  E+R 
Subjt:  AVRPIESSRPNSELAMVCGFASNVKRKSKGQVHALEAVQSSKPATPAV--------VGPASEDPAPVIELESSGGPSREKRPRDQTEVVDVSPLGEELRE

Query:  EVPLKRRRKKKKTTSPLEVGARGVLPASFVDRVDDPEARMGGTSAVTARFRVEPSSSGVRDQVSRISAASLDRCLRRASKFVSDPGSVLQRTIDYAAEAF
        E PL+RRRKKKKT+S  E GARG LP S  D VDDPEARM GTS V  RF +EPSSSGV+DQVSRISA  LDR LRRASKFVSDPGSVLQRTID  AEAF
Subjt:  EVPLKRRRKKKKTTSPLEVGARGVLPASFVDRVDDPEARMGGTSAVTARFRVEPSSSGVRDQVSRISAASLDRCLRRASKFVSDPGSVLQRTIDYAAEAF

Query:  VASIQSALAVKAELDGREVLAAREKEEFSAALEAASSTMKDELLKAHSEVEILKAEVETKAELLKKEEDRRKAQLQAAHAITKDLEKEKFQLLKEKDNML
        +ASI  A+ VKAELDGRE LAA+E+E   AALEAA +T+K ELLKA  EV+IL+AEV+ K +LLKKE ++ KA L+AAHAITK LEKEKFQLLKEKD++ 
Subjt:  VASIQSALAVKAELDGREVLAAREKEEFSAALEAASSTMKDELLKAHSEVEILKAEVETKAELLKKEEDRRKAQLQAAHAITKDLEKEKFQLLKEKDNML

Query:  QALEAKEEELKHATAELATVKE-------------------------------LSQKGIASDMPDLQIDLGGLKKRYAEQWASGPSGTPGPQALVDKYVR
        Q LE K+  +   T EL  +KE                                  KGIA+DMP LQIDL GLKK+Y+E+WASGP+GTP PQ+LVDKYVR
Subjt:  QALEAKEEELKHATAELATVKE-------------------------------LSQKGIASDMPDLQIDLGGLKKRYAEQWASGPSGTPGPQALVDKYVR

Query:  DLDSDYSDLEEDQGRAARSISLGS
        +LDSDYSD+EE+   +     +G+
Subjt:  DLDSDYSDLEEDQGRAARSISLGS

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGTGATTAATTCCGATGCAGCCTGTAACATTTCATTGTCAGTGACTGGAATTGCGATTTCAATTCGTAGAGGTACTCATCTTTCGGAGGTTGCCATGTCCAAA
TTTGTGAGCATCAGGTACAGTACCCTCCATGCTGAGTTGATGGTGATTAAGGAAGATATTCGTCTTGTTTTCAGGATGAATTTCAGTGCTGTGGTGGTGGAATCT
GACGCTTTAGAGACGATTCACCTTCTAAATAGAAGGGAGATGAACAAATCAGAAGCTGGGATTTGGGTTTTGAGGATCCTTGAGCTTGGGAGAAATTTTGAAATG
GTGGAATTTCGTCATATCCGAAGGATTCACAATGAAATTGCAGATTCTTTGACCAGAGAGGTTGTCCGTTCGAGGTCCACTTTTCTTTGGGTGGATCACTTCCCT
TTGTGGCTTTCCCATCCTCCTAAGGATGAGAGCAAGAAAGCTTGGTTGCGGAATGATGCTCGACTGGTTTTACAAATCAAGAATTCAATCGAAGGAAATGTTAGT
CGAATATTTGAAGAAATCTACAAGCGCTTCTATCAACCTGAGTTTGGTGACCAATCACTTACAAATTACTTTATGGAACACAAAAGAATTTATGTAGAGTTTAAT
GCATTACTCCCAGATAGTAATTGTTTATGCAAGGATATGCACAACAGTGTATTTCAGATTGCAACTCGAACTCGGCCTCCGGACCGATCTGAATACTTGGGCGGA
CCTGCACAAAAAGGTGAGCACTCCGACGATCAAGTCAGTATAGGTCGGATTCTCAGTTTAGTTCGAGTTGCCATGTCGTCCCCTTTTAGCAGCGACTTAGGGTCC
GATGAGGATTTAGCTCGTAGGTTAGAGTCCGAGCTCGATGAGATAGAAAACTTTAGGTTCTCCGATTACGGGGAGGATAGTGATGCCTCCACATCGGGTCAGGGT
TTGGAATACCCTTCTAGGATACCTGAGCACTACCTCGGATCCCTTCGTAGGGGGTTCGCTATCCCTGAGAACATCCTCCTTAGGATTCCGGAGGAGGGGGAGAGA
GCTGACAATCCTCTAGAGGGATGGGTCACTCTCTACTTCAAAATGTTTGAGTACGGCCTCAGACTTCCCGTTCACCCTTTTGTCCAAGAATTTCTCTTCCGAACT
GGGTTGGCTCCGGCTCAAGTGGCCCCCAATGGGTGGGGTGTCATTTTCGCTTTGGCCACCCTTTTTTGGCTACGAGCTCGGGATAGTGAAGAGGCCGAGCTGTTG
GACGTAGACCAGCTCCTCGCGTGCTTCGAAGCGAAAAGGATAGCTAAGAAGCCTGGTCGGTTCTATATGTGCGCAAGGAAAGGCGCAGGCGGTATAGTTAAGGGG
CCGACCTCCATCAAGGGATGGGTGAGGAAGTGGTTCTACGCTTCTGGGGAATGGCTTGCAAAGGATGAGTCAGGTCGTTCCTTCCTTGACGTTCCCACTAGGTTT
GGGAACCTAGTTTCAATCCGACCAGTCCCTGAGCTTACGCAAGCCTCCTTCGACACGCTGAAATATTACAAGGAGCGTTTTCCGAGGGGTAGGAAGGTCGGAACC
TTGGTGACCGACAAGCTGCTGCTTGAGTCCGGGCTGCTAGATTACAACCCTGCAGTTCGTCCCATTGAATCCTCAAGGCCGAACTCCGAACTTGCCATGGTTTGC
GGATTTGCAAGCAACGTGAAGCGCAAGTCCAAGGGCCAAGTCCATGCTCTTGAGGCCGTCCAGAGTTCGAAACCTGCCACCCCTGCTGTGGTAGGGCCAGCCTCG
GAAGATCCAGCCCCAGTGATCGAGCTGGAGTCTTCTGGGGGTCCTTCGAGGGAGAAGCGCCCCAGGGATCAGACCGAGGTGGTGGACGTCTCGCCCTTGGGCGAG
GAGTTGAGGGAGGAAGTCCCTCTGAAGCGAAGGAGGAAGAAGAAGAAGACCACCTCCCCCTTGGAGGTCGGAGCTCGTGGGGTCTTGCCTGCGAGCTTCGTAGAT
CGGGTGGATGATCCTGAAGCCAGGATGGGCGGGACGTCCGCCGTGACGGCACGGTTCAGAGTTGAGCCGTCAAGTTCTGGGGTGAGGGACCAGGTGTCCCGCATC
TCGGCTGCAAGTTTGGACCGCTGCCTCAGAAGAGCGTCCAAATTTGTAAGTGACCCGGGGTCCGTCCTGCAGAGGACCATCGACTACGCCGCTGAGGCATTTGTT
GCTTCCATTCAATCGGCTCTGGCTGTAAAGGCCGAGCTGGATGGGAGGGAAGTTTTGGCAGCGAGGGAGAAAGAGGAGTTCTCTGCTGCCTTGGAGGCTGCTTCC
TCCACCATGAAGGATGAGCTGCTGAAGGCTCACTCTGAGGTGGAAATTTTGAAGGCCGAGGTGGAGACCAAGGCCGAGCTGCTGAAGAAAGAAGAGGACAGACGC
AAGGCCCAGCTCCAAGCTGCCCATGCTATCACCAAGGACTTGGAGAAGGAGAAGTTCCAACTCCTCAAGGAGAAGGACAACATGCTCCAGGCGCTTGAAGCGAAG
GAGGAGGAGCTGAAGCATGCGACTGCCGAGCTGGCGACGGTGAAGGAGCTTTCTCAGAAGGGCATTGCTTCCGACATGCCTGACCTTCAGATCGATCTCGGTGGT
CTGAAGAAGAGGTACGCTGAGCAGTGGGCGTCTGGGCCTAGCGGCACCCCTGGCCCCCAGGCGTTGGTGGATAAGTATGTCAGAGATCTGGACTCTGACTACTCC
GATCTCGAAGAGGACCAGGGCAGAGCTGCAAGGTCTATAAGCCTTGGCTCTGCTCTCCATTTAATGAAGAAGCTTTGTTTGAATGTTAAGTTCGTCAGTGGTTTT
GGCATCGCACCTCGTACCCTTAGATCCATTGAGAACCATTTCGGCATTTCAAGGATAATAACGCTTCAGGTGCTCCGCGTTCCATGGGTGCGCGATGACGTCTCC
TTTCAGATCGGCCAACATGTACGTCCCAGGCCCTCTTCAGGGGTTAGGCATCTCAATAGAGGCAGAGAAAAGCCGCGTCGGTACAATGCTCTACCTCGGACCACG
AACCGAGCTGCTCGCCTTGCCAACTTTCTGCGCTCCTTGGGGTCTTGTGGTGAATTGCCCCGAATGAAGTCCGCAATCGGGTCCATTCATGAGGACTCTGGAGCG
TCGATCTCCATCAGATCTCGCTCTGAGATCGAGGGATTATCTAAGATCTCGACGGGGACCGACCTGGCCAGGTCGTATCAAAAGTCATTGACGACATCGTCCTCT
AATCCAATTACCGTCATCGCTGAGTCAGGTAACTCTAACTTATTTGCTAATTTCTCCCCATCTGCATCTTTGCCTGATGTTACCATAGCAGATGGCACCACTTCT
CCTATTCTTGGCTCTGGCACAGATCTTACAACGAAGAAGACTATTGGTAAAGGGCATGAATCCAATGACCTCTACACGTTTGATACACAAATACCTACATCTACC
ATTTTCACTCGAGTACCATCTCCTTTTGAAGAACATTGTCATTTAGATATCTTGTCTCTCCCGTTCTTTGAAGATACTTCTTTTTCATCATCTTTGAGTACCAGT
CAGGGCGAACGTTCACAAGAAGATGATGACTTTCTTGTCTATTTCATTGTCTCTACTTCTACTAAAGAGCTTTATAGCAATACATCTCCATATGTGCCTGATCCT
TCTCCTCCCACCATTACTCAAGTTTATTTTCGTCGGCAACCTCCTACGGACTCATGCCCTATACTAGCAGCTTCTTCATCCATGGATCCAGGAACAAGTGATGAC
ATTCATATTGCTCTTAGAAAAGGTAAGCATAAATGTACTTATCATGTTTCTTCTTTTGTTTCATATAACCATTTGTCCTCACCTACTTGTTTGTTTCTTGCATCC
CTTGAGTCTGTATCTGTTCCTAAAACTGTTCATGAAGCTTTGTCTCATCCTGGTTGGCGAGACCATTTGGGAGTGCAAATTAATCAAAAGAAGCGAAAAGACGGA
AAAACCTACCTAGGAGGCGCCAGGCGCCTGGGAAGCCTGCAGAAAACAGTTTTTCTTCCAACTTTGCCCTTAATGAAACGCGTCTTCCAATGCGTTTTGGTGGTT
CCAACCGATGCATACGTGTAG
mRNA sequenceShow/hide mRNA sequence
ATGGTGATTAATTCCGATGCAGCCTGTAACATTTCATTGTCAGTGACTGGAATTGCGATTTCAATTCGTAGAGGTACTCATCTTTCGGAGGTTGCCATGTCCAAA
TTTGTGAGCATCAGGTACAGTACCCTCCATGCTGAGTTGATGGTGATTAAGGAAGATATTCGTCTTGTTTTCAGGATGAATTTCAGTGCTGTGGTGGTGGAATCT
GACGCTTTAGAGACGATTCACCTTCTAAATAGAAGGGAGATGAACAAATCAGAAGCTGGGATTTGGGTTTTGAGGATCCTTGAGCTTGGGAGAAATTTTGAAATG
GTGGAATTTCGTCATATCCGAAGGATTCACAATGAAATTGCAGATTCTTTGACCAGAGAGGTTGTCCGTTCGAGGTCCACTTTTCTTTGGGTGGATCACTTCCCT
TTGTGGCTTTCCCATCCTCCTAAGGATGAGAGCAAGAAAGCTTGGTTGCGGAATGATGCTCGACTGGTTTTACAAATCAAGAATTCAATCGAAGGAAATGTTAGT
CGAATATTTGAAGAAATCTACAAGCGCTTCTATCAACCTGAGTTTGGTGACCAATCACTTACAAATTACTTTATGGAACACAAAAGAATTTATGTAGAGTTTAAT
GCATTACTCCCAGATAGTAATTGTTTATGCAAGGATATGCACAACAGTGTATTTCAGATTGCAACTCGAACTCGGCCTCCGGACCGATCTGAATACTTGGGCGGA
CCTGCACAAAAAGGTGAGCACTCCGACGATCAAGTCAGTATAGGTCGGATTCTCAGTTTAGTTCGAGTTGCCATGTCGTCCCCTTTTAGCAGCGACTTAGGGTCC
GATGAGGATTTAGCTCGTAGGTTAGAGTCCGAGCTCGATGAGATAGAAAACTTTAGGTTCTCCGATTACGGGGAGGATAGTGATGCCTCCACATCGGGTCAGGGT
TTGGAATACCCTTCTAGGATACCTGAGCACTACCTCGGATCCCTTCGTAGGGGGTTCGCTATCCCTGAGAACATCCTCCTTAGGATTCCGGAGGAGGGGGAGAGA
GCTGACAATCCTCTAGAGGGATGGGTCACTCTCTACTTCAAAATGTTTGAGTACGGCCTCAGACTTCCCGTTCACCCTTTTGTCCAAGAATTTCTCTTCCGAACT
GGGTTGGCTCCGGCTCAAGTGGCCCCCAATGGGTGGGGTGTCATTTTCGCTTTGGCCACCCTTTTTTGGCTACGAGCTCGGGATAGTGAAGAGGCCGAGCTGTTG
GACGTAGACCAGCTCCTCGCGTGCTTCGAAGCGAAAAGGATAGCTAAGAAGCCTGGTCGGTTCTATATGTGCGCAAGGAAAGGCGCAGGCGGTATAGTTAAGGGG
CCGACCTCCATCAAGGGATGGGTGAGGAAGTGGTTCTACGCTTCTGGGGAATGGCTTGCAAAGGATGAGTCAGGTCGTTCCTTCCTTGACGTTCCCACTAGGTTT
GGGAACCTAGTTTCAATCCGACCAGTCCCTGAGCTTACGCAAGCCTCCTTCGACACGCTGAAATATTACAAGGAGCGTTTTCCGAGGGGTAGGAAGGTCGGAACC
TTGGTGACCGACAAGCTGCTGCTTGAGTCCGGGCTGCTAGATTACAACCCTGCAGTTCGTCCCATTGAATCCTCAAGGCCGAACTCCGAACTTGCCATGGTTTGC
GGATTTGCAAGCAACGTGAAGCGCAAGTCCAAGGGCCAAGTCCATGCTCTTGAGGCCGTCCAGAGTTCGAAACCTGCCACCCCTGCTGTGGTAGGGCCAGCCTCG
GAAGATCCAGCCCCAGTGATCGAGCTGGAGTCTTCTGGGGGTCCTTCGAGGGAGAAGCGCCCCAGGGATCAGACCGAGGTGGTGGACGTCTCGCCCTTGGGCGAG
GAGTTGAGGGAGGAAGTCCCTCTGAAGCGAAGGAGGAAGAAGAAGAAGACCACCTCCCCCTTGGAGGTCGGAGCTCGTGGGGTCTTGCCTGCGAGCTTCGTAGAT
CGGGTGGATGATCCTGAAGCCAGGATGGGCGGGACGTCCGCCGTGACGGCACGGTTCAGAGTTGAGCCGTCAAGTTCTGGGGTGAGGGACCAGGTGTCCCGCATC
TCGGCTGCAAGTTTGGACCGCTGCCTCAGAAGAGCGTCCAAATTTGTAAGTGACCCGGGGTCCGTCCTGCAGAGGACCATCGACTACGCCGCTGAGGCATTTGTT
GCTTCCATTCAATCGGCTCTGGCTGTAAAGGCCGAGCTGGATGGGAGGGAAGTTTTGGCAGCGAGGGAGAAAGAGGAGTTCTCTGCTGCCTTGGAGGCTGCTTCC
TCCACCATGAAGGATGAGCTGCTGAAGGCTCACTCTGAGGTGGAAATTTTGAAGGCCGAGGTGGAGACCAAGGCCGAGCTGCTGAAGAAAGAAGAGGACAGACGC
AAGGCCCAGCTCCAAGCTGCCCATGCTATCACCAAGGACTTGGAGAAGGAGAAGTTCCAACTCCTCAAGGAGAAGGACAACATGCTCCAGGCGCTTGAAGCGAAG
GAGGAGGAGCTGAAGCATGCGACTGCCGAGCTGGCGACGGTGAAGGAGCTTTCTCAGAAGGGCATTGCTTCCGACATGCCTGACCTTCAGATCGATCTCGGTGGT
CTGAAGAAGAGGTACGCTGAGCAGTGGGCGTCTGGGCCTAGCGGCACCCCTGGCCCCCAGGCGTTGGTGGATAAGTATGTCAGAGATCTGGACTCTGACTACTCC
GATCTCGAAGAGGACCAGGGCAGAGCTGCAAGGTCTATAAGCCTTGGCTCTGCTCTCCATTTAATGAAGAAGCTTTGTTTGAATGTTAAGTTCGTCAGTGGTTTT
GGCATCGCACCTCGTACCCTTAGATCCATTGAGAACCATTTCGGCATTTCAAGGATAATAACGCTTCAGGTGCTCCGCGTTCCATGGGTGCGCGATGACGTCTCC
TTTCAGATCGGCCAACATGTACGTCCCAGGCCCTCTTCAGGGGTTAGGCATCTCAATAGAGGCAGAGAAAAGCCGCGTCGGTACAATGCTCTACCTCGGACCACG
AACCGAGCTGCTCGCCTTGCCAACTTTCTGCGCTCCTTGGGGTCTTGTGGTGAATTGCCCCGAATGAAGTCCGCAATCGGGTCCATTCATGAGGACTCTGGAGCG
TCGATCTCCATCAGATCTCGCTCTGAGATCGAGGGATTATCTAAGATCTCGACGGGGACCGACCTGGCCAGGTCGTATCAAAAGTCATTGACGACATCGTCCTCT
AATCCAATTACCGTCATCGCTGAGTCAGGTAACTCTAACTTATTTGCTAATTTCTCCCCATCTGCATCTTTGCCTGATGTTACCATAGCAGATGGCACCACTTCT
CCTATTCTTGGCTCTGGCACAGATCTTACAACGAAGAAGACTATTGGTAAAGGGCATGAATCCAATGACCTCTACACGTTTGATACACAAATACCTACATCTACC
ATTTTCACTCGAGTACCATCTCCTTTTGAAGAACATTGTCATTTAGATATCTTGTCTCTCCCGTTCTTTGAAGATACTTCTTTTTCATCATCTTTGAGTACCAGT
CAGGGCGAACGTTCACAAGAAGATGATGACTTTCTTGTCTATTTCATTGTCTCTACTTCTACTAAAGAGCTTTATAGCAATACATCTCCATATGTGCCTGATCCT
TCTCCTCCCACCATTACTCAAGTTTATTTTCGTCGGCAACCTCCTACGGACTCATGCCCTATACTAGCAGCTTCTTCATCCATGGATCCAGGAACAAGTGATGAC
ATTCATATTGCTCTTAGAAAAGGTAAGCATAAATGTACTTATCATGTTTCTTCTTTTGTTTCATATAACCATTTGTCCTCACCTACTTGTTTGTTTCTTGCATCC
CTTGAGTCTGTATCTGTTCCTAAAACTGTTCATGAAGCTTTGTCTCATCCTGGTTGGCGAGACCATTTGGGAGTGCAAATTAATCAAAAGAAGCGAAAAGACGGA
AAAACCTACCTAGGAGGCGCCAGGCGCCTGGGAAGCCTGCAGAAAACAGTTTTTCTTCCAACTTTGCCCTTAATGAAACGCGTCTTCCAATGCGTTTTGGTGGTT
CCAACCGATGCATACGTGTAG
Protein sequenceShow/hide protein sequence
MVINSDAACNISLSVTGIAISIRRGTHLSEVAMSKFVSIRYSTLHAELMVIKEDIRLVFRMNFSAVVVESDALETIHLLNRREMNKSEAGIWVLRILELGRNFEM
VEFRHIRRIHNEIADSLTREVVRSRSTFLWVDHFPLWLSHPPKDESKKAWLRNDARLVLQIKNSIEGNVSRIFEEIYKRFYQPEFGDQSLTNYFMEHKRIYVEFN
ALLPDSNCLCKDMHNSVFQIATRTRPPDRSEYLGGPAQKGEHSDDQVSIGRILSLVRVAMSSPFSSDLGSDEDLARRLESELDEIENFRFSDYGEDSDASTSGQG
LEYPSRIPEHYLGSLRRGFAIPENILLRIPEEGERADNPLEGWVTLYFKMFEYGLRLPVHPFVQEFLFRTGLAPAQVAPNGWGVIFALATLFWLRARDSEEAELL
DVDQLLACFEAKRIAKKPGRFYMCARKGAGGIVKGPTSIKGWVRKWFYASGEWLAKDESGRSFLDVPTRFGNLVSIRPVPELTQASFDTLKYYKERFPRGRKVGT
LVTDKLLLESGLLDYNPAVRPIESSRPNSELAMVCGFASNVKRKSKGQVHALEAVQSSKPATPAVVGPASEDPAPVIELESSGGPSREKRPRDQTEVVDVSPLGE
ELREEVPLKRRRKKKKTTSPLEVGARGVLPASFVDRVDDPEARMGGTSAVTARFRVEPSSSGVRDQVSRISAASLDRCLRRASKFVSDPGSVLQRTIDYAAEAFV
ASIQSALAVKAELDGREVLAAREKEEFSAALEAASSTMKDELLKAHSEVEILKAEVETKAELLKKEEDRRKAQLQAAHAITKDLEKEKFQLLKEKDNMLQALEAK
EEELKHATAELATVKELSQKGIASDMPDLQIDLGGLKKRYAEQWASGPSGTPGPQALVDKYVRDLDSDYSDLEEDQGRAARSISLGSALHLMKKLCLNVKFVSGF
GIAPRTLRSIENHFGISRIITLQVLRVPWVRDDVSFQIGQHVRPRPSSGVRHLNRGREKPRRYNALPRTTNRAARLANFLRSLGSCGELPRMKSAIGSIHEDSGA
SISIRSRSEIEGLSKISTGTDLARSYQKSLTTSSSNPITVIAESGNSNLFANFSPSASLPDVTIADGTTSPILGSGTDLTTKKTIGKGHESNDLYTFDTQIPTST
IFTRVPSPFEEHCHLDILSLPFFEDTSFSSSLSTSQGERSQEDDDFLVYFIVSTSTKELYSNTSPYVPDPSPPTITQVYFRRQPPTDSCPILAASSSMDPGTSDD
IHIALRKGKHKCTYHVSSFVSYNHLSSPTCLFLASLESVSVPKTVHEALSHPGWRDHLGVQINQKKRKDGKTYLGGARRLGSLQKTVFLPTLPLMKRVFQCVLVV
PTDAYV