; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Moc08g30560 (gene) of Bitter gourd (OHB3-1) v2 genome

Gene IDMoc08g30560
OrganismMomordica charantia cv. OHB3-1 (Bitter gourd (OHB3-1) v2)
DescriptionRetrotrans_gag domain-containing protein
Genome locationchr8:21914124..21920721
RNA-Seq ExpressionMoc08g30560
SyntenyMoc08g30560
Gene Ontology termsGO:0003676 - nucleic acid binding (molecular function)
InterPro domainsIPR005162 - Retrotransposon gag domain
IPR036397 - Ribonuclease H superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_022158314.1 uncharacterized protein LOC111024824 [Momordica charantia]8.5e-13366.67Show/hide
Query:  NAQDPPPPQNPPVNGDMAGEGAANRAGEIPNPILLADNRDVAMG--------------IMDGARTWLNALEPNS-INTWTELTEKFLAKYHT-----LTR
        N QDPP P NPPV+GD AGEGAANRAGE+PNPILL DNRDVA+               + DG     +  +P S + ++ E+   F     +     L  
Subjt:  NAQDPPPPQNPPVNGDMAGEGAANRAGEIPNPILLADNRDVAMG--------------IMDGARTWLNALEPNS-INTWTELTEKFLAKYHT-----LTR

Query:  NADLRENIVSFRQKENEAVQEAWERFKELLRRCSSHGLPACVQIEQFYRGLDRSSRMMLNTAANGSLLENSVNEIVDILNKMIDINDQGEIGRSLPKKQV
        NADLRE+IVSFRQKENEAVQE WERFKELLRRC SHGLP CVQIEQFYRGLDR SRMMLNTAAN SL E S++EI+DILNKM D NDQGEIGRSLPKKQV
Subjt:  NADLRENIVSFRQKENEAVQEAWERFKELLRRCSSHGLPACVQIEQFYRGLDRSSRMMLNTAANGSLLENSVNEIVDILNKMIDINDQGEIGRSLPKKQV

Query:  SAGIFELDTVASMQAQMAAMNQMLKQLTMEKETKTTTLAIPEPSPILQISDISCVYCGDNHLYENCPANLASIFYVG-----------------------
        SA +FELDTVASMQAQMA +NQMLKQLTMEKETKT T A+ EPS  LQISDISCVYCGDN LYENCPAN  S+FYVG                       
Subjt:  SAGIFELDTVASMQAQMAAMNQMLKQLTMEKETKTTTLAIPEPSPILQISDISCVYCGDNHLYENCPANLASIFYVG-----------------------

Query:  -----LGVASSSAQVPAQQYKQNYTPPGFPTQPASQPQQYNQQRGQSTTQQSGSNASLEAM-----------MKEFMTRTDAAIRSLEMQVGQIANDQKS
              GVASSSAQ PAQQYKQNYTPP FPTQPASQPQQYNQQR Q+TTQQ GSN SLEAM            KEFMTRTD  IR LEMQVGQIAND+KS
Subjt:  -----LGVASSSAQVPAQQYKQNYTPPGFPTQPASQPQQYNQQRGQSTTQQSGSNASLEAM-----------MKEFMTRTDAAIRSLEMQVGQIANDQKS

Query:  RPQGTLPEHTENPK
        RPQGTLP +TENPK
Subjt:  RPQGTLPEHTENPK

XP_022158836.1 uncharacterized protein LOC111025302 [Momordica charantia]1.0e-10963.36Show/hide
Query:  QNAQDPPPPQNPPVNGDMAGEGAANRAGEIPNPILLADNRDVAM--------------------------------------------------------
        +NAQDPPPPQNPPVNGDMAGE AANR GEIPN ILLADNRDVAM                                                        
Subjt:  QNAQDPPPPQNPPVNGDMAGEGAANRAGEIPNPILLADNRDVAM--------------------------------------------------------

Query:  ------------------------GIMDGARTWLNALEPNSINTWTELTEKFLAKYHTLTRNADLRENIVSFRQKENEAVQEAWERFKELLRRCSSHGLP
                                 + DGARTW+NALEPNSINTW ELT+KFLAKYHTLT+NADLRE+IVSFRQKENEAVQEAWERFKELLRRC SHGLP
Subjt:  ------------------------GIMDGARTWLNALEPNSINTWTELTEKFLAKYHTLTRNADLRENIVSFRQKENEAVQEAWERFKELLRRCSSHGLP

Query:  ACVQIEQFYRGLDRSSRMMLNTAANGSLLENSVNEIVDILNKMIDINDQGEIGRSLPKKQVSAGIFELDTVASMQAQMAAMNQMLKQLTMEKETKTTTLA
        +CVQIEQFYRGLDRSS+MMLNT ANGSLLE SVNEIVD+LNKM DINDQGE+GRSLPKKQVS GIFELDTVASMQAQMAAMNQMLKQLTMEKETKT T A
Subjt:  ACVQIEQFYRGLDRSSRMMLNTAANGSLLENSVNEIVDILNKMIDINDQGEIGRSLPKKQVSAGIFELDTVASMQAQMAAMNQMLKQLTMEKETKTTTLA

Query:  IPEPSPILQISDISCVYCGD---------NHLYENCPANLASIFYVGLGVASSSAQVPAQQYK
        IPE SPILQISDISCVYCG          ++ Y     +  +  +   GVASSSAQ PAQQYK
Subjt:  IPEPSPILQISDISCVYCGD---------NHLYENCPANLASIFYVGLGVASSSAQVPAQQYK

XP_022159127.1 uncharacterized protein LOC111025557 [Momordica charantia]2.7e-8673.19Show/hide
Query:  IMDGARTWLNALEPNSINTWTELTEKFLAKYHTLTRNADLRENIVSFRQKENEAVQEAWERFKELLRRCSSHGLPACVQIEQFYRGLDRSSRMMLNTAAN
        + DGA TW+N LE N I TW ELT+KFLAKYHTLTRNADL+E+IVSFRQ+E+EAVQEAWERFKELL+RC SHGLP CVQI+QFYRGLD   RMM +TAAN
Subjt:  IMDGARTWLNALEPNSINTWTELTEKFLAKYHTLTRNADLRENIVSFRQKENEAVQEAWERFKELLRRCSSHGLPACVQIEQFYRGLDRSSRMMLNTAAN

Query:  GSLLENSVNEIVDILNKMIDINDQGEIGRSLPKKQVSAGIFELDTVASMQAQMAAMNQMLKQLTMEKETK-TTTLAIPEPSPILQISDISCVYCGDNHLY
         SLLE SVNEI+DILNKMIDINDQ E+GRSLPKKQ SAGIFELDTV S+QAQ++AM+QMLKQLTM+K  K  T++ I EPS ILQISDISCVYC DNHLY
Subjt:  GSLLENSVNEIVDILNKMIDINDQGEIGRSLPKKQVSAGIFELDTVASMQAQMAAMNQMLKQLTMEKETK-TTTLAIPEPSPILQISDISCVYCGDNHLY

Query:  ENCPANLASIFYVGLGVASSSAQVPAQQYKQNYTP
        ENC AN A IFYVG GV     Q     Y   Y P
Subjt:  ENCPANLASIFYVGLGVASSSAQVPAQQYKQNYTP

XP_022159235.1 uncharacterized protein LOC111025653 [Momordica charantia]2.4e-6643.01Show/hide
Query:  ARTWLNALEPNSINTWTELTEKFLAKYHTLTRNADLRENIVSFRQKENEAVQEAWERFKELLRRCSSHGLPACVQIEQFYRGLDRSSRMMLNTAANGSLL
        A  WLNA   ++I TW+++ +KFL KY   TRNAD+RE I+SFRQKENEAV  AWERFK+L+  C + G+PACVQIE F+RG D  ++MMLN AANG   
Subjt:  ARTWLNALEPNSINTWTELTEKFLAKYHTLTRNADLRENIVSFRQKENEAVQEAWERFKELLRRCSSHGLPACVQIEQFYRGLDRSSRMMLNTAANGSLL

Query:  ENSVNEIVDILNKMIDINDQ--GEIGRSLPKKQVSAGIFELDTVASMQAQMAAMNQMLKQLTMEKETKTTTLAIPEPSPILQISDISCVYCGDNHLYENC
          S NEIV+IL+++ + N Q   E  R+  K+   AG+  LD + SMQ Q+  + QMLK +        +  A   PSP+ QI++ +C YCGD H  ENC
Subjt:  ENSVNEIVDILNKMIDINDQ--GEIGRSLPKKQVSAGIFELDTVASMQAQMAAMNQMLKQLTMEKETKTTTLAIPEPSPILQISDISCVYCGDNHLYENC

Query:  PANLASIFYVG--------------------------LGVASSSAQVPAQQYKQNYTPPGFPTQPA--SQPQQYNQQRGQ-STTQQSGSNASL-------
        P+N +S++YVG                           G  SS+     QQYK+ YTPPGFP  PA    P QYNQQ+      QQ+ SN  +       
Subjt:  PANLASIFYVG--------------------------LGVASSSAQVPAQQYKQNYTPPGFPTQPA--SQPQQYNQQRGQ-STTQQSGSNASL-------

Query:  --EAMMKEFMTRT-----------------DAAIRSLEMQVGQIANDQKSRPQGTLPEHTENPKR
          +A MKE MTRT                 D  +R LEMQ+GQ+ N+ ++RPQG+LP  TE P+R
Subjt:  --EAMMKEFMTRT-----------------DAAIRSLEMQVGQIANDQKSRPQGTLPEHTENPKR

XP_030494802.1 uncharacterized protein LOC115710583 [Cannabis sativa]3.6e-6746.56Show/hide
Query:  IMDGARTWLNALEPNSINTWTELTEKFLAKYHTLTRNADLRENIVSFRQKENEAVQEAWERFKELLRRCSSHGLPACVQIEQFYRGLDRSSRMMLNTAAN
        + D AR WLN L P+S+  W +L EKFL KY   TRNA  R  I+SF+Q E+E   +AWERFKE+LR+C  HG+P C+Q+E FY GL+ +SRM+L+ +AN
Subjt:  IMDGARTWLNALEPNSINTWTELTEKFLAKYHTLTRNADLRENIVSFRQKENEAVQEAWERFKELLRRCSSHGLPACVQIEQFYRGLDRSSRMMLNTAAN

Query:  GSLLENSVNEIVDILNKMIDINDQGEIGRSLPKKQVSAGIFELDTVASMQAQMAAMNQMLKQLTMEKETKTTTLAIPEPSPILQISDISCVYCGDNHLYE
        G++L  S NE  +IL ++   N Q    R+ P  +  AG+ E+D + ++ AQMA+M  +LK + M            +P+  +Q ++ISCVYCGD H +E
Subjt:  GSLLENSVNEIVDILNKMIDINDQGEIGRSLPKKQVSAGIFELDTVASMQAQMAAMNQMLKQLTMEKETKTTTLAIPEPSPILQISDISCVYCGDNHLYE

Query:  NCPANLASIFYVGLGVASSSAQVPAQQYKQNYTPPGFPTQPASQPQQYNQQRGQSTTQQSGSNASLEAMMKEFMTRTDAAIRS-------LEMQVGQIAN
        NCP+N AS+ YVG   ASSS    AQ  KQ++ PPGF  QP  +PQQ +Q +G  T       +SLE++M+++M + DA I+S       LE+Q+GQ+AN
Subjt:  NCPANLASIFYVGLGVASSSAQVPAQQYKQNYTPPGFPTQPASQPQQYNQQRGQSTTQQSGSNASLEAMMKEFMTRTDAAIRS-------LEMQVGQIAN

Query:  DQKSRPQGTLPEHTENPKRD
        D K+RPQGTLP  TENP+RD
Subjt:  DQKSRPQGTLPEHTENPKRD

TrEMBL top hitse value%identityAlignment
A0A6J1DSZ5 uncharacterized protein LOC1110241075.5e-6146.08Show/hide
Query:  ARTWLNALEPNSINTWTELTEKFLAKYHTLTRNADLRENIVSFRQKENEAVQEAWERFKELLRRCSSHGLPACVQIEQFYRGLDRSSRMMLNTAANGSLL
        A  WLNA    +I TW+++ +KFL KY   TRNAD+RE I+SFRQKENEAV  AWE FK+L+R C + G+PACVQIE F+RG D  ++MMLN AANG   
Subjt:  ARTWLNALEPNSINTWTELTEKFLAKYHTLTRNADLRENIVSFRQKENEAVQEAWERFKELLRRCSSHGLPACVQIEQFYRGLDRSSRMMLNTAANGSLL

Query:  ENSVNEIVDILNKMIDINDQ--GEIGRSLPKKQVSAGIFELDTVASMQAQMAAMNQMLKQLTMEKETKTTTLAIPEPSPILQISDISCVYCGDNHLYENC
          S NEIV+IL+++ + NDQ   E  R+  K+   AG+  LD + SMQ Q+  + QMLK +        +  A   PSP+ QI++ +C YCGD H  ENC
Subjt:  ENSVNEIVDILNKMIDINDQ--GEIGRSLPKKQVSAGIFELDTVASMQAQMAAMNQMLKQLTMEKETKTTTLAIPEPSPILQISDISCVYCGDNHLYENC

Query:  PANLASIFYVG--------------------------LGVASSSAQVPAQQYKQNYTPPGFPTQPA--SQPQQYNQQRGQ-STTQQSGSNASLEAMMKEF
        P+N +S++YVG                           G  SSS     QQYKQ YTPPGFP  PA    P QYNQQ+      QQ+ SN  +E +MKEF
Subjt:  PANLASIFYVG--------------------------LGVASSSAQVPAQQYKQNYTPPGFPTQPA--SQPQQYNQQRGQ-STTQQSGSNASLEAMMKEF

Query:  MTRTDA
        +T+ DA
Subjt:  MTRTDA

A0A6J1DY39 uncharacterized protein LOC1110256531.1e-6643.01Show/hide
Query:  ARTWLNALEPNSINTWTELTEKFLAKYHTLTRNADLRENIVSFRQKENEAVQEAWERFKELLRRCSSHGLPACVQIEQFYRGLDRSSRMMLNTAANGSLL
        A  WLNA   ++I TW+++ +KFL KY   TRNAD+RE I+SFRQKENEAV  AWERFK+L+  C + G+PACVQIE F+RG D  ++MMLN AANG   
Subjt:  ARTWLNALEPNSINTWTELTEKFLAKYHTLTRNADLRENIVSFRQKENEAVQEAWERFKELLRRCSSHGLPACVQIEQFYRGLDRSSRMMLNTAANGSLL

Query:  ENSVNEIVDILNKMIDINDQ--GEIGRSLPKKQVSAGIFELDTVASMQAQMAAMNQMLKQLTMEKETKTTTLAIPEPSPILQISDISCVYCGDNHLYENC
          S NEIV+IL+++ + N Q   E  R+  K+   AG+  LD + SMQ Q+  + QMLK +        +  A   PSP+ QI++ +C YCGD H  ENC
Subjt:  ENSVNEIVDILNKMIDINDQ--GEIGRSLPKKQVSAGIFELDTVASMQAQMAAMNQMLKQLTMEKETKTTTLAIPEPSPILQISDISCVYCGDNHLYENC

Query:  PANLASIFYVG--------------------------LGVASSSAQVPAQQYKQNYTPPGFPTQPA--SQPQQYNQQRGQ-STTQQSGSNASL-------
        P+N +S++YVG                           G  SS+     QQYK+ YTPPGFP  PA    P QYNQQ+      QQ+ SN  +       
Subjt:  PANLASIFYVG--------------------------LGVASSSAQVPAQQYKQNYTPPGFPTQPA--SQPQQYNQQRGQ-STTQQSGSNASL-------

Query:  --EAMMKEFMTRT-----------------DAAIRSLEMQVGQIANDQKSRPQGTLPEHTENPKR
          +A MKE MTRT                 D  +R LEMQ+GQ+ N+ ++RPQG+LP  TE P+R
Subjt:  --EAMMKEFMTRT-----------------DAAIRSLEMQVGQIANDQKSRPQGTLPEHTENPKR

A0A6J1DYY9 uncharacterized protein LOC1110255579.9e-8773.62Show/hide
Query:  IMDGARTWLNALEPNSINTWTELTEKFLAKYHTLTRNADLRENIVSFRQKENEAVQEAWERFKELLRRCSSHGLPACVQIEQFYRGLDRSSRMMLNTAAN
        + DGA TWLN LE N I TW ELT+KFLAKYHTLTRNADL+E+IVSFRQ+E+EAVQEAWERFKELL+RC SHGLP CVQI+QFYRGLD   RMM +TAAN
Subjt:  IMDGARTWLNALEPNSINTWTELTEKFLAKYHTLTRNADLRENIVSFRQKENEAVQEAWERFKELLRRCSSHGLPACVQIEQFYRGLDRSSRMMLNTAAN

Query:  GSLLENSVNEIVDILNKMIDINDQGEIGRSLPKKQVSAGIFELDTVASMQAQMAAMNQMLKQLTMEKETK-TTTLAIPEPSPILQISDISCVYCGDNHLY
         SLLE SVNEI+DILNKMIDINDQ E+GRSLPKKQ SAGIFELDTV S+QAQ++AM+QMLKQLTM+K  K  T++ I EPS ILQISDISCVYC DNHLY
Subjt:  GSLLENSVNEIVDILNKMIDINDQGEIGRSLPKKQVSAGIFELDTVASMQAQMAAMNQMLKQLTMEKETK-TTTLAIPEPSPILQISDISCVYCGDNHLY

Query:  ENCPANLASIFYVGLGVASSSAQVPAQQYKQNYTP
        ENC AN A IFYVG GV     Q     Y   Y P
Subjt:  ENCPANLASIFYVGLGVASSSAQVPAQQYKQNYTP

A0A6J1DZ19 uncharacterized protein LOC1110248244.1e-13366.67Show/hide
Query:  NAQDPPPPQNPPVNGDMAGEGAANRAGEIPNPILLADNRDVAMG--------------IMDGARTWLNALEPNS-INTWTELTEKFLAKYHT-----LTR
        N QDPP P NPPV+GD AGEGAANRAGE+PNPILL DNRDVA+               + DG     +  +P S + ++ E+   F     +     L  
Subjt:  NAQDPPPPQNPPVNGDMAGEGAANRAGEIPNPILLADNRDVAMG--------------IMDGARTWLNALEPNS-INTWTELTEKFLAKYHT-----LTR

Query:  NADLRENIVSFRQKENEAVQEAWERFKELLRRCSSHGLPACVQIEQFYRGLDRSSRMMLNTAANGSLLENSVNEIVDILNKMIDINDQGEIGRSLPKKQV
        NADLRE+IVSFRQKENEAVQE WERFKELLRRC SHGLP CVQIEQFYRGLDR SRMMLNTAAN SL E S++EI+DILNKM D NDQGEIGRSLPKKQV
Subjt:  NADLRENIVSFRQKENEAVQEAWERFKELLRRCSSHGLPACVQIEQFYRGLDRSSRMMLNTAANGSLLENSVNEIVDILNKMIDINDQGEIGRSLPKKQV

Query:  SAGIFELDTVASMQAQMAAMNQMLKQLTMEKETKTTTLAIPEPSPILQISDISCVYCGDNHLYENCPANLASIFYVG-----------------------
        SA +FELDTVASMQAQMA +NQMLKQLTMEKETKT T A+ EPS  LQISDISCVYCGDN LYENCPAN  S+FYVG                       
Subjt:  SAGIFELDTVASMQAQMAAMNQMLKQLTMEKETKTTTLAIPEPSPILQISDISCVYCGDNHLYENCPANLASIFYVG-----------------------

Query:  -----LGVASSSAQVPAQQYKQNYTPPGFPTQPASQPQQYNQQRGQSTTQQSGSNASLEAM-----------MKEFMTRTDAAIRSLEMQVGQIANDQKS
              GVASSSAQ PAQQYKQNYTPP FPTQPASQPQQYNQQR Q+TTQQ GSN SLEAM            KEFMTRTD  IR LEMQVGQIAND+KS
Subjt:  -----LGVASSSAQVPAQQYKQNYTPPGFPTQPASQPQQYNQQRGQSTTQQSGSNASLEAM-----------MKEFMTRTDAAIRSLEMQVGQIANDQKS

Query:  RPQGTLPEHTENPK
        RPQGTLP +TENPK
Subjt:  RPQGTLPEHTENPK

A0A6J1E251 uncharacterized protein LOC1110253024.9e-11063.36Show/hide
Query:  QNAQDPPPPQNPPVNGDMAGEGAANRAGEIPNPILLADNRDVAM--------------------------------------------------------
        +NAQDPPPPQNPPVNGDMAGE AANR GEIPN ILLADNRDVAM                                                        
Subjt:  QNAQDPPPPQNPPVNGDMAGEGAANRAGEIPNPILLADNRDVAM--------------------------------------------------------

Query:  ------------------------GIMDGARTWLNALEPNSINTWTELTEKFLAKYHTLTRNADLRENIVSFRQKENEAVQEAWERFKELLRRCSSHGLP
                                 + DGARTW+NALEPNSINTW ELT+KFLAKYHTLT+NADLRE+IVSFRQKENEAVQEAWERFKELLRRC SHGLP
Subjt:  ------------------------GIMDGARTWLNALEPNSINTWTELTEKFLAKYHTLTRNADLRENIVSFRQKENEAVQEAWERFKELLRRCSSHGLP

Query:  ACVQIEQFYRGLDRSSRMMLNTAANGSLLENSVNEIVDILNKMIDINDQGEIGRSLPKKQVSAGIFELDTVASMQAQMAAMNQMLKQLTMEKETKTTTLA
        +CVQIEQFYRGLDRSS+MMLNT ANGSLLE SVNEIVD+LNKM DINDQGE+GRSLPKKQVS GIFELDTVASMQAQMAAMNQMLKQLTMEKETKT T A
Subjt:  ACVQIEQFYRGLDRSSRMMLNTAANGSLLENSVNEIVDILNKMIDINDQGEIGRSLPKKQVSAGIFELDTVASMQAQMAAMNQMLKQLTMEKETKTTTLA

Query:  IPEPSPILQISDISCVYCGD---------NHLYENCPANLASIFYVGLGVASSSAQVPAQQYK
        IPE SPILQISDISCVYCG          ++ Y     +  +  +   GVASSSAQ PAQQYK
Subjt:  IPEPSPILQISDISCVYCGD---------NHLYENCPANLASIFYVGLGVASSSAQVPAQQYK

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGCCTTCGACCTGTAGAGCTAACATGAATTTTGTTATGGAGAACAAAAACGACCTGAAACGGGACAAGGAGTTGAAGCCTAAAATCATCGATTGGGGGAGACCG
ACCCGTGACGTAACGGACCGGCCCAAAGAACGACCTAGCGTCGGGGCAAAGAACGAGGAAAGCGACCAGCGGAGAACTAACCGGGCCGAGAGGGCTCGGTCCCGA
CCCCTGCTCGTGAGATCCTACCCTGAAGAGGCAGGTATCGGTTCCTTTATCATCGAGTTTGGTGTGTGCGAGAAGAAGAGCCGGAAGGTCAAAATCCACGCATCA
ACAATTGGCGTTGTCTGTGGGAATGATACAGTAAAGCGAGCTTCGTCAAAGCGAAATCCCAATCCAAGCGACATGAATCTTGATGAGAGCCATCGGTGCTTCGAA
AAGAAGAGCTTGAAAGAAGCGGCCGATCACTCGGGTGAGGCAAGACCGAGCCTTGGCCATCGACTTAATCCGACCACCCTTGATAGGGGAGAACCAGTGACGAGC
TTGAGTGAAGCGACTGACCTCTCGGGTGAGGCAAGGCCGAGCCCTGGTCATCGACTTAATTCGACCACCCTTGACAAGGGAGAACCGGCTAAGGAAGTAGAATCT
GTCCCTCTGACAGCCGAAGATCAACAGGTGAAAGATGAGTTTCAGGCTAGAGAACCGCGAATGGAGAAATACCTTTCACAGGTAAGAAACCAGCTCGAACAGTTC
TCAAAATATGAGATTCGACAAATTCCACGTGCTCAAAACGTCAACACCGATGCGTTAGCTCGACTAGCTGCAGTTTATGAAACCGACCTAGGCAGAACTGTACCG
GTCGAGATTCTACCTGAGCCAAGCATAGTAGCTCATGAAGTAATGGATATCGACGAGCAAAGGCAACAAGAAGAAAACTGGAAGAGCCATTTGATCAAATATTTG
AGAGACAAGATCCTGCCTACTGAAAAGATAGAGGCCCAAAATGCACAAGATCCTCCACCGCCACAAAATCCACCTGTGAATGGAGATATGGCAGGTGAAGGAGCA
GCAAACCGAGCAGGAGAAATTCCTAATCCGATTCTTCTAGCAGATAATCGAGATGTAGCCATGGGAATTATGGATGGTGCAAGGACTTGGCTAAACGCGTTAGAA
CCAAATTCTATCAACACATGGACAGAACTGACGGAGAAATTTTTGGCAAAGTACCATACTTTGACCAGGAACGCAGACCTTCGAGAGAACATTGTGTCTTTTAGA
CAGAAGGAGAACGAAGCAGTTCAAGAAGCTTGGGAGCGTTTTAAGGAATTACTTAGAAGGTGCTCGAGCCATGGATTGCCTGCATGTGTGCAGATTGAACAATTC
TATAGAGGATTGGATCGTTCATCACGGATGATGTTGAACACTGCAGCCAATGGCTCGTTGTTAGAGAATTCGGTAAATGAGATCGTTGATATCTTGAATAAGATG
ATAGACATTAATGACCAAGGTGAAATAGGAAGGTCATTACCAAAGAAGCAAGTATCAGCTGGAATCTTTGAGTTAGACACAGTAGCTTCAATGCAAGCCCAAATG
GCAGCTATGAACCAAATGTTAAAGCAGTTGACAATGGAGAAAGAAACCAAAACCACAACTTTGGCGATACCTGAACCCTCTCCTATTTTACAAATTTCAGATATA
TCTTGTGTATATTGTGGTGATAACCACTTGTATGAGAACTGTCCAGCTAATCTAGCGTCTATTTTCTATGTAGGTCTAGGAGTAGCTAGTAGCAGTGCACAAGTA
CCCGCTCAACAATACAAACAAAACTACACTCCTCCTGGTTTTCCAACTCAACCGGCGTCGCAGCCTCAACAATACAATCAGCAAAGAGGTCAAAGTACTACTCAG
CAAAGTGGTAGCAACGCAAGTTTGGAGGCCATGATGAAAGAGTTCATGACAAGAACTGATGCTGCGATAAGAAGCTTGGAGATGCAAGTGGGGCAGATTGCAAAT
GACCAGAAATCTAGACCCCAAGGTACATTGCCTGAACACACAGAAAACCCGAAGCGAGATCGTGACGGAGCACTGTAA
mRNA sequenceShow/hide mRNA sequence
ATGCCTTCGACCTGTAGAGCTAACATGAATTTTGTTATGGAGAACAAAAACGACCTGAAACGGGACAAGGAGTTGAAGCCTAAAATCATCGATTGGGGGAGACCG
ACCCGTGACGTAACGGACCGGCCCAAAGAACGACCTAGCGTCGGGGCAAAGAACGAGGAAAGCGACCAGCGGAGAACTAACCGGGCCGAGAGGGCTCGGTCCCGA
CCCCTGCTCGTGAGATCCTACCCTGAAGAGGCAGGTATCGGTTCCTTTATCATCGAGTTTGGTGTGTGCGAGAAGAAGAGCCGGAAGGTCAAAATCCACGCATCA
ACAATTGGCGTTGTCTGTGGGAATGATACAGTAAAGCGAGCTTCGTCAAAGCGAAATCCCAATCCAAGCGACATGAATCTTGATGAGAGCCATCGGTGCTTCGAA
AAGAAGAGCTTGAAAGAAGCGGCCGATCACTCGGGTGAGGCAAGACCGAGCCTTGGCCATCGACTTAATCCGACCACCCTTGATAGGGGAGAACCAGTGACGAGC
TTGAGTGAAGCGACTGACCTCTCGGGTGAGGCAAGGCCGAGCCCTGGTCATCGACTTAATTCGACCACCCTTGACAAGGGAGAACCGGCTAAGGAAGTAGAATCT
GTCCCTCTGACAGCCGAAGATCAACAGGTGAAAGATGAGTTTCAGGCTAGAGAACCGCGAATGGAGAAATACCTTTCACAGGTAAGAAACCAGCTCGAACAGTTC
TCAAAATATGAGATTCGACAAATTCCACGTGCTCAAAACGTCAACACCGATGCGTTAGCTCGACTAGCTGCAGTTTATGAAACCGACCTAGGCAGAACTGTACCG
GTCGAGATTCTACCTGAGCCAAGCATAGTAGCTCATGAAGTAATGGATATCGACGAGCAAAGGCAACAAGAAGAAAACTGGAAGAGCCATTTGATCAAATATTTG
AGAGACAAGATCCTGCCTACTGAAAAGATAGAGGCCCAAAATGCACAAGATCCTCCACCGCCACAAAATCCACCTGTGAATGGAGATATGGCAGGTGAAGGAGCA
GCAAACCGAGCAGGAGAAATTCCTAATCCGATTCTTCTAGCAGATAATCGAGATGTAGCCATGGGAATTATGGATGGTGCAAGGACTTGGCTAAACGCGTTAGAA
CCAAATTCTATCAACACATGGACAGAACTGACGGAGAAATTTTTGGCAAAGTACCATACTTTGACCAGGAACGCAGACCTTCGAGAGAACATTGTGTCTTTTAGA
CAGAAGGAGAACGAAGCAGTTCAAGAAGCTTGGGAGCGTTTTAAGGAATTACTTAGAAGGTGCTCGAGCCATGGATTGCCTGCATGTGTGCAGATTGAACAATTC
TATAGAGGATTGGATCGTTCATCACGGATGATGTTGAACACTGCAGCCAATGGCTCGTTGTTAGAGAATTCGGTAAATGAGATCGTTGATATCTTGAATAAGATG
ATAGACATTAATGACCAAGGTGAAATAGGAAGGTCATTACCAAAGAAGCAAGTATCAGCTGGAATCTTTGAGTTAGACACAGTAGCTTCAATGCAAGCCCAAATG
GCAGCTATGAACCAAATGTTAAAGCAGTTGACAATGGAGAAAGAAACCAAAACCACAACTTTGGCGATACCTGAACCCTCTCCTATTTTACAAATTTCAGATATA
TCTTGTGTATATTGTGGTGATAACCACTTGTATGAGAACTGTCCAGCTAATCTAGCGTCTATTTTCTATGTAGGTCTAGGAGTAGCTAGTAGCAGTGCACAAGTA
CCCGCTCAACAATACAAACAAAACTACACTCCTCCTGGTTTTCCAACTCAACCGGCGTCGCAGCCTCAACAATACAATCAGCAAAGAGGTCAAAGTACTACTCAG
CAAAGTGGTAGCAACGCAAGTTTGGAGGCCATGATGAAAGAGTTCATGACAAGAACTGATGCTGCGATAAGAAGCTTGGAGATGCAAGTGGGGCAGATTGCAAAT
GACCAGAAATCTAGACCCCAAGGTACATTGCCTGAACACACAGAAAACCCGAAGCGAGATCGTGACGGAGCACTGTAA
Protein sequenceShow/hide protein sequence
MPSTCRANMNFVMENKNDLKRDKELKPKIIDWGRPTRDVTDRPKERPSVGAKNEESDQRRTNRAERARSRPLLVRSYPEEAGIGSFIIEFGVCEKKSRKVKIHAS
TIGVVCGNDTVKRASSKRNPNPSDMNLDESHRCFEKKSLKEAADHSGEARPSLGHRLNPTTLDRGEPVTSLSEATDLSGEARPSPGHRLNSTTLDKGEPAKEVES
VPLTAEDQQVKDEFQAREPRMEKYLSQVRNQLEQFSKYEIRQIPRAQNVNTDALARLAAVYETDLGRTVPVEILPEPSIVAHEVMDIDEQRQQEENWKSHLIKYL
RDKILPTEKIEAQNAQDPPPPQNPPVNGDMAGEGAANRAGEIPNPILLADNRDVAMGIMDGARTWLNALEPNSINTWTELTEKFLAKYHTLTRNADLRENIVSFR
QKENEAVQEAWERFKELLRRCSSHGLPACVQIEQFYRGLDRSSRMMLNTAANGSLLENSVNEIVDILNKMIDINDQGEIGRSLPKKQVSAGIFELDTVASMQAQM
AAMNQMLKQLTMEKETKTTTLAIPEPSPILQISDISCVYCGDNHLYENCPANLASIFYVGLGVASSSAQVPAQQYKQNYTPPGFPTQPASQPQQYNQQRGQSTTQ
QSGSNASLEAMMKEFMTRTDAAIRSLEMQVGQIANDQKSRPQGTLPEHTENPKRDRDGAL