; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Moc03g21560 (gene) of Bitter gourd (OHB3-1) v2 genome

Gene IDMoc03g21560
OrganismMomordica charantia cv. OHB3-1 (Bitter gourd (OHB3-1) v2)
DescriptionRetrovirus-related Pol polyprotein from transposon TNT 1-94
Genome locationchr3:14717292..14723471
RNA-Seq ExpressionMoc03g21560
SyntenyMoc03g21560
Gene Ontology termsGO:0016021 - integral component of membrane (cellular component)
GO:0043167 - ion binding (molecular function)
InterPro domainsNA


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_022144034.1 uncharacterized protein LOC111013826 [Momordica charantia]8.4e-11088.99Show/hide
Query:  MFEYSLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYICARKGAGGIVKGPTSIKGWVR
        MFEY LRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFY+CARKGAGGIVKGPTSIKGWVR
Subjt:  MFEYSLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYICARKGAGGIVKGPTSIKGWVR

Query:  KWFYASGEWLAKDESGRSFFDVPTRFGNLVSIRPVPELTQASFDTLKYYKECFPRGRKVGTLVTDQLLLESGLLDYNPAVRPIESSRPNSELAMVCGFAS
        KWFYASGEWLAKDESGRSFFDVPTRFGNLVSIRPVPELTQASFDTLKYYKE FPRGRKVGTLVTD+LLLESGLLDYNPAVRPIE SRPNS LAMVC FAS
Subjt:  KWFYASGEWLAKDESGRSFFDVPTRFGNLVSIRPVPELTQASFDTLKYYKECFPRGRKVGTLVTDQLLLESGLLDYNPAVRPIESSRPNSELAMVCGFAS

Query:  NVKRKYRLRCCGRASLGRSSPSDRAGV
         VKRK + R     +   S P   A V
Subjt:  NVKRKYRLRCCGRASLGRSSPSDRAGV

XP_022150343.1 uncharacterized protein LOC111018538 [Momordica charantia]6.0e-12487.02Show/hide
Query:  GTSDVTARFRVEPSSSGVRDQVSRISAASLDRCLRRASKFVSDPGSVLQRTIDYAAEAFVASIQSALAVKAELDGREVLAAREKEEFSAALEAASSAMKD
        G   + A+ R+EPSSSGVRDQVSRISAASLDRCLRRASKFVS PGSVLQRTIDYAAEAFVASIQSALAVKAELDGREVLAAREKEEFSAALE ASS MKD
Subjt:  GTSDVTARFRVEPSSSGVRDQVSRISAASLDRCLRRASKFVSDPGSVLQRTIDYAAEAFVASIQSALAVKAELDGREVLAAREKEEFSAALEAASSAMKD

Query:  ELLKAHSEVETLKAEVETKAELLKKEEDRRKAQLRAAHAITKGLEKEKFQLLKEKDDMLQALEAKEEELKHATAELETVKERLSNGALLEESFRQHPDFD
        ELLKAHSEVETLKAEVE++AELLKKEEDRR+AQLRAAHAIT+GLE+EKFQLLKEKDDMLQALEAK++EL+HATAELET KERLSNG LLEE+FRQHPDFD
Subjt:  ELLKAHSEVETLKAEVETKAELLKKEEDRRKAQLRAAHAITKGLEKEKFQLLKEKDDMLQALEAKEEELKHATAELETVKERLSNGALLEESFRQHPDFD

Query:  GFAKDFSDAGFKFLMKGIASDMPDVQIDLGGLKKRYAEQWASGPSGTRGSQALVDKYVGDLDSDYSDLEEDQVDTTQEGVSQAGS
        GFAKDFSDAGFKFLMKGIASDMPD+QIDL GLK+RYAE+WASGP GT G QALVD+YV DLDSDYSD EEDQV +TQEG S  GS
Subjt:  GFAKDFSDAGFKFLMKGIASDMPDVQIDLGGLKKRYAEQWASGPSGTRGSQALVDKYVGDLDSDYSDLEEDQVDTTQEGVSQAGS

XP_022158122.1 uncharacterized protein LOC111024680 [Momordica charantia]8.1e-10597.92Show/hide
Query:  MFEYSLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYICARKGAGGIVKGPTSIKGWVR
        MFEY LRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFY+CARKGAGGIVKGPTSIKGWVR
Subjt:  MFEYSLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYICARKGAGGIVKGPTSIKGWVR

Query:  KWFYASGEWLAKDESGRSFFDVPTRFGNLVSIRPVPELTQASFDTLKYYKECFPRGRKVGTLVTDQLLLESGLLDYNPAVRPIESSRPNSEL
        KWFYASGEWLAKDESGRSFFDVPTRFGNLVSIRPVPELTQASFDTLKYYKE FPRGRKVGTLVTD+LLLESGLLDYNPAVRPIESSRPNSEL
Subjt:  KWFYASGEWLAKDESGRSFFDVPTRFGNLVSIRPVPELTQASFDTLKYYKECFPRGRKVGTLVTDQLLLESGLLDYNPAVRPIESSRPNSEL

XP_022159063.1 uncharacterized protein LOC111025502, partial [Momordica charantia]2.0e-15989.13Show/hide
Query:  MSSSFSSNLGSDEDLARRLESELEEIENFRFSDDGEDSDASTSGQGLEYPSRIPEHYLGSLRRGFAIPDNILLRIPEEGERAGNPPEEWVTLYFKMFEYS
        MSSS SSNL  + DLARRLES+LEEIEN R SDDGEDSDASTSGQGLEYPSRIPEHYLGSLRRGFAIP+NILLR+PEEGERA NPPE WVTLYFKMFEY 
Subjt:  MSSSFSSNLGSDEDLARRLESELEEIENFRFSDDGEDSDASTSGQGLEYPSRIPEHYLGSLRRGFAIPDNILLRIPEEGERAGNPPEEWVTLYFKMFEYS

Query:  LRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYICARKGAGGIVKGPTSIKGWVRKWFYA
        LRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEEAEL DVDQLLACFEAKRIAKKPGRFY+CARKGAGGIVKGPTSIKGWVRKWFYA
Subjt:  LRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYICARKGAGGIVKGPTSIKGWVRKWFYA

Query:  SGEWLAKDESGRSFFDVPTRFGNLVSIRPVPELTQASFDTLKYYKECFPRGRKVGTLVTDQLLLESGLLDYNPAVRPIESSRPNSELAMVCGFASNVKRK
        SGEWLAKDESGRSFFDVPTRFGNLVSIRPVPELTQASFDTLKYYKE FPRGRKVGTLVTD+LLLESGLLDYNPAVRPIESSRPNSELAMVCGFAS VKRK
Subjt:  SGEWLAKDESGRSFFDVPTRFGNLVSIRPVPELTQASFDTLKYYKECFPRGRKVGTLVTDQLLLESGLLDYNPAVRPIESSRPNSELAMVCGFASNVKRK

Query:  YRLRCCGRASLGRSSPSDRAGV
         + R     +   S P+  A V
Subjt:  YRLRCCGRASLGRSSPSDRAGV

XP_022159252.1 uncharacterized protein LOC111025665 [Momordica charantia]6.8e-16062.48Show/hide
Query:  ICARKGAGGIVKGPTSIKGWVRKWFYASGEWLAKDESGRSFFDVPTRFGNLVSIRPVPELTQASFDTLKYYKECFPRGRKVGTLVTDQLLLESGLLDYNP
        +CARKG GGIVKGPTSIKGWV KWF+ASGEWLAKDESGR+FFDVPTRFGNLVSI+ +PEL QA+FDTLK+YK+ FPR RK+ TLVTD+LLLESGLLDYNP
Subjt:  ICARKGAGGIVKGPTSIKGWVRKWFYASGEWLAKDESGRSFFDVPTRFGNLVSIRPVPELTQASFDTLKYYKECFPRGRKVGTLVTDQLLLESGLLDYNP

Query:  AVRPIESSRPNSELAMVCGFASNVKRKYRLRCCGRASL----------------GRSSPS----------DRAGVFWG-----------------SFEGE
         VR IE+SRPNSELAMVCGF  +VKRK + R     ++                G S PS          D +G   G                    GE
Subjt:  AVRPIESSRPNSELAMVCGFASNVKRKYRLRCCGRASL----------------GRSSPS----------DRAGVFWG-----------------SFEGE

Query:  AP-----------QGSDRGG-GRLALGRGDRVDDPKARMSGTSDVTARFRVEPSSSGVRDQVSRISAASLDRCLRRASKFVSDPGSVLQRTIDYAAEAFV
        +P             S+ G  G L     D VDDP+ARM GTS+V  RF +EPSSSGV+DQVSRISA  LDR LRRASKFVSDPGSVLQRTID  AEAF+
Subjt:  AP-----------QGSDRGG-GRLALGRGDRVDDPKARMSGTSDVTARFRVEPSSSGVRDQVSRISAASLDRCLRRASKFVSDPGSVLQRTIDYAAEAFV

Query:  ASIQSALAVKAELDGREVLAAREKEEFSAALEAASSAMKDELLKAHSEVETLKAEVETKAELLKKEEDRRKAQLRAAHAITKGLEKEKFQLLKEKDDMLQ
        ASI  A+ VKAELDGRE LAA+E+E   AALEAA++ +K ELLKA  EV+ L+AEV+ K +LLKKE ++ KA LRAAHAITKGLEKEKFQLLKEKDD+ Q
Subjt:  ASIQSALAVKAELDGREVLAAREKEEFSAALEAASSAMKDELLKAHSEVETLKAEVETKAELLKKEEDRRKAQLRAAHAITKGLEKEKFQLLKEKDDMLQ

Query:  ALEAKEEELKHATAELETVKERLSNGALLEESFRQHPDFDGFAKDFSDAGFKFLMKGIASDMPDVQIDLGGLKKRYAEQWASGPSGTRGSQALVDKYVGD
         LE K+  +   T EL+ +KERL+NG LLEESFRQHPDFDGFAKDFSDAGFKFLMKGIA+DMP +QIDL GLKK+Y+E+WASGP+GT   Q+LVDKYV +
Subjt:  ALEAKEEELKHATAELETVKERLSNGALLEESFRQHPDFDGFAKDFSDAGFKFLMKGIASDMPDVQIDLGGLKKRYAEQWASGPSGTRGSQALVDKYVGD

Query:  LDSDYSDLEED--------QVDTTQEGV-SQAG
        LDSDYSD+EE+        +V TTQE V SQ G
Subjt:  LDSDYSDLEED--------QVDTTQEGV-SQAG

TrEMBL top hitse value%identityAlignment
A0A6J1CR42 uncharacterized protein LOC1110138264.1e-11088.99Show/hide
Query:  MFEYSLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYICARKGAGGIVKGPTSIKGWVR
        MFEY LRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFY+CARKGAGGIVKGPTSIKGWVR
Subjt:  MFEYSLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYICARKGAGGIVKGPTSIKGWVR

Query:  KWFYASGEWLAKDESGRSFFDVPTRFGNLVSIRPVPELTQASFDTLKYYKECFPRGRKVGTLVTDQLLLESGLLDYNPAVRPIESSRPNSELAMVCGFAS
        KWFYASGEWLAKDESGRSFFDVPTRFGNLVSIRPVPELTQASFDTLKYYKE FPRGRKVGTLVTD+LLLESGLLDYNPAVRPIE SRPNS LAMVC FAS
Subjt:  KWFYASGEWLAKDESGRSFFDVPTRFGNLVSIRPVPELTQASFDTLKYYKECFPRGRKVGTLVTDQLLLESGLLDYNPAVRPIESSRPNSELAMVCGFAS

Query:  NVKRKYRLRCCGRASLGRSSPSDRAGV
         VKRK + R     +   S P   A V
Subjt:  NVKRKYRLRCCGRASLGRSSPSDRAGV

A0A6J1D971 uncharacterized protein LOC1110185382.9e-12487.02Show/hide
Query:  GTSDVTARFRVEPSSSGVRDQVSRISAASLDRCLRRASKFVSDPGSVLQRTIDYAAEAFVASIQSALAVKAELDGREVLAAREKEEFSAALEAASSAMKD
        G   + A+ R+EPSSSGVRDQVSRISAASLDRCLRRASKFVS PGSVLQRTIDYAAEAFVASIQSALAVKAELDGREVLAAREKEEFSAALE ASS MKD
Subjt:  GTSDVTARFRVEPSSSGVRDQVSRISAASLDRCLRRASKFVSDPGSVLQRTIDYAAEAFVASIQSALAVKAELDGREVLAAREKEEFSAALEAASSAMKD

Query:  ELLKAHSEVETLKAEVETKAELLKKEEDRRKAQLRAAHAITKGLEKEKFQLLKEKDDMLQALEAKEEELKHATAELETVKERLSNGALLEESFRQHPDFD
        ELLKAHSEVETLKAEVE++AELLKKEEDRR+AQLRAAHAIT+GLE+EKFQLLKEKDDMLQALEAK++EL+HATAELET KERLSNG LLEE+FRQHPDFD
Subjt:  ELLKAHSEVETLKAEVETKAELLKKEEDRRKAQLRAAHAITKGLEKEKFQLLKEKDDMLQALEAKEEELKHATAELETVKERLSNGALLEESFRQHPDFD

Query:  GFAKDFSDAGFKFLMKGIASDMPDVQIDLGGLKKRYAEQWASGPSGTRGSQALVDKYVGDLDSDYSDLEEDQVDTTQEGVSQAGS
        GFAKDFSDAGFKFLMKGIASDMPD+QIDL GLK+RYAE+WASGP GT G QALVD+YV DLDSDYSD EEDQV +TQEG S  GS
Subjt:  GFAKDFSDAGFKFLMKGIASDMPDVQIDLGGLKKRYAEQWASGPSGTRGSQALVDKYVGDLDSDYSDLEEDQVDTTQEGVSQAGS

A0A6J1DWD2 uncharacterized protein LOC1110246803.9e-10597.92Show/hide
Query:  MFEYSLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYICARKGAGGIVKGPTSIKGWVR
        MFEY LRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFY+CARKGAGGIVKGPTSIKGWVR
Subjt:  MFEYSLRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYICARKGAGGIVKGPTSIKGWVR

Query:  KWFYASGEWLAKDESGRSFFDVPTRFGNLVSIRPVPELTQASFDTLKYYKECFPRGRKVGTLVTDQLLLESGLLDYNPAVRPIESSRPNSEL
        KWFYASGEWLAKDESGRSFFDVPTRFGNLVSIRPVPELTQASFDTLKYYKE FPRGRKVGTLVTD+LLLESGLLDYNPAVRPIESSRPNSEL
Subjt:  KWFYASGEWLAKDESGRSFFDVPTRFGNLVSIRPVPELTQASFDTLKYYKECFPRGRKVGTLVTDQLLLESGLLDYNPAVRPIESSRPNSEL

A0A6J1DXS5 uncharacterized protein LOC1110255029.5e-16089.13Show/hide
Query:  MSSSFSSNLGSDEDLARRLESELEEIENFRFSDDGEDSDASTSGQGLEYPSRIPEHYLGSLRRGFAIPDNILLRIPEEGERAGNPPEEWVTLYFKMFEYS
        MSSS SSNL  + DLARRLES+LEEIEN R SDDGEDSDASTSGQGLEYPSRIPEHYLGSLRRGFAIP+NILLR+PEEGERA NPPE WVTLYFKMFEY 
Subjt:  MSSSFSSNLGSDEDLARRLESELEEIENFRFSDDGEDSDASTSGQGLEYPSRIPEHYLGSLRRGFAIPDNILLRIPEEGERAGNPPEEWVTLYFKMFEYS

Query:  LRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYICARKGAGGIVKGPTSIKGWVRKWFYA
        LRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEEAEL DVDQLLACFEAKRIAKKPGRFY+CARKGAGGIVKGPTSIKGWVRKWFYA
Subjt:  LRLPLHPFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYICARKGAGGIVKGPTSIKGWVRKWFYA

Query:  SGEWLAKDESGRSFFDVPTRFGNLVSIRPVPELTQASFDTLKYYKECFPRGRKVGTLVTDQLLLESGLLDYNPAVRPIESSRPNSELAMVCGFASNVKRK
        SGEWLAKDESGRSFFDVPTRFGNLVSIRPVPELTQASFDTLKYYKE FPRGRKVGTLVTD+LLLESGLLDYNPAVRPIESSRPNSELAMVCGFAS VKRK
Subjt:  SGEWLAKDESGRSFFDVPTRFGNLVSIRPVPELTQASFDTLKYYKECFPRGRKVGTLVTDQLLLESGLLDYNPAVRPIESSRPNSELAMVCGFASNVKRK

Query:  YRLRCCGRASLGRSSPSDRAGV
         + R     +   S P+  A V
Subjt:  YRLRCCGRASLGRSSPSDRAGV

A0A6J1DZB3 uncharacterized protein LOC1110256653.3e-16062.48Show/hide
Query:  ICARKGAGGIVKGPTSIKGWVRKWFYASGEWLAKDESGRSFFDVPTRFGNLVSIRPVPELTQASFDTLKYYKECFPRGRKVGTLVTDQLLLESGLLDYNP
        +CARKG GGIVKGPTSIKGWV KWF+ASGEWLAKDESGR+FFDVPTRFGNLVSI+ +PEL QA+FDTLK+YK+ FPR RK+ TLVTD+LLLESGLLDYNP
Subjt:  ICARKGAGGIVKGPTSIKGWVRKWFYASGEWLAKDESGRSFFDVPTRFGNLVSIRPVPELTQASFDTLKYYKECFPRGRKVGTLVTDQLLLESGLLDYNP

Query:  AVRPIESSRPNSELAMVCGFASNVKRKYRLRCCGRASL----------------GRSSPS----------DRAGVFWG-----------------SFEGE
         VR IE+SRPNSELAMVCGF  +VKRK + R     ++                G S PS          D +G   G                    GE
Subjt:  AVRPIESSRPNSELAMVCGFASNVKRKYRLRCCGRASL----------------GRSSPS----------DRAGVFWG-----------------SFEGE

Query:  AP-----------QGSDRGG-GRLALGRGDRVDDPKARMSGTSDVTARFRVEPSSSGVRDQVSRISAASLDRCLRRASKFVSDPGSVLQRTIDYAAEAFV
        +P             S+ G  G L     D VDDP+ARM GTS+V  RF +EPSSSGV+DQVSRISA  LDR LRRASKFVSDPGSVLQRTID  AEAF+
Subjt:  AP-----------QGSDRGG-GRLALGRGDRVDDPKARMSGTSDVTARFRVEPSSSGVRDQVSRISAASLDRCLRRASKFVSDPGSVLQRTIDYAAEAFV

Query:  ASIQSALAVKAELDGREVLAAREKEEFSAALEAASSAMKDELLKAHSEVETLKAEVETKAELLKKEEDRRKAQLRAAHAITKGLEKEKFQLLKEKDDMLQ
        ASI  A+ VKAELDGRE LAA+E+E   AALEAA++ +K ELLKA  EV+ L+AEV+ K +LLKKE ++ KA LRAAHAITKGLEKEKFQLLKEKDD+ Q
Subjt:  ASIQSALAVKAELDGREVLAAREKEEFSAALEAASSAMKDELLKAHSEVETLKAEVETKAELLKKEEDRRKAQLRAAHAITKGLEKEKFQLLKEKDDMLQ

Query:  ALEAKEEELKHATAELETVKERLSNGALLEESFRQHPDFDGFAKDFSDAGFKFLMKGIASDMPDVQIDLGGLKKRYAEQWASGPSGTRGSQALVDKYVGD
         LE K+  +   T EL+ +KERL+NG LLEESFRQHPDFDGFAKDFSDAGFKFLMKGIA+DMP +QIDL GLKK+Y+E+WASGP+GT   Q+LVDKYV +
Subjt:  ALEAKEEELKHATAELETVKERLSNGALLEESFRQHPDFDGFAKDFSDAGFKFLMKGIASDMPDVQIDLGGLKKRYAEQWASGPSGTRGSQALVDKYVGD

Query:  LDSDYSDLEED--------QVDTTQEGV-SQAG
        LDSDYSD+EE+        +V TTQE V SQ G
Subjt:  LDSDYSDLEED--------QVDTTQEGV-SQAG

SwissProt top hitse value%identityAlignment
P10978 Retrovirus-related Pol polyprotein from transposon TNT 1-941.8e-0636.78Show/hide
Query:  LSIGVYSLVPKETTAKELLQALQDRYEKPSANTKILLWTKYFNIHMEEGTSVNSHINELTDILNKLEGMSVKIEEEMKAMRLLTSLP
        LS  V + +  E TA+ +   L+  Y   +   K+ L  + + +HM EGT+  SH+N    ++ +L  + VKIEEE KA+ LL SLP
Subjt:  LSIGVYSLVPKETTAKELLQALQDRYEKPSANTKILLWTKYFNIHMEEGTSVNSHINELTDILNKLEGMSVKIEEEMKAMRLLTSLP

Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGACACTATCAATAGGGGTATACAGTCTGGTGCCGAAAGAGACTACAGCGAAAGAATTGTTGCAGGCCTTGCAAGATAGGTATGAAAAACCTTCTGCCAATACAAAAAT
ACTTCTGTGGACGAAGTATTTTAATATCCACATGGAGGAGGGAACCTCGGTGAATTCACACATTAATGAGCTCACCGATATCTTGAACAAATTAGAAGGGATGAGTGTCA
AGATTGAGGAAGAGATGAAGGCTATGAGGCTGTTGACATCTTTGCCTGACAAAAAAGCTTCAGTGTATGTGTTGAGGTTTGGTGTTGCCAGAGGATTAGAGAGACGGATT
ATGCACAAGGCTGCAGATAGTTTAGGGGGAGACTTTAAAAAACTAGCAGCATTGACAGCCGAGACAGATCAAGAGAATATGCCATCAATTCAAGTACAACAGCTAGGAAG
TAGAGGAAAGGGAAAGGGGAACAGCTCAGTGAGACAAGACCACGTAGGATTGCTCAGTCTCAGGGCAATCACAGATGAAGTCTTTGTTGGTGCCAAGAAGATGTTAGAAG
CTGTTGGTGTAGTGGGAGTCGAGTCTTGTGCTCAGAAACTAGTGACTATTGTTGTGGCATCTAAAGAGAACTTTATTGCAGCTCGAACTCGGCCTCCGGACCGATCTGAA
TACTTGGGCGGACCTGCACAGAAAGGTGAGCACTCCGACGATCAAGTCAGTATAGGTCGGATTCCCAGTTTAGTTCGAGGGTATTCTCTTCCCCAAACATTGGCCCCCTC
TCTGTCTGGTCCGATCTCGACCTGGCTGAGAAGTTCATTCGACTTGCTTTGGACGCGTGGCGACTTCCTATTCGTGGGAAAATATAACCGTTGTGGTAGATTTATCGTCG
GAATATTCAAATATTCCGACGCTTCGGATCTTAGGGAGGATCCTAACCGCTCGTTGATTACACGTCTCGAACCCTTGGTAGGTCGGTCTCTTCCCTCACTTTCTCTTTCG
AACGTAGTTGCCATGTCGTCCTCTTTTAGCAGCAACTTAGGATCCGATGAGGATTTAGCTCGTAGGTTAGAGTCCGAGCTCGAGGAGATAGAAAACTTTAGGTTCTCCGA
TGACGGGGAGGATAGTGATGCCTCCACCTCGGGTCAGGGTTTGGAATACCCTTCTAGGATACCTGAGCACTACCTCGGATCCCTTCGTAGGGGGTTCGCTATCCCTGACA
ACATCCTCCTTAGGATTCCGGAGGAGGGGGAGAGAGCTGGCAATCCTCCAGAGGAATGGGTCACCCTCTACTTCAAAATGTTTGAGTACAGCCTCAGACTTCCCCTTCAC
CCTTTCGTCCAAGAGTTTCTTTTCCGAACTGGGCTGGCTCCGGCTCAAGTGGCCCCCAATGGGTGGGGTGTCATTTTCGCTTTGGCCATCCTTTTTTGGCTACGAGCTCG
GGATAGTGAAGAGGCCGAGCTGTTAGACGTAGACCAGCTTCTCGCGTGCTTCGAAGCGAAAAGGATAGCTAAGAAGCCTGGTCGGTTCTATATATGCGCAAGGAAAGGCG
CAGGCGGTATAGTTAAGGGCCCGACCTCTATCAAGGGATGGGTGAGGAAGTGGTTCTACGCTTCTGGGGAATGGCTTGCAAAGGACGAGTCAGGTCGTTCCTTCTTTGAC
GTTCCCACTAGGTTTGGAAACCTAGTTTCAATCCGACCAGTCCCCGAGCTTACGCAAGCCTCCTTCGACACGCTGAAATATTACAAGGAGTGTTTTCCGAGGGGTAGGAA
GGTCGGAACCTTGGTGACTGACCAGCTGCTGCTCGAGTCCGGGCTGCTAGATTACAACCCCGCAGTTCGTCCCATTGAATCCTCAAGACCGAACTCCGAACTTGCCATGG
TTTGTGGATTTGCGAGCAACGTGAAGCGCAAGTATCGACTACGCTGCTGTGGTAGGGCCAGCCTCGGAAGATCCAGCCCCAGTGATCGAGCTGGAGTCTTCTGGGGGTCC
TTCGAGGGAGAAGCGCCCCAGGGATCAGACCGAGGCGGTGGACGTCTCGCCCTTGGGCGAGGGGATCGGGTGGACGATCCTAAGGCCAGGATGAGCGGGACGTCCGATGT
GACGGCACGGTTCAGAGTTGAGCCGTCAAGCTCTGGGGTGAGGGACCAGGTGTCCCGCATCTCGGCCGCAAGTTTGGACCGCTGCCTAAGGAGGGCGTCCAAATTTGTGA
GCGACCCAGGGTCCGTTCTGCAGAGGACCATCGACTACGCTGCCGAGGCGTTCGTTGCTTCCATTCAATCGGCTCTGGCCGTGAAGGCCGAGCTGGATGGGAGGGAAGTT
CTGGCAGCGAGGGAGAAAGAGGAGTTCTCTGCTGCCTTGGAGGCTGCTTCCTCCGCCATGAAGGATGAGCTACTGAAGGCTCACTCTGAGGTGGAAACTTTGAAGGCCGA
GGTGGAGACCAAGGCCGAGCTGCTGAAGAAGGAAGAAGACAGACGCAAGGCCCAGCTCCGAGCTGCCCATGCTATTACCAAGGGCTTGGAGAAGGAGAAGTTCCAACTCC
TCAAGGAGAAGGACGACATGCTCCAGGCGCTTGAAGCGAAGGAGGAGGAGCTGAAGCATGCGACTGCCGAGCTGGAGACGGTGAAGGAGCGTCTCAGCAATGGAGCCCTA
TTGGAGGAATCATTTAGGCAACATCCTGACTTCGATGGATTTGCCAAAGACTTCTCTGACGCGGGCTTCAAGTTTCTCATGAAGGGCATTGCTTCCGACATGCCTGACGT
TCAGATCGATCTCGGTGGTCTGAAGAAGAGGTATGCTGAGCAGTGGGCGTCTGGGCCTAGCGGCACCCGTGGCTCCCAAGCGTTGGTGGATAAGTACGTCGGAGATCTGG
ACTCTGACTACTCCGACCTCGAAGAGGATCAGGTCGACACCACTCAAGAGGGCGTTTCTCAAGCAGGTTCTTAG
mRNA sequenceShow/hide mRNA sequence
ATGACACTATCAATAGGGGTATACAGTCTGGTGCCGAAAGAGACTACAGCGAAAGAATTGTTGCAGGCCTTGCAAGATAGGTATGAAAAACCTTCTGCCAATACAAAAAT
ACTTCTGTGGACGAAGTATTTTAATATCCACATGGAGGAGGGAACCTCGGTGAATTCACACATTAATGAGCTCACCGATATCTTGAACAAATTAGAAGGGATGAGTGTCA
AGATTGAGGAAGAGATGAAGGCTATGAGGCTGTTGACATCTTTGCCTGACAAAAAAGCTTCAGTGTATGTGTTGAGGTTTGGTGTTGCCAGAGGATTAGAGAGACGGATT
ATGCACAAGGCTGCAGATAGTTTAGGGGGAGACTTTAAAAAACTAGCAGCATTGACAGCCGAGACAGATCAAGAGAATATGCCATCAATTCAAGTACAACAGCTAGGAAG
TAGAGGAAAGGGAAAGGGGAACAGCTCAGTGAGACAAGACCACGTAGGATTGCTCAGTCTCAGGGCAATCACAGATGAAGTCTTTGTTGGTGCCAAGAAGATGTTAGAAG
CTGTTGGTGTAGTGGGAGTCGAGTCTTGTGCTCAGAAACTAGTGACTATTGTTGTGGCATCTAAAGAGAACTTTATTGCAGCTCGAACTCGGCCTCCGGACCGATCTGAA
TACTTGGGCGGACCTGCACAGAAAGGTGAGCACTCCGACGATCAAGTCAGTATAGGTCGGATTCCCAGTTTAGTTCGAGGGTATTCTCTTCCCCAAACATTGGCCCCCTC
TCTGTCTGGTCCGATCTCGACCTGGCTGAGAAGTTCATTCGACTTGCTTTGGACGCGTGGCGACTTCCTATTCGTGGGAAAATATAACCGTTGTGGTAGATTTATCGTCG
GAATATTCAAATATTCCGACGCTTCGGATCTTAGGGAGGATCCTAACCGCTCGTTGATTACACGTCTCGAACCCTTGGTAGGTCGGTCTCTTCCCTCACTTTCTCTTTCG
AACGTAGTTGCCATGTCGTCCTCTTTTAGCAGCAACTTAGGATCCGATGAGGATTTAGCTCGTAGGTTAGAGTCCGAGCTCGAGGAGATAGAAAACTTTAGGTTCTCCGA
TGACGGGGAGGATAGTGATGCCTCCACCTCGGGTCAGGGTTTGGAATACCCTTCTAGGATACCTGAGCACTACCTCGGATCCCTTCGTAGGGGGTTCGCTATCCCTGACA
ACATCCTCCTTAGGATTCCGGAGGAGGGGGAGAGAGCTGGCAATCCTCCAGAGGAATGGGTCACCCTCTACTTCAAAATGTTTGAGTACAGCCTCAGACTTCCCCTTCAC
CCTTTCGTCCAAGAGTTTCTTTTCCGAACTGGGCTGGCTCCGGCTCAAGTGGCCCCCAATGGGTGGGGTGTCATTTTCGCTTTGGCCATCCTTTTTTGGCTACGAGCTCG
GGATAGTGAAGAGGCCGAGCTGTTAGACGTAGACCAGCTTCTCGCGTGCTTCGAAGCGAAAAGGATAGCTAAGAAGCCTGGTCGGTTCTATATATGCGCAAGGAAAGGCG
CAGGCGGTATAGTTAAGGGCCCGACCTCTATCAAGGGATGGGTGAGGAAGTGGTTCTACGCTTCTGGGGAATGGCTTGCAAAGGACGAGTCAGGTCGTTCCTTCTTTGAC
GTTCCCACTAGGTTTGGAAACCTAGTTTCAATCCGACCAGTCCCCGAGCTTACGCAAGCCTCCTTCGACACGCTGAAATATTACAAGGAGTGTTTTCCGAGGGGTAGGAA
GGTCGGAACCTTGGTGACTGACCAGCTGCTGCTCGAGTCCGGGCTGCTAGATTACAACCCCGCAGTTCGTCCCATTGAATCCTCAAGACCGAACTCCGAACTTGCCATGG
TTTGTGGATTTGCGAGCAACGTGAAGCGCAAGTATCGACTACGCTGCTGTGGTAGGGCCAGCCTCGGAAGATCCAGCCCCAGTGATCGAGCTGGAGTCTTCTGGGGGTCC
TTCGAGGGAGAAGCGCCCCAGGGATCAGACCGAGGCGGTGGACGTCTCGCCCTTGGGCGAGGGGATCGGGTGGACGATCCTAAGGCCAGGATGAGCGGGACGTCCGATGT
GACGGCACGGTTCAGAGTTGAGCCGTCAAGCTCTGGGGTGAGGGACCAGGTGTCCCGCATCTCGGCCGCAAGTTTGGACCGCTGCCTAAGGAGGGCGTCCAAATTTGTGA
GCGACCCAGGGTCCGTTCTGCAGAGGACCATCGACTACGCTGCCGAGGCGTTCGTTGCTTCCATTCAATCGGCTCTGGCCGTGAAGGCCGAGCTGGATGGGAGGGAAGTT
CTGGCAGCGAGGGAGAAAGAGGAGTTCTCTGCTGCCTTGGAGGCTGCTTCCTCCGCCATGAAGGATGAGCTACTGAAGGCTCACTCTGAGGTGGAAACTTTGAAGGCCGA
GGTGGAGACCAAGGCCGAGCTGCTGAAGAAGGAAGAAGACAGACGCAAGGCCCAGCTCCGAGCTGCCCATGCTATTACCAAGGGCTTGGAGAAGGAGAAGTTCCAACTCC
TCAAGGAGAAGGACGACATGCTCCAGGCGCTTGAAGCGAAGGAGGAGGAGCTGAAGCATGCGACTGCCGAGCTGGAGACGGTGAAGGAGCGTCTCAGCAATGGAGCCCTA
TTGGAGGAATCATTTAGGCAACATCCTGACTTCGATGGATTTGCCAAAGACTTCTCTGACGCGGGCTTCAAGTTTCTCATGAAGGGCATTGCTTCCGACATGCCTGACGT
TCAGATCGATCTCGGTGGTCTGAAGAAGAGGTATGCTGAGCAGTGGGCGTCTGGGCCTAGCGGCACCCGTGGCTCCCAAGCGTTGGTGGATAAGTACGTCGGAGATCTGG
ACTCTGACTACTCCGACCTCGAAGAGGATCAGGTCGACACCACTCAAGAGGGCGTTTCTCAAGCAGGTTCTTAG
Protein sequenceShow/hide protein sequence
MTLSIGVYSLVPKETTAKELLQALQDRYEKPSANTKILLWTKYFNIHMEEGTSVNSHINELTDILNKLEGMSVKIEEEMKAMRLLTSLPDKKASVYVLRFGVARGLERRI
MHKAADSLGGDFKKLAALTAETDQENMPSIQVQQLGSRGKGKGNSSVRQDHVGLLSLRAITDEVFVGAKKMLEAVGVVGVESCAQKLVTIVVASKENFIAARTRPPDRSE
YLGGPAQKGEHSDDQVSIGRIPSLVRGYSLPQTLAPSLSGPISTWLRSSFDLLWTRGDFLFVGKYNRCGRFIVGIFKYSDASDLREDPNRSLITRLEPLVGRSLPSLSLS
NVVAMSSSFSSNLGSDEDLARRLESELEEIENFRFSDDGEDSDASTSGQGLEYPSRIPEHYLGSLRRGFAIPDNILLRIPEEGERAGNPPEEWVTLYFKMFEYSLRLPLH
PFVQEFLFRTGLAPAQVAPNGWGVIFALAILFWLRARDSEEAELLDVDQLLACFEAKRIAKKPGRFYICARKGAGGIVKGPTSIKGWVRKWFYASGEWLAKDESGRSFFD
VPTRFGNLVSIRPVPELTQASFDTLKYYKECFPRGRKVGTLVTDQLLLESGLLDYNPAVRPIESSRPNSELAMVCGFASNVKRKYRLRCCGRASLGRSSPSDRAGVFWGS
FEGEAPQGSDRGGGRLALGRGDRVDDPKARMSGTSDVTARFRVEPSSSGVRDQVSRISAASLDRCLRRASKFVSDPGSVLQRTIDYAAEAFVASIQSALAVKAELDGREV
LAAREKEEFSAALEAASSAMKDELLKAHSEVETLKAEVETKAELLKKEEDRRKAQLRAAHAITKGLEKEKFQLLKEKDDMLQALEAKEEELKHATAELETVKERLSNGAL
LEESFRQHPDFDGFAKDFSDAGFKFLMKGIASDMPDVQIDLGGLKKRYAEQWASGPSGTRGSQALVDKYVGDLDSDYSDLEEDQVDTTQEGVSQAGS