; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Lag0039041 (gene) of Sponge gourd (AG-4) v1 genome

Gene IDLag0039041
OrganismLuffa acutangula AG-4 (Sponge gourd (AG-4) v1)
DescriptionReverse transcriptase
Genome locationchr2:34388617..34400851
RNA-Seq ExpressionLag0039041
SyntenyLag0039041
Gene Ontology termsGO:0015074 - DNA integration (biological process)
GO:0003676 - nucleic acid binding (molecular function)
GO:0008270 - zinc ion binding (molecular function)
InterPro domainsIPR001584 - Integrase, catalytic core
IPR005162 - Retrotransposon gag domain


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_022151688.1 uncharacterized protein LOC111019603 [Momordica charantia]7.3e-6349.6Show/hide
Query:  APMLITPEALQTMFDNMAQKNARPPRNPNWVPENAEESQFIRDFKRYGPPSFDGQSENPLAAERWITDLEALLDLMNCNDSLKIRGAVFMLKDDARTWWQ
        A + +   ALQ + DN     A     P        E+QFIRDF+RYGPP+F+G+SE     E WI +LEAL   + C+D LK++GAVFML+ +A  WW 
Subjt:  APMLITPEALQTMFDNMAQKNARPPRNPNWVPENAEESQFIRDFKRYGPPSFDGQSENPLAAERWITDLEALLDLMNCNDSLKIRGAVFMLKDDARTWWQ

Query:  SVAAAEDHANQPISWERFKDLLYDNYFPETVKDDKEAEFLHLAQGSMSVVQYERKFTALSRFAPDLVSTPERKIKRFIKGLREEIRGSVALSRPATFAEA
         VA  EDH N+PI+W   KDLLYD YFP+T+KD+KE EFLHL Q ++ V QYE+KFT  SRFA DL+ T  RKIKRF++GL + I+G + L RP T+AEA
Subjt:  SVAAAEDHANQPISWERFKDLLYDNYFPETVKDDKEAEFLHLAQGSMSVVQYERKFTALSRFAPDLVSTPERKIKRFIKGLREEIRGSVALSRPATFAEA

Query:  LTGALIMDKNVSKKPQPHLEKGSTSGDKRKLSPLRNPPIEPTQQQPRR
        + GAL+MDK+V +K QP  + G +SG KRK+ P+ +   +P++  P++
Subjt:  LTGALIMDKNVSKKPQPHLEKGSTSGDKRKLSPLRNPPIEPTQQQPRR

XP_022155000.1 uncharacterized protein LOC111022144 [Momordica charantia]2.8e-6259.35Show/hide
Query:  ESQFIRDFKRYGPPSFDGQSENPLAAERWITDLEALLDLMNCNDSLKIRGAVFMLKDDARTWWQSVAAAEDHANQPISWERFKDLLYDNYFPETVKDDKE
        E+ FI+DFKRYGPP+FDG+SE   AAE WI +LEA    + C D  K++GAVFML+ +A  WW S+AAAEDHAN  I W RFKDLLYD Y+ ETVKD KE
Subjt:  ESQFIRDFKRYGPPSFDGQSENPLAAERWITDLEALLDLMNCNDSLKIRGAVFMLKDDARTWWQSVAAAEDHANQPISWERFKDLLYDNYFPETVKDDKE

Query:  AEFLHLAQGSMSVVQYERKFTALSRFAPDLVSTPERKIKRFIKGLREEIRGSVALSRPATFAEALTGALIMDKNVSKKPQPHLEKGSTSGDKRKLSP-LR
        AEFLHL QG++SV QYERKFT LSRFA +L+     KIKRF+KGL + IRG V L RPA++AEA+ GALIMDK+VS K     E GS+SG KRK  P   
Subjt:  AEFLHLAQGSMSVVQYERKFTALSRFAPDLVSTPERKIKRFIKGLREEIRGSVALSRPATFAEALTGALIMDKNVSKKPQPHLEKGSTSGDKRKLSP-LR

Query:  NPPIEPTQQQPRRQ
        +P +   Q Q + +
Subjt:  NPPIEPTQQQPRRQ

XP_022155925.1 uncharacterized protein LOC111022925 [Momordica charantia]9.2e-5849.61Show/hide
Query:  PPAPPAAPMLITPEALQTMFDNMAQKNARPPRNPNWVPENAEESQFIRDFKRYGPPSFDGQSENPLAAERWITDLEALLDLMNCNDSLKIRGAVFMLKDD
        PP  P   +L+  EALQ + DN         + P+    + EE QFIRDFKR+GPP F+G SE P AAE W+ +LEAL   + C+D  K+RGAVFML+ +
Subjt:  PPAPPAAPMLITPEALQTMFDNMAQKNARPPRNPNWVPENAEESQFIRDFKRYGPPSFDGQSENPLAAERWITDLEALLDLMNCNDSLKIRGAVFMLKDD

Query:  ARTWWQSVAAAEDHANQPISWERFKDLLYDNYFPETVKDDKEAEFLHLAQGSMSVVQYERKFTALSRFAPDLVSTPERKIKRFIKGLREEIRGSVALSRP
        A  WW+SVAAAEDHAN P++W RFKDLLY+ YFP TV+++K AEFL L Q S+ V QYERKFT LSRF    + T + KI +FI GLR EI+G + L  P
Subjt:  ARTWWQSVAAAEDHANQPISWERFKDLLYDNYFPETVKDDKEAEFLHLAQGSMSVVQYERKFTALSRFAPDLVSTPERKIKRFIKGLREEIRGSVALSRP

Query:  ATFAEALTGALIMDKNVSKKPQPHLEKGSTSGDKRKLSPL-RNPPIEPTQQQPRRQVS
         T+A A+  AL+MDK + ++PQ     GS+SG KRK +    + P    Q   +RQ +
Subjt:  ATFAEALTGALIMDKNVSKKPQPHLEKGSTSGDKRKLSPL-RNPPIEPTQQQPRRQVS

XP_022156326.1 uncharacterized protein LOC111023247 [Momordica charantia]2.3e-6458.59Show/hide
Query:  PPRNPNWVPENAEESQFIRDFKRYGPPSFDGQSENPLAAERWITDLEALLDLMNCNDSLKIRGAVFMLKDDARTWWQSVAAAEDHANQPISWERFKDLLY
        PP      P++  E++FI+DFKRYGPP+FDG+SE   A E WI +LEAL   + C D  K++GAVFML+ +A  WW SVAAAED+AN PI W RFK+LLY
Subjt:  PPRNPNWVPENAEESQFIRDFKRYGPPSFDGQSENPLAAERWITDLEALLDLMNCNDSLKIRGAVFMLKDDARTWWQSVAAAEDHANQPISWERFKDLLY

Query:  DNYFPETVKDDKEAEFLHLAQGSMSVVQYERKFTALSRFAPDLVSTPERKIKRFIKGLREEIRGSVALSRPATFAEALTGALIMDKNVSKKPQPHLEKGS
        D Y+PETVKD KEAEFLHL QG++SV QYERKFT LSRFA +L+ T   KIKRF+KGLR+ IRG V L RP T+AEA+ GAL+MDK+VS K  P  E GS
Subjt:  DNYFPETVKDDKEAEFLHLAQGSMSVVQYERKFTALSRFAPDLVSTPERKIKRFIKGLREEIRGSVALSRPATFAEALTGALIMDKNVSKKPQPHLEKGS

Query:  TSGDKRKL-SPLRNPPIEPTQQQPRRQ
        +SG KRK  S   +  +   Q+Q + Q
Subjt:  TSGDKRKL-SPLRNPPIEPTQQQPRRQ

XP_022156546.1 uncharacterized protein LOC111023424 [Momordica charantia]2.5e-6350.36Show/hide
Query:  PPGQRRVDPPPPP-------PPPAPPAAPMLITPEALQTMFDNMAQKNARPPRNPNWVPENAEESQFIRDFKRYGPPSFDGQSENPLAAERWITDLEALL
        P G+   DPPPPP        PP PPAA   +       + +N A       + P  +     E+QFI+DFKRYGPP+F G SE    AE W+ +LEAL 
Subjt:  PPGQRRVDPPPPP-------PPPAPPAAPMLITPEALQTMFDNMAQKNARPPRNPNWVPENAEESQFIRDFKRYGPPSFDGQSENPLAAERWITDLEALL

Query:  DLMNCNDSLKIRGAVFMLKDDARTWWQSVAAAEDHANQPISWERFKDLLYDNYFPETVKDDKEAEFLHLAQGSMSVVQYERKFTALSRFAPDLVSTPERK
          + C D  K++GAVFML+ +A  WW SVAA EDHAN P+ W RFK+LLYD+Y+ ETV+D KE EFLHL QG+++V QYERKFT LS FA +L+ T   K
Subjt:  DLMNCNDSLKIRGAVFMLKDDARTWWQSVAAAEDHANQPISWERFKDLLYDNYFPETVKDDKEAEFLHLAQGSMSVVQYERKFTALSRFAPDLVSTPERK

Query:  IKRFIKGLREEIRGSVALSRPATFAEALTGALIMDKNVSKKPQPHLEKGSTSGDKRKLSPLRNPPIEPTQQQPRRQ
        IKRF+KGL + IRGSV L RP T+AEA+ G LIMDK+VS + QP +E GS+ G KRK+ P          Q+P +Q
Subjt:  IKRFIKGLREEIRGSVALSRPATFAEALTGALIMDKNVSKKPQPHLEKGSTSGDKRKLSPLRNPPIEPTQQQPRRQ

TrEMBL top hitse value%identityAlignment
A0A6J1DCW8 uncharacterized protein LOC1110196033.5e-6349.6Show/hide
Query:  APMLITPEALQTMFDNMAQKNARPPRNPNWVPENAEESQFIRDFKRYGPPSFDGQSENPLAAERWITDLEALLDLMNCNDSLKIRGAVFMLKDDARTWWQ
        A + +   ALQ + DN     A     P        E+QFIRDF+RYGPP+F+G+SE     E WI +LEAL   + C+D LK++GAVFML+ +A  WW 
Subjt:  APMLITPEALQTMFDNMAQKNARPPRNPNWVPENAEESQFIRDFKRYGPPSFDGQSENPLAAERWITDLEALLDLMNCNDSLKIRGAVFMLKDDARTWWQ

Query:  SVAAAEDHANQPISWERFKDLLYDNYFPETVKDDKEAEFLHLAQGSMSVVQYERKFTALSRFAPDLVSTPERKIKRFIKGLREEIRGSVALSRPATFAEA
         VA  EDH N+PI+W   KDLLYD YFP+T+KD+KE EFLHL Q ++ V QYE+KFT  SRFA DL+ T  RKIKRF++GL + I+G + L RP T+AEA
Subjt:  SVAAAEDHANQPISWERFKDLLYDNYFPETVKDDKEAEFLHLAQGSMSVVQYERKFTALSRFAPDLVSTPERKIKRFIKGLREEIRGSVALSRPATFAEA

Query:  LTGALIMDKNVSKKPQPHLEKGSTSGDKRKLSPLRNPPIEPTQQQPRR
        + GAL+MDK+V +K QP  + G +SG KRK+ P+ +   +P++  P++
Subjt:  LTGALIMDKNVSKKPQPHLEKGSTSGDKRKLSPLRNPPIEPTQQQPRR

A0A6J1DL73 uncharacterized protein LOC1110221441.3e-6259.35Show/hide
Query:  ESQFIRDFKRYGPPSFDGQSENPLAAERWITDLEALLDLMNCNDSLKIRGAVFMLKDDARTWWQSVAAAEDHANQPISWERFKDLLYDNYFPETVKDDKE
        E+ FI+DFKRYGPP+FDG+SE   AAE WI +LEA    + C D  K++GAVFML+ +A  WW S+AAAEDHAN  I W RFKDLLYD Y+ ETVKD KE
Subjt:  ESQFIRDFKRYGPPSFDGQSENPLAAERWITDLEALLDLMNCNDSLKIRGAVFMLKDDARTWWQSVAAAEDHANQPISWERFKDLLYDNYFPETVKDDKE

Query:  AEFLHLAQGSMSVVQYERKFTALSRFAPDLVSTPERKIKRFIKGLREEIRGSVALSRPATFAEALTGALIMDKNVSKKPQPHLEKGSTSGDKRKLSP-LR
        AEFLHL QG++SV QYERKFT LSRFA +L+     KIKRF+KGL + IRG V L RPA++AEA+ GALIMDK+VS K     E GS+SG KRK  P   
Subjt:  AEFLHLAQGSMSVVQYERKFTALSRFAPDLVSTPERKIKRFIKGLREEIRGSVALSRPATFAEALTGALIMDKNVSKKPQPHLEKGSTSGDKRKLSP-LR

Query:  NPPIEPTQQQPRRQ
        +P +   Q Q + +
Subjt:  NPPIEPTQQQPRRQ

A0A6J1DNV8 uncharacterized protein LOC1110229254.5e-5849.61Show/hide
Query:  PPAPPAAPMLITPEALQTMFDNMAQKNARPPRNPNWVPENAEESQFIRDFKRYGPPSFDGQSENPLAAERWITDLEALLDLMNCNDSLKIRGAVFMLKDD
        PP  P   +L+  EALQ + DN         + P+    + EE QFIRDFKR+GPP F+G SE P AAE W+ +LEAL   + C+D  K+RGAVFML+ +
Subjt:  PPAPPAAPMLITPEALQTMFDNMAQKNARPPRNPNWVPENAEESQFIRDFKRYGPPSFDGQSENPLAAERWITDLEALLDLMNCNDSLKIRGAVFMLKDD

Query:  ARTWWQSVAAAEDHANQPISWERFKDLLYDNYFPETVKDDKEAEFLHLAQGSMSVVQYERKFTALSRFAPDLVSTPERKIKRFIKGLREEIRGSVALSRP
        A  WW+SVAAAEDHAN P++W RFKDLLY+ YFP TV+++K AEFL L Q S+ V QYERKFT LSRF    + T + KI +FI GLR EI+G + L  P
Subjt:  ARTWWQSVAAAEDHANQPISWERFKDLLYDNYFPETVKDDKEAEFLHLAQGSMSVVQYERKFTALSRFAPDLVSTPERKIKRFIKGLREEIRGSVALSRP

Query:  ATFAEALTGALIMDKNVSKKPQPHLEKGSTSGDKRKLSPL-RNPPIEPTQQQPRRQVS
         T+A A+  AL+MDK + ++PQ     GS+SG KRK +    + P    Q   +RQ +
Subjt:  ATFAEALTGALIMDKNVSKKPQPHLEKGSTSGDKRKLSPL-RNPPIEPTQQQPRRQVS

A0A6J1DUM2 uncharacterized protein LOC1110232471.1e-6458.59Show/hide
Query:  PPRNPNWVPENAEESQFIRDFKRYGPPSFDGQSENPLAAERWITDLEALLDLMNCNDSLKIRGAVFMLKDDARTWWQSVAAAEDHANQPISWERFKDLLY
        PP      P++  E++FI+DFKRYGPP+FDG+SE   A E WI +LEAL   + C D  K++GAVFML+ +A  WW SVAAAED+AN PI W RFK+LLY
Subjt:  PPRNPNWVPENAEESQFIRDFKRYGPPSFDGQSENPLAAERWITDLEALLDLMNCNDSLKIRGAVFMLKDDARTWWQSVAAAEDHANQPISWERFKDLLY

Query:  DNYFPETVKDDKEAEFLHLAQGSMSVVQYERKFTALSRFAPDLVSTPERKIKRFIKGLREEIRGSVALSRPATFAEALTGALIMDKNVSKKPQPHLEKGS
        D Y+PETVKD KEAEFLHL QG++SV QYERKFT LSRFA +L+ T   KIKRF+KGLR+ IRG V L RP T+AEA+ GAL+MDK+VS K  P  E GS
Subjt:  DNYFPETVKDDKEAEFLHLAQGSMSVVQYERKFTALSRFAPDLVSTPERKIKRFIKGLREEIRGSVALSRPATFAEALTGALIMDKNVSKKPQPHLEKGS

Query:  TSGDKRKL-SPLRNPPIEPTQQQPRRQ
        +SG KRK  S   +  +   Q+Q + Q
Subjt:  TSGDKRKL-SPLRNPPIEPTQQQPRRQ

A0A6J1DVA0 uncharacterized protein LOC1110234241.2e-6350.36Show/hide
Query:  PPGQRRVDPPPPP-------PPPAPPAAPMLITPEALQTMFDNMAQKNARPPRNPNWVPENAEESQFIRDFKRYGPPSFDGQSENPLAAERWITDLEALL
        P G+   DPPPPP        PP PPAA   +       + +N A       + P  +     E+QFI+DFKRYGPP+F G SE    AE W+ +LEAL 
Subjt:  PPGQRRVDPPPPP-------PPPAPPAAPMLITPEALQTMFDNMAQKNARPPRNPNWVPENAEESQFIRDFKRYGPPSFDGQSENPLAAERWITDLEALL

Query:  DLMNCNDSLKIRGAVFMLKDDARTWWQSVAAAEDHANQPISWERFKDLLYDNYFPETVKDDKEAEFLHLAQGSMSVVQYERKFTALSRFAPDLVSTPERK
          + C D  K++GAVFML+ +A  WW SVAA EDHAN P+ W RFK+LLYD+Y+ ETV+D KE EFLHL QG+++V QYERKFT LS FA +L+ T   K
Subjt:  DLMNCNDSLKIRGAVFMLKDDARTWWQSVAAAEDHANQPISWERFKDLLYDNYFPETVKDDKEAEFLHLAQGSMSVVQYERKFTALSRFAPDLVSTPERK

Query:  IKRFIKGLREEIRGSVALSRPATFAEALTGALIMDKNVSKKPQPHLEKGSTSGDKRKLSPLRNPPIEPTQQQPRRQ
        IKRF+KGL + IRGSV L RP T+AEA+ G LIMDK+VS + QP +E GS+ G KRK+ P          Q+P +Q
Subjt:  IKRFIKGLREEIRGSVALSRPATFAEALTGALIMDKNVSKKPQPHLEKGSTSGDKRKLSPLRNPPIEPTQQQPRRQ

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
No hits found

Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGGGCTTATTTGTAGCTCAAGTCGTCATTCTTGATGATGACATACAAGATTCATTATTTTTTAGAAAGAAAATGAATGATGTTGACAAGGATCAATGGTTTAAAGTCAT
CGACCTGGAAATGAAGTCTTTGCATTTCAATTCCATCTGGGGTCTTGTAAATTTGCTTGATGAGACACCTCAAGAAGTTGAGGACATGAGACATATGCCCTATGCAGTAG
GGTTGTCAGTGAGTTTCAATCAATTCCAAGATTTGATCTCTGGAGTGTCGTTACATACATCCTCCAGTATCTTAGGAGAACGAGGATACACTGACACAGATGTTTTAACT
GATAAGGATTTGATGAAATCTACATCAGTGTCTGTCTTCACTCTTAATGGAAAAGCTATACTGATCAATCTCGTATCGTTGGAGCTTCTGATCTCCGCAACCCCTCCCCT
TCTGCGTTTTCTTCAGCCGCCAGCCGCAGATCCTCCTCTCTGCCGTGTGTTGCGCCGCCGCCAGCCGCTCTCCGCGTCTTCCCTCGCTCTCTCCGAGTCCTTCTCGCGCC
GCCTCCTCGCAAGCAGTCGCCTCGCGCGTCCTTCTCCTCTGCGCGTCTCTCCCGCGAATCTCCCTCTCTCTCGTTTCTGGTTCGTGTGGAGCCGCGGCCCAGCTTCGCTC
CGAGCTTGTCAGCCGTCACGTGAGTCCTCTTTCGCCGTTTTTGCAGAACTCTGGGTAAGTCTCATCCTCTGTGTATTCGTTAAGGCCGATCTAACTCCTCTTGGTTTGCG
GCAGCAGTCGGAGTCTCCTTTCCTCGCGTTTTCGTTGCTGTCCAGCAGCGTCATTGGGCGTTTCCGGCGTCGGTTAGCGTGTCGGGCCTCGGGTATAAAAGGTCGGGGAC
TGATATATCACTATTGGTGTCGATGCCTCGGGTATAAATGGTCGAGGGTCGATATGCCAATGTTAGATAAAGAGGAGCATCGAGGCCTTGGGTATAAATGGTCAAGGGTC
GGTGTGATGAGTCCTGAGGCAAGTATTGAGGCCTTGGGTATAAATGGTCAAGGGTCAGGTCGAATGCCGAGCTCTGTAGAGAAGTGTCGAGGCCCTGGGTATAAATGGTC
AGGGGTCGGTACAACTCGAAGGGTAGGGGCTGCTTACCAGTACCTTAGTGTACTGACCCCCTCCCCTCTCTCTCCCCCCAACTACCAGATTTTGCAGGTTATGAGGACTG
CGTGGACCGTGGCTGTTGGTTCTGTGGATATTCTGTTGCGTTGTAGAATGGCGGAGTCGTTTAATTTGTTAAATAGAAGTGTTTTATGCATAACAGTTGGTATCAGAGCG
AAACCTCTCCCAGTAGGATGTGGTTCGGGGACGAACCAAGGCGGAAGCTGGTGGGCATGTGACGCCCGAGGGAGGTGTAGGGATGCCGTGAGCTGCCTCGTTCGATCCTC
TGGCGTTACATGTATCAGCCAGGCAATGTCTCGCGGTCATGACCCTGAAGTTCCAATTGTCAATCAAGATGATCAAGTAGAGGAAGTTACTACTCAGCAAGGGGTCGATC
CTCTGGCTCCCCCTATGCAGGAGGCTAATCCCCTGATTCCTCCCGGTCAGCGCAGGGTTGATCCTCCTCCTCCCCCGCCTCCTCCGGCCCCTCCTGCGGCTCCTATGCTG
ATCACTCCGGAAGCCCTCCAGACCATGTTCGATAACATGGCCCAGAAGAATGCTAGGCCACCGCGAAACCCTAATTGGGTACCTGAGAACGCGGAGGAATCCCAGTTCAT
TAGGGACTTCAAGCGCTACGGGCCTCCCTCTTTTGATGGGCAATCCGAAAATCCTTTAGCAGCAGAGCGATGGATCACTGATTTGGAGGCACTGTTGGACCTCATGAACT
GTAATGATTCCTTGAAAATCAGAGGGGCAGTTTTCATGCTCAAGGATGACGCTCGCACGTGGTGGCAATCGGTGGCAGCAGCCGAAGACCATGCTAATCAACCGATCTCG
TGGGAAAGGTTCAAGGATCTATTGTATGATAATTACTTCCCGGAGACAGTCAAGGACGACAAAGAAGCGGAATTTCTTCATTTGGCCCAGGGGAGTATGTCTGTAGTGCA
GTATGAGAGGAAGTTCACTGCACTATCACGCTTTGCTCCTGACCTGGTCAGCACGCCAGAGCGGAAGATTAAGAGGTTCATTAAAGGTCTTCGTGAGGAAATTCGAGGCT
CTGTAGCCCTAAGCAGGCCCGCGACCTTTGCTGAAGCACTCACAGGTGCATTGATCATGGATAAGAATGTTTCTAAGAAGCCACAACCTCATCTCGAGAAGGGATCAACC
TCTGGAGATAAAAGAAAGTTGTCTCCCCTGAGGAACCCACCTATTGAGCCTACTCAGCAACAGCCCAGACGCCAAGTGTCCAAGGAGGTTAGCCAAGCAAGCATCAATGG
AGTCCTTACAGGTGGGAAGTCGGTTTCACCTCTCATTTCCAGATCGTGCAAGATCCGAGTTCGGTTTGGTCTTTGTCAGACCCTTAGTGTTCTTAATGCACTTGGTCAAG
TTGCCTGTAAGCGAGCTTTGCTTCAAGTGTTGCACCAGCCGCAGGTCTCCTCCTCTCTGTCGTGTGTTGCGCCGCCGCCAGCCCCTTCTCCGGCGTCTTTCCCCCCCCCC
GCGAGCTCTCTCCGTGAGTTCTCTTTCGCTCGCGCCGCCTCCGTCGTCAAGGCCGAGCCATGGGTCACGCGTCCTTCTCCCTCTGCGCGTCTCTCTCCGCGAATCTCCCT
CTCTCTCGATTTCTGGTTCGTGTGGAAGCCGCCGCCGCCAAGCCTGTCGCTGCCGGAGCTTGTGCAACGCCGTCATCGAGCCCGATCTAACTCCTTCTTGCCCGTATTGC
GCGGACAGCAGCTCGGAGTCTCCTTTCCTCGCGTTTTCGTTGCTGTCCAGCAGCGTCATTGGGCGTTTCCGGCGTCAGTTAGCGTGTCCGCGCCGTCTAGGTGTTCGATT
AAGTTCGAAACACTTCAACTTGGGTACCCACTGCTCAAAGAGCGTTCTAGCTCATTGGTTGTGGTTGGTGTAACCCGTCTAGCGCAGAAGCAGGTCCGTTGGCGAGCGTT
TGATCTCAAATATCATGTTCAGCGAATACCCACAACTCGAAAGACCTTGATTTTGGTTACCCATAACCCGGTGACTTGGGTTCTTGGTTGTTGGGTCGTTTCGAACACAA
GTCAGCTTGTTCTCAAGCATGAGTTGAGAGCCTGTGATTACATGTATATGCTTGGTGGGCATAATACGGTCAATACGTTGCTTAGTCATCGAGGCCTTGGGTATAAATGG
TCAAGGGTCGGTGTGATGAGTCCTGAGGCAAGTATTGGGCTTGGGTATAAATGGTCAAGGGTCAATACGTTGCTTAGTCGTCGAGGCCTTGAGTATAAATGGTCAAGGGT
CGATGCACAGTTCGAGGCCTTGGGTATAAATGGTCAAGGGTCGAATGTCGAGCTCTGTAGAGAAGTGTCGAGGCCCTGGGTAGGGGCTGCTTACCAGTACCTTAGTGTAC
TGACCCCCTCCCCTCTCTCTCCCCCCAACTACCAGATTTTGCAGGTTATGAGGACTGCGTGGACCGTGGTGATGCGGAGGAGGCGTATGAGGAAGGACCTTAGTTAG
mRNA sequenceShow/hide mRNA sequence
ATGGGCTTATTTGTAGCTCAAGTCGTCATTCTTGATGATGACATACAAGATTCATTATTTTTTAGAAAGAAAATGAATGATGTTGACAAGGATCAATGGTTTAAAGTCAT
CGACCTGGAAATGAAGTCTTTGCATTTCAATTCCATCTGGGGTCTTGTAAATTTGCTTGATGAGACACCTCAAGAAGTTGAGGACATGAGACATATGCCCTATGCAGTAG
GGTTGTCAGTGAGTTTCAATCAATTCCAAGATTTGATCTCTGGAGTGTCGTTACATACATCCTCCAGTATCTTAGGAGAACGAGGATACACTGACACAGATGTTTTAACT
GATAAGGATTTGATGAAATCTACATCAGTGTCTGTCTTCACTCTTAATGGAAAAGCTATACTGATCAATCTCGTATCGTTGGAGCTTCTGATCTCCGCAACCCCTCCCCT
TCTGCGTTTTCTTCAGCCGCCAGCCGCAGATCCTCCTCTCTGCCGTGTGTTGCGCCGCCGCCAGCCGCTCTCCGCGTCTTCCCTCGCTCTCTCCGAGTCCTTCTCGCGCC
GCCTCCTCGCAAGCAGTCGCCTCGCGCGTCCTTCTCCTCTGCGCGTCTCTCCCGCGAATCTCCCTCTCTCTCGTTTCTGGTTCGTGTGGAGCCGCGGCCCAGCTTCGCTC
CGAGCTTGTCAGCCGTCACGTGAGTCCTCTTTCGCCGTTTTTGCAGAACTCTGGGTAAGTCTCATCCTCTGTGTATTCGTTAAGGCCGATCTAACTCCTCTTGGTTTGCG
GCAGCAGTCGGAGTCTCCTTTCCTCGCGTTTTCGTTGCTGTCCAGCAGCGTCATTGGGCGTTTCCGGCGTCGGTTAGCGTGTCGGGCCTCGGGTATAAAAGGTCGGGGAC
TGATATATCACTATTGGTGTCGATGCCTCGGGTATAAATGGTCGAGGGTCGATATGCCAATGTTAGATAAAGAGGAGCATCGAGGCCTTGGGTATAAATGGTCAAGGGTC
GGTGTGATGAGTCCTGAGGCAAGTATTGAGGCCTTGGGTATAAATGGTCAAGGGTCAGGTCGAATGCCGAGCTCTGTAGAGAAGTGTCGAGGCCCTGGGTATAAATGGTC
AGGGGTCGGTACAACTCGAAGGGTAGGGGCTGCTTACCAGTACCTTAGTGTACTGACCCCCTCCCCTCTCTCTCCCCCCAACTACCAGATTTTGCAGGTTATGAGGACTG
CGTGGACCGTGGCTGTTGGTTCTGTGGATATTCTGTTGCGTTGTAGAATGGCGGAGTCGTTTAATTTGTTAAATAGAAGTGTTTTATGCATAACAGTTGGTATCAGAGCG
AAACCTCTCCCAGTAGGATGTGGTTCGGGGACGAACCAAGGCGGAAGCTGGTGGGCATGTGACGCCCGAGGGAGGTGTAGGGATGCCGTGAGCTGCCTCGTTCGATCCTC
TGGCGTTACATGTATCAGCCAGGCAATGTCTCGCGGTCATGACCCTGAAGTTCCAATTGTCAATCAAGATGATCAAGTAGAGGAAGTTACTACTCAGCAAGGGGTCGATC
CTCTGGCTCCCCCTATGCAGGAGGCTAATCCCCTGATTCCTCCCGGTCAGCGCAGGGTTGATCCTCCTCCTCCCCCGCCTCCTCCGGCCCCTCCTGCGGCTCCTATGCTG
ATCACTCCGGAAGCCCTCCAGACCATGTTCGATAACATGGCCCAGAAGAATGCTAGGCCACCGCGAAACCCTAATTGGGTACCTGAGAACGCGGAGGAATCCCAGTTCAT
TAGGGACTTCAAGCGCTACGGGCCTCCCTCTTTTGATGGGCAATCCGAAAATCCTTTAGCAGCAGAGCGATGGATCACTGATTTGGAGGCACTGTTGGACCTCATGAACT
GTAATGATTCCTTGAAAATCAGAGGGGCAGTTTTCATGCTCAAGGATGACGCTCGCACGTGGTGGCAATCGGTGGCAGCAGCCGAAGACCATGCTAATCAACCGATCTCG
TGGGAAAGGTTCAAGGATCTATTGTATGATAATTACTTCCCGGAGACAGTCAAGGACGACAAAGAAGCGGAATTTCTTCATTTGGCCCAGGGGAGTATGTCTGTAGTGCA
GTATGAGAGGAAGTTCACTGCACTATCACGCTTTGCTCCTGACCTGGTCAGCACGCCAGAGCGGAAGATTAAGAGGTTCATTAAAGGTCTTCGTGAGGAAATTCGAGGCT
CTGTAGCCCTAAGCAGGCCCGCGACCTTTGCTGAAGCACTCACAGGTGCATTGATCATGGATAAGAATGTTTCTAAGAAGCCACAACCTCATCTCGAGAAGGGATCAACC
TCTGGAGATAAAAGAAAGTTGTCTCCCCTGAGGAACCCACCTATTGAGCCTACTCAGCAACAGCCCAGACGCCAAGTGTCCAAGGAGGTTAGCCAAGCAAGCATCAATGG
AGTCCTTACAGGTGGGAAGTCGGTTTCACCTCTCATTTCCAGATCGTGCAAGATCCGAGTTCGGTTTGGTCTTTGTCAGACCCTTAGTGTTCTTAATGCACTTGGTCAAG
TTGCCTGTAAGCGAGCTTTGCTTCAAGTGTTGCACCAGCCGCAGGTCTCCTCCTCTCTGTCGTGTGTTGCGCCGCCGCCAGCCCCTTCTCCGGCGTCTTTCCCCCCCCCC
GCGAGCTCTCTCCGTGAGTTCTCTTTCGCTCGCGCCGCCTCCGTCGTCAAGGCCGAGCCATGGGTCACGCGTCCTTCTCCCTCTGCGCGTCTCTCTCCGCGAATCTCCCT
CTCTCTCGATTTCTGGTTCGTGTGGAAGCCGCCGCCGCCAAGCCTGTCGCTGCCGGAGCTTGTGCAACGCCGTCATCGAGCCCGATCTAACTCCTTCTTGCCCGTATTGC
GCGGACAGCAGCTCGGAGTCTCCTTTCCTCGCGTTTTCGTTGCTGTCCAGCAGCGTCATTGGGCGTTTCCGGCGTCAGTTAGCGTGTCCGCGCCGTCTAGGTGTTCGATT
AAGTTCGAAACACTTCAACTTGGGTACCCACTGCTCAAAGAGCGTTCTAGCTCATTGGTTGTGGTTGGTGTAACCCGTCTAGCGCAGAAGCAGGTCCGTTGGCGAGCGTT
TGATCTCAAATATCATGTTCAGCGAATACCCACAACTCGAAAGACCTTGATTTTGGTTACCCATAACCCGGTGACTTGGGTTCTTGGTTGTTGGGTCGTTTCGAACACAA
GTCAGCTTGTTCTCAAGCATGAGTTGAGAGCCTGTGATTACATGTATATGCTTGGTGGGCATAATACGGTCAATACGTTGCTTAGTCATCGAGGCCTTGGGTATAAATGG
TCAAGGGTCGGTGTGATGAGTCCTGAGGCAAGTATTGGGCTTGGGTATAAATGGTCAAGGGTCAATACGTTGCTTAGTCGTCGAGGCCTTGAGTATAAATGGTCAAGGGT
CGATGCACAGTTCGAGGCCTTGGGTATAAATGGTCAAGGGTCGAATGTCGAGCTCTGTAGAGAAGTGTCGAGGCCCTGGGTAGGGGCTGCTTACCAGTACCTTAGTGTAC
TGACCCCCTCCCCTCTCTCTCCCCCCAACTACCAGATTTTGCAGGTTATGAGGACTGCGTGGACCGTGGTGATGCGGAGGAGGCGTATGAGGAAGGACCTTAGTTAG
Protein sequenceShow/hide protein sequence
MGLFVAQVVILDDDIQDSLFFRKKMNDVDKDQWFKVIDLEMKSLHFNSIWGLVNLLDETPQEVEDMRHMPYAVGLSVSFNQFQDLISGVSLHTSSSILGERGYTDTDVLT
DKDLMKSTSVSVFTLNGKAILINLVSLELLISATPPLLRFLQPPAADPPLCRVLRRRQPLSASSLALSESFSRRLLASSRLARPSPLRVSPANLPLSRFWFVWSRGPASL
RACQPSRESSFAVFAELWVSLILCVFVKADLTPLGLRQQSESPFLAFSLLSSSVIGRFRRRLACRASGIKGRGLIYHYWCRCLGYKWSRVDMPMLDKEEHRGLGYKWSRV
GVMSPEASIEALGINGQGSGRMPSSVEKCRGPGYKWSGVGTTRRVGAAYQYLSVLTPSPLSPPNYQILQVMRTAWTVAVGSVDILLRCRMAESFNLLNRSVLCITVGIRA
KPLPVGCGSGTNQGGSWWACDARGRCRDAVSCLVRSSGVTCISQAMSRGHDPEVPIVNQDDQVEEVTTQQGVDPLAPPMQEANPLIPPGQRRVDPPPPPPPPAPPAAPML
ITPEALQTMFDNMAQKNARPPRNPNWVPENAEESQFIRDFKRYGPPSFDGQSENPLAAERWITDLEALLDLMNCNDSLKIRGAVFMLKDDARTWWQSVAAAEDHANQPIS
WERFKDLLYDNYFPETVKDDKEAEFLHLAQGSMSVVQYERKFTALSRFAPDLVSTPERKIKRFIKGLREEIRGSVALSRPATFAEALTGALIMDKNVSKKPQPHLEKGST
SGDKRKLSPLRNPPIEPTQQQPRRQVSKEVSQASINGVLTGGKSVSPLISRSCKIRVRFGLCQTLSVLNALGQVACKRALLQVLHQPQVSSSLSCVAPPPAPSPASFPPP
ASSLREFSFARAASVVKAEPWVTRPSPSARLSPRISLSLDFWFVWKPPPPSLSLPELVQRRHRARSNSFLPVLRGQQLGVSFPRVFVAVQQRHWAFPASVSVSAPSRCSI
KFETLQLGYPLLKERSSSLVVVGVTRLAQKQVRWRAFDLKYHVQRIPTTRKTLILVTHNPVTWVLGCWVVSNTSQLVLKHELRACDYMYMLGGHNTVNTLLSHRGLGYKW
SRVGVMSPEASIGLGYKWSRVNTLLSRRGLEYKWSRVDAQFEALGINGQGSNVELCREVSRPWVGAAYQYLSVLTPSPLSPPNYQILQVMRTAWTVVMRRRRMRKDLS