; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Lag0023079 (gene) of Sponge gourd (AG-4) v1 genome

Gene IDLag0023079
OrganismLuffa acutangula AG-4 (Sponge gourd (AG-4) v1)
DescriptionRetrovirus-related Pol polyprotein from transposon TNT 1-94
Genome locationchr7:43862761..43869460
RNA-Seq ExpressionLag0023079
SyntenyLag0023079
Gene Ontology termsGO:0044237 - cellular metabolic process (biological process)
GO:0044238 - primary metabolic process (biological process)
GO:0071704 - organic substance metabolic process (biological process)
GO:0110165 - cellular anatomical structure (cellular component)
GO:0003676 - nucleic acid binding (molecular function)
GO:0003824 - catalytic activity (molecular function)
GO:0008270 - zinc ion binding (molecular function)
InterPro domainsIPR001878 - Zinc finger, CCHC-type
IPR036875 - Zinc finger, CCHC-type superfamily


Homology Show/hide homology
GenBank top hitse value%identityAlignment
TXG54059.1 hypothetical protein EZV62_019315 [Acer yangbiense]7.6e-4748.58Show/hide
Query:  DGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYK
        DG  DF +W+ K+KA+L QQK L+AI  P+KLP  +T E K+ M  +  GTIILN+S++VLR++ DE T   +W+KL  LY  K   N++Y++ER F +K
Subjt:  DGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYK

Query:  MDAGKTLSENLDEFKKMTSEFKNLG--EKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIISAIRTKELELMSIKKESSEGLFVKGKTRSKDSK
        MDA K L +NLD+FKKMT +  N G  EK+ DENE  +LLNSLP+++++VK A KYGR S++ +E ISA+++KELEL   KK++ E LFV+G+   K+  
Subjt:  MDAGKTLSENLDEFKKMTSEFKNLG--EKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIISAIRTKELELMSIKKESSEGLFVKGKTRSKDSK

Query:  HQTEDKPKPKVK
        + + +K K + K
Subjt:  HQTEDKPKPKVK

TYK27723.1 Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Cucumis melo var. makuwa]6.2e-4946.49Show/hide
Query:  DGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYK
        DG  DF L   +I A LG QKAL+A+ DP +LPA +T  ++E++  + Y T+I+NI+++VLRQVI+E T +  W+ L  LYE KD  N+M++RE+ F++K
Subjt:  DGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYK

Query:  MDAGKTLSENLDEFKKMTSEFKNLGEKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIISAIRTKELELMSIKKES--SEGLFVKGKT--RSKD
        M+  K L ENLDEFKK+T+      EK+  E+E  +L+N + + Y+EVK + KYGRE+IT + +I+A+++KELEL +  K S  +E LF KG    R   
Subjt:  MDAGKTLSENLDEFKKMTSEFKNLGEKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIISAIRTKELELMSIKKES--SEGLFVKGKT--RSKD

Query:  SKHQTEDKPKPKVKCNYCHKEGHIKREC
        +K+Q   + KP +KC  CHKEGH KR C
Subjt:  SKHQTEDKPKPKVKCNYCHKEGHIKREC

XP_038885928.1 uncharacterized protein LOC120076236 [Benincasa hispida]2.3e-5948.11Show/hide
Query:  DGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYK
        + K DF LWKAKIK +L +QKAL AI+DP K P ++   +KE++    YGTI+LN+ +SVLRQ++D  T Y +W KLN +Y  KD  N+ ++RERFFTYK
Subjt:  DGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYK

Query:  MDAGKTLSENLDEFKKMTSEFKNLGEKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIISAIRTKELELMSIKKES--SEGLFVKGKTRSKDSK
        MD  K+L++NL+EFK ++S+F+++G+ I +ENE F+LLNSLPE +++VK A KYGRE ITT  IISA+  KELEL   KK+    EG F KG  ++    
Subjt:  MDAGKTLSENLDEFKKMTSEFKNLGEKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIISAIRTKELELMSIKKES--SEGLFVKGKTRSKDSK

Query:  HQTEDKPKPKVKCNYCHKEGHIKRECYSLKRKNQYHSSKNNGKQSEVVVGENSITYTDALAGLD
        ++T +          C ++  +K++CY+LKRK   +     GKQ+E  VGENS+ Y+DALA  +
Subjt:  HQTEDKPKPKVKCNYCHKEGHIKRECYSLKRKNQYHSSKNNGKQSEVVVGENSITYTDALAGLD

XP_038887098.1 uncharacterized protein LOC120077280 [Benincasa hispida]5.1e-5153.73Show/hide
Query:  DGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYK
        D K+DF L KAKIKA+LGQQKAL AI+DP K P  ++  +KE++    YGTIILN+++SVLRQ++D+ T Y +W KLN++Y  KD  N+ ++RERFFTYK
Subjt:  DGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYK

Query:  MDAGKTLSENLDEFKKMTSEFKNLGEKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIISAIRTKELELMSIKKESSEGLFVKGKTRSKDSKHQ
        MD  K+L++NL+EFK+++SEF+++G+ I +ENE F+L NSLPE +++VK A KY R+ IT D IISA+R KELEL  I +E  +   VKG+TR+    ++
Subjt:  MDAGKTLSENLDEFKKMTSEFKNLGEKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIISAIRTKELELMSIKKESSEGLFVKGKTRSKDSKHQ

Query:  T
        T
Subjt:  T

XP_038890043.1 uncharacterized protein LOC120079747 [Benincasa hispida]7.1e-5355.1Show/hide
Query:  DGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYK
        D K DF LWK KIK +LGQQKAL AI+DP K P  +T  +KE++ +   GTI+LN++++VLRQVI++ T Y +W KLN++Y  KD  N+ ++RERFFTYK
Subjt:  DGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYK

Query:  MDAGKTLSENLDEFKKMTSEFKNLGEKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIISAIRTKELELMSIKK--ESSEGLFVKGKTRS
        MDA K+L++ L+EFK+++SEF+++G  I +ENE F+LLNSLPE++++ K A KYGRE ITT+ IISA+R +ELEL   KK  +  EGLF KGK ++
Subjt:  MDAGKTLSENLDEFKKMTSEFKNLGEKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIISAIRTKELELMSIKK--ESSEGLFVKGKTRS

TrEMBL top hitse value%identityAlignment
A0A5C7GPM1 Uncharacterized protein4.1e-4651.81Show/hide
Query:  DGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYK
        DG  DF +W+ K+KA+L QQK L+AI  PDKLP  +  E K  M  +   TIILN+S++VLR+V DE T Y +W KL  LY  K   N++Y++ER F++K
Subjt:  DGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYK

Query:  MDAGKTLSENLDEFKKMTSEFKNL--GEKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIISAIRTKELELMSIKKESSEGLFVKGK
        MDA K L +NLDE+KKMT E  N    EK+ DENE  +LLNSLP+++++VK A KYGR S++ +E ISA+++KELEL   KK++ E LFV G+
Subjt:  MDAGKTLSENLDEFKKMTSEFKNL--GEKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIISAIRTKELELMSIKKESSEGLFVKGK

A0A5C7GXL9 Sucrose-phosphate phosphatase1.1e-4651.26Show/hide
Query:  DGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYK
        DG  DF +W+ K+KA+L QQK L+AI  P+KLP  +T E K+ M  +  GTIILN+S++VLR++ DE T   +W+KL  LY  K   N++Y++ER F +K
Subjt:  DGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYK

Query:  MDAGKTLSENLDEFKKMTSEFKNLG--EKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIISAIRTKELELMSIKKESSEGLFVKGKTRSKDS
        MDA K L +NLD+FKKMT E  N G  EK+ DENE  +LLNSLP+++++VK A KYGR S++ +E ISA+++KELEL   KK++ E LFV+    SK S
Subjt:  MDAGKTLSENLDEFKKMTSEFKNLG--EKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIISAIRTKELELMSIKKESSEGLFVKGKTRSKDS

A0A5C7HB65 gag_pre-integrs domain-containing protein3.7e-4748.58Show/hide
Query:  DGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYK
        DG  DF +W+ K+KA+L QQK L+AI  P+KLP  +T E K+ M  +  GTIILN+S++VLR++ DE T   +W+KL  LY  K   N++Y++ER F +K
Subjt:  DGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYK

Query:  MDAGKTLSENLDEFKKMTSEFKNLG--EKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIISAIRTKELELMSIKKESSEGLFVKGKTRSKDSK
        MDA K L +NLD+FKKMT +  N G  EK+ DENE  +LLNSLP+++++VK A KYGR S++ +E ISA+++KELEL   KK++ E LFV+G+   K+  
Subjt:  MDAGKTLSENLDEFKKMTSEFKNLG--EKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIISAIRTKELELMSIKKESSEGLFVKGKTRSKDSK

Query:  HQTEDKPKPKVK
        + + +K K + K
Subjt:  HQTEDKPKPKVK

A0A5C7I661 Uncharacterized protein1.8e-4651.83Show/hide
Query:  DGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYK
        DG  DF +W+ K+KA+L QQK L+AI  P+KLP  +T E K+ M  +  GTIILN+S++VLR++ DE T   +W+KL  LY  K   N++Y++ER F +K
Subjt:  DGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYK

Query:  MDAGKTLSENLDEFKKMTSEFKNLG--EKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIISAIRTKELELMSIKKESSEGLFVK
        MDA K L +NLD+FKKMT E  N G  EK+ DENE  +LLNSLP+++++VK A KYGR S++ +E ISA+++KELEL   KK++ E LFV+
Subjt:  MDAGKTLSENLDEFKKMTSEFKNLG--EKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIISAIRTKELELMSIKKESSEGLFVK

A0A5D3DVM0 Retrovirus-related Pol polyprotein from transposon TNT 1-943.0e-4946.49Show/hide
Query:  DGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYK
        DG  DF L   +I A LG QKAL+A+ DP +LPA +T  ++E++  + Y T+I+NI+++VLRQVI+E T +  W+ L  LYE KD  N+M++RE+ F++K
Subjt:  DGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYK

Query:  MDAGKTLSENLDEFKKMTSEFKNLGEKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIISAIRTKELELMSIKKES--SEGLFVKGKT--RSKD
        M+  K L ENLDEFKK+T+      EK+  E+E  +L+N + + Y+EVK + KYGRE+IT + +I+A+++KELEL +  K S  +E LF KG    R   
Subjt:  MDAGKTLSENLDEFKKMTSEFKNLGEKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIISAIRTKELELMSIKKES--SEGLFVKGKT--RSKD

Query:  SKHQTEDKPKPKVKCNYCHKEGHIKREC
        +K+Q   + KP +KC  CHKEGH KR C
Subjt:  SKHQTEDKPKPKVKCNYCHKEGHIKREC

SwissProt top hitse value%identityAlignment
P04146 Copia protein4.1e-1125.59Show/hide
Query:  FNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYKMDAGK
        + +WK +I+A+L +Q  L+ +   D L      +  +        TII  +S+S L     + T  +I + L+ +YE K   +Q+ +R+R  + K+ +  
Subjt:  FNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYKMDAGK

Query:  TLSENLDEFKKMTSEFKNLGEKIDDENETFVLLNSLPEAYREVKNA-SKYGRESITTDEIISAIRTKELELMSIKKESSE----------------GLFV
        +L  +   F ++ SE    G KI++ ++   LL +LP  Y  +  A      E++T   + + +  +E+++ +   ++S+                 LF 
Subjt:  TLSENLDEFKKMTSEFKNLGEKIDDENETFVLLNSLPEAYREVKNA-SKYGRESITTDEIISAIRTKELELMSIKKESSE----------------GLFV

Query:  KGKTRSKDSKHQTEDKPKPKVKCNYCHKEGHIKRECYSLKRKNQYHSSKNNGKQ
          K R    K   +   K KVKC++C +EGHIK++C+  KR    + +K N KQ
Subjt:  KGKTRSKDSKHQTEDKPKPKVKCNYCHKEGHIKRECYSLKRKNQYHSSKNNGKQ

P10978 Retrovirus-related Pol polyprotein from transposon TNT 1-941.1e-2922.14Show/hide
Query:  DGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYK
        +G   F+ W+ +++ +L QQ   + +    K P  + AED   ++      I L++S+ V+  +IDEDT   IW +L  LY  K   N++Y++++ +   
Subjt:  DGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNISNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYK

Query:  MDAGKTLSENLDEFKKMTSEFKNLGEKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIISAIRTKELELMSIKKESSEGLFVKGKTRS------
        M  G     +L+ F  + ++  NLG KI++E++  +LLNSLP +Y  +     +G+ +I   ++ SA+   E ++    +   + L  +G+ RS      
Subjt:  MDAGKTLSENLDEFKKMTSEFKNLGEKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIISAIRTKELELMSIKKESSEGLFVKGKTRS------

Query:  ----KDSKHQTEDKPKPKVK-CNYCHKEGHIKRECYSLKRKNQYHSSKNNGKQSEVVVGENSITYTDALAGLDFSCIQTSNKSLPKEGRVRPCSSIQDGN
              ++ +++++ K +V+ C  C++ GH KR+C +                                               P++G+        D N
Subjt:  ----KDSKHQTEDKPKPKVK-CNYCHKEGHIKRECYSLKRKNQYHSSKNNGKQSEVVVGENSITYTDALAGLDFSCIQTSNKSLPKEGRVRPCSSIQDGN

Query:  RWLPDCGSWKSIAIVRLLRGVCSALFRKLPVISHHSSRTWSCRRLDVVPSLGRTRVMSFAELRNFDGVMKFDGKNFGYWKMQVNDYLTCRKVHKALKERP
                                                                 + A ++N D V+ F           +N+               
Subjt:  RWLPDCGSWKSIAIVRLLRGVCSALFRKLPVISHHSSRTWSCRRLDVVPSLGRTRVMSFAELRNFDGVMKFDGKNFGYWKMQVNDYLTCRKVHKALKERP

Query:  KGMADKDWEAMDEEAVASIRMCLSMDVASLVAHETTTVKLMETFTNKCSNHSSDWILDSAASVHIASYRSLFTSFTEGHHGLVRMENGRTSKTSGIGDVS
                   +EE       C+ +                       S   S+W++D+AAS H    R LF  +  G  G V+M N   SK +GIGD+ 
Subjt:  KGMADKDWEAMDEEAVASIRMCLSMDVASLVAHETTTVKLMETFTNKCSNHSSDWILDSAASVHIASYRSLFTSFTEGHHGLVRMENGRTSKTSGIGDVS

Query:  LKTECGDKLVLQNVRLVPNIKMNLISTGKLTDNGYMCEFGSHQCKLKLGSQVVAVGHRKSTLYKCQLNV
        +KT  G  LVL++VR VP+++MNLIS   L  +GY   F + + +L  GS V+A G  + TLY+    +
Subjt:  LKTECGDKLVLQNVRLVPNIKMNLISTGKLTDNGYMCEFGSHQCKLKLGSQVVAVGHRKSTLYKCQLNV

Arabidopsis top hitse value%identityAlignment
AT3G29785.1 unknown protein4.5e-0528.57Show/hide
Query:  KFDGKNFGYWKMQVNDYLTCRKVHKALKERPKGMADKDWEAMDEEAVASIRMCLSMDVASLVAHETTTVKLMETFTN
        K DG ++ + +M++ DYL  +K+H+ L ++ + M+  DW  +  + +  IR+ +S ++A  VA E +   LM+  ++
Subjt:  KFDGKNFGYWKMQVNDYLTCRKVHKALKERPKGMADKDWEAMDEEAVASIRMCLSMDVASLVAHETTTVKLMETFTN


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGTCGGTTGCTCCCAAAGGGTCCAGCTGCCACGTGTCAGCCATGGAGGAGTCCACGCATAACAACTCCTCGGTCAAAGGCACGCAAAAAGGCACCAAAATCGGTCAACG
AACAGACTGGTCAACTGCTGGAGTGTGTGGACGCCGGATTGACACGTGGACAGATGGAAAGGTAGATTTTAACTTGTGGAAGGCAAAGATTAAGGCCATACTTGGACAAC
AAAAGGCCTTACAAGCAATATCCGATCCAGATAAGTTGCCTGCATTAGTGACAGCAGAAGACAAAGAAAGCATGAATATGATAGTCTATGGCACTATTATTTTGAATATA
AGTAATAGTGTCTTGAGGCAAGTCATTGATGAGGACACTCCTTACAAAATTTGGCAGAAATTGAACAAGTTGTATGAGGTAAAGGACACACACAATCAAATGTACATGAG
GGAAAGATTTTTTACCTATAAGATGGATGCTGGAAAAACTTTGTCAGAGAACCTTGATGAATTCAAGAAGATGACAAGTGAATTCAAGAACCTCGGAGAGAAAATAGATG
ATGAAAACGAAACTTTTGTCCTTCTGAATTCACTTCCAGAGGCATACAGAGAAGTAAAAAATGCCTCAAAGTACGGAAGAGAATCTATAACTACTGATGAGATTATATCA
GCCATAAGGACAAAAGAATTAGAACTAATGTCAATCAAGAAAGAATCATCTGAAGGCCTGTTTGTAAAAGGAAAGACAAGAAGCAAAGACAGCAAACACCAAACCGAAGA
TAAACCTAAGCCCAAGGTGAAGTGTAACTATTGCCACAAGGAGGGGCACATAAAGAGAGAGTGCTACTCCTTGAAAAGGAAGAATCAGTACCATTCCTCTAAGAATAACG
GCAAACAGTCAGAAGTCGTTGTAGGGGAAAACTCCATAACATACACGGATGCCTTGGCTGGATTAGATTTCTCTTGTATCCAGACCTCAAACAAATCCCTTCCAAAAGAA
GGAAGAGTTCGACCTTGCAGTAGCATCCAGGATGGGAATAGATGGTTACCGGATTGTGGTTCTTGGAAGTCAATAGCTATTGTTAGGCTATTGAGGGGTGTTTGTAGTGC
GTTATTTCGAAAGCTTCCTGTAATCTCCCATCATTCTAGTAGAACCTGGAGTTGCAGAAGACTGGATGTAGTCCCGAGCTTGGGACGAACCAGAGTCATGAGTTTTGCAG
AGCTAAGAAATTTCGATGGAGTCATGAAGTTCGATGGAAAAAATTTTGGATATTGGAAAATGCAAGTCAATGATTATTTAACTTGCAGGAAAGTGCATAAGGCATTGAAG
GAGAGACCGAAAGGGATGGCAGACAAAGATTGGGAAGCTATGGATGAAGAGGCCGTTGCAAGCATAAGGATGTGTTTGTCAATGGATGTGGCAAGTCTAGTGGCTCATGA
GACAACTACGGTTAAATTAATGGAAACGTTTACAAACAAGTGTAGTAACCACTCATCAGATTGGATATTAGACAGTGCAGCGTCTGTACACATAGCTTCATATAGGAGTT
TGTTCACATCATTCACAGAAGGACATCATGGTCTAGTGAGGATGGAGAATGGTAGAACCTCCAAAACTAGTGGGATTGGAGATGTTAGTCTGAAGACAGAATGTGGAGAT
AAATTAGTATTGCAAAATGTCAGGCTTGTGCCTAATATCAAGATGAATCTTATTTCTACTGGCAAGTTGACAGATAATGGTTACATGTGTGAGTTTGGTAGTCACCAGTG
TAAACTCAAGTTAGGATCCCAGGTAGTGGCAGTTGGTCACAGGAAATCTACATTATACAAATGTCAGTTGAATGTTGACAAAAGATCAAAAGACAGTGGACTGGTTAAAG
CTACAGATGATAGTTGTAGAGGTAAAGTGGAACCAGCAGCAAAGAAGTCGCTTAGGCGAGTTGAGGCATCAAAGTGGAAGACTAGAGCAGTTACTAAGGTCAAAGATCAG
GTCTCTAGCTTGGCAACAGATATGAATAGAGGATTCAAGCCATTCAAAGTGTATCTTCTCCGGAAACCGTTGTTCGAGTTGGAAGGAGATGATAGGTGTCACCACTTAGC
TGAAGTGGGAGTACGTGTCCGTCTTGCTCATCTGGGAGATAGGTATTTTGAGCCTTGGAATTTACCAAGAAGATCACAGATAACAGTTGTAGTGGGAGTAGATCCTTGGA
GTTTGCCAAGATGA
mRNA sequenceShow/hide mRNA sequence
ATGTCGGTTGCTCCCAAAGGGTCCAGCTGCCACGTGTCAGCCATGGAGGAGTCCACGCATAACAACTCCTCGGTCAAAGGCACGCAAAAAGGCACCAAAATCGGTCAACG
AACAGACTGGTCAACTGCTGGAGTGTGTGGACGCCGGATTGACACGTGGACAGATGGAAAGGTAGATTTTAACTTGTGGAAGGCAAAGATTAAGGCCATACTTGGACAAC
AAAAGGCCTTACAAGCAATATCCGATCCAGATAAGTTGCCTGCATTAGTGACAGCAGAAGACAAAGAAAGCATGAATATGATAGTCTATGGCACTATTATTTTGAATATA
AGTAATAGTGTCTTGAGGCAAGTCATTGATGAGGACACTCCTTACAAAATTTGGCAGAAATTGAACAAGTTGTATGAGGTAAAGGACACACACAATCAAATGTACATGAG
GGAAAGATTTTTTACCTATAAGATGGATGCTGGAAAAACTTTGTCAGAGAACCTTGATGAATTCAAGAAGATGACAAGTGAATTCAAGAACCTCGGAGAGAAAATAGATG
ATGAAAACGAAACTTTTGTCCTTCTGAATTCACTTCCAGAGGCATACAGAGAAGTAAAAAATGCCTCAAAGTACGGAAGAGAATCTATAACTACTGATGAGATTATATCA
GCCATAAGGACAAAAGAATTAGAACTAATGTCAATCAAGAAAGAATCATCTGAAGGCCTGTTTGTAAAAGGAAAGACAAGAAGCAAAGACAGCAAACACCAAACCGAAGA
TAAACCTAAGCCCAAGGTGAAGTGTAACTATTGCCACAAGGAGGGGCACATAAAGAGAGAGTGCTACTCCTTGAAAAGGAAGAATCAGTACCATTCCTCTAAGAATAACG
GCAAACAGTCAGAAGTCGTTGTAGGGGAAAACTCCATAACATACACGGATGCCTTGGCTGGATTAGATTTCTCTTGTATCCAGACCTCAAACAAATCCCTTCCAAAAGAA
GGAAGAGTTCGACCTTGCAGTAGCATCCAGGATGGGAATAGATGGTTACCGGATTGTGGTTCTTGGAAGTCAATAGCTATTGTTAGGCTATTGAGGGGTGTTTGTAGTGC
GTTATTTCGAAAGCTTCCTGTAATCTCCCATCATTCTAGTAGAACCTGGAGTTGCAGAAGACTGGATGTAGTCCCGAGCTTGGGACGAACCAGAGTCATGAGTTTTGCAG
AGCTAAGAAATTTCGATGGAGTCATGAAGTTCGATGGAAAAAATTTTGGATATTGGAAAATGCAAGTCAATGATTATTTAACTTGCAGGAAAGTGCATAAGGCATTGAAG
GAGAGACCGAAAGGGATGGCAGACAAAGATTGGGAAGCTATGGATGAAGAGGCCGTTGCAAGCATAAGGATGTGTTTGTCAATGGATGTGGCAAGTCTAGTGGCTCATGA
GACAACTACGGTTAAATTAATGGAAACGTTTACAAACAAGTGTAGTAACCACTCATCAGATTGGATATTAGACAGTGCAGCGTCTGTACACATAGCTTCATATAGGAGTT
TGTTCACATCATTCACAGAAGGACATCATGGTCTAGTGAGGATGGAGAATGGTAGAACCTCCAAAACTAGTGGGATTGGAGATGTTAGTCTGAAGACAGAATGTGGAGAT
AAATTAGTATTGCAAAATGTCAGGCTTGTGCCTAATATCAAGATGAATCTTATTTCTACTGGCAAGTTGACAGATAATGGTTACATGTGTGAGTTTGGTAGTCACCAGTG
TAAACTCAAGTTAGGATCCCAGGTAGTGGCAGTTGGTCACAGGAAATCTACATTATACAAATGTCAGTTGAATGTTGACAAAAGATCAAAAGACAGTGGACTGGTTAAAG
CTACAGATGATAGTTGTAGAGGTAAAGTGGAACCAGCAGCAAAGAAGTCGCTTAGGCGAGTTGAGGCATCAAAGTGGAAGACTAGAGCAGTTACTAAGGTCAAAGATCAG
GTCTCTAGCTTGGCAACAGATATGAATAGAGGATTCAAGCCATTCAAAGTGTATCTTCTCCGGAAACCGTTGTTCGAGTTGGAAGGAGATGATAGGTGTCACCACTTAGC
TGAAGTGGGAGTACGTGTCCGTCTTGCTCATCTGGGAGATAGGTATTTTGAGCCTTGGAATTTACCAAGAAGATCACAGATAACAGTTGTAGTGGGAGTAGATCCTTGGA
GTTTGCCAAGATGA
Protein sequenceShow/hide protein sequence
MSVAPKGSSCHVSAMEESTHNNSSVKGTQKGTKIGQRTDWSTAGVCGRRIDTWTDGKVDFNLWKAKIKAILGQQKALQAISDPDKLPALVTAEDKESMNMIVYGTIILNI
SNSVLRQVIDEDTPYKIWQKLNKLYEVKDTHNQMYMRERFFTYKMDAGKTLSENLDEFKKMTSEFKNLGEKIDDENETFVLLNSLPEAYREVKNASKYGRESITTDEIIS
AIRTKELELMSIKKESSEGLFVKGKTRSKDSKHQTEDKPKPKVKCNYCHKEGHIKRECYSLKRKNQYHSSKNNGKQSEVVVGENSITYTDALAGLDFSCIQTSNKSLPKE
GRVRPCSSIQDGNRWLPDCGSWKSIAIVRLLRGVCSALFRKLPVISHHSSRTWSCRRLDVVPSLGRTRVMSFAELRNFDGVMKFDGKNFGYWKMQVNDYLTCRKVHKALK
ERPKGMADKDWEAMDEEAVASIRMCLSMDVASLVAHETTTVKLMETFTNKCSNHSSDWILDSAASVHIASYRSLFTSFTEGHHGLVRMENGRTSKTSGIGDVSLKTECGD
KLVLQNVRLVPNIKMNLISTGKLTDNGYMCEFGSHQCKLKLGSQVVAVGHRKSTLYKCQLNVDKRSKDSGLVKATDDSCRGKVEPAAKKSLRRVEASKWKTRAVTKVKDQ
VSSLATDMNRGFKPFKVYLLRKPLFELEGDDRCHHLAEVGVRVRLAHLGDRYFEPWNLPRRSQITVVVGVDPWSLPR