; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; CuGenDBv2

Tan0019772 (gene) of Snake gourd v1 genome

Gene IDTan0019772
OrganismTrichosanthes anguina (Snake gourd v1)
Descriptionendonuclease MutS2 isoform X1
Genome locationLG01:432444..473189
RNA-Seq ExpressionTan0019772
SyntenyTan0019772
Gene Ontology termsGO:0006298 - mismatch repair (biological process)
GO:0005524 - ATP binding (molecular function)
GO:0030983 - mismatched DNA binding (molecular function)
InterPro domainsIPR045076 - DNA mismatch repair MutS family


Homology Show/hide homology
GenBank top hitse value%identityAlignment
XP_022922840.1 uncharacterized protein LOC111430703 isoform X1 [Cucurbita moschata]6.7e-8766.08Show/hide
Query:  MLSAALFAHPLTSIISATLPAKTVGSFRFQNGAASKHFSLSVNNSVSNGIRDDGNKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMP
        MLSAA+F H LT I SATLP  +V SFRFQN A   HFSLS N SV N IR D N+HSIHLDSLRALEWDKLCDSVASFARTSLG QAIK          
Subjt:  MLSAALFAHPLTSIISATLPAKTVGSFRFQNGAASKHFSLSVNNSVSNGIRDDGNKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMP

Query:  LSIPVSGLNLKDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIPLSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHL
                                                                 AQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDL L
Subjt:  LSIPVSGLNLKDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIPLSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHL

Query:  VKSAIEHAQRSLPMDGNEAVAVATLLQFADMLQFNLKTAIKEDVDWSARFRPLTEVIMGMVVNQSLIKLILNVVDEDGSVKDSAFS
        VKSA+EHAQRSLPMDGNEAVA+A LLQFADMLQFNLKTAIKED DWS RF PLT+VIMGMVVNQSLIKLILNVVDEDGSVKDSA S
Subjt:  VKSAIEHAQRSLPMDGNEAVAVATLLQFADMLQFNLKTAIKEDVDWSARFRPLTEVIMGMVVNQSLIKLILNVVDEDGSVKDSAFS

XP_022922841.1 uncharacterized protein LOC111430703 isoform X2 [Cucurbita moschata]6.7e-8766.08Show/hide
Query:  MLSAALFAHPLTSIISATLPAKTVGSFRFQNGAASKHFSLSVNNSVSNGIRDDGNKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMP
        MLSAA+F H LT I SATLP  +V SFRFQN A   HFSLS N SV N IR D N+HSIHLDSLRALEWDKLCDSVASFARTSLG QAIK          
Subjt:  MLSAALFAHPLTSIISATLPAKTVGSFRFQNGAASKHFSLSVNNSVSNGIRDDGNKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMP

Query:  LSIPVSGLNLKDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIPLSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHL
                                                                 AQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDL L
Subjt:  LSIPVSGLNLKDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIPLSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHL

Query:  VKSAIEHAQRSLPMDGNEAVAVATLLQFADMLQFNLKTAIKEDVDWSARFRPLTEVIMGMVVNQSLIKLILNVVDEDGSVKDSAFS
        VKSA+EHAQRSLPMDGNEAVA+A LLQFADMLQFNLKTAIKED DWS RF PLT+VIMGMVVNQSLIKLILNVVDEDGSVKDSA S
Subjt:  VKSAIEHAQRSLPMDGNEAVAVATLLQFADMLQFNLKTAIKEDVDWSARFRPLTEVIMGMVVNQSLIKLILNVVDEDGSVKDSAFS

XP_022922843.1 uncharacterized protein LOC111430703 isoform X3 [Cucurbita moschata]6.7e-8766.08Show/hide
Query:  MLSAALFAHPLTSIISATLPAKTVGSFRFQNGAASKHFSLSVNNSVSNGIRDDGNKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMP
        MLSAA+F H LT I SATLP  +V SFRFQN A   HFSLS N SV N IR D N+HSIHLDSLRALEWDKLCDSVASFARTSLG QAIK          
Subjt:  MLSAALFAHPLTSIISATLPAKTVGSFRFQNGAASKHFSLSVNNSVSNGIRDDGNKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMP

Query:  LSIPVSGLNLKDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIPLSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHL
                                                                 AQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDL L
Subjt:  LSIPVSGLNLKDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIPLSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHL

Query:  VKSAIEHAQRSLPMDGNEAVAVATLLQFADMLQFNLKTAIKEDVDWSARFRPLTEVIMGMVVNQSLIKLILNVVDEDGSVKDSAFS
        VKSA+EHAQRSLPMDGNEAVA+A LLQFADMLQFNLKTAIKED DWS RF PLT+VIMGMVVNQSLIKLILNVVDEDGSVKDSA S
Subjt:  VKSAIEHAQRSLPMDGNEAVAVATLLQFADMLQFNLKTAIKEDVDWSARFRPLTEVIMGMVVNQSLIKLILNVVDEDGSVKDSAFS

XP_022922844.1 uncharacterized protein LOC111430703 isoform X4 [Cucurbita moschata]6.7e-8766.08Show/hide
Query:  MLSAALFAHPLTSIISATLPAKTVGSFRFQNGAASKHFSLSVNNSVSNGIRDDGNKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMP
        MLSAA+F H LT I SATLP  +V SFRFQN A   HFSLS N SV N IR D N+HSIHLDSLRALEWDKLCDSVASFARTSLG QAIK          
Subjt:  MLSAALFAHPLTSIISATLPAKTVGSFRFQNGAASKHFSLSVNNSVSNGIRDDGNKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMP

Query:  LSIPVSGLNLKDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIPLSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHL
                                                                 AQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDL L
Subjt:  LSIPVSGLNLKDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIPLSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHL

Query:  VKSAIEHAQRSLPMDGNEAVAVATLLQFADMLQFNLKTAIKEDVDWSARFRPLTEVIMGMVVNQSLIKLILNVVDEDGSVKDSAFS
        VKSA+EHAQRSLPMDGNEAVA+A LLQFADMLQFNLKTAIKED DWS RF PLT+VIMGMVVNQSLIKLILNVVDEDGSVKDSA S
Subjt:  VKSAIEHAQRSLPMDGNEAVAVATLLQFADMLQFNLKTAIKEDVDWSARFRPLTEVIMGMVVNQSLIKLILNVVDEDGSVKDSAFS

XP_022922845.1 uncharacterized protein LOC111430703 isoform X5 [Cucurbita moschata]6.7e-8766.08Show/hide
Query:  MLSAALFAHPLTSIISATLPAKTVGSFRFQNGAASKHFSLSVNNSVSNGIRDDGNKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMP
        MLSAA+F H LT I SATLP  +V SFRFQN A   HFSLS N SV N IR D N+HSIHLDSLRALEWDKLCDSVASFARTSLG QAIK          
Subjt:  MLSAALFAHPLTSIISATLPAKTVGSFRFQNGAASKHFSLSVNNSVSNGIRDDGNKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMP

Query:  LSIPVSGLNLKDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIPLSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHL
                                                                 AQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDL L
Subjt:  LSIPVSGLNLKDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIPLSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHL

Query:  VKSAIEHAQRSLPMDGNEAVAVATLLQFADMLQFNLKTAIKEDVDWSARFRPLTEVIMGMVVNQSLIKLILNVVDEDGSVKDSAFS
        VKSA+EHAQRSLPMDGNEAVA+A LLQFADMLQFNLKTAIKED DWS RF PLT+VIMGMVVNQSLIKLILNVVDEDGSVKDSA S
Subjt:  VKSAIEHAQRSLPMDGNEAVAVATLLQFADMLQFNLKTAIKEDVDWSARFRPLTEVIMGMVVNQSLIKLILNVVDEDGSVKDSAFS

TrEMBL top hitse value%identityAlignment
A0A6J1E4M8 uncharacterized protein LOC111430703 isoform X43.3e-8766.08Show/hide
Query:  MLSAALFAHPLTSIISATLPAKTVGSFRFQNGAASKHFSLSVNNSVSNGIRDDGNKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMP
        MLSAA+F H LT I SATLP  +V SFRFQN A   HFSLS N SV N IR D N+HSIHLDSLRALEWDKLCDSVASFARTSLG QAIK          
Subjt:  MLSAALFAHPLTSIISATLPAKTVGSFRFQNGAASKHFSLSVNNSVSNGIRDDGNKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMP

Query:  LSIPVSGLNLKDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIPLSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHL
                                                                 AQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDL L
Subjt:  LSIPVSGLNLKDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIPLSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHL

Query:  VKSAIEHAQRSLPMDGNEAVAVATLLQFADMLQFNLKTAIKEDVDWSARFRPLTEVIMGMVVNQSLIKLILNVVDEDGSVKDSAFS
        VKSA+EHAQRSLPMDGNEAVA+A LLQFADMLQFNLKTAIKED DWS RF PLT+VIMGMVVNQSLIKLILNVVDEDGSVKDSA S
Subjt:  VKSAIEHAQRSLPMDGNEAVAVATLLQFADMLQFNLKTAIKEDVDWSARFRPLTEVIMGMVVNQSLIKLILNVVDEDGSVKDSAFS

A0A6J1E586 uncharacterized protein LOC111430703 isoform X23.3e-8766.08Show/hide
Query:  MLSAALFAHPLTSIISATLPAKTVGSFRFQNGAASKHFSLSVNNSVSNGIRDDGNKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMP
        MLSAA+F H LT I SATLP  +V SFRFQN A   HFSLS N SV N IR D N+HSIHLDSLRALEWDKLCDSVASFARTSLG QAIK          
Subjt:  MLSAALFAHPLTSIISATLPAKTVGSFRFQNGAASKHFSLSVNNSVSNGIRDDGNKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMP

Query:  LSIPVSGLNLKDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIPLSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHL
                                                                 AQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDL L
Subjt:  LSIPVSGLNLKDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIPLSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHL

Query:  VKSAIEHAQRSLPMDGNEAVAVATLLQFADMLQFNLKTAIKEDVDWSARFRPLTEVIMGMVVNQSLIKLILNVVDEDGSVKDSAFS
        VKSA+EHAQRSLPMDGNEAVA+A LLQFADMLQFNLKTAIKED DWS RF PLT+VIMGMVVNQSLIKLILNVVDEDGSVKDSA S
Subjt:  VKSAIEHAQRSLPMDGNEAVAVATLLQFADMLQFNLKTAIKEDVDWSARFRPLTEVIMGMVVNQSLIKLILNVVDEDGSVKDSAFS

A0A6J1E7X9 uncharacterized protein LOC111430703 isoform X33.3e-8766.08Show/hide
Query:  MLSAALFAHPLTSIISATLPAKTVGSFRFQNGAASKHFSLSVNNSVSNGIRDDGNKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMP
        MLSAA+F H LT I SATLP  +V SFRFQN A   HFSLS N SV N IR D N+HSIHLDSLRALEWDKLCDSVASFARTSLG QAIK          
Subjt:  MLSAALFAHPLTSIISATLPAKTVGSFRFQNGAASKHFSLSVNNSVSNGIRDDGNKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMP

Query:  LSIPVSGLNLKDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIPLSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHL
                                                                 AQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDL L
Subjt:  LSIPVSGLNLKDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIPLSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHL

Query:  VKSAIEHAQRSLPMDGNEAVAVATLLQFADMLQFNLKTAIKEDVDWSARFRPLTEVIMGMVVNQSLIKLILNVVDEDGSVKDSAFS
        VKSA+EHAQRSLPMDGNEAVA+A LLQFADMLQFNLKTAIKED DWS RF PLT+VIMGMVVNQSLIKLILNVVDEDGSVKDSA S
Subjt:  VKSAIEHAQRSLPMDGNEAVAVATLLQFADMLQFNLKTAIKEDVDWSARFRPLTEVIMGMVVNQSLIKLILNVVDEDGSVKDSAFS

A0A6J1E9Y4 uncharacterized protein LOC111430703 isoform X13.3e-8766.08Show/hide
Query:  MLSAALFAHPLTSIISATLPAKTVGSFRFQNGAASKHFSLSVNNSVSNGIRDDGNKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMP
        MLSAA+F H LT I SATLP  +V SFRFQN A   HFSLS N SV N IR D N+HSIHLDSLRALEWDKLCDSVASFARTSLG QAIK          
Subjt:  MLSAALFAHPLTSIISATLPAKTVGSFRFQNGAASKHFSLSVNNSVSNGIRDDGNKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMP

Query:  LSIPVSGLNLKDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIPLSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHL
                                                                 AQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDL L
Subjt:  LSIPVSGLNLKDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIPLSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHL

Query:  VKSAIEHAQRSLPMDGNEAVAVATLLQFADMLQFNLKTAIKEDVDWSARFRPLTEVIMGMVVNQSLIKLILNVVDEDGSVKDSAFS
        VKSA+EHAQRSLPMDGNEAVA+A LLQFADMLQFNLKTAIKED DWS RF PLT+VIMGMVVNQSLIKLILNVVDEDGSVKDSA S
Subjt:  VKSAIEHAQRSLPMDGNEAVAVATLLQFADMLQFNLKTAIKEDVDWSARFRPLTEVIMGMVVNQSLIKLILNVVDEDGSVKDSAFS

A0A6J1E9Z0 uncharacterized protein LOC111430703 isoform X53.3e-8766.08Show/hide
Query:  MLSAALFAHPLTSIISATLPAKTVGSFRFQNGAASKHFSLSVNNSVSNGIRDDGNKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMP
        MLSAA+F H LT I SATLP  +V SFRFQN A   HFSLS N SV N IR D N+HSIHLDSLRALEWDKLCDSVASFARTSLG QAIK          
Subjt:  MLSAALFAHPLTSIISATLPAKTVGSFRFQNGAASKHFSLSVNNSVSNGIRDDGNKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMP

Query:  LSIPVSGLNLKDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIPLSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHL
                                                                 AQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDL L
Subjt:  LSIPVSGLNLKDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIPLSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHL

Query:  VKSAIEHAQRSLPMDGNEAVAVATLLQFADMLQFNLKTAIKEDVDWSARFRPLTEVIMGMVVNQSLIKLILNVVDEDGSVKDSAFS
        VKSA+EHAQRSLPMDGNEAVA+A LLQFADMLQFNLKTAIKED DWS RF PLT+VIMGMVVNQSLIKLILNVVDEDGSVKDSA S
Subjt:  VKSAIEHAQRSLPMDGNEAVAVATLLQFADMLQFNLKTAIKEDVDWSARFRPLTEVIMGMVVNQSLIKLILNVVDEDGSVKDSAFS

SwissProt top hitse value%identityAlignment
No hits found
Arabidopsis top hitse value%identityAlignment
AT5G54090.1 DNA mismatch repair protein MutS, type 21.1e-3738.79Show/hide
Query:  NKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMPLSIPVSGLNLKDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIP
        +K     DSLR LEWDKLCD VASFARTSLG +A K                                                                
Subjt:  NKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMPLSIPVSGLNLKDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIP

Query:  LSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHLVKSAIEHAQRSLPMDGNEAVAVATLLQFADMLQFNLKTAIKEDVDWSARFRPLT
            +LWSL++++ ESL+LLDET+AA++M +HG   LDLS + + LV+S I HA+R L +  ++A+ VA+LL+F + LQ +LK AIK+D DW  RF PL+
Subjt:  LSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHLVKSAIEHAQRSLPMDGNEAVAVATLLQFADMLQFNLKTAIKEDVDWSARFRPLT

Query:  EVIMGMVVNQSLIKLILNVVDEDGSVKDSAFS
        E+I+  V+N+S +KL+  V+D DG++KDSA S
Subjt:  EVIMGMVVNQSLIKLILNVVDEDGSVKDSAFS


Sequences Show/hide sequences
CDS sequenceShow/hide CDS sequence
ATGCTTTCTGCAGCTCTTTTTGCCCATCCCCTCACCTCGATCATCTCTGCTACACTGCCGGCTAAAACCGTCGGTTCGTTCAGATTCCAGAATGGAGCTGCATCTAAACA
CTTCTCCCTCTCTGTAAACAACTCCGTCAGCAATGGCATTAGAGATGACGGAAACAAACATTCAATCCACCTCGATAGTCTCCGAGCGCTGGAATGGGATAAACTTTGCG
ATTCTGTTGCTTCCTTCGCGCGCACTTCTCTGGGCCATCAAGCTATCAAGGTGGCTCCTAATATTTGCGGATTTATGCCTTTATCTATTCCGGTTTCAGGGTTGAATCTA
AAGGATTTTTCATTGAGTTTCGAACCTTTCACGGAGTCGCATAGTTGTTGCGTTAACAATATCTATTTCAGTATTGGGGGCGAGAAGGATAATTTCTTTATGAAGTACAA
AATGAATGGCTTGTTTATCCCTCTTTCTTTGGCTCAACTTTGGTCTTTGAACCGGACATATGAAGAAAGTTTGAGACTTTTGGATGAGACTAATGCGGCAGTAGAAATGC
ACAAGCATGGTGGCTGCAGCCTTGATTTAAGTGGTGTCGACCTTCATCTTGTGAAATCTGCAATAGAACATGCTCAAAGGAGTCTGCCAATGGATGGAAATGAAGCAGTG
GCTGTTGCAACTCTTCTACAGTTTGCTGATATGTTGCAATTTAATTTGAAAACTGCAATCAAAGAAGATGTGGACTGGTCTGCACGTTTTAGGCCCCTAACAGAAGTGAT
AATGGGAATGGTGGTAAATCAATCATTGATTAAACTGATACTGAATGTAGTAGATGAAGATGGCTCAGTTAAAGATTCTGCGTTTTCAGTAGAATGTGCTACAAGTGCTG
ATTGGGTCTAA
mRNA sequenceShow/hide mRNA sequence
TAGCAAGATAGCATTAACCGGGAGCGGTGACTTGTCCTAACTCCTACAAAGCTTTTCGTCCCTCGTCGGTGGCAGCAAGGCTAAAACGATGCTTTCTGCAGCTCTTTTTG
CCCATCCCCTCACCTCGATCATCTCTGCTACACTGCCGGCTAAAACCGTCGGTTCGTTCAGATTCCAGAATGGAGCTGCATCTAAACACTTCTCCCTCTCTGTAAACAAC
TCCGTCAGCAATGGCATTAGAGATGACGGAAACAAACATTCAATCCACCTCGATAGTCTCCGAGCGCTGGAATGGGATAAACTTTGCGATTCTGTTGCTTCCTTCGCGCG
CACTTCTCTGGGCCATCAAGCTATCAAGGTGGCTCCTAATATTTGCGGATTTATGCCTTTATCTATTCCGGTTTCAGGGTTGAATCTAAAGGATTTTTCATTGAGTTTCG
AACCTTTCACGGAGTCGCATAGTTGTTGCGTTAACAATATCTATTTCAGTATTGGGGGCGAGAAGGATAATTTCTTTATGAAGTACAAAATGAATGGCTTGTTTATCCCT
CTTTCTTTGGCTCAACTTTGGTCTTTGAACCGGACATATGAAGAAAGTTTGAGACTTTTGGATGAGACTAATGCGGCAGTAGAAATGCACAAGCATGGTGGCTGCAGCCT
TGATTTAAGTGGTGTCGACCTTCATCTTGTGAAATCTGCAATAGAACATGCTCAAAGGAGTCTGCCAATGGATGGAAATGAAGCAGTGGCTGTTGCAACTCTTCTACAGT
TTGCTGATATGTTGCAATTTAATTTGAAAACTGCAATCAAAGAAGATGTGGACTGGTCTGCACGTTTTAGGCCCCTAACAGAAGTGATAATGGGAATGGTGGTAAATCAA
TCATTGATTAAACTGATACTGAATGTAGTAGATGAAGATGGCTCAGTTAAAGATTCTGCGTTTTCAGTAGAATGTGCTACAAGTGCTGATTGGGTCTAAATTGTCGGCAA
AACCACGGATTATTTAATTCAATGCGGTGAAAGCTATCTTTTTTGAAACTTGGTTTGAAAGAAATCAAAGAGTCTTTCAAAGCAAGTTTCTTCCTTGGGCAGTTCAGTTT
GATTCGGCTCGTATTAAGGCTTCATCATGGTGCTTTCTTTCCAAACTCTTTGCAGATTATTCTCTTCA
Protein sequenceShow/hide protein sequence
MLSAALFAHPLTSIISATLPAKTVGSFRFQNGAASKHFSLSVNNSVSNGIRDDGNKHSIHLDSLRALEWDKLCDSVASFARTSLGHQAIKVAPNICGFMPLSIPVSGLNL
KDFSLSFEPFTESHSCCVNNIYFSIGGEKDNFFMKYKMNGLFIPLSLAQLWSLNRTYEESLRLLDETNAAVEMHKHGGCSLDLSGVDLHLVKSAIEHAQRSLPMDGNEAV
AVATLLQFADMLQFNLKTAIKEDVDWSARFRPLTEVIMGMVVNQSLIKLILNVVDEDGSVKDSAFSVECATSADWV