# Wolfram Language™

## String Decomposition

Examine the relative frequencies of codons (groups of three consecutive nucleotides) in the list of nucleotides of a gene.

Get the DNA sequence of the human gene "SCNN1A".

In[1]:=
`dnasequence = GenomeData["SCNN1A", "FullSequence"];`

Use StringPartition to construct the corresponding list of codons.

In[2]:=
`codons = StringPartition[dnasequence, 3];`
In[3]:=
`Take[codons, 10]`
Out[3]=

Compute the relative frequency of each codon in this gene.

In[4]:=
`frequencies = N[Counts[codons]/Length[codons]];`

There are 64 possible codons formed from the A, C, G, T nucleotides, and they all appear in the gene chosen.

In[5]:=
`frequencies // Length`
Out[5]=

Find the three codons with the highest frequencies.

In[6]:=
`TakeLargest[frequencies, 3]`
Out[6]=

Find the three codons with the lowest frequencies.

In[7]:=
`TakeSmallest[frequencies, 3]`
Out[7]=

Visualize all relative frequencies in a Grid.

show complete Wolfram Language input
In[8]:=
```background = Thread[Rule[ Flatten[{ Outer[List, Range[1, 15, 2], {3, 4, 7, 8}], Outer[List, Range[2, 16, 2], {1, 2, 5, 6}] }, 2], GrayLevel[0.9]]]; Grid[Partition[Sequence @@@ Normal[KeySort@frequencies], 8], Spacings -> {1, 1}, Dividers -> All, Background -> {None, None, background}]```
Out[8]=