Wolfram Language

Open live version

Generate Random Pronounceable Words

Generate random English words where the probability of each letter is given by the likelihood that the letter occurs after the previous two letters somewhere in the dictionary.


code

trigrams = Append[#[[1, 1, ;; 2]] -> (#[[All, 2]] -> #[[All, 1, 3]]) & /@ GatherBy[ Tally[Flatten[ Partition[Join[{"A", "A"}, #, {"Z"}], 3, 1] & /@ Characters[DictionaryLookup[RegularExpression["[a-z]+"]]], 1]], #[[1, ;; 2]] &], {__String} -> {"e"}];
randomWord[] := StringJoin[ NestWhile[Append[#, RandomChoice[Take[#, -2] /. trigrams]] &, {"A", "A"}, #[[-1]] =!= "Z" &][[3 ;; -2]]]
englishWordQ[word_] := DictionaryLookup[word] =!= {}
CloudDeploy[APIFunction[{}, With[{word = randomWord[]}, Pane[Style[word, FontFamily -> "Times", FontSize -> 48, FontColor -> If[englishWordQ[word], Darker[Red], Black]], 1000, ImageMargins -> 100] ] &, "PNG"], Permissions -> "Public"]

Make random words using bigrams instead of trigrams.

bigrams = Append[#[[1, 1, ;; 1]] -> (#[[All, 2]] -> #[[All, 1, 2]]) & /@ GatherBy[ Tally[Flatten[ Partition[Join[{"A"}, #, {"Z"}], 2, 1] & /@ Characters[DictionaryLookup[RegularExpression["[a-z]+"]]], 1]], #[[1, 1]] &], {_String} -> {"e"}];
bigramRandomWord[] := StringJoin[ NestWhile[Append[#, RandomChoice[Take[#, -1] /. bigrams]] &, {"A"}, #[[-1]] =!= "Z" &][[2 ;; -2]]]

how it works

The data that guides the generation of random words is contained in trigrams, a list of elements like this one, that indicates that there is 1 word where aa is followed by h, 4 words where aa is followed by r, and so on. The right side of the first arrow is in precisely the format required by RandomChoice to make a weighted choice from a list.

{"a", "a"} -> {1, 4, 2, 1, 1, 1, 2, 4} -> {"h", "r", "Z", "e", "i", "s", "l", "m"}

To build the complete list of trigrams, first find all triples of characters in the dictionary, tally them, and gather together those that have the same first two letters. Here is an example of the result when the first two letters are aa.

GatherBy[Tally[ Flatten[Partition[#, 3, 1] & /@ Characters[DictionaryLookup[RegularExpression["[a-z]+"]]], 1]], #[[1, ;; 2]] &][[1]]

Using the convention that AA indicates the beginning of a word and Z the end, special entries in trigrams for {A,A} and {A, a} give the frequencies, respectively, of words that begin with a given letter and words whose second letter is a given letter. Z elements indicate no subsequent letter. A default case is appended to trigrams to avoid errors. This is the complete list of trigrams:

trigrams = Append[#[[1, 1, ;; 2]] -> (#[[All, 2]] -> #[[All, 1, 3]]) & /@ GatherBy[ Tally[Flatten[ Partition[Join[{"A", "A"}, #, {"Z"}], 3, 1] & /@ Characters[DictionaryLookup[RegularExpression["[a-z]+"]]], 1]], #[[1, ;; 2]] &], {__String} -> {"e"}];

To generate a random word using the trigrams, start with AA and repeatedly add random characters guided by the weights in trigrams until a Z is produced.

randomWordCharacters = NestWhile[ Append[#, RandomChoice[Take[#, -2] /. trigrams]] &, {"A", "A"}, #[[-1]] =!= "Z" &]

Extract the third to next-to-last characters and join them together to make a pronounceable word.

StringJoin[randomWordCharacters[[3 ;; -2]]]

Package those steps as the function randomWord.

randomWord[] := StringJoin[ NestWhile[Append[#, RandomChoice[Take[#, -2] /. trigrams]] &, {"A", "A"}, #[[-1]] =!= "Z" &][[3 ;; -2]]]

Test randomWord by making a list of random words.

Column[Table[randomWord[], {20}]]

englishWordQ tests if a word is in the dictionary.

englishWordQ[word_] := DictionaryLookup[word] =!= {}

This selects the English words in a sample of 20 random words:

Select[Table[randomWord[], {20}], englishWordQ]

You can estimate the likelihood of getting an actual word with randomWord experimentally:

Row[{Length[Select[Table[randomWord[], {10000}], englishWordQ]]/ 10000*100., "%"}]

Heres how to make a web page that gives you a different, nicely formatted word each time you visit it.

Format a word using Style to set the font and size and Pane to add some white space around the word. Set the color to red if it is an English word.

With[{word = randomWord[]}, Pane[Style[word, FontFamily -> "Times", FontSize -> 48, FontColor -> If[englishWordQ[word], Darker[Red], Black]], ImageMargins -> 100] ]

Deploy that to the cloud with APIFunction and CloudDeploy, specifying PermissionsPublic so that everyone has access.

CloudDeploy[APIFunction[{}, With[{word = randomWord[]}, Pane[Style[word, FontFamily -> "Times", FontSize -> 48, FontColor -> If[englishWordQ[word], Darker[Red], Black]], 1000, ImageMargins -> 100] ] &, "PNG"], Permissions -> "Public"]