Wolfram Computation Meets Knowledge

47 Writing Good Code

47Writing Good Code
Writing good code is in many ways like writing good prose: you need to have your thoughts clear, and express them well. When you first start writing code, you’ll most likely think about what your code does in English or whatever natural language you use. But as you become fluent in the Wolfram Language you’ll start thinking directly in code, and it’ll be faster for you to type a program than to describe what it does.
My goal as a language designer has been to make it as easy as possible to express things in the Wolfram Language. The functions in the Wolfram Language are much like the words in a natural language, and I’ve worked hard to choose them well.
Functions like Table or NestList or FoldList exist in the Wolfram Language because they express common things one wants to do. As in natural language, there are always many ways one can in principle express something. But good code involves finding the most direct and simple way.
To create a table of the first 10 squares in the Wolfram Language, there’s an obvious good piece of code that just uses the function Table.
Simple and good Wolfram Language code for making a table of the first 10 squares:
Table[n^2, {n, 10}]
Why would anyone write anything else? A common issue is not thinking about the “whole table”, but instead thinking about the steps in building it. In the early days of computing, computers needed all the help they could get, and there was no choice but to give code that described every step to take.
A much worse piece of code that builds up the table step by step:
Module[{list, i}, list = {}; For[i = 1, i <= 10, i++, list = Append[list, i^2]]; list]
But the point of the Wolfram Language is to let one express things at a higher leveland to create code that as directly as possible captures the concept of what one wants to do. Once one knows the language, it’s vastly more efficient to operate at this level. And it leads to code that’s easier for both computers and humans to understand.
fromdigits[{h_, t_, o_}] := 100 h + 10 t + o
Run the code:
fromdigits[{5, 6, 1}]
fromdigits[list_List] := Total[Table[10^(Length[list] - i)*list[[i]], {i, Length[list]}]]
The new code works:
fromdigits[{5, 6, 1, 7, 8}]
Simplify the code by multiplying the whole list of powers of 10 at the same time:
fromdigits[list_List] := Total[10^Reverse[Range[Length[list]] - 1]*list]
Try a different, recursive, approach, after first clearing the earlier definitions:
fromdigits[{k_}] := k
fromdigits[{digits___, k_}] := 10*fromdigits[{digits}] + k
The new approach works too:
fromdigits[{5, 6, 1, 7, 8}]
But then you realize: it’s actually all just a Fold!
fromdigits[list_] := Fold[10*#1 + #2 &, list]
fromdigits[{5, 6, 1, 7, 8}]
Of course, there’s a built-in function that does it too:
FromDigits[{5, 6, 1, 7, 8}]
An overly short version of fromdigits, that’s starting to be difficult to understand:
fromdigits = Fold[{10, 1} . {##} &, #] & ;
It still works though:
fromdigits[{5, 6, 1, 7, 8}]
If what you’re trying to do is complicated, then your code may inevitably need to be complicated. Good code, though, is broken up into functions and definitions that are each as simple and self-contained as possible. Even in very large Wolfram Language programs, there may be no individual definitions longer than a handful of lines.
Here’s a single definition that combines several cases:
fib[n_] := If[! IntegerQ[n] || n < 1, "Error", If[n == 1 || n == 2, 1, fib[n - 1] + fib[n - 2]]]
It’s much better to break it up into several simpler definitions:
fib[1] = fib[2] = 1;
fib[n_Integer] := fib[n - 1] + fib[n - 2]
When you’re writing code, it’s common to first define a new function because you need it in some very specific context. But it’s almost always worth trying to give it a name that you’ll understand even outside that context. And if you can’t find a good name, it’s often a sign that it’s not quite the right function to define in the first place.
Graphics[{White, Riffle[NestList[Scale[Rotate[#, 0.1], 0.9] &, Rectangle[], 40], {Pink, Yellow}]}]
When you write Wolfram Language code, you’ll sometimes have to choose between using a single rare built-in function that happens to do exactly what you wantand building up the same functionality from several more common functions. In this book, I’ve sometimes chosen to avoid rare functions so as to minimize vocabulary. But the best code tends to use single functions whenever it canbecause the name of the function explains the intent of the code in a way that individual pieces cannot.
Use a small piece of code to reverse the digits in an integer:
Using a single built-in function explains the intent more clearly:
Good code needs to be correct and easy to understand. But it also needs to run efficiently. And in the Wolfram Language, simpler code is typically better here toobecause by explaining your intent more clearly, it makes it easier for the Wolfram Language to optimize how the computations you need are done internally.
With every new version, the Wolfram Language does better at automatically figuring out how to make your code run fast. But you can always help by structuring your algorithms well.
Timing gives the timing of a computation (in seconds), together with its result:
With the definitions of fib above, the time grows very rapidly:
ListLinePlot[Table[First[Timing[fib[n]]], {n, 25}]]
The algorithm we used happens to do an exponential amount of unnecessary work recomputing what it’s already computed before. We can avoid this by making the definition for fib[n_] always do an assignment for fib[n], so it stores the result of each intermediate computation.
Redefine the fib function to remember every value it computes:
fib[1] = fib[2] = 1;
fib[n_Integer] := fib[n] = fib[n - 1] + fib[n - 2]
Now even up to 1000 each new value takes only microseconds to compute:
ListLinePlot[Table[First[Timing[fib[n]]], {n, 1000}]]
Good code should have a structure that’s easy to read. But sometimes there are details that you have to include in the code, but that get in the way of seeing its main point. In Wolfram Notebooks, there’s a convenient way to deal with this: iconize the details.
Make a plot with a long sequence of options specified:
ListLinePlot[Table[Prime[n]/n,{n,100}],Frame->True, PlotStyle->Orange,Filling->Axis, FillingStyle->LightPurple,AspectRatio->1/3,Mesh->All]
ListLinePlot[Table[Prime[n]/n, {n, 100}], Sequence[ Frame -> True, PlotStyle -> Orange, Filling -> Axis, FillingStyle -> LightPurple, AspectRatio -> 1/3, Mesh -> All]]
In a typical Wolfram Notebook interface, you can just select what you want to iconize, and use the right-click menu. You’ll see an iconized form, but it’s the code “underneath” the icon that’ll actually be used.
It’s also often convenient to use iconization when you want to include lots of data in a notebook, but don’t want to show it explicitly.
Create a table of a thousand primes, but show it only in iconized form:
Iconize[Table[Prime[n], {n, 1000}]]
You can copy the iconized form, then use it in input:
Total[CompressedData[" 1:eJwV138oqN0dAHB3ZmZ3d+Y1MzOTJEmSJEmSJEmSSZIku0mSV5IkSZIkSZIk SZIk6SZJkiRJkiRJ0k2S7CZJkiTdfd4/Pp0/z3O+P875PnH/bfnPr58CAgL+ FxgQ8DurJSCIYD7zhTDCiSSGWBJIIplUMsgmh3wKKaKUcqqoo54GmmimlR76 GGSIcSaYZo4FlllljW122GOfU6644ZYfPPLEO58cMJgvhBJBFNEkkE4mWeRQ TBnV1FBHAy2000UP/QwzyiRzzPONFdbZZIcDjjjmlCvuuOeZVz4IkYTPxBBP CulkkU0exZRRSRW11NNIEx300scAw4wxzizzLLHKFnscc853brjjkRfe+KQw gggnhiSSySCTbHIopIIqqqmlmVba6GCIESaYZZFlVtlgn0NOOOeKW5544Z0P gn8vXoQSRTQJJJJCLvkUUEQpFdTwlUZ66GWIMaaZY5EVtthhnyOOOeOaez74 SaCmCeEzYUQQSTSJJJFKGZXUUE8HvYwwxjiTzLDAEsusss0BhxxzwhmX3PHM O8F/UOuEEUks8SSTSg55FFJECeVUUE09nfTQRz/TzDLHMhtssscRp1xwxROv vPNBiMsknFgSyaSAYkopo4JK6mimjU6GGGaaGdbYZJcDjjjmlEuuueeBV4L+ qD6III54EkkinVzyKKOcKmqopZFW+phihnmW2WKbXQ655DvX/OCRF94ICpET YogngWRSySCfIkoopYIqammilQ76GWSEKeZYZJl9LrjhnkdeeOWD8D+JP3Gk kUUORVRSx1caaKGdLnropZ8hxphgkhkWWWKNLXY45oJb7njgiWcCPSphRBBN PEmkkEshZVRSQz0tdDHIMGNMMsU6G+xyyDHfueYHL7wT/Ge9SRgJpJJGHtXU UkczbXTQxyCjjDPNKmtssMk+N9xyzxOvvPPJ4xlIKOHEEkcyKeSQSykVNNJJ L/0MMMQwcyyxzhb7HHDEKWdccssdz3zwk5C/uAeIIY5EUskgm2IqqaKGJlpp p5sBRhhnihXW2WCXE8645IFH3ggOFXO+EEkCSaSRTyEllFJOFdU00EQP48wy xwLLrLLGJieccsEPHnnng8C/ij2xxJFAMlnkkEsZjbTRSRf9DDDKGNPMMc83 VthkhwPueOKZD4IMTJ+JIJEUsiigkCpq+UojLfTSxzBTzDLPAkvssscJ59zx yAvvfPlFfoklniSSSSOTPAopp5ZW2uikmxEm+MYqa2ywzQ5HnHDFB8EGwi9E kUgKmeRTQBGlVFJDA40000UvAwwxyhjTzPCNTXY544Ir7nngmTd+Ev03uSWV dLIpoIQqmmihlXaGmWSGWVZYZ4tDzrjhjqAIeSSMSGJJIIlkMskhj3yKqKWe Bpropoc+RhhlginW2GSHPY4455Xgv+tbQgkngiiiSSCdLHIooJQKGmmmhXbG mGCGRb6xxha7HHPKBZdcc8s9D7zy9tu3GP5DiSeRdPKo4SsddDHMOFPMMM8q G2xzwCFnXPODZ174SfA/nJcYMsgkm3yKKKGCBlrpoJse+hlhimkWWGaVDXbY 54RzLrjilnte+CAiSpxJI50Ciiijmk56GWOCSeZYZIkV1tjnmEse+CDwn+qK MCKJI54UUskglxLKaKSJLroZYJAxFlhimRU2OOCYM8654Zl3fv62d7T9SCCV NDLJJo9CSqmginqaaKOHPkaZZoFt9jjhnAseeSXkX3JJFIkkk0I6ORRQTCkV tNHLEMNMMsc8a6xzwDXPvBHihzSaOOLJIItc8iimnBq+0kgr7XTRzQCDjDPJ DPMssMo6e5xzyTUvvPHp32qWz4QTSQLJpJJDESWU0kwrg4wwzizLbLLDHvsc 8p0bHvk/t8YqXw== "]]
FromDigits[list] assemble an integer from its digits
IntegerReverse[n] reverse the digits in an integer
Timing[expr] do a computation, timing how long it takes
Iconize[expr] make an expression display in iconized form
47.1Find a simpler form for Module[{a, i}, a=0; For[i=1, i1000, i++, a=i*(i+1)+a];a]»
Expected output:
47.2Find a simpler form for Module[{a, i}, a=x; For[i=1, i10, i++, a=1/(1+a)];a]»
Expected output:
47.3Find a simpler form for Module[{i, j, a}, a={}; For[i=1, i10, i++, For[j=1, j10, j++, a=Join[a, {i, j}]]];a]»
Expected output:
47.4Make a line plot of the timing for computing n^n for n up to 10000. »
Sample expected output:
47.5Make a line plot of the timing for Sort to sort Range[n] from a random order, for n up to 200. »
Sample expected output:
It’s a short notation for i=i+1. It’s the same notation that C and many other low-level computer languages use for this increment operation.
What does the For function do?
It’s a direct analog of the for(...) statement in C. For[start, test, step, body] first executes start, then checks test, then executes step, then body. It does this repeatedly until test no longer gives True.
Why can shortened pieces of code be hard to understand?
The most common issue is that variables and sometimes even functions have been factored out, so there are fewer names to read that might give clues about what the code is supposed to do.
What are examples of good and bad function names?
There’s a function DigitCount that counts the number of times different digits occur in an integer. Worse names for this function might be TotalDigits (confusing with IntegerLength), IntegerTotals (might be about adding up the integer somehow), PopulationCount (weird and whimsical), etc.
What’s the best environment for authoring Wolfram Language code?
For everyday programming, Wolfram Notebooks are best. Be sure to add sections, text and examples right alongside your code. For large multi-developer software projects, there are plug-ins for popular IDEs. (You can also edit Wolfram Language code textually in .wl files using the notebook system.)
What does Timing actually measure?
It measures the CPU time spent in the Wolfram Language actually computing your result. It doesn’t include time to display the result. Nor does it include time spent on external operations, like fetching data from the cloud. If you want the absolute “wall clock” time, use AbsoluteTiming.
How can I get more accurate timings for code that runs fast?
Use RepeatedTiming, which runs code many times and averages the timings it gets. (This won’t work if the code is modifying itself, like in the last definition of fib in this section.)
What are some tricks for speeding up code?
Beyond keeping the code simple, one thing is not to recompute anything you don’t have to. Also, if you’re dealing with lots of numbers, it may make sense to use N to force the numbers to be approximate. For some internal algorithms you can pick your PerformanceGoal, typically trading off speed and accuracy. There are also functions like Compile that force more of the work associated with optimization to be done up front, rather than during a computation.
How do I uniconize an iconized piece of code?
Click it and select uniconize.