Here is a short introduction to the Wolfram Language that should help you understand the code snippets present in this book.

The Wolfram Language is a high-level programming language that can be used in a notebook interface. We can type some code and press shiftreturn to obtain a result:

This language is composed of more than 6000 built-in functions that aim to capture the most common operations that we might want to perform. We can, for example, sort numbers using the function Sort:

We can also plot a curve using the function Plot:

We can also recognize and extract objects in an image using the function ImageCases:

These high-level functions allow us to write small and understandable programs. In a sense, the Wolfram Language is closer to a natural language than usual programming languages. Note that nobody remembers the name or even the existence of all of these built-in functions, but it usually does not take long to browse the documentation and find what we are looking for.

Another aspect of this language is that it is *knowledge based*, which means that we can use it to obtain data about the world. Using natural language input (by typing Ctrl=) is the best way to access such data. For example, let’s obtain the population of France:

Here the natural language input is first converted into a proper program, which queries the data. Having direct access to such data is quite useful for machine learning or, more generally, for data science projects.

Besides these “one-shot” computations, we can use the Wolfram Language to create all kinds of programs. Here is a custom function to compute the root mean square of a list of values:

Let’s use this function:

We could have also written this program as a *pure function*, which is more concise:

The symbol # represents the input (a.k.a. *argument*) of the function, and the symbol & shows that this is a pure function.

Here is another function to standardize the data (note that we reuse our root mean square function):

This time we used a Module to define a local variable (centered) in the program in order to reuse an intermediate result. Each line inside the module performs a small computation and the result of the last line is returned. Let’s try this function on the same list of values:

We could also develop arbitrarily complex programs this way. As an example, the computational knowledge engine Wolfram|Alpha is developed in the Wolfram Language.

Data is central to the field of machine learning. Let’s look at the classic data structures of this language. We have already seen the *list*, which is the simplest kind of data structure. A list can contain numbers, strings, images, or any other kind of *expression *(everything in the Wolfram Language is an expression):

Let’s obtain the third element of this list with the function Part:

We can also use the shorthand syntax notation for Part by typing [[and ]] :

Let’s now take the first three elements:

Or elements two through four:

We can use the function Map to apply a function to every element of the list:

Here is the shorthand syntax for Map (/@):

Note that the function f has no definition here, which is why no computation is done.

The list is a fundamental data structure that is used everywhere. We can even define arrays of arbitrary dimensions by creating lists of lists. Here is a 23 matrix of numbers:

Let’s extract the value in the second row and first column:

We can also use Map on this matrix:

The function is only applied to the element of the outer list though. To apply the function deeper we need to add a *level specification* (which corresponds to a depth):

We can also apply a function to the columns of the matrix using MapThread:

To apply the function on the rows (but with the inner lists removed), we can use the intimidating but practical @@@ syntax (which is a special case of the function Apply):

And that is basically how we manipulate lists.

The other main data structure in the Wolfram Language is called the *association*:

This is an associative array (a.k.a. dictionary) and can be used to store values associated with keys. We can, for example, query the value associated with the key "Weight":

The Map function transforms the values:

Lists and associations can be nested together to form proper datasets. Here is a list of two associations:

Again, we can extract any value from this data:

These structures would typically be the way to represent a dataset in machine learning, and they can be better visualized using Dataset:

We can query or transform this dataset as if it was a list of associations. For example, we can remove the "Age" key:

Or select the rows for which "Age" is larger than 3:

Or obtain a random row:

We will often use such datasets in this book.

That is it for this minimal introduction that should help you follow the code present in this book. To go further, you can read *An Elementary Introduction to the Wolfram Language* (wolfr.am/eiwl) or simply browse the documentation (reference.wolfram.com), which describes the functions and contains many examples.