Machine learning—which roughly refers to computers learning to do things by themselves—is one of the most transformative domains in today’s world and its use is growing. I joined Wolfram Research in 2012 and led the early development of the machine learning tools that are now part of the Wolfram Language. We started by developing automatic functions to perform classic machine learning tasks such as classification, regression, or dimensionality reduction. Then, we developed a user-friendly neural network framework. Along the way, we used these tools to develop applications such as image identification, topic identification, or text entity recognition. I decided to write this book to share my understanding of machine learning as it is after these eight years of design and development. I hope that it will be useful to you.
What Is This Book About?
This book is an introduction to machine learning, and it assumes no prior knowledge of this field. The first goal of this book is to teach you what machine learning is and what its applications are. The second goal of this book is to teach you how to practice machine learning: how to create models, how to test them, and how to use them. The final goal of this book is to give you an understanding of how machine learning works and the functioning of the methods and algorithms that power it.
This book is written in computational essay style, which is a “show, don’t tell” approach that alternates between text and simple computations. These computations usually consist of an input and an output, such as:
These small programs are written in the Wolfram Language and are composed of rather self-explanatory functions. These code snippets are used to show how to practice machine learning, to illustrate concepts, and to complement—or even replace—mathematical formulations. To improve readability, some parts of the code are hidden, but all of the code is accessible in the online version. Note that even regular illustrations are made using the Wolfram Language, and their corresponding code is also accessible in the online version.
What Are the Prerequisites for Reading This Book?
This book has been written with as little math as possible. Nevertheless, the mathematical concepts that would be most useful to know beforehand are the basics of algebra (what a vector is, what a matrix is, what a dot product is, etc.), the basics of probability (what a probability is, what a distribution is, etc.), and the basics of analysis (what a function is, what a derivative is, etc.). Overall, an end-of-high-school math education should be enough for you to understand the math content.
On the programming side, nothing much is required before reading this book. However, a grasp of the Wolfram Language is needed to fully understand the code snippets and have a better reading experience. This can be obtained through the Short Introduction to the Wolfram Language included in this book and through the Wolfram Language & System Documentation Center (reference.wolfram.com). Also, keep in mind that if your goal is to use machine learning, it will be hard to avoid learning at least one programming language.
Who Is This Book For?
This book is for anyone who wants to know what machine learning is, how to use it, or how it works. A scientist or an engineer might use it to apply machine learning to their problems. A data analyst might use it to transition to a data scientist position. A student might use it to learn valuable skills. A decision maker might use it to get an intuition about what machine learning is. A manager might use it to interact more effectively with their data scientists. More generally, this book should benefit anyone curious about this fascinating field.
How to Read This Book
This book has 13 chapters that are loosely meant to be read in order. Chapters 1 and 2 form a minimal introduction that is easy to grasp, and this might be enough for those only looking for an overview of machine learning. Chapters 3 through 9 (except for Chapter 5) offer a deeper dive into the tasks of machine learning through accessible examples. Chapter 5 gives a detailed overview about how machine learning works. The final chapters are mostly about how the methods and algorithms of machine learning are functioning and are overall a bit harder to understand. Note that Chapter 11 is an introduction to neural networks, which might be of interest to many.
Each chapter ends with some takeaways, some exercises, and a vocabulary section. The vocabulary section provides definitions for the important concepts present in the chapter. Like for the takeaways, reading the vocabulary section might be a good way to test and solidify your understanding of these concepts. The exercises are intended to be tackled using the Wolfram Language, but in principle, other languages could also be used. These exercises are open-ended; their goal is to encourage you to play with machine learning tools by solving problems, which is an effective strategy to learn concepts and is a necessary strategy for learning how to use machine learning. The best way to use the Wolfram Language is with a notebook, which can be freely accessed in the cloud (wolframcloud.com).
Besides exercises, it might be a good idea to read this book with a Wolfram Notebook open to re-evaluate and play with the code snippets. Also, having the documentation nearby is useful for checking what a given function does or exploring the details of machine learning functions (wolfr.am/MachineLearning), which is a good complementary way to learn machine learning.
Access this book and all the examples online at: wolfr.am/iml.