Wolfram Language

Classify an Audio Dataset

The creation of a powerful audio classifier is made easy by the automated feature extraction present in all of the high-level machine learning functions. This example automatically classifies a standard dataset for Environmental Sound Classification (ESC-50).

Download the dataset.

show complete Wolfram Language input

Import the metadata. The dataset is a labeled collection of 2000 environmental audio recordings. The files are five-second-long recordings organized into 50 semantic classes.

show complete Wolfram Language input

Inspect a sample from the metadata.

Divide the dataset into training and testing subsets.

Train a ClassifierFunction on the training data using Classify. All of the preprocessing, feature extraction and the classification algorithm are automatically chosen according to the input data.

Compute the accuracy on the test data and plot the confusion matrix. Despite the lack of explicit user input, the classification accuracy exceeds 90%.

Related Examples

de es fr ja ko pt-br zh