Return to SiSEC 2016

MUS 2016

Professionally-produced music recordings

We propose to reproduce this year the 2015 Professionally produced music recordings task.

0. Introduction

The purpose of this task is to evaluate source separation algorithms for estimating one or more sources from a set of mixtures in the context of professionally-produced music recordings.

The data set consists of a total of 100 full-track songs of different styles and includes the synthesized mixtures and the original sources, divided between a development subset and a test subset (see I.).

The participants are kindly asked to download the data set and an evaluation function, run the evaluation function which will call their separation function and return the estimated sources and the performance results, and send us back the performance results (see II.).

The evaluation will be performed along with the separation using a common set of performance measures, and the final results will be made available on the website after scaling them according to a lower-bound and/or an upper-bound criterion (see III.).

I. The Dataset

This year, the dataset considered for the MUS task is called Demixing Secrets Dataset 100 (DSD100). It consists of exactly the same songs and stems than the MSD100 dataset considered in MUS2015,  but where each track has now been mixed using real professionnal Digital Audio Workstations.

The DSD100 dataset can be downloaded here.

MD5 sum is: e8964d97d6f320962647e24ebc6773e3

DSD100 contains two folders, a folder with the mixture set, “Mixtures,” and a folder with the source set, “Sources.”

Each folder contains two subfolders, a folder with a development set, “Dev,” and folder with a test set, “Test“;
supervised approaches should be trained on the former set and tested on both sets.

Each subfolder contains 50 sub subfolders corresponding to 50 songs, for a total of 100 different songs.
Each sub subfolder from “Mixtures” contains one file, “mixture.wav,” corresponding to the mixture, and each sub subfolder from “Sources” contains 4 files, “bass.wav,” “drums.wav,”other.wav” (i.e., the other instruments), and “vocals.wav,” corresponding to the sources.

For a same song, the mixture and the sources have the same length and the same sampling frequency (i.e., 44,100 Hz); All signals are stereophonic.

The sources for DSD100 were created from stems downloaded from The ‘Mixing Secrets’ Free Multitrack Download Library.
We would like to thank Mike Senior, not only for giving us the permission to use this multitrack material, but also for maintaining such resources for the audio community.

Antoine Liutkus and Stylianos Ioannis Mimilakis did the mixing of the whole set of 100 full tracks.

II. The Settings

The participants are kindly asked to first download the data set DSD100.zip (12 GB!) and unzip it in a root folder, so that there is a folder “DSD100” containing the subfolders “Mixtures” and “Sources,” along with the files “dsd100.txt” and “dsd100.xlsx“.

The participants are also kindly asked to download the evaluation function DSD100_eval.m and place it in the root folder, along with the folder “DSD100.”

Download DSD100.zip (12 GB, LINK TO DSD100.zip dataset)

MD5 sum is e8964d97d6f320962647e24ebc6773e3

DSD100 comes with two different implementations for running the evaluation: one in MATLAB and one in python.

II.1 MATLAB scripts

Download the DSD100 matlab evaluation scripts and get details here.

II.1 Python scripts

Download the DSD100 python evaluation scripts and get details here.

  • Please do not hesitate to open issues on github for both the matlab and the python version in case you discover something going wrong !
  • The documentation of these packages can be found HERE.

 III. The Evaluation

The participants are kindly asked to run the evaluation function themselves and to provide the final .mat result file to Antoine Liutkus at the following address: antoine(dot)liutkus(at)inria(dot)fr

In short, the evaluation process is the following:

  1. Download the DSD100 dataset here
  2. Get the dsd100mat tools here.
  3. If you’re going to use python, download dsdtools here
  4. Do no use the test set to learn your model if your method is supervised. You may use the dev set and any additional data you may have.
  5. Under matlab: proceed to simultaneous separation and evaluation with the scripts in dsd100mat
  6. Under python: first separate the songs using dsdtools to handle the file structures, and then use the ‘eval only’ tools in dsd100mat to compute the bsseval metrics.
  7. Send the resulting .mat file to Antoine Liutkus

The metrics considered for MUS2016 are limited to bsseval images v3.0, as described here.

In case of any question, please ask Antoine Liutkus

We would like to thank Emmanuel Vincent for giving us the permission to use the BSS Eval toolbox 3.0.

References:

  • Emmanuel Vincent, Shoko Araki, Fabian J. Theis, Guido Nolte, Pau Bofill, Hiroshi Sawada, Alexey Ozerov, B. Vikrham Gowreesunker, Dominik Lutter and Ngoc Q.K. Duong, “The Signal Separation Evaluation Campaign (2007-2010): Achievements and remaining challenges”, Signal Processing, 92, pp.1928-1936, 2012.
  • Emmanuel Vincent, Hiroshi Sawada, Pau Bofill, Shoji Makino and Justinian P. Rosca, “First stereo audio source separation evaluation campaign: data, algorithms and results,” In Proc. Int. Conf. on Independent Component Analysis and Blind Source Separation (ICA), pp.552-559, 2007.

Please Note:

The participants are kindly asked not to delete their separation results, i.e. the full arborescence containing all the separated tracks.
It is indeed expected that a perceptual evaluation could be performed on all the MUS results, and this requires availability of the separated tracks.
In case such a perceptual evaluation is performed, further instructions would be given to the participants, concerning the server and the way they should transmit these results for evaluation.
Meanwhile, please be so kind so as to keep these, exactly as they were evaluated with BSSeval, somewhere on your local hard drives.

Let’s separate!

The organizers for MUS2016,

Zafar Rafii (zafarrafii[at]gmail.com), Fabian Stöter (fabian-robert.stoeter[at]audiolabs-erlangen.de) and Antoine Liutkus (antoine.liutkus[at]inria.fr)