Getting TensorFlow and Magenta installed with GPU support on the Donders compute cluster

Although many software packages are nowadays much easier to get up and running than, say, a few years ago, some of them can still cause the occasional headache. For instance, I spent the better part of a day getting the Magenta toolbox (part of the TensorFlow ecosystem) for modelling and generating music using machine learning to work on the GPUs of the high-performance compute (HPC) cluster here at the Donders Institute (Donders Centre for Cognitive Neuroimaging; DCCN). Since I don't want anyone else to have to go through the same pains as I did, I thought I'd document my experiences here. (Also, I'll have to explain this to at least one of my students soon, so why not blog about it instead?)

DCCN-specific initialization

First of all, some stuff that's relatively specific for the Torque cluser running here at the Donders Centre for Cognitive Neuroimaging. There are a few GPU-capable nodes in the cluster here (running nVidia Tesla P100 GPUs). Make sure to get an interactive job on one of these nodes, so that we can actually do GPU computations once the installation is done:

qsub -I -l mem=32gb,walltime=06:00:00,nodes=1:gpus=1

Once the job starts, activate the Anaconda platform, as well as the CUDA libraries required for GPU computation (note that these commands do nothing more than some path shuffling):

module load anaconda3/5.0.0
module load cuda/10.0

If you're reading this guide while not at the DCCN, you can skip the above of course (but make sure you have the required dependencies installed, like a working conda setup). Note that we're going to be installing Tensorflow version 1.15.2, since that is the one used by the Magenta transformer models we're interested in. Things are very similar with Tensorflow 2.0.0 (which I initially tried, before discovering that TF1 was needed).

Creating env and installing Tensorflow

Install TF into a conda env called tfgpu and activate it. Using a conda env allows easy package management:

conda create -n tfgpu python=3.6
source activate tfgpu

It's important here to use Python version 3.6, since version 3.7 causes conflicts with dependency python-rtmidi, which we'll need later on. Now we're ready to install TF. We'll install the main package tensorflow as well as the tensorflow-gpu package. In principle, a working TF setup only needs one of these (depending on whether you want to run models on the CPU or GPU), but if we don't install the tensorflow package, Magenta will later on try to install its own tensorflow, thereby causing conflicts. Note furthermore that we have to use pip install, rather than conda install, because TF is not available from conda repos. Also: order is important here!

pip install tensorflow==1.15.2
pip install tensorflow-gpu==1.15.2

We also want these specific TF addons:

pip install tensorflow-probability==0.8
pip install tensorflow-addons==0.6.0

Getting python-rtmidi to work

One dependency of Magenta causes some headaches in particular: python-rtmidi. It's possible that this is a DCCN-specific thing again, so you could try to see what happens if you just omit this step and try to install Magenta immediately. python-rtmidi depends, in turn, on the ALSA sound library. On our cluster, we cannot simply do sudo apt-install alsa or whatever, since we cannot sudo. So, instead, we want to install a version of ALSA with conda, specifically from the conda-forge repository:

conda install -c conda-forge alsa-lib

Then, we can install python-rtmidi but we'll have to make sure the binaries are compiled against the version of ALSA that we just manually installed. To do this:

pip install --global-option=build_ext --global-option="-I/home/predatt/eelspa/.conda/envs/tfgpu/include" python-rtmidi==1.1

where the --global-option=build_ext instructs pip to (re)compile binaries, and the other -I option specifies the include path for libraries to use in this compilation step (the ALSA installation put its binaries in there). This path needs to be changed to reflect wherever your conda env lives (+ /include), of course.

Getting pyfluidsynth to work

Another dependency of Magenta also requires a bit of special attention, though this one is much easier to fix: pyfluidsynth. The "official version" does not work in our case (related to this issue on GitHub), but fortunately there's an easy fix. First we need to install fluidsynth, again from the conda-forge repo:

conda install -c conda-forge fluidsynth

and then we can install a patched version of pyfluidsynth:

pip install git+https://github.com/0xf0f/pyfluidsynth

Almost there...

The only thing still needed:

pip install magenta

Done!

('Universal Install Script' comic courtesy of xkcd.)

Leave a Reply

Your email address will not be published. Required fields are marked *