Qarnot Technical Team
Engineers
HPC platform
Launch compute tasks in a few lines of code or a few clicks on Tasq, our HPC platform.

MLflow on Qarnot Cloud - documentation

December 22, 2021 - HPC discovery, Documentation, Machine Learning / AI

Introduction

MLflow is a platform to streamline machine learning development, including tracking experiments, packaging code into reproducible runs, and sharing and deploying models. MLflow offers a set of lightweight APIs that can be used with any existing machine learning application or library (TensorFlow, PyTorch, XGBoost, etc), wherever you currently run ML code (e.g. in notebooks, standalone applications or the cloud).

Here is a quick step by step walkthrough to guide you through the different steps of how to run a MLflow pipeline on Qarnot, so follow along!

Version

Release yearVersion
20211.21.0

If you are interested in another version, please send us an email at qlab@qarnot.com.

Prerequisites

Before starting a calculation with the Python SDK, a few steps are required:

  • Retrieve the authentication token (here)
  • Install Qarnot’s Python SDK (here)

Note: in addition to the Python SDK, Qarnot provides C# and Node.js SDKs and a Command Line.

Test Case

This test case will show you how to train an ElasticNet model. It is a type of regularized linear regression that combines two popular penalties, specifically the L1 and L2 penalty functions. This combination allows for learning a sparse model where few of the weights are non-zero like Lasso, while still maintaining the regularization properties of Ridge. We control the combination of L1 and L2 using the l1_ratio parameter.

We will use MLflow to run two experiments with different values of alpha and l1_ratio. In each experiment, we will fix one of the hyper-parameters, vary the other and record how the model's performance changes.

  • With alpha being a float that multiplies the penalty terms. Alpha = 0 is equivalent to an ordinary least square.
  • The ElasticNet mixing parameter, with 0 <= l1_ratio <= 1. For l1_ratio = 0 the penalty is an L2 penalty. For l1_ratio = 1 it is an L1 penalty. For 0 < l1_ratio < 1, the penalty is a combination of L1 and L2.

The data used here is the Wine Quality Data Set. The goal is to predict the quality of the wine (a score between 0 and 10) based on various properties (pH, density, alcohol, etc...).

The necessary input files needed for this tutorial can be downloaded here.

Once you have downloaded everything, your working directory should look like this:

  • input
    • sklearn_elasticnet_wine:MLflow project content
      • conda.yaml: YAML file for defining the conda environment to be used
      • MLproject: file defining the MLflow pipeline
      • train.py: python script for training the elastic net model
    • mlflow_driver.py: script for launching the experiments
  • run-mlflow.py: launchs the computation on Qarnot (code found below)

Launching the test case

Once everything is set up, use the following script to launch the computation on Qarnot. Be sure to copy your authentication token in the script (instead of <<<MY_SECRET_TOKEN>>>) to be able to launch the task on Qarnot.

To launch this script, simply copy the following code in a Python script and execute python3 run-mlflow.py & in your terminal.

Results

At any given time, you can monitor the status of your task on Tasq.

Once training is done, the task status should change to green and the results should be downloaded automatically in a output directory. There you will find all the files generated by MLflow for each run of each experiment. This includes: logs, metrics, parameters, the actual model and more.

MLflow UI

You can visualize these results in the integrated MLflow ui.

  • To do so, make sure you have mlflow installed on your machine by running pip install mlflow
  • Change directories to outputs/ with cd outputs/
  • Run mlflow ui

This will launch the UI on localhost:5000 which you can access from your browser.

From this dashboard you have access to:

  • Your experiments
  • The training runs with all the metrics and parameters you have defined
  • You also have access to the model corresponding to each run

It is also possible to compare different runs from the same experiment. For example, the image below shows the comparison of 5 runs to see the effect of the alpha. You can see a scatter plot with alpha on the x-axis and the RMSE score on the y-axis.

Wrapping up

That’s it! If you have any questions, please contact qlab@qarnot.com and we will help you with pleasure!

Share on networks