✍️ Exercise: Intro to MLFlow - Part III¶

Now that we have loged models into MLFlow it's time to learn how register them and deploy them to a production environment.

Load a regression dataset
Train a model
Log the model into MLFlow
Register the model
Stage the model into production/development
Deploy the model using MLFlow

In [7]:

Copied!

from sklearn import datasets

# Download dataset and convert to pandas dataframe
diabetes_dataset = datasets.load_diabetes(as_frame=True)
X = diabetes_dataset.data
y = diabetes_dataset.target
from sklearn import datasets

# Download dataset and convert to pandas dataframe
diabetes_dataset = datasets.load_diabetes(as_frame=True)
X = diabetes_dataset.data
y = diabetes_dataset.target

Exercise I: Split the Data into Train and Test Sets¶

💡 Remember that we need to split our data into train and test sets. We can use the train_test_split function from sklearn.model_selection to do this. We should store the split into X_train, y_train, X_test, y_test.

In [6]:

Copied!

from sklearn.model_selection import train_test_split

RANDOM_STATE = 42
TEST_SIZE = 0.2

# 👇 Add the relevant code below to split the data into training and testing sets
from sklearn.model_selection import train_test_split

RANDOM_STATE = 42
TEST_SIZE = 0.2

# 👇 Add the relevant code below to split the data into training and testing sets

Exercise II: Train a Linear Regression Model¶

Then, train a linear regression model using the scikit-learn library.

👉 Initialize the model calling the LinearRegression class.
👉 Train the model using the fit method.

In [7]:

Copied!

from sklearn.linear_model import LinearRegression

# Add code to train the model 👇
from sklearn.linear_model import LinearRegression

# Add code to train the model 👇

Out[7]:

LinearRegression()

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Exercise III: Compute the Accuracy of the Model¶

Finally, compute the accuracy of the model using the mean_squared_error function from the sklearn.metrics module.

👉 Compute the predictions by passing the X_test to the predict method of the model.
👉 Compute the accuracy using the mean_squared_error function and passing the y_test and the predictions as arguments.
👉 Print the accuracy.

In [11]:

Copied!

from sklearn.metrics import root_mean_squared_error

# Add code to calculate the mean squared error 👇
from sklearn.metrics import root_mean_squared_error

# Add code to calculate the mean squared error 👇

Exercise IV: Create a Run and log the model and metrics.¶

👉 Connect to MLFlow
👉 Set the experiment "Diabetes Linear Regression"

In [ ]:

Copied!

import mlflow

EXPERIMENT_NAME = "Diabetes Linear Regression"
MLFLOW_TRACKING_URI = "http://localhost:5000"

# Connect to MLFlow 👇
import mlflow

EXPERIMENT_NAME = "Diabetes Linear Regression"
MLFLOW_TRACKING_URI = "http://localhost:5000"

# Connect to MLFlow 👇

👉 Log the root mean squared error metric using mlflow.log_metric function
👉 Log the model using the mlflow.sklearn.log_model function.

In [ ]:

Copied!

# launch a run to log the model
with mlflow.start_run() as run:
    
    # Add code to log the model, and the mean squared error 👇
# launch a run to log the model
with mlflow.start_run() as run:
    
    # Add code to log the model, and the mean squared error 👇

Exercise V: Register the model¶

Registering a model in MLFlow is a way to keep track of the different versions of the same model. Registered models have different versions that track changes in the model and allows

👉 Get the run ID of the model you want to register using run.info.run_id.
👉 Register the model using the mlflow.register_model function.

In [ ]:

Copied!

# register the model for this run
MODEL_NAME = "diabetes-predictions"  # change this to your model name

# Compute model path: models stored in a run follow this convention
model_path = f"runs:/{run_id}/model"  # fill the `run_id`` variable
# register the model for this run
MODEL_NAME = "diabetes-predictions"  # change this to your model name

# Compute model path: models stored in a run follow this convention
model_path = f"runs:/{run_id}/model"  # fill the `run_id`` variable

Exercise VI: Deploy a model¶

Deploying a model is a complex task that involves many steps. MLFlow simplifies this process by providing a set of tools to deploy models to different platforms. In this exercise, we will deploy a model to a local server.

First, you need to connect the terminal to the MLFlow Server by setting the MLFLOW_TRACKING_URI environment variable.

export MLFLOW_TRACKING_URI=http://localhost:5000

Then, you can deploy the model using the mlflow models serve command in your terminal:

mlflow models serve --model-uri models:/<model_name>/<model_version> --port 5001

Where <model_name> is the name of the model and <model_version> is the version of the model you want to deploy. You can find the name and version of the model in the MLFlow UI. Also the --port argument is the port where the server will be running. It's important to choose a port different than the 5000 port where the MLFlow server is running.

BONUS: Make a request to the model¶

Finally, make a request to the model using the requests library. You can use the following code to make a request to the model:

In [20]:

Copied!





import requests
import json
import numpy as np

# Define the URL and payload (JSON data)
url = 'http://localhost:5001/invocations'
headers = {'Content-Type': 'application/json'}

# Create a list representing the (100, 1) vector
input_vector = np.random.rand(2, 8).tolist()

# Create a dictionary with the 'inputs' key and the input_vector
payload = {'inputs': input_vector}

# Convert the payload to JSON format
json_payload = json.dumps(payload)

# Make a POST request
response = requests.post(url, headers=headers, data=json_payload)

# Check the response
if response.status_code == 200:
    print("Request successful. Response:")
    print(response.text)
else:
    print(f"Request failed with status code {response.status_code}")
    print(response.text)
import requests
import json
import numpy as np

# Define the URL and payload (JSON data)
url = 'http://localhost:5001/invocations'
headers = {'Content-Type': 'application/json'}

# Create a list representing the (100, 1) vector
input_vector = np.random.rand(2, 8).tolist()

# Create a dictionary with the 'inputs' key and the input_vector
payload = {'inputs': input_vector}

# Convert the payload to JSON format
json_payload = json.dumps(payload)

# Make a POST request
response = requests.post(url, headers=headers, data=json_payload)

# Check the response
if response.status_code == 200:
    print("Request successful. Response:")
    print(response.text)
else:
    print(f"Request failed with status code {response.status_code}")
    print(response.text)

Request successful. Response:
{"predictions": [-36.83689486654794, -37.10261875122955]}