โ๏ธ Exercise: Intro to MLFlow - Part IIIยถ
Now that we have loged models into MLFlow it's time to learn how register them and deploy them to a production environment.
- Load a regression dataset
- Train a model
- Log the model into MLFlow
- Register the model
- Stage the model into production/development
- Deploy the model using MLFlow
from sklearn import datasets
# Download dataset and convert to pandas dataframe
diabetes_dataset = datasets.load_diabetes(as_frame=True)
X = diabetes_dataset.data
y = diabetes_dataset.target
Exercise I: Split the Data into Train and Test Setsยถ
๐ก Remember that we need to split our data into train and test sets. We can use the train_test_split
function from sklearn.model_selection
to do this. We should store the split into X_train
, y_train
, X_test
, y_test
.
from sklearn.model_selection import train_test_split
RANDOM_STATE = 42
TEST_SIZE = 0.2
# ๐ Add the relevant code below to split the data into training and testing sets
Exercise II: Train a Linear Regression Modelยถ
Then, train a linear regression model using the scikit-learn library.
- ๐ Initialize the model calling the
LinearRegression
class. - ๐ Train the model using the
fit
method.
from sklearn.linear_model import LinearRegression
# Add code to train the model ๐
LinearRegression()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
LinearRegression()
Exercise III: Compute the Accuracy of the Modelยถ
Finally, compute the accuracy of the model using the mean_squared_error
function from the sklearn.metrics
module.
- ๐ Compute the predictions by passing the
X_test
to thepredict
method of the model. - ๐ Compute the accuracy using the
mean_squared_error
function and passing they_test
and thepredictions
as arguments. - ๐ Print the accuracy.
from sklearn.metrics import root_mean_squared_error
# Add code to calculate the mean squared error ๐
Exercise IV: Create a Run and log the model and metrics.ยถ
- ๐ Connect to MLFlow
- ๐ Set the experiment "Diabetes Linear Regression"
import mlflow
EXPERIMENT_NAME = "Diabetes Linear Regression"
MLFLOW_TRACKING_URI = "http://localhost:5000"
# Connect to MLFlow ๐
- ๐ Log the root mean squared error metric using
mlflow.log_metric
function - ๐ Log the model using the
mlflow.sklearn.log_model
function.
# launch a run to log the model
with mlflow.start_run() as run:
# Add code to log the model, and the mean squared error ๐
Exercise V: Register the modelยถ
Registering a model in MLFlow is a way to keep track of the different versions of the same model. Registered models have different versions that track changes in the model and allows
- ๐ Get the run ID of the model you want to register using
run.info.run_id
. - ๐ Register the model using the
mlflow.register_model
function.
# register the model for this run
MODEL_NAME = "diabetes-predictions" # change this to your model name
# Compute model path: models stored in a run follow this convention
model_path = f"runs:/{run_id}/model" # fill the `run_id`` variable
Exercise VI: Deploy a modelยถ
Deploying a model is a complex task that involves many steps. MLFlow simplifies this process by providing a set of tools to deploy models to different platforms. In this exercise, we will deploy a model to a local server.
First, you need to connect the terminal to the MLFlow Server by setting the MLFLOW_TRACKING_URI
environment variable.
export MLFLOW_TRACKING_URI=http://localhost:5000
Then, you can deploy the model using the mlflow models serve
command in your terminal:
mlflow models serve --model-uri models:/<model_name>/<model_version> --port 5001
Where <model_name>
is the name of the model and <model_version>
is the version of the model you want to deploy. You can find the name and version of the model in the MLFlow UI. Also the --port
argument is the port where the server will be running. It's important to choose a port different than the 5000
port where the MLFlow server is running.
BONUS: Make a request to the modelยถ
Finally, make a request to the model using the requests
library. You can use the following code to make a request to the model:
import requests
import json
import numpy as np
# Define the URL and payload (JSON data)
url = 'http://localhost:5001/invocations'
headers = {'Content-Type': 'application/json'}
# Create a list representing the (100, 1) vector
input_vector = np.random.rand(2, 8).tolist()
# Create a dictionary with the 'inputs' key and the input_vector
payload = {'inputs': input_vector}
# Convert the payload to JSON format
json_payload = json.dumps(payload)
# Make a POST request
response = requests.post(url, headers=headers, data=json_payload)
# Check the response
if response.status_code == 200:
print("Request successful. Response:")
print(response.text)
else:
print(f"Request failed with status code {response.status_code}")
print(response.text)
Request successful. Response: {"predictions": [-36.83689486654794, -37.10261875122955]}