{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# ✍️ Exercise: Intro to MLFlow - Part III\n", "\n", "Now that we have loged models into MLFlow it's time to learn how register them and deploy them to a production environment.\n", "\n", "\n", "- Load a regression dataset\n", "- Train a model\n", "- Log the model into MLFlow\n", "- Register the model\n", "- Stage the model into production/development\n", "- Deploy the model using MLFlow" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "from sklearn import datasets\n", "\n", "\n", "# Download dataset and convert to pandas dataframe\n", "diabetes_dataset = datasets.load_diabetes(as_frame=True)\n", "X = diabetes_dataset.data\n", "y = diabetes_dataset.target" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise I: Split the Data into Train and Test Sets\n", "\n", "💡 Remember that we need to split our data into train and test sets. We can use the [`train_test_split` function](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html) from `sklearn.model_selection` to do this. We should store the split into `X_train`, `y_train`, `X_test`, `y_test`." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "from sklearn.model_selection import train_test_split\n", "\n", "\n", "RANDOM_STATE = 42\n", "TEST_SIZE = 0.2\n", "\n", "# 👇 Add the relevant code below to split the data into training and testing sets\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise II: Train a Linear Regression Model\n", "\n", "Then, train a [**linear regression model** using the scikit-learn library](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html).\n", "\n", "1. 👉 Initialize the model calling the `LinearRegression` class.\n", "2. 👉 Train the model using the `fit` method." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
LinearRegression()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ "LinearRegression()" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.linear_model import LinearRegression\n", "\n", "\n", "# Add code to train the model 👇\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise III: Compute the Accuracy of the Model\n", "\n", "Finally, compute the accuracy of the model using the [`mean_squared_error` function](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_squared_error.html) from the `sklearn.metrics` module.\n", "\n", "1. 👉 Compute the predictions by passing the `X_test` to the `predict` method of the model.\n", "2. 👉 Compute the accuracy using the `mean_squared_error` function and passing the `y_test` and the `predictions` as arguments.\n", "3. 👉 Print the accuracy." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "from sklearn.metrics import root_mean_squared_error\n", "\n", "\n", "# Add code to calculate the mean squared error 👇\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise IV: Create a Run and log the model and metrics.\n", "\n", "1. 👉 Connect to MLFlow\n", "2. 👉 Set the experiment \"Diabetes Linear Regression\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import mlflow\n", "\n", "\n", "EXPERIMENT_NAME = \"Diabetes Linear Regression\"\n", "MLFLOW_TRACKING_URI = \"http://localhost:5000\"\n", "\n", "# Connect to MLFlow 👇\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "1. 👉 Log the root mean squared error metric using `mlflow.log_metric` function\n", "2. 👉 Log the model using the `mlflow.sklearn.log_model` function." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# launch a run to log the model\n", "with mlflow.start_run() as run:\n", " \n", " # Add code to log the model, and the mean squared error 👇" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise V: Register the model\n", "\n", "Registering a model in MLFlow is a way to keep track of the different versions of the same model. Registered models have different versions that track changes in the model and allows\n", "\n", "1. 👉 Get the **run ID** of the model you want to register using `run.info.run_id`.\n", "2. 👉 Register the model using the `mlflow.register_model` function." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# register the model for this run\n", "MODEL_NAME = \"diabetes-predictions\" # change this to your model name\n", "\n", "\n", "# Compute model path: models stored in a run follow this convention\n", "model_path = f\"runs:/{run_id}/model\" # fill the `run_id`` variable" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise VI: Deploy a model\n", "\n", "Deploying a model is a complex task that involves many steps. MLFlow simplifies this process by providing a set of tools to deploy models to different platforms. In this exercise, we will deploy a model to a local server. \n", "\n", "First, you need to connect the terminal to the MLFlow Server by setting the `MLFLOW_TRACKING_URI` environment variable. \n", "\n", "```bash\n", "export MLFLOW_TRACKING_URI=http://localhost:5000\n", "```\n", "\n", "Then, you can deploy the model using the `mlflow models serve` command **in your terminal**:\n", "\n", "```bash\n", "mlflow models serve --model-uri models:// --port 5001\n", "```\n", "\n", "Where `` is the name of the model and `` is the version of the model you want to deploy. You can find the name and version of the model in the MLFlow UI. Also the `--port` argument is the port where the server will be running. It's important to choose a port different than the `5000` port where the MLFlow server is running." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## BONUS: Make a request to the model\n", "\n", "Finally, make a request to the model using the `requests` library. You can use the following code to make a request to the model:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Request successful. Response:\n", "{\"predictions\": [-36.83689486654794, -37.10261875122955]}\n" ] } ], "source": [ "import requests\n", "import json\n", "import numpy as np\n", "\n", "# Define the URL and payload (JSON data)\n", "url = 'http://localhost:5001/invocations'\n", "headers = {'Content-Type': 'application/json'}\n", "\n", "# Create a list representing the (100, 1) vector\n", "input_vector = np.random.rand(2, 8).tolist()\n", "\n", "# Create a dictionary with the 'inputs' key and the input_vector\n", "payload = {'inputs': input_vector}\n", "\n", "# Convert the payload to JSON format\n", "json_payload = json.dumps(payload)\n", "\n", "# Make a POST request\n", "response = requests.post(url, headers=headers, data=json_payload)\n", "\n", "# Check the response\n", "if response.status_code == 200:\n", " print(\"Request successful. Response:\")\n", " print(response.text)\n", "else:\n", " print(f\"Request failed with status code {response.status_code}\")\n", " print(response.text)" ] } ], "metadata": { "kernelspec": { "display_name": "mlops-cookbook", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.5" } }, "nbformat": 4, "nbformat_minor": 2 }