๐ Creating a Python Project for ML
In this article we will create a Python project for machine learning. We will use the following tools:
- ๐น Poetry: Dependency manager.
- ๐ Devcontainers: Development environment.
- ๐ป Visual Studio Code: IDE.
Requirements
In order to follow this tutorial, you will need to have the following tools installed:
Why Poetry?
Poetry is an essential tool for Python developers as it revolutionizes dependency management, packaging, and project development. It simplifies the process of managing project dependencies by providing a declarative and intuitive approach. With Poetry, developers can easily define and track their project's dependencies, ensuring consistent and reproducible development environments across different machines. Additionally, Poetry simplifies packaging by automating the creation of distributable packages, making it easier to share and distribute projects with others. It also facilitates the creation of virtual environments, isolating project dependencies and avoiding conflicts. Poetry's comprehensive features, combined with its user-friendly interface, streamline the development workflow and enhance the overall efficiency and maintainability of Python projects.
More About Poetry
You can find more info about Poetry in the official documentation.
Why Devcontainers?
DevContainers offer significant benefits in the world of software development by simplifying and standardizing development environments. They eliminate the hassle of setting up and configuring development environments on different machines, ensuring consistency across team members. With DevContainers, developers can define a containerized environment that includes all the necessary tools, libraries, and configurations for a project. This means that everyone working on the project can use the same development environment, reducing compatibility issues and improving collaboration. DevContainers also enable seamless onboarding for new team members, as they can quickly set up the required environment without manual setup steps. Moreover, DevContainers provide isolation and security by running the development environment within a container, preventing conflicts with the host machine and ensuring the project's dependencies are contained. Overall, DevContainers simplify development workflows, enhance portability, and improve the overall efficiency and consistency of software projects.
More About Devcontainers
You can find more info about Devcontainers in this official tutorial.
Why Visual Studio Code?
Visual Studio Code (VS Code) holds immense significance in the field of software development due to its remarkable features and capabilities. As a powerful and extensible code editor, VS Code offers a seamless and customizable development experience. It supports a wide range of programming languages and frameworks, providing developers with a versatile environment for their projects. VS Code boasts an extensive collection of extensions, enabling developers to enhance their workflow with additional tools, integrations, and functionalities. Its intuitive interface and smart code editing features, such as IntelliSense and debugging support, boost productivity and efficiency. Moreover, VS Code's integrated terminal, version control integration, and collaboration tools make it a comprehensive solution for software development. Its cross-platform compatibility ensures a consistent experience across different operating systems. With its flexibility, extensibility, and an active community of users, Visual Studio Code has become an indispensable tool for developers worldwide.
More About Visual Studio Code
You can find more info about Visual Studio Code in the official documentation.
Note
Visual Studio Code has an integrated AI assistant called Copilot. It leverages the power of AI to provide intelligent code suggestions and completions. Copilot can be used to generate code snippets, functions, and even entire classes. It can also be used to generate comments and documentation. Copilot is currently in technical preview and is available as a VS Code extension. You can find more info about Copilot in this official blog post.
1. Setting Up the Devcontainer
The fastest way to quickly start developing is to build a development environment up an ready using Docker and vscode:
1.1. Installing Remote Containers Extension
-
First step is to install remote containers in VSCode.
-
VSCode automatically searches for the
.devcontainer/devcontainer.json
file in the root folder. So and Run the docker in development in VSCode (wait, first time takes some time to run).
Pre-built python devcontainers
VSCode provides built-in images for Python, so you don't need to create a Dockerfile for your project. You can find more info about the available images in the official documentation. This images provide also some common formatter and linters like black
, pylint
, pydocstyle
, isort
...
1.2. Creating a Python Devcontainer
-
Add a devcontainer configuration File
-
Select the Python 3 image
-
Select the
buster
tag -
(Optional) Select features.
There are some features that you can add to your devcontainer. For example, if you want to use Docker inside your devcontainer, you can add the
docker
feature. Also it is possible to add Anaconda to your devcontainer.A devcontainer.json file is created
A
devcontainer.json
file is created in the.devcontainer
folder. This file contains the configuration for the devcontainer. -
(Optional) Select extensions
VSCode provides a list of extensions that can be installed in the devcontainer.
a. Find the desired extension
b. Click on
Add to devcontainer.json
file so it will be installed automatically when the devcontainer is created.List of recommended extensions
You can directly copy this list of extensions to the
devcontainer.json
file under thecustomizations
section."customizations": { "vscode": { // Add the IDs of extensions you want installed when the container is created. "extensions": [ "ms-python.python", "ms-python.vscode-pylance", "ms-python.pylint", "njpwerner.autodocstring", "eamodio.gitlens", "mhutchie.git-graph", "zhuangtongfa.material-theme", "PKief.material-icon-theme", "ms-azuretools.vscode-docker", "yzhang.markdown-all-in-one", "DavidAnson.vscode-markdownlint", "christian-kohler.path-intellisense", "ms-vsliveshare.vsliveshare", "Vtrois.gitmoji-vscode", "GitHub.vscode-pull-request-github", "seatonjiang.gitmoji-vscode", "perkovec.emoji", "ms-toolsai.jupyter", "bungcip.better-toml", "GitHub.copilot", "GitLab.gitlab-workflow" ] } },
-
Rebuild and reopen in container
Once the
devcontainer.json
file is created, you can rebuild and reopen the devcontainer. It means that the devcontainer will be created and opened in a new VSCode window.
2. Create a Virtualenv using Pyenv
Virtualenvs are a great way to isolate your Python project dependencies. They allow you to create an isolated environment for your project, which means that you can install packages without affecting the rest of your system. This is especially useful if you are working on multiple projects at the same time, or if you want to test out new packages without affecting your system.
Some of the most common tools to create virtualenvs are:
How to install pyenv?
Open a terminal (Ctrl+Shift+`
or Terminal > New Terminal
) and use the following command:
-
Create a Virtualenv using Pyenv
Note: You can use any name for your virtualenv.
-
Activate the Virtualenv
Note: You can deactivate the virtualenv using
pyenv deactivate
.
3. Starting a Poetry Project
Now that we have our development environment up and running, we can start creating our project. We will use Poetry to manage our project's dependencies and packaging. Poetry provides a simple and intuitive interface for managing dependencies, packaging, and virtual environments. It also offers a comprehensive set of features, such as dependency resolution, dependency isolation, and dependency locking. Poetry's user-friendly interface and powerful features make it an indispensable tool for Python developers.
Avoid creating a virtualenv using Poetry
Poetry creates a virtual environment for each project, but this is normally handled by other tools like virtualenv
or conda
. So, if you are using Poetry. In order to avoid Poetry creating a virtual environment, you can use the virtualenvs.create
config:
How to install poetry?
Open a terminal (Ctrl+Shift+`
or Terminal > New Terminal
) and use pip
:
- โน๏ธ Remember to use a specific version (like
1.2.2
,1.4.2
or1.5.0
) by replacing<version>
with the desired version.
3.1. Creating a new project
To create a new project, we can use the new
command:
The default poetry project structure
The poetry new <project_name>
command will create a new project with the following structure:
The default pyproject.toml
file
The pyproject.toml
file contains the project's metadata and dependencies and it will look like this:
[tool.poetry]
name = "<project_name>"
version = "0.1.0"
description = ""
authors = ["Andrรฉs Matesanz <matesanz.cuadrado@gmail.com>"]
readme = "README.md"
packages = [{include = "<project_name>"}]
[tool.poetry.dependencies]
python = "^3.9"
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
3.2. Adding dependencies
To add a dependency, we can use the add
command:
How to add a dependency: requests
This will add the dependency to the pyproject.toml
file and install it in the virtual environment. For example, to add the requests
library, we can use the following command:
This will add the following lines to the pyproject.toml
file:
Adding dependencies just for development
If we want to add a dependency just for development, we can use the --dev
flag:
"How to add a development dependency: pytest
This will add the dependency to the pyproject.toml
file under the [tool.poetry.group.dev.dependencies]
section. For example, to add the pytest
library, we can use the following command:
This will add the following lines to the pyproject.toml
file:
3.3. Removing dependencies
To remove a dependency, we can use the remove
command: