🔢 Numpy Slicing and Masking¶
NumPy, short for Numerical Python, is an essential Python library for performing mathematical and logical operations on arrays. In this tutorial, we'll cover two important techniques for data manipulation in NumPy: slicing and masking.
1. Slicing¶
Just like Python's list, NumPy arrays can be sliced. As arrays can be multidimensional, you need to specify a slice for each dimension of the array.
import numpy as np
# Create a 1D array
arr1 = np.array([1, 2, 3, 4, 5])
print("1D Array:", arr1)
# Slice elements from 1st to 4th position
print("Sliced Array:", arr1[1:4])
# Create a 2D array
arr2 = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print("2D Array:\n", arr2)
# Slice a specific row
print("First row:\n", arr2[0])
# Slice a specific column
print("First column:\n", arr2[:, 0])
# Slice a submatrix
print("Submatrix:\n", arr2[0:2, 1:3])
1D Array: [1 2 3 4 5] Sliced Array: [2 3 4] 2D Array: [[1 2 3] [4 5 6] [7 8 9]] First row: [1 2 3] First column: [1 4 7] Submatrix: [[2 3] [5 6]]
Array masking is a powerful feature in NumPy that allows you to manipulate and analyze your data based on certain conditions. This tutorial will cover array masking and also introduce np.where
, a function that can be extremely helpful in conjunction with masks.
2. Masking in NumPy¶
A mask is essentially a boolean array that can be used to "mask" or "hide" certain values in an array.
import numpy as np
# Create an array
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
# Create a mask
mask = arr > 5
print("Mask:", mask)
Mask: [False False False False False True True True True True]
This will create a boolean mask where each element of the mask will be True
if the corresponding element in the array is greater than 5 and False
otherwise.
You can use this mask to index into your original array, which will return a new array with only the values where the mask is True
.
# Using the mask to index into the array
filtered_arr = arr[mask]
print("Filtered Array:", filtered_arr)
Filtered Array: [ 6 7 8 9 10]
2.1 Modifying Values with a Mask¶
You can also modify the values in your original array using a mask. This can be useful if you want to apply changes to certain elements in your array based on a condition.
# Modifying elements of the array using a mask
arr[mask] = 99
print("Modified Array:", arr)
Modified Array: [ 1 2 3 4 5 99 99 99 99 99]
2.2 np.where
¶
np.where
is a function that returns the indices of elements in an input array where the given condition is satisfied. If we pass the condition directly, it can be used as a masking operation.
# Create an array
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
# Use np.where to create a mask and filter the array
indices = np.where(arr > 5)
print("Indices:", indices)
filtered_arr = arr[indices]
print("Filtered Array:", filtered_arr)
Indices: (array([5, 6, 7, 8, 9]),) Filtered Array: [ 6 7 8 9 10]
You can also use np.where
with three arguments to return a new array that replaces elements where the condition is True
or False
.
# Use np.where to replace elements based on a condition
new_arr = np.where(arr > 5, 99, arr)
print("New Array:", new_arr)
New Array: [ 1 2 3 4 5 99 99 99 99 99]
In this case, np.where
will replace elements where the condition arr > 5
is True
with 99
, and all other elements with their original values.
By combining array masking and np.where
, you can efficiently manipulate and analyze your NumPy arrays based on a wide variety of conditions, making your data processing tasks much more efficient and easy to handle.