🌪️ Itertools¶
The itertools
module is a very powerful Python built-in module for creating and using iterators. It is especially powerful when dealing with large datasets, for several reasons:
Memory Efficiency: Iterators in
itertools
don't need to load the entire dataset into memory. Instead, they generate one item at a time, as you need them. This makes them ideal for processing large datasets that might not fit into memory.Lazy Evaluation:
itertools
uses lazy evaluation, meaning that computations are only performed when needed. For instance, if you only need the first few items from an infinite iterator, only those items will be computed.High Performance: The
itertools
functions are implemented in C, meaning they are high-performance functions that run more quickly than standard Python code.Composability: The tools in
itertools
are designed to be combined with each other. This means you can build more complex iterators from the basic ones provided by the module.Flexible Iteration: The module offers flexible, iterator-based solutions for creating permutations and combinations, Cartesian products, slicing, grouping, and more. It can solve a wide range of real-world problems, including many encountered in data analysis and machine learning.
By making efficient use of memory, processing large datasets quickly, and offering a suite of tools for flexible iteration and combinations, itertools
is indeed a very powerful module in Python.
1. Importing itertools¶
You can import the itertools
module as follows:
import itertools
2. Infinite Iterators¶
itertools
provides three types of infinite iterators:
count(start, step)
: This iterator starts withstart
and increments it bystep
indefinitely.
for num in itertools.count(10, 2):
print(num)
if num > 20:
break
10 12 14 16 18 20 22
cycle(iterable)
: This iterator cycles indefinitely over an iterable.
count = 0
for item in itertools.cycle('XYZ'):
if count > 7:
break
print(item)
count += 1
X Y Z X Y Z X Y
repeat(object, times)
: This iterator returns an object over and over again. Iftimes
is specified, it will repeat the object that number of times.
for item in itertools.repeat('A', 4):
print(item)
A A A A
3. Iterators Terminating on the Shortest Input Sequence¶
itertools
provides several functions that make iterators terminating on the shortest input sequence:
accumulate(iterable, func)
: This iterator returns accumulated sums.
import operator
data = [1, 2, 3, 4, 5]
result = itertools.accumulate(data, operator.mul)
for each in result:
print(each)
1 2 6 24 120
chain(*iterables)
: This function takes several iterables as arguments and returns a single iterator that produces the contents of the inputs one after the other.
for item in itertools.chain([1, 2, 3], ['a', 'b', 'c']):
print(item)
1 2 3 a b c
zip_longest(*iterables, fillvalue=None)
: This function makes an iterator that aggregates elements from each of the iterables. If the iterables are of uneven length, missing values are filled-in with fillvalue.
for item in itertools.zip_longest([1, 2, 3], ['a', 'b'], fillvalue='_'):
print(item)
(1, 'a') (2, 'b') (3, '_')
4. Combinatoric Iterators¶
itertools
also provides several functions for creating iterators that produce complex iterators:
product(*iterables, repeat=1)
: This function creates a Cartesian product from several input iterables.
for item in itertools.product([1, 2], ['a', 'b']):
print(item)
(1, 'a') (1, 'b') (2, 'a') (2, 'b')
permutations(iterable, r=None)
: This function makes an iterator that returns successiver
length permutations of elements in theiterable
.
for item in itertools.permutations([1, 2, 3], 2):
print(item)
(1, 2) (1, 3) (2, 1) (2, 3) (3, 1) (3, 2)
combinations(iterable, r)
: This function makes an iterator that returns successiver
length combinations of elements in theiterable
.
for item in itertools.combinations([1, 2, 3, 4], 2):
print(item)
(1, 2) (1, 3) (1, 4) (2, 3) (2, 4) (3, 4)