🌪️ Itertools¶
The itertools module is a very powerful Python built-in module for creating and using iterators. It is especially powerful when dealing with large datasets, for several reasons:
Memory Efficiency: Iterators in
itertoolsdon't need to load the entire dataset into memory. Instead, they generate one item at a time, as you need them. This makes them ideal for processing large datasets that might not fit into memory.Lazy Evaluation:
itertoolsuses lazy evaluation, meaning that computations are only performed when needed. For instance, if you only need the first few items from an infinite iterator, only those items will be computed.High Performance: The
itertoolsfunctions are implemented in C, meaning they are high-performance functions that run more quickly than standard Python code.Composability: The tools in
itertoolsare designed to be combined with each other. This means you can build more complex iterators from the basic ones provided by the module.Flexible Iteration: The module offers flexible, iterator-based solutions for creating permutations and combinations, Cartesian products, slicing, grouping, and more. It can solve a wide range of real-world problems, including many encountered in data analysis and machine learning.
By making efficient use of memory, processing large datasets quickly, and offering a suite of tools for flexible iteration and combinations, itertools is indeed a very powerful module in Python.
1. Importing itertools¶
You can import the itertools module as follows:
import itertools
2. Infinite Iterators¶
itertools provides three types of infinite iterators:
count(start, step): This iterator starts withstartand increments it bystepindefinitely.
for num in itertools.count(10, 2):
print(num)
if num > 20:
break
10 12 14 16 18 20 22
cycle(iterable): This iterator cycles indefinitely over an iterable.
count = 0
for item in itertools.cycle('XYZ'):
if count > 7:
break
print(item)
count += 1
X Y Z X Y Z X Y
repeat(object, times): This iterator returns an object over and over again. Iftimesis specified, it will repeat the object that number of times.
for item in itertools.repeat('A', 4):
print(item)
A A A A
3. Iterators Terminating on the Shortest Input Sequence¶
itertools provides several functions that make iterators terminating on the shortest input sequence:
accumulate(iterable, func): This iterator returns accumulated sums.
import operator
data = [1, 2, 3, 4, 5]
result = itertools.accumulate(data, operator.mul)
for each in result:
print(each)
1 2 6 24 120
chain(*iterables): This function takes several iterables as arguments and returns a single iterator that produces the contents of the inputs one after the other.
for item in itertools.chain([1, 2, 3], ['a', 'b', 'c']):
print(item)
1 2 3 a b c
zip_longest(*iterables, fillvalue=None): This function makes an iterator that aggregates elements from each of the iterables. If the iterables are of uneven length, missing values are filled-in with fillvalue.
for item in itertools.zip_longest([1, 2, 3], ['a', 'b'], fillvalue='_'):
print(item)
(1, 'a') (2, 'b') (3, '_')
4. Combinatoric Iterators¶
itertools also provides several functions for creating iterators that produce complex iterators:
product(*iterables, repeat=1): This function creates a Cartesian product from several input iterables.
for item in itertools.product([1, 2], ['a', 'b']):
print(item)
(1, 'a') (1, 'b') (2, 'a') (2, 'b')
permutations(iterable, r=None): This function makes an iterator that returns successiverlength permutations of elements in theiterable.
for item in itertools.permutations([1, 2, 3], 2):
print(item)
(1, 2) (1, 3) (2, 1) (2, 3) (3, 1) (3, 2)
combinations(iterable, r): This function makes an iterator that returns successiverlength combinations of elements in theiterable.
for item in itertools.combinations([1, 2, 3, 4], 2):
print(item)
(1, 2) (1, 3) (1, 4) (2, 3) (2, 4) (3, 4)