Iterable objects and iterators
Motivation
The objective is to write more elegant code that the for(int i = 0; i < n; i++)
style of code. For instance, the following code is ugly.
A = [1, 7, 9, 3, 4]
for i in range(len(A)):
print(f"We have {A[i]} in the array")
On a 1 dans le tableau
On a 7 dans le tableau
On a 9 dans le tableau
On a 3 dans le tableau
On a 4 dans le tableau
Instead, write this:
A = [1, 7, 9, 3, 4]
for el in A:
print(f"We have {el} in the array")
We have 1 in the array
We have 7 in the array
We have 9 in the array
We have 3 in the array
We have 4 in the array
Definition
Iterators
An iterator is an object that gives you elements when pressing the next button. Like a coffee machine or a food dispenser:
Iterable objects
An iterable object is an object that can be transformed into an iterator. In for x in BLOUP
, BLOUP
should be an iterable object.
List
[1, 6, 2]
[1, 6, 2]
In Python:
- An iterable object can be transformed into an interator via
__iter__()
- An iterator is an object implementing
__next__()
.
it = [1, 6, 2].__iter__()
print(it.__next__())
print(it.__next__())
print(it.__next__())
print(it.__next__()) # raise a StopIteration error
1
6
2
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
Cell In[12], line 5
3 print(it.__next__())
4 print(it.__next__())
----> 5 print(it.__next__()) # raise a StopIteration error
StopIteration:
Range
range(4)
range(0, 4)
for i in range(4):
print(i)
0
1
2
3
it = range(4).__iter__()
print(it.__next__())
print(it.__next__())
print(it.__next__())
print(it.__next__())
print(it.__next__()) # raise a StopIteration error
0
1
2
3
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
Cell In[17], line 6
4 print(it.__next__())
5 print(it.__next__())
----> 6 print(it.__next__()) # raise a StopIteration error
StopIteration:
Many iterable objects and iterators are lazy data structures. For instance, range(100000000)
will not create a list of 100000000 elements! It is inspired from constructions in Haskell.
Generators
A generator is a special kind of function that contains the keyword yield
. It returns an iterable object.
def squares(start, stop):
for i in range(start, stop):
yield i * i
it = squares(1, 5).__iter__()
print(it.__next__())
print(it.__next__())
print(it.__next__())
print(it.__next__())
print(it.__next__())
1
4
9
16
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
Cell In[10], line 6
4 print(it.__next__())
5 print(it.__next__())
----> 6 print(it.__next__())
StopIteration:
slots and dict
for i in squares(1, 6):
print(i)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[1], line 1
----> 1 for i in squares(1, 6):
2 print(i)
NameError: name 'squares' is not defined
def all_integers():
i = 0
while True:
yield i
i += 1
N = all_integers()
iter = N.__iter__()
print(next(iter))
print(next(iter))
0
1
Generators can also be defined with a functional construction, using parenthesis.
all_squares = (a ** 2 for a in all_integers())
N = all_squares
iter = N.__iter__()
print(next(iter))
print(next(iter))
print(next(iter))
print(next(iter))
0
1
4
9
Nest generators
Generators can be nested but not in naive way:
def f():
yield 1
yield 2
def g():
f()
yield 3
for i in g():
print(i)
3
You may write:
def f():
yield 1
yield 2
def g():
for j in f():
yield j
yield 3
for i in g():
print(i)
1
2
3
Or more elegantly:
def f():
yield 1
yield 2
def g():
yield from f()
yield 3
for i in g():
print(i)
1
2
3
Converting iterable objects into whatever you want
list(ITERABLE)
takes the iterable, get an iterator, and then press the next button until no more food gets out. All the items are then collected into a list.
list((1, 2, 3))
[1, 2, 3]
list(squares(1, 6))
[1, 4, 9, 16, 25]
list(all_integers())
---------------------------------------------------------------------------
KeyboardInterrupt Traceback (most recent call last)
Cell In[20], line 1
----> 1 list(all_integers())
Cell In[19], line 4, in all_integers()
2 i = 0
3 while True:
----> 4 yield i
5 i += 1
KeyboardInterrupt:
list({"a": 0, "b": 1})
['a', 'b']
set((1, 2, 3))
{1, 2, 3}
frozenset([1, 2, 3])
frozenset({1, 2, 3})
tuple([1,2, 3])
(1, 2, 3)
Enumerate
Sometimes you need the index and the element and you are stuck with: The enumerate function transforms any iterable object into a new iterable object that incorporates a counter.
fruits = ['banana', 'kiwi', 'strawberry']
for fruit in fruits:
print(f"Fruit number ?? is a {fruits[i]}")
le fruit numéro ?? est une pomme
le fruit numéro ?? est une banane
le fruit numéro ?? est une orange
So you are lazy and you write:
fruits = ['banana', 'kiwi', 'strawberry']
for i in range(len(fruits)):
fruit = fruits[i]
print(f"Fruit number {i+1} is a {fruits[i]}")
le fruit numéro 1 est une pomme
le fruit numéro 2 est une banane
le fruit numéro 3 est une orange
But you can use enumerate
that creates an interable object.
fruits = ['banana', 'kiwi', 'strawberry']
enumerate([fruits], start=1)
<enumerate at 0x7f87a626d200>
fruits = ['banana', 'kiwi', 'strawberry']
for i, fruit in enumerate(fruits, start=1):
print(f"Fruit number {i+1} is a {fruit}")
1 pomme
2 banane
3 orange
Zip
Zip consists in transforming several iterables into a single one.
In order to see the result, we put list(....)
to transform the produced iterable into a list.
fruits = ['banana', 'kiwi', 'strawberry']
prices = [2, 4, 8]
for i in len(fruits):
print(f"{fruits[i]} and costs {prices[i]}")
The fruits[i]
and prices[i]
are ugly. The solution is to zip fruits
and prices
.
fruits = ['banana', 'kiwi', 'strawberry']
prices = [2, 4, 8]
for (fruit, price) in zip(fruits, prices):
print(f"{fruit} and costs {price}")
banana and costs 2
kiwi and costs 4
strawberry and costs 8
fruits = ['banana', 'kiwi', 'strawberry']
prices = [2, 4, 8]
for i, (fruit, price) in enumerate(zip(fruits, prices), start=1):
print(f"{i}: {fruit} and costs {price}")
1: banana and costs 2
2: kiwi and costs 4
3: strawberry and costs 8
zip(["a", "b", "c", "d"], (1, 2, 3, 4))
<zip at 0x7fda498d3b40>
list(zip(["a", "b", "c", "d"], (1, 2, 3, 4)))
[('a', 1), ('b', 2), ('c', 3), ('d', 4)]
zip(["a", "b", "c", "d"], (1, 2, 3, 4), ("alpha","beta", "gamma", "delta"))
<zip at 0x7fda498c3e40>
zip
peut être défini comme ça :
def myZip(A, B):
map(lambda x, y: (x, y), A, B)
list(zip(["a", "b", "c", "d"], (1, 2, 3, 4), ("alpha","beta", "gamma", "delta")))
[('a', 1, 'alpha'), ('b', 2, 'beta'), ('c', 3, 'gamma'), ('d', 4, 'delta')]
from itertools import zip_longest
list(zip_longest(["a", "b", "c"],(1, 2, 3, 4), ("alpha","beta", "gamma", "delta"), fillvalue=0))
[('a', 1, 'alpha'), ('b', 2, 'beta'), ('c', 3, 'gamma'), (0, 4, 'delta')]
Dictionnaries
costs = {"banana": 2, "kiwi": 4, "straberry": 8}
for key in costs.keys():
print(f"{key} costs {costs[key]}")
banana costs 2
kiwi costs 4
straberry costs 8
costs = {"banana": 2, "kiwi": 4, "straberry": 8}
for key in costs:
print(f"{key} costs {costs[key]}")
banana costs 2
kiwi costs 4
straberry costs 8
costs = {"banana": 2, "kiwi": 4, "straberry": 8}
for key, cost in costs.items():
print(f"{key} costs {cost}")
banana costs 2
kiwi costs 4
straberry costs 8
Modifing a dictionnary during a loop
costs = {"banana": 2, "kiwi": 4, "straberry": 8}
for key in costs:
costs[f"{key}'"] = costs[key]*2
print(costs)
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[39], line 2
1 cost = {"banana": 2, "kiwi": 4, "straberry": 8}
----> 2 for key in cost:
3 cost[f"{key}'"] = cost[key]*2
5 print(cost)
RuntimeError: dictionary changed size during iteration
costs = {"banana": 2, "kiwi": 4, "straberry": 8}
for key in list(costs):
costs[f"{key}'"] = costs[key]*2
print(costs)
{'banana': 2, 'kiwi': 4, 'straberry': 8, "banana'": 4, "kiwi'": 8, "straberry'": 16}
Counting
import itertools
itertools.count(start=10, step=2)
count(10, 2)
import itertools
itertools.cycle(["player1", "player2"])
<itertools.cycle at 0x7fda4a716d00>
import itertools
itertools.repeat("X", 4)
Permutations
itertools.permutations(ITERABLE, k)
returns an iterable object that enumerates all k-length tuples which are all possible orderings with no-repeated elements.
import itertools
list(itertools.permutations(range(4), 2))
[(0, 1),
(0, 2),
(0, 3),
(1, 0),
(1, 2),
(1, 3),
(2, 0),
(2, 1),
(2, 3),
(3, 0),
(3, 1),
(3, 2)]
import itertools
list(itertools.permutations(range(4), 4))
[(0, 1, 2, 3),
(0, 1, 3, 2),
(0, 2, 1, 3),
(0, 2, 3, 1),
(0, 3, 1, 2),
(0, 3, 2, 1),
(1, 0, 2, 3),
(1, 0, 3, 2),
(1, 2, 0, 3),
(1, 2, 3, 0),
(1, 3, 0, 2),
(1, 3, 2, 0),
(2, 0, 1, 3),
(2, 0, 3, 1),
(2, 1, 0, 3),
(2, 1, 3, 0),
(2, 3, 0, 1),
(2, 3, 1, 0),
(3, 0, 1, 2),
(3, 0, 2, 1),
(3, 1, 0, 2),
(3, 1, 2, 0),
(3, 2, 0, 1),
(3, 2, 1, 0)]
The following program computes all $n \times n$ chess configurations of $n$ non-attacking queens.
from itertools import permutations
"""
We represent a board of n-queens by an array board such that
board[r] is the column index such that there is a queen at row r and column board[r]
"""
def printBoard(board):
"""
print a board in the terminal
"""
n = len(board)
print('_' * (n+2))
for r in board:
print('|' + ' ' * r + '♛' + ' ' * (n-r-1) + '|')
print('\n')
def nqueenboards(n):
"""
A generator that generates all configurations of non-attacking n queens on a n*n boards
"""
columns = range(n)
for board in permutations(columns):
s1 = set(board[i] + i for i in columns)
s2 = set(board[i] - i for i in columns)
if n == len(s1) == len(s2):
yield board
def printAllBoards(N):
for board in nqueenboards(N):
printBoard(board)
printAllBoards(5)
_______
|♛ |
| ♛ |
| ♛|
| ♛ |
| ♛ |
_______
|♛ |
| ♛ |
| ♛ |
| ♛|
| ♛ |
_______
| ♛ |
| ♛ |
|♛ |
| ♛ |
| ♛|
_______
| ♛ |
| ♛|
| ♛ |
|♛ |
| ♛ |
_______
| ♛ |
|♛ |
| ♛ |
| ♛ |
| ♛|
_______
| ♛ |
| ♛|
| ♛ |
| ♛ |
|♛ |
_______
| ♛ |
|♛ |
| ♛ |
| ♛|
| ♛ |
_______
| ♛ |
| ♛ |
| ♛|
| ♛ |
|♛ |
_______
| ♛|
| ♛ |
| ♛ |
|♛ |
| ♛ |
_______
| ♛|
| ♛ |
|♛ |
| ♛ |
| ♛ |
Cartesian product
import itertools
list(itertools.product(["a", "b", "c"], [1, 2, 3, 4]))
[('a', 1),
('a', 2),
('a', 3),
('a', 4),
('b', 1),
('b', 2),
('b', 3),
('b', 4),
('c', 1),
('c', 2),
('c', 3),
('c', 4)]