Python Concepts/Iteration and Iterators

Objective

edit
 
  • What is iteration?
  • Why is iteration important?
  • What are examples of iterables?
  • What is the difference between an iterable and an iterator?
  • What are examples of iterators?
  • May an iterator be user-written?

Lesson

edit

Iteration is the process of moving from one member of a sequence to the next. Depending on the code that accesses the sequence, the member currently accessed may be retrieved, changed or ignored.

For iteration to be possible, the sequence or object must be "iterable."

Python's error messages tell us whether or not an object is iterable:

>>> 6 in 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: argument of type 'int' is not iterable
>>> 
>>> 6 in [1] ; 6 in [4,5,6,7]
False
True
>>>

list object is iterable. Value 6 does not exist in the first list above. It exists in the second.

Use built-in function next() on an iterator:

>>> next([1])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'list' object is not an iterator
>>>

A list is an iterable, but it is not an iterator.

Iteration is important because the concept is so elementary that it is almost impossible to write meaningful code without it.


Iterables

edit

Many common and familiar sequences are iterables. Python's tuple() built-in function accepts iterable as input:

>>> tuple(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable
>>>
>>> tuple([1,2,3]) # list is iterable.
(1, 2, 3)
>>> tuple((1,2,3)) # tuple is iterable.
(1, 2, 3)
>>> tuple( {1,2,3,2,3,3} ) # set is iterable.
(1, 2, 3)
>>> tuple( '123' ) # string is iterable.
('1', '2', '3')
>>> tuple( bytes([1,2,3]) ) # bytes object is iterable.
(1, 2, 3)
>>> tuple( bytearray([1,2,3]) ) # bytearray is iterable.
(1, 2, 3)
>>> tuple ( range(2,10,3) ) # range(...) built-in is iterable.
(2, 5, 8)
>>> tuple ( {'one': 1, 'two': 2, 'three': 3} ) # dictionary is iterable.
('one', 'two', 'three')
>>>

When we say "dictionary is iterable," iteration over a dictionary means iteration over the keys of the dictionary.

When an object is iterable, we expect it to support operations over iterables:

>>> 2 in {1,2,3}
True
>>> 'abc' in ' abcd '
True
>>> for p in (2,3,4) : print (p)
... 
2
3
4
>>> 2 in {'one': 1, 'two': 2, 'three': 3} 
False
>>> 'two' in {'one': 1, 'two': 2, 'three': 3}
True
>>> [ {'one': 1, 'two': 2, 'three': 3}[p] for p in {'one': 1, 'two': 2, 'three': 3} ]
[1, 2, 3] # Retrieving the values of the dictionary.
>>>

If you know exactly how many items the iterable contains, the following syntax is possible:

>>> d,e = {1,2} ; d ; e
1
2
>>>
>>> v1,v2,v3 = {'one': 1, 'two': 2, 'three': 3} ; v1 ; v2 ; v3
'one'
'two'
'three'
>>>

If the iterable contains one member:

>>> L1 = [6] ; L1
[6]
>>> L1, = [6] ; L1
6
>>>

Iterators

edit

If interrupted, iteration over an iterable does not resume at the point of interruption. If iteration is interrupted and resumed, the next iteration returns to the beginning.

>>> for p in 'abcd' :
...     print(p)
...     if p == 'b' : break
... 
a
b
>>> for p in 'abcd' :
...     print(p)
... 
a
b
c
d
>>>

Iterators within Python allow for interruption of the iteration and resumption at the point immediately after that at which interruption occurred.

Function iter(object[, sentinel])

edit

To create an iterator, use built-in function iter() with an iterable as argument.

>>> L1 = [1,2,3,4]
>>> it1 = iter(L1)
>>> it1
<list_iterator object at 0x101a95a90>
>>> v0 = next(it1) ; v0
1
>>> v1 = next(it1) ; v1
2
>>> v2 = next(it1) ; v2
3
>>> v3 = next(it1) ; v3
4
>>> v4 = next(it1) ; v4
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
>>>

The built-in function next(....) accepts a default argument. If provided, it will be returned if the iterator is exhausted.

>>> L1 = [1,2,3]
>>> it1 = iter(L1)
>>> v0 = next(it1) ; v0
1
>>> v1 = next(it1) ; v1
2
>>> v2 = next(it1, None) ; v2
3
>>> v3 = next(it1, None) ; v3
>>>

An iterator can behave like an iterable:

>>> L1 = [1,2,3]
>>> it1 = iter(L1)
>>> for p in it1 : print(p)
... 
1
2
3
>>> 
>>> L1 = list(range(8)) ; L1
[0, 1, 2, 3, 4, 5, 6, 7]
>>> it1 = iter(L1)
>>> 5 in it1 # Equivalent to an interruption.
True
>>> for p in it1 : print(p)
... 
6            # Execution resumes after the interruption.  
7
>>>

Iterator may be initialized at any time.

>>> L1 = [1,2,3]
>>> it1 = iter(L1)
>>> next(it1)
1
>>> next(it1)
2
>>> it1 = iter(L1)
>>> next(it1)
1
>>> next(it1)
2
>>> next(it1)
3
>>> next(it1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
>>>

After initialization, original iterable may be changed without affecting original result.

>>> L1 = [1,2,3]
>>> it1 = iter(L1)
>>> next(it1)
1
>>> next(it1)
2
>>> L1 = [1,2,4,5]
>>> next(it1)
3
>>> next(it1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
>>>

Examples of iterators:

>>> it1 = iter( range(2) )
>>> next(it1, None) ; next(it1, None) ; next(it1, None) ;
0
1
>>> 
>>> it1 = iter( zip('abcd', '1234') )
>>> next(it1, None) ; next(it1, None) ; next(it1, None) ; next(it1, None) ; next(it1, None) ;
('a', '1')
('b', '2')
('c', '3')
('d', '4')
>>> 
>>> it1 = iter( re.finditer(r'\w+', ' abc DEF 123 ') )
>>> next(it1, 'None') ; next(it1, 'None') ; next(it1, 'None') ; next(it1, 'None') ; next(it1, 'None') ; 
<_sre.SRE_Match object; span=(1, 4), match='abc'>
<_sre.SRE_Match object; span=(5, 8), match='DEF'>
<_sre.SRE_Match object; span=(9, 12), match='123'>
'None'
'None'
>>>

Generator objects as iterators

edit

Derived from generator expression

edit

We are familiar with list comprehensions "listcomps":

>>> [ p.upper() for p in ('abc',' d ','xyz') ]
['ABC', ' D ', 'XYZ']
>>>

and also with set comprehensions. Is a set comprehension a "setcomp" ?

>>> {p+1 for p in (1,2,3,4,3,4,5)}
{2, 3, 4, 5, 6}
>>>

A generator expression appears to have the syntax of a tuple comprehension.

>>> line_list = ['  line 1\n', 'line 2  \n', ' line 3   \n']
>>> 
>>> stripped_iter = (line.strip() for line in line_list) # Syntax of listcomp, but within parentheses '()'.
>>> 
>>> stripped_iter 
<generator object <genexpr> at 0x101a92620> # generator object derived from generator expression.
>>>

Generator object may be used as iterable.

>>> list(( (line, len(line)) for line in line_list ))
[('  line 1\n', 9), ('line 2  \n', 9), (' line 3   \n', 11)]
>>> 
>>> tuple(( (line, len(line)) for line in line_list ))
(('  line 1\n', 9), ('line 2  \n', 9), (' line 3   \n', 11))
>>> 
>>> for p in ( (line, len(line)) for line in line_list ) : print (p)
... 
('  line 1\n', 9)
('line 2  \n', 9)
(' line 3   \n', 11)
>>>

Generator object may be used as iterator.

>>> next( stripped_iter )
'line 1'
>>> next( stripped_iter )
'line 2'
>>> next( stripped_iter )
'line 3'
>>> next( stripped_iter )
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
>>>

As for listcomps, conditions may be added.

>>> stripped_iter = (line.strip() for line in line_list if '2' not in line)
>>> next( stripped_iter, None ) ; next( stripped_iter, None ) ; next( stripped_iter, None ) ; 
'line 1'
'line 3'
>>> 
>>> stripped_iter = (line.strip().upper() for line in line_list if ('2' in line) or (len(line) >= 10))
>>> 
>>> for p in stripped_iter : print (p)
... 
LINE 2
LINE 3
>>>

Derived from generator

edit

Generators are special functions that simplify the task of writing iterators. Regular functions compute a value and return it, but generators return an iterator that returns a stream of values.

Suppose that the method re.finditer(reg_exp, string) does not exist. You want to create an iterator that will iterate over string and return all the substrings that match reg_exp.

import re

def words_iter(flag=None) :
    start = 0

    while True :
        m = word.search(s1, start)
        if m == None : return

        if flag :
            yield m
            # 'yield' statement identifies this function as           
            # generator function.                                     
            # m is returned to caller.                                
            # Execution of code in function words_iter is suspended.  
            # All local variables are preserved.                      
            # On next invocation of words_iter, execution resumes     
            # immediately after 'yield' statement.                    
        else :
            yield m[0] # Another 'yield' statement.                   

        start = m.span()[1]

print ("""                                                            
s1 = '''                                                              
{}'''                                                                 
""".format(s1)
)
s1 = '''
The quick, brown fox jum......
#####.....#####.....#####.....'''
word = re.compile(r'\w+\s+\w+')

go1 = words_iter(1) # generator object 1, flag supplied.                      

print ('go1 =', go1, '# generator object derived from function words_iter.')
go1 = <generator object words_iter at 0x1019dfd00> # generator object derived from function words_iter.
for m in go1: print (m)
<_sre.SRE_Match object; span=(0, 9), match='The quick'>
<_sre.SRE_Match object; span=(11, 20), match='brown fox'>
word = re.compile(r'\w+')

go1 = words_iter() # generator object 1, flag not supplied.  

while True :
    m = next(go1, None)
    if m == None : break
    print (m)
The
quick
brown
fox
jum
go1 = words_iter() # generator object 1, flag not supplied.  
s2 = "'brown' in go1"
print (
       "{}: {}".format( s2,
                        eval(s2)
                      )
      )
'brown' in go1: True

An endless, but controllable generator

edit

At times it's convenient to have a generator that provides endless iteration but also terminates on command. The code below accomplishes this by communicating with the generator via the generator's .close() method.

def counter(count=0):
    if not isinstance(count, int) :
        exit (99)
    while True :
        count += 1
        status = 0
        try:
            yield count
        except GeneratorExit :
            status = 98
        except :
            status = 97

        if status == 97 :
            exit (97)
        if status == 98 :
            return

it = counter(3)
for count in it :
    print ( 'count = {}'.format(count) )
    if count == 7 :
        v = it.close()
        print ('v =', v)
count = 4
count = 5
count = 6
count = 7
v = None
Endless generator in a listcomp
edit

Listcomps accept free-format Python. Modify the syntax of the loop and it fits in a listcomp. The following example mimics str.lstrip().

s1 = '    abcd  123   '

L2 = [
s1[start:]

for it in (counter(-1),)
for start in it
if s1[start:start+1] != ' '
for v in (it.close(),)
]

print ('L2 =', L2)
L2 = ['abcd  123   ']

Assignments

edit
 

Enhanced range(....) constructor

edit

Python's range(start, stop[, step]) constructor accepts only integers as input.

>>> list(range(-3,17,5))
[-3, 2, 7, 12]
>>>

Produce an enhanced version of range(...) that is an iterator accepting something like:

>>> list(range(-23/11, 2.9, 5/7))
>>> a=-23/11 ; b=2.9; c=5/7 ; a;b;c
-2.090909090909091
2.9
0.7142857142857143
>>> 
>>> d,e,f = [r for p in (a,b,c) for q in (p*770,) for r in (int(q),) if r==q]
>>> d;e;f
-1610
2233
550
>>> go1 = (p/770 for p in range(d,e,f))
>>> 
>>> next(go1,'None');next(go1,'None');next(go1,'None');next(go1,'None');
-2.090909090909091
-1.3766233766233766
-0.6623376623376623
0.05194805194805195
>>> next(go1,'None');next(go1,'None');next(go1,'None');next(go1,'None');
0.7662337662337663
1.4805194805194806
2.1948051948051948
'None'
>>>

Further Reading or Review

edit

References

edit

1. Python's documentation:

"Iterators," "Data Types That Support Iterators," "Generator expressions and list comprehensions," "Generators"


2. Python's methods:

"str.lstrip([chars])"


3. Python's built-in functions:

"iter(object[, sentinel])," "next(iterator[, default])," "tuple([iterable])," " range(start, stop[, step])"