Objective

edit
 
  • Learn about Python sets.
  • Learn how to dynamically manipulate sets.
  • Learn about set math and comparison.
  • Learn about built-in set functions.
  • Learn when to use sets and when not to.

Lesson

edit

Python Sets

edit

Sets are mutable sequences, like lists. However, sets and lists differ. Unlike lists, you cannot use append() nor can you index or slice. Although the set has limitations, it has two advantages. The set can only contain unique items, so if there are two or more items the same, all but one will be removed. This can help get rid of duplicates. A set is an unordered collection with no duplicate elements. (The technical definition: A set object is an unordered collection of distinct hashable objects. ) Secondly, sets can perform set mathematics. This makes Python sets much like mathematical sets. To create a set, use curly braces ({}). To create an empty set you have to use set().

>>> spam = {1, 2, 3}
>>> spam
{1, 2, 3}
>>> eggs = {1, 2, 1, 3, 5, 2, 7, 3, 4}
>>> eggs
{1, 2, 3, 4, 5, 7}    # each object unique
>>> {True, False, True, False, True}
{False, True}
>>> {"hi", "hello", "hey", "hi", "hiya", "sup"}
{'hey', 'sup', 'hi', 'hello', 'hiya'}
>>>
>>> a = {} ; a
{}
>>> isinstance(a,set)
False
>>>
>>> a = set() ; a # to create empty set.
set()
>>> isinstance(a,set)
True
>>>


Operations on a single set

edit

Initialize the set:

edit
>>> b = set('alacazam') ; b
{'z', 'l', 'm', 'c', 'a'}
>>> 
>>> b = {'alacazam'} ; b
{'alacazam'}
>>> 
>>> b = {'pear','plum'} ; b
{'pear', 'plum'}
>>>
>>> d = ['apple', 'pear', 'plum', 'peach', 'pecan',  'plum', 'peach', 'pecan',  'plum', 'peach'] ; d
['apple', 'pear', 'plum', 'peach', 'pecan', 'plum', 'peach', 'pecan', 'plum', 'peach']
>>> b = set(d) ; b
{'peach', 'pecan', 'pear', 'plum', 'apple'}
>>>

Familiar operations:

edit
>>> isinstance(b,set)
True
>>> len(b)
5
>>> 'apple' in b
True
>>> 'grape' in b
False
>>> 
>>> 'grape' not in b
True
>>> 
>>> for x in b : print ( x[0:3] ) # for x in set :
... 
pea
pec
pea
plu
app
>>> 
>>> f = b # A shallow copy.
>>> f == b
True
>>> f is b
True
>>> f = set(b) # A deep copy.
>>> f == b
True
>>> f is b
False
>>>

Operations available for set:

edit
>>> b = set() ; b.add('alacazam') ; b # add element 'alacazam' to set b.
{'alacazam'}
>>> 
>>> b = {'pear','plum'} ; b
{'pear', 'plum'}
>>>
>>> d = ['apple', 'pear', 'plum', 'peach', 'pecan'] ; d
['apple', 'pear', 'plum', 'peach', 'pecan']
>>> for c in d : b.add(c) ; b
... 
{'pear', 'apple', 'plum'} # 'apple' was added
{'pear', 'apple', 'plum'} # 'pear' was not added.
{'pear', 'apple', 'plum'} # 'plum' was not added.
{'peach', 'pear', 'apple', 'plum'} # 'peach' was added
{'peach', 'pecan', 'pear', 'plum', 'apple'} # 'pecan' was added. ordering not same as list d.
>>> 
>>> b = {'peach', 'pecan', 'pear', 'plum', 'apple'} ; b
{'peach', 'pecan', 'pear', 'plum', 'apple'}
>>> b.clear() ; b # remove all elements from set b
set()
>>> 
>>> b = {'peach', 'pecan', 'pear', 'plum', 'apple'} ; b
{'peach', 'pecan', 'pear', 'plum', 'apple'}
>>> a = b.pop() ; a ; b # Remove and return an arbitrary element from the set. Raises KeyError if the set is empty.
'peach'
{'pecan', 'pear', 'plum', 'apple'}
>>> 
>>> b.discard('grape') ; b # Remove element 'grape' from set b if element is present.
{'pecan', 'pear', 'plum', 'apple'}
>>> 
>>> b.discard('pear') ; b
{'pecan', 'plum', 'apple'}
>>> 
>>> b.remove('apple') ; b # Remove element 'apple' from set b. Raises KeyError if element is not contained in the set.
{'pecan', 'plum'}
>>>

Set comprehensions

edit

Similarly to list comprehensions, set comprehensions are also supported:

>>> {x*x%7   for x in range(-234,79)}
{0, 1, 2, 4}
>>>
>>> a = {x for x in 'abracadabra' if x in 'abcrmgz'} ; a
{'b', 'a', 'c', 'r'}
>>>

Operations on two sets

edit

set.isdisjoint(other)

edit

Return True if set set has no elements in common with other set. Sets are disjoint if and only if their intersection is the empty set.

>>> set1 = {'pecan', 'pear', 'plum', 'apple'}
>>> set2 = {'pecan', 'pear', 'orange', 'mandarin'}
>>> set3 = {'grape', 'watermelon', 'orange', 'mandarin'}
>>> set1.isdisjoint(set2)
False
>>> set1.isdisjoint(set3)
True
>>> 
>>> {'a', 'b', 'c'}.isdisjoint( {'a', 'd', 'e'} )
False
>>> {'a', 'b', 'c'}.isdisjoint( {'z', 'd', 'e'} )
True
>>>

set.issubset(other)

edit

Test whether every element in set set is in other. Equivalent to set <= other.

>>> {'a', 'b', 'c'}.issubset( {'a', 'b', 'c', 'd'} )
True
>>> {'a', 'b', 'c'}.issubset( ['a', 'b', 'c'] ) # this form accepts iterable for 'other'.
True
>>> {'a', 'b', 'c'}.issubset( ['a', 'b', 'd'] )
False
>>> {'a', 'b', 'c'}.issubset( 'abd' )
False
>>> {'a', 'b', 'c'}.issubset( 'abcdef' )
True
>>> 
>>> {'a', 'b', 'c'} <= {'a', 'b', 'c'} # In this form both arguments are sets.
True
>>> {'a', 'b', 'c'} <= {'a', 'b', 'c', 'd'}
True
>>> {'a', 'b', 'c'} <= {'a', 'b', 'd'}
False
>>> 
>>> {'a', 'b', 'c'} < {'a', 'b', 'c'}
False
>>> {'a', 'b', 'c'} < {'a', 'b', 'c', 'd'} # set is a proper subset of other
True
>>> {'a', 'b', 'c'} < {'a', 'b', 'd'}
False
>>>

Symmetric difference

edit

newSet = set.symmetric_difference(other).

Return a new set with elements in either set or other but not both.

>>> newSet = {'a', 'b', 'c', 'g'}.symmetric_difference( {'a', 'b', 'h', 'i'} ) ; newSet
{'i', 'c', 'h', 'g'}
>>> {'a', 'b', 'c', 'g'}.symmetric_difference( 'abcdef' ) # this form accepts iterable for 'other'.
{'e', 'f', 'd', 'g'}
>>> {'a', 'b', 'c', 'g'} ^ {'a', 'b', 'h', 'i'} # In this form both arguments are sets.
{'i', 'c', 'h', 'g'}
>>>

Operations on two or more sets

edit

Union

edit

newSet = set.union(*others).

Return a new set with elements from set and all others.

>>> newSet = {'a', 'b', 'c', 'g'}.union( {'b', 'h'},'abcz', [1,2,3] ) ; newSet # iterable as argument
{'b', 1, 'z', 'c', 2, 'h', 3, 'g', 'a'}
>>> 
>>> newSet = {'a', 'b', 'c', 'g'} |  {'b', 'h'} | set([1,2,3,4]) ; newSet # operands must be sets.
{'b', 1, 2, 'c', 3, 'h', 4, 'g', 'a'}
>>>
>>> newSet = {'a', 'b', 'c', 'g'} |  {'b', 'h'} | [1,2,3,4] ; newSet
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for |: 'set' and 'list'
>>>

Intersection

edit

newSet = set.intersection(*others).

Return a new set with elements common to set and all others.

>>> newSet = {'a', 'b', 'c', 'g'}.intersection( {'g', 'b', 'h'},'abczg' ) ; newSet # iterable as argument
{'b', 'g'}
>>
>>> newSet = {'a', 'b', 'c', 'g'}.intersection( {'g', 'b', 'h'},'abczg','p', 'q', 's', 'b', 'g' ) ; newSet
set()
>>> 
>>> newSet = {'a', 'b', 'c', 'g'} & {'g', 'b', 'h'}  ; newSet # operands must be sets
{'b', 'g'}
>>>
>>> newSet = {'a', 'b', 'c', 'g'} & {'g', 'b', 'h'} & set(('g', 'b', 'z', 'm')) ; newSet
{'b', 'g'}
>>>
>>> newSet = {'a', 'b', 'c', 'g'} & {'g', 'b', 'h'} & ('g', 'b', 'z', 'm') ; newSet
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for &: 'set' and 'tuple'
>>>

Difference

edit

newSet = set.difference(*others).

Return a new set with elements in set that are not in others.

>>> newSet = {'a', 'b', 'c', 'g','q', 'x'}.difference( {'g', 'b', 'h'},'bczg' ) ; newSet # iterable as argument
{'q', 'x', 'a'}
>>> 
>>> newSet = {'a', 'b', 'c', 'g','q', 'x'} -  {'g', 'b', 'h'} - set('bczg')  ; newSet # operands are sets
{'a', 'q', 'x'}
>>> newSet = {'a', 'b', 'c', 'g','q', 'x'} - set( {'g', 'b', 'h'} | set('bczg') )  ; newSet
{'q', 'x', 'a'} # same as above
>>> 
>>> {'a', 'q', 'x'} == {'q', 'x', 'a'}
True
>>>

Assignments

edit
 

String str1 contains the names of all 50 states of the United States of America with some duplicates and extraneous white space. Use sets, including set comprehensions, to determine the one letter that does not appear in the name of any state.

str1 = '''                                                                                                                                       
  Indiana , Kentucky , Nebraska , California , Oregon , Washington , Hawaii , Alaska ,                                                           
  Arizona , Utah , Nevada , Idaho , New Mexico , Colorado , Wyoming , Montana , Texas                                                            
 , Oklahoma , Kansas , Nebraska , South Dakota , North D , Louisiana , Ark , Missouri ,                                                          
 Iowa , Illinois , Minnesota , Michigan , Mississippi , Tennessee , Alabama ,                                                                    
Ohio,West V , Virginia , Michigan , Florida , Georgia , S Carolina ,  ,    N C ,                                                                 
      , Maryland , Delaware , New Jersey , New York , Pennsylvania , Vermont , New  Hampshire                                                    
   , Maine , Connecticut ,         , Rhode Island , Massachussetts , Maine , Wisconsin ,                                                         
'''

Why are the letters "N C" sufficient for "North Carolina"?

One Possible Solution

Further Reading or Review

edit
  Completion status: this resource is just getting off the ground. Please feel welcome to help!

References

edit

1. Python's documentation:

"Sets", "Set Types — set, frozenset", "Displays for lists, sets and dictionaries", "Set displays"


2. Python's methods:


3. Python's built-in functions:

"class set(....)"