Python Concepts/Sets

Objective

 Learn about Python sets. Learn how to dynamically manipulate sets. Learn about set math and comparison. Learn about built-in set functions. Learn when to use sets and when not to.

Lesson

Python Sets

Sets are mutable sequences, like lists. However, sets and lists differ. Unlike lists, you cannot use `append()` nor can you index or slice. Although the set has limitations, it has two advantages. The set can only contain unique items, so if there are two or more items the same, all but one will be removed. This can help get rid of duplicates. A set is an unordered collection with no duplicate elements. (The technical definition: A set object is an unordered collection of distinct hashable objects. ) Secondly, sets can perform set mathematics. This makes Python sets much like mathematical sets. To create a set, use curly braces (`{}`). To create an empty set you have to use `set().`

```>>> spam = {1, 2, 3}
>>> spam
{1, 2, 3}
>>> eggs = {1, 2, 1, 3, 5, 2, 7, 3, 4}
>>> eggs
{1, 2, 3, 4, 5, 7}    # each object unique
>>> {True, False, True, False, True}
{False, True}
>>> {"hi", "hello", "hey", "hi", "hiya", "sup"}
{'hey', 'sup', 'hi', 'hello', 'hiya'}
>>>
>>> a = {} ; a
{}
>>> isinstance(a,set)
False
>>>
>>> a = set() ; a # to create empty set.
set()
>>> isinstance(a,set)
True
>>>
```

Operations on a single set

Initialize the set:

```>>> b = set('alacazam') ; b
{'z', 'l', 'm', 'c', 'a'}
>>>
>>> b = {'alacazam'} ; b
{'alacazam'}
>>>
>>> b = {'pear','plum'} ; b
{'pear', 'plum'}
>>>
>>> d = ['apple', 'pear', 'plum', 'peach', 'pecan',  'plum', 'peach', 'pecan',  'plum', 'peach'] ; d
['apple', 'pear', 'plum', 'peach', 'pecan', 'plum', 'peach', 'pecan', 'plum', 'peach']
>>> b = set(d) ; b
{'peach', 'pecan', 'pear', 'plum', 'apple'}
>>>
```

Familiar operations:

```>>> isinstance(b,set)
True
>>> len(b)
5
>>> 'apple' in b
True
>>> 'grape' in b
False
>>>
>>> 'grape' not in b
True
>>>
>>> for x in b : print ( x[0:3] ) # for x in set :
...
pea
pec
pea
plu
app
>>>
>>> f = b # A shallow copy.
>>> f == b
True
>>> f is b
True
>>> f = set(b) # A deep copy.
>>> f == b
True
>>> f is b
False
>>>
```

Operations available for set:

```>>> b = set() ; b.add('alacazam') ; b # add element 'alacazam' to set b.
{'alacazam'}
>>>
>>> b = {'pear','plum'} ; b
{'pear', 'plum'}
>>>
>>> d = ['apple', 'pear', 'plum', 'peach', 'pecan'] ; d
['apple', 'pear', 'plum', 'peach', 'pecan']
>>> for c in d : b.add(c) ; b
...
{'pear', 'apple', 'plum'} # 'apple' was added
{'pear', 'apple', 'plum'} # 'pear' was not added.
{'pear', 'apple', 'plum'} # 'plum' was not added.
{'peach', 'pear', 'apple', 'plum'} # 'peach' was added
{'peach', 'pecan', 'pear', 'plum', 'apple'} # 'pecan' was added. ordering not same as list d.
>>>
>>> b = {'peach', 'pecan', 'pear', 'plum', 'apple'} ; b
{'peach', 'pecan', 'pear', 'plum', 'apple'}
>>> b.clear() ; b # remove all elements from set b
set()
>>>
>>> b = {'peach', 'pecan', 'pear', 'plum', 'apple'} ; b
{'peach', 'pecan', 'pear', 'plum', 'apple'}
>>> a = b.pop() ; a ; b # Remove and return an arbitrary element from the set. Raises KeyError if the set is empty.
'peach'
{'pecan', 'pear', 'plum', 'apple'}
>>>
>>> b.discard('grape') ; b # Remove element 'grape' from set b if element is present.
{'pecan', 'pear', 'plum', 'apple'}
>>>
{'pecan', 'plum', 'apple'}
>>>
>>> b.remove('apple') ; b # Remove element 'apple' from set b. Raises KeyError if element is not contained in the set.
{'pecan', 'plum'}
>>>
```

Set comprehensions

Similarly to list comprehensions, set comprehensions are also supported:

```>>> {x*x%7   for x in range(-234,79)}
{0, 1, 2, 4}
>>>
>>> a = {x for x in 'abracadabra' if x in 'abcrmgz'} ; a
{'b', 'a', 'c', 'r'}
>>>
```

Operations on two sets

`set.isdisjoint(other)`

Return `True` if set `set` has no elements in common with `other` set. Sets are disjoint if and only if their intersection is the empty set.

```>>> set1 = {'pecan', 'pear', 'plum', 'apple'}
>>> set2 = {'pecan', 'pear', 'orange', 'mandarin'}
>>> set3 = {'grape', 'watermelon', 'orange', 'mandarin'}
>>> set1.isdisjoint(set2)
False
>>> set1.isdisjoint(set3)
True
>>>
>>> {'a', 'b', 'c'}.isdisjoint( {'a', 'd', 'e'} )
False
>>> {'a', 'b', 'c'}.isdisjoint( {'z', 'd', 'e'} )
True
>>>
```

`set.issubset(other)`

Test whether every element in set `set` is in `other.` Equivalent to `set <= other.`

```>>> {'a', 'b', 'c'}.issubset( {'a', 'b', 'c', 'd'} )
True
>>> {'a', 'b', 'c'}.issubset( ['a', 'b', 'c'] ) # this form accepts iterable for 'other'.
True
>>> {'a', 'b', 'c'}.issubset( ['a', 'b', 'd'] )
False
>>> {'a', 'b', 'c'}.issubset( 'abd' )
False
>>> {'a', 'b', 'c'}.issubset( 'abcdef' )
True
>>>
>>> {'a', 'b', 'c'} <= {'a', 'b', 'c'} # In this form both arguments are sets.
True
>>> {'a', 'b', 'c'} <= {'a', 'b', 'c', 'd'}
True
>>> {'a', 'b', 'c'} <= {'a', 'b', 'd'}
False
>>>
>>> {'a', 'b', 'c'} < {'a', 'b', 'c'}
False
>>> {'a', 'b', 'c'} < {'a', 'b', 'c', 'd'} # set is a proper subset of other
True
>>> {'a', 'b', 'c'} < {'a', 'b', 'd'}
False
>>>
```

Symmetric difference

`newSet = set.symmetric_difference(other).`

Return a new set with elements in either `set` or `other` but not both.

```>>> newSet = {'a', 'b', 'c', 'g'}.symmetric_difference( {'a', 'b', 'h', 'i'} ) ; newSet
{'i', 'c', 'h', 'g'}
>>> {'a', 'b', 'c', 'g'}.symmetric_difference( 'abcdef' ) # this form accepts iterable for 'other'.
{'e', 'f', 'd', 'g'}
>>> {'a', 'b', 'c', 'g'} ^ {'a', 'b', 'h', 'i'} # In this form both arguments are sets.
{'i', 'c', 'h', 'g'}
>>>
```

Operations on two or more sets

Union

`newSet = set.union(*others).`

Return a new set with elements from `set` and all `others.`

```>>> newSet = {'a', 'b', 'c', 'g'}.union( {'b', 'h'},'abcz', [1,2,3] ) ; newSet # iterable as argument
{'b', 1, 'z', 'c', 2, 'h', 3, 'g', 'a'}
>>>
>>> newSet = {'a', 'b', 'c', 'g'} |  {'b', 'h'} | set([1,2,3,4]) ; newSet # operands must be sets.
{'b', 1, 2, 'c', 3, 'h', 4, 'g', 'a'}
>>>
>>> newSet = {'a', 'b', 'c', 'g'} |  {'b', 'h'} | [1,2,3,4] ; newSet
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for |: 'set' and 'list'
>>>
```

Intersection

`newSet = set.intersection(*others).`

Return a new set with elements common to `set` and all `others.`

```>>> newSet = {'a', 'b', 'c', 'g'}.intersection( {'g', 'b', 'h'},'abczg' ) ; newSet # iterable as argument
{'b', 'g'}
>>
>>> newSet = {'a', 'b', 'c', 'g'}.intersection( {'g', 'b', 'h'},'abczg','p', 'q', 's', 'b', 'g' ) ; newSet
set()
>>>
>>> newSet = {'a', 'b', 'c', 'g'} & {'g', 'b', 'h'}  ; newSet # operands must be sets
{'b', 'g'}
>>>
>>> newSet = {'a', 'b', 'c', 'g'} & {'g', 'b', 'h'} & set(('g', 'b', 'z', 'm')) ; newSet
{'b', 'g'}
>>>
>>> newSet = {'a', 'b', 'c', 'g'} & {'g', 'b', 'h'} & ('g', 'b', 'z', 'm') ; newSet
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for &: 'set' and 'tuple'
>>>
```

Difference

`newSet = set.difference(*others).`

Return a new set with elements in `set` that are not in `others.`

```>>> newSet = {'a', 'b', 'c', 'g','q', 'x'}.difference( {'g', 'b', 'h'},'bczg' ) ; newSet # iterable as argument
{'q', 'x', 'a'}
>>>
>>> newSet = {'a', 'b', 'c', 'g','q', 'x'} -  {'g', 'b', 'h'} - set('bczg')  ; newSet # operands are sets
{'a', 'q', 'x'}
>>> newSet = {'a', 'b', 'c', 'g','q', 'x'} - set( {'g', 'b', 'h'} | set('bczg') )  ; newSet
{'q', 'x', 'a'} # same as above
>>>
>>> {'a', 'q', 'x'} == {'q', 'x', 'a'}
True
>>>
```

Assignments

 String `str1` contains the names of all 50 states of the United States of America with some duplicates and extraneous white space. Use sets, including set comprehensions, to determine the one letter that does not appear in the name of any state. ```str1 = ''' Indiana , Kentucky , Nebraska , California , Oregon , Washington , Hawaii , Alaska , Arizona , Utah , Nevada , Idaho , New Mexico , Colorado , Wyoming , Montana , Texas , Oklahoma , Kansas , Nebraska , South Dakota , North D , Louisiana , Ark , Missouri , Iowa , Illinois , Minnesota , Michigan , Mississippi , Tennessee , Alabama , Ohio,West V , Virginia , Michigan , Florida , Georgia , S Carolina , , N C , , Maryland , Delaware , New Jersey , New York , Pennsylvania , Vermont , New Hampshire , Maine , Connecticut , , Rhode Island , Massachussetts , Maine , Wisconsin , ''' ``` Why are the letters "N C" sufficient for "North Carolina"?