Sets in Python

Posted by Daksh on Sunday, April 3, 2022

Python Sets

A set is an unordered collection of items. Every element is unique (no duplicates) and must be immutable (which cannot be changed). However, the set itself is mutable. We can add or remove items from it.

In other words a set is a collection of unique & immutable objects in an unordered manner. Sets have a very fast lookup time (membership test).

Contents of this page:

Set Creation

To define a set, we use curly braces {} or the set() function. Note that, {} are used by sets and dictionaries, but empty curly braces is always a dictionary.

print(type({}))  # <class 'dict'>
# If you want to make an empty set, you have to be explicit about it
print(type(set()))

Don’t name your variable as “set” as this will override the built in data type.

set() vs hash()

Often they are considered the same, but they are not. set() is a built-in Python data type that represents an unordered collection of unique elements. The set() function creates a new set object, and you can add or remove elements from the set as needed. Sets are commonly used for membership testing and deduplicating lists.

hash() is a built-in Python function that returns a hash value for an object. The hash value is an integer that represents the object, and it is used for efficient lookup and comparison of objects in hash tables and other data structures. Objects that are hashable (such as strings, numbers, and tuples) can be used as keys in dictionaries and added to sets.

Sets and hashes are related in that sets use hash values to determine the uniqueness of their elements. When you add an element to a set, Python calculates the hash value of the element and compares it to the hash values of the other elements in the set. If the hash value is unique, the element is added to the set. If the hash value is not unique, the element is not added, since it would be a duplicate. However, sets and hashes are different in terms of their implementation and the types of objects they can work with.

Set Operations

Operations commonly used on sets:

  • s.add(x) - adds the element x to the set s
  • s.copy() - returns a shallow copy of s
  • s.clear() - removes all elements from s
  • s.pop() - removes an arbitrary element from s and returns it
  • s.remove(x) - removes the element x from s
  • s.discard(x) - removes the element x from s if it is present
  • s1.update(s2) - updates s1 with the union of s1 and s2
  • len(s) - returns the number of elements in the set
  • x in s - returns True if x is in the set s
  • x not in s - returns True if x is not in the set s
  • s1.issubset(s2) - returns True if s1 is a subset of s2
  • s1.issuperset(s2) - returns True if s1 is a superset of s2
  • s1.isdisjoint(s2) - returns True if s1 and s2 have no elements in common
  • s1 == s2 - returns True if s1 and s2 are equal
  • s1 != s2 - returns True if s1 and s2 are not equal
  • s1 <= s2 - returns True if s1 is a subset of s2
  • s1 < s2 - returns True if s1 is a proper subset of s2
  • s1 >= s2 - returns True if s1 is a superset of s2
  • s1 > s2 - returns True if s1 is a proper superset of s2
  • s1 | s2 or s1.union(s2) - returns the union of s1 and s2
  • s1 & s2 or s1.intersection(s2) - returns the intersection of s1 and s2
  • s1 - s2 or s1.difference(s2) - returns the difference of s1 and s2
  • s1 ^ s2 or s1.symmetric_difference(s2) - returns the symmetric difference of s1 and s2
  • s1 |= s2 or s1.update(s2) - updates s1 with the union of s1 and s2
  • s1 &= s2 or s1.intersection_update(s2) - updates s1 with the intersection of s1 and s2
  • s1 -= s2 or s1.difference_update(s2) - updates s1 with the difference of s1 and s2
  • s1 ^= s2 or s1.symmetric_difference_update(s2) - updates s1 with the symmetric difference of s1 and s2

Set Sorting

Sets cannot be sorted inplace because items aren’t ordered. However, we can sort the list of elements in a set and then create a new set from that list.

my_set = {1, 3, 5, 7, 9}
my_set = set(list(my_set).sort())

Set Comprehension

my_set = {x for x in 'dakshgaur' if x not in 'dak'}
print(my_set) # {'h', 'r', 's', 'u', 'g'}

Elements in a Set

Dealing with sets sometimes may be confusing because of the fact that every element inside a set must be immutable (strings, numbers, etc.).

If you try to create an empty list inside a set, you will get an error.

# hash([])  # TypeError: unhashable type: 'list'
# {[]}  # TypeError: unhashable type: 'list therefore set of empty list is not possible

Sets store immutable objects, however, they themselves are mutable. This means that you can add and remove elements from a set. If you want your set to be immutable, you can use the frozenset() function.

Membership test & removing elements

As usual, we can test if an element is in a set using in and not in operators.

For removing elements, discard and remove compete with each other. discard is better, as discarding an item that is not in set will not give an error.