Python Sets
A set is an unordered collection of items. Every element is unique (no duplicates) and must be immutable (which cannot be changed). However, the set itself is mutable. We can add or remove items from it.
In other words a set is a collection of unique & immutable objects in an unordered manner. Sets have a very fast lookup time (membership test).
Contents of this page:
- Set Creation
- set() vs hash()
- Set Operations
- Set Sorting
- Set Comprehension
- Elements in a Set
- Membership test & removing elements
Set Creation
To define a set, we use curly braces {}
or the set()
function. Note that, {}
are used by sets and dictionaries, but empty curly braces is always a dictionary.
print(type({})) # <class 'dict'>
# If you want to make an empty set, you have to be explicit about it
print(type(set()))
Don’t name your variable as “set” as this will override the built in data type.
set() vs hash()
Often they are considered the same, but they are not. set() is a built-in Python data type that represents an unordered collection of unique elements. The set() function creates a new set object, and you can add or remove elements from the set as needed. Sets are commonly used for membership testing and deduplicating lists.
hash() is a built-in Python function that returns a hash value for an object. The hash value is an integer that represents the object, and it is used for efficient lookup and comparison of objects in hash tables and other data structures. Objects that are hashable (such as strings, numbers, and tuples) can be used as keys in dictionaries and added to sets.
Sets and hashes are related in that sets use hash values to determine the uniqueness of their elements. When you add an element to a set, Python calculates the hash value of the element and compares it to the hash values of the other elements in the set. If the hash value is unique, the element is added to the set. If the hash value is not unique, the element is not added, since it would be a duplicate. However, sets and hashes are different in terms of their implementation and the types of objects they can work with.
Set Operations
Operations commonly used on sets:
s.add(x)
- adds the elementx
to the sets
s.copy()
- returns a shallow copy ofs
s.clear()
- removes all elements froms
s.pop()
- removes an arbitrary element froms
and returns its.remove(x)
- removes the elementx
froms
s.discard(x)
- removes the elementx
froms
if it is presents1.update(s2)
- updatess1
with the union ofs1
ands2
len(s)
- returns the number of elements in the setx in s
- returnsTrue
ifx
is in the sets
x not in s
- returnsTrue
ifx
is not in the sets
s1.issubset(s2)
- returnsTrue
ifs1
is a subset ofs2
s1.issuperset(s2)
- returnsTrue
ifs1
is a superset ofs2
s1.isdisjoint(s2)
- returnsTrue
ifs1
ands2
have no elements in commons1 == s2
- returnsTrue
ifs1
ands2
are equals1 != s2
- returnsTrue
ifs1
ands2
are not equals1 <= s2
- returnsTrue
ifs1
is a subset ofs2
s1 < s2
- returnsTrue
ifs1
is a proper subset ofs2
s1 >= s2
- returnsTrue
ifs1
is a superset ofs2
s1 > s2
- returnsTrue
ifs1
is a proper superset ofs2
s1 | s2
ors1.union(s2)
- returns the union ofs1
ands2
s1 & s2
ors1.intersection(s2)
- returns the intersection ofs1
ands2
s1 - s2
ors1.difference(s2)
- returns the difference ofs1
ands2
s1 ^ s2
ors1.symmetric_difference(s2)
- returns the symmetric difference ofs1
ands2
s1 |= s2
ors1.update(s2)
- updatess1
with the union ofs1
ands2
s1 &= s2
ors1.intersection_update(s2)
- updatess1
with the intersection ofs1
ands2
s1 -= s2
ors1.difference_update(s2)
- updatess1
with the difference ofs1
ands2
s1 ^= s2
ors1.symmetric_difference_update(s2)
- updatess1
with the symmetric difference ofs1
ands2
Set Sorting
Sets cannot be sorted inplace because items aren’t ordered. However, we can sort the list of elements in a set and then create a new set from that list.
my_set = {1, 3, 5, 7, 9}
my_set = set(list(my_set).sort())
Set Comprehension
my_set = {x for x in 'dakshgaur' if x not in 'dak'}
print(my_set) # {'h', 'r', 's', 'u', 'g'}
Elements in a Set
Dealing with sets sometimes may be confusing because of the fact that every element inside a set must be immutable (strings, numbers, etc.).
If you try to create an empty list inside a set, you will get an error.
# hash([]) # TypeError: unhashable type: 'list'
# {[]} # TypeError: unhashable type: 'list therefore set of empty list is not possible
Sets store immutable objects, however, they themselves are mutable. This means that you can add and remove elements from a set. If you want your set to be immutable, you can use the frozenset()
function.
Membership test & removing elements
As usual, we can test if an element is in a set using in
and not in
operators.
For removing elements, discard
and remove
compete with each other. discard
is better, as discarding an item that is not in set will not give an error.