multiset

This package provides a multiset implementation for Python.

A multiset is similar to the builtin set, but it allows an element to occur multiple times. It is an unordered collection of element which have to be hashable just like in a set. It supports the same methods and operations as set does, e.g. membership test, union, intersection, and (symmetric) difference. Multisets can be used in combination with regular sets for those operations.

The implementation is based on a dict that maps the elements to their multiplicity in the multiset. Hence, some dictionary operations are supported.

In contrast to the collections.Counter from the standard library, it has proper support for set operations and only allows positive counts. Also, elements with a zero multiplicity are automatically removed from the multiset.

There is an immutable version of the multiset called FrozenMultiset.

The package also uses the typing module for type hints (see PEP 484) so you can specify the type of a multiset like Multiset[ElementType].

API Overview

The following is an overview over the methods of the Multiset class. For more details on each method and some examples see the autogenerated API Documentation.

class Multiset([mapping or iterable])

Return a new multiset object whose elements are taken from the given optional iterable or mapping. If a mapping is given, it must have positive int values representing each element’s multiplicity. If no iterable or mapping is specified, a new empty multiset is returned.

The elements of a set must be hashable. In contrast to regular sets, duplicate elements will not be removed from the multiset.

The Multiset class provides the following operations which the builtin set also supports:

len(s)

Return the total number of elements in multiset s (cardinality of s).

Note that this is the sum of the multiplicities and not not the number of distinct elements. You can use distinct_elements() to get the number of distinct elements:

>>> len(Multiset('aab').distinct_elements())
2
x in s

Test x for membership in s.

x not in s

Test x for non-membership in s.

isdisjoint(other)

Return True if the multiset has no elements in common with other.

issubset(other)
multiset <= other

Test whether every element in the multiset is in other and each element’s multiplicity in the multiset is less than or equal to its multiplicity in other.

multiset < other

Test whether the multiset is a proper subset of other, that is, multiset <= other and multiset != other.

issuperset(other)
multiset >= other

Test whether every element in other is in the multiset and each element’s multiplicity in other is less than or equal to its multiplicity in the multiset.

multiset > other

Test whether the multiset is a proper superset of other, that is, multiset >= other and multiset != other.

union(*others)
multiset | other | ...

Return a new multiset with elements from the multiset and all others. The maximal multiplicity over all sets is used for each element.

combine(*others)
multiset + other + ...

Return a new multiset with elements from the multiset and all others. Each element’s multiplicities are summed up for the new set.

intersection(*others)
multiset & other & ...

Return a new multiset with elements common to the multiset and all others. The minimal multiplicity over all sets is used for each element.

difference(*others)
multiset - other - ...

Return a new multiset with elements in the multiset that are not in the others. This will subtract all the others’ multiplicities and remove the element from the multiset if its multiplicity reaches zero.

symmetric_difference(other)
multiset ^ other

Return a new multiset with elements from either the multiset or other where their multiplicity is the absolute difference of the two multiplicities.

copy()
Multiset(multiset)

Return a new shallow copy of the multiset.

The following methods are not supported by FrozenMultiset, as it is equivalent to the builtin frozenset class and is immutable:

union_update(*others)
multiset |= other | ...

Update the multiset, adding elements from all others. The maximal multiplicity over all sets is used for each element. For a version of this method that works more closely to dict.update(), see update().

intersection_update(*others)
multiset &= other & ...

Update the multiset, keeping only elements found in it and all others. The minimal multiplicity over all sets is used for each element.

difference_update(*others)
multiset -= other | ...

Update the multiset, removing elements found in others. This will subtract all the others’ multiplicities and remove the element from the multiset if its multiplicity reaches zero.

symmetric_difference_update(other)
multiset ^= other

Update the multiset, keeping only elements found in either the multiset or other but not both. The new multiplicity is the absolute difference of the two original multiplicities.

add(element, multiplicity=1)
s[element] += multiplicity
s[element] = multiplicity

Add element element to the multiset s. If the optional multiplicity is specified, more than one element can be added at the same time by adding to its.

Note that adding the same element multiple times will increase its multiplicity and thus change the multiset, whereas for a regular set this would have no effect.

You can also set the element’s multiplicity directly via key assignment.

remove(element, multiplicity=None)
del s[element]

Remove all elements element from the multiset. Raises KeyError if element is not contained in the set.

If the optional multiplicity is specified, only the given number is subtracted from the element’s multiplicity. This might still completely remove the element from the multiset depending on its original multiplicity.

This method returns the original multiplicity of the element before it was removed.

You can also delete the element directly via key access.

discard(element, multiplicity=None)
s[element] -= multiplicity
s[element] = 0

Remove element element from the set if it is present. If the optional multiplicity is specified, only the given number is subtracted from the element’s multiplicity. This might still completely remove the element from the multiset depending on its original multiplicity.

This method returns the original multiplicity of the element before it was removed.

You can also set the element’s multiplicity directly via key assignment.

clear()

Remove all elements from the multiset.

Note, the non-operator versions of union(), intersection(), difference(), and symmetric_difference(), issubset(), and issuperset(), update(), intersection_update(), difference_update(), and symmetric_difference_update() methods will accept any iterable as an argument. In contrast, their operator based counterparts require their arguments to be sets. This precludes error-prone constructions like Multiset('abc') & 'cbs' in favor of the more readable Multiset('abc').intersection('cbs').

The Multiset supports set to set comparisons. Two multisets are equal if and only if every element of each multiset is contained in the other and each element’s multiplicity is the same in both multisets (each is a subset of the other). A multiset is less than another set if and only if it is a proper subset of the second set (is a subset, but is not equal). A multiset is greater than another set if and only if it is a proper superset of the second set (is a superset, but is not equal). These comparisons work with both sets and multisets:

>>> Multiset('ab') == {'a', 'b'}
True

Multiset elements, like set elements and dictionary keys, must be hashable.

Binary operations that mix set or frozenset instances with Multiset instances will always return a Multiset.

Since the Multiset internally uses a dict, it exposes some of its methods and allows key-based access for elements:

s[element]

Return the multiplicity of element in s. Returns 0 for elements that are not in the multiset.

s[element] = value

Set the multiplicity of element to value. Setting the multiplicity to 0 removes the element. This is not supported by FrozenMultiset.

del s[element]

See remove(). This is not supported by FrozenMultiset.

iter(s)

Return an iterator over the elements in the multiset.

In contrast to both the dict and set implementations, this will repeat elements whose multiplicity is greater than 1:

>>> sorted(Multiset('aab'))
['a', 'a', 'b']

To only get distinct elements, use the distinct_elements() method.

classmethod from_elements(elements, multiplicity)

Create a new multiset with elements from elements and all multiplicities set to multiplicity.

get(element, default)

Return the multiplicity for element if it is in the multiset, else default.

items()

Return a new view of the multiset’s items ((element, multiplicity) pairs). See the documentation of dict view objects.

Note that this view is unordered.

distinct_elements()
keys()

Return a new view of the multiset’s distinct elements. See the documentation of dict view objects.

Note that this view is unordered.

update(*others)
multiset += other + ...

Update the multiset, adding elements from all others. Each element’s multiplicities is summed up for the new multiset. This is not supported by FrozenMultiset.

For a version of this method that works more closely to the original set.update(), see union_update().

pop(element, default)

If element is in the multiset, remove it and return its multiplicity, else return default. This is not supported by FrozenMultiset.

popitem()

Remove and return an arbitrary (element, multiplicity) pair from the multiset. This is not supported by FrozenMultiset.

popitem() is useful to destructively iterate over a multiset. If the multiset is empty, calling popitem() raises a KeyError.

setdefault(element, default)

If element is in the multiset, return its multiplicity. If not, insert element with a multiplicity of default and return default. This is not supported by FrozenMultiset.

multiplicities()
values()

Return a new view of the multiset’s multiplicities. See the documentation of dict view objects.

Note that this view is unordered.

The multiset also adds some new methods for multiplying a multiset with an int factor:

times(factor)
multiset * factor

Return a copy of the multiset where each multiplicity is multiplied by factor.

times_update(factor)
multiset *= factor

Update the multiset, multiplying each multiplicity with factor. This is not supported by FrozenMultiset.

class FrozenMultiset([mapping or iterable])

This is an immutable version of the Multiset and supports all non-mutating methods of it.

Because it is immutable, it is also hashable and can be used e.g. as a dictionary key or in a set.

class BaseMultiset

This is the base class of both Multiset and FrozenMultiset. While it cannot instantiated directly, it can be used for isinstance checks:

>>> isinstance(Multiset(), BaseMultiset)
True
>>> isinstance(FrozenMultiset(), BaseMultiset)
True