_types¶
This module provides a set of types that can be used as building block
in the aggregation of a Clustering
object.
Go to:
Cluster parameters¶
-
class
cnnclustering._types.
ClusterParameters
(double radius_cutoff: float, similarity_cutoff: int = 0, double similarity_cutoff_continuous: float = 0., n_member_cutoff: int = None, current_start: int = 1)¶ Input parameters for clustering procedure
- Parameters
radius_cutoff – Neighbour search radius \(r\).
- Keyword Arguments
similarity_cutoff – Value used to check the similarity criterion. In common-nearest-neighbours clustering, it is the minimum required number of shared neighbours \(c\).
similarity_cutoff_continuous – Same as
similarity_cutoff
but allowed to be a floating point value.n_member_cutoff – Minimum required number of points in neighbour lists to be considered (tested in
cnnclustering._types.Neighbours.enough
). IfNone
, will be set tosimilarity_cutoff
.current_start – Use this as the first label for identified clusters.
- Members
to_dict
Cluster labels¶
-
class
cnnclustering._types.
Labels
(labels, consider=None, *, meta=None)¶ Represents cluster label assignments
- Parameters
labels – A container of integer cluster labels supporting the buffer protocol
- Keyword Arguments
consider – A boolean (uint8) container of same length as
labels
indicating if a cluster label should be considered for assignment during clustering. IfNone
, will be created as all true.meta – Meta information. If
None
, will be created as empty dictionary.
-
n_points
¶ The length of the labels container
-
labels
¶ The labels container converted to a NumPy ndarray
-
meta
¶ The meta information dictionary
-
consider
¶ The consider container converted to a NumPy ndarray
-
mapping
¶ A mapping of cluster labels to indices in
labels
-
set
¶ The set of cluster labels
-
consider_set
¶ A set of cluster labels to consider for cluster label assignments
- Members
from_sequence, sort_by_size
Input data¶
Types used as input data to a clustering have to adhere to the input
data interface which is defined through
InputDataExtInterface
for Cython extension
types. For pure Python types the input data interface is defined through
the abstract base class InputDataInputData
and the specialised abstract classes
-
class
cnnclustering._types.
InputDataExtInterface
¶ Defines the input data interface for Cython extension types
-
compute_distances
(self, InputDataExtInterface input_data)¶
-
compute_neighbourhoods
(self, InputDataExtInterface input_data, AVALUE r, ABOOL is_sorted, ABOOL is_selfcounting)¶
-
get_builder_kwargs
(type cls)¶
-
get_component
(self, point: int, dimension: int) → int¶
-
get_distance
(self, point_a: int, point_b: int) → int¶
-
get_n_neighbours
(self, point: int) → int¶
-
get_neighbour
(self, point: int, member: int) → int¶
-
meta
¶ dict
- Type
meta
-
n_dim
¶ ‘AINDEX’
- Type
n_dim
-
n_points
¶ ‘AINDEX’
- Type
n_points
-
-
class
cnnclustering._types.
InputData
¶ Defines the input data interface
-
abstract property
data
¶ Return underlying data (only for user convenience, not to be relied on)
-
classmethod
get_builder_kwargs
(cls)¶
-
abstract property
meta
¶ Return meta-information
-
abstract property
n_points
¶ Return total number of points
-
abstract property
-
class
cnnclustering._types.
InputDataComponents
¶ Extends the input data interface
-
abstract
get_component
(self, point: int, dimension: int) → float¶ Return one component of point coordinates
-
abstract property
n_dim
¶ Return total number of dimensions
-
abstract
to_components_array
(self) → Type[np.ndarray]¶ Return input data as NumPy array of shape (#points, #components)
-
abstract
-
class
cnnclustering._types.
InputDataPairwiseDistances
¶ Extends the input data interface
-
abstract
get_distance
(self, point_a: int, point_b: int) → float¶ Return the pairwise distance between two points
-
abstract
-
class
cnnclustering._types.
InputDataPairwiseDistancesComputer
¶ Extends the input data interface
-
class
cnnclustering._types.
InputDataNeighbourhoods
¶ Extends the input data interface
-
abstract
get_n_neighbours
(self, point: int) → int¶ Return number of neighbours for point
-
abstract
get_neighbour
(self, point: int, member: int) → int¶ Return a member for point
-
abstract
-
class
cnnclustering._types.
InputDataNeighbourhoodsComputer
¶ Extends the input data interface
-
abstract
compute_neighbourhoods
(self, input_data: Type[u'InputData'], double r: float, is_sorted: bool = False, is_selfcounting: bool = True) → None¶ Pre-compute neighbourhoods at radius
-
abstract
-
class
cnnclustering._types.
InputDataExtComponentsMemoryview
¶ Implements the input data interface
Stores compenents as cython memoryview.
-
by_parts
(self) → Iterator¶ Yield data by parts
- Returns
Generator of 2D
numpy.ndarray
s (parts)
-
get_component
(self, point: int, dimension: int) → int¶
-
get_subset
(self, indices: Sequence) → Type[InputDataExtComponentsMemoryview]¶
-
to_components_array
(self)¶
-
-
class
cnnclustering._types.
InputDataExtDistancesLinearMemoryview
¶ Implements the input data interface
Stores distances as 1D memoryview
-
class
cnnclustering._types.
InputDataExtNeighbourhoodsMemoryview
¶ Implements the input data interface
Neighbours of points stored using a cython memoryview.
-
get_n_neighbours
(self, point: int) → int¶
-
get_neighbour
(self, point: int, member: int) → int¶
-
get_subset
(self, indices: Sequence) → Type[InputDataExtNeighbourhoodsMemoryview]¶ Return input data subset
-
-
class
cnnclustering._types.
InputDataNeighbourhoodsSequence
(data: Sequence, *, meta=None)¶ Implements the input data interface
Neighbours of points stored as a sequence.
- Parameters
data – Any sequence of neighbour index sequences (need to be sized, indexable, and iterable)
- Keyword Arguments
meta – Meta-information dictionary.
-
property
data
¶
-
get_n_neighbours
(self, point: int) → int¶
-
get_neighbour
(self, point: int, member: int) → int¶
-
get_subset
(self, indices: Container) → Type[InputDataNeighbourhoodsSequence]¶
-
property
meta
¶
-
property
n_neighbours
¶
-
property
n_points
¶
-
class
cnnclustering._types.
InputDataSklearnKDTree
(data: Type[numpy.ndarray], *, meta=None, **kwargs)¶ Implements the input data interface
Components stored as a NumPy array. Neighbour queries delegated to pre-build KDTree.
-
build_tree
(self, **kwargs)¶
-
clear_cached
(self)¶
-
compute_neighbourhoods
(self, input_data: Type[u'InputData'], double radius: float, is_sorted: bool = False, is_selfcounting: bool = True)¶
-
property
data
¶
-
get_component
(self, point: int, dimension: int) → float¶
-
get_n_neighbours
(self, point: int) → int¶
-
get_neighbour
(self, point: int, member: int) → int¶ Return a member for point
-
get_subset
(self, indices: Container) → Type[InputDataSklearnKDTree]¶ Return input data subset
-
property
meta
¶
-
property
n_dim
¶
-
property
n_neighbours
¶
-
property
n_points
¶
-
to_components_array
(self)¶
-
Neighbour containers¶
-
class
cnnclustering._types.
NeighboursExtInterface
¶ -
assign
(self, member: int)¶
-
contains
(self, member: int)¶
-
enough
(self, member_cutoff: int)¶
-
get_builder_kwargs
(type cls)¶
-
get_member
(self, index: int)¶
-
n_points
¶ ‘AINDEX’
- Type
n_points
-
reset
(self)¶
-
-
class
cnnclustering._types.
Neighbours
¶ Defines the neighbours interface
-
abstract
assign
(self, member: int) → None¶ Add a member to this container
-
abstract
contains
(self, member: int) → bool¶ Return True if member is in neighbours container
-
abstract
enough
(self, member_cutoff: int) → bool¶ Return True if there are enough points
-
classmethod
get_builder_kwargs
(cls)¶
-
abstract
get_member
(self, index: int) → int¶ Return indexable neighbours container
-
abstract property
n_points
¶ Return total number of points
-
abstract property
neighbours
¶ Return point indices as NumPy array
-
abstract
reset
(self) → None¶ Reset/empty this container
-
abstract
-
class
cnnclustering._types.
NeighboursExtVector
¶ Implements the neighbours interface
Uses an underlying C++ std:vector.
- Parameters
initial_size – Number of elements reserved for the size of vector.
- Keyword Arguments
neighbours – A sequence of labels suitable to be cast to a vector.
-
class
cnnclustering._types.
NeighboursExtCPPSet
¶ Implements the neighbours interface
Uses an underlying C++ std:set.
- Keyword Arguments
neighbours – A sequence of labels suitable to be cast to a C++ set.
-
class
cnnclustering._types.
NeighboursExtCPPUnorderedSet
¶ Implements the neighbours interface
Uses an underlying C++ std:unordered_set.
- Keyword Arguments
neighbours – A sequence of labels suitable to be cast to a C++ set.
-
class
cnnclustering._types.
NeighboursExtVectorCPPUnorderedSet
¶ Implements the neighbours interface
Uses a compination of an underlying C++ std:vector and a std:unordered_set.
- Keyword Arguments
neighbours – A sequence of labels suitable to be cast to a C++ vector.
Neighbours getter¶
-
class
cnnclustering._types.
NeighboursGetterExtInterface
¶ -
get
(self, AINDEX index, InputDataExtInterface input_data, NeighboursExtInterface neighbours, ClusterParameters cluster_params)¶
-
get_builder_kwargs
(type cls)¶
-
get_other
(self, AINDEX index, InputDataExtInterface input_data, InputDataExtInterface other_input_data, NeighboursExtInterface neighbours, ClusterParameters cluster_params)¶
-
is_selfcounting
¶ ‘bool’
- Type
is_selfcounting
-
is_sorted
¶ ‘bool’
- Type
is_sorted
-
-
class
cnnclustering._types.
NeighboursGetter
¶ Defines the neighbours-getter interface
-
abstract
get
(self, index: int, input_data: Type[InputData], neighbours: Type[Neighbours], cluster_params: Type[ClusterParameters]) → None¶ Collect neighbours for point in input data
-
classmethod
get_builder_kwargs
(cls)¶
-
get_other
(self, index: int, input_data: Type[InputData], other_input_data: Type[InputData], neighbours: Type[Neighbours], cluster_params: Type[ClusterParameters]) → None¶ Collect neighbours in input data for point in other input data
-
abstract property
is_selfcounting
¶ Return True if points count as their own neighbour
-
abstract property
is_sorted
¶ Return True if neighbour indices are sorted
-
abstract
-
class
cnnclustering._types.
NeighboursGetterExtBruteForce
(distance_getter: Type[DistanceGetterExtInterface])¶ Implements the neighbours getter interface
This getter retrieves the neighbours of a point by comparing the distances (from a distance getter) between the point and all other points to the radius cutoff (\(r_{ij} \leq r\)).
The resulting neighbour containers are in general not sorted and include points as their own neighbour (self counting).
- Parameters
distance_getter – An object implementing the distance getter interface. Has to be a Cython extension type.
-
get_builder_kwargs
(type cls)¶
-
class
cnnclustering._types.
NeighboursGetterExtLookup
¶ Implements the neighbours getter interface
-
class
cnnclustering._types.
NeighboursGetterBruteForce
(distance_getter: Type[DistanceGetter])¶ Implements the neighbours getter interface
-
get
(self, index: int, input_data: Type[InputData], neighbours: Type[Neighbours], cluster_params: Type[ClusterParameters])¶
-
classmethod
get_builder_kwargs
(cls)¶
-
get_other
(self, index: int, input_data: Type[InputData], other_input_data: Type[InputData], neighbours: Type[Neighbours], cluster_params: Type[ClusterParameters])¶
-
property
is_selfcounting
¶
-
property
is_sorted
¶
-
-
class
cnnclustering._types.
NeighboursGetterLookup
(is_sorted=False, is_selfcounting=False)¶ Implements the neighbours getter interface
-
get
(self, index: int, input_data: Type[InputData], neighbours: Type[Neighbours], cluster_params: Type[ClusterParameters]) → None¶
-
get_other
(self, index: int, input_data: Type[InputData], other_input_data: Type[InputData], neighbours: Type[Neighbours], cluster_params: Type[ClusterParameters])¶
-
property
is_selfcounting
¶
-
property
is_sorted
¶
-
-
class
cnnclustering._types.
NeighboursGetterRecomputeLookup
(is_sorted=False, is_selfcounting=True)¶ Implements the neighbours getter interface
-
get
(self, index: int, input_data: Type[InputData], neighbours: Type[Neighbours], cluster_params: Type[ClusterParameters]) → None¶
-
get_other
(self, index: int, input_data: Type[InputData], other_input_data: Type[InputData], neighbours: Type[Neighbours], cluster_params: Type[ClusterParameters])¶
-
property
is_selfcounting
¶
-
property
is_sorted
¶
-
Distance getter¶
-
class
cnnclustering._types.
DistanceGetterExtInterface
¶ -
get_builder_kwargs
(type cls)¶
-
get_single
(self, AINDEX point_a, AINDEX point_b, InputDataExtInterface input_data)¶
-
get_single_other
(self, AINDEX point_a, AINDEX point_b, InputDataExtInterface input_data, InputDataExtInterface other_input_data)¶
-
-
class
cnnclustering._types.
DistanceGetter
¶ Defines the distance getter interface
-
classmethod
get_builder_kwargs
(cls)¶
-
classmethod
-
class
cnnclustering._types.
DistanceGetterExtMetric
¶ Implements the distance getter interface
-
get_builder_kwargs
(type cls)¶
-
get_single
(self, AINDEX point_a, AINDEX point_b, InputDataExtInterface input_data)¶
-
get_single_other
(self, AINDEX point_a, AINDEX point_b, InputDataExtInterface input_data, InputDataExtInterface other_input_data)¶
-
-
class
cnnclustering._types.
DistanceGetterExtLookup
¶ Implements the distance getter interface
-
class
cnnclustering._types.
DistanceGetterMetric
(metric: Type[Metric])¶ Implements the distance getter interface
-
classmethod
get_builder_kwargs
(cls)¶
-
classmethod
-
class
cnnclustering._types.
DistanceGetterLookup
¶ Implements the distance getter interface
-
get_single
(self, AINDEX point_a, AINDEX point_b, InputDataExtInterface input_data)¶
-
get_single_other
(self, AINDEX point_a, AINDEX point_b, InputDataExtInterface input_data, InputDataExtInterface other_input_data)¶
-
Metrics¶
-
class
cnnclustering._types.
MetricExtInterface
¶ Defines the metric interface for extension types
-
adjust_radius
(self, AVALUE radius_cutoff) → float¶
-
calc_distance
(self, AINDEX index_a, AINDEX index_b, InputDataExtInterface input_data) → float¶
-
calc_distance_other
(self, AINDEX index_a, AINDEX index_b, InputDataExtInterface input_data, InputDataExtInterface other_input_data) → float¶
-
get_builder_kwargs
(type cls)¶
-
-
class
cnnclustering._types.
Metric
¶ Defines the metric-interface
-
class
cnnclustering._types.
MetricExtDummy
¶ Implements the metric interface
-
class
cnnclustering._types.
MetricExtPrecomputed
¶ Implements the metric interface
-
class
cnnclustering._types.
MetricExtEuclidean
¶ Implements the metric interface
-
class
cnnclustering._types.
MetricExtEuclideanReduced
¶ Implements the metric interface
-
class
cnnclustering._types.
MetricExtEuclideanPeriodicReduced
¶ Implements the metric interface
-
class
cnnclustering._types.
MetricDummy
¶ Implements the metric interface
-
adjust_radius
(self, double radius_cutoff: float) → float¶
-
-
class
cnnclustering._types.
MetricEuclidean
¶ Implements the metric interface
-
adjust_radius
(self, double radius_cutoff: float) → float¶
-
Similarity checker¶
-
class
cnnclustering._types.
SimilarityCheckerExtInterface
¶ Defines the similarity checker interface for extension types
-
check
(self, NeighboursExtInterface neighbours_a, NeighboursExtInterface neighbours_b, ClusterParameters cluster_params)¶
-
get_builder_kwargs
(type cls)¶
-
-
class
cnnclustering._types.
SimilarityChecker
¶ Defines the similarity checker interface
-
abstract
check
(self, neighbours_a: Type[Neighbours], neighbours_b: Type[Neighbours], cluster_params: Type[ClusterParameters]) → bool¶ Retrun True if a and b have sufficiently many common neighbours
-
classmethod
get_builder_kwargs
(cls)¶
-
abstract
-
class
cnnclustering._types.
SimilarityCheckerExtContains
¶ Implements the similarity checker interface
- Strategy:
Loops over members of one neighbours container and checks if they are contained in the other neighbours container. Breaks early when similarity criterion is reached. The performance and time-complexity of the check depends on the used neighbour containers. Worst case time complexity is \(\mathcal{O}(n * m)\) with \(n\) and \(m\) being the lengths of the neighbours containers if the containment check is performed by iteration. Worst case time complexity is \(\mathcal{O}(n)\) if containment check can be performed as lookup in linear time. Note that no switching of the neighbours containers is done to ensure that the first container is the one with the shorter length (compare
cnnclustering._types.SimilarityCheckerExtSwitchContains
).
-
class
cnnclustering._types.
SimilarityCheckerExtSwitchContains
¶ Implements the similarity checker interface
- Strategy:
Loops over members of one neighbours container and checks if they are contained in the other neighbours container. Breaks early when similarity criterion is reached. The performance and time-complexity of the check depends on the used neighbour containers. Worst case time complexity is \(\mathcal{O}(n * m)\) with \(n\) and \(m\) being the lengths of the neighbours containers if the containment check is performed by iteration. Worst case time complexity is \(\mathcal{O}(n)\) if containment check can be performed as lookup in linear time. Note that switching of the neighbours containers is done to ensure that the first container is the one with the shorter length (compare
SimilarityCheckerExtContains
).
-
class
cnnclustering._types.
SimilarityCheckerExtScreensorted
¶ Implements the similarity checker interface
- Strategy:
Loops over members of two neighbour containers alternatingly and checks if neighbours are contained in both containers. Requires that the containers are sorted ascendingly to return the correct result. Sorting will neither be checked nor enforced. Breaks early when similarity criterion is reached. The performance of the check depends on the used neighbour containers. Worst case time complexity is \(\mathcal{O}(n + m)\) with \(n\) and \(m\) being the lengths of the neighbours containers.
-
class
cnnclustering._types.
SimilarityCheckerContains
¶ Implements the similarity checker interface
- Strategy:
Loops over members of one neighbours container and checks if they are contained in the other neighbours container. Breaks early when similarity criterion is reached. The performance and time-complexity of the check depends on the used neighbour containers. Worst case time complexity is \(\mathcal{O}(n * m)\) with \(n\) and \(m\) being the lengths of the neighbours containers if the containment check is performed by iteration. Worst case time complexity is \(\mathcal{O}(n)\) if containment check can be performed as lookup in linear time. Note that no switching of the neighbours containers is done to ensure that the first container is the one with the shorter length (compare
cnnclustering._types.SimilarityCheckerSwitchContains
).
-
check
(self, neighbours_a: Type[Neighbours], neighbours_b: Type[Neighbours], cluster_params: Type[ClusterParameters]) → bool¶
-
class
cnnclustering._types.
SimilarityCheckerSwitchContains
¶ Implements the similarity checker interface
- Strategy:
Loops over members of one neighbours container and checks if they are contained in the other neighbours container. Breaks early when similarity criterion is reached. The performance and time-complexity of the check depends on the used neighbour containers. Worst case time complexity is \(\mathcal{O}(n * m)\) with \(n\) and \(m\) being the lengths of the neighbours containers if the containment check is performed by iteration. Worst case time complexity is \(\mathcal{O}(n)\) if containment check can be performed as lookup in linear time. Note that a switching of the neighbours containers is done to ensure that the first container is the one with the shorter length (compare
cnnclustering._types.SimilarityCheckerContains
).
-
check
(self, neighbours_a: Type[Neighbours], neighbours_b: Type[Neighbours], cluster_params: Type[ClusterParameters]) → bool¶
Queues¶
Queues can be optionally used by a fitter, e.g.
FitterExtBFS
FitterBFS
-
class
cnnclustering._types.
QueueExtInterface
¶ -
get_builder_kwargs
(type cls)¶
-
is_empty
(self) → bool¶
-
pop
(self) → int¶
-
push
(self, value: int)¶
-
-
class
cnnclustering._types.
Queue
¶ Defines the queue interface
-
classmethod
get_builder_kwargs
(cls)¶
-
abstract
is_empty
(self) → bool¶ Return True if there are no values in the queue
-
abstract
pop
(self)¶ Retrieve value from the queue
-
abstract
push
(self, value)¶ Put value into the queue
-
classmethod
-
class
cnnclustering._types.
QueueExtLIFOVector
¶ Implements the queue interface
-
class
cnnclustering._types.
QueueExtFIFOQueue
¶ Implements the queue interface