pysad.models.RSHash

class pysad.models.RSHash(feature_mins, feature_maxes, sampling_points=1000, decay=0.015, num_components=100, num_hash_fns=1)[source]

Subspace outlier detection in linear time with randomized hashing [BSA16]. This implementation is adapted from cmuxstream-baselines.

Parameters:

feature_mins (np.float64 array of shape (num_features,)) – Minimum boundary of the features.
feature_maxes (np.float64 array of shape (num_features,)) – Maximum boundary of the features.
sampling_points (int) – The number of sampling points (Default=1000).
decay (float) – The decay hyperparameter (Default=0.015).
num_components (int) – The number of ensemble components (Default=100).
num_hash_fns (int) – The number of hashing functions (Default=1).

Methods

`__init__`(feature_mins, feature_maxes[, ...])
`fit`(X[, y])	Fits the model to all instances in order.
`fit_partial`(X[, y])	Fits the model to next instance.
`fit_score`(X[, y])	This helper method applies fit_score_partial to all instances in order.
`fit_score_partial`(X[, y])	Applies fit_partial and score_partial to the next instance, respectively.
`score`(X)	Scores all instances via score_partial iteratively.
`score_partial`(X)	Scores the anomalousness of the next instance.

fit(X, y=None)

Fits the model to all instances in order.

Parameters:

X (np.float64 array of shape (num_instances, num_features)) – The instances in order to fit.
y (int) – The labels of the instances in order to fit (Optional for unsupervised models, default=None).

Returns:

Fitted model.

Return type:

object

fit_partial(X, y=None)[source]

Fits the model to next instance.

Parameters:

X (np.float64 array of shape (num_features,)) – The instance to fit.
y (int) – Ignored since the model is unsupervised (Default=None).

Returns:

Returns the self.

Return type:

object

fit_score(X, y=None)

This helper method applies fit_score_partial to all instances in order.

Parameters:

X (np.float64 array of shape (num_instances, num_features)) – The instances in order to fit.
y (np.int32 array of shape (num_instances, )) – The labels of the instances in order to fit (Optional for unsupervised models, default=None).

Returns:

The anomalousness scores of the instances in order.

Return type:

np.float64 array of shape (num_instances,)

fit_score_partial(X, y=None)

Applies fit_partial and score_partial to the next instance, respectively.

Parameters:

X (np.float64 array of shape (num_features,)) – The instance to fit and score.
y (int) – The label of the instance (Optional for unsupervised models, default=None).

Returns:

The anomalousness score of the input instance.

Return type:

float

score(X)

Scores all instances via score_partial iteratively.

Parameters:: X (np.float64 array of shape (num_instances, num_features)) – The instances in order to score.
Returns:: The anomalousness scores of the instances in order.
Return type:: np.float64 array of shape (num_instances,)

score_partial(X)[source]

Scores the anomalousness of the next instance. Outputs the last score. Note that this method must be called after fit_partial is called.

Parameters:: X (any) – Ignored.
Returns:: The anomalousness score of the last fitted instance.
Return type:: float