pysad.models.RSHash

class pysad.models.RSHash(feature_mins, feature_maxes, sampling_points=1000, decay=0.015, num_components=100, num_hash_fns=1)[source]

Subspace outlier detection in linear time with randomized hashing [BSA16]. This implementation is adapted from cmuxstream-baselines.

Parameters:
  • feature_mins (np.float64 array of shape (num_features,)) – Minimum boundary of the features.

  • feature_maxes (np.float64 array of shape (num_features,)) – Maximum boundary of the features.

  • sampling_points (int) – The number of sampling points (Default=1000).

  • decay (float) – The decay hyperparameter (Default=0.015).

  • num_components (int) – The number of ensemble components (Default=100).

  • num_hash_fns (int) – The number of hashing functions (Default=1).

Methods

__init__(feature_mins, feature_maxes[, ...])

fit(X[, y])

Fits the model to all instances in order.

fit_partial(X[, y])

Fits the model to next instance.

fit_score(X[, y])

This helper method applies fit_score_partial to all instances in order.

fit_score_partial(X[, y])

Applies fit_partial and score_partial to the next instance, respectively.

score(X)

Scores all instaces via score_partial iteratively.

score_partial(X)

Scores the anomalousness of the next instance.

fit(X, y=None)

Fits the model to all instances in order.

Parameters:
  • X (np.float64 array of shape (num_instances, num_features)) – The instances in order to fit.

  • y (int) – The labels of the instances in order to fit (Optional for unsupervised models, default=None).

Returns:

Fitted model.

Return type:

object

fit_partial(X, y=None)[source]

Fits the model to next instance.

Parameters:
  • X (np.float64 array of shape (num_features,)) – The instance to fit.

  • y (int) – Ignored since the model is unsupervised (Default=None).

Returns:

Returns the self.

Return type:

object

fit_score(X, y=None)

This helper method applies fit_score_partial to all instances in order.

Parameters:
  • X (np.float64 array of shape (num_instances, num_features)) – The instances in order to fit.

  • y (np.int32 array of shape (num_instances, )) – The labels of the instances in order to fit (Optional for unsupervised models, default=None).

Returns:

The anomalousness scores of the instances in order.

Return type:

np.float64 array of shape (num_instances,)

fit_score_partial(X, y=None)

Applies fit_partial and score_partial to the next instance, respectively.

Parameters:
  • X (np.float64 array of shape (num_features,)) – The instance to fit and score.

  • y (int) – The label of the instance (Optional for unsupervised models, default=None).

Returns:

The anomalousness score of the input instance.

Return type:

float

score(X)

Scores all instaces via score_partial iteratively.

Parameters:

X (np.float64 array of shape (num_instances, num_features)) – The instances in order to score.

Returns:

The anomalousness scores of the instances in order.

Return type:

np.float64 array of shape (num_instances,)

score_partial(X)[source]

Scores the anomalousness of the next instance. Outputs the last score. Note that this method must be called after the fit_partial

Parameters:

X (any) – Ignored.

Returns:

The anomalousness score of the last fitted instance.

Return type:

float