API Reference

This is the API documentation for PySAD.

Core

The pysad.core module covers base classes of the PySAD.

core.BaseMetric

Abstract base class for metrics.

core.BaseModel

Abstract base class for the models.

core.BasePostprocessor

Base class for postprocessing methods.

core.BaseStatistic

Abstact base class for the statistics.

core.BaseStreamer

Abstract base class to simulate the streaming data.

core.BaseTransformer

Base class for transforming methods.

Individual Anomaly Models

The pysad.models module includes anomaly detection models to score the anomalousness of instances.

models.ExactStorm

The Exact-STORM method [BAF07].

models.HalfSpaceTrees

Half-Space Trees method [BTTL11].

models.IForestASD

An Anomaly Detection Approach Based on Isolation Forest Algorithm for Streaming Data using Sliding Window [BDF13].

models.KitNet

KitNET is a lightweight online anomaly detection algorithm based on an ensemble of autoencoders [BMDES18].

models.KNNCAD

Conformalized density- and distance-based anomaly detection in time-series data [BBI16], which uses a combination of a feature extraction method, an approach to assess a score whether a new observation differs significantly from a previously observed data, and a probabilistic interpretation of this score based on the conformal paradigm.

models.LODA

The LODA model [BPevny16] The implemnetation is adapted to the steraming framework from the PyOD framework.

models.LocalOutlierProbability

The implementation of streaming Local Outlier Probabilities method [BKKrogerSZ09], which uses the implementation of PyNomaly library [BCon18].

models.MedianAbsoluteDeviation

Median Absolute Deviation method [BHVK17].

models.NullModel

The model that returns 0.5 for all instances, which is added for testing and pipelining convenience purposes.

models.PerfectModel

This model directly outputs the ground truth labels.

models.RandomModel

Random scorer that chooses a score between 0 and 1 ignoring the input.

models.RelativeEntropy

Relative entropy based anomaly detection model on univariate stream [BALPA17].

models.RobustRandomCutForest

Robust Random Cut Forest model [BGMRS16].

models.RSHash

Subspace outlier detection in linear time with randomized hashing [BSA16].

models.StandardAbsoluteDeviation

The model that assigns the deviation from the mean (or median) and divides with the standard deviation.

models.xStream

The xStream model for row-streaming data [BMLA18].

Integration Models

The pysad.models.integrations module contains models to integrate batch anomaly detection models to the streaming setting.

models.integrations.ReferenceWindowModel

This PyOD model wrapper wraps the batch anomaly detectors.

models.integrations.OneFitModel

The wrapper model fits the model_cls to the initial instnaces.

Score Ensemblers

The pysad.transform.ensemble module consist of ensemblers to combine scores from multiple anomaly detectors.

transform.ensemble.MaximumScoreEnsembler

An ensembler that results the maximum of the previous scores.

transform.ensemble.MedianScoreEnsembler

An ensembler that results the median of the previous scores.

transform.ensemble.AverageScoreEnsembler

An wrapper class that results in the weighted average of the anomaly scores from multiple anomaly detectors.

transform.ensemble.MaximumOfAverageScoreEnsembler

Maximum of average scores ensembler that outputs the maximum of average.

transform.ensemble.AverageOfMaximumScoreEnsembler

Maximum of average scores ensembler that outputs the maximum of average.

Probability Calibrators

The pysad.transform.probability_calibration module includes probability calibrators to convert module scores into true probabilities for decision-making on anomalousness.

transform.probability_calibration.ConformalProbabilityCalibrator

This class provides an interface to convert the scores into probabilities through conformal prediction.

transform.probability_calibration.GaussianTailProbabilityCalibrator

Assuming that the scores follow normal distribution, this class provides an interface to convert the scores into probabilities via Q-function, i.e., the tail function of Gaussian distribution [BALPA17].

Projectors

The pysad.transform.projection module contains methods to project input into (possibly) lower dimensional space to better discriminate anomalies.

transform.projection.StreamhashProjector

Streamhash projection method from Manzoor et.

transform.projection.GaussianRandomProjector

Reduces dimensionality through Gaussian random projection.

transform.projection.SparseRandomProjector

The wrapper method for Sklearn's SparseRandomProjection.

Preprocessors

” The pysad.transform.preprocessing module includes preprocessing methods to transform inputs such as normalizers.

transform.preprocessing.IdentityScaler

A scaler that does not modify the input, which is added for convenience.

transform.preprocessing.InstanceStandardScaler

Standard deviation scaling per instance.

transform.preprocessing.InstanceUnitNormScaler

A scaler that makes the instance feature vector's norm equal to 1, i.e., the unit vector.

Postprocessors

The pysad.transform.postprocessing module includes postprocessors to transform model scores for streaming learning.

pysad.transform.postprocessing.AveragePostprocessor

A postprocessor that convert a score to the average of of all previous scores.

pysad.transform.postprocessing.MaxPostprocessor

A postprocessor that convert a score to the maximum of of all previous scores.

pysad.transform.postprocessing.MedianPostprocessor

A postprocessor that convert a score to the median of of all previous scores.

pysad.transform.postprocessing.ZScorePostprocessor

A postprocessor that normalize the score via Z-score normalization.

pysad.transform.postprocessing.RunningAveragePostprocessor

A postprocessor that convert a score to the average of of all previous scores in the window.

pysad.transform.postprocessing.RunningMaxPostprocessor

A postprocessor that convert a score to the maximum of of all previous scores in the window.

pysad.transform.postprocessing.RunningMedianPostprocessor

A postprocessor that convert a score to the median of of all previous scores in the window.

pysad.transform.postprocessing.RunningZScorePostprocessor

A postprocessor that normalizes score using Z-score normalization with the statistics of the window.

Statistics

The pysad.statistics module contains methods to keep track of statistics on streaming data.

statistics.AbsStatistic

The absolute value of the statistic that is tracked.

statistics.RunningStatistic

The running statistic that wraps any other statistics to track statistics with a fixed window size.

statistics.AverageMeter

The average of the values.

statistics.CountMeter

A simple counter statistic.

statistics.MaxMeter

The statistic that keeps track of the maximum value.

statistics.MedianMeter

The statistic that keeps track of the median.

statistics.MinMeter

The statistic that keeps track of the minimum value.

statistics.SumMeter

The statistic that keeps track of the sum of values.

statistics.SumSquaresMeter

The statistic that keeps track of the sum of squares.

statistics.VarianceMeter

The statistic that keeps track of the variance of the values.

Evaluators

The pysad.evaluation module includes evaluation metrics for anomaly detection on streaming data.

evaluation.BaseSKLearnMetric

Abstract base class to wrap the sklearn metrics.

evaluation.PrecisionMetric

Precision wrapper class for sklearn.

evaluation.RecallMetric

Recall wrapper class for sklearn.

evaluation.AUROCMetric

Area under roc curve wrapper class for sklearn.

evaluation.AUPRMetric

Area under PR curve wrapper class for sklearn.

evaluation.WindowedMetric

A helper class to evaluate windowed metrics.

Utilities

The pysad.utils module includes utility functions used in the PySAD framework, which can also be useful streaming learning.

utils.Window

Window to limit the instances in list and keep the size fixed when full.

utils.Data

A helper class to load various data.

utils.ArrayStreamer

Simulator class to iterate array(s).

utils.PandasStreamer

Simulator class to iterate dataframe(s).

utils._iterate

Iterates array of features and possibly labels.

utils.get_minmax_array

Utility method that returns the boundaries for each feature of the input array.

utils.get_minmax_scalar

Utility method that returns the boundaries of the input array.

utils.fix_seed

Utility method to fix the seed for randomness.