API Reference¶
This is the API documentation for PySAD
.
Core¶
The pysad.core
module covers base classes of the PySAD.
Abstract base class for metrics. |
|
Abstract base class for the models. |
|
Base class for postprocessing methods. |
|
Abstact base class for the statistics. |
|
Abstract base class to simulate the streaming data. |
|
Base class for transforming methods. |
Individual Anomaly Models¶
The pysad.models
module includes anomaly detection models to score the anomalousness of instances.
The Exact-STORM method [BAF07]. |
|
Half-Space Trees method [BTTL11]. |
|
An Anomaly Detection Approach Based on Isolation Forest Algorithm for Streaming Data using Sliding Window [BDF13]. |
|
KitNET is a lightweight online anomaly detection algorithm based on an ensemble of autoencoders [BMDES18]. |
|
Conformalized density- and distance-based anomaly detection in time-series data [BBI16], which uses a combination of a feature extraction method, an approach to assess a score whether a new observation differs significantly from a previously observed data, and a probabilistic interpretation of this score based on the conformal paradigm. |
|
The LODA model [BPevny16] The implemnetation is adapted to the steraming framework from the PyOD framework. |
|
The implementation of streaming Local Outlier Probabilities method [BKKrogerSZ09], which uses the implementation of PyNomaly library [BCon18]. |
|
Median Absolute Deviation method [BHVK17]. |
|
The model that returns 0.5 for all instances, which is added for testing and pipelining convenience purposes. |
|
This model directly outputs the ground truth labels. |
|
Random scorer that chooses a score between 0 and 1 ignoring the input. |
|
Relative entropy based anomaly detection model on univariate stream [BALPA17]. |
|
Robust Random Cut Forest model [BGMRS16]. |
|
Subspace outlier detection in linear time with randomized hashing [BSA16]. |
|
The model that assigns the deviation from the mean (or median) and divides with the standard deviation. |
|
The xStream model for row-streaming data [BMLA18]. |
Integration Models¶
The pysad.models.integrations
module contains models to integrate batch anomaly detection models to the streaming setting.
This PyOD model wrapper wraps the batch anomaly detectors. |
|
The wrapper model fits the model_cls to the initial instnaces. |
Score Ensemblers¶
The pysad.transform.ensemble
module consist of ensemblers to combine scores from multiple anomaly detectors.
An ensembler that results the maximum of the previous scores. |
|
An ensembler that results the median of the previous scores. |
|
An wrapper class that results in the weighted average of the anomaly scores from multiple anomaly detectors. |
|
Maximum of average scores ensembler that outputs the maximum of average. |
|
Maximum of average scores ensembler that outputs the maximum of average. |
Probability Calibrators¶
The pysad.transform.probability_calibration
module includes probability calibrators to convert module scores into true probabilities for decision-making on anomalousness.
|
This class provides an interface to convert the scores into probabilities through conformal prediction. |
|
Assuming that the scores follow normal distribution, this class provides an interface to convert the scores into probabilities via Q-function, i.e., the tail function of Gaussian distribution [BALPA17]. |
Projectors¶
The pysad.transform.projection
module contains methods to project input into (possibly) lower dimensional space to better discriminate anomalies.
Streamhash projection method from Manzoor et. |
|
Reduces dimensionality through Gaussian random projection. |
|
The wrapper method for Sklearn's SparseRandomProjection. |
Preprocessors¶
”
The pysad.transform.preprocessing
module includes preprocessing methods to transform inputs such as normalizers.
A scaler that does not modify the input, which is added for convenience. |
|
Standard deviation scaling per instance. |
|
A scaler that makes the instance feature vector's norm equal to 1, i.e., the unit vector. |
Postprocessors¶
The pysad.transform.postprocessing
module includes postprocessors to transform model scores for streaming learning.
A postprocessor that convert a score to the average of of all previous scores. |
|
A postprocessor that convert a score to the maximum of of all previous scores. |
|
A postprocessor that convert a score to the median of of all previous scores. |
|
A postprocessor that normalize the score via Z-score normalization. |
|
A postprocessor that convert a score to the average of of all previous scores in the window. |
|
A postprocessor that convert a score to the maximum of of all previous scores in the window. |
|
A postprocessor that convert a score to the median of of all previous scores in the window. |
|
A postprocessor that normalizes score using Z-score normalization with the statistics of the window. |
Statistics¶
The pysad.statistics
module contains methods to keep track of statistics on streaming data.
The absolute value of the statistic that is tracked. |
|
The running statistic that wraps any other statistics to track statistics with a fixed window size. |
|
The average of the values. |
|
A simple counter statistic. |
|
The statistic that keeps track of the maximum value. |
|
The statistic that keeps track of the median. |
|
The statistic that keeps track of the minimum value. |
|
The statistic that keeps track of the sum of values. |
|
The statistic that keeps track of the sum of squares. |
|
The statistic that keeps track of the variance of the values. |
Evaluators¶
The pysad.evaluation
module includes evaluation metrics for anomaly detection on streaming data.
Abstract base class to wrap the sklearn metrics. |
|
Precision wrapper class for sklearn. |
|
Recall wrapper class for sklearn. |
|
Area under roc curve wrapper class for sklearn. |
|
Area under PR curve wrapper class for sklearn. |
|
A helper class to evaluate windowed metrics. |
Utilities¶
The pysad.utils
module includes utility functions used in the PySAD framework, which can also be useful streaming learning.
Window to limit the instances in list and keep the size fixed when full. |
|
A helper class to load various data. |
|
Simulator class to iterate array(s). |
|
Simulator class to iterate dataframe(s). |
|
Iterates array of features and possibly labels. |
|
Utility method that returns the boundaries for each feature of the input array. |
|
Utility method that returns the boundaries of the input array. |
|
Utility method to fix the seed for randomness. |