Basics

Generate signal

In this example, we consider basic usages for change point detection methods in roerich library. Let’s generate a signal with different regimes, or, in other words, with different distributions of observations. The goal is to determine all change points that correspond to the distribution changes. We also can consider this task as the signal segmentation, where each segment corresponds to one regime of the signal.

import roerich
import roerich.algorithms
import roerich.metrics
import numpy as np

# generate a time series with change points
X, cps_true = roerich.generate_dataset(period=200, N_tot=2000)

ChangePointDetectionClassifier

The first method we use here is ChangePointDetectionClassifier. It is based on binary classifiers in machine learning. A classifier takes two parts of the signal with window_size observations and separates them into two classes. Then, it uses predictions to estimate the probability density ratio for these observations and calculate the change point score between the windows. The method refits the classifier on each new pair of windows.

from roerich.algorithms import ChangePointDetectionClassifier
from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis

# sklearn-like binary classifier
clf = QuadraticDiscriminantAnalysis()

# change points detection
cpd = ChangePointDetectionClassifier(base_classifier=clf, metric='KL_sym', periods=1,
                                     window_size=100, step=1, n_runs=1)
score, cps_pred = cpd.predict(X)

# visualization
roerich.display(X, cps_true, score, cps_pred)

Now, let’s measure the quality of change point detection using Precision and Recall metrics.

# metrics
precision, recall = roerich.metrics.precision_recall_scores(cps_true, cps_pred, window=20)
auc = roerich.metrics.pr_auc(cps_true, cps_pred, score[cps_pred], window=20)
print('Precision: ', precision)
print('Recall: ', recall)
print('PR AUC: ', auc)

Precision:  1.0
Recall:  1.0
PR AUC:  1.0

The ChangePointDetectionClassifier method can take any sklearn-like binary classifier. For example, you use your favorite Neural Network or Gradient Boosting over Decision Trees algorithm. The following example uses a shallow Neural Network implemented in PyTroch as a base classifier for change point detection.

# pytorch NN classifier
clf = roerich.algorithms.NNClassifier(n_hidden=10, n_epochs=10, batch_size=64, lr=0.1, l2=0.0)

# detection
cpd = ChangePointDetectionClassifier(base_classifier=clf, metric='KL_sym', periods=1,
                                     window_size=100, step=1, n_runs=1)

score, cps_pred = cpd.predict(X)
cps_pred

array([ 194,  399,  598,  803, 1001, 1199, 1399, 1598, 1800])

ChangePointDetectionRuLSIF

Similarly, ChangePointDetectionRuLSIF method also estimates probability density ratio between two windows of the signal. However, it does it using regression models with the RuLSIF loss function. They directly predict the ratios without learning individual distributions. Roerich provides two following RuLSIF models. NNRuLSIFRegressor model is a shallow Neural Network implemented in PyTorch. And GBDTRuLSIFRegressor is a Gradient Boosting over Regression Trees. The following examples show how to use them for change point detection.

from roerich.algorithms import NNRuLSIFRegressor, ChangePointDetectionRuLSIF

# pytorch NN regressor with RuLSIF loss function
reg = NNRuLSIFRegressor(n_hidden=10, n_epochs=10, batch_size=64, 
                        lr=0.1, l2=0.0, alpha=0.05)

# detection
cpd = ChangePointDetectionRuLSIF(reg, metric='PE', periods=1,
                                 window_size=100, step=1, n_runs=1)

score, cps_pred = cpd.predict(X)

# visualization
roerich.display(X, cps_true, score, cps_pred)

from roerich.algorithms import GBDTRuLSIFRegressor

# regressor with RuLSIF loss function
reg = GBDTRuLSIFRegressor(n_estimators=10, max_depth=2)

# detection
cpd = ChangePointDetectionRuLSIF(reg, metric='PE', periods=1,
                                 window_size=100, step=1, n_runs=1)

score, cps_pred = cpd.predict(X)
cps_pred

array([ 206,  400,  596,  798, 1001, 1201, 1401, 1600, 1799])

OnlineNNClassifier

OnlineNNClassifier is based on the same principles as the previous methods. But it uses a single Neural Network to detect change points in the whole time series. The network scans the signal point by point (window_size=1) and updates its weights online. It estimates the probability density ratio between distant (lag_size=100) observations of the signal. In result, the method is faster than previous algorithms, because it fits the network online and reuses it from previous iterations.

from roerich.algorithms import OnlineNNClassifier

# detection
cpd = OnlineNNClassifier(net='default', scaler="default", metric="KL_sym",
                         periods=1, window_size=1, lag_size=100, step=1, 
                         n_epochs=10, lr=0.1, lam=0.0001, optimizer="Adam")

score, cps_pred = cpd.predict(X)

# visualization
roerich.display(X, cps_true, score, cps_pred)

OnlineNNRuLSIF

OnlineNNRuLSIF is the same as OnlineNNClassifier, but uses Neural Network regression model with the RuLSIF loss function fitted online.

from roerich.algorithms import OnlineNNRuLSIF

cpd = OnlineNNRuLSIF(alpha=0.05, net='default', scaler="default",
                     periods=1, window_size=1, lag_size=100, step=1, n_epochs=10,
                     lr=0.1, lam=0.0001, optimizer="Adam")

score, cps_pred = cpd.predict(X)
cps_pred

array([ 198,  402,  601,  798, 1006, 1203, 1401, 1604, 1804])