指数加权移动平均法(EWMA)

主要思想

Exponentially Weighted Moving Average (EWMA),对观察值给予不同的权重求得平均值,并以平均值为基础确定预测值的方法,往往近期观察值的权重高是因为其更能反映近期变化的趋势。各个指数加权系数是随时间指数递减的,越靠近当前时刻,加权系数就越大。

可以视作一种理想的最大似然估计,当前估计值由前一次估计值和当前的抽样值共同决定;也可以视作一个低通滤波器,剔除短期波动保留长期发展趋势的平滑形式。

优点

  1. 不需要保存过去的所有数值
  2. 计算量显著减小

计算

\[ v_t = \beta v_{t-1}+(1-\beta)\theta_t\]

其中\(\theta_t\)为时刻\(t\)的实际温度;\(\beta\)表示加权下降的速率,其值越小下降越快;\(v_t\)\(t\)时刻的EWMA值。

变差修正

如果初始化\(v_0=0\),那么初期值都会偏小,虽然最后会慢慢减小这部分影响但是还是对公式做出适当修正: \[v_t = \frac{\beta v_{t-1}+(1-\beta)\theta_t}{1-\beta^t}\]

当t表较小(初期)分母可以很好放大当前数值,当t很大,分母趋为1,对数值机会没有影响。

代码

import numpy as np


class Ewma(object):
"""
In statistical quality control, the EWMA chart (or exponentially weighted moving average chart)
is a type of control chart used to monitor either variables or attributes-type data using the monitored business
or industrial process's entire history of output. While other control charts treat rational subgroups of samples
individually, the EWMA chart tracks the exponentially-weighted moving average of all prior sample means.
WIKIPEDIA: https://en.wikipedia.org/wiki/EWMA_chart
"""

def __init__(self, alpha=0.3, coefficient=3):
"""
:param alpha: Discount rate of ewma, usually in (0.2, 0.3).
:param coefficient: Coefficient is the width of the control limits, usually in (2.7, 3.0).
"""
self.alpha = alpha
self.coefficient = coefficient

def predict(self, X):
"""
Predict if a particular sample is an outlier or not.
:param X: the time series to detect of
:param type X: pandas.Series
:return: 1 denotes normal, 0 denotes abnormal
"""
s = [X[0]]
for i in range(1, len(X)):
temp = self.alpha * X[i] + (1 - self.alpha) * s[-1]
s.append(temp)
s_avg = np.mean(s)
sigma = np.sqrt(np.var(X))
ucl = s_avg + self.coefficient * sigma * np.sqrt(self.alpha / (2 - self.alpha))
lcl = s_avg - self.coefficient * sigma * np.sqrt(self.alpha / (2 - self.alpha))
if s[-1] > ucl or s[-1] < lcl:
return 0
return 1