时序预测评价指标简介

本文简述时序预测(以及其他机器学习/深度学习应用领域)常见的各种评价指标(Evaluation Metrics),供日后研究总结用。

相关阅读:概率论与数理统计常用公式大全

Hyplus目录

1 确定性预测指标

Deterministic

1.1 误差指标

Error-Based: Absolute / Relative / Cumulative / Scaled

1.1.1 MAE

Mean Absolute Error(平均绝对误差,MAE):衡量一组预测中误差的平均大小,不考虑误差的方向。Measures the average magnitude of the errors in a set of predictions, without considering their direction.

\text{MAE}=\frac1n \sum\limits_{t=1}^n|y_t-\hat y_t|

1.1.2 MSE

Mean Squared Error(均方误差,MSE):衡量平方误差的平均值。Measures the average of the squared errors.

\text{MSE}=\frac1n \sum\limits_{t=1}^n(y_t-\hat y_t)^2

1.1.3 RMSE

Root Mean Squared Error(均方根误差,RMSE):MSE的平方根,对较大误差赋予更高权重。Square root of the average of squared errors, giving higher weight to larger errors.

\text{RMSE}=\sqrt{\frac1n \sum\limits_{t=1}^n(y_t-\hat y_t)^2}

1.1.4 MAPE

Mean Absolute Percentage Error(平均绝对百分比误差,MAPE):以百分比形式衡量误差的大小。Measures the size of the error in percentage terms.

\text{MAPE}=\frac1n\sum\limits_{t=1}^n|\frac{y_t-\hat y_t}{y_t}| \times 100

1.1.5 sMAPE

Symmetric Mean Absolute Percentage Error(对称平均绝对百分比误差,sMAPE):基于相对误差衡量预测准确性。Measures the accuracy based on relative error.

\text{sMAPE}=\frac1n\sum\limits_{t=1}^n \frac{|y_t-\hat y_t|}{\frac{|y_t|+|\hat y_t|}2} \times 100

1.1.6 MFE

Mean Forecast Error(平均预测误差,MFE):预测误差的平均值,反映预测偏差。Average of forecast errors, indicating bias.

\text{MFE}=\frac1n\sum\limits_{t=1}^n (y_t-\hat y_t)

1.1.7 CFE

Cumulative Forecast Error(累计预测误差,CFE):所有预测误差的总和,衡量整个预测范围内的总偏差。Sum of all forecast errors, measures total bias over the forecast horizon.

\text{CFE}=\sum\limits_{t=1}^n (y_t-\hat y_t)

1.1.8 MASE

Mean Absolute Scaled Error(平均绝对缩放误差,MASE):通过Naïve预测(基准模型,不唯一)的MAE进行缩放的MAE。MAE scaled by the MAE of a naïve forecast.

\text{MASE}=\frac{\frac1n \sum\limits_{t=1}^n|y_t-\hat y_t|}{\frac1{n-1} \sum\limits_{t=2}^n|y_t-y_{t-1}|}

1.2 解释方差指标

Explained Variance Metrics

1.2.1 R²

Coefficient of Determination(决定系数,R2):模型解释的方差比例。Proportion of variance explained by the model.

R^2= 1 - \frac{\sum\limits_{t=1}^n(y_t-\hat y_t)^2}{\sum\limits_{t=1}^n(y_t-\bar y)^2}

1.2.2 调整R²

Adjusted Coefficient of Determination(调整决定系数,Adjusted R2):调整预测变量数量后的决定系数。R2 adjusted for the number of predictors.

\text{Adjusted}\ R^2=1-\frac{(1-R^2)(n-1)}{n-k-1}

1.2.3 EVS

Explained Variance Score(解释方差分数,EVS):衡量模型解释的方差比例。Measures the proportion of variance explained by the model.

\text{EVS}=1-\frac{\text{Var}(y_t-\hat y_t)}{\text{Var}(y_t)}

1.3 模型选择指标

Model Selection Metrics

1.3.1 AIC

Akaike Information Criterion(赤池信息准则,AIC):模型拟合优度与复杂度之间的权衡。Trade-off between goodness of fit and model complexity.

\text{AIC}=2k-2\ln \hat L

1.3.2 BIC

Bayesian Information Criterion(贝叶斯信息准则,BIC):类似于AIC,但对参数较多的模型有更强的惩罚。Similar to AIC with a stronger penalty for models with more parameters.

\text{BIC}=k\ln n - 2 \ln \hat L

1.3.3 HQC

Hannan-Quinn Criterion(汉南-奎因准则,HQC):AIC和BIC的替代方案,具有不同的惩罚项。Alternative to AIC and BIC with different penalty terms.

\text{HQC}= 2k\ln(\ln n) - 2\ln \hat L

1.3.4 AICc

Corrected Akaike Information Criterion(修正赤池信息准则,AICc):针对小样本尺寸修正的AIC。AIC with correction for small sample sizes.

\text{AICc}=\text{AIC}+\frac{2k(k+1)}{n-k-1}

2 概率性预测指标

Probabilistic

2.1 误差指标

Error-Based Metrics

2.1.1 Log Score

Logarithmic Score(对数分数,Log Score):使用对数函数评估预测概率与实际结果之间的差异。Evaluates the difference between predicted probabilities and actual outcomes using a logarithmic function.

\text{LogScore}=-\frac1n \sum_{t=1}^n\log p_t

2.1.2 CRPS

Continuous Ranked Probability Score(连续分级概率评分,CRPS):使用累积分布函数评估预测概率分布与观测值之间的差异。Evaluates the difference between the predicted probability distribution and the observed value using the cumulative distribution function.

\text{CRPS}=\int_{-\infty}^{+\infty}(\hat F(z)- I_{z\ge y_t})^2\text{d}z

2.2 区间指标

Interval Metrics

2.2.1 PICP

Prediction Interval Coverage Probability(预测区间覆盖概率,PICP):衡量观测值落在预测区间内的比例。Measures the proportion of observed values that fall within the predicted intervals.

\text{PICP}=\frac1n\sum\limits_{t=1}^n I_{y_t \in [\hat y_{\text{lower},t},\hat y_{\text{upper},t}]}

2.2.2 PIW

Prediction Interval Width(预测区间宽度,PIW):通过测量预测区间的宽度来评估预测精度。Evaluates precision by measuring the width of prediction intervals.

\text{PIW}=\frac1n\sum\limits_{t=1}^n (\hat y_{\text{upper},t} - \hat y_{\text{lower},t})

2.3 其他

2.3.1 分位数损失

  • Quantile Loss(分位数损失,又称Pinball Loss):基于分位数\tau对过度预测和不足预测进行惩罚。Penalizes over- and under-predictions based on quantile \tau.
\text{QuantileLoss}=\frac1n \sum\limits_{t=1}^n (I_{y_t \ge \hat y_t}\tau(y_t-\hat y_t)+I_{y_t<\hat y_t}(1-\tau)(\hat y_t-y_t))

2.3.2 锐度

  • Sharpness(锐度):通过区间的宽度或方差评估预测的集中程度。Evaluates the concentration by the width of intervals or variance.
\text{Sharpness}=\frac1n \sum\limits_{i=1}^n\text{Var}(\hat y_i)

3 符号表

文中所有公式涉及的符号含义如下:

符号 含义
n 样本数量(时间点总数)
t 时间点索引(t = 1, 2, \cdots, n
y_t 在时间点t的实际观测值
\hat y_t 在时间点t的预测值
\bar y 实际观测值的平均值
k 模型参数数量
\hat L 模型似然函数的最大值
\hat F(z) 预测分布的累积分布函数
I_{\text{condition}} 指示函数(当条件满足时为1,否则为0)
p_t 预测概率
\tau 分位数
\hat y_{\text{lower},t}
\hat y_{\text{upper},t}
预测区间的下限和上限
\text{Var}(\cdot) 方差

《时序预测评价指标简介》有1条评论

发表评论