本文简述时序预测（以及其他机器学习/深度学习应用领域）常见的各种评价指标（Evaluation Metrics），供日后研究总结用。公式表述力求简明易懂，符号表见博文末尾。

Hyplus目录

1 确定性预测指标

Deterministic

1.1 误差指标

Error-Based: Absolute / Relative / Cumulative / Scaled

1.1.1 MAE

Mean Absolute Error（平均绝对误差，MAE）：衡量一组预测中误差的平均大小，不考虑误差的方向。Measures the average magnitude of the errors in a set of predictions, without considering their direction.

 $\begin{aligned} \text{MAE}=\frac1n \sum\limits_{t=1}^n|y_t-\hat y_t| \end{aligned}$

1.1.2 MSE

Mean Squared Error（均方误差，MSE）：衡量平方误差的平均值。Measures the average of the squared errors.

 $\begin{aligned} \text{MSE}=\frac1n \sum\limits_{t=1}^n(y_t-\hat y_t)^2 \end{aligned}$

1.1.3 RMSE

Root Mean Squared Error（均方根误差，RMSE）：MSE的平方根，对较大误差赋予更高权重。Square root of the average of squared errors, giving higher weight to larger errors.

 $\begin{aligned} \text{RMSE}=\sqrt{\frac1n \sum\limits_{t=1}^n(y_t-\hat y_t)^2} \end{aligned}$

1.1.4 MAPE

Mean Absolute Percentage Error（平均绝对百分比误差，MAPE）：以百分比形式衡量误差的大小。Measures the size of the error in percentage terms.

 $\begin{aligned} \text{MAPE}=\frac1n\sum\limits_{t=1}^n|\frac{y_t-\hat y_t}{y_t}| \times 100 \end{aligned}$

1.1.5 sMAPE

Symmetric Mean Absolute Percentage Error（对称平均绝对百分比误差，sMAPE）：基于相对误差衡量预测准确性。Measures the accuracy based on relative error.

 $\begin{aligned} \text{sMAPE}=\frac1n\sum\limits_{t=1}^n \frac{|y_t-\hat y_t|}{\frac{|y_t|+|\hat y_t|}2} \times 100 \end{aligned}$

1.1.6 MFE

Mean Forecast Error（平均预测误差，MFE）：预测误差的平均值，反映预测偏差。Average of forecast errors, indicating bias.

 $\begin{aligned} \text{MFE}=\frac1n\sum\limits_{t=1}^n (y_t-\hat y_t) \end{aligned}$

1.1.7 CFE

Cumulative Forecast Error（累计预测误差，CFE）：所有预测误差的总和，衡量整个预测范围内的总偏差。Sum of all forecast errors, measures total bias over the forecast horizon.

 $\begin{aligned} \text{CFE}=\sum\limits_{t=1}^n (y_t-\hat y_t) \end{aligned}$

1.1.8 MASE

Mean Absolute Scaled Error（平均绝对缩放误差，MASE）：通过Naïve预测（基准模型，不唯一）的MAE进行缩放的MAE。MAE scaled by the MAE of a naïve forecast.

 $\begin{aligned} \text{MASE}=\frac{\frac1n \sum\limits_{t=1}^n|y_t-\hat y_t|}{\frac1{n-1} \sum\limits_{t=2}^n|y_t-y_{t-1}|} \end{aligned}$

1.2 解释方差指标

Explained Variance Metrics

1.2.1 R²

Coefficient of Determination（决定系数，R²）：模型解释的方差比例，相关系数的平方。Proportion of variance explained by the model.

 $\begin{aligned} R^2= 1 - \frac{\sum\limits_{t=1}^n(y_t-\hat y_t)^2}{\sum\limits_{t=1}^n(y_t-\bar y)^2} \end{aligned}$

1.2.2 调整R²

Adjusted Coefficient of Determination（调整决定系数，Adjusted R²）：调整预测变量数量后的决定系数。R² adjusted for the number of predictors.

 $\begin{aligned} \text{Adjusted}\ R^2=1-\frac{(1-R^2)(n-1)}{n-k-1} \end{aligned}$

1.2.3 EVS

Explained Variance Score（解释方差分数，EVS）：衡量模型解释的方差比例。Measures the proportion of variance explained by the model.

 $\begin{aligned} \text{EVS}=1-\frac{\text{Var}(y_t-\hat y_t)}{\text{Var}(y_t)} \end{aligned}$

1.3 模型选择指标

Model Selection Metrics

1.3.1 AIC

Akaike Information Criterion（赤池信息准则，AIC）：模型拟合优度与复杂度之间的权衡。Trade-off between goodness of fit and model complexity.

 $\begin{aligned} \text{AIC}=2k-2\ln \hat L \end{aligned}$

1.3.2 BIC

Bayesian Information Criterion（贝叶斯信息准则，BIC）：类似于AIC，但对参数较多的模型有更强的惩罚。Similar to AIC with a stronger penalty for models with more parameters.

 $\begin{aligned} \text{BIC}=k\ln n - 2 \ln \hat L \end{aligned}$

1.3.3 HQC

Hannan-Quinn Criterion（汉南-奎因准则，HQC）：AIC和BIC的替代方案，具有不同的惩罚项。Alternative to AIC and BIC with different penalty terms.

 $\begin{aligned} \text{HQC}= 2k\ln(\ln n) - 2\ln \hat L \end{aligned}$

1.3.4 AICc

Corrected Akaike Information Criterion（修正赤池信息准则，AICc）：针对小样本尺寸修正的AIC。AIC with correction for small sample sizes.

 $\begin{aligned} \text{AICc}=\text{AIC}+\frac{2k(k+1)}{n-k-1} \end{aligned}$

2 概率性预测指标

Probabilistic

2.1 误差指标

Error-Based Metrics

2.1.1 Log Score

Logarithmic Score（对数分数，Log Score）：使用对数函数评估预测概率与实际结果之间的差异。Evaluates the difference between predicted probabilities and actual outcomes using a logarithmic function.

 $\begin{aligned} \text{LogScore}=-\frac1n \sum\limits_{t=1}^n\log p_t \end{aligned}$

2.1.2 CRPS

Continuous Ranked Probability Score（连续分级概率评分，CRPS）：使用累积分布函数评估预测概率分布与观测值之间的差异。Evaluates the difference between the predicted probability distribution and the observed value using the cumulative distribution function.

 $\begin{aligned} \text{CRPS}=\int_{-\infty}^{+\infty}(\hat F(z)- I_{z\ge y_t})^2\text{d}z \end{aligned}$

2.2 区间指标

Interval Metrics

2.2.1 PICP

Prediction Interval Coverage Probability（预测区间覆盖概率，PICP）：衡量观测值落在预测区间内的比例。Measures the proportion of observed values that fall within the predicted intervals.

 $\begin{aligned} \text{PICP}=\frac1n\sum\limits_{t=1}^n I_{y_t \in [\hat y_{\text{lower},t},\hat y_{\text{upper},t}]} \end{aligned}$

2.2.2 PIW

Prediction Interval Width（预测区间宽度，PIW）：通过测量预测区间的宽度来评估预测精度。Evaluates precision by measuring the width of prediction intervals.

 $\begin{aligned} \text{PIW}=\frac1n\sum\limits_{t=1}^n (\hat y_{\text{upper},t} - \hat y_{\text{lower},t}) \end{aligned}$

2.3 其他

2.3.1 分位数损失

Quantile Loss（分位数损失，又称Pinball Loss）：基于分位数 $\tau$ 对过度预测和不足预测进行惩罚。Penalizes over- and under-predictions based on quantile $\tau$ .

 $\begin{aligned} \text{QuantileLoss}=\frac1n \sum\limits_{t=1}^n (I_{y_t \ge \hat y_t}\tau(y_t-\hat y_t)+I_{y_t<\hat y_t}(1-\tau)(\hat y_t-y_t)) \end{aligned}$

2.3.2 锐度

Sharpness（锐度）：通过区间的宽度或方差评估预测的集中程度。Evaluates the concentration by the width of intervals or variance.

 $\begin{aligned} \text{Sharpness}=\frac1n \sum\limits_{i=1}^n\text{Var}(\hat y_i) \end{aligned}$

3 符号表

文中所有公式涉及的符号含义如下：

符号	含义
$n$	样本数量（时间点总数）
$t$	时间点索引（ $t = 1, 2, \cdots, n$ ）
$y_t$	在时间点 $t$ 的实际观测值
$\hat y_t$	在时间点 $t$ 的预测值
$\bar y$	实际观测值的平均值
$k$	模型参数数量
$\hat L$	模型似然函数的最大值
$\hat F(z)$	预测分布的累积分布函数
$I_{\text{condition}}$ 或 $\mathbb{I}(\text{condition})$	指示函数（当条件满足时为1，否则为0）
$p_t$	预测概率
$\tau$	分位数
$\hat y_{\text{lower},t}$ 和 $\hat y_{\text{upper},t}$	预测区间的下限和上限
$\text{Var}(\cdot)$	方差

时序预测评价指标简介

1 确定性预测指标

1.1 误差指标

1.1.1 MAE

1.1.2 MSE

1.1.3 RMSE

1.1.4 MAPE

1.1.5 sMAPE

1.1.6 MFE

1.1.7 CFE

1.1.8 MASE

1.2 解释方差指标

1.2.1 R²

1.2.2 调整R²

1.2.3 EVS

1.3 模型选择指标

1.3.1 AIC

1.3.2 BIC

1.3.3 HQC

1.3.4 AICc

2 概率性预测指标

2.1 误差指标

2.1.1 Log Score

2.1.2 CRPS

2.2 区间指标

2.2.1 PICP

2.2.2 PIW

2.3 其他

2.3.1 分位数损失

2.3.2 锐度

3 符号表

《时序预测评价指标简介》有3条评论

回复 Akira 取消回复

综合大模型

控制台

实用工具

信息检索

其他资源

非常规搜索引擎

Hyplus服务

1 确定性预测指标

1.1 误差指标

1.1.1 MAE

1.1.2 MSE

1.1.3 RMSE

1.1.4 MAPE

1.1.5 sMAPE

1.1.6 MFE

1.1.7 CFE

1.1.8 MASE

1.2 解释方差指标

1.2.1 R²

1.2.2 调整R²

1.2.3 EVS

1.3 模型选择指标

1.3.1 AIC

1.3.2 BIC

1.3.3 HQC

1.3.4 AICc

2 概率性预测指标

2.1 误差指标

2.1.1 Log Score

2.1.2 CRPS

2.2 区间指标

2.2.1 PICP

2.2.2 PIW

2.3 其他

2.3.1 分位数损失

2.3.2 锐度

3 符号表

《时序预测评价指标简介》有3条评论

回复 Akira 取消回复