- 首先,創建一個包含時間序列數據的表。例如,創建一個名為 time_series_data 的表,其中包含時間戳(timestamp)和數值(value)兩個字段:
CREATE TABLE time_series_data (
timestamp TIMESTAMP NOT NULL,
value NUMERIC NOT NULL
);
- 使用窗口函數計算移動平均值和移動標準差。例如,可以使用以下查詢計算每個數據點的 5 個時間單位的移動平均值(moving_avg)和移動標準差(moving_stddev):
SELECT
timestamp,
value,
AVG(value) OVER (ORDER BY timestamp ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) AS moving_avg,
STDDEV(value) OVER (ORDER BY timestamp ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) AS moving_stddev
FROM
time_series_data;
- 計算每個數據點的 Z 分數(z_score),以確定其與移動平均值的距離。可以使用以下查詢計算 Z 分數:
WITH moving_stats AS (
SELECT
timestamp,
value,
AVG(value) OVER (ORDER BY timestamp ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) AS moving_avg,
STDDEV(value) OVER (ORDER BY timestamp ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) AS moving_stddev
FROM
time_series_data
)
SELECT
timestamp,
value,
(value - moving_avg) / moving_stddev AS z_score
FROM
moving_stats;
- 根據 Z 分數判斷數據點是否異常。例如,可以認為 Z 分數大於 3 或小於 - 3 的數據點是異常值。可以使用以下查詢篩選出異常數據點:
WITH moving_stats AS (
SELECT
timestamp,
value,
AVG(value) OVER (ORDER BY timestamp ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) AS moving_avg,
STDDEV(value) OVER (ORDER BY timestamp ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) AS moving_stddev
FROM
time_series_data
),
z_scores AS (
SELECT
timestamp,
value,
(value - moving_avg) / moving_stddev AS z_score
FROM
moving_stats
)
SELECT
timestamp,
value,
z_score
FROM
z_scores
WHERE
z_score > 3 OR z_score < -3;
- 這個方法基於統計學原理,假設時間序列數據符合正態分佈。你可以根據實際情況調整移動窗口的大小(例如,改變 ROWS BETWEEN 4 PRECEDING AND CURRENT ROW)以及異常判斷的閾值(例如,改變 z_score > 3 OR z_score < -3)。