PostgreSQL 11 新特性之窗口函數(shù)(window function)增強(qiáng)
作者: 不剪發(fā)的Tony老師
畢業(yè)于北京航空航天大學(xué),十多年數(shù)據(jù)庫(kù)管理與開(kāi)發(fā)經(jīng)驗(yàn),目前在一家全球性的金融公司從事數(shù)據(jù)庫(kù)架構(gòu)設(shè)計(jì)。CSDN學(xué)院簽約講師以及GitChat專(zhuān)欄作者。csdn上的博客收藏于以下地址:https://tonydong.blog.csdn.net
文章目錄
PostgreSQL 11 針對(duì)窗口函數(shù)(window function)進(jìn)行了增強(qiáng),添加了 SQL:2011 標(biāo)準(zhǔn)中的所有窗口范圍(window frame)子句選項(xiàng)。具體來(lái)說(shuō),新增了以下功能:
允許在RANGE窗口模式中使用off_set PRECEDING/FOLLOWING限定窗口范圍
支持GROUPS窗口模式,使用基于組的方式限定窗口范圍
支持窗口范圍子句的排除選項(xiàng),用于排除窗口內(nèi)的某些數(shù)據(jù)行
首先,對(duì)于RANGE窗口模式,增加了off_set PRECEDING/FOLLOWING子句,現(xiàn)在可以支持所有的選項(xiàng):
RANGE frame_start
RANGE BETWEEN frame_start AND frame_end
其中,frame_start 和 frame_end 可以是以下選項(xiàng)之一:
UNBOUNDED PRECEDING
off_set PRECEDING
CURRENT ROW
off_set FOLLOWING
UNBOUNDED FOLLOWING
創(chuàng)建一個(gè)測(cè)試表:
CREATE TABLE test
(
id SERIAL PRIMARY KEY,
val INT4,
logtime TIMESTAMP
);
INSERT INTO test(val, logtime)
VALUES
(1, '2019-01-15 08:08:17'),
(1, '2019-01-15 08:14:30'),
(3, '2019-01-15 08:36:00'),
(6, '2019-01-15 09:20:56'),
(6, '2019-01-15 10:15:41');
以下示例比較了ROWS模式和RANGE模式的區(qū)別:
SELECT id, val, logtime,
sum(val) OVER (ORDER BY val ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) as sum_rows,
sum(val) OVER (ORDER BY val RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING) as sum_range,
sum(val) OVER (ORDER BY logtime RANGE BETWEEN INTERVAL '10 minute' PRECEDING AND '10 minute' FOLLOWING) as sum_range_time
FROM test;
id | val | logtime | sum_rows | sum_range | sum_range_time
----+-----+---------------------+----------+-----------+----------------
1 | 1 | 2019-01-15 08:08:17 | 2 | 2 | 2
2 | 1 | 2019-01-15 08:14:30 | 5 | 2 | 2
3 | 3 | 2019-01-15 08:36:00 | 10 | 3 | 3
4 | 6 | 2019-01-15 09:20:56 | 15 | 12 | 6
5 | 6 | 2019-01-15 10:15:41 | 12 | 12 | 6
(5 rows)
其中,
sum_rows 列計(jì)算按照字段 val 排序后,每一行以及前后各一行的合計(jì)值;
sum_range 列計(jì)算按照字段 val 排序后,每一行以及前后和它的值相差小于等于 1 的那些行的合計(jì)值;
sum_range_time 列計(jì)算按照字段 logtime 排序后,每一行以及前后和它的值相差小于等于 10 分鐘的那些行的合計(jì)值。
其次,PostgreSQL 11 增加了GROUPS窗口模式。
GROUPS frame_start
GROUPS BETWEEN frame_start AND frame_end
其中,frame_start 和 frame_end 可以是以下選項(xiàng)之一:
UNBOUNDED PRECEDING
off_set PRECEDING
CURRENT ROW
off_set FOLLOWING
UNBOUNDED FOLLOWING
對(duì)于GROUPS窗口模式,計(jì)算窗口范圍時(shí)使用組(由ORDER BY決定)為單位,而不是以行ROWS模式)或者值RANGE模式)為單位。
基于上面的示例,增加GROUPS模式:
SELECT id, val, logtime,
sum(val) OVER (ORDER BY val ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) as sum_rows,
sum(val) OVER (ORDER BY val RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING) as sum_range,
sum(val) OVER (ORDER BY val GROUPS BETWEEN 1 PRECEDING AND 1 FOLLOWING) as sum_groups
FROM test;
id | val | logtime | sum_rows | sum_range | sum_groups
----+-----+---------------------+----------+-----------+------------
1 | 1 | 2019-01-15 08:08:17 | 2 | 2 | 5
2 | 1 | 2019-01-15 08:14:30 | 5 | 2 | 5
3 | 3 | 2019-01-15 08:36:00 | 10 | 3 | 17
4 | 6 | 2019-01-15 09:20:56 | 15 | 12 | 15
5 | 6 | 2019-01-15 10:15:41 | 12 | 12 | 15
(5 rows)
其中,sum_groups 計(jì)算按照字段 val 排序后,每一行所在的組以及前后各一個(gè)組中所有行的合計(jì)值。
ORDER BY val決定了組,示例中存在 3 個(gè)組(1、3、6)。對(duì)于第一行數(shù)據(jù),它所在的組有兩個(gè) val 值(1 和 1),1 PRECEDING組不存在,1 FOLLOWING組有一個(gè)值(3),合計(jì)為 5(1+1+3)。
最后,PostgreSQL 11 支持窗口范圍子句的排除選項(xiàng) frame_exclusion,用于排除窗口內(nèi)的某些數(shù)據(jù):
{ RANGE | ROWS | GROUPS } frame_start [ frame_exclusion ]
{ RANGE | ROWS | GROUPS } BETWEEN frame_start AND frame_end [ frame_exclusion ]
其中,frame_start 和 frame_end 可以是以下選項(xiàng)之一:
UNBOUNDED PRECEDING
off_set PRECEDING
CURRENT ROW
off_set FOLLOWING
UNBOUNDED FOLLOWING
frame_exclusion 可以是以下選項(xiàng)之一:
EXCLUDE CURRENT ROW
EXCLUDE GROUP
EXCLUDE TIES
EXCLUDE NO OTHERS
它們的作用分別為:
EXCLUDE CURRENT ROW排除當(dāng)前行
EXCLUDE GROUP排除當(dāng)前行所在的組,也就是和當(dāng)前行排序相同的行
EXCLUDE TIES排除當(dāng)前行所在組中的其他行,但是不排除當(dāng)前行
EXCLUDE NO OTHERS不排除任何行,默認(rèn)值
以下示例演示了這些選項(xiàng)的效果:
SELECT id, val, logtime,
sum(val) OVER (ORDER BY val ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING EXCLUDE NO OTHERS) as sum_rows,
sum(val) OVER (ORDER BY val ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING EXCLUDE CURRENT ROW) as exclude_current_row,
sum(val) OVER (ORDER BY val ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING EXCLUDE GROUP) as exclude_group,
sum(val) OVER (ORDER BY val ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING EXCLUDE TIES) as exclude_ties
FROM test;
id | val | logtime | sum_rows | exclude_current_row | exclude_group | exclude_ties
----+-----+---------------------+----------+---------------------+---------------+--------------
1 | 1 | 2019-01-15 08:08:17 | 2 | 1 | | 1
2 | 1 | 2019-01-15 08:14:30 | 5 | 4 | 3 | 4
3 | 3 | 2019-01-15 08:36:00 | 10 | 7 | 7 | 10
4 | 6 | 2019-01-15 09:20:56 | 15 | 9 | 3 | 9
5 | 6 | 2019-01-15 10:15:41 | 12 | 6 | | 6
(5 rows)
至此,PostgreSQL 11 支持 SQL:2011 標(biāo)準(zhǔn)中定義的所有窗口范圍子句選項(xiàng)。
官方文檔:
https://www.postgresql.org/docs/11/tutorial-window.html
https://www.postgresql.org/docs/11/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS
https://www.postgresql.org/docs/11/queries-table-expressions.html#QUERIES-WINDOW
https://www.postgresql.org/docs/11/functions-window.html
https://www.postgresql.org/docs/11/sql-select.html#SQL-WINDOW