MySQL 8.0 新增窗口函数,窗口函数又被称为开窗函数,与Oracle 窗口函数类似,属于MySQL的一大特点.
非聚合窗口函数是相对于聚函数来说的。聚合函数是对一组数据计算后返回单个值(即分组),非聚合函数一次只会处理一行数据。窗口聚合函数在行记录上计算某个字段的结果时,可将窗口范围内的数据输入到聚合函数中,并不改变行数。
# 创建表格 create table employee( dname varchar(20), -- 部门名 eid varchar(20), ename varchar(20), hiredate date, -- 入职日期 salary double -- 薪资 ); # 插入数据 insert into employee values('研发部','1001','刘备','2021-11-01',3000); insert into employee values('研发部','1002','关羽','2021-11-02',5000); insert into employee values('研发部','1003','张飞','2021-11-03',7000); insert into employee values('研发部','1004','赵云','2021-11-04',7000); insert into employee values('研发部','1005','马超','2021-11-05',4000); insert into employee values('研发部','1006','黄忠','2021-11-06',4000); insert into employee values('销售部','1007','曹操','2021-11-01',2000); insert into employee values('销售部','1008','许褚','2021-11-02',3000); insert into employee values('销售部','1009','典韦','2021-11-03',5000); insert into employee values('销售部','1010','张辽','2021-11-04',6000); insert into employee values('销售部','1011','徐晃','2021-11-05',9000); insert into employee values('销售部','1012','曹洪','2021-11-06',6000);
需求:对每个部门的员工按照薪资排序,并给出排名
代码如下(示例):
select dname,ename,salary, row_number() over (partition by dname order by salary desc ) as rk from employee; select dname,ename,salary, rank() over (partition by dname order by salary desc ) as rk from employee; select dname,ename,salary, dense_rank() over (partition by dname order by salary desc ) as rk from employee;
可以看到row_number()输出的结果,rank()和dense_rank()差不太多。大家可以实际试一下看看区别。
在窗口中每条记录动态地应用聚合函数(SUM()、AVG()、MAX()、MIN()、COUNT()),可以动态计算在指定的窗口内的各种聚合函数值。
需求:对每个部门的员工按照薪资排序,并加上前面一位薪资,得到新的一列c1
代码如下(示例):
select dname, ename, salary, sum(salary) over(partition by dname order by salary rows between 2 preceding and current row) as c1 from employee;
分组内小于、等于当前rank值的行数 / 分组内总行数
需求:查询小于等于当前薪资(salary)的比例
代码如下(示例):
select dname, ename, salary, rank() over(partition by dname order by salary desc ) as rn, CUME_DIST() over(partition by dname order by salary desc ) as rn2 from employee;
返回位于当前行的前n行(LAG(expr,n))或后n行(LEAD(expr,n))的expr的值
需求:查询前1名同学的成绩和当前同学成绩的差值
代码如下(示例):
select dname, ename, hiredate, salary, lag(hiredate,1,'2000-01-01') over(partition by dname order by hiredate) as last_1_time, lag(hiredate,2) over(partition by dname order by hiredate) as last_2_time from employee;
返回第一个(FIRST_VALUE(expr))或最后一个(LAST_VALUE(expr))expr的值
需求:截止到当前,按照日期排序查询第1个入职和最后1个入职员工的薪资
代码如下(示例):
select dname, ename, hiredate, salary, first_value(salary) over(partition by dname order by hiredate) as first from employee;
将分区中的有序数据分为n个等级,记录等级数
需求:将每个部门员工按照入职日期分成3组
代码如下(示例):
select dname, ename, hiredate, salary, ntile(3) over(partition by dname order by hiredate ) as rn from employee;
返回窗口中第n个expr的值。expr可以是表达式,也可以是列名
需求:截止到当前薪资,显示每个员工的薪资中排名第2或者第3的薪资
select dname, ename, hiredate, salary, nth_value(salary,2) over(partition by dname order by hiredate) as second_score, nth_value(salary,3) over(partition by dname order by hiredate) as third_score from employee
今天和大家分享了一下sql中的6种常用的窗口函数的用法,分别是:
1.序号函数- ROW_NUMBER、RANK、DENSE_RANK
2.开窗聚合函数- SUM,AVG,MIN,MAX
3.分布函数- CUME_DIST
4.前后函数-LAG和LEAD
5.头尾函数-FIRST_VALUE和LAST_VALUE
6.其他函数-NTH_VALUE(expr, n)、NTILE(n)
有时间大家可以多练习一下。大数据框架也常用sql语法。窗口函数非常高效。