ICode9

精准搜索请尝试: 精确搜索
首页 > 数据库> 文章详细

MySQL基础 开窗函数

2022-04-15 19:32:52  阅读:176  来源: 互联网

标签:insert cnt name 04 开窗 2022 MySQL dt 函数


目录

mysql语法

数据准备

create table emp (
    empno numeric(4) not null,
    ename varchar(10),
    job varchar(9),
    mgr numeric(4),
    hiredate datetime,
    sal numeric(7, 2),
    comm numeric(7, 2),
    deptno numeric(2)
);

insert into emp values (7369, 'SMITH', 'CLERK', 7902, '1980-12-17', 800, null, 20);
insert into emp values (7499, 'ALLEN', 'SALESMAN', 7698, '1981-02-20', 1600, 300, 30);
insert into emp values (7521, 'WARD', 'SALESMAN', 7698, '1981-02-22', 1250, 500, 30);
insert into emp values (7566, 'JONES', 'MANAGER', 7839, '1981-04-02', 2975, null, 20);
insert into emp values (7654, 'MARTIN', 'SALESMAN', 7698, '1981-09-28', 1250, 1400, 30);
insert into emp values (7698, 'BLAKE', 'MANAGER', 7839, '1981-05-01', 2850, null, 30);
insert into emp values (7782, 'CLARK', 'MANAGER', 7839, '1981-06-09', 2450, null, 10);
insert into emp values (7788, 'SCOTT', 'ANALYST', 7566, '1982-12-09', 3000, null, 20);
insert into emp values (7839, 'KING', 'PRESIDENT', null, '1981-11-17', 5000, null, 10);
insert into emp values (7844, 'TURNER', 'SALESMAN', 7698, '1981-09-08', 1500, 0, 30);
insert into emp values (7876, 'ADAMS', 'CLERK', 7788, '1983-01-12', 1100, null, 20);
insert into emp values (7900, 'JAMES', 'CLERK', 7698, '1981-12-03', 950, null, 30);
insert into emp values (7902, 'FORD', 'ANALYST', 7566, '1981-12-03', 3000, null, 20);
insert into emp values (7934, 'MILLER', 'CLERK', 7782, '1982-01-23', 1300, null, 10);

1.聚合函数(分组函数)

1.聚合统计逻辑

    聚合统计:
        group by => 分组
            xianyu,<1,a,xc,asd>
            lxy,<as,zxf,zxf,qwr,ags>
        聚合函数 => 指标
            xianyu,4
            lxy,5

2.函数使用

group by =》 分组
聚合函数 =》 指标统计 sun avg max min count


需求:
    统计每个部门有多少个人?

    查什么?
        维度:部门
        指标:人数        

        select
        deptno,
        count(1) as cnt
        from emp
        group by deptno;

解释:
    count(1) 【1.代表 先放置一个假数,然后再查询】
        【2.理解为按照第几个字段进行查数】
    select
        select + 函数 => 可以校验函数是否存在

2.开窗函数

1.语法

窗口函数:
    窗口 + 函数
    窗口:函数运行时 计算数据集的范围
    函数:运行时的函数
        1.聚合函数
            sun avg max min count
        2.内置窗口函数
        
    语法结构:
        函数 over([partition by xxx,...] [order by xxx,...])
        over() 是以谁进行开窗【table or 数据集】
        partition by:以谁进行分组 【group by column】
        order by:以谁进行排序【column】

2.聚合函数:多行数据 按照一定规则 进行聚合 为一行

    sum avg max...
    理论上:聚合后的行数 <= 聚合前的行数 【主要是看维度选取 group by 里面的字段】

    需求:
        既要显示 聚合前的数据 又要显示 聚合后的数据 ?

        id name sal   dt        sal_all
        1   zs  1000 2022-4     1000
        2   ls  2000 2022-4     2000
        3   wu  3000 2022-4     3000
        4   zs  1000 2022-5     2000
        5   ls  2000 2022-5     4000
        6   wu  3000 2022-5     6000

数据:
服务器 每天的启动 次数
linux01,2022-04-15,1
linux01,2022-04-16,5
linux01,2022-04-17,7
linux01,2022-04-18,2
linux01,2022-04-19,3
linux01,2022-04-20,10
linux01,2022-04-21,4

统计累计问题:
    创建表
        create table window01(
            name varchar(50),
            dt varchar(20),
            cnt int
        );
    插入数据
        insert into window01 values("linux01","2022-04-15",1);
        insert into window01 values("linux01","2022-04-16",5);
        insert into window01 values("linux01","2022-04-17",7);
        insert into window01 values("linux01","2022-04-18",2);
        insert into window01 values("linux01","2022-04-19",3);
        insert into window01 values("linux01","2022-04-20",10);
        insert into window01 values("linux01","2022-04-21",4);


        insert into window01 values("linux02","2022-04-18",20);
        insert into window01 values("linux02","2022-04-19",30);
        insert into window01 values("linux02","2022-04-20",10);
        insert into window01 values("linux02","2022-04-21",40);


    需求:
        每个服务器 每天 累积启动次数
        select
        name,
        dt,
        cnt,
        sum(cnt) over(partition by name order by dt) as cut_all
        from window01;

        +---------+------------+------+---------+
        | name    | dt         | cnt  | cut_all |
        +---------+------------+------+---------+
        | linux01 | 2022-04-15 |    1 |       1 |
        | linux01 | 2022-04-16 |    5 |       6 |
        | linux01 | 2022-04-17 |    7 |      13 |
        | linux01 | 2022-04-18 |    2 |      15 |
        | linux01 | 2022-04-19 |    3 |      18 |
        | linux01 | 2022-04-20 |   10 |      28 |
        | linux01 | 2022-04-21 |    4 |      32 |
        | linux02 | 2022-04-18 |   20 |      20 |
        | linux02 | 2022-04-19 |   30 |      50 |
        | linux02 | 2022-04-20 |   10 |      60 |
        | linux02 | 2022-04-21 |   40 |     100 |
        +---------+------------+------+---------+

        1 9 10 11 str 【字典序】
        1 10 11 9

         * 从1开始,1,2,3,4,5,6,7,8,9
         * 从10开始,1,10…19,2,3,4,5,6,7,8,9 
         * 从20开始,1,10…19,2,20…29,3,4,5,6,7,8,9
         * 以此类推,所有的10位数,都插入到与他们十位数位置上相等的个位数后面。

3.内置窗口函数

窗口大小 xxx between xxx and xxx

参数
(ROWS | RANGE) BETWEEN (UNBOUNDED | [num]) PRECEDING AND ([num] PRECEDING | CURRENT ROW | (UNBOUNDED | [num]) FOLLOWING)
(ROWS | RANGE) BETWEEN CURRENT ROW AND (CURRENT ROW | (UNBOUNDED | [num]) FOLLOWING)
(ROWS | RANGE) BETWEEN [num] FOLLOWING AND (UNBOUNDED | [num]) FOLLOWING

select
name,
dt,
cnt,
sum(cnt) over(partition by name order by dt) as cut_all,
-- 无边界
sum(cnt) over(partition by name order by dt rows between unbounded preceding and current row) as cut_all2,
-- 前三行 + 当前行
sum(cnt) over(partition by name order by dt rows between 3 preceding and current row) as cut_all3,
-- 前三行 + 当前行 + 下一行
sum(cnt) over(partition by name order by dt rows between 3 preceding and 1 following) as cut_all4,
-- 上面无边界 + 下面无边界
sum(cnt) over(partition by name order by dt rows between unbounded preceding and UNBOUNDED FOLLOWING) as cut_all5
from window01;

select
name,
dt,
cnt,
-- 常规分组排序求加和
sum(cnt) over(partition by name order by dt) as cut_all,
-- 整张表对时间排序,然后加和,作用到整张表,理解为18号并列有两条数据
sum(cnt) over(order by dt) as cut_all2,
-- 对整张表进行加和
sum(cnt) over() as cut_all3,
-- 直接按照名字进分组
sum(cnt) over(partition by name) as cut_all4
from window01
order by dt;

1.partition by 不加 => 作用整张表


数仓顺序
    ods不动
    union all + group by select ifnull case when
    join
    group by
    grouping sets 【维度组合】

4.内置窗口函数

1.取值 串行

1.串行
            LAG 【窗口内 向上 第n行的值 当前行向上取一行】
                LAG(column [, N[, default]])
                column => 列名
                n => 取几行
                default => 取不到给默认值
            LEAD 【窗口内 向下 第n行的值 当前行向下取一行】

            select
            name,
            dt,
            cnt,
            sum(cnt) over(partition by name order by dt) as cut_all,
            lead(dt,1,"9999-99-99") over(partition by name order by dt) as lead_alias,
            lead(dt,1,"9999-99-99") over(partition by name order by dt) as lag_alias
            from window01;
2.取值
            FIRST_VALUE() : 取分组内排序后 截止到当前行 第一个值
            LAST_VALUE():取分组内排序后 截止到当前行 最后一个值

            select
            name,
            dt,
            cnt,
            first_value(cnt) over(partition by name order by dt) as f_value,
            last_value(cnt) over(partition by name order by dt) as l_value
            from window01;

2.排序

分组
            ntile
            需求:
                把数据按照某个字段进行排序,把数据分成n份ntile(n)
                如果不能平均分配 优先分配到编号小的里面
            select
            name,
            dt,
            cnt,
            sum(cnt) over(partition by name order by dt) as cut_all,
            -- 平均分成n份,不能平均分,优先把多余的放到最小的里面
            ntile(2) over(partition by name order by dt) as n2,
            ntile(3) over(partition by name order by dt) as n3
            from window01
            order by dt;
排序
            rank : 从1开始,按照排序 相同会重复,名次会留下空位 生成组内的记录编号
            row_number: 从1开始,按照排序 生成组内的记录编号
            dense_rank:从1开始,按照排序 相同会重复,名次不会留下空位 生成组内的记录编号

            select
            name,
            dt,
            cnt,
            sum(cnt) over(partition by name order by dt) as cut_all,
            rank() over(partition by name order by cnt desc) as rk,
            row_number() over(partition by name order by cnt desc) as rw,
            dense_rank() over(partition by name order by cnt desc) as d_rk
            from window01;

标签:insert,cnt,name,04,开窗,2022,MySQL,dt,函数
来源: https://www.cnblogs.com/old-salted-fish/p/16150647.html

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有