有一个登录(login)记录表,简况如下:
1、查询每个日期新用户的次日留存率
2、结果保留小数点后面3位数(3位之后的四舍五入)
3、查询结果按照日期升序排序
上面的例子查询结果如下:
留存率 = 次日的新用户的数量 / 当日新用户的数量
由于涉及两个数量的计算,最好分别求出彼此的数量,然后再计算
1.0 * count( a ) / count( b )
首先要找到当日新用户所在的范围
select user_id, min(date) as date from login group by user_id
然后,找到次日从新用户所在的范围
(select user_id, min(date) as date from login group by user_id) as l1 #当日的新用户 left join login as l2 #次日的新用户 on l1.user_id = l2.user_id and l2.date = date_add(l1.date, interval 1 day) #次日的日期比当日的多一天
再根据条件,联立可得
select l1.date, round(1.0 * count(l2.date) / count(l1.date), 3) as p from (select user_id, min(date) as date from login group by user_id) as l1 left join login as l2 on l1.user_id = l2.user_id and l2.date = date_add(l1.date, interval 1 day) group by l1.date
当存在留存率为0的情况时,count(l1,date)为0,算式不成立,则没有找出这一行
所以需要手动添加
select date, 0.000 as p from login where date not in (select min(date) from login group by user_id) order by date;
最后的代码为
select l1.date, round(1.0 * count(l2.date) / count(l1.date), 3) as p from (select user_id, min(date) as date from login group by user_id) as l1 left join login as l2 on l1.user_id = l2.user_id and l2.date = date_add(l1.date, interval 1 day) group by l1.date union select date, 0.000 as p from login where date not in (select min(date) from login group by user_id) order by date;
这题我没做出来,有很多格式上的问题导致代码不通过,比如在括号内是不能加入命名函数作为条件来使用,还有order by是在代码的最后才能使用,在union前不能使用
还需要继续学习