背景:
在日常的工作中,我们会遇见一张表中存在重复的数据,这个重复的数据是指几个字段组成一个唯一索引的,但是因为在加工数据的过程中,会去掉这个唯一索引,但是在迁移到业务表中需要这个唯一索引,或者是业务要求等条件的,如下就演示了如何取其一:
如下就是业务,我们需要根据业务的需求取出重复数据中的其中一条数据
-- Create table create table sc_class ( id VARCHAR2(32) not null, class_id VARCHAR2(20) not null, xm VARCHAR2(50) not null, sfzjhm VARCHAR2(30) not null, sfzjlx VARCHAR2(5) not null, kc VARCHAR2(5), cj VARCHAR2(5), class_pm VARCHAR2(6), lrrq DATE, lrr_dm CHAR(11), yxbz CHAR(1) not null, ) tablespace TS_CLASS_DATA pctfree 10 initrans 1 maxtrans 255 storage ( initial 64K next 1M minextents 1 maxextents unlimited ); -- Add comments to the table comment on table sc_class is '学生信息'; -- Add comments to the columns comment on column sc_class.id is 'ID'; comment on column sc_class.class_id is '班级ID'; comment on column sc_class.xm is '姓名'; comment on column sc_class.sfzjhm is '身份证件号码'; comment on column sc_class.sfzjlx is '身份证件类型'; comment on column sc_class.kc is '课程'; comment on column sc_class.cj is '成绩'; comment on column sc_class.class_pm is '年级排名'; comment on column sc_class.lrrq is '修改日期'; comment on column sc_class.lrr_dm is '修改人代码'; comment on column sc_class.yxbz is '有效标志 Y:是 N:否'; -- Create/Recreate primary, unique and foreign key constraints alter table sc_class add constraint PK_CLASSID primary key (ID) using index tablespace TS_CLASS_IDX pctfree 10 initrans 2 maxtrans 255 storage ( initial 64K next 1M minextents 1 maxextents unlimited );
sql语句如下:
这里就是取出来的rowid中最大那个值的数据,
select * from sc_class t where id in (select a.id from sc_class a where rowid in (select max(rowid) from sc_class b where b.yxbz = 'Y' and (b.class_id, b.xm, b.sfzjhm, b.sfzjlx, b.kc) in (select d.class_id, d.xm, d.sfzjhm, d.sfzjlx, d.kc from sc_class d where where d.yxbz = 'Y' group by d.class_id, d.xm, d.sfzjhm, d.sfzjlx, d.kc having count(1) > 1) group by b.class_id, b.xm, b.sfzjhm, b.sfzjlx, b.kc));
核心的sql语句如下:
--方法一 select * from tb_supply where rowid=any(select max(rowid) from tb_supply group by phone_id) --方法二 select * from tb_supply where rowid in (select max(rowid) from tb_supply group by phone_id)
总结:这里就是将重复的数据按照一定的规则取出rowid最大的那个值,这里只是查询的,如果还需要将查询的数据删除或者是更新,可以按照如下的:
-----更新的SQL: update sc_class t set t.yxbz = 'N' where id in (select a.id from sc_class a where rowid in (select max(rowid) from sc_class b where b.yxbz = 'Y' and (b.class_id, b.xm, b.sfzjhm, b.sfzjlx, b.kc) in (select d.class_id, d.xm, d.sfzjhm, d.sfzjlx, d.kc from sc_class d where where d.yxbz = 'Y' group by d.class_id, d.xm, d.sfzjhm, d.sfzjlx, d.kc having count(1) > 1) group by b.class_id, b.xm, b.sfzjhm, b.sfzjlx, b.kc)); -----删除的sql: delete sc_class t where t.id in (select a.id from sc_class a where rowid in (select max(rowid) from sc_class b where b.yxbz = 'Y' and (b.class_id, b.xm, b.sfzjhm, b.sfzjlx, b.kc) in (select d.class_id, d.xm, d.sfzjhm, d.sfzjlx, d.kc from sc_class d where where d.yxbz = 'Y' group by d.class_id, d.xm, d.sfzjhm, d.sfzjlx, d.kc having count(1) > 1) group by b.class_id, b.xm, b.sfzjhm, b.sfzjlx, b.kc));
到此重复数据取其一,或者是更新/删除其一的,同学们可以根据需求进行修改。