Sql去重性能比较

首页课程实战体系课手记专栏慕课教程

Sql去重性能比较

表TB中，在gid为1和2中，要把souceid和type同时相同的找出来，只取一条

id souceid type gid

1 s1 t1 1

2 s1 t1 2

3 s2 t2 1

4 s2 t2 2

5 s3 t3 1

我了解的三种方法分别利用in、join、row_number

1.select * from TB where id in (select Max(id) from TB where gid in(1,2) group by souceid,type)

2.select * from TB A join (select Max(id) id from TB where gid in(1,2) group by souceid,type) B on A.id=B.id

3.select * from (select id,souceid,type,gid,row_number() over (partition by souceid,type order by id) as rn) A where A.rn=1

请高手指正，哪种方法最高效亦或是有更好的改进方法

哆啦的时光机

浏览 523回答 4

4回答

眼眸繁星

3效率最高.似乎应该这样写.select * from (select id,souceid,type,gid,row_number() over (partition by souceid,type order by id) as rn from TB) A where A.rn=1 直接得出结果. 1,2 都分两步进行,最后都涉及到聚合运算. 我的理解, 有不对的地方,欢迎指出.谢谢!

0 0

撒科打诨

不一定,话说分析函数挺费效率的.....在大数据量的时候整体来看,效率最高的是2或3,数据量大的话应该是2; gid in(1,2) ,改成 gid = 1 or gid = 2; 效率能高一点; 如果想效率最高的话,应该将其合并至一句: 很简单, GID最好加上索引;如果数据一直是1,2+位图索引;

0 0

UYOU

那请问怎么合并至一句呢

0 0

随时随地看视频慕课网APP