对于不同的数据结构:列表,字典,集合,通用的数据筛选手段为:迭代。
例如:
>>> data = [1, 5, -3, -2, 6, 8, 9]
>>> res = []
>>> for x in data:
... if x >= 0:
... res.append(x)
...
>>> print res
[1, 5, 6, 8, 9]
为了使编程风格更加地pythonic,采用函数式编程。
1.列表,过滤掉列表中的负数
step1:随机生成10个元素的列表
step2:filter?查看函数使用说明
step3:筛选非负数
方法一:filter函数
方法二:列表解析
step4:timeit 测试两种方法运行时间(cpu 1.4Ghz真慢)
两种方法都快于迭代,而且列表解析速度更快,优先选择。
In [1]: from random import randint
In [2]: data = [randint(-10,10) for _ in xrange(10)]
In [3]: data
Out[3]: [9, -8, -2, 2, 6, 4, -9, -1, 7, -6]
In [4]: filter?
Docstring:
filter(function or None, sequence) -> list, tuple, or string
Return those items of sequence for which function(item) is true. If
function is None, return the items that are true. If sequence is a tuple
or string, return the same type, else return a list.
Type: builtin_function_or_method
In [5]: filter(lambda x: x >=0, data)
Out[5]: [9, 2, 6, 4, 7]
In [6]: [x for x in data if x >= 0]
Out[6]: [9, 2, 6, 4, 7]
In [7]: timeit filter(lambda x: x >= 0,data)
The slowest run took 6.55 times longer than the fastest. This could mean that an intermediate result is being cached
100000 loops, best of 3: 2.58 µs per loop
In [8]: timeit [x for x in data if x >= 0]
The slowest run took 5.82 times longer than the fastest. This could mean that an intermediate result is being cached
1000000 loops, best of 3: 1.02 µs per loop
2.字典,筛选出字典中值大于90的项
step1:随机生成某班20人的学号和分数
step2:筛选分数大于90的键值对k,v
In [9]: d = {x: randint(60, 100) for x in xrange(1,21)}
In [10]: d
Out[10]:
{1: 89,
2: 74,
3: 71,
4: 64,
5: 94,
6: 63,
7: 75,
8: 67,
9: 93,
10: 99,
11: 91,
12: 96,
13: 94,
14: 68,
15: 75,
16: 84,
17: 93,
18: 94,
19: 83,
20: 97}
In [11]: {k: v for k,v in d.iteritems() if v > 90}
Out[11]: {5: 94, 9: 93, 10: 99, 11: 91, 12: 96, 13: 94, 17: 93, 18: 94, 20: 97}
3.集合,筛选集合中能被3整除的元素
step1:将data转化为集合
step2:集合解析,对3取模为0
In [12]: data
Out[12]: [9, -8, -2, 2, 6, 4, -9, -1, 7, -6]
In [13]: s = set(data)
In [14]: s
Out[14]: {-9, -8, -6, -2, -1, 2, 4, 6, 7, 9}
In [15]: {x for x in s if x % 3 == 0}
Out[15]: {-9, -6, 6, 9}
小结:列举了列表,字典,集合中根据条件筛选数据的方法,用列表解析和filter函数的速度都远快于迭代方式。
热门评论
大佬,三者的速度,我对比了一下跟你的结论有出入啊
当我的筛选条件、循环执行次数不一样时,速度的结果也不一样。
当我用data%2作为筛选条件,重复执行1000000000次时,filter函数速度最快;
当我用data>0作为筛选条件,重复执行1000000000次时,for循环的速度最快;