如何在列表，字典，集合中根据条件筛选数据-原创手记-慕课网

对于不同的数据结构：列表，字典，集合，通用的数据筛选手段为：迭代。
例如：

>>> data = [1, 5, -3, -2, 6, 8, 9]
>>> res = []
>>> for x in data:
...     if x >= 0:
...         res.append(x)
... 
>>> print res
[1, 5, 6, 8, 9]

为了使编程风格更加地pythonic，采用函数式编程。
1.列表，过滤掉列表中的负数
step1：随机生成10个元素的列表
step2：filter?查看函数使用说明
step3：筛选非负数
方法一：filter函数
方法二：列表解析
step4：timeit 测试两种方法运行时间（cpu 1.4Ghz真慢）
两种方法都快于迭代，而且列表解析速度更快，优先选择。

In [1]: from random import randint

In [2]: data = [randint(-10,10) for _ in xrange(10)]

In [3]: data
Out[3]: [9, -8, -2, 2, 6, 4, -9, -1, 7, -6]

In [4]: filter?
Docstring:
filter(function or None, sequence) -> list, tuple, or string

Return those items of sequence for which function(item) is true.  If
function is None, return the items that are true.  If sequence is a tuple
or string, return the same type, else return a list.
Type:      builtin_function_or_method

In [5]: filter(lambda x: x >=0, data)
Out[5]: [9, 2, 6, 4, 7]

In [6]: [x for x in data if x >= 0]
Out[6]: [9, 2, 6, 4, 7]

In [7]: timeit filter(lambda x: x >= 0,data)
The slowest run took 6.55 times longer than the fastest. This could mean that an intermediate result is being cached 
100000 loops, best of 3: 2.58 µs per loop

In [8]: timeit [x for x in data if x >= 0]
The slowest run took 5.82 times longer than the fastest. This could mean that an intermediate result is being cached 
1000000 loops, best of 3: 1.02 µs per loop

2.字典，筛选出字典中值大于90的项
step1：随机生成某班20人的学号和分数
step2：筛选分数大于90的键值对k,v

In [9]: d = {x: randint(60, 100) for x in xrange(1,21)}

In [10]: d
Out[10]: 
{1: 89,
 2: 74,
 3: 71,
 4: 64,
 5: 94,
 6: 63,
 7: 75,
 8: 67,
 9: 93,
 10: 99,
 11: 91,
 12: 96,
 13: 94,
 14: 68,
 15: 75,
 16: 84,
 17: 93,
 18: 94,
 19: 83,
 20: 97}

In [11]: {k: v for k,v in d.iteritems() if v > 90}
Out[11]: {5: 94, 9: 93, 10: 99, 11: 91, 12: 96, 13: 94, 17: 93, 18: 94, 20: 97}

3.集合，筛选集合中能被3整除的元素
step1：将data转化为集合
step2：集合解析，对3取模为0

In [12]: data
Out[12]: [9, -8, -2, 2, 6, 4, -9, -1, 7, -6]

In [13]: s = set(data)

In [14]: s
Out[14]: {-9, -8, -6, -2, -1, 2, 4, 6, 7, 9}

In [15]: {x for x in s if x % 3 == 0}
Out[15]: {-9, -6, 6, 9}

小结：列举了列表，字典，集合中根据条件筛选数据的方法，用列表解析和filter函数的速度都远快于迭代方式。