一、在列表、字典、集合中根据条件筛选数据下面实验的数据都是采用random模块随机生成符合条件的数据,故每次实验结果会有不同
1. 过滤列表中的负数
# -*- coding:utf-8 -*- from random import randint data = [randint(-10,10) for _ in xrange(10)] # 方法一采用filter函数 print filter(lambda x:x>=0, data) # 方法二采用列表解析 print [x for x in data if x>=0] 输出: [8, 5, 0, 3] [8, 5, 0, 3]
2. 筛出字典中值高于90的项
# -*- coding:utf-8 -*- from random import randint d = {x: randint(60,100) for x in xrange(1,21)} print { k:v for k,v in d.iteritems() if v >90} 输出: {10: 97}
3. 筛出集合中能被3整除的元素
二、命名统计字典# -*- coding:utf-8 -*- from random import randint data = [randint(-10,10) for _ in xrange(10)] s = set(data) print {x for x in s if x % 3 ==0} 输出: set([0, 9])
1.如何为元组中的每个元素命名,提高程序可读性
方案一:
定义类似与其他语言类似的枚举类型,也就是定义一系列数值常量
# -*- coding:utf-8 -*- NAME = 0 AGE = 1 SEX = 2 EMAIL = 3 student = ('jim', 16, 'male', 'jim@gmail.com') if student[AGE] > 18: pass if student[SEX] == 'male': pass
方案二:
使用标准库中collections.namedtuple 替代内置tuple
生成的s是个元组,namedtuple相当于一个类的工厂,s既可以用索引,也可以用属性查找
# -*- coding:utf-8 -*- from collections import namedtuple
student = namedtuple('Student',['name', 'age', 'sex', 'male'])
s = student('jim', 16, 'male', 'jim@gmail.com')
print s
print s.name输出:
Student(name='jim', age=16, sex='male', male='jim@gmail.com')
jim-- coding:utf-8 --### 2.如何统计序列中元素的出现频度 例如1: 某随机序列中,找到出现次数最高的3个元素,它们的出现次数是多少? 方法一:
from random import randint
data = [randint(0,20) for _ in xrange(30)]
c = dict.fromkeys(data, 0)
print c
for x in data:
c[x] = c[x] + 1
print c.items()
print sorted(c.items(), key=lambda d:d[1])[-3:]输出:
[(0, 1), (1, 2), (2, 3), (3, 2), (4, 1), (5, 2), (6, 1), (7, 2), (8,
3), (9, 3), (11, 2), (12, 2), (15, 4), (19, 1), (20, 1)]
[(8, 3), (9, 3), (15, 4)]-- coding:utf-8 --方法二:使用collections.Counter对象 将序列传入Counter的构造器,得到Counter对象是元素频度的字典,Counter.most_common(n)方法得到频度最高的n个元素的列表
from random import randint
from collections import Counterdata = [randint(0,20) for _ in xrange(30)]
c2 = Counter(data)print c2.most_common(3)
输出:
[(18, 4), (5, 3), (14, 3)]-- coding:utf-8 --例如2: 对某英文文章的单词,进行词频统计,找到出现次数最高的10个单词,它们出现次数是多少? > 以文件内容不是英文字符进行切片
from collections import Counter
import retxt = open('test.txt').read()
c3 = Counter(re.split('\W+', txt))
print c3.most_common(3)输出:
[('openhpc', 26), ('resource', 17), ('queue', 16)]-- coding:utf-8 --### 3.根据字典中值的大小,对字典中的项排序 解决方案: 1.利用zip将字典转化为元组 2.传递sorted函数的key参数
from random import randint
d = {x:randint(60,100) for x in 'xyzabc' }
print sorted(zip(d.itervalues(),d.iterkeys()))输出:
[(80, 'y'), (89, 'x'), (91, 'b'), (94, 'a'), (94, 'z'), (99, 'c')]-- coding:utf-8 --from random import randint
d = {x:randint(60,100) for x in 'xyzabc' }
print sorted(d.items(), key=lambda x: x[1])输出:
[('x', 67), ('y', 71), ('c', 72), ('a', 75), ('z', 88), ('b', 89)]
-- coding:utf-8 --## 三、公共键 ### 1.如何快速找到多个字典中的公共键
from random import randint, sample
s1 = {x: randint(1,4) for x in sample('abcdefg', randint(3,6))}
如果数据集比较少可以采用下面方法
s2 = {x: randint(1,4) for x in sample('abcdefg', randint(3,6))}
s3 = {x: randint(1,4) for x in sample('abcdefg', randint(3,6))}print s1.viewkeys() & s2.viewkeys() & s3.viewkeys()
step1:使用字典的viewkeys()方法,得到一个字典的keys集合; step2: 使用map函数,得到所有字典的keys集合; step3:使用reduce函数,取所有字典的keys集合的交集。 数据集多的话采用下面方法print reduce(lambda a,b:a&b, map(dict.viewkeys, [s1,s2,s3]))
输出:
set(['c', 'd'])
set(['c', 'd'])## 四、如何让字典保持有序 ### 1.使用collections.OrderedDict
from time import time
from random import randint
from collections import OrderedDictd = OrderedDict()
players = list('ABCDEFGH')
start = time()for i in xrange(8):
raw_input()
p = players.pop(randint(0,7-i))
end = time()
print i+1,p, end - start
d[p] = (i+1, end - start)print ''20
for k in d:
print k, d[k]输出:
后面for循环遍历的字典是以元素进入字典的顺序进行排列的1 C 0.934000015259
2 D 1.40899991989
3 F 1.67999982834
4 A 1.95599985123
5 E 2.16599988937
6 H 2.37599992752
7 B 2.60699987411
8 G 2.99799990654
C (1, 0.9340000152587891)
D (2, 1.4089999198913574)
F (3, 1.679999828338623)
A (4, 1.9559998512268066)
E (5, 2.1659998893737793)
H (6, 2.375999927520752)
B (7, 2.6069998741149902)
G (8, 2.997999906539917)## 五、历史记录 ### 1. 实现用户的历史记录功能(最多n条) 使用容量为n的队列历史存储记录 使用标准库collections中的deque,它是一个双端循环队列,程序退出前,可以使用pickle将队列对象存入文件,再次运行程序时将其导入。
from random import randint
from collections import deque
N = randint(0, 100)
history = deque([], 5)def guess(k):
if k == N:
print 'right'
return Trueif k < N:
print '%s is less than N' % k
else:
print '%s is greater than N' % k
return Falsewhile True:
line = raw_input("please input a number: ")
if line.isdigit():
k = int(line)
history.append(k)
if guess(k):
break
elif line == 'history' or line =='h?':
print list(history)In [1]: import pickle
In [2]: from collections import deque
In [3]: q = deque([],5)
In [4]: q.append(1)
In [5]: q.append(2)
In [6]: q.append(3)
In [7]: q.append(4)
In [8]: q.append(5)
In [9]: q.append(6)
In [10]: q
Out[10]: deque([2, 3, 4, 5, 6])In [11]: pickle.dump(q,open('history','w'))
In [12]: pickle.load(open('history'))
Out[12]: deque([2, 3, 4, 5, 6])
ps:最后吐槽一下,我明明编写好的markdown格式,并且预览都是正常的,为什么发布的时候格式就全变了。@慕女神