tuple() 增加这么多运行时间是正常的吗？

首页课程实战体系课手记专栏慕课教程

tuple() 增加这么多运行时间是正常的吗？

我有一个相对较长（20,000 行）的 CSV 文件和一个简单的函数来打开它：

def read_prices():

with open('sp500.csv', 'r') as f:

reader = csv.DictReader(f)

for row in reader:

yield float(row['Adj Close'].strip())

当我按原样计时时3e-05s：

print(timeit.timeit(lambda: read_prices(), number=100))

当我计时相同的功能但tuple(...)它需要一个惊人的27s：

print(timeit.timeit(lambda: tuple(read_prices()), number=100))

这是正常的tuple()吗？为什么会这样？我是初学者，所以欢迎 ELI5 解释:)

慕村225694

浏览 122回答 2

2回答

慕斯709654

发生这种情况是因为read_prices它不是一个函数 - 它实际上是一个generator. 那是因为yield关键字。正如函数式编程 HOWTO中所解释的：任何包含yield关键字的函数都是生成器函数；这是由 Python 的字节码编译器检测到的，该编译器专门编译该函数作为结果。当您调用生成器函数时，它不会返回单个值；相反，它返回一个支持迭代器协议的生成器对象。所以当你第一次运行时发生的read_prices()只是一个generator对象的创建，等待被告知yield元素。在第二个版本中tuple(read_prices())，您像以前一样创建generator对象，但tuple()实际上会一次性耗尽它和yield所有元素。一个简单的演示：>>> def yielder():...     yield from [1, 2, 3]...     >>> y = yielder()>>> y<generator object yielder at 0x2b5604090de0>>>> next(y)1>>> list(y)[2, 3]>>> tuple(yielder())(1, 2, 3)

0 0

达令说

这是因为这是一个生成器 read_prices('SP500.csv')，当这样调用时它几乎什么都不做。但是，当您这样做时，tuple(read_prices('SP500.csv'))它会操作生成器并提供值。生成器是可迭代的，由 a 操作：for 循环下一个使用解包tuple（如您所述）或list在涉及集合构造的其他操作中。这是一个更具体的生成器示例：def f():    print("First value:")    yield "first"    print("Second value:")    yield "second"这是在行动：### Nothing prints when called (analogous to your first timeit  without tuple)In [2]: v = f()In [3]:### However when I call `next` the first value is provided:In [3]: next(v)First value:Out[3]: 'first'## etc, until there is no more values and a "StopIteration` exception is raised:In [4]: next(v)Second value:Out[4]: 'second'In [5]: next(v)------------------------------------...StopIteration:## by unpacking using "tuple" the "StopIteration" ## exception is handled and all the values are provided at once##  (like your timeit using the tuple):In [6]: tuple(f())First value:Second value:Out[6]: ('first', 'second')

0 0

随时随地看视频慕课网APP

相关分类

Python