Python 使用多进程加速合并计数器

调用combined = future.result()会阻塞，直到结果完成，因此您不会在前一个请求完成之前向池提交后续请求。换句话说，您永远不会运行多个子进程。至少您应该将代码更改为：with ProcessPoolExecutor(max_workers=10) as pool:    the_futures = []    for samples in tqdm(sample_list):        future = pool.submit(reduce, add, [item2item[s] for s in samples], Counter())        the_futures.append(future) # save it    results = [f.result() for f in the_futures()] # all the results另一种方式：with ProcessPoolExecutor(max_workers=10) as pool:    the_futures = []    for samples in tqdm(sample_list):        future = pool.submit(reduce, add, [item2item[s] for s in samples], Counter())        the_futures.append(future) # save it    # you need: from concurrent.futures import as_completed    for future in as_completed(the_futures): # not necessarily the order of submission        result = future.result() # do something with this此外，如果您未指定构造函数，则默认为您机器上的处理器数量max_workers。ProcessPoolExecutor指定一个大于您实际拥有的处理器数量的值不会有任何收获。更新如果您想在结果完成后立即处理结果并需要一种方法将结果与原始请求联系起来，您可以将期货作为键存储在字典中，其中相应的值表示请求的参数。在这种情况下：with ProcessPoolExecutor(max_workers=10) as pool:    the_futures = {}    for samples in tqdm(sample_list):        future = pool.submit(reduce, add, [item2item[s] for s in samples], Counter())        the_futures[future] = samples # map future to request    # you need: from concurrent.futures import as_completed    for future in as_completed(the_futures): # not necessarily the order of submission        samples = the_futures[future] # the request        result = future.result() # the result

Python 使用多进程加速合并计数器

1回答