Python - 查找字符串列表中包含的唯一子字符串的索引，而无需遍历所有项目

探索的方法1. 生成器方法next(i for i,v in enumerate(test_strings) if 'other' in v)2. 列表理解法[i for i,v in enumerate(test_strings) if 'other' in v]3. 将索引与生成器一起使用（由@HeapOverflow建议）test_strings.index(next(v for v in test_strings if 'other' in v))4. 带生成器的正则表达式re_pattern = re.compile('.*other.*')next(test_strings.index(x) for x in test_strings if re_pattern.search(x))结论索引方法具有最快的时间（@HeapOverflow在注释中建议的方法）。测试代码使用使用timeit的Perfplotimport random import stringimport reimport perfplotdef random_string(N):    return ''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))def create_strings(length):    M = length // 2    random_strings = [random_string(5) for _ in range(length)]    front = ['...other...'] + random_strings    middle = random_strings[:M] + ['...other...'] + random_strings[M:]    end_ = random_strings + ['...other...']    return front, middle, end_def search_list_comprehension(test_strings):    return [i for i,v in enumerate(test_strings) if 'other' in v][0]def search_genearator(test_strings):    return next(i for i,v in enumerate(test_strings) if 'other' in v)def search_index(test_strings):    return test_strings.index(next(v for v in test_strings if 'other' in v))def search_regex(test_strings):    re_pattern = re.compile('.*other.*')    return next(test_strings.index(x) for x in test_strings if re_pattern.search(x))# Each benchmark is run with the '..other...' placed in the front, middle and end of a random list of strings.out = perfplot.bench(    setup=lambda n: create_strings(n),  # create front, middle, end strings of length n    kernels=[        lambda a: [search_list_comprehension(x) for x in a],        lambda a: [search_genearator(x) for x in a],        lambda a: [search_index(x) for x in a],        lambda a: [search_regex(x) for x in a],    ],    labels=["list_comp", "generator", "index", "regex"],    n_range=[2 ** k for k in range(15)],    xlabel="lenght list",    # More optional arguments with their default values:    # title=None,    # logx="auto",  # set to True or False to force scaling    # logy="auto",    # equality_check=numpy.allclose,  # set to None to disable "correctness" assertion    # automatic_order=True,    # colors=None,    # target_time_per_measurement=1.0,    # time_unit="s",  # set to one of ("auto", "s", "ms", "us", or "ns") to force plot units    # relative_to=1,  # plot the timings relative to one of the measurements    # flops=lambda n: 3*n,  # FLOPS plots)out.show()print(out)结果length list   regex    list_comp  generator    index     1.0     10199.0     3699.0     4199.0     3899.0     2.0     11399.0     3899.0     4300.0     4199.0     4.0     13099.0     4300.0     4599.0     4300.0     8.0     16300.0     5299.0     5099.0     4800.0    16.0     22399.0     7199.0     5999.0     5699.0    32.0     34900.0    10799.0     7799.0     7499.0    64.0     59300.0    18599.0    11799.0    11200.0   128.0    108599.0    33899.0    19299.0    18500.0   256.0    205899.0    64699.0    34699.0    33099.0   512.0    403000.0   138199.0    69099.0    62499.0  1024.0    798900.0   285600.0   142599.0   120900.0  2048.0   1599999.0   582999.0   288699.0   239299.0  4096.0   3191899.0  1179200.0   583599.0   478899.0  8192.0   6332699.0  2356400.0  1176399.0   953500.0 16384.0  12779600.0  4731100.0  2339099.0  1897100.0

Python - 查找字符串列表中包含的唯一子字符串的索引，而无需遍历所有项目

2回答