搜索字符串的非蛮力建议解决方案

我想知道您是否可以就用于迭代字符串、将其分成两半并检查两半是否存在于字典/哈希中的非蛮力解决方案的算法提出建议?


例如,字符串“peanutbutter”被拆分为“peanut”和“butter”(是的,那里还有其他词,但出于示例目的,我们可以使用这两个词)


这是我提出的蛮力解决方案以供参考:


def break_into_spaces(S):

    i = 1

    while i < len(S):

        left = S[i:]

        right = S[:i]

        if left in DICTIONARY and right in DICTIONARY:

            print("Found them!")

        print("{} {}".format(right, left))

        i += 1



break_into_spaces("peanutbutter")


沧海一幻觉
浏览 167回答 2
2回答

Smart猫小萌

我的选择:wordlist = ['air', 'pack', 'port', 'hard', 'back', 'bag', 'disk', 'ground', 'play']word = 'playground'lenw, minlen = len(word), min([len(w) for w in wordlist])pairs = [(word[:n], word[n:]) for n in range(1,lenw) if (n >= minlen and n < lenw-minlen+1) ]found = Falsefor w1, w2 in pairs:&nbsp; if w1 in wordlist and w2 in wordlist:&nbsp; &nbsp; print('Found ' + word + ' as: ' + w1 + ' + ' + w2)&nbsp; &nbsp; found = True&nbsp; &nbsp; breakif not found: print('No words found')#=> Found playground as: play + groundpairs是将词一分为二的映射,其中两个子词不小于词表中最小的词。这减少了查找次数。打印出来看看:print(pairs)#=> [('pla', 'yground'), ('play', 'ground'), ('playg', 'round'), ('playgr', 'ound'), ('playgro', 'und')]如果是大量单词,我建议按起始字母(作为词汇表)分组,然后仅查找单词字母和起始单词集之间的交集内的单词。这里不完整的代码:letters = set(word)print(letters) #=> {'r', 'a', 'u', 'g', 'l', 'n', 'd', 'o', 'y', 'p'}alphabet = {}for word in wordlist:&nbsp; &nbsp; alphabet.setdefault(word[0], set()).add(word)print(alphabet)#=> {'a': {'air'}, 'p': {'port', 'play', 'pack'}, 'h': {'hard'}, 'b': {'back', 'bag'}, 'd': {'disk'}, 'g': {'ground'}}所以交集是:{'g', 'p', 'd', 'a'} 然后建立查找列表:lookuplist = []for i in intersection:&nbsp; for word in alphabet[i]:&nbsp; &nbsp; lookuplist.append(word)lookuplist #=> ['air', 'disk', 'ground', 'port', 'pack', 'play']所以用lookuplist代替wordlist用一些方法在抽屉里订购def vocabulary(wordlist):&nbsp; res = {}&nbsp; for word in wordlist:&nbsp; &nbsp; res.setdefault(word[0], set()).add(word)&nbsp; return resdef lookuplist(vocabulary, word):&nbsp; vocabulary_alphabet = set(vocabulary.keys())&nbsp; word_letters = set(word)&nbsp; intersection = vocabulary_alphabet.intersection(word_letters)&nbsp; lookuplist = []&nbsp; for i in intersection:&nbsp; &nbsp; for word in vocabulary[i]:&nbsp; &nbsp; &nbsp; lookuplist.append(word)&nbsp; return lookuplistdef find_word(word, lookuplist):&nbsp; lenw, minlen = len(word), min([len(w) for w in lookuplist])&nbsp; pairs = [(word[:n], word[n:]) for n in range(1,lenw) if (n >= minlen and n < lenw-minlen+1) ]&nbsp; for w1, w2 in pairs:&nbsp; &nbsp; if w1 in lookuplist and w2 in lookuplist: return (word, w1, w2)&nbsp; return []您可以按如下方式使用:wordlist = ['air', 'pack', 'port', 'hard', 'back', 'bag', 'disk', 'ground', 'play']word = 'playground'vocabulary = vocabulary(wordlist) # run once then store the resultlookuplist = lookuplist(vocabulary, word)found_word = find_word(word, lookuplist)print(found_word)#=> ('playground', 'play', 'ground')

开心每一天1111

不是一个完整的解决方案,但一个好主意可能是将单词存储在字典中,例如,其中键是单词的长度,值是一组单词。然后创建一个长度列表来迭代它,而不是迭代输入词 ( s),例如:words = ['toothpaste',&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;'hard-to-find',&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;'economic',&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;'point',&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;'food',&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;'seal',&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;'outrageous',&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;'motionless',&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;'ice',&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;'tow',&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;'boot',&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;'cruel',&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;'peanut',&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;'butter']index = {}for word in words:&nbsp; &nbsp; index.setdefault(len(word), set()).add(word)lengths = sorted(index)def break_into_spaces(s):&nbsp; &nbsp; s_length = len(s)&nbsp; &nbsp; for length in lengths:&nbsp; &nbsp; &nbsp; &nbsp; if length < s_length:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; left = s[length:]&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; right = s[:length]&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if left in index[length] and s_length - length in index and right in index[s_length - length]:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; print("{} {}".format(right, left))&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; print("Found them!")&nbsp; &nbsp; &nbsp; &nbsp; else:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; breakbreak_into_spaces('peanutbutter')输出peanut butterFound them!这样做是否可以节省时间:它避免了遍历整个输入词,想象一下输入词比字典中的所有词都短的情况,这将立即打破循环并且不打印任何内容。通过将单词存储在相同长度的集合中,您只需要检查是否存在相同长度的匹配单词,而不是检查所有单词。请注意,这可能毫无意义,因为字典是哈希表,因此理论上检查包含情况是O(1).
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python