斯蒂芬大帝
您可以使用re.split...from string import punctuationimport repuncrx = re.compile(r'[{}\s]'.format(re.escape(punctuation)))print filter(None, puncrx.split(your_tweet))或者,只查找包含某些连续字符的单词:print re.findall(re.findall('[\w#@]+', s), your_tweet)例如:print re.findall(r'[\w@#]+', 'talking about #python with @someone is so much fun! Is there a 140 char limit? So not cool!')# ['talking', 'about', '#python', 'with', '@someone', 'is', 'so', 'much', 'fun', 'Is', 'there', 'a', '140', 'char', 'limit', 'So', 'not', 'cool']我最初在示例中确实有一个笑脸,但是当然这些最终都被这种方法过滤掉了,因此需要警惕。