我对以下数据(来自 pandas 数据框)遇到一些困难:
Text
0 Selected moments from Fifa game t...
1 What I learned is that I am ...
3 Bill Gates kept telling us it was comi...
5 scenario created a month before the...
... ...
1899 Events for May 19 – October 7 - October CTOvision.com
1900 Office of Event Services and Campus Center Ope...
1901 How the CARES Act May Affect Gift Planning in ...
1902 City of Rohnert Park: Home
1903 iHeartMedia, Inc.
我需要提取每行的唯一单词数(删除标点符号后)。因此,例如:
Unique
0 6
1 6
3 8
5 6
... ...
1899 8
1900 8
1901 9
1902 5
1903 2
我尝试按如下方式进行:
df["Unique"]=df['Text'].str.lower()
df["Unique"]==Counter(word_tokenize('\n'.join( file["Unique"])))
但我没有得到任何计数,只有一个单词列表(没有它们在该行中的频率)。
你能告诉我出了什么问题吗?
饮歌长啸
GCT1015
有只小跳蛙
相关分类