我的想法是,我要接一个Wordnet文本行,将行的所有不同部分分配给不同的变量,然后将这些变量作为三元组输入到RDFlib图中。
这是文本文件中的示例行:
13797906 23 n 04 flood 0 inundation 0 deluge 0 torrent 0 005 @ 13796604 n 0000 + 00603894 a 0401 + 00753137 v 0302 + 01527311 v 0203 + 02361703 v 0101 | an overwhelming number or amount; "a flood of requests"; "a torrent of abuse"
这是我的代码。
from rdflib import URIRef, Graph
from StringIO import StringIO
G = Graph()
F = open("new_2.txt", "r")
for line in F:
L = line.split()
L2 = line.strip().split('|')
synset_offset = L[0]
lex_filenum = L[1]
ss_type = L[2]
gloss = L2[1]
before_at, after_at = line.split('@', 1)
N = int(L[3])
K = int(before_at.split()[-1])
word = L[4:4 + 2 * N:2]
iw = iter(word)
S = after_at.split()[0:0 +4 * K:4]
ip = iter(S)
SS = after_at.split()[1:1 + 4 * K:4]
iss = iter(SS)
ST = after_at.split()[2:2 + 4 * K:4]
ist = iter(ST)
line1 = '''http://www.example.org/lexicon#'''+synset_offset+''' http://www.w3.org/1999/02/22-rdf-syntax-ns#lex_filenum '''+lex_filenum+''''''
line2 = '''http://www.example.org/lexicon#'''+synset_offset+''' http://www.monnetproject.eu/lemon#ss_type '''+ss_type+''''''
line3 = ''''''
#line4 = '''http://www.example.org/lexicon#'''+synset_offset+''' http://www.monnetproject.eu/lemon#gloss '''gloss'''
for item in word:
line3 += '''http://www.example.org/lexicon#'''+synset_offset+''' http://www.monnetproject.eu/lemon#lexical_entry '''+iw.next()+'''\n'''
直到一切都完美line5。(由于其他原因,第4行被注释掉了,我还不需要它)
当包含line5,line6和line7时,这是我得到的错误:
G.add(triple)
File "/usr/lib/python2.7/site-packages/rdflib-4.1_dev-py2.7.egg/rdflib/graph.py", line 352, in add
def add(self, (s, p, o)):
ValueError: need more than 0 values to unpack
我不明白line3和line5之间的区别是什么会导致错误,line3可以完美运行!
鸿蒙传说
临摹微笑
随时随地看视频慕课网APP
相关分类