Lucene不断添加文档，而使用updateDocument

我的项目围绕Lucene 6.6.0展开。实际上，它处理用Java编写的桌面搜索引擎，其中搜索部分与索引部分位于单独的应用程序中。我必须不时地向索引添加新字段，以满足客户的需求，而不必重新索引（即解析文件+索引）所有内容。

因此，当应用程序启动时，我使用IndexWriter，打开与之关联的IndexReader：

IndexReader reader = DirectoryReader.open(writer, true, false);

然后，对于索引中已经存在的每个文档：

StoredField fieldVersion = new StoredField(

FIELDNAME_VERSION,

fixedValue // The value is the same for all the documents but may change (for all documents) when upgrading the version.

);

for (int i = 0; i < idMax; i++) {

Document currentDoc = reader.document(i);

// Checks if the field exists in the index

if (

// Field does not exist yet

currentDoc.get(FIELDNAME_VERSION) == null ||

// Field value is different from what it should be

!currentDoc.get(FIELDNAME_VERSION).contentEquals(fixedValue))

{

// THe field does not exist so we add it to the doc and beforehand removes the field from the currentDoc (also tried without removing first with the same result)

currentDoc.removeField(FIELDNAME_VERSION);

currentDoc.add(fieldVersion);

// Updates the document in the index

writer.updateDocuments(

new Term(FIELDNAME_PATH, currentDoc.get(FIELDNAME_PATH),

currentDoc);

// also tried with

writer.deleteDocuments(new Term(FIELDNAME_PATH,

currentDoc.get(FIELDNAME_PATH)));

writer.addDocument(currentDoc);

}

// When all documents have been checked, write the index

writer.commit();

当我第一次运行该字段时，该字段将按预期方式添加到所有没有此字段的文档中。问题是，当fixedValue更改时，会将新文档添加到索引，而我希望currentDoc更新其fieldVersion，而不是为fieldVersion创建另一个与所有字段具有相同原始值的Document。

IndexWriter处于附加模式（也可以尝试附加或创建）。而且，如果我首先为单个文件建立索引，那么我在索引中得到1个文档，然后在索引更新后得到2个文档，然后是4个，然后8个，然后是16个，……总是引用相同的单个文件（只有fieldVersion具有不同的内容）。

这个其他的问题对我没有帮助。

当我要求Lucene更新现有文档时，为什么要添加新文档，Lucene为什么要添加新的文档，以及解决该问题的方法（即仅将fieldVersion的不同内容替换为同一文档中的现有文档）？

慕尼黑8549860

浏览 363回答 1

1回答

随时随地看视频慕课网APP