使用 Flying Saucer PDF Rendering 将格式错误的 HTML 转换为 PDF

在GitHub项目中，我试图将任意 HTML 字符串转换为 PDF 版本。通过转换，我的意思是解析 HTML，并将其呈现为 PDF 文件。

为了实现这一点，我正在使用飞碟 PDF 渲染，如下所示：

主程序

public class Main {

public static void main(String [] args) {

final String ok = "<valid html here>: see github rep for real html markup here";

final String html = "<invalid html here>: see github rep for real html markup here";

try {

// final byte[] bytes = generatePDFFrom(ok); // works!

final byte[] bytes = generatePDFFrom(html); // does NOT work :(

try(FileOutputStream fos = new FileOutputStream("sample-file.pdf")) {

fos.write(bytes);

}

} catch (IOException | DocumentException e) {

e.printStackTrace();

}

private static byte[] generatePDFFrom(String html) throws IOException, DocumentException {

final ITextRenderer renderer = new ITextRenderer();

renderer.setDocumentFromString(html);

renderer.layout();

try (ByteArrayOutputStream fos = new ByteArrayOutputStream(html.length())) {

renderer.createPDF(fos);

return fos.toByteArray();

}

在上面的代码中，如果我使用存储在ok变量中的 html 字符串（这是一个“有效”的 html），它会正确创建 PDF（如果您使用该ok变量运行 GitHub 项目，它将sample-file.pdf在项目文件夹中创建一个文件，其中包含一些呈现的 html）。

现在，据我所知，这是因为 html 字符串的“无效”部分。

波斯汪

浏览 324回答 2