编写充满 Python 代码的 Latex 书的困难工作流程

这是我写的一个小脚本。它拆分单个*.ipynb文件并将其转换为多个*.tex文件。用法是：复制以下脚本并另存为 main.py执行python main.py init。它将创建main.tex和style_ipython_custom.tplx在您的 jupyther 笔记本中，向您要提取的每个单元格添加额外的行#latex:tag_a, #latex:tag_b, .. 。相同的标签将被提取到相同的*.tex文件。将其保存为*.ipynb文件。幸运的是，目前VSCode蟒蛇插件支持出口到*.ipynb从，或使用jupytext转换*.py到*.ipynb。运行python main.py path/to/your.ipynb，它将创建tag_a.tex和tag_b.tex编辑main.tex和添加\input{tag_a.tex}或\input{tag_b.tex}任何你想要的地方。运行pdflatex main.tex它会产生main.pdf这个脚本背后的想法：使用默认值从 jupyter notebook 转换为 LaTexnbconvert.LatexExporter会生成包含宏定义的完整 LaTex 文件。使用它来转换每个单元格可能会创建大型 LaTex 文件。为避免该问题，脚本首先创建main.tex只有宏定义的单元格，然后将每个单元格转换为没有宏定义的 LaTex 文件。这可以使用自定义模板文件来完成，该文件从style_ipython.tplx标记或标记单元格可能使用单元格元数据完成，但我找不到如何在 VSCode python 插件（问题）中设置它，因此它使用正则表达式模式扫描每个单元格的源^#latex:(.*)，并在将其转换为 LaTex 文件之前将其删除.来源：import sysimport reimport osfrom collections import defaultdictimport nbformatfrom nbconvert import LatexExporter, exportersOUTPUT_FILES_DIR = './images'CUSTOM_TEMPLATE = 'style_ipython_custom.tplx'MAIN_TEX = 'main.tex'def create_main():    # creates `main.tex` which only has macro definition    latex_exporter = LatexExporter()    book = nbformat.v4.new_notebook()    book.cells.append(        nbformat.v4.new_raw_cell(r'\input{__your_input__here.tex}'))    (body, _) = latex_exporter.from_notebook_node(book)    with open(MAIN_TEX, 'x') as fout:        fout.write(body)    print("created:", MAIN_TEX)def init():    create_main()    latex_exporter = LatexExporter()    # copy `style_ipython.tplx` in `nbconvert.exporters` module to current directory,    # and modify it so that it does not contain macro definition    tmpl_path = os.path.join(        os.path.dirname(exporters.__file__),        latex_exporter.default_template_path)    src = os.path.join(tmpl_path, 'style_ipython.tplx')    target = CUSTOM_TEMPLATE    with open(src) as fsrc:        with open(target, 'w') as ftarget:            for line in fsrc:                # replace the line so than it does not contain macro definition                if line == "((*- extends 'base.tplx' -*))\n":                    line = "((*- extends 'document_contents.tplx' -*))\n"                ftarget.write(line)    print("created:", CUSTOM_TEMPLATE)def group_cells(note):    # scan the cell source for tag with regexp `^#latex:(.*)`    # if sames tags are found group it to same list    pattern = re.compile(r'^#latex:(.*?)$(\n?)', re.M)    group = defaultdict(list)    for num, cell in enumerate(note.cells):        m = pattern.search(cell.source)        if m:            tag = m.group(1).strip()            # remove the line which contains tag            cell.source = cell.source[:m.start(0)] + cell.source[m.end(0):]            group[tag].append(cell)        else:            print("tag not found in cell number {}. ignore".format(num + 1))    return groupdef doit():    with open(sys.argv[1]) as f:        note = nbformat.read(f, as_version=4)    group = group_cells(note)    latex_exporter = LatexExporter()    # use the template which does not contain LaTex macro definition    latex_exporter.template_file = CUSTOM_TEMPLATE    try:        os.mkdir(OUTPUT_FILES_DIR)    except FileExistsError:        pass    for (tag, g) in group.items():        book = nbformat.v4.new_notebook()        book.cells.extend(g)        # unique_key will be prefix of image        (body, resources) = latex_exporter.from_notebook_node(            book,            resources={                'output_files_dir': OUTPUT_FILES_DIR,                'unique_key': tag            })        ofile = tag + '.tex'        with open(ofile, 'w') as fout:            fout.write(body)            print("created:", ofile)        # the image data which is embedded as base64 in notebook        # will be decoded and returned in `resources`, so write it to file        for filename, data in resources.get('outputs', {}).items():            with open(filename, 'wb') as fres:                fres.write(data)                print("created:", filename)if len(sys.argv) <= 1:    print("USAGE: this_script [init|yourfile.ipynb]")elif sys.argv[1] == "init":    init()else:    doit()

编写充满 Python 代码的 Latex 书的困难工作流程

3回答