猿问

删除文件的蛇形规则

我有一个看起来像这样的大蛇形文件(在简化了很多之后)。


rule a:

    input: '{path}.csv'

    output: '{path}.a.csv'

    shell: 'cp {input} {output}'

rule b:

    input: '{path}.csv'

    output: '{path}.b.csv'

    shell: 'cp {input} {output}'

rule c:

    input: '{path}.csv'

    output: '{path}.c.csv'

    shell: 'cp {input} {output}'

rule d:

    input: '{path}.csv'

    output: '{path}.d.csv'

    shell: 'cp {input} {output}'

rule all:

    input: 'raw1.a.b.c.a.d.csv',

           'raw2.a.b.c.d.a.csv'

(这个设置让我可以使用像函数这样的规则,通过在all规则中链接它们的文件名后缀。)


开始状态:


$ ls -tr1

Snakefile

raw1.csv

raw2.csv


$ snakemake all

...

后:


$ ls -tr1

Snakefile

raw1.csv

raw2.csv

raw2.a.csv

raw2.a.b.csv

raw2.a.b.c.csv

raw2.a.b.c.d.csv

raw1.a.csv

raw1.a.b.csv

raw1.a.b.c.csv

raw1.a.b.c.a.csv

raw1.a.b.c.a.d.csv

raw2.a.b.c.d.a.csv

现在,我想添加一个规则来删除特定的中间文件(例如raw1.a.csv和raw2.a.b.csv),因为我不需要它们并且它们占用大量磁盘空间。temp()由于通配符,我无法标记输出{path}。


有小费吗?谢谢。


一只萌萌小番薯
浏览 86回答 2
2回答

郎朗坤

temp() 在这种情况下确实有效。rule all:    input: 'raw1.a.b.c.a.d.csv',        'raw2.a.b.c.d.a.csv'rule a:    input: '{path}.csv'    output: temp('{path}.a.csv')    shell: 'cp {input} {output}'rule b:    input: '{path}.csv'    output: '{path}.b.csv'    shell: 'cp {input} {output}'rule c:    input: '{path}.csv'    output: temp('{path}.c.csv')    shell: 'cp {input} {output}'rule d:    input: '{path}.csv'    output: '{path}.d.csv'    shell: 'cp {input} {output}'执行此操作将导致创建文件raw1.a.b.c.a.d.csv , raw1.a.b.csv, raw2.a.b.c.d.csv, raw2.a.b.csv和自动删除文件raw1.a.csv, raw2.a.csv, raw1.a.b.c.csv, raw2.a.b.c.csv, raw1.a.b.c.a.csv, raw2.a.b.c.d.a.csv。

BIG阳

编辑:实际上,这个解决方案不起作用..它会导致竞争条件......好吧,我想通了...rule a:&nbsp; &nbsp; input: '{path}.csv'&nbsp; &nbsp; output: '{path}.a.csv'&nbsp; &nbsp; shell: 'cp {input} {output}'rule b:&nbsp; &nbsp; input: '{path}.csv'&nbsp; &nbsp; output: '{path}.b.csv'&nbsp; &nbsp; shell: 'cp {input} {output}'rule c:&nbsp; &nbsp; input: '{path}.csv'&nbsp; &nbsp; output: '{path}.c.csv'&nbsp; &nbsp; shell: 'cp {input} {output}'rule d:&nbsp; &nbsp; input: '{path}.csv'&nbsp; &nbsp; output: '{path}.d.csv'&nbsp; &nbsp; shell: 'cp {input} {output}'rule remove:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; # <-- rule to delete a file&nbsp; &nbsp; input: '{path}'&nbsp; &nbsp; output: touch('{path}.removed')&nbsp; &nbsp; shell: 'rm {input}'rule all:&nbsp; &nbsp; input: 'raw1.a.b.c.a.d.csv',&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;'raw2.a.b.c.d.a.csv',&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;'raw1.a.csv.removed',&nbsp; &nbsp; &nbsp; # <-- specify which files to rm&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;'raw2.a.b.c.csv.removed',&nbsp; # <-- specify which files to rm这是dag:$ snakemake --dag all | dot -Tpng > dag.png
随时随地看视频慕课网APP

相关分类

Python
我要回答