Snakemake - 如何通过子目录获取目录中的所有文件

首页课程实战体系课手记专栏慕课教程

Snakemake - 如何通过子目录获取目录中的所有文件

/test/cond1/

/test/cond2/

/test/cond3/

/test/cond4/

所有子目录都有几个不同的文件：

cond1 : a1.txt, a2.txt

cond2 : b1.txt, b2.txt, b3.txt

cond3 : c1.txt, c2.txt, c4.txt

cond4 : d1.txt,c2.txt, c4.txt, d2.txt

我正在使用 Snakemake 运行命令，我需要通过 cond 获取所有文件，并用空白分隔

我尝试这样做：

def get_motifs_tf(wildcards):

file_list = sorted(glob.glob("tf_final/{wildcards.cond}/*.bed"))

return " ".join(file_list)

这是我的规则snakemake

rule compute_combi_enrichment:

"""

For a given input, compute the enrichment in n-wise TF combinations using OLOGRAM-MODL.

"""

input:

query = 'input/core_silencer/{cond}/core_silencer.bed',

excl = "input/exclude_region_dhs.bed",

genome = "input/mm9.chromsizes"

params:

trs = get_motifs_tf,

minibatch_number = 16, minibatch_size = 10 # Modulate depending on available RAM

threads: 8 # Do not use 16 threads to not vampirize all the cluster

output: 'output/ologram_result/{cond}/00_ologram_stats.tsv',

shell: """

set +u; source /gpfs/tagc/home/Apps/anaconda3/bin/activate dev; set -u

gtftk ologram -z -c {input.genome} -p {input.query} --more-bed {params.trs} \

-o output/ologram_result/{wildcards.cell_line} --force-chrom-peak --force-chrom-more-bed \

-V 3 -k {threads} -mn {params.minibatch_number} -ms {params.minibatch_size} \

--more-bed-multiple-overlap --bed-excl {input.excl} --no-date \

--multiple-overlap-max-number-of-combinations 80

"""

在 --more-bed {params.trs} 中

我预计会得到：

/test/cond1/a1.txt /test/cond1/a2.txt

然后

/test/cond2/b1.txt /test/cond2/b2.txt /test/cond2/b3.txt

等等...

守着一只汪

浏览 288回答 1

1回答

手掌心

我解决了它：函数中的通配符必须在 stp 中转换并且不带括号：def get_motifs_tf(wildcards):     file_list = sorted(glob.glob("tf_final/"+str(wildcards.cond)+"/*.bed"))         return " ".join(file_list)

0 0

随时随地看视频慕课网APP