How To Use Expand In Snakemake When Some Particular Combinations Of Wildcards Are Not Desired?
Let's suppose that I have the following files, on which I want to apply some processing automatically using snakemake: test_input_C_1.txt test_input_B_2.txt test_input_A_2.txt test
Solution 1:
The expand
function accepts a second optional non-keyword argument to use a different function from the default one to combine wildcard values.
One can create a filtered version of itertools.product
by wrapping it in a higher-order generator that checks that the yielded combination of wildcards is not among a pre-established blacklist:
from itertools import product
deffilter_combinator(combinator, blacklist):
deffiltered_combinator(*args, **kwargs):
for wc_comb in combinator(*args, **kwargs):
# Use frozenset instead of tuple# in order to accomodate# unpredictable wildcard orderiffrozenset(wc_comb) notin blacklist:
yield wc_comb
return filtered_combinator
# "B_1" and "C_2" are undesired
forbidden = {
frozenset({("text", "B"), ("num", 1)}),
frozenset({("text", "C"), ("num", 2)})}
filtered_product = filter_combinator(product, forbidden)
rule all:
input:
# Override default combination generator
expand("test_output_{text}_{num}.txt", filtered_product, text=["A", "B", "C"], num=[1, 2])
rule make_output:
input: "test_input_{text}_{num}.txt"
output: "test_output_{text}_{num}.txt"
shell:
"""
md5sum {input} > {output}
"""
The missing wildcards combinations can be read from the configfile.
Here is an example in json format:
{
"missing" :
[
{
"text" : "B",
"num" : 1
},
{
"text" : "C",
"num" : 2
}
]
}
The forbidden
set would be read as follows in the snakefile:
forbidden = {frozenset(wc_comb.items()) for wc_comb in config["missing"]}
Post a Comment for "How To Use Expand In Snakemake When Some Particular Combinations Of Wildcards Are Not Desired?"