String Manipulation And Adding Values Based On Row They Are
I have a file text delimited file which I am trying to make binary combination per each line and giving the number of line to each pairs. Here is an example (you can download it he
Solution 1:
Try this
#!/usr/bin/pythonfrom itertools import combinations
withopen('data1.txt') as f:
result = []
for n, line inenumerate(f, start=1):
items = line.strip().split(',')
x = [['%s_%d' % (x, n) for x in item] for item in combinations(items, 2)]
result.append(x)
for res in result:
for elem in res:
print(',\t'.join(elem))
You need a list of list of lists to represent each pair. You can build them using a list comprehension in a loop.
I wasn't sure what you wanted as your actual output format, but this prints your expected output.
If there are quotes in the input file, the simple fix is
items = line.replace("\"", "").strip().split(',')
For the above code. This would break if there were other double quotes in the data. So if you know there aren't its ok.
Otherwise, create a small function to strip the quotes. This example also writes to a file.
#!/usr/bin/pythonfrom itertools import combinations
defremquotes(s):
beg, end = 0, len(s)
if s[0] == '"': beg = 1if s[-1] == '"': end = -1return s[beg:end]
withopen('data1.txt') as f:
result = []
for n, line inenumerate(f, start=1):
items = remquotes(line.strip()).strip().split(',')
x = [['%s_%d' % (x, n) for x in item] for item in combinations(items, 2)]
result.append(x)
withopen('out.txt', 'w') as fout:
for res in result:
for elem in res:
linestr = ',\t'.join(elem)
print(linestr)
fout.write(linestr + '\n')
Solution 2:
Similar to the other answer provided adding that based on the comments it looks like you actually wish to write to a tab-delimited text file instead of a dictionary.
#!/usr/bin/pythonimport itertools
file_name = 'data.txt'
out_file = 'out.txt'withopen(file_name) as infile, open(out_file, "w") as out:
for n,line inenumerate(infile):
row = [i + "_" + str(n+1) for i in line.strip().split(",")]
for i in itertools.combinations(row,2):
out.write('\t'.join(i) + '\n')
Solution 3:
The following seems to work with a minimal amount of code:
import itertools
input_filename = 'data.txt'
output_filename = 'split_data.txt'withopen(input_filename, 'rt') as inp, open(output_filename, 'wt') as outp:
for n, line inenumerate(inp, 1):
items = ('{}_{}'.format(x.strip(), n)
for x in line.replace('"', '').split(','))
for combo in itertools.combinations(items, 2):
outp.write('\t'.join(combo) + '\n')
Post a Comment for "String Manipulation And Adding Values Based On Row They Are"