Skip to content Skip to sidebar Skip to footer

String Manipulation And Adding Values Based On Row They Are

I have a file text delimited file which I am trying to make binary combination per each line and giving the number of line to each pairs. Here is an example (you can download it he

Solution 1:

Try this

#!/usr/bin/pythonfrom itertools import combinations

withopen('data1.txt') as f:
    result = []
    for n, line inenumerate(f, start=1):
        items = line.strip().split(',')

        x = [['%s_%d' % (x, n) for x in item] for item in combinations(items, 2)]
        result.append(x)

for res in result:
    for elem in res:
        print(',\t'.join(elem))

You need a list of list of lists to represent each pair. You can build them using a list comprehension in a loop.

I wasn't sure what you wanted as your actual output format, but this prints your expected output.

If there are quotes in the input file, the simple fix is

items = line.replace("\"", "").strip().split(',')

For the above code. This would break if there were other double quotes in the data. So if you know there aren't its ok.

Otherwise, create a small function to strip the quotes. This example also writes to a file.

#!/usr/bin/pythonfrom itertools import combinations

defremquotes(s):
    beg, end = 0, len(s)
    if s[0] == '"': beg = 1if s[-1] == '"': end = -1return s[beg:end]

withopen('data1.txt') as f:
    result = []
    for n, line inenumerate(f, start=1):
        items = remquotes(line.strip()).strip().split(',')

        x = [['%s_%d' % (x, n) for x in item] for item in combinations(items, 2)]
        result.append(x)

withopen('out.txt', 'w') as fout:
    for res in result:
        for elem in res:                
            linestr = ',\t'.join(elem)
            print(linestr)
            fout.write(linestr + '\n')

Solution 2:

Similar to the other answer provided adding that based on the comments it looks like you actually wish to write to a tab-delimited text file instead of a dictionary.

#!/usr/bin/pythonimport itertools

file_name = 'data.txt'
out_file = 'out.txt'withopen(file_name) as infile, open(out_file, "w") as out:
  for n,line inenumerate(infile):
    row = [i + "_" + str(n+1) for i in line.strip().split(",")]
    for i in itertools.combinations(row,2):
      out.write('\t'.join(i) + '\n')

Solution 3:

The following seems to work with a minimal amount of code:

import itertools

input_filename = 'data.txt'
output_filename = 'split_data.txt'withopen(input_filename, 'rt') as inp, open(output_filename, 'wt') as outp:
    for n, line inenumerate(inp, 1):
        items = ('{}_{}'.format(x.strip(), n) 
                    for x in line.replace('"', '').split(','))
        for combo in itertools.combinations(items, 2):
            outp.write('\t'.join(combo) + '\n')

Post a Comment for "String Manipulation And Adding Values Based On Row They Are"