Replacing Words In A File With Re
Solution 1:
you don't need a loop for this, or to replace & write the file several times. A very efficient approach is:
- open & read the file
- use regex replacement function with a lambda, trying to match the words of the text with the dictionary, returning the same word if not found
- open & write the file (or a new file)
like this:
import re
text = "input.txt"
operators = {'order': '"order"', 'matter':'"matter"'}
withopen(text, 'r') as f:
contents = f.read()
cleaned = re.sub(r"\b(\w+)\b",lambda m : operators.get(m.group(1),m.group(1)),contents)
withopen("new_"+text, 'w') as f:
f.write(cleaned)
This little-known feature is very powerful. It allows to pass a function as a replacement (not a string). This function takes the match as input, and returns the string that must replace the match as output. My function is an anonymous function (lambda):
lambda m : operators.get(m.group(1),m.group(1))
so if the matched word is in the dictionary, it returns & replaces by the value, else it returns the original word.
All that without a loop & O(1)
word lookup, so super fast even if you have a lot of items in your dictionary (as opposed to linear nth replace approach, or building list of keywords with "|".join()
, which starts to crawl when you have 1000+ items to search/replace)
Post a Comment for "Replacing Words In A File With Re"