Extracting Portion Of The String Text With Start And End Matches By Using Regular Expressions In Python
I am trying to extract only one portion of the string text by using regular expressions in Python with two specific matches. To be specific, here is an example text: example = '''
Solution 1:
import re
example = """
The forward-looking statements are made as of the date of this report,
and the Company assumes no obligation to update the forward-looking statements
or to update the reasons why actual results could differ from those projected
in the forward-looking statements. PART 1. ITEM 1. BUSINESS
General Farmers & Merchants Bancorp, Inc. (Company) is a bank holding company
incorporated under the laws of Ohio in 1985 and elected to become a financial
holding company under the Federal Reserve in 2014. Our primary subsidiary,
The Farmers & Merchants State Bank (Bank) is a community bank operating
in Northwest Ohio since 1897.ITEM 2. PROPERTIES Our principal office is located in Archbold, Ohio.
The Bank operates from the facilities at 307 North Defiance Street.
In addition, the Bank owns the property from 200 to 208 Ditto Street,
Archbold, Ohio, which it uses for Bank parking and a community mini-park area.
"""defget_text_between(text, mark1, mark2):
regex = '({}.*?){}'.format(mark1, mark2)
match = re.search(regex, example, re.DOTALL)
if match:
return match.group(1)
returnNoneif __name__ == '__main__':
text = get_text_between(example, 'ITEM 1', 'ITEM 2')
if text:
print(text)
Solution 2:
This way you can buffer part of string you want to extract.
import re;
example = """
The forward-looking statements are made as of the date of this report,
and the Company assumes no obligation to update the forward-looking statements
or to update the reasons why actual results could differ from those projected
in the forward-looking statements. PART 1. ITEM 1. BUSINESS
General Farmers & Merchants Bancorp, Inc. (Company) is a bank holding company
incorporated under the laws of Ohio in 1985 and elected to become a financial
holding company under the Federal Reserve in 2014. Our primary subsidiary,
The Farmers & Merchants State Bank (Bank) is a community bank operating
in Northwest Ohio since 1897.ITEM 2. PROPERTIES Our principal office is located in Archbold, Ohio.
The Bank operates from the facilities at 307 North Defiance Street.
In addition, the Bank owns the property from 200 to 208 Ditto Street,
Archbold, Ohio, which it uses for Bank parking and a community mini-park area.
"""
final_result = "";
search = re.search('(ITEM\ 1[\s\S]*)ITEM\ 2', example);
if search:
final_result = search.group(1);
Solution 3:
example = """
The forward-looking statements are made as of the date of this report,
and the Company assumes no obligation to update the forward-looking statements
or to update the reasons why actual results could differ from those projected
in the forward-looking statements. PART 1. ITEM 1. BUSINESS
General Farmers & Merchants Bancorp, Inc. (Company) is a bank holding company
incorporated under the laws of Ohio in 1985 and elected to become a financial
holding company under the Federal Reserve in 2014. Our primary subsidiary,
The Farmers & Merchants State Bank (Bank) is a community bank operating
in Northwest Ohio since 1897.ITEM 2. PROPERTIES Our principal office is located in Archbold, Ohio.
The Bank operates from the facilities at 307 North Defiance Street.
In addition, the Bank owns the property from 200 to 208 Ditto Street,
Archbold, Ohio, which it uses for Bank parking and a community mini-park area.
"""import re
example2 = " ".join(example.split("\n"))
match = re.search("(ITEM 1.*?)ITEM 2",example2)
if match:
print(match.group(1))
This should work
Post a Comment for "Extracting Portion Of The String Text With Start And End Matches By Using Regular Expressions In Python"