Skip to content Skip to sidebar Skip to footer

Extracting Portion Of The String Text With Start And End Matches By Using Regular Expressions In Python

I am trying to extract only one portion of the string text by using regular expressions in Python with two specific matches. To be specific, here is an example text: example = '''

Solution 1:

import re

example = """
The forward-looking statements are made as of the date of this report,
and the Company assumes no obligation to update the forward-looking statements 
or to update the reasons why actual results could differ from those projected 
in the forward-looking statements. PART 1. ITEM 1. BUSINESS 
General Farmers & Merchants Bancorp, Inc. (Company) is a bank holding company 
incorporated under the laws of Ohio in 1985 and elected to become a financial 
holding company under the Federal Reserve in 2014. Our primary subsidiary, 
The Farmers & Merchants State Bank (Bank) is a community bank operating 
in Northwest Ohio since 1897.ITEM 2. PROPERTIES Our principal office is located in Archbold, Ohio.
The Bank operates from the facilities at 307 North Defiance Street. 
In addition, the Bank owns the property from 200 to 208 Ditto Street, 
Archbold, Ohio, which it uses for Bank parking and a community mini-park area.
"""defget_text_between(text, mark1, mark2):
    regex = '({}.*?){}'.format(mark1, mark2)
    match = re.search(regex, example, re.DOTALL)
    if match:
        return match.group(1)
    returnNoneif __name__ == '__main__':
    text = get_text_between(example, 'ITEM 1', 'ITEM 2')
    if text:
        print(text)

Solution 2:

This way you can buffer part of string you want to extract.

import re;
example = """
    The forward-looking statements are made as of the date of this report,
    and the Company assumes no obligation to update the forward-looking statements 
    or to update the reasons why actual results could differ from those projected 
    in the forward-looking statements. PART 1. ITEM 1. BUSINESS 
    General Farmers & Merchants Bancorp, Inc. (Company) is a bank holding company 
    incorporated under the laws of Ohio in 1985 and elected to become a financial 
    holding company under the Federal Reserve in 2014. Our primary subsidiary, 
    The Farmers & Merchants State Bank (Bank) is a community bank operating 
    in Northwest Ohio since 1897.ITEM 2. PROPERTIES Our principal office is located in Archbold, Ohio.
    The Bank operates from the facilities at 307 North Defiance Street. 
    In addition, the Bank owns the property from 200 to 208 Ditto Street, 
    Archbold, Ohio, which it uses for Bank parking and a community mini-park area.
"""
final_result = "";
search = re.search('(ITEM\ 1[\s\S]*)ITEM\ 2', example);
if search:
    final_result = search.group(1);

Solution 3:

example = """
    The forward-looking statements are made as of the date of this report,
    and the Company assumes no obligation to update the forward-looking statements 
    or to update the reasons why actual results could differ from those projected 
    in the forward-looking statements. PART 1. ITEM 1. BUSINESS 
    General Farmers & Merchants Bancorp, Inc. (Company) is a bank holding company 
    incorporated under the laws of Ohio in 1985 and elected to become a financial 
    holding company under the Federal Reserve in 2014. Our primary subsidiary, 
    The Farmers & Merchants State Bank (Bank) is a community bank operating 
    in Northwest Ohio since 1897.ITEM 2. PROPERTIES Our principal office is located in Archbold, Ohio.
    The Bank operates from the facilities at 307 North Defiance Street. 
    In addition, the Bank owns the property from 200 to 208 Ditto Street, 
    Archbold, Ohio, which it uses for Bank parking and a community mini-park area.
    """import re
example2 = " ".join(example.split("\n"))
match = re.search("(ITEM 1.*?)ITEM 2",example2)
if match:
  print(match.group(1))

This should work

Post a Comment for "Extracting Portion Of The String Text With Start And End Matches By Using Regular Expressions In Python"