Using Beautifulsoup To Parse String Efficiently
I am trying to parse this html to get the item title (e.g. Big Boss Air Fryer - Healthy 1300-Watt Super Sized 16-Quart, Fryer 5 Colors -NEW)
Solution 1:
This is because of this caveat of the .string
attribute:
If a tag contains more than one thing, then it’s not clear what
.string
should refer to, so.string
is defined to beNone
Since the header element contains multiple children - it cannot be defined and defaults to None
.
To avoid cutting of "Details about" part, you can get the first text node in a non-recursive mode:
soup.find('h1', {'class':'it-ttl'}).find(text=True, recursive=False)
Demo:
In [3]: soup = BeautifulSoup(data, "html.parser")
In [4]: print(soup.find('h1', {'class':'it-ttl'}).find(text=True, recursive=False))
Big Boss Air Fryer - Healthy 1300-Watt Super Sized 16-Quart, Fryer 5 Colors -NEW
Post a Comment for "Using Beautifulsoup To Parse String Efficiently"