How Can I Get The Contents Of The "feedback" Box From Google Searches?
When you ask a question or request the definition of a word in a Google search, Google gives you a summary of the answer in the 'feedback' box. For example, when you search for def
Solution 1:
It is easily done using requests and bs4, you just need to pull the text from the div with the class lr_dct_ent
import requests
from bs4 importBeautifulSouph= {"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36"}
r = requests.get("https://www.google.ie/search?q=define+apple", headers=h).textsoup= BeautifulSoup(r)
print("\n".join(soup.select_one("div.lr_dct_ent").text.split(";")))
The main text is in an ordered list, the noun is in the div with the lr_dct_sf_h class:
In [11]: r = requests.get("https://www.google.ie/search?q=define+apple", headers=h).text
In [12]: soup = BeautifulSoup(r,"lxml")
In [13]: div = soup.select_one("div.lr_dct_ent")
In [14]: n_v = div.select_one("div.lr_dct_sf_h").text
In [15]: expl = [li.text for li in div.select("ol.lr_dct_sf_sens li")]
In [16]: print(n_v)
noun
In [17]: print("\n".join(expl))
1. the round fruit of a tree of the rose family, which typically has thin green or red skin and crisp flesh.used in names of unrelated fruits or other plant growths that resemble apples in some way, e.g. custard apple, oak apple.
used in names of unrelated fruits or other plant growths that resemble apples in some way, e.g. custard apple, oak apple.
2. the tree bearing apples, with hard pale timber that is used in carpentry and to smoke food.
Solution 2:
Question is nice idea
program can be started with python3 defineterm.py apple
#! /usr/bin/env python3.5# defineterm.pyimport requests
from bs4 import BeautifulSoup
import sys
import html
import codecs
searchterm = ' '.join(sys.argv[1:])
url = 'https://www.google.com/search?q=define+' + searchterm
res = requests.get(url)
try:
res.raise_for_status()
except Exception as exc:
print('error while loading page occured: ' + str(exc))
text = html.unescape(res.text)
soup = BeautifulSoup(text, 'lxml')
prettytext = soup.prettify()
#next lines are for analysis (saving raw page), you can comment them
frawpage = codecs.open('rawpage.txt', 'w', 'utf-8')
frawpage.write(prettytext)
frawpage.close()
firsttag = soup.find('h3', class_="r")
if firsttag != None:
print(firsttag.getText())
print()
#second tag may be changed, so check it if not returns correct result. That might be situation for all searched tags.
secondtag = soup.find('div', {'style': 'color:#666;padding:5px 0'})
if secondtag != None:
print(secondtag.getText())
print()
termtags = soup.findAll("li", {"style" : "list-style-type:decimal"})
count = 0for tag in termtags:
count += 1print( str(count)+'. ' + tag.getText())
print()
make script as executable
then in ~/.bashrc this line can be added
alias defterm="/data/Scrape/google/defineterm.py "
putting correct path to script your place
then executing
source ~/.bashrc
program can be started with:
defterm apple(or other term)
Solution 3:
The easiest way is to grab CSS selectors of this text by using the SelectorGadget.
from bs4 import BeautifulSoup
import requests, lxml
headers = {
'User-agent':
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
}
html = requests.get('https://www.google.de/search?q=define apple', headers=headers)
soup = BeautifulSoup(html.text, 'lxml')
syllables = soup.select_one('.frCXef span').text
phonetic = soup.select_one('.g30o5d span span').text
noun = soup.select_one('.h3TRxf span').text
print(f'{syllables}\n{phonetic}\n{noun}')
# Output:'''
ap·ple
ˈapəl
the round fruit of a tree of the rose family, which typically has thin red or green skin and crisp flesh. Many varieties have been developed as dessert or cooking fruit or for making cider.
'''
Alternatively, you can do the same thing using Google Direct Answer Box API from SerpApi. It's a paid API with a free trial of 5,000 searches.
Code to integrate:
from serpapi import GoogleSearch
params = {
"api_key": "YOUR_API_KEY",
"engine": "google",
"q": "define apple",
"google_domain": "google.com",
}
search = GoogleSearch(params)
results = search.get_dict()
syllables = results['answer_box']['syllables']
phonetic = results['answer_box']['phonetic']
noun = results['answer_box']['definitions'][0] # array outputprint(f'{syllables}\n{phonetic}\n{noun}')
# Output:'''
ap·ple
ˈapəl
the round fruit of a tree of the rose family, which typically has thin red or green skin and crisp flesh. Many varieties have been developed as dessert or cooking fruit or for making cider.
'''
Disclaimer, I work for SerpApi
Post a Comment for "How Can I Get The Contents Of The "feedback" Box From Google Searches?"