Requests: Explanation Of The .text Format
Solution 1:
requests.get()
returns a Response
object; it is that object that has the .text
attribute; it is not the 'source code' of the URL, it is an object that lets you access the source code (the body) of the response, as well as other information. The Response.text
attribute gives you the body of the response, decoded to unicode
.
See the Response Content section of the Quickstart documentation:
When you make a request, Requests makes educated guesses about the encoding of the response based on the HTTP headers. The text encoding guessed by Requests is used when you access
r.text
.
Further information can be found in the API documentation, see the Response.text
entry:
Content of the response, in unicode.
If Response.encoding is None, encoding will be guessed using
chardet
.The encoding of the response content is determined based solely on HTTP headers, following RFC 2616 to the letter. If you can take advantage of non-HTTP knowledge to make a better guess at the encoding, you should set
r.encoding
appropriately before accessing this property.
You can also use Response.content
to access the response body undecoded, as raw bytes.
Solution 2:
in this line
source_code = requests.get(url)
source_code
has a response
object, not the source code.
it should be
response = requests.get(url)
source_code = response.text
Post a Comment for "Requests: Explanation Of The .text Format"