Work With Network Response

Response Object

The result of doing a network request via Grab is a Response object.

You get a Response object as a result of calling to g.go, g.request and g.submit methods. You can also access the response object of a recent network query via the g.response attribute:

>>> from grab import Grab
>>> g = Grab()
>>> g.request('http://google.com')
<grab.doc.Document object at 0x2cff9f0>
>>> g.doc
<grab.doc.Document object at 0x2cff9f0>

You can find a full list of response attributes in the Response API document. Here are the most important things you should know:

body

original body contents of HTTP response

code

HTTP status of response

headers

HTTP headers of response

encoding

encoding of the response

cookies

cookies in the response

url

the URL of the response document. In case of some automatically processed redirect, the url attribute contains the final URL.

download_size

size of received data

upload_size

size of uploaded data except the HTTP headers

Now, a real example:

>>> from grab import Grab
>>> g = Grab()
>>> g.request('http://wikipedia.org')
<grab.doc.Document object at 0x1ff99f0>
>>> g.doc.body[:100]
'<!DOCTYPE html>\n<html lang="mul" dir="ltr">\n<head>\n<!-- Sysops: Please do not edit the main template'
>>> g.doc.code
200
>>> g.doc.headers['Content-Type']
'text/html; charset=utf-8'
>>> g.doc.encoding
'utf-8'
>>> g.doc.cookies
<CookieJar[Cookie(...), Cookie(..)]>
>>> g.doc.url
'http://www.wikipedia.org/'
>>> g.doc.download_size
11100.0
>>> g.doc.upload_size
0.0

Now let’s see some useful methods available in the response object:

unicode_body()

this method returns the response body converted to unicode

copy()

returns a clone of the response object

save(path)

saves the response object to the given location

json

treats the response content as json-serialized data and de-serializes it into a python object. Actually, this is not a method, it is a property.

url_details()

return the result of calling urlparse.urlsplit with response.url as an argument.

query_param(name)

extracts the value of the key argument from the query string of response.url.