Skip to content

Cannot get r.html because session.get(url) returns a requests Session, not an HTMLSession #586

@e-ave

Description

@e-ave

I really cannot figure out how to use the latest version of requests-html.
The readme says to do

from requests_html import HTMLSession
session = HTMLSession()
r = session.get('https://python.org/')
rendered_html = r.html.render()

but session.get returns a requests.models.Response from the normal requests library, which doesn't have an html attribute. The session.get function should return a requests_html.HTMLResponse, which is what has the html property.

I tried doing this. It does not have any errors, but it does not get the html of the webpage. session.get just returns <HTML url='https://urlhere.com'>

# First make our HTMLSession
session = HTMLSession()
# Then use it to get a regular requests Response
r = session.get(url)
# Then convert our regular Response into an HTMLResponse
response = session.response_hook(r)
print(response.html)
# Now we can access response.html
html_doc = response.html.render()
print(html_doc)

I even tried using normal requests to grab the html, which works fine, then as soon as i pass the response to my HTMLSession, it gets scrubbed and turned into <HTML url='https://urlhere.com'>

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions