Skip to content

HeaderReader class to replace read_headers #25

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Apr 29, 2017
107 changes: 65 additions & 42 deletions cheroot/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -131,53 +131,71 @@ def bton(b, encoding='ISO-8859-1'):
logging.statistics = {}


def read_headers(rfile, hdict=None):
"""Read headers from the given stream into the given header dict.
class HeaderReader(object):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a docstring here.


If hdict is None, a new header dict is created. Returns the populated
header dict.
def __call__(self, rfile, hdict=None):
"""
Read headers from the given stream into the given header dict.

Headers which are repeated are folded together using a comma if their
specification so dictates.
If hdict is None, a new header dict is created. Returns the populated
header dict.

This function raises ValueError when the read bytes violate the HTTP spec.
You should probably return "400 Bad Request" if this happens.
"""
if hdict is None:
hdict = {}
Headers which are repeated are folded together using a comma if their
specification so dictates.

while True:
line = rfile.readline()
if not line:
# No more data--illegal end of headers
raise ValueError('Illegal end of headers.')

if line == CRLF:
# Normal end of headers
break
if not line.endswith(CRLF):
raise ValueError('HTTP requires CRLF terminators')

if line[0] in (SPACE, TAB):
# It's a continuation line.
v = line.strip()
else:
try:
k, v = line.split(COLON, 1)
except ValueError:
raise ValueError('Illegal header line.')
# TODO: what about TE and WWW-Authenticate?
k = k.strip().title()
v = v.strip()
hname = k
This function raises ValueError when the read bytes violate the HTTP spec.
You should probably return "400 Bad Request" if this happens.
"""
if hdict is None:
hdict = {}

while True:
line = rfile.readline()
if not line:
# No more data--illegal end of headers
raise ValueError('Illegal end of headers.')

if line == CRLF:
# Normal end of headers
break
if not line.endswith(CRLF):
raise ValueError('HTTP requires CRLF terminators')

if line[0] in (SPACE, TAB):
# It's a continuation line.
v = line.strip()
else:
try:
k, v = line.split(COLON, 1)
except ValueError:
raise ValueError('Illegal header line.')
v = v.strip()
k = self._transform_key(k)
hname = k
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not assign hname = self._transform_key(k) and use just this var name? do we need two vars for the same purpose?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, although in a loopy/branchy code like this, I've decided to limit my changes to the refactoring in order to avoid potential issues with other optimizations. Since the previous code set k and hname, I decided to retain that. We can optimize that away later if appropriate.


if not self._allow_header(k):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks you've lost one level of indentation starting here till line 183.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, what if one wants to run a check before transformation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I add a level of indentation, that would put all of that code within the else block. That code wasn't in the else block before. Why should it be now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If one wants to run a check before the transformation, they will either need to override _transform_key to perform the check there, or they should describe why that feature would be useful and add an appropriate hook to enable it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaraco oh.. I assumed it was.

k variable is created in else-block, meaning we'll get NameError when if part happens, while referencing a variable before assignment. Am I overlooking smth?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the code is complex. The other part of the else block (the if block) is indicating a 'continuation line' which is assuming k was set in a previous line (previous iteration of the while loop). So yes, you would get a name error if the first line contained a space or tab, but apparently that's not an issue in practice, as the original code used the same logic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, thanks :)

continue

if k in comma_separated_headers:
existing = hdict.get(hname)
if existing:
v = b', '.join((existing, v))
hdict[hname] = v
if k in comma_separated_headers:
existing = hdict.get(hname)
if existing:
v = b', '.join((existing, v))
hdict[hname] = v

return hdict
return hdict

def _allow_header(self, key_name):
return key_name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return True?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Perhaps. Unless we want to filter out empty headers. I guess the previous implementation allowed for empty headers, so this refactor should as well.


def _transform_key(self, key_name):
# TODO: what about TE and WWW-Authenticate?
return key_name.strip().title()


class DropUnderscoreHeaderReader(HeaderReader):
def _allow_header(self, key_name):
orig = super(DropUnderscoreHeaderReader, self)._allow_header(key_name)
return orig and '_' not in key_name


class SizeCheckWrapper(object):
Expand Down Expand Up @@ -479,6 +497,11 @@ class HTTPRequest(object):

This value is set automatically inside send_headers."""

header_reader = HeaderReader()
"""
A HeaderReader instance or compatible reader.
"""

def __init__(self, server, conn):
self.server = server
self.conn = conn
Expand Down Expand Up @@ -637,7 +660,7 @@ def read_request_headers(self):
"""Read self.rfile into self.inheaders. Return success."""
# then all the http headers
try:
read_headers(self.rfile, self.inheaders)
self.header_reader(self.rfile, self.inheaders)
except ValueError as ex:
self.simple_response('400 Bad Request', ex.args[0])
return False
Expand Down