Skip to content

Remote images: download of URIs containing port number details on Windows fails. #12100

@jayaddison

Description

@jayaddison

Sourced from #12095 (comment):

The Windows test output for this change appears to have uncovered an unrelated problem: handling http://localhost:7777/sphinx.png raises a WinError exception somewhere on that platform while it does not on Linux.

WARNING: Could not fetch remote image: http://localhost:7777/sphinx.png [[WinError 267] The directory name is invalid: 'C:\\Users\\runneradmin\\AppData\\Local\\Temp\\...\\images\\_build\\doctrees\\images\\http/localhost:7777']

My best guess is around this area of code (it has quite a large try...catch block):

try:
basename = os.path.basename(node['uri'])
if '?' in basename:
basename = basename.split('?')[0]
if basename == '' or len(basename) > MAX_FILENAME_LEN:
filename, ext = os.path.splitext(node['uri'])
basename = sha1(filename.encode(), usedforsecurity=False).hexdigest() + ext
basename = CRITICAL_PATH_CHAR_RE.sub("_", basename)
dirname = node['uri'].replace('://', '/').translate({ord("?"): "/",
ord("&"): "/"})
if len(dirname) > MAX_FILENAME_LEN:
dirname = sha1(dirname.encode(), usedforsecurity=False).hexdigest()
ensuredir(os.path.join(self.imagedir, dirname))
path = os.path.join(self.imagedir, dirname, basename)
headers = {}
if os.path.exists(path):
timestamp: float = ceil(os.stat(path).st_mtime)
headers['If-Modified-Since'] = epoch_to_rfc1123(timestamp)
config = self.app.config
r = requests.get(
node['uri'], headers=headers,
_user_agent=config.user_agent,
_tls_info=(config.tls_verify, config.tls_cacerts),
)
if r.status_code >= 400:
logger.warning(__('Could not fetch remote image: %s [%d]') %
(node['uri'], r.status_code))
else:
self.app.env.original_image_uri[path] = node['uri']
if r.status_code == 200:
with open(path, 'wb') as f:
f.write(r.content)
last_modified = r.headers.get('last-modified')
if last_modified:
timestamp = rfc1123_to_epoch(last_modified)
os.utime(path, (timestamp, timestamp))
mimetype = guess_mimetype(path, default='*')
if mimetype != '*' and os.path.splitext(basename)[1] == '':
# append a suffix if URI does not contain suffix
ext = get_image_extension(mimetype)
newpath = os.path.join(self.imagedir, dirname, basename + ext)
os.replace(path, newpath)
self.app.env.original_image_uri.pop(path)
self.app.env.original_image_uri[newpath] = node['uri']
path = newpath
node['candidates'].pop('?')
node['candidates'][mimetype] = path
node['uri'] = path
self.app.env.images.add_file(self.env.docname, path)
except Exception as exc:
logger.warning(__('Could not fetch remote image: %s [%s]') % (node['uri'], exc))

These lines initially seem like a possibility:

ensuredir(os.path.join(self.imagedir, dirname))
path = os.path.join(self.imagedir, dirname, basename)

Originally posted by @jayaddison in #12095 (comment)

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions