-
Notifications
You must be signed in to change notification settings - Fork 340
Description
Describe the bug
I have a simple script that runs once a week for to collect citation counts. It has always worked, until last night, when it started failing with the error detailed below. I have tried several times over several hours on multiple machines.
To Reproduce
I have two machines. The following code fails with different errors on the different machines.
from scholarly import scholarly
query = scholarly.search_author('james watson')
author = scholarly.fill(next(query), ['publications'])
Error on machine 1 (ubuntu, python 3.9, scholarly 1.7.5):
Traceback (most recent call last):
File "/home/g/gb/gboeing/apps/citations/app/citations.py", line 15, in <module>
author = scholarly.fill(next(query), ['publications'])
File "/home/g/gb/gboeing/apps/citations/lib/python3.9/site-packages/scholarly/_navigator.py", line 237, in search_authors
soup = self._get_soup(url)
File "/home/g/gb/gboeing/apps/citations/lib/python3.9/site-packages/scholarly/_navigator.py", line 226, in _get_soup
html = self._get_page('https://scholar.google.com{0}'.format(url))
File "/home/g/gb/gboeing/apps/citations/lib/python3.9/site-packages/scholarly/_navigator.py", line 175, in _get_page
return self._get_page(pagerequest, True)
File "/home/g/gb/gboeing/apps/citations/lib/python3.9/site-packages/scholarly/_navigator.py", line 177, in _get_page
raise MaxTriesExceededException("Cannot Fetch from Google Scholar.")
scholarly._proxy_generator.MaxTriesExceededException: Cannot Fetch from Google Scholar.
Error on machine 2 (ubuntu, python 3.11, scholarly 1.7.5)::
Traceback (most recent call last):
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/site-packages/fake_useragent/utils.py", line 139, in load
browsers_dict[browser_name] = get_browser_user_agents(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/site-packages/fake_useragent/utils.py", line 123, in get_browser_user_agents
raise FakeUserAgentError(
fake_useragent.errors.FakeUserAgentError: No browser user-agent strings found for browser: chrome
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/urllib/request.py", line 1348, in do_open
h.request(req.get_method(), req.selector, req.data, headers,
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/http/client.py", line 1282, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/http/client.py", line 1328, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/http/client.py", line 1277, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/http/client.py", line 1037, in _send_output
self.send(msg)
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/http/client.py", line 975, in send
self.connect()
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/http/client.py", line 1447, in connect
super().connect()
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/http/client.py", line 941, in connect
self.sock = self._create_connection(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/socket.py", line 850, in create_connection
raise exceptions[0]
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/socket.py", line 835, in create_connection
sock.connect(sa)
TimeoutError: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/site-packages/fake_useragent/utils.py", line 64, in get
urlopen(
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/urllib/request.py", line 216, in urlopen
return opener.open(url, data, timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/urllib/request.py", line 519, in open
response = self._open(req, data)
^^^^^^^^^^^^^^^^^^^^^
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/urllib/request.py", line 536, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/urllib/request.py", line 496, in _call_chain
result = func(*args)
^^^^^^^^^^^
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/urllib/request.py", line 1391, in https_open
return self.do_open(http.client.HTTPSConnection, req,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/urllib/request.py", line 1351, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error timed out>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/geoff/Dropbox/Documents/School/Projects/Code/citations/citations.py", line 1, in <module>
from scholarly import scholarly
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/site-packages/scholarly/__init__.py", line 4, in <module>
scholarly = _Scholarly()
^^^^^^^^^^^^
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/site-packages/scholarly/_scholarly.py", line 34, in __init__
self.__nav = Navigator()
^^^^^^^^^^^
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/site-packages/scholarly/_navigator.py", line 26, in __call__
cls._instances[cls] = super(Singleton, cls).__call__(*args,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/site-packages/scholarly/_navigator.py", line 42, in __init__
self.pm1 = ProxyGenerator()
^^^^^^^^^^^^^^^^
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/site-packages/scholarly/_proxy_generator.py", line 54, in __init__
self._new_session()
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/site-packages/scholarly/_proxy_generator.py", line 454, in _new_session
'User-Agent': UserAgent().random,
^^^^^^^^^^^
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/site-packages/fake_useragent/fake.py", line 64, in __init__
self.load()
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/site-packages/fake_useragent/fake.py", line 70, in load
self.data_browsers = load_cached(
^^^^^^^^^^^^
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/site-packages/fake_useragent/utils.py", line 209, in load_cached
update(path, browsers, use_cache_server=use_cache_server, verify_ssl=verify_ssl)
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/site-packages/fake_useragent/utils.py", line 203, in update
path, load(browsers, use_cache_server=use_cache_server, verify_ssl=verify_ssl)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/site-packages/fake_useragent/utils.py", line 154, in load
jsonLines = get(
^^^^
File "/home/geoff/mambaforge/envs/citations/lib/python3.11/site-packages/fake_useragent/utils.py", line 87, in get
raise FakeUserAgentError("Maximum amount of retries reached")
fake_useragent.errors.FakeUserAgentError: Maximum amount of retries reached
Expected behavior
I expected the code to succeed without error, like it used to.
Screenshots
n/a
Desktop (please complete the following information):
(see my platform and version details above in reproduction section)
Do you plan on contributing?
Your response below will clarify whether the maintainers can expect you to fix the bug you reported.
- Yes, I will create a Pull Request with the bugfix.
Additional context
n/a