-
Notifications
You must be signed in to change notification settings - Fork 47
fix: replaced taskcluster rest api calls with package #741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
ec75ed6
to
40c0435
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is shaping up nicely, thanks!
I have some changes to request. Once you put up a new revision I'll pull this down locally and see if it causes any differences in the graph (both here and in Gecko).
40c0435
to
80451af
Compare
This reverts commit cfb77a0.
Thanks! I'll take a closer look at this tomorrow. I'll try and test it against |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I tested this patch both in Taskgraph and the Gecko repo and it looks like it's working as expected!
But get_retry_post_session
, _do_request
and _handle_artifact
are all unused now in taskcluster.py
. Can you remove these and associated tests please?
Also I think I changed my mind about leaving We can do a major version bump for this.. no big deal. |
826551a
to
2e7e853
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
This is mostly cleanup, but there are two issues I neglected to catch in my first review:
- The
_handle_artifact
for yaml - The
.capitalize()
issue
response.raw.read = functools.partial(response.raw.read, decode_content=True) | ||
return response.raw | ||
|
||
|
||
def get_artifact_url(task_id, path, use_proxy=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should leave this for a follow-up issue, but you can actually get artifact urls and the like from the Taskcluster client as well, using something like:
queue.buildUrl("getArtifact", task_id, path)
If we used this in get_artifact_url
, get_index_url
and get_task_url
, I think we could get rid of the get_root_url
method as well as the dependency on taskcluster_urls
.
But as I mentioned, this is close to landing, so let's get this landed and maybe consider this change later.
return _handle_artifact(path, response) | ||
queue = get_taskcluster_client("queue") | ||
response = queue.getArtifact(task_id, path) | ||
return response |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could cause some problems, at least if I'm reading this right, we used to load yaml artifacts and now we don't. I think the taskcluster client will handle json for us, but not yaml.
So maybe we can add the _handle_artifact
logic back in? Sorry, I said to remove it but I was focused on the fact that it was unused and didn't notice the regression.
break | ||
tasks = response.get("tasks", []) | ||
for task in tasks: | ||
results.append(task) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop doesn't seem to do anything
if "taskId" in t: | ||
task_id = t.get("taskId") | ||
if task_id: | ||
task_ids.append(task_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's some superfluous existence checks here. You could re-write this as:
for t in results:
if task_id := t.get("taskId"):
task_ids.append(task_id)
(look up the "walrus operator" if that confuses you)
But this can be further simplified using a list comprehension:
task_ids = [t["taskId"] for t in results if "taskId" in t]
namespace = task.get("namespace") # type: ignore | ||
task_id = task.get("taskId") # type: ignore | ||
if namespace and task_id: | ||
task_ids[namespace] = task_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be re-written using a dictionary comprehension:
task_ids = {
t["namespace"]: t["taskId"]
for t in tasks
if "namespace" in t and "taskId" in t
}
else: | ||
options = taskcluster.optionsFromEnvironment() | ||
|
||
return getattr(taskcluster, service.capitalize())(options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized the use of capitalize
here is going to break the purge_cache
function (and also WorkerManager
, but we don't use that here)..
E.g:
>>> "purgeCache".capitalize()
'Purgecache'
I know you copied this code from something I wrote, so my bad.. Maybe instead of using .capitalize()
we can do something like service[0].upper() + service[1:]
"""Requests a cache purge from the purge-caches service.""" | ||
if testing: | ||
logger.info(f"Would have purged {provisioner_id}/{worker_type}/{cache_name}.") | ||
else: | ||
logger.info(f"Purging {provisioner_id}/{worker_type}/{cache_name}.") | ||
purge_cache_url = get_purge_cache_url(provisioner_id, worker_type, use_proxy) | ||
_do_request(purge_cache_url, json={"cacheName": cache_name}) | ||
purge_cache_client = get_taskcluster_client("purgeCache") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(see my note about this above)
tasks = response.get("tasks", []) | ||
for task in tasks: | ||
if "status" in task: # type: ignore | ||
status = task["status"] # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good opportunity to try out the walrus operator ;)
continue | ||
|
||
raise e | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure the taskcluster client would return None
on a 404? I would have thought it would raise an exception still.. Though it would probably be from aiohttp rather than requests.
@@ -57,29 +58,31 @@ def params(): | |||
), | |||
), | |||
) | |||
def test_index_search(caplog, responses, params, state, expires, expected, logs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine, but fyi there exists an aioresponses mock library designed to be used with aiohttp that we could make use of in a similar way.
fixes #221 . Created a bunch of helpers to use this package in taskcluster.py . I also didnt completely depcreate the url creation as that is still used. I bypassed it and replaced all uses when to get request is made.