Skip to content

Conversation

sckott
Copy link
Collaborator

@sckott sckott commented Apr 10, 2025

#83
#139

  • Rework global submit job fixture to separate job submission per test. Removed the submit_wdls fixture - for those tests that used it, new utility fxn submit_wdl to submit jobs within each test. These tests now can use @pytest.mark.parametrize to parametrize over paths to WDL dirs.
  • There's a bunch of new yml cassette files from the above change - PLEASE IGNORE THESE!
  • Using python library pytest-xdist now to parallelize tests, especially important now with more distinct tests to run. We use it by having it in our pyproject.toml, and you can use it with pytest like -n auto
  • The downside of replacing a global submit_wdls fixture with each test submitting their own jobs is that an entire test suite run takes longer now - the last full run i tried on gh actions took 45 min then failed, so may need to seek out ways to speed this up. However, the other change here should help. That is:
  • Added a new GH Action api-tests-changed.yml which will look for changed WDL dirs and run its tests only across all tests leveraging the -k flag in pytest which look for all tests that use that WDL dir, even accounting for parametrized tests
  • The above approach can also be used locally so folks can tweak a WDL repo, run just its tests using the -k flag and not have to run the entire test suite

TODO:

  • After merging we'll need to check the new gh action with some changes to WDL repos

@sckott sckott added the infrastructure Infrastructure fix to execute WDL GitHub Actions label Apr 10, 2025
@sckott sckott requested review from seankross and tefirman April 10, 2025 21:28
@seankross
Copy link
Collaborator

Looking extremely reasonable so far.

@sckott
Copy link
Collaborator Author

sckott commented Apr 15, 2025

I'm trying this out by running the rewrite cassettes workflow with this branch

  • i'm also getting retry failures when running tests in parallel. I'm trying right now to see if it changes when we do not use parallel (it does change - we get back to our failures due to the globbing thing) thinking that perhaps we're running out of connections and either we need to use fewer simultaneous connections or just not use parallel. these errors are masking other errors all set now - just can't have too many workers
  • Now trying fewer workers than given by auto to see what the optimal number is
    • Last try was with 24 workers. and that took 45 min which was I think the fastest we're going to get. Tried 48 workers, and that gave errors again, so number of workers should be 24 or maybe a bit higher, let's just go with 24 for now.

Copy link
Collaborator

@tefirman tefirman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really like this approach of spelling out which WDL's failed during test-successes.py, test-failures.py, etc., think it'll help a ton when reading GHA check failures.

Also a huge fan of the individual cassette approach for each WDL, much more isolated changes when updating one of the unit tests. api-tests-changed.yml looks good, just one small note about -n auto, definitely want to give it a test run once it's merged.

My only other suggestion is to update the function docstrings accordingly, specifically the input argument descriptions. I'm pretty sure I understand what each of them do now, but just don't want to confuse ourselves in the future.

@sckott
Copy link
Collaborator Author

sckott commented Apr 16, 2025

For the failing test check, i'm not sure if it's all due to #160 and the globbing issue or not. 🤷🏽

@sckott
Copy link
Collaborator Author

sckott commented Apr 16, 2025

Thanks for looking @tefirman - Made changes. Yeah I think we'll benefit from seeing what actual WDLs are failing rather than having to manually go in and find which one is failing.

@tefirman
Copy link
Collaborator

For the failing test check, i'm not sure if it's all due to #160 and the globbing issue or not. 🤷🏽

I'm seeing a lot of "connection refused" errors for the validation tests (helloHostname is a good example), feels unrelated...

@seankross
Copy link
Collaborator

Given the changes on dev, what do we need to do here to make sure the tests pass appropriately?

@sckott
Copy link
Collaborator Author

sckott commented Apr 17, 2025

@seankross I think we'd just need to do a rewrite of the cassettes locally then push up on this branch because the test thats failing is doing a cached run, i'll do that

@seankross seankross added this to the v0.2 test infrastructure milestone Apr 17, 2025
@tefirman
Copy link
Collaborator

I think we'd just need to do a rewrite of the cassettes locally then push up on this branch because the test thats failing is doing a cached run, i'll do that

Ahhhh, that makes total sense. I was going to say, I ran things locally and things seemed to be working fine on my end. Glad to know it's just a re-run situation.

@sckott
Copy link
Collaborator Author

sckott commented Apr 17, 2025

@seankross @tefirman okay, rewrite locally done, and running now, hopefully we get green

@sckott
Copy link
Collaborator Author

sckott commented Apr 17, 2025

we're all set on the cached api tests now https://github.com/FredHutch/wdl-unit-tests/actions/runs/14521232867/job/40742413310?pr=161

but I see i need to fiddle with the api tests changed yml now just a sec

@sckott
Copy link
Collaborator Author

sckott commented Apr 17, 2025

@seankross @tefirman okay, we're all green on tests, good to merge?

@sckott
Copy link
Collaborator Author

sckott commented Apr 17, 2025

i thnk i need one more approval @tefirman

@seankross seankross requested review from tefirman and seankross April 17, 2025 17:36
Copy link
Collaborator

@tefirman tefirman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sckott !

@sckott sckott merged commit 9a7580e into main Apr 17, 2025
45 checks passed
@tefirman tefirman mentioned this pull request Apr 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infrastructure Infrastructure fix to execute WDL GitHub Actions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants