-
Notifications
You must be signed in to change notification settings - Fork 23
Refactor e2e to generate GT #1327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea seems right overall, I would just keep in mind a few things.
-
try to avoid timeouts as much as possible to avoid slowing down the tests or having race conditions, its better to be more event driven in the tests.. playwright is smart enough to wait for the elements you're selecting to appear for example and stuff like that, see https://playwright.dev/docs/actionability
-
The test now seems too big, I think it can be split into two or three separate tests to be run sequentially or have some shared code in
beforeEachfor example (I believe playwright runs sequentially by default), so for example 1st test uploads dataset and checks that the row is rendered correctly, a second one to start ground truth generation and see that a row for the job has been added, and a third if we really want to wait for the job to run and see the result although its debatable if there's much benefit to that compared to the effort/time waiting (keep in mind if we run the test against 3 browsers, it runs 3 jobs which in my case was failing due to memory issues running the 3 jobs at the same time) -
We should be very mindful of the selectors we use in the tests as we should try to be as implementation agnostic as possible interacting with the page like a user/screen reader would, so for example things like specific class names or ids can change with the implementation, its often recommended to either use more generic elements like getting things by role/element and text, in other cases where the text itself for example can change, we can even set
data-test-idattributes on the html elements to be able to select them from the tests -
Do we need to worry about cleaning up? for example something to delete the datasets, jobs, experiments, workflows, etc.. after running the test suite?
some more details here https://playwright.dev/docs/best-practices
| await fileChooser.setFiles(sampleDatasetFilePath) | ||
| test('Launch a GT workflow with unique file and fail early on job failure', async ({ page }) => { | ||
| // Increase test timeout to 10 minutes. | ||
| test.setTimeout(600000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this really needed? having a test run for 10 minutes likely means that other tests can't be run until its finished (unless we explicitly tell playwright to run in parallel), and it would be too slow for a CI feedback, maybe its ok to rely on unit/integration tests only for the actual job stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this because the default timeout is 3 mins (I think) and this test takes usually something like 7 minutes. When I wrote this I was thinking to run this in parallel to other jobs, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm I think by default cypress runs the tests sequentially, although it can be configured to run in parallel, but idk even in parallel if it splits them across 4 cores or so, 7 mins per test still sounds a bit too slow, I'm not sure if its worth it.. especially considering jobs can fail for all kinds of reasons and there's alot of combination of inputs/models to test
|
|
||
| // Click the dataset row (using the unique submittedFileName). | ||
| const datasetRow = page.locator('tr').filter({ hasText: submittedFileName }).first(); | ||
| await expect(datasetRow).toBeVisible({ timeout: 5000 }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the timeout necessary here and the other toBeVisible calls? I think the default timeout/behavior should be fine
| const [jobResponse] = await Promise.all([ | ||
| page.waitForResponse( | ||
| (res) => | ||
| res.url().includes('/jobs') && | ||
| res.request().method() === 'POST' && | ||
| res.status() === 201 | ||
| ), | ||
| popupContainer.getByRole('button', { name: 'Start Generating' }).click(), | ||
| ]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I don't necessarily think we need to wait for both in parallel since one is a consequence of the other
| const [jobResponse] = await Promise.all([ | |
| page.waitForResponse( | |
| (res) => | |
| res.url().includes('/jobs') && | |
| res.request().method() === 'POST' && | |
| res.status() === 201 | |
| ), | |
| popupContainer.getByRole('button', { name: 'Start Generating' }).click(), | |
| ]); | |
| await popupContainer.getByRole('button', { name: 'Start Generating' }).click() | |
| const jobResponse = await page.waitForResponse( | |
| (res) => | |
| res.url().includes('/jobs') && | |
| res.request().method() === 'POST' && | |
| res.status() === 201 | |
| ) |
| const dynamicFileName = `dialogsum_mini_no_gt_${timestamp}.csv`; | ||
| let submittedFileName = dynamicFileName; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: no need for a different variable
What's changing
Updated the dataset upload playwright e2e test to perform a GT generation job on a non-gt dataset
Additional notes for reviewers
Anything you'd like to add to help the reviewer understand the changes you're proposing.
I already...
/docs)