Skip to content

Conversation

agpituk
Copy link
Member

@agpituk agpituk commented Apr 16, 2025

What's changing

Updated the dataset upload playwright e2e test to perform a GT generation job on a non-gt dataset

Additional notes for reviewers

Anything you'd like to add to help the reviewer understand the changes you're proposing.

I already...

  • Tested the changes in a working environment to ensure they work as expected
  • [N/A] Added some tests for any new functionality
  • [N/A ] Updated the documentation (both comments in code and product documentation under /docs)
  • [ N/A] Checked if a (backend) DB migration step was required and included it if required

@agpituk agpituk requested a review from khaledosman April 16, 2025 15:59
Copy link
Contributor

@khaledosman khaledosman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea seems right overall, I would just keep in mind a few things.

  1. try to avoid timeouts as much as possible to avoid slowing down the tests or having race conditions, its better to be more event driven in the tests.. playwright is smart enough to wait for the elements you're selecting to appear for example and stuff like that, see https://playwright.dev/docs/actionability

  2. The test now seems too big, I think it can be split into two or three separate tests to be run sequentially or have some shared code in beforeEach for example (I believe playwright runs sequentially by default), so for example 1st test uploads dataset and checks that the row is rendered correctly, a second one to start ground truth generation and see that a row for the job has been added, and a third if we really want to wait for the job to run and see the result although its debatable if there's much benefit to that compared to the effort/time waiting (keep in mind if we run the test against 3 browsers, it runs 3 jobs which in my case was failing due to memory issues running the 3 jobs at the same time)

  3. We should be very mindful of the selectors we use in the tests as we should try to be as implementation agnostic as possible interacting with the page like a user/screen reader would, so for example things like specific class names or ids can change with the implementation, its often recommended to either use more generic elements like getting things by role/element and text, in other cases where the text itself for example can change, we can even set data-test-id attributes on the html elements to be able to select them from the tests

  4. Do we need to worry about cleaning up? for example something to delete the datasets, jobs, experiments, workflows, etc.. after running the test suite?

some more details here https://playwright.dev/docs/best-practices

await fileChooser.setFiles(sampleDatasetFilePath)
test('Launch a GT workflow with unique file and fail early on job failure', async ({ page }) => {
// Increase test timeout to 10 minutes.
test.setTimeout(600000);
Copy link
Contributor

@khaledosman khaledosman Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this really needed? having a test run for 10 minutes likely means that other tests can't be run until its finished (unless we explicitly tell playwright to run in parallel), and it would be too slow for a CI feedback, maybe its ok to rely on unit/integration tests only for the actual job stuff?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this because the default timeout is 3 mins (I think) and this test takes usually something like 7 minutes. When I wrote this I was thinking to run this in parallel to other jobs, what do you think?

Copy link
Contributor

@khaledosman khaledosman Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm I think by default cypress runs the tests sequentially, although it can be configured to run in parallel, but idk even in parallel if it splits them across 4 cores or so, 7 mins per test still sounds a bit too slow, I'm not sure if its worth it.. especially considering jobs can fail for all kinds of reasons and there's alot of combination of inputs/models to test


// Click the dataset row (using the unique submittedFileName).
const datasetRow = page.locator('tr').filter({ hasText: submittedFileName }).first();
await expect(datasetRow).toBeVisible({ timeout: 5000 });
Copy link
Contributor

@khaledosman khaledosman Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the timeout necessary here and the other toBeVisible calls? I think the default timeout/behavior should be fine

Comment on lines +81 to +89
const [jobResponse] = await Promise.all([
page.waitForResponse(
(res) =>
res.url().includes('/jobs') &&
res.request().method() === 'POST' &&
res.status() === 201
),
popupContainer.getByRole('button', { name: 'Start Generating' }).click(),
]);
Copy link
Contributor

@khaledosman khaledosman Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't necessarily think we need to wait for both in parallel since one is a consequence of the other

Suggested change
const [jobResponse] = await Promise.all([
page.waitForResponse(
(res) =>
res.url().includes('/jobs') &&
res.request().method() === 'POST' &&
res.status() === 201
),
popupContainer.getByRole('button', { name: 'Start Generating' }).click(),
]);
await popupContainer.getByRole('button', { name: 'Start Generating' }).click()
const jobResponse = await page.waitForResponse(
(res) =>
res.url().includes('/jobs') &&
res.request().method() === 'POST' &&
res.status() === 201
)

Comment on lines +15 to +16
const dynamicFileName = `dialogsum_mini_no_gt_${timestamp}.csv`;
let submittedFileName = dynamicFileName;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: no need for a different variable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants