Refactor e2e to generate GT #1327

agpituk · 2025-04-16T15:59:32Z

What's changing

Updated the dataset upload playwright e2e test to perform a GT generation job on a non-gt dataset

Additional notes for reviewers

Anything you'd like to add to help the reviewer understand the changes you're proposing.

I already...

Tested the changes in a working environment to ensure they work as expected
[N/A] Added some tests for any new functionality
[N/A ] Updated the documentation (both comments in code and product documentation under /docs)
[ N/A] Checked if a (backend) DB migration step was required and included it if required

khaledosman

The idea seems right overall, I would just keep in mind a few things.

try to avoid timeouts as much as possible to avoid slowing down the tests or having race conditions, its better to be more event driven in the tests.. playwright is smart enough to wait for the elements you're selecting to appear for example and stuff like that, see https://playwright.dev/docs/actionability
The test now seems too big, I think it can be split into two or three separate tests to be run sequentially or have some shared code in beforeEach for example (I believe playwright runs sequentially by default), so for example 1st test uploads dataset and checks that the row is rendered correctly, a second one to start ground truth generation and see that a row for the job has been added, and a third if we really want to wait for the job to run and see the result although its debatable if there's much benefit to that compared to the effort/time waiting (keep in mind if we run the test against 3 browsers, it runs 3 jobs which in my case was failing due to memory issues running the 3 jobs at the same time)
We should be very mindful of the selectors we use in the tests as we should try to be as implementation agnostic as possible interacting with the page like a user/screen reader would, so for example things like specific class names or ids can change with the implementation, its often recommended to either use more generic elements like getting things by role/element and text, in other cases where the text itself for example can change, we can even set data-test-id attributes on the html elements to be able to select them from the tests
Do we need to worry about cleaning up? for example something to delete the datasets, jobs, experiments, workflows, etc.. after running the test suite?

some more details here https://playwright.dev/docs/best-practices

khaledosman · 2025-04-17T11:45:57Z

lumigator/frontend/e2e/datasets.spec.ts

-  await fileChooser.setFiles(sampleDatasetFilePath)
+test('Launch a GT workflow with unique file and fail early on job failure', async ({ page }) => {
+  // Increase test timeout to 10 minutes.
+  test.setTimeout(600000);


is this really needed? having a test run for 10 minutes likely means that other tests can't be run until its finished (unless we explicitly tell playwright to run in parallel), and it would be too slow for a CI feedback, maybe its ok to rely on unit/integration tests only for the actual job stuff?

I added this because the default timeout is 3 mins (I think) and this test takes usually something like 7 minutes. When I wrote this I was thinking to run this in parallel to other jobs, what do you think?

hmm I think by default cypress runs the tests sequentially, although it can be configured to run in parallel, but idk even in parallel if it splits them across 4 cores or so, 7 mins per test still sounds a bit too slow, I'm not sure if its worth it.. especially considering jobs can fail for all kinds of reasons and there's alot of combination of inputs/models to test

khaledosman · 2025-04-17T11:47:30Z

lumigator/frontend/e2e/datasets.spec.ts

+
+  // Click the dataset row (using the unique submittedFileName).
+  const datasetRow = page.locator('tr').filter({ hasText: submittedFileName }).first();
+  await expect(datasetRow).toBeVisible({ timeout: 5000 });


is the timeout necessary here and the other toBeVisible calls? I think the default timeout/behavior should be fine

khaledosman · 2025-04-17T11:50:56Z

lumigator/frontend/e2e/datasets.spec.ts

+  const [jobResponse] = await Promise.all([
+    page.waitForResponse(
+      (res) =>
+        res.url().includes('/jobs') &&
+        res.request().method() === 'POST' &&
+        res.status() === 201
+    ),
+    popupContainer.getByRole('button', { name: 'Start Generating' }).click(),
+  ]);


nit: I don't necessarily think we need to wait for both in parallel since one is a consequence of the other

Suggested change

const [jobResponse] = await Promise.all([

page.waitForResponse(

(res) =>

res.url().includes('/jobs') &&

res.request().method() === 'POST' &&

res.status() === 201

),

popupContainer.getByRole('button', { name: 'Start Generating' }).click(),

]);

await popupContainer.getByRole('button', { name: 'Start Generating' }).click()

const jobResponse = await page.waitForResponse(

(res) =>

res.url().includes('/jobs') &&

res.request().method() === 'POST' &&

res.status() === 201

)

khaledosman · 2025-04-17T11:55:44Z

lumigator/frontend/e2e/datasets.spec.ts

+const dynamicFileName = `dialogsum_mini_no_gt_${timestamp}.csv`;
+let submittedFileName = dynamicFileName;


nit: no need for a different variable

agpituk added 2 commits April 14, 2025 15:56

Precommit fixes

7f90142

GT generation e2e job

b147fa9

agpituk requested a review from khaledosman April 16, 2025 15:59

Merge branch 'main' into agpituk/add-extra-e2e

27f6d9e

github-actions bot added the frontend label Apr 16, 2025

khaledosman reviewed Apr 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor e2e to generate GT #1327

Refactor e2e to generate GT #1327

Uh oh!

agpituk commented Apr 16, 2025

Uh oh!

khaledosman left a comment •

edited

Loading

Uh oh!

khaledosman Apr 17, 2025 •

edited

Loading

Uh oh!

agpituk Apr 21, 2025

Uh oh!

khaledosman Apr 22, 2025 •

edited

Loading

Uh oh!

khaledosman Apr 17, 2025 •

edited

Loading

Uh oh!

khaledosman Apr 17, 2025 •

edited

Loading

Uh oh!

khaledosman Apr 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		const dynamicFileName = `dialogsum_mini_no_gt_${timestamp}.csv`;
		let submittedFileName = dynamicFileName;

Refactor e2e to generate GT #1327

Are you sure you want to change the base?

Refactor e2e to generate GT #1327

Uh oh!

Conversation

agpituk commented Apr 16, 2025

What's changing

Additional notes for reviewers

I already...

Uh oh!

khaledosman left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

khaledosman Apr 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

agpituk Apr 21, 2025

Choose a reason for hiding this comment

Uh oh!

khaledosman Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

khaledosman Apr 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

khaledosman Apr 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

khaledosman Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

khaledosman left a comment •

edited

Loading

khaledosman Apr 17, 2025 •

edited

Loading

khaledosman Apr 22, 2025 •

edited

Loading

khaledosman Apr 17, 2025 •

edited

Loading

khaledosman Apr 17, 2025 •

edited

Loading