-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Closed
Labels
area/backendNeeds backend code changesNeeds backend code changesenhancementNew feature or requestNew feature or requestkind/customer-requestRequested by one or more customersRequested by one or more customers
Description
Feature description
One of the biggest problems with the current ForEachItem for me is that by default, ForEachItem will split an input file into multiple files, one for each batch.
Consider the following flow parameters :
- Batch job that must handle ~100,000 rows, stored in an ION file from a database query. Subflow must take in one row per execution.
- ForEachItem with the 100,000 rows.
{{ouptuts.query.uri}}
would be values input into the ForEachItem - Result : ForEachItem splits the 100,000 into 100,000 files in the internal storage, which takes ages to do because it's splitting one file into 100,000 files, and the disk IO becomes a major bottleneck
Instead of splitting the 100,000 row ION file into 100,000 separate files, I would like an option for the ForEachItem to simply stream the original ION file in memory in batches (the amount stored in memory will be dependent on the batch
property's rows attribute), creating executions as it processes the file in a "sliding window" fashion.
I believe this will make the ForEachItem task far more efficient way it passes inputs to subflows.
Metadata
Metadata
Assignees
Labels
area/backendNeeds backend code changesNeeds backend code changesenhancementNew feature or requestNew feature or requestkind/customer-requestRequested by one or more customersRequested by one or more customers
Type
Projects
Status
Done