You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem?
Currently, whenever the transform job is executed, the search phase is executed first before any compute or indexing processes are initiated.
Now, job scheduler schedules the job at specific intervals. Once the interval is over, the job is again initiated. If the search phase is taking up a lot of time, possibly more than the duration of the interval itself, the search process will keep on continuing during the transform job's every restart, till all the checkpoints / buckets / documents are visited.
In cases of time series data, where the transform job is unable to keep up with the indexing in source index, the transform job keeps on searching without computing and indexing into the source index.
Since the queried data is loaded into memory, the node could experience circuit breaker exceptions, due to which the job fails. Without circuit breakers, the node can go into OOMs as well.
What solution would you like?
This is to propose a change in a way transform job is executed currently. Instead of waiting for the search phase execution to complete, we should keep on computing the data based on aggregations and thus indexing into target index. This would allow us to release some of the computed buckets from the memory, thus freeing up memory from time to time.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem?
Currently, whenever the transform job is executed, the search phase is executed first before any compute or indexing processes are initiated.
Now, job scheduler schedules the job at specific intervals. Once the interval is over, the job is again initiated. If the search phase is taking up a lot of time, possibly more than the duration of the interval itself, the search process will keep on continuing during the transform job's every restart, till all the checkpoints / buckets / documents are visited.
In cases of time series data, where the transform job is unable to keep up with the indexing in source index, the transform job keeps on searching without computing and indexing into the source index.
Since the queried data is loaded into memory, the node could experience circuit breaker exceptions, due to which the job fails. Without circuit breakers, the node can go into OOMs as well.
What solution would you like?
This is to propose a change in a way transform job is executed currently. Instead of waiting for the search phase execution to complete, we should keep on computing the data based on aggregations and thus indexing into target index. This would allow us to release some of the computed buckets from the memory, thus freeing up memory from time to time.
The text was updated successfully, but these errors were encountered: