[AMORO-3545] Prevent empty tasks list during planning#3546
Conversation
| actualInputSize += evaluator.getCost(); | ||
| AbstractPartitionPlan actualPartitionPlan = (AbstractPartitionPlan) evaluator; | ||
| List<RewriteStageTask> splitTasks = | ||
| actualPartitionPlan.splitTasks((int) (actualInputSize / avgThreadCost)); |
There was a problem hiding this comment.
It is ok to change this currently, because the parameter for splitTasks is not taken into consideration, but maybe in the future we will change the logic in TaskSplitter#splitTasks
Can we change the actualInputSize / avgThreadCost here to some other things like evaluator.getCost() / avgThreadCostPre -- but currently there is no avgThreadCostPre
Maybe we can add a comment here to let others know we need to change this here? what do you think about this
There was a problem hiding this comment.
I see what you mean, will publish another review.
Regarding the comment I think it should be placed in the implementation classes.
|
@lrsb thanks for the update. The change looks fine to me. Please refine the code style to pass the style check in CI. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3546 +/- ##
============================================
- Coverage 21.76% 21.76% -0.01%
Complexity 2391 2391
============================================
Files 431 431
Lines 40498 40504 +6
Branches 5744 5746 +2
============================================
Hits 8816 8816
- Misses 30935 30941 +6
Partials 747 747
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
xxubai
left a comment
There was a problem hiding this comment.
Thanks for rasing this. I just have one small comment.
| } | ||
|
|
||
| List<PartitionEvaluator> evaluators = new ArrayList<>(needOptimizingPlanMap.values()); | ||
| LinkedList<PartitionEvaluator> evaluators = new LinkedList<>(needOptimizingPlanMap.values()); |
There was a problem hiding this comment.
nit: it's used as a queue container, declaring a Deque would be more appropriate
There was a problem hiding this comment.
changed to priority queue since makes more sense given we are sorting too
There was a problem hiding this comment.
I think using a Deque is sufficient here, as a priority queue is not more efficient than a list in this case, especially during polling. Therefore, I don’t think using a priority queue for sorting is necessarily appropriate.
There was a problem hiding this comment.
Deque doesn't expose sort
There was a problem hiding this comment.
Okay, I forgot that queues don't have a sort interface, so linkedlist is indeed reasonable. can you rollback it?
415f4b7 to
fb10af4
Compare
| evaluator != null; | ||
| evaluator = evaluators.poll()) { | ||
| actualPartitionPlans.add((AbstractPartitionPlan) evaluator); | ||
| actualInputSize += evaluator.getCost(); |
There was a problem hiding this comment.
Nice catch!
But I am wondering if we should calculate the input size depending on the real tasks from plannedTasks rather than evaluator#cost if the cost is not true.
(cherry picked from commit e69710a)
(cherry picked from commit e69710a)
Fix #3545
This change merges the logic in AbstractOptimizingPlanner to avoid having
actualPartitionPlanswith only unschedulable tasks.How was this patch tested?
Add some test cases that check the changes thoroughly including negative and positive cases if possible
Add screenshots for manual tests if appropriate
Run test locally before making a pull request
Documentation