Skip to content

[AMORO-3545] Prevent empty tasks list during planning#3546

Merged
xxubai merged 1 commit into
apache:masterfrom
lrsb:master
May 30, 2025
Merged

[AMORO-3545] Prevent empty tasks list during planning#3546
xxubai merged 1 commit into
apache:masterfrom
lrsb:master

Conversation

@lrsb
Copy link
Copy Markdown
Contributor

@lrsb lrsb commented May 2, 2025

Fix #3545

This change merges the logic in AbstractOptimizingPlanner to avoid having actualPartitionPlans with only unschedulable tasks.

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before making a pull request

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? not applicable

@lrsb lrsb changed the title [AMORO-3545] Prevent empty tasks list during planning [AMORO-3545][module:ams-server] Prevent empty tasks list during planning May 2, 2025
@lrsb lrsb changed the title [AMORO-3545][module:ams-server] Prevent empty tasks list during planning [AMORO-3545][ams-server] Prevent empty tasks list during planning May 2, 2025
@lrsb lrsb changed the title [AMORO-3545][ams-server] Prevent empty tasks list during planning [AMORO-3545] Prevent empty tasks list during planning May 2, 2025
Copy link
Copy Markdown
Member

@klion26 klion26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lrsb thanks for the contribution, I left some inline comments, please take a look when you're free.

actualInputSize += evaluator.getCost();
AbstractPartitionPlan actualPartitionPlan = (AbstractPartitionPlan) evaluator;
List<RewriteStageTask> splitTasks =
actualPartitionPlan.splitTasks((int) (actualInputSize / avgThreadCost));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is ok to change this currently, because the parameter for splitTasks is not taken into consideration, but maybe in the future we will change the logic in TaskSplitter#splitTasks

Can we change the actualInputSize / avgThreadCost here to some other things like evaluator.getCost() / avgThreadCostPre -- but currently there is no avgThreadCostPre

Maybe we can add a comment here to let others know we need to change this here? what do you think about this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean, will publish another review.

Regarding the comment I think it should be placed in the implementation classes.

@klion26
Copy link
Copy Markdown
Member

klion26 commented May 21, 2025

@lrsb thanks for the update. The change looks fine to me. Please refine the code style to pass the style check in CI.

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

Attention: Patch coverage is 0% with 17 lines in your changes missing coverage. Please review.

Project coverage is 21.76%. Comparing base (d7d6534) to head (54e52c3).
Report is 15 commits behind head on master.

Files with missing lines Patch % Lines
...oro/optimizing/plan/AbstractOptimizingPlanner.java 0.00% 17 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #3546      +/-   ##
============================================
- Coverage     21.76%   21.76%   -0.01%     
  Complexity     2391     2391              
============================================
  Files           431      431              
  Lines         40498    40504       +6     
  Branches       5744     5746       +2     
============================================
  Hits           8816     8816              
- Misses        30935    30941       +6     
  Partials        747      747              
Flag Coverage Δ
trino 21.76% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Member

@klion26 klion26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lrsb thanks for the work, LGTM

Copy link
Copy Markdown
Contributor

@xxubai xxubai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for rasing this. I just have one small comment.

}

List<PartitionEvaluator> evaluators = new ArrayList<>(needOptimizingPlanMap.values());
LinkedList<PartitionEvaluator> evaluators = new LinkedList<>(needOptimizingPlanMap.values());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's used as a queue container, declaring a Deque would be more appropriate

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to priority queue since makes more sense given we are sorting too

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using a Deque is sufficient here, as a priority queue is not more efficient than a list in this case, especially during polling. Therefore, I don’t think using a priority queue for sorting is necessarily appropriate.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deque doesn't expose sort

Copy link
Copy Markdown
Contributor

@xxubai xxubai May 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I forgot that queues don't have a sort interface, so linkedlist is indeed reasonable. can you rollback it?

@lrsb lrsb force-pushed the master branch 2 times, most recently from 415f4b7 to fb10af4 Compare May 30, 2025 14:37
@xxubai xxubai merged commit e69710a into apache:master May 30, 2025
6 checks passed
evaluator != null;
evaluator = evaluators.poll()) {
actualPartitionPlans.add((AbstractPartitionPlan) evaluator);
actualInputSize += evaluator.getCost();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

But I am wondering if we should calculate the input size depending on the real tasks from plannedTasks rather than evaluator#cost if the cost is not true.

Jzjsnow pushed a commit to Jzjsnow/amoro that referenced this pull request Aug 20, 2025
zhoujinsong pushed a commit that referenced this pull request Aug 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: AMS stops optimizing when some partitions have high cost but cannot be scheduled

5 participants