feat(plan_v2): Improve planner extraction#4182
Conversation
c3986e7 to
873b75b
Compare
| /// `append_children` and `children_of` are the only intended way to write into | ||
| /// or read from `child_indices`; they own the half-open-window invariant. | ||
| struct ResolvedBuildContext { | ||
| std::vector<ResolvedEntry> build_order; |
There was a problem hiding this comment.
Is there useful methods we can add for interacting with build_order rather than manipulating it directly?
| } | ||
| } | ||
| return result; | ||
| auto entry = make_entry(key, self); |
There was a problem hiding this comment.
better name? this is where recursion happens, what contractually makes this DFS?
| /// trait is detectable. Swapping operands inverts the ordering so a < b ⇒ | ||
| /// "a dominates b" ⇒ `greater`. | ||
| struct LowerIsBetter { | ||
| static constexpr auto compare(auto const &a, auto const &b) { return b <=> a; } |
There was a problem hiding this comment.
b <=> a how does that correlate with LowerIsBetter?
| .required = {}, | ||
| .introduces = a.introduces, |
There was a problem hiding this comment.
Do we need a comment to explain these semantics, does the parent Output node clean up the .introduces + .required?
| /// the same expr), so the frontier still emits the valid combinations. This | ||
| /// also subsumes the post-inline-rewrite self-referential case where | ||
| /// `Identifier(sym)` is unioned into expr's e-class. | ||
| auto BindFlatMap(CostFrontier const &input, CostFrontier const &expr, planner::core::EClassId sym_eclass, |
There was a problem hiding this comment.
BindFlatMap, OutputCombine, UnwindFlatMap, SubqueryFlatMap, FunctionCombine. Do they deserve there own named functions? Is there a common generalisation that they can use, or do we inline into their symbol_cost_traits?
| // through unchanged. expr is evaluated once at bind-time. | ||
| emit({.cost = BindAliveCost(input_alt.cost, sym_cost, expr_alt.cost, input_alt.cardinality), | ||
| .cardinality = input_alt.cardinality, | ||
| .required = {}, |
There was a problem hiding this comment.
Does this need reconsidering? will operators on inner branches need to communicate their requirements?
| auto result = args.empty() ? LeafFrontier(0.0, enode_id) : Restamp(*args[0], 0.0, enode_id); | ||
| for (auto const *arg : args.empty() ? args : args.subspan(1)) { | ||
| result = BinaryCombine(result, *arg, 0.0, enode_id); | ||
| } | ||
| result.mutate_pruning_invariant_preserving([&](Alternative &alt) { | ||
| alt.cost += 1.0; | ||
| alt.cardinality = cardinality; | ||
| alt.enode_id = enode_id; | ||
| }); | ||
| return result; |
There was a problem hiding this comment.
This is not general, it special cases to handle the arity of the functions we have been testing (ie. range(_,_)). This requires code to be more general.
158307b to
75ca5d6
Compare
| #include "memory/db_arena_fwd.hpp" | ||
| #include "query/context.hpp" | ||
| #include "query/db_accessor.hpp" | ||
| #include "query/plan_v2/frontend/egraph_converter.hpp" |
There was a problem hiding this comment.
TODO: QueryPlannerContext coming from "frontend/egraph_converter.hpp" looks strange, change its header
Bring planner v2 extraction to the point where it carries full expression trees (arithmetic, comparison, boolean, function, unwind, subquery) and selects plans on cost. The extractor now searches the equivalent plans held in the e-graph and picks the cheapest, scoring candidates with a cost model and a pluggable cardinality estimator. Candidates can be better on different axes - estimated cost, output cardinality, and demand (whether later operators actually use a value) - so it keeps a Pareto frontier of non-dominated candidates instead of committing early. Propagating demand lets it drop work nothing reads: unreferenced alternatives are pruned, and an operator like Bind takes its cheaper variant that skips evaluating the bound value wherever nothing downstream consumes it. The per-symbol behaviour is organised into trait families spanning the make, cost, resolve, and build axes, with a single dispatcher across them. Adding a new operator is driven from one X-macro symbol list that flows into the enum, predicates, and round-trip; a missing trait specialisation is a compile error rather than a silent gap. plan_v2 is reorganised along these axes into egraph, cost, frontend, and resolve layers. Extraction state is threaded through the interpreter via a planner context, replacing the previous ad-hoc plumbing. Test and benchmark support is consolidated under shared test_support libraries, with new algebra, cardinality, and extractor suites covering the Pareto cases and full operator round-trips, plus new extraction and match benchmarks.
75ca5d6 to
6ad2b53
Compare
|



Bring planner v2 extraction to the point where it carries full expression trees (arithmetic, comparison, boolean, function, unwind, subquery) and selects plans on cost. The extractor now searches the equivalent plans held in the e-graph and picks the cheapest, scoring candidates with a cost model and a pluggable cardinality estimator. Candidates can be better on different axes - estimated cost, output cardinality, and demand (whether later operators actually use a value) - so it keeps a Pareto frontier of non-dominated candidates instead of committing early. Propagating demand lets it drop work nothing reads: unreferenced alternatives are pruned, and an operator like Bind takes its cheaper variant that skips evaluating the bound value wherever nothing downstream consumes it.
The per-symbol behaviour is organised into trait families spanning the make, cost, resolve, and build axes, with a single dispatcher across them. Adding a new operator is driven from one X-macro symbol list that flows into the enum, predicates, and round-trip; a missing trait specialisation is a compile error rather than a silent gap. plan_v2 is reorganised along these axes into egraph, cost, frontend, and resolve layers.
Extraction state is threaded through the interpreter via a planner context, replacing the previous ad-hoc plumbing. Test and benchmark support is consolidated under shared test_support libraries, with new algebra, cardinality, and extractor suites covering the Pareto cases and full operator round-trips, plus new extraction and match benchmarks.