Cleanup/Optimization: Remove remaining LINQ usage in the compiler#13543
Cleanup/Optimization: Remove remaining LINQ usage in the compiler#13543vexx32 wants to merge 6 commits intoPowerShell:masterfrom
Conversation
582a271 to
cc16044
Compare
iSazonov
left a comment
There was a problem hiding this comment.
If we expect perf wins it would be great to measure and compare.
|
@vexx32 Is there any noticeable difference in perf or parsing? Can you do a before / after comparison and add to the PR description. |
|
@adityapatwardhan let me know how to go about doing that and I'd be happy to. This isn't a particularly accessible area of the code to run benchmarks on (not that anything in PS is very easy to benchmark individually anyway, really). |
|
@vexx32 We have same options for perf measure:
Perhaps there are more simple options. |
|
One thing I never understood with BenchmarkDotNet is how to benchmark between commits properly without ripping out all the associated code into separate standalone classes. Does anyone have experience with that? Do you have to just run separate tests and calculate the ratios yourself (and is that even accurate) or does BenchmarkDotNet have support for that? Really hard to google for some reason. If someone can provide a template or some clear steps towards getting the before and after into the same test run, I'll work with @vexx32 to setup the actual benchmark. |
|
@AndreyAkinshin Could you please help PowerShell contributors to understand how compare performance different commits with BenchmarkDotNet? |
|
@vexx32 @SeeminglyScience I think we could use a tool mentioned in dotnet/runtime#41871 |
|
It looks like that would have to be a significant effort, largely from the PS team themselves; it doesn't look like something an individual contributor could feasibly set up. |
|
I mean that they use a simple tool for comparison with a base line. |
|
This pull request has been automatically marked as Review Needed because it has been there has not been any activity for 7 days. |
|
@vexx32 Can we split some changes to separate small PRs? |
|
Eh... I mean, sure, I can if you want. Didn't think it was that large, but I can take a look later on and do things a bit more piecemeal, I suppose? |
Yes, but it is very sensitive code and it would be better to keep the PR as a meta and split the work on small parts. |
|
That is fair, I'll give it a look, thanks! 🙂 |
Co-authored-by: Ilya <darpa@yandex.ru>
|
I added a benchmark to test the compiler performance (PR is out: #16083). I ran the benchmark with and without changes from this PR, and the result with changes from this PR is even slightly worse. Here are the details: PreparationThe BenchmarkingTo run benchmark without changes from this PR, from the
To run benchmark with changes from this PR, from the
Result DataResults without changes from this PR: Click to see the results from the 10 runsBenchmarkDotNet=v0.12.1.1521-nightly, OS=Windows 10.0.19043
Intel Core i7-1065G7 CPU 1.30GHz, 1 CPU, 8 logical and 4 physical cores
.NET SDK=6.0.100-preview.6.21355.2
[Host] : .NET 6.0.0 (6.0.21.35212), X64 RyuJIT
Job-XKBQFO : .NET 6.0.0 (6.0.21.35212), X64 RyuJIT
EnvironmentVariables=POWERSHELL_TELEMETRY_OPTOUT=1 PowerPlanMode=00000000-0000-0000-0000-000000000000 Arguments=/p:DebugType=portable
IterationTime=250.0000 ms MaxIterationCount=20 MinIterationCount=15
WarmupCount=1
Results with changes from this PR: Click to see the results from the 10 runsBenchmarkDotNet=v0.12.1.1521-nightly, OS=Windows 10.0.19043
Intel Core i7-1065G7 CPU 1.30GHz, 1 CPU, 8 logical and 4 physical cores
.NET SDK=6.0.100-preview.6.21355.2
[Host] : .NET 6.0.0 (6.0.21.35212), X64 RyuJIT
Job-LKTXSI : .NET 6.0.0 (6.0.21.35212), X64 RyuJIT
EnvironmentVariables=POWERSHELL_TELEMETRY_OPTOUT=1 PowerPlanMode=00000000-0000-0000-0000-000000000000 Arguments=/p:DebugType=portable
IterationTime=250.0000 ms MaxIterationCount=20 MinIterationCount=15
WarmupCount=1
Result Analysis
So, only the allocated memory is slightly better with the changes in this PR, but somehow the @vexx32 @iSazonov Please take a look at the benchmark PR #16083 and see if I missed anything in my benchmark. If not, please also try run the benchmark to see if you get similar results. |
|
Hmmm, when increase the size of the test script, the results with the changes from this PR seems a little better. I chose to use the Result DataResults without changes from this PR: Click to see the results from 6 runsBenchmarkDotNet=v0.13.1, OS=Windows 10.0.19043.1165 (21H1/May2021Update)
Intel Core i7-1065G7 CPU 1.30GHz, 1 CPU, 8 logical and 4 physical cores
.NET SDK=6.0.100-preview.6.21355.2
[Host] : .NET 6.0.0 (6.0.21.35212), X64 RyuJIT
Job-TLRODV : .NET 6.0.0 (6.0.21.35212), X64 RyuJIT
EnvironmentVariables=POWERSHELL_TELEMETRY_OPTOUT=1 PowerPlanMode=00000000-0000-0000-0000-000000000000 Arguments=/p:DebugType=portable
IterationTime=250.0000 ms MaxIterationCount=20 MinIterationCount=15
WarmupCount=1
Results with changes from this PR: Click to see the results from 6 runsBenchmarkDotNet=v0.13.1, OS=Windows 10.0.19043.1165 (21H1/May2021Update)
Intel Core i7-1065G7 CPU 1.30GHz, 1 CPU, 8 logical and 4 physical cores
.NET SDK=6.0.100-preview.6.21355.2
[Host] : .NET 6.0.0 (6.0.21.35212), X64 RyuJIT
Job-KKVGUA : .NET 6.0.0 (6.0.21.35212), X64 RyuJIT
EnvironmentVariables=POWERSHELL_TELEMETRY_OPTOUT=1 PowerPlanMode=00000000-0000-0000-0000-000000000000 Arguments=/p:DebugType=portable
IterationTime=250.0000 ms MaxIterationCount=20 MinIterationCount=15
WarmupCount=1
Data AnalysisThe average of Allocated memory is about the same -- 1mb. But it's weird that there is no GC collects shown for most of the runs. |
|
Interesting. I've not had time to run through myself but I'll give it a go this weekend. Would be curious whether there's a difference in compiling something class-heavy vs function-heavy vs general script as well. |
|
I made some experiments with the code before and found the code very optimized (by Jason Shirk I guess 👍 ). And I'd be very wonder if we would have seen a marked increase in performance in such a large synthetic test (milliseconds) while doing some specific micro-optimizations (nano-seconds). |
|
This pull request has been automatically marked as Review Needed because it has been there has not been any activity for 7 days. |
|
Closing and re-opening PR to get the new CIs working for this PR. |
|
This PR has Quantification details
Why proper sizing of changes matters
Optimal pull request sizes drive a better predictable PR flow as they strike a
What can I do to optimize my changes
How to interpret the change counts in git diff output
Was this comment helpful? 👍 :ok_hand: :thumbsdown: (Email) |
|
I added 'Review-Maintainer' label to discuss whether we should take this change or not as the demonstrated benefit is not very large, |
|
My conclusion was #13543 (comment) |
|
This pull request has been automatically marked as Review Needed because it has been there has not been any activity for 7 days. |
|
I don't think we have demonstrated a substantial improvement as compared to the current code. We would like to reduce the risk on regressions in a sensitive piece of code and not take this PR. |
PR Summary
using System.Linq;for Compiler.csIEnumerable<T>fields/variables/etc withIReadOnlyList<T>where appropriateIReadOnlyList<T>PR Context
LINQ ain't the greatest for performance sensitive applications.
Related: #13474
Semi-related: #12930 and #12412, I'll probably end up with merge conflicts on this or those depending on which get merged first. Might also need to make some edits either here or in those PRs to ensure I'm not reintroducing LINQ in some fashion or another, I'll go over them once one is merged.
/cc @rjmholt @SeeminglyScience
PR Checklist
.h,.cpp,.cs,.ps1and.psm1files have the correct copyright headerWIP:or[ WIP ]to the beginning of the title (theWIPbot will keep its status check atPendingwhile the prefix is present) and remove the prefix when the PR is ready.