C#: Improvements to buildless extraction#3136
Conversation
e7222c7 to
09616d1
Compare
09616d1 to
b94b4b7
Compare
hvitved
left a comment
There was a problem hiding this comment.
Awesome stuff, Calum. Excited to see how far we can get with buildless extraction.
| return new MemberAccess(info.SetKind(ExprKind.UNKNOWN), expression, symbol); | ||
| // Default to property access - this can still give useful results but | ||
| // the target of the expression should be checked in QL. | ||
| return new MemberAccess(info.SetKind(ExprKind.PROPERTY_ACCESS), expression, symbol); |
There was a problem hiding this comment.
This feels wrong to me... It could just as well be a field or something else.
There was a problem hiding this comment.
If anything, it feels more like a dynamic member access, but that is also a bit of a hack. Perhaps we need new classes of unknown expressions, i.e.
| 122 = @unknown_member_access_expr
| 123 = @unknown_binary_expr
...
and then have
@member_access_expr = ... | @unknown_member_access_expr;
@bin_op = ... | @unknown_binary_expr;
There was a problem hiding this comment.
The choice is between wrong some of the time or missing all of the time. So I would argue that the change I made is a step in the right direction that is unlikely to make things worse. I think your changes seem reasonable but I don't think this PR is the place for them, particularly since dataflow libraries and queries would need to be reworked.
hvitved
left a comment
There was a problem hiding this comment.
I think we should add a ConcurrentHashSet wrapper to Semmle.Util, and it also appears that you forgot to add the new files.
| private readonly HashSet<string> usedSources = new HashSet<string>(); | ||
| private readonly HashSet<string> missingSources = new HashSet<string>(); | ||
| private readonly Dictionary<string, string> unresolvedReferences = new Dictionary<string, string>(); | ||
| private readonly IDictionary<string, bool> usedReferences = new ConcurrentDictionary<string, bool>(); |
There was a problem hiding this comment.
Perhaps we should add this to Semmle.Util and use instead:
using System.Collections;
using System.Collections.Concurrent;
using System.Collections.Generic;
namespace Semmle.Util
{
public class ConcurrentHashSet<T> : IEnumerable<T>
{
private readonly ConcurrentDictionary<T, object> dict;
public ConcurrentHashSet()
{
dict = new ConcurrentDictionary<T, object>();
}
public void Add(T t) => dict[t] = null;
public bool Contains(T t) => dict.ContainsKey(t);
public IEnumerator<T> GetEnumerator() => dict.Keys.GetEnumerator();
IEnumerator IEnumerable.GetEnumerator() => dict.Keys.GetEnumerator();
}
}
There was a problem hiding this comment.
My reservation with this is that this would be a half-baked implementation, that should also implement
[System.Serializable]
public class ConcurrentHashSet<T> : System.Collections.Generic.ICollection<T>, System.Collections.Generic.IEnumerable<T>, System.Collections.Generic.IReadOnlyCollection<T>, System.Collections.Generic.ISet<T>, System.Runtime.Serialization.IDeserializationCallback, System.Runtime.Serialization.ISerializable
So we should really just reference the nuget package ConcurrentHashSet which I presume is totally correct and bug-free.....
Then it means updating resources/lib which is a bit of a pain.
There was a problem hiding this comment.
I think a half-baked implementation suffices, as we only need to add the things that we actually need. If we ever need, say, to use it as an ICollection<T>, then we can implement it then.
There was a problem hiding this comment.
There's literally one place where we use ConcurrentDictionary instead of ConcurrentHashSet so I'm really not convinced it's worth the effort. Particularly because we'd need a unit test suite for the new ConcurrentHashSet class.
| lock(mutex) | ||
| missingSources.Add(sourceFile.FullName); | ||
| } | ||
| sources[sourceFile.FullName] = sourceFile.Exists; |
dotnet restoreto a temp directory, then reads the downloaded DLLs from that directory..csprojfiles.NullReferenceExceptionsdotnet restorebeing slow.Fixes https://github.com/github/codeql-csharp-team/issues/14