Skip to content

Flow analysis #5

@dgrunwald

Description

@dgrunwald

Currently our inference re-uses Roslyn's flow-analysis.

However, this has some fundamental problems.

9:	Dictionary<T, Node> mapping = new Dictionary<T, Node>();

11:	Node? GetNode(T element)
	{
13:		Node? node;
14:		if (!mapping.TryGetValue(element, out node))
15:		{
16:			node = new Node();
17:			mapping.Add(element, node);
18:		}
19:		return node;
	}

GetNode

There's no edges created for line 16/17 because here Roslyn knows that node is non-null.
Line 14 creates two edges: one from <nullable> because TryGetValue will assign null when returning false; the other from mapping!1#2 (the mapping field's Node type argument) when TryGetValue returns true.

The return statement creates an edge from the variable's type, because Roslyn's flow analysis can't guarantee us that the variable is non-null -- our Roslyn code analysis runs under the pessimistic assumption that all types-to-be-inferred might end up nullable, so it considers mapping to be Dictionary<T, Node?>, which leaves open the possibility that GetNode returns null.

However, after our inference decides that mapping!1#2 is non-null, it would be correct to also indicate that the GetNode return value is non-null. After all, if no node exists yet, the function will create one.

The issue here is that Roslyn's flow analysis isn't aware of our types-to-be-inferred.
It would be better if, instead of using Roslyn's flow analysis, we had our own that keeps track of node's nullability.

The idea would be to create additional "helper" graph nodes for the different flow states of a local variable of reference type.
After TryGetValue initializes node, it's flow-state would be (true: mapping!1#2, false: <nullable>). Within the if body, the flow-state would initially be <nullable>, but after line 16 would change to <nonnull>.
After the if, the flow-state from both alternatives could be re-combined by creating a new node "node-after-line-18" and edges from the nodes from the two if-branches -- in this case <nonnull> from the then-branch and mapping!1#2 from the else branch.
Then the return statement would create an edge from this "node-after-line-18" instead of the node for the variable's declared type.
All flow-state nodes associated with a variable would have an edge to the variable's declared type node.
We'd end up with a graph somewhat like this:
GetNode-cfg

Thus in the end, node would be inferred as nullable, but the GetNode return type would only depend on mapping!1#2 and thus can be inferred depending on whether there's a mapping.Add(x, null) access somewhere else in the program.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions