Chakaravarthy’s algorithm

Thankfully, spring quarter is over and summer is here!⁽¹⁾ This past quarter was pretty rough—I think I took one class too many, so even though all my classes were really cool, there were moments of real misery.⁽²⁾ Many moments, actually. The bright side, of course, is that now I have fewer classes to take until I finish up the MS degree. ツ

Edit: I started writing this on June 15, which was right after school ended. But finishing the whole thing ended up taking a bit longer.

Points-to analysis

One of my classes this quarter was ECS240: Programming Languages, which was primarily focused on static analysis—specifically, basic dataflow analysis and abstract interpretation.⁽³⁾ Something that I found quite cool was this algorithm for polynomial-time and precise flow-insensitive points-to analysis of typed programs, which was introduced in a 2003 POPL paper called “New Results on the Computability and Complexity of Points-to Analysis”.

There are lots of other goodies about some of the theoretical boundaries of various classes of points-to analysis (the title is very apt!), but this algorithm—sometimes called Chakaravarthy’s algorithm (after the paper’s sole author: Venkatesan Chakaravarthy)—is particularly interesting. There really just aren’t that many algorithms for non-trivial program analysis that are polynomial-time, i.e. very fast! It’s also very nice that types (if only a very limited notion of types) are what makes it all possible.

That said, I honestly don’t think the paper does a great job explaining the algorithm, particularly due to (1) some weird organizational quirks that are perhaps due to an effort to present formal complexity proofs alongside the algorithm structure and (2) some unfortunate typos.⁽⁴⁾ So here’s an attempt at going over the algorithm in a way that feels clearer to me. The full details remain in the paper itself—we only aim to give a better sense of the structure of and general intuition behind the algorithm.

Before we get into the algorithm itself though, let’s pin down—roughly, at least—what we mean by precise flow-insensitive points-to analysis:

Chakaravarthy’s algorithm

We begin by giving some intuition on why the algorithm enjoys such good algorithmic complexity while still being precise, as well as a high-level view of its structure. We then fill in some of the details of this general outline, using the various definitions and pieces of intuition we’ve provided. We conclude with a brief sketch of the complexity argument.

General intuition

The key to the algorithm is the observation that, by restricting ourselves to well-typed pointers, our pointers form a layered graph with certain good properties around forbidden pairs.

First, what do we mean by well-typed pointers?
We mean that we are restricting ourselves to a model where all:

In this model (a model that seems quite reasonable, frankly), we can further simplify our notation by simply tracking the number of dereferences * that a variable’s type has. For example, we say that a variable with type int** has the corresponding type $2$, the variable with type int**** has the corresponding type $4$.

Intuitively, this simplification is fine since we can split any program into independent sets of statements that only include one base type (i.e. int or string) and analyze each of those sets separately, since all pointers must be well-typed (and so a int pointer can never point to a string, vice-versa, et-cetera).

Additionally (and importantly), this also restricts the kinds of program statements we need to consider. In fact, we only consider two kinds of statements:

Second, why does this mean that our realizability graph is layered?
Define our layers as follows: a variable of type $1$ is in layer $1$, a variable of type $2$ is in layer $2$, and so on. In a realizability graph, realizable points-to relations are edges between variables (where an edge from variable $a$ to variable $b$ means that $a$ can point-to $b$).

Given our requirement that all pointers are well-typed, note that our graph then has the following properties:

Finally, what are forbidden pairs, why are they a problem, and how do our types help?
Forbidden pairs are also called dependencies in the paper, which is slightly confusing. Very briefly, forbidden pairs are basically two points-to relations (or edges, in the visual terminology of our realizability graph) that cannot be simultaneously realized.

Such forbidden pairs are the crux of the difficulty in getting precise analysis in the general case, i.e. in the untyped setting that analyses like Andersen’s algorithm operate in. In an algorithm like Andersen’s, we have no easy way of tracking such forbidden pairings/dependencies. So a points-to relation computed by Andersen’s algorithm that relies on two relations in a forbidden pair is allowed, even though that relation is actually unrealizable (i.e. because the two points-to relations in the forbidden pair are not simulateously realizable, there actually is no sequence of statements that can actually realize the computed points-to relation that relies on the forbidden pair).

This is the source of the imprecision of such algorithms: we will compute certain points-to relations for which there is actually no sequence of statements demonstrating realizability. These relations end up being the “extra” relations mentioned earlier.

Getting the truly precise set of points-to relations involves an efficient method of computing and tracking forbidden pairs. Thankfully, our well-typed pointers and the resulting layered realizability graph gives us such a method. More specifically, our layered realizability graph has the following nice property regarding forbidden pairs:

This simplifies things greatly, since forbidden pairs of points-to relations can only occur when they are in the same layer of our realizability graph, i.e. edges can only be dependent when they are in the same layer. This gives us an efficient way of computing, tracking, and avoiding such forbidden pairs in the typed setting , helping us keep the algorithm in polynomial time.

Algorithm structure

The algorithm proceeds top-down, in one pass, through our layered graph structure, starting at the top layer $L$ (the layer containing the largest types) and moving down. We construct the edges in the graph and add to the graph’s metadata (i.e. our set of forbidden pairs) dynamically as we move down the layers. We will need the information from prior layers to perform the algorithm at the current layer.

At each layer $l$, the algorithm proceeds in 4 stages. The first two stages prepare the information necessary for the final two stages, which are the ones that actually compute the new edges and the set of forbidden pairs of edges in the current layer. Here are what each of the 4 stages should do, in brief:

Importantly, each of these stages can be modelled as a certain sub-problem with a polynomial time algorithmic solution. Here are each of the stages mapped onto their particular sub-problem—these sub-problems get us from what the stages should do to an idea of how to do them:

Graph reachability is relatively well-known (and can be implemented fairly simply, if naively, with some kind of bread-first or depth-first search), so we elide deeper discussion of the problem. It’s also known to be solvable in at least linear time.⁽⁶⁾ However, we will outline the disjoint paths with forbidden pairs and CCP problems below, along with informal arguments that they can be solved in polynomial time.

Disjoint paths with forbidden pairs

While the usual disjoint paths problem (i.e. where the graph is not necessarily layered and there are no forbidden pairs) is $\mathcal{NP}$-complete, there exists an algorithm for our particular case—where we have a layered graph and each forbidden pair of edges is in the same layer—that is polynomial time. In particular, we are able to reduce the problem to reachability in directed graphs, using a polynomial time reduction.

The reduction is polynomial time, the size of $G’$ is polynomial in the size of $G$, and the resulting reachability problem can then also be solved in polynomial time (i.e. with depth-first search). We leave more formal arguments for the correctness of this reduction to the full paper.

Concurrent Copy Propagation (1- and 2-CCP)

The general $k$-CCP problem can be described as follows. We are given a set of variables $V$, a set of constants $C$, and a set of statements $S$. We consider two types of statements: (1) X := a for some X in $V$ and some a in $C$; (2) X := Y for some X in $V$ and some Y in $V$. Executing a statement of type (1) assigns the constant a to variable X; executing a statement of type (2) copies the current value of variable Y to X.

Now, consider some set of goals $G$ that consists of pairs of variables and constants, i.e. <X1,a1>,<X2,a2>,...<Xk,ak>. Note that the k here is the same as the $k$ of our $k$-CCP problem (and is just some positive integer).

$G$ can be realized if there is some finite sequence of statements in $S$ such that, at the end of the sequence, all of our goals in $G$ are realized, i.e. the value of X1 is a1, the value of X2 is a2, ...the value of Xk is ak. (Note the parallels here to the points-to analysis that we are interested in solving!)

The solution $\lambda$ to $k$-CCP is the set of all such realizable sets of $k$ goals.

What’s interesting about the $k$-CCP problem is that, when $k$ can be arbitrary (and part of the input to the problem), the decision version of the problem (i.e. can our set of goals $G$ be realized?) is PSPACE-complete, but when $k$ is constant, the problem is polynomial time. We are actually only interested in the cases where $k=1$ and $k=2$, so our problems have polynomial time solutions.

For the 1-CCP problem, we can reduce it to a problem of reachability in directed graphs. More specifically, we can construct a graph where statements like X := Y are represented by an edge from some variable node Y to some variable node X and statements like X := a are represented by an edge from some constant node a to some variable node X. To check which variables can be assigned what constants, we simply need to check which variable nodes are reachable from what constant nodes in our graph.

For the 2-CCP problem, here’s a sketch of the algorithmic solution, which takes the form of iterative transitive closure based on a set of rules:

Note that the boundary condition here in which X is the only variable that gets a constant assigned to it directly must be handled separately. Again, we leave more formal details to the paper (although the paper also glosses over quite a bit of formal detail).

Filling in the details

Now, armed with an idea of the general structure of the algorithm, we are ready to fill in some of the details. More specifically, how exactly do we implement each stage as an algorithmic solution to the sub-problems we’ve described? As a reminder, here are the sub-problems for each stage that give us an idea of how to do what we’ve said they should do:

For direct assignments, remember that we care about statements of the form: *(d)p = &z. We will use an algorithmic solution to graph reachability in order to determine if some variable q in our current layer $l$ is reachable from p (in d steps). If so, then q can be made to point to z (and we can add the edge from q to z, although the algorithm described in the paper technically delays this until the edge computation phase).

For copying statements, remember that we only care about statements of the form: *(d1)p1 = *(d2)p2. We will use the algorithmic solution to disjoint paths with forbidden pairs to find all pairs (q1,q2) in layer $l$ such that there is a disjoint path, avoiding forbidden pairs, from p1 to q1 and p2 to q2 in our realizability graph so far.

For edge computation, note that the previous two stages have given us: (1) pairs <q,z> such that q is in layer $l$, z is in layer $l-1$, and q can be made to point to z; and (2) pairs (q1,q2) such that values from q2 can be copied to q1. We can then compute edges by solving the 1-CCP problem, where:

Each pair <Xq,az> in the solution to the 1-CCP problem as defined here corresponds to an edge <q,z> in the graph.

For forbidden pair computation, the correspondence of pointers in the realizability graph and pairs identified by the direct assignments and copying statements stages to $V$, $C$, and $S$ in the CCP problem is the same. Each element {<X(q1), c(z1)>, <X(q2), c(z2)>} in the solution to the 2-CCP problem defined this way means that the pair of edges <q1,z1>,<q2,z2> can be realized simultaneously—every other pair of realizable edges in layer $l$ (as computed in edge computation) is thus a forbidden pair.

Missing formalism

In our narrative here, we’ve avoided talking at all about any proofs of computational complexity in an effort to focus on a conceptual understanding of how the algorithm works (rather than why the algorithm is fast).

The interested reader should refer to the paper itself for the proofs of the complexity of key components of the algorithm. That said, the paper itself also sweeps some truly rigorous formalism under the rug, particularly when tying all the pieces together at the end of Section 2. The enterprising student would perhaps be interested in working out the full inductive proofs! (I am not an enterprising student.)

Having summer break is one the great perks of being in school.↩
In hindsight though, I still think it was the right decision. The classes were all very interesting, and (I think are) usually only offered in the spring. I'd rather be miserable this spring than next spring! Plus all the classes next quarter look boring, so I'm glad I knocked out more of my credits this school year.↩
It's pretty funny how many different things a class titled "Programming Languages" can cover. I would usually expect (and probably prefer) a core/required class focused on programming languages to cover "foundational" programming language theory, i.e. things like lambda calculus, type systems and basic results there, semantics, Hoare logic, and so on, but then there's this whole body of material on what could maybe be called implementation theory—things related to compiler optimization, static analysis (particularly of programs in existing languages), garbage collection, etc. that are also really interesting and valid directions a course on programming languages can take.↩
Also some math paper-y proofs that rely on some claim of obviousness, i.e. "It is clear that $G'$ is also layered" or "It is easy to see that the algorithm runs in polynomial time." Some of these may indeed be relatively obvious, but I still wish fewer papers would do this and at least sketch the ideas of the proof or give some brief intuition.↩
For reference, take any inference rule style presentation of Andersen's algorithm, which should be accessible with a quick web search or LLM query 🙃↩
Since BFS is technically linear time, for example, or more specifically $\mathcal{O}(V+E)$ for some graph $G = (V, E)$. Or an appeal to Wikipedia citations here.↩