Screams into the Void

Context-, Flow-, and Field-Sensitive Data-Flow Analysis using Synchronized Pushdown Systems

March 19, 2019

As part of the CMPUT 500 course, I was asked to put myself in the shoes of a reviewer and review Späth et al., POPL’19. What follows is what I came up with, although it was hard to review a paper as a non-expert.

Firstly, I would accept the paper.

Summary

This paper introduces a field- and flow-sensitive, but context insensitive data-flow analysis based on push-down systems and combines it in a novel way with a previously known push-down based context- and flow-sensitive, but field-insensitive, analysis to create a precise and performant context-, flow-, and field-sensitive analysis. The main contribution of this paper is that the authors show how to combine two push-down systems to create something which they coin to be a synchronized push-down system. Analysis based on synchronized push-down systems benefit from the precision of the separate analysis.

Comments:

Minor Comments:

I know LaTeX can be a pain to work with, but can Figure 1 and Table 1 be in the same page?

GPU-Accelerated Fixpoint Algorithms for Faster Compiler Analyses

March 18, 2019

As part of our monthly paper reading, we read the Blaß and Philippsen, CC’19 paper on offloading data-flow analysis to the GPU. The title seems to suggest that they found a way to make data-flow analysis data parallel and remove data races, but in reality they just let data races happen and detect and fix them.

Data-flow analysis propagate predicates from one node of a control flow graph (CFG) into another. This propagation is repetitive, and typically done using a worklist algorithm. The main idea of the paper is, instead of propagating predicates of nodes one at a time for each node in the worklist, to act on multiple nodes at once on a GPU.

From the paper, it seems that computation on the GPU involves multiple warps, which are a collection of threads all executing the same instruction. Since threads in a warp all execute the same instruction, they are never reading and writing at the same time. So there is never a data race within a warp — all data races happen between warps. Instead of adding synchronization mechanisms to avoid data races, they let data races happen. They use a checksum for predicates to detect if predicates were changed during computation to rerun the computation if necessary.

For their analysis, they perform some optimizations to make things run even faster on the GPU.

  1. They prune (deslag) the call graph to remove nodes which do not affect predicates.
  2. They sort the instructions of the program and try to ensure that a warp processes nodes with the same instruction. Having the same program instruction makes it more likely that different threads are executing the same instruction.
  3. They do some load balancing to make sure that threads do not idle as much.

I did not exactly understand the load balancing, but I found the other two optimizations to be somewhat cute. One thing that they note is pruning the call graph is usually not worth it on CPU-based algorithms, but it can be done in parallel which makes it worth. The instruction sorting optimization was also interesting.

Implementation of Functional Programming Languages Chapter 9

March 12, 2019

Finished reading “Chapter 9: A Type-Checker”. Here they present the type checking algorithm. This algorithm is called Algorithm W, although they don’t really name it in the text. This algorithm works by using substitutions, and by defining a unify function which extends substitutions in very careful and specific ways. I have always found this algorithm to be very clunky. I generally find functional algorithms to be much clearer than imperative ones which manipulate state, but this is one clear exception — if you implement this algorithm it is very hard to convince yourself that you did not implement this algorithm incorrectly. For the implementation in this book, I found the heavy reliance on a general way to type check lists of expressions further complicated the picture. On the plus side, I really liked how they took the time to describe substitutions and the unify function in a lot of detail. In particular, I liked how they explicitly mention the careful ways in which substitutions are extended. In other descriptions of Algorithm W, the properties of the extensions were not mention explicitly and I have had to derive properties of the unify function by myself.

But this was the last chapter of Part I of the book. It’s a little sad that this part of the book ended on a low for me. I am, however, really looking forward to the rest of the chapters, particularly some of the early chapters of Part II. Graph reduction has been something I have wanted to learn more about for a while now, but have not had the chance!

Scaling Java Points-To Analysis using SPARK

March 7, 2019

Finally finished reading Lhoták and Hendren, 2003. It was about SPARK, the Soot Pointer Analysis Research Kit.

I did not read the paper with 100% attention as I did with the CHA, VTA, and RTA papers. Even then, the first thing that jumps out to me when I think back to the paper is that it had a very detailed evaluation section — different configurations of SPARK were compared in a very detailed manner. The section being this long had a point — they wanted to provide specific recommendations for configurations and data structures to use for doing points-to analysis. Other than that, the more technical parts of the paper provided algorithms for propagating points-to sets throughout a pointer assignment graph (PAG). But one of the things about SPARK that I found really interesting is the PAG simplification step. This step merges nodes in strongly connected components of the PAG using a union-find algorithm.

I tried to look for algorithms to merge strongly connected components in the dragon book, but the algorithms there had too much surrounding context. The paper seems to suggest that doing this ends up being a instance of union-find. I should read “Unification-based pointer analysis with directional assignments” (Das, 2000) paper for the full details.

I should reread section 3 and then I should be able to look at the CMPUT 500 slides and deduce the context. Will probably be reading “Points-to analysis in almost linear time” (Steensgaard, 1996) next.

Practical Virtual Method Call Resolution for Java

February 27, 2019

Read the Sundaresan et al, 2000 paper on Variable Type Analysis. It took me a lot longer to read this compared to the previous papers I read for CMPUT 500.

Variable Type Analysis (VTA) essentially prunes the call graph generated by Class Hierarchy Analysis (CHA) by doing a fast flow-insensitive points-to analysis. The paper shows that the precision of this analysis is significantly better that Rapid Type Analysis (RTA) in highly object-oriented code.

RTA essentially prunes the CHA call graph by removing edges to classes whose constructors are not reachable in the program. VTA improves upon this by conservatively figuring out which types a variable can point to instead of assuming that a variable points to everything reachable in the program. The meaty part of the paper concerns itself with the implementation of the points-to analysis.

One of the techniques that keep showing up in my readings is this:

For VTA, a DAG of variables and their initial types are built, with edges between nodes representing assignments between variables. Then the types are propagated down the DAG, to build a “points-to type” set for each variable.

The other place that this showed up was in the Dependency Analysis section of the Peyton Jones, 1987 book. I should revisit the actual algorithms involved — planning to read the relevant parts of the dragon book.

Next on my reading list for CMPUT 500 is a paper on SPARK (Lhoták and Hendren, 2003). Will finally get to learn what my Master’s advisor worked on before Scala and DOT.

Older