Architecture | Declarative Debugging for Java

DDJ uses a novel architecture in which declarative debugging is not done in two sequential phases, but in two concurrent phases; that is, while the ET is being generated, the debugger is able to produce questions. This new architecture solves scalability problems of declarative debugging. In particular, we use a database to store the whole ET, and only a part of it is loaded to main memory.

Moreover, in order to make the algorithms that traverse the ET independent of the database caching problems, we use a three-tier architecture where all the components have access to a virtual execution tree (VET). The VET is a data structure which is identical to the ET except that some nodes are missing (not generated yet) or incomplete (they only store a part of the routine execution). Hence, standard search strategies can traverse the VET because the structure of the ET is kept.

The VET is produced while running the program. For each routine execution, a new node is added to it with the routine parameters and the context before the call. The result and the context after the call are only added to the node when the routine execution finishes.

Let us explain the components of the architecture with the diagram in the figure below.

Observe that each tier contains a cache that can be seen as a view of the VET. Each cache is used for a different task:

Persistence cache: It is used to store the nodes of the VET in the database. Therefore, when the whole VET is in the database, the persistence cache is not used anymore. Basically, it specifies the maximum number of completed nodes that can be stored in the VET. This bound is called persistence bound and it ensures that main memory is never overflowed.
Logic cache: It defines a subset of the VET. This subset contains a limited number of nodes (in the following, logic bound), and these nodes are those with the highest probability of being asked, therefore, they should be retrieved from the database. This allows us to load in a single database transaction those nodes that are going to be probably asked and thus reducing the number of accesses to the database.
Presentation cache: It contains the part of the VET that is shown to the user in the GUI. The number of nodes in this cache should be limited to ensure that the GUI is not out of memory or it is too slow. The presentation cache defines a subtree inside the logic cache. Therefore, all the nodes in the presentation cache are also nodes of the logic cache. Here, the subtree is defined by selecting one root node and a depth (in the following, presentation bound).

The behavior of the debugger is controlled by four threads that run in parallel, one for the presentation tier (thread 3), two for the logic tier (threads 1 and 4) and one for the persistence tier (thread 2). Threads 1 and 2 control the generation of the VET and its storage in the database. They collaborate via synchronizations and message passing. Threads 3 and 4 communicate with the user and generate the questions. They also collaborate and are independent of threads 1 and 2. A description of the threads follows:

Thread 1 (Construction of the VET): This thread is in charge of constructing the VET. It is the only one that communicates with the JPDA and JVM. Therefore, we could easily construct a declarative debugger for another language (e.g., C++) by only replacing this thread. Basically, this thread executes the program and for every method invocation performed, it constructs a new node stored in the VET. When the number of complete nodes is close to the persistence bound, this thread sends to thread 2 the wake up signal. Then, thread 2 moves some nodes to the database. If the persistence bound is reached, thread 1 sleeps until enough nodes have been removed from the VET and it can continue generating new nodes.
Thread 2 (Controlling the size of the VET): This thread ensures that the VET always fits in main memory. It controls what nodes of the VET should be stored in main memory, and what nodes should be stored in the database. When the number of completed nodes in the VET is close to the persistence bound thread 1 wakes up thread 2 that removes some 2 nodes from the VET and copies them to the database. It uses the logic cache to decide what nodes to store in the database. Concretely, it tries to store in the database as many nodes as possible that are not in the logic cache, but always less than the persistence bound divided by two. When it finishes, it sends to thread 1 the wake up signal and sleeps.
Thread 3 (Interface communication): This thread is the only one that communicates with the user. It controls the information shown in the GUI with the presentation cache. According to the user’s answers, the strategy selected, and the presentation bound, this thread selects the root node of the presentation cache. This task is done question after question according to the programmer answers, ensuring that the question asked, its parent, and as many descendants as the presentation bound allows, are shown in the GUI.
Thread 4 (Selecting questions): This thread chooses the next node according to a given standard search strategy. If the node selected is not loaded in the logic cache the logic cache is updated. This is done using the node selected and the logic bound to compute the new logic cache (i.e, those nodes of the VET that should be loaded from the database). All the nodes that belong to the new logic cache and that do not belong to the previous logic cache are loaded from the database.