Datalog-based Scalable Semantic Diffing of Concurrent Programs
When an evolving program is modified to address issues related
to thread synchronization, there is a need to confirm the change
is correct, i.e., it does not introduce unexpected behavior. However,
manually comparing two programs to identify the semantic
difference is labor intensive and error prone, whereas techniques
based on model checking are computationally expensive. To fill the
gap, we develop a fast and approximate static analysis for computing
synchronization differences of two programs. The method is
fast because, instead of relying on heavy-weight model checking
techniques, it leverages a polynomial-time Datalog-based program
analysis framework to compute differentiating data-flow edges, i.e.,
edges allowed by one program but not the other. Although approximation
is used our method is sufficiently accurate due to careful
design of the Datalog inference rules and iterative increase of the
required data-flow edges for representing a difference. We have
implemented our method and evaluated it on a large number of
multithreaded C programs to confirm its ability to produce, often
within seconds, the same differences obtained by human; in contrast,
prior techniques based on model checking take minutes or
even hours and thus can be 10x to 1000x slower.
Project link: https://github.com/ChunghaSung/EC-Diff
CANAL: A Cache Timing Analysis Framework via LLVM Transformation
A unified modeling framework for non-functional properties of a
program is essential for research in software analysis and verification,
since it reduces burdens on individual researchers to implement
new approaches and compare existing approaches. We
present CANAL, a framework that models the cache behaviors of
a program by transforming its intermediate representation in the
LLVM compiler. CANAL inserts auxiliary variables and instructions
over these variables, to allow standard verification tools to handle
a new class of cache related properties, e.g., for computing the
worst-case execution time and detecting side-channel leaks. We
demonstrate the effectiveness of CANAL using three verification
tools: KLEE, SMACK and Crab-llvm. We confirm the accuracy of
our cache model by comparing with CPU cycle-accurate simulation
results of GEM5.
Project link: https://github.com/canalcache/canal
Modular Verification of Interrupt-driven software
Since there has been relatively lack of verification
methods for interrupt-driven programs even the behavior of interrupts is similar with that of threads, we introduced a more accurate verification
method for interrupt-driven programs by leveraging a thread-modular
analysis for concurrent programs. By modeling interrupt behavior such
as priority information and preemption information with Z3 SMT solver,
I combined this information with a thread-modular abstract interpretation
analysis. Therefore, I could remove infeasible interference for interrupt-driven programs and could achieve a more accurate analysis without losing
computation overhead.
I compared our analysis to state-of-the-art techniques one of which is a thread-modular analysis for concurrent programs, and another is a model checking based interrupt-driven analysis tool, with about 35 C-base interrupt-driven programs. Finally, my tool showed its higher accuracy than other two tools with getting a little overhead.
Project link: https://github.com/chunghasung/intabs
Extension of a Scope query optimization tool
During the internship at MSR, I extended a scope query optimization tool.
The tool consists of two parts: one part is parser and another part is template generation.
Basically, it parses a query file and then prints a C-style code template,
and the template is fed into a programming synthesis tool to optimize it.
Then, the optimized C-style program is translated into an optimized query.
However, the tool only supported a small set of queries because many queries
use many combinations of columns and uninterpreted functions. So, I have
changed both parser and template generation parts. To achieve this, I got used to F#, C# and Scope query.
Optimization of testing web applications
Even though there were many approaches to analyze web applications, it was hard to analyze them accurately due to its inherent characteristic of user-driven events using the DOM (Document Object Model). This project tries to model the DOMs to find out dependencies between two events which are attached to the DOMs. To achieve this, I made a JavaScript parser from the scratch using JavaScript and nodejs, and modified JS_WALA to build a CFG (Control Flow Graph). Based on the CFG, translated datalog facts are printed. My analysis is inter-procedural and flow-insensitive pointer analysis. Finally, from the facts I got and the rules I made, I can figure out the dependencies among all events over-approximately.
To show the usefulness of my work, I hooked up the automated testing tool named Artemis to shrink the searching space of it using POR (Partial Order Reduction) with the dependency information. I tested with about 20 real world JavaScript games which are highly user-driven events demanded programs, and the result shows that the overhead of this analysis is really negligible and we got about 16% improvement of code coverage in average.
Project link: https://github.com/chunghasung/jsdep
Interactive Visualization
Our team developed an interactive visualization graphic package as an open source in R project during the Google summer of Code 2013 period to help people easily deal with various aspects of data by inserting new data or deleting outliers while watching the graphs. To support these features easily our team used the browser to show the graph using HTML and JavaScript. The advantage of using the browser to support the graphic package is people can easily use this package on the various platforms such as mobile devices. Furthermore, we try to support server-side offload using the package named Shiny with the large number of data which are hard to compute at the client-side platform.
Project link: https://righthelp.github.io/tutorial/overview
Development of Boot loader sequence
When I was at AhnLab, I worked on prototyping new boot loader for an appliance that should be compatible with x86/MIPS architecture and Linux-based operating systems and simplified customer support and troubleshooting, ensuring that the boot loader would provide multi-level booting, securing the system to protect not only back side information, but customers’ information as well.