No.100 Analysis and Verification of Pointer Programs

Icon

NII Shonan Meeting Seminar 100

Schedule

Sunday, October 1st

  • from 15:00: Hotel check-in (early check-in from 12:00 is negotiable if informed in advance)
  • 19:00 – 21:00: Welcome banquet

Monday, October 2nd

  • 07:30 – 09:00: Breakfast
  • 09:00 – 10:15: Introduction
    • Opening by organisers
    • Self-introduction by participants (2 min p.p.)
  • 10:15 – 10:45: Break
  • 10:45 – 12:00: General Verification Approaches using Separation Logic
    • Makoto Tatsuta:
      Program Analysis and Verification by Separation Logic

      Our team is currently working on a separation-logic-based program analyzer/verifier. Our plan is to start at the point O’Hearn’s group reached and to achieve more precise and faster systems. Our concrete target is to verify OpenSSL. Our current topics are: an entailment checker for separation logic with arithmetic and arrays, a loop invariant generator for Hoare triples by abstract interpretations, and completeness of cyclic proofs in symbolic heaps with general inductive predicates.

    • Joxan Jaffar:
      Automatic Local Reasoning of Recursive Data Structures

      We consider the problem of verifying programs which manipulate recursive data structures. A main challenge here is how to perform local reasoning so that the verification of subprograms, which operate on different components or frames of the data structure, can be combined. Separation Logic (SL) was a significant advance in program verification of data structures. It used a “separating” conjoin operator in data structure specifications to construct heaps from disjoint subheaps, and a “frame rule” to very elegantly realize local reasoning. Subsequently, the methods of Dynamic Frames (DF) and Implicit Dynamic Frames (IDF) provided expressive ways to specify the frames of methods.

      Our method begins with a domain of discourse of explicit subheaps with recursive definitions. The resulting specification language can describe arbitrary data structures, and arbitrary sharing therein, thus enabling a very precise specification of frames. The main contribution then is a program verification method which combines strongest postcondition reasoning in the form of symbolic execution, and unfolding recursive definitions of the data structure in question. Conceptually, this makes our method relatively complete in the sense of Hoare Logic. Finally, we present an implementation of our verifier, which essentially reduces to an implementation of unfolding recursive definitions. We then demonstrate automation on a number of representative programs.

      Finally, to demonstrate that a new level of automatic verification has been achieved, we present the first automatic proof of a classic graph marking algorithm, paving the way for dealing with a class of programs which traverse complex data structures.

      (Joint Work with Duc-Hiep Chu)

    • Discussion (15 min)
  • 12:00 – 13:30: Lunch
  • 13:30 – 14:00: Photo shooting
  • 14:00 – 15:30: Separation Logic and Concurrency I
    • Lennart Beringer:
      Foundational Program Verification Using VST

      The Verified Software Toolchain is a verification framework for C, implemented in the Coq proof assistant and proven sound w.r.t. the operational semantics of CompCert’s Clight language. Its key components are a higher-order impredicative concurrent separation logic and a library of proof automation tactics. The talk will give a high-level overview and summarize some recent verification examples. Meeting participants are invited to download and install VST on their own machine (see http://vst.cs.princeton.edu) and experiment with the system in their spare time.

    • Mike Dodds:
      Starling: Lightweight Reasoning with Separation

      Starling is a lightweight, automated tool for verifying racy concurrent algorithms. Starling proofs are are written in an abstracted Hoare-logic style, and converted into terms discharged by a sequential solver (for example, Z3). Starling is built on the Views framework, an abstract form of separation logic. In this talk I’ll describe how we specialise the Views framework into a simple, generic verification tool, and how we can apply this approach to verify complex pointer programs.

    • Discussion (30 min)
  • 15:30 – 16:00: Break
  • 16:00 – 18:00: Runtime Verification and Input Analysis
    • Quang Loc Le:
      Enhancing Symbolic Execution of Heap-based Programs with Separation Logic for Test Input Generation

      Symbolic execution is a well-established method for test input generation. By taking inputs as symbolic values and solving constraints encoding path conditions, it helps achieve a better test coverage. Despite of having achieved tremendous success over numeric domains, existing symbolic execution techniques for heap-based programs (e.g., linked lists and trees) are limited due to the lack of a succinct and precise description for symbolic values over unbounded heaps.

      In this work, we present a new symbolic execution method for heap-based programs using separation logic. The essence of our proposal is the use of existential quantifiers to precisely represent symbolic heaps. In order to solve path-condition-constraints, we first present a satisfiability solver in a fragment of separation logic with inductive predicates and arithmetic. Furthermore, we identify conditions for a decidable subfragment. Next, we propose a context-sensitive lazy initialization, a novel approach for efficient test input generation. In particular, we describe a least fixed point analysis to compute a valid set of symbolic values initialized to reference fields during the symbolic execution. We have implemented our proposal into a prototype system, called Java StarFinder and S2SAT solver, and evaluated it on a set of programs with complex heap inputs. The results show that our approach significantly reduces the number of invalid test inputs and improves the test coverage.

    • Gerald Lüttgen:
      Automated Detection of Dynamic Data Structures in C Programs and Binary Code

      This talk presents the key results of the DFG-funded research project “Learning Data Structure Behaviour from Executions of Pointer Programs” (DSI), in which dynamic analysis techniques have been developed to identify dynamic data structures in C programs and x86 binary code. DSI’s analysis utilizes a novel memory abstraction that allows for a compact description of pointer-based data structures such as linked lists and binary trees, and their interconnections such as parent-child nesting. On top of this abstraction, an evidence-collecting approach calculates a natural language description of the observed data structures with the help of a systematic taxonomy. The inferred data structure information is not only helpful for program comprehension but can also be utilized in the contexts of software verification and visualization.

    • Caterina Urban:
      An Abstract Interpretation Framework for Input Data Usage

      Nowadays, data science software plays an increasingly important role in critical decision making in fields ranging from economy and finance to biology and medicine. As we rely more and more on data science for making decisions, we become increasingly vulnerable to programming errors. Errors that do not cause failures can have serious consequences, since they give no indication that something went wrong.
      In this talk, we focus on programming errors related to input data usage. Specifically, we propose an abstract interpretation framework to automatically detect unused input data. We systematically derive static analyses for data usage by abstraction of the program operational trace semantics. We propose a new abstract domain to detect single unused input data stored in scalar variables, and we lift this abstraction by building upon an existing domain for the analysis of compound data structures such as array and lists to detect unused chunks of the data.
      Finally, we show that existing static analyses for seemingly different problems can be cast into our framework. In particular, we show that a form of live variable analysis and secure information flow analyses can be used for input data usage, with varying degrees of precision.

    • Discussion (30 min)
  • 18:00 – 19:30: Dinner

Tuesday, October 3rd

  • 07:30 – 09:00: Breakfast
  • 09:00 – 10:15: Separation Logic and Concurrency II
    • Robbert Krebbers:
      Iris: A Framework for Higher-Order Concurrent Separation Logic in Coq

      Iris is a framework for higher-order concurrent separation logic, implemented in the Coq proof assistant. In collaboration with a growing network of collaborators, Iris has been deployed in a wide variety of verification projects such verification of fine-grained concurrent data structures, a safety proof of the Rust type system, logical relations for proving program refinements, and program logics for relaxed memory models.

      In this talk I will give an overview of Iris. Firstly, I will introduce the basic building blocks of Iris and show how these can be used to verify classic concurrent programs. Secondly, I will demonstrate the formalization of Iris in Coq.

    • Lars Birkedal:
      Relational Models and Program Logics: Logical Relations in Iris

      In this talk I present a formalization of a logical relations model of an ML-like type system for a call-by-value higher-order language with impredicative polymorphism, recursive types, general references, and concurrency. The logical relation interpretation is defined in Iris, a state-of-the-art higher-order concurrent separation logic, which in turn is formalized in Coq. The proof effort is made simpler by the use of the novel interactive proof mode for Iris Proof Mode.

    • Discussion (15 min)
  • 10:15 – 10:45: Break
  • 10:45 – 12:00: Automata-Based Approaches to Separation Logic
    • Christoph Matheja:
      Heap Automata for Pointer Programs and Separation Logic

      We introduce heap automata, a formalism for automatic reasoning about robustness properties of the symbolic heap fragment of separation logic with user-defined inductive predicates. Robustness properties, such as satisfiability, reachability, and acyclicity, are important for a wide range of reasoning tasks in automated program analysis and verification based on separation logic. Previously, such properties have appeared in many places in the separation logic literature, but have not been studied in a systematic manner.

      In this talk, we develop an algorithmic framework based on heap automata that allows us to derive asymptotically optimal decision procedures for a wide range of robustness properties in a uniform way. We implemented a prototype of our framework and obtained promising results for all of the aforementioned robustness properties. Further, we demonstrate the applicability of heap automata beyond robustness properties. We apply our algorithmic framework to the model checking and the entailment problem for symbolic-heap separation logic.

    • Adam Rogalewicz:
      Forest Automata for Verification of Heap Manipulation

      We consider verification of programs manipulating dynamic linked data structures such as various forms of singly and doubly-linked lists or trees. We consider important properties for this kind of systems like no null-pointer dereferences, absence of garbage, shape properties, etc. We develop a verification method based on a novel use of tree automata to represent heap configurations. A heap is split into several “separated” parts such that each of them can be represented by a tree automaton. The automata can refer to each other allowing the different parts of the heaps to mutually refer to their boundaries. Moreover, we allow for a hierarchical representation of heaps by allowing alphabets of the tree automata to contain other, nested tree automata. Program instructions can be easily encoded as operations on our representation structure. This allows verification of programs based on a symbolic state-space exploration together with refinable abstraction within the so-called abstract regular tree model checking. A motivation for the approach is to combine advantages of automata-based approaches (higher generality and flexibility of the abstraction) with some advantages of separation-logic-based approaches (efficiency). We have implemented our approach and tested it successfully on multiple non-trivial case studies.

    • Discussion (15 min)
  • 12:00 – 13:30: Lunch
  • 13:30 – 15:30: Graph Transformation
    • Arend Rensink:
      Abstract Graphs and their Transformation

      The data structures built up in pointer programs can for many purposes be viewed as graphs, where the nodes are records and the edges are pointers. The manipulation of that data by a program then corresponds to the transformation of such a graph.

      By regarding portions of a data graph that are “similar enough” (in a sense to be defined precisely but dependent on the application) as identical, and merely recording how many of each such portion there are rather than their individual interconnections, we can arrive at a finite model that captures the essential characteristics of pointer data. The transformations can then be lifted from concrete graphs to this abstract level, giving rise to an over- or under-approximation of the reachable states that allows a partial prediction of the behaviour of the original pointer program.

      In this presentation I give an overview of graph abstraction techniques that have been studied for this purpose, and identify the most promising approaches.

    • Thomas Noll:
      Graph-Based Abstract Interpretation of Pointer Programs

      In this talk we introduce an abstraction framework for analysing pointer programs featuring dynamic data structures, recursive procedures, and concurrent threads. It uses a graph-based symbolic representation of sets of heaps and employs so-called hyperedge replacement grammars to describe both abstraction and concretisation operations on symbolic heaps. Modular reasoning is supported in the form of contracts with graphical pre- and postconditions that capture the net effect of a procedure’s and thread’s execution. In the latter case, contracts are enriched by so-called permissions that represent access rights to (parts of) the heap, which allows to check for race conditions and other concurrency issues.

    • Christina Jansen:
      The Attestor Tool: Graph-Based Abstract Interpretation in Practice

      The automated analysis and verification of pointer-manipulating programs operating on a heap is a challenging task. It requires abstraction techniques for dealing with complex program behaviour and unbounded state spaces that arise from both dynamic data structures and recursive procedures.
      In this talk I am going to briefly present the theoretical underpinnings of a static analysis for pointer programs, which fits into the standard abstract interpretation framework. Its abstraction and (local) concretisation functions are defined by graph grammar application.
      We will have a close look into the prototypic analysis tool Attestor, which analyses Java Bytecode for garbage-freedom, null pointer dereferences, pointer reachability, shape preservance as well as complex functional properties.
      In detail, we consider case studies including singly- and doubly-linked list reversal, the Deutsch-Schorr-Waite tree traversal algorithm and diverse operations on AVL trees.

    • Discussion (30 min)
  • 15:30 – 16:00: Break
  • 16:00 – 18:00: Decision Problems in Separation Logic
    • Koji Nakazawa:
      Complete Cyclic-Proof System for Separation Logic with General Inductive Predicates

      TBA

    • Ruzica Piskac:
      Automated Reasoning about Separation Logic using SMT Solvers

      Separation logic (SL) has gained widespread popularity as a formal foundation of tools that analyze and verify heap-manipulating programs. Its great asset lies in its assertion language, which can succinctly express how data structures are laid out in memory, and its discipline of local reasoning, which mimics human intuition about how to prove heap programs correct.

      While the succinctness of separation logic makes it attractive for developers of program analysis tools, it also poses a challenge to automation: separation logic is a nonclassical logic that requires specialized theorem provers for discharging the generated proof obligations. SL-based tools therefore implement their own tailor-made theorem provers for this task. However, these theorem provers are not robust under extensions, e.g., involving reasoning about the data stored in heap structures.

      I will present an approach that enables complete combinations of decidable separation logic fragments with other theories in an elegant way. The approach works by reducing SL assertions to first-order logic. The target of this reduction is a decidable fragment of first-order logic that fits well into the SMT framework. That is, reasoning in separation logic is handled entirely by an SMT solver. We have implemented our approach in the GRASShopper tool and used it successfully to verify interesting data structures.

    • Adam Rogalewicz:
      The Tree Width of Separation Logic with Recursive Definitions

      Separation Logic is a widely used formalism for describing dynamically allocated linked data structures, such as lists, trees, etc. The decidability status of various fragments of the logic constitutes a long standing open problem. Current results report on techniques to decide satisfiability and validity of entailments for Separation Logic(s) over lists (possibly with data). In this paper we establish a more general decidability result. We prove that any Separation Logic formula using rather general recursively defined predicates is decidable for satisfiability, and moreover, entailments between such formulae are decidable for validity. These predicates are general enough to define (doubly-) linked lists, trees, and structures more general than trees, such as trees whose leaves are chained in a list. The decidability proofs are by reduction to decidability of Monadic Second Order Logic on graphs with bounded tree width.

    • Discussion (30 min)
  • 18:00 – 19:30: Dinner

Wednesday, October 4th

  • 07:30 – 09:00: Breakfast
  • 09:00 – 10:15: Separation Logic and Concurrency III
    • Alexander J. Summers:
      Hybrid Program Analyses for Pointwise Permission Inference

      Ownership and permissions are concepts commonly employed to aid reasoning about programs with mutable state and concurrency, e.g. in custom program logics such as separation logic. Permissions can be used to specify the potential side-effects of code, guaranteeing which facts can be preserved across changes to the program state. One way to generalise permissions to unbounded data is to support them under quantifiers; e.g. specifying access to a graph structures by ranging over its sets of nodes, or to array segments by ranging over integer indices.

      In this talk, I’ll describe ongoing work to automatically infer such quantified permission specifications for a variety of heap-based data structures. Permission-based program logics extend first-order logic with powerful additional connectives, but I’ll show that the constraints arising from our inference problem can nonetheless be summarised within first-order arithmetic. Using this idea, we define a precise analysis for straight-line code, which can e.g. summarise the permissions needed to execute a single loop iteration. We then generate loop invariants using several novel techniques for projecting such expressions out over all loop iterations, leveraging complementary static analyses and quantifier elimination algorithms.

      The talk will include demonstrations using the Viper verification infrastructure, which I’ll introduce along the way.

    • Thomas Wies:
      Flow Interfaces – Compositional Abstractions for Concurrent Data Structures

      Concurrent separation logics have helped to significantly simplify correctness proofs for concurrent data structures. However, a recurring problem in such proofs remains: data structure abstractions based on inductive predicates, which work well in the sequential setting, are much harder to reason about in a concurrent setting. To solve this problem, we propose a novel approach to abstracting regions in the heap by encoding the data structure invariant into a local property that can be checked on individual nodes. These properties may depend on a quantity of each node that is computed as a fixpoint over the entire heap graph. We refer to this quantity as a flow. Flows can encode both structural properties of the heap (e.g., the reachable nodes from the root form a tree) as well as data invariants that are relevant for proving functional correctness. We then introduce the notion of a flow interface, which expresses the relies and guarantees that a heap region imposes on its context to maintain the global flow invariant.

      Our main technical result is that flow interfaces provide a new semantic model for separation logic assertions that admits general implementation-agnostic proof rules for reasoning about concurrent data structures. These include rules that allow a heap region to be split into arbitrary chunks which can be modified and recomposed to form a new region, while maintaining the global data structure invariant. We have used our new approach to obtain simple correctness proofs for non-trivial concurrent dictionary implementations based on B+ trees and non-blocking lists.

    • Discussion (15 min)
  • 10:15 – 10:45: Break
  • 10:45 – 12:00: Extensions of Separation Logic I
    • Chin Wei Ngan:
      Multi-Party Session Logic

      TBA

    • Xiaokang Qiu:
      A Decidable Logic for Tree Data-Structures with Measurements

      We present Dryad_dec, a decidable logic that allows reasoning about tree data-structures with measurements. This logic supports user-defined recursive measure functions based on height or size, and measure-related recursive predicates such as AVL trees or red-black trees. We prove the logic’s satisfiability is decidable. The crux of the decidability proof is a small model property which allows us to reduce the satisfiability of Dryad_dec to quantifier-free linear arithmetic theory which can be solved efficiently using SMT solvers. We also show that Dryad_dec can encode natural proof verification conditions for functional correctness of recursive tree-manipulating programs, as well as synthesis conditions for conditional linear-integer arithmetic functions. We developed the decision procedure and successfully solved 100+ Dryad_dec formulas raised from various scenarios, including verifying full correctness of programs manipulating AVL trees, red-black trees and treaps, checking size lower bounds for AVL trees and red black trees, and synthesizing candidate solutions from a specification and a set of counterexamples. To our knowledge, this is the first decidable logic that can express these measure-related properties for trees.

    • Discussion (15 min)
  • 12:00 – 13:30: Lunch
  • 13:30 – 18:30: Excursion
  • 18:30 – 21:00: Main banquet

Thursday, October 5th

  • 07:30 – 09:00: Breakfast
  • 09:00 – 10:15: Extensions of Separation Logic II
    • Mahmudul Faisal Al Ameen:
      A Logical System for Modular Information Flow Verification

      Information Flow Control (IFC) is important to ensure secure program where secret data does not influence any public data. The pervasive standard that IFC aims is non-interference. Current IFC systems are separated into dynamic IFC, static IFC, and the hybrid between static and dynamic. Dynamic IFC suffers from high overhead and limited ability to prevent implicit flows due to the paths not taken, we propose a novel modular static IFC systems. To the best of our knowledge, this is the first modular static IFC systems. Unlike type-based static IFC systems, ours is separation logic-based. The limitation of type-based IFC systems is in the inviolability of static security label declarations for fields. As such, they suffer from transient leaks on fields. Our proposed system verifies each function independently with the help of separation logic. Furthermore, we provide the proof of correctness for our novel IFC systems with respect to termination and timing insensitive non-interference.

    • Daisuke Kimura:
      Decision Procedure for Entailment of Symbolic Heaps with Arrays

      This talk gives a decision procedure for the validity of entailment of symbolic heaps in separation logic with Presburger arithmetic and arrays. The correctness of the decision procedure is proved under the condition that sizes of arrays in the succedent are not existentially bound. This condition is independent of the condition proposed by the CADE-2017 paper by Brotherston et al, namely, one of them does not imply the other. For improving efficiency of the decision procedure, some techniques are also presented. The main idea of the decision procedure is a novel translation of an entailment of symbolic heaps into a formula in Presburger arithmetic, and to combine it with an external SMT solver.

    • Discussion (15 min)
  • 10:15 – 10:45: Break
  • 10:45 – 12:00: Discussion & Closing
  • 12:00 – 13:30: Lunch

Discussion Topics

  • Abstraction (proposed by Arend Rensink)
  • Mechanized proofs and proof assistants (proposed by Robbert Krebbers)
  • Proof automation in separation logic (proposed by Robbert Krebbers)
  • Checking Entailment Relations in Separation Logic (proposed by Christoph Matheja)
  • Potential for combining different reasoning methodologies and tools (proposed by Alex Summers)

Category: Uncategorized

Tagged:

Comments are closed.