No.065 Low level code analysis and applications to computer security

Icon

NII Shonan Meeting Seminar 065

Organizers

  • Jean-Yves Marion, Lorraine University (France)
  • Mizuhito Ogawa, Japan Advanced Institute of Science and Technology (JAIST) (Japan)
  • Akira Mori, National Institute of Advanced Industrial Science and Technology (AIST) (Japan)

Participants

  • Tachio Terauchi, JAIST (Japan)
  • Abdullah Mohad Zin, Universiti Kebangsaan Malaysia (Malaysia)
  • Sébastien Bardin, CEA LIST (France)
  • Frédéric Besson, Inria (France)
  • Sandrine Blazy, University Rennes – Inria (France)
  • Guillaume Bonfante, LORIA – Université de Lorraine (France)
  • Thomas Dullien, Google (Switzerland)
  • Roberto Giacobazzi, University of Verona (Italy)
  • Nguyen Minh Hai, Industrial University of Ho Chi Minh City (Vietnam)
  • Tomonori Izumida, AIST (Japan)
  • Tim Kornau-von Bock und Polach, Google (Switzerland)
  • Julian Kranz, Technische Universität München (Germany)
  • Arun Lakhotia, University of Louisiana at Lafayette (USA)
  • Jeffrey T. McDonald, University of South Alabama (USA)
  • Quan Thanh Tho, Ho Chi Minh University of Technology (Vietnam)

Schedule

The seminar takes place from the 2nd through the 5th of March 2015. An excursion to Kamakura (an old town famous for the Buddha statue) is planned in the afternoon of the 4th. Presentations will be 45 minutes including interactive questions and comments.

  • March 1st (Sun): Check-in. Welcome banquet.
  • March 2nd (Mon): Opening. Morning session and afternoon session.
  • March 3rd (Tue): Morning session and afternoon session.
  • March 4th (Wed): Morning session. Excursion.
  • March 5th (Thu): Personal discussions. Check-out.

Monday, March 2nd

Time Author(s) Title
9:45 –
10:00
Organizers Opening
10:00 –
10:45
Arun Lakhotia Malware analysis in-the-large
10:45 –
11:30
Guillaume Bonfante On malware chat on the web
Lunch
13:30 –
14:15
Jean-Yves Marion  Disassembly of Self-modifying binaries with overlapping instructions 
14:15 –
15:00
Sébastien Bardin Binary-level analysis: from safety to security
15:00 –
16:15
Jeffrey T. McDonald Developing an Extensible Deobfuscation Framework

Tuesday, March 3rd

9:30 –
10:15
Akira Mori and
Tomonori Izumida
Methods for the Whole Program Analysis of Binary Code
10:15 –
11:00
Quan Thanh Tho    BE-PUM: A Tool of Binary Emulation for PUshdown Model Generation  
11:00 –
11:45
Nguyen Minh Hai Pushdown Model Generation for Malware Deobfuscation
Lunch
13:30 –
14:15
Sandrine Blazy Towards an obfuscating compiler
14:15 –
15:00
Julian Kranz IR Preprocessing using GDSL
Break
15:30 –
16:15
Frédéric Besson Semantics models for binary code
16:15 –
17:00
Tachio Terauchi Information flow analysis and its applications to computer security

Wednesday, March 4th

9:00 –?
9:45
Thomas Dullien
and Tim Kornau
     Analysis of binaries at Google – The unreasonable success of     
nonsemantic methods
9:45 –
10:30
Roberto Giacobazzi Abstract Symbolic Automata – Mixed syntactic/semantic
similarity analysis of executables
10:30 –
11:15
Abdullah Mohad Zin Common Modeling Language for Model Checkers
11:15 –
12:00
Mizuhito Ogawa Constant propagation for binary CFG rebuilt
Lunch
Excursion to Kamakura

Overview

Software security is one of the major issues in modern societies since insecure systems can affect people (phishing, unauthorized payment, etc.), organizations (critical information leakage) and even states (cyber attacks). In regard to software security, a crucial problem is to have methods and to devise tools to analyze low-level code for at least three reasons. First, low level codes are usually the one that is effectively run on a computer system. Low-level code is usually obtained by a compilation chain where some parts of the chain maybe unsecure. As a result it is necessary to guarantee low-level code correction ensuring that programs run by the system cannot be attacked. Second, a lot of applications are available on internet as binary codes. Those binary codes are quite obfuscated either to protect the intellectual property or because there are malicious software (malware). Therefore, it is necessary to have tools to analyze binary codes in order to detect malware and prevent an attack.

It may look surprising that such important issues are so poorly addressed by current state-of-the-technology tools, while formal methods in general and automatic program verification in particular have made tremendous progress in the past decade, resulting in many impressive industrial applications, and demonstrating their strength both in bug finding and safety validation. There are two key problems here.

First, security needs are more difficult to characterize (in terms amenable to automatic analysis) than safety needs. Security requirements change depending on the social context, and malicious behavior is not easy to define. For instance, malware often uses windows system calls for malicious behavior, but standard programs also use such capability. To clarify, not only theoretical observation form academia, but also empirical study in industry will be required.

Second, security analyses must often be performed at the level of binary executable. Malicious behavior is often based on very fine low-level details. invisible on the source code (e.g. memory layout, exception handling details, ambiguity in program semantics, bug in compilation), and malware are only available as executable files. This is a key difficulty, as most formal methods have been designed for high-level models or languages, and they rely on hypotheses that no longer hold at the executable code level. For instance, even (static) disassembly is not possible for recent advanced polymorphic virus, like self-decryption. Thus, model generation of binary executables is already a challenging.

The goal of the meeting is to explore all aspects from theory to practice of low-level code analysis, including communication between industry and academia. The main outcome of this meeting will be to trigger new interactions and enrich various approaches including the followings.

  • Security threat in practice. e.g., vulnerability, overflow attacks on x86 / Android.
  • Malware and botnet.
  • Static analysis of low level codes.
  • Obfuscated programs, e.g., self-modifying code, packers, and obfuscation techniques.
  • Low level code semantics.
  • Model generation. e.g., control flow graph reconstruction, disassembly.
  • CEGAR (Counter example guided abstraction refinement) and loop invariant generation.
  • Testing and virtual binary emulation, dynamic analysis.
  • Backend reasoning tools. e.g., SMT (SAT modulo theory) solvers and model checkers.

By interacting industry and academia, and among different areas in academia, we will start to tackle the two key problems.

Analyzing security needs. Learn concrete examples from industrial experiences, and manual pattern analysis of malware behavior in malware database. They would lead clear formal definitions of target properties.

Binary executable analysis. Exchanging ideas of state-of-the-art research on binary executable analysis, and look for collaboration opportunity.