Seminars

NO.126 Computation over Compressed Structured Data

Shonan Village Center

October 9 - 12, 2017 (Check-in: October 8, 2017 )

Organizers

  • Kunihiko Sadakane
    • The University of Tokyo, Japan
  • Gonzalo Navarro
    • University of Chile, Chile

Overview

Description of the meeting

Data compression is used to store data in devices with small storage or to exchange data over small band-width networks. To process transferred or stored data efficiently, it is better to process them without decompression. Typical processes on compressed data are searching in compressed data, inserting into compressed data, or deleting from compressed data. Another merit of processing data without decompression is that we can reduce secondary memory access. Furthermore, we can store a large percentage of data into main memory, resulting fast processing. There is a possibility that the performance gain outweighs the computational overhead resulting from working on compressed data. We consider processing structured data such as strings, trees, or graphs.

The aim of the meeting is to bring together researchers from various research directions of compression of structured data. The meeting will inspire a discussion on theoretical results and practical implementations of compression algorithms. Especially, we consider the following topics:

  • Encoding Data Structures.
    It is sometimes enough to construct data structures which support some queries on data but which do not recover the actual data. The size of such data structures can be much less space than the entropy of the data. These data structures are called encodings. A typical example of an encoding is the 2n-bit structure that answers range minimum queries on a permutation of [1; n], whose entropy is n lg n bits. This direction of research is theoretically challenging and practically relevant.
  • Computation-Friendly Compression.
    It is difficult in general to do computation over compressed data compressed by algorithms which focus on only compression effectiveness. Recent trends are designing computation-friendly compression schemes, that achieve both strong compression and allow for efficient computation.
  • Graph Compression.
    Graph compression is important due to the growth of complex networks such as social networks. There exist many results on graph compression, but there are still important open problems such as suitable definitions of graph entropies, graph algorithms on compressed graphs, designing benchmark data, etc.
  • Machine Learning
    The importance of efficient algorithms for machine learning rapidly grew recently. It will be possible to accelerate learning if data are compressed.

Report

No-126.pdf