Dec 2, 2014 Comments Off on Organizers
Organizers
Laurent Amsaleg, CNRS-INRIA (Rennes, France)
Michael E. Houle, NII (Tokyo, Japan)
Vincent Oria, NJIT (Newark, NJ, USA)
Arthur Zimek, LMU (Munich, Germany)
Dec 2, 2014 Comments Off on Organizers
Laurent Amsaleg, CNRS-INRIA (Rennes, France)
Michael E. Houle, NII (Tokyo, Japan)
Vincent Oria, NJIT (Newark, NJ, USA)
Arthur Zimek, LMU (Munich, Germany)
Dec 2, 2014 Comments Off on Participants
Laurent Amsaleg, CNRS-IRISA Rennes, France
James Bailey, University of Melbourne, Australia
Nozha Boujemaa, INRIA Saclay Ile-de-France, France
Oussama Chelly, National Institute of Informatics (NII), Japan
Michel Crucianu, Conservatoire National des Arts et Metiers (CNAM), France
Vladimir Estivill-Castro, Griffith University, Australia
Michael E. Houle, National Institute of Informatics (NII), Japan
Alfred Inselberg, Tel Aviv University, Israel
Ata Kaban, University of Birmingham, UK
Ken-ichi Kawarabayashi, National Institute of Informatics (NII), Japan
Peer Kroeger, Ludwig-Maximilians-Universitaet (LMU) Muenchen, Germany
Pei-Ling Lai, Southern Taiwan University of Science and Technology, Taiwan
Chong-Wah Ngo, City University of Hong Kong, China
Vincent Oria, New Jersey Institute of Technology (NJIT), NJ, USA
Srinivasan Parthasarathy, Ohio State University, USA
Milos Radovanovic, University of Novi Sad, Serbia
Shin’ichi Satoh, National Institute of Informatics (NII), Japan
Ansgar Scherp, Kiel University, Germany
Erich Schubert, Ludwig-Maximilians-Universitaet (LMU) Muenchen, Germany
Mahito Sugiyama, ISIR, Osaka University, Japan
Yasuo Tabei, Tokyo Institute of Technology, Japan
Kai-Ming Ting, Federation University, Australia
Nenad Tomasev, Google, USA
Takeaki Uno, National Institute of Informatics (NII), Japan
Takashi Washio, ISIR, Osaka University, Japan
Kaoru Yoshida, Sony Computer Science Laboratories, Japan
Pavel Zezula, Masaryk University, Prague, Czech Republic
Arthur Zimek, Ludwig-Maximilians-Universitaet (LMU) Muenchen, Germany
Dec 2, 2014 Comments Off on Schedule
The seminar runs June 28 – July 2, 2015
See this pdf for a detailed time table:
Dec 2, 2014 Comments Off on Overview
Description of the Meeting
Background
For many fundamental operations in the areas of search and retrieval, data mining, machine learning, multimedia, recommendation systems, and bioinformatics, the efficiency and effectiveness of implementations depends crucially on the interplay between measures of data similarity and the features by which data objects are represented.
When the number of features (the data dimensionality) is high, similarity values tend to concentrate strongly about their means, a phenomenon commonly referred to as the curse of dimensionality. As the dimensionality increases, the discriminative ability of similarity measures diminishes to the point where methods that depend on them lose their effectiveness. The effects of the curse of dimensionality on search and clustering methods are well-known and well-documented. Domain transformation strategies such as dimensional reduction and feature selection can improve performance to some extent, but the fundamental difficulties associated with high dimensionality nevertheless persist.
Over the past decade or so, new characterizations of data sets have been proposed for assessing the performance of particular methods. Such characterizations include estimations of distribution, estimation of local subspace dimension, and measures of intrinsic dimensionality of data.? Although the applications affected by the curse of dimensionality vary widely across research disciplines, the characterizations and models of data that can be applied to analyze the performance of solutions are very general. Across the different disciplines, the data models and data characterizations that have been proposed are quite similar. Unfortunately, researchers from one domain are typically unaware of what researchers from other domains have developed.
NII Shonan Meeting on Dimensionality and Scalability (May 2013)
In May 2013, a NII Shonan Meeting was held to bring together researchers and students active in the areas of databases, data mining, pattern recognition, machine learning, statistics, multimedia, bioinformatics, visualization, and algorithmics who are currently searching for effective and scalable solutions to problems affected by the curse of dimensionality. The main objectives of this workshop were to survey the existing approaches used in dealing with the curse of dimensionality in these various disciplines, to identify their commonalities, strengths and limitations, and to clarify the potential impact of such approaches on core tasks such as search, classification, and clustering.
During the four days of the Meeting, 19 participants participated in brainstorming sessions, identifying future directions for research on dimensionality and scalability. 10 survey presentations were made on the impact of dimensionality in the disciplines of databases, data mining, multimedia and machine learning.? Small working groups eventually focused on the interplay between intrinsic dimensionality and topics in such areas and topics as clustering and outlier detection, multimedia, graphs and networks, and feature selection.? The discussions quickly focused on the need to develop and exploit practical estimators of intrinsic dimensionality.
See the report on this meeting under http://www.nii.ac.jp/shonan/wp-content/uploads/2011/09/No.2013-4.pdf.
Day Seminar on Dimensionality and Scalability (March 2014)
In the months after the Shonan Meeting, participants continued to investigate the theory and applications of intrinsic dimensionality. Several key outcomes were presented at a special one-day seminar organized in March 2014 at NII, which was attended by 8 of the participants from the first Shonan Meeting.
On the theoretical side, the use of Extreme Value Theory enabled the development of several practical estimators of a newly proposed model of the intrinsic dimensionality of distance distributions.? In parallel, several contributions allowed for a better understanding the impact of intrinsic dimensionality of data sets on the quality of similarity search and supporting indexing techniques. Scoring functions based on intrinsic dimensionality were also presented for the detection of? outliers. In addition, a method was presented by which k-nearest-neighbor graphs construction could be combined with simultaneous local dimensional reduction (data sparsification). The normalization of scores and distances for ensemble methods was proposed as a means of compensating for the adverse effects of high dimensionality.
NII Shonan Meeting on Dimensionality and Scalability II: Hands-On Intrinsic Dimensionality (June/July 2015)
We now prepare a second NII Shonan Meeting on Dimensionality and Scalability, in order to
The detailed objectives of the meeting are:
We now have estimators that can compute local values of intrinsic dimensionality, and procedures for making use of these estimators to prune search spaces, filter noisy points, and enhance indexing and clustering. It is crucial to disseminate this knowledge so that progress can continue to be made on as many fronts as possible, and to find other use-cases where such estimators can help in alleviating the effects of the curse of dimensionality.
One goal of the second edition of the Shonan workshop is to both consolidate these existing collaborations and to initiate new ones.