New Challenges in Neural Computation (NC2)

Workshop of the GI-Fachgruppe Neuronale Netze and the German Neural Networks Society in connection to DAGM 2012, Graz

[Home] [Call for Papers] [Program]


Location and Venue:Tuesday, 28.8.2012, 12.00-13.00 in room i12 at the Inffeld Campus of TU-Graz.

Information theoretic feature selection for high-dimensional data analysis,
Prof. Dr. Michel Verleysen

(Université catholique de Louvain, Engineering Faculty - Electricity Department)

Machine learning methods are used to build models for classification and regression tasks, among others. Models are built on the basis of information contained in a set of samples, with few or no information about the underlying process.

The more information there is in the set of samples, the better the model should be. However, this natural assumption does not always hold, since most machine learning paradigms suffer from the 'curse of dimensionality'. The curse of dimensionality means that strange phenomena appear when data are represented in a high-dimensional space. These phenomena are most often counter-intuitive: the conventional geometrical interpretation of data analysis in 2- or 3-dimensional spaces cannot be extended to much higher dimensions.

Among the problems related to the curse of dimensionality, the feature redundancy and concentration of the norm are probably those that have the largest impact on data analysis tools. Feature redundancy means that models will lose the identifiability property (for example they will oscillate between equivalent solutions), will be difficult to interpret, etc.; although it is an advantage on the point of view of information content in the data, the redundancy makes the learning of the model more difficult. The concentration of the norm is a more specific unfortunate property of high-dimensional vectors: when the dimension of the space increases, norms and distances will concentrate, making the discrimination between data more difficult. Most data analysis tools are not robust to these phenomena. Their performance collapse when the dimension of the data space increases, in particular when the number of data available for learning is limited.

This tutorial will start by a presentation of phenomena related to the curse of dimensionality. Then, feature selection will be discussed, as a possible remedy to this curse. Feature selection consists in selecting some of the variables/features among those available in the dataset, according to a relevance criterion. The goal is twofold: to avoid redundancy between features, and to discard irrelevant ones. State-of-the-art feature selection methods based on information theory criteria will be presented, together with the respective advantages of filter, wrapper and embedded methods.

The tutorial will conclude by opening new research questions about feature selection with informatics theoretic criteria.


ENNS The workshop is sponsored by the European Neural Networks Society
Barbara Hammer