Tutorial at IJCNN 2015

Learning in indefinite proximity spaces:
Mathematical foundations, representations and models

Tutorial abstract

Efficient learning of a data analysis task strongly depends on the data representation. Most methods rely on (symmetric) similarity or dissimilarity representations by means of metric inner products or distances, providing easy access to powerful mathematical formalisms like kernel or branch-and-bound approaches. Similarities and dissimilarities are however often naturally obtained by non-metric proximity measures which can not easily be handled by classical learning algorithms. In the last years major efforts have been undertaken to provide approaches which can either directly be used for such data or to make standard methods available for these type of data.
The tutorial provides a comprehensive overview for the field of learning with non-metric proximities. First we introduce the formalism used in non-metric spaces and motivate specific treatments for non-metric proximity data. Secondly we provide a systematization of the various approaches. For each category of approaches a comparative discussion is provided addressing complexity issues and generalization properties. We also address the problem of large scale proximity learning which is often overlooked in this context and of major importance to make the method relevant in practice. The discussed algorithms and concepts are in general applicable for proximity based clustering, one-class classification, classification, regression or embedding tasks. Various applications show the relevance of the discussed approaches, which provide a generic framework for multiple input formats. The goal of the tutorial is to give an overview about recent developments in this domain, covering in particular principled approaches as concerns learning in indefinite spaces and its mathematical foundations and extensions to large scale problems.

Covered material

The following items represent the topics of the tutorial:

Indefinite kernels and pseudo-Euclidean spaces: basic definition; algorithms for correcting indefinite kernels and dissimilarities (eigenspectrum and proxy approaches)
Models for supervised learning in pseudo-Euclidean spaces and by general similarity functions
Application examples from the biomedical domain and image processing
Approximation strategies improving runtime, memory complexity and out-of-sample extension

URLs

Tutorial material is available as pdf-slides. The mentioned survey paper (accepted by Neural Computation and being open access soon) can be obtained as a preprint.

Sources

All packages listed below are implemented in Matlab but may also contain c/c++ bindings. (If any link is dead please let me know I will likely be able to help in providing an alternative source)

SVM proxy learning (classification, regression, 1-class SVM) by Luss and Aspremont available at IndefiniteSVM page
An alternative indefinite proxy kernel SVM was proposed by Chen and Maya Gupta - here we provide the basic matlab code (obtained by personal communication from J. Chen).
Matlab scripts for eigenvalue corrections and nested low rank (Nystroem) double centering with optional eigenvalue correction
Indefinite kernel fisher discriminant and indefinite kernel pca by Haasdonk, Pekalska, Duin (using the distools and prtools toolbox

Organizers

Peter Tino,
School of Computer Science, University of Birmingham, UK,
Email: P.Tino@cs.bham.ac.uk
http://www.cs.bham.ac.uk/~pxt

Frank-Michael Schleif,
School of Computer Science, University of Birmingham, UK,
Email: schleify@cs.bham.ac.uk
http://www.cs.bham.ac.uk/~schleify

Peter Tino is Professor of Complex and Adaptive Systems at the School of Computer Science, University of Birmingham, UK. He held a Fulbright Fellowship (at NEC Research Institute, Princeton, USA) and a UK-Hong Kong Fellowship for Excellence. Peter is a recipient of three IEEE Computational Intelligence Society Outstanding Paper of the Year awards in IEEE Transactions on Neural Networks (1998, 2011) and IEEE Transactions on Evolutionary Computation (2010). He serves on Editorial Boards of several journals (IEEE Transactions on Neural Networks and Learning Systems (IEEE CIS, since 2013), Scientific Reports (Nature Publishing, since 2011) and Neural Processing Letters (Springer, since 2007)). He has (co-)chaired programme committee of 4 international conferences, while being member on programme committees of more than 90 international conferences. Peter's scientific interests include machine learning, dynamical systems, evolutionary computation, complex systems, probabilistic modelling and statistical pattern recognition.

Frank-Michael Schleif received his Ph.D. in Computer Science from the University of Clausthal, Germany, in 2006 and his venia legendi in Applied Computer Science in 2013 from the University of Bielefeld, Germany. From 2004-2006 he was working for the R\&D department at Bruker Biosciences. From 2006 to 2009 he was a research assistant in the research group of computational intelligence at the University of Leipzig working on multiple bioinformatic projects. In 2010 he joined the Chair of Theoretical Computer Science at the University of Bielefeld and did research in multiple projects in machine learning and bioinformatics. Since 2014 he is a member of the University of Birmingham, UK as a Marie Curie Fellow and PI of the project Probabilistic Models in Pseudo-Euclidean Spaces. His areas of expertise include machine learning, signal processing, data analysis and bioinformatics. Several long term research stays have taken him to UK, the USA, the Netherlands and Japan. He is co-editor of the Machine Learning Reports and reviewer for multiple journals and conferences in the field of machine learning and computational intelligence. He is a founding member of the Institute of Computational Intelligence and Intelligent Data Analysis (CIID) e.V. (Mittweida, Germany), a member of the IEEE-CIS, IEEE-SPS, the GI, the DAGM and secretary of the German chapter of the ENNS (GNNS). He is coauthor of more than 100 papers in international journals and conferences on different aspects of Computational Intelligence, most of which can be retrieved from http://www.cs.bham.ac.uk/~schleify