February 22, 2017
Paul Otlet room - Réaumur Building, a.327
In this talk, I will discuss matrix factorisation based methods for pattern set mining in rank data.
First, I will discuss a general framework called Semiring Rank Matrix Factorisation. The framework employs semiring theory rather than relying on the traditional linear algebra for matrix factorisation, which results in a more elegant way of aggregating rankings. Subsequently, I will introduce two instantiations of the framework: Sparse RMF and ranked tiling. We introduce Sparse RMF to mine a set of sparse rank vectors that can be used to summarise given rank matrices succinctly and show the main categories of rankings. We introduce ranked tiling to discover a set of data regions in a rank matrix which have high ranks. Such data regions are interesting as they can show local associations between subsets of the rows and subsets of the columns of the given matrices.
Finally, I will discuss how to use ranked tiling to formally define the concept of driver pathways, from which we can find cancer subtypes, i.e., groups of tumour samples having the same molecular mechanism driving tumorigenesis.
Thanh obtained his master degree at the Asian Institute of Technology (AIT) in 2007 and his PhD at the KU Leuven in December 2016, under the supervision of Luc De Raedt (KU Leuven), Kathleen Marchal (Universiteit Gent) and Siegfried Nijssen (currently UC Louvain). He is interested in declarative methods for data mining using Constraint Programmming and Integer Programming, matrix factorisation for pattern set mining in rank data and its applications in bioinformatics.