Extracting Insights from Complex Data using (Coupled) Tensor Factorizations

by Evrim Acar Ataman

There is an emerging need to jointly analyze data sets collected from multiple sources in order to extract insights about complex systems such as the human brain, or human metabolome. For instance, joint analysis of omics data (e.g., metabolomics, microbiome, genomics) holds the promise to improve our understanding of the human metabolism and facilitate precision health. Such data sets are heterogeneous – they are a collection of static and dynamic data sets. Dynamic data can often be arranged as a higher-order tensor (e.g., subjects by metabolites by time) while static data can be represented as a matrix (e.g., subjects by genes). Tensor factorizations have been successfully used to reveal the underlying patterns in higher-order tensors, and extended to joint analysis of multimodal data through coupled matrix and tensor factorizations (CMTF). However, jointly analyzing heterogeneous data sets still has many challenges, especially when the goal is to capture the underlying (time-evolving) patterns. In this talk, we discuss CMTF models for temporal and multimodal data mining. We focus on a flexible, accurate and computationally efficient modelling and algorithmic framework that facilitates the use of a variety of constraints, loss functions and couplings with linear transformations when fitting CMTF models. Through various applications, we discuss the advantages and limitations of available CMTF methods.