Aller au contenu principal

YRD / Young Researchers Day

isba
Louvain-la-Neuve
Plus d'information

Program  : 
 

  • 9h00 : Lara WAUTIER 
    Dynamic graphical models for high-dimensional financial time series

    The framework of Graphical models has been extensively studied. They are useful to describe conditional independences by identifying disjoint sets of nodes. Large high-dimensional graphs with many nodes often satisfy a sparsity assumption. That is, many edges are absent while only a few nodes are connected. For this purpose, an l1-penalty which depends on an external regularization parameter is applied to shrink small entries in the inverse covariance matrix to 0.

    Our aim is to develop a method that allows for the estimation of graphs for high-dimensional multivariate volatility models. The dynamic structure is particularly relevant for financial applications like portfolio allocation and risk management. Specifically, we want to estimate the conditional covariance matrix which changes over time according to the BEKK model (Engle & Kroner, 1995). To achieve this goal, we use an ADMM algorithm (Boyd et al., 2011) which allows to deal with the penalization induced by high dimensional settings, by dividing the optimization problem into simpler sub-problems. Constraints on the parameter matrices of our model prevents finding an analytical formula. We therefore use a second algorithm, the BHHH algorithm (Berndt et al., 1974) which approximates the Hessian matrix, contributing to good numerical stability.
     

  • 9h25 : Madeline VAST + Laura SYMUL (Team Presentation) 
    Unsupervised integration of longitudinal multi-omics data
    In biomedical research, the last decades have seen the emergence of a plethora of technologies for generating “-omics” data, such as transcriptomics, metagenomics, metabolomics, etc. The suffix “-omics” indicates that virtually all quantifiable corresponding biological entities (transcripts, genomes, metabolites, . . .) have been measured. This leads to datasets that often have more features (p, the biological entities) than samples (n), such that traditional statistical approaches are often not suitable. In addition, these data often present many other statistical challenges, including heteroskedasticity and over-dispersion, unknown scaling factors, and, in some cases, the compositionality of the data. While many methods and analysis workflows have been successfully proposed and implemented for the independent analysis of specific types of omics datasets, we are still in the early stages of method development for integrating and jointly analysing several -omics modalities quantified on the same samples. Furthermore, many of these methods are not yet well suited for analysing longitudinal multi-omics datasets. After an introduction to omics data and the challenges of joint analysis of longitudinal multi-omics data, we will present two promising methods for this purpose, (Di)STATIS (Abdi et al., 2005; L’Hermier des Plantes, 1976) and MEFISTO (Velten et al., 2022).
    References
    Abdi, H., O’Toole, A.J., Valentin, D., Edelman, B., 2005. DISTATIS: The Analysis of Multiple Distance Matrices, in: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05) - Workshops. Presented at the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05) - Workshops, IEEE, San Diego, CA, USA, pp. 42–42. https://doi.org/10.1109/CVPR.2005.445
    L’Hermier des Plantes, H., 1976. Structuration des tableaux à trois indices de la statistique. Université des Sciences et Techniques du Languedoc.
    Velten, B., Braunger, J.M., Argelaguet, R., Arnol, D., Wirbel, J., Bredikhin, D., Zeller, G., Stegle, O., 2022. Identifying temporal and spatial patterns of variation from multimodal data using MEFISTO. Nat. Methods 19, 179–186. https://doi.org/10.1038/s41592-021-01343-9
     
  • 10h15 : Coffee break 
     
  • 10h40 : Mathilde FOULON 
    Multimodal mixture regression on censored data with a cure fraction.
    There is an abundant literature in statistics, biostatistics and econometrics on the modelling, estimation and inference of regression models for survival data subject to censoring. However, only a few of them consider a potential multimodality of the time-to-event. To the best of our knowledge, there is no model that takes into account both multimodality and the possible presence of a cure fraction, i.e. the presence of a fraction of subjects who do not experience the event of interest. Our aim is to develop a modelling approach that takes both these aspects into account. This is particularly useful in contexts such as modelling cancer recurrence, where recurrences may occur in several waves, but with a proportion of patients never relapsing.
    To achieve this goal, we developed an accelerated failure time model in which the error term is assumed to follow a mixture of Sinh-Cauchy distributions. This approach offers greater robustness by combining the flexibility of mixture models with that of the Sinh-Cauchy distribution. We studied the properties of this distribution and implemented an estimation method using the EM algorithm. A simulation study was carried out to illustrate the performance of the proposed approach.  Further investigations are ongoing on the selection of the number of components in the mixture, but preliminary results indicate that large flexibility is already achieved with a limited number of mixture components. In the following, we intend to apply our methodology to real data.
    Keywords: Survival, Multimodality, Cure, Mixture, EM.
     
  • 11h05 : Hugo BRUNET + Eugen PIRCALABELU (Team Presentation) 
    Functional additive regression on imperfectly observed data with error in covariates

    Functional data analysis (FDA) provides a framework for modeling random functions and offers statistical tools for both descriptive and inferential analysis.
    Modeling non-linear relationships between functional responses and covariates is crucial for complex phenomena such as weather forecasting, electricity consumption and many other relevant real-life problems. Non-parametric regression is particularly suitable for this purpose. However, non-parametric models face the curse of dimensionality, which is exacerbated in the functional context due to the infinite-dimensional nature of the data. The additive model assumption offers a balance by imposing some restrictions on the non-linear relationships that can be captured while maintaining flexibility, interpretability, and achieving convergence rates comparable to one-dimensional non-parametric regression.
    Unfortunately, real-world functional data are often imperfectly observed due to signal noise or the discrete nature of sampling procedures, leading to bias in model estimation. It is essential to study these biases to determine whether observation errors hinder the estimation of the population model. If so, it is crucial to identify the conditions under which these errors may cause estimation failure.
    This presentation addresses challenges of additive regression with errors in covariates within the functional data framework. It will also present current theoretical and simulation results. An introduction to functional data and the primary theoretical tools used in FDA will precede the main discussion.

  • Vendredi, 07 février 2025, 09h00
    Vendredi, 07 février 2025, 12h15
  • ISBA

    ISBA - C115 (1st Floor)