INTRODUCTION & BACKGROUND
Could you introduce yourself in a few words ?
I’m a researcher with a background in applied statistics and data science, working at the intersection of machine learning and healthcare. I’m interested in building models that can learn from real-world clinical data, which are often messy, multimodal, and incomplete, while staying rigorous and useful for medical applications.
What led you to become interested in artificial intelligence ?
It was a gradual process. I first explored what statistics can do - how far careful modeling, inference, and classical machine learning can take you (and in many ways, classical ML is really an extension of statistics). As I worked with more complex and high-dimensional data, I became increasingly interested in AI methods that can learn richer representations and capture non-linear patterns, while still keeping the statistical mindset of rigor, robustness, and generalization.
How would you describe your PhD research topic to someone outside your field ?
Hospitals collect a lot of information about patients over time such as diagnoses, medications, lab tests, imaging, notes, but it’s scattered, incomplete, and irregular. My PhD focuses on building AI that can combine these different sources, handle missing information, and learn which patients are similar to each other dynamically, in order to better predict outcomes like deterioration, readmission, or treatment response.
CORE OF THE PROJECT
What is the main objective of your research ?
The main objective is to develop a time-aware graph learning framework for healthcare data that can jointly: a) learn an evolving patient similarity graph (instead of using a fixed, predefined one), b) impute missing modalities/features directly within the model, and c) learn robust patient representations optimized end-to-end for downstream clinical prediction tasks (e.g., hospitalization risk, disease progression, treatment response).
Which methods or approaches do you use ?
My work relies on graph representation learning and graph neural networks (GNNs), with a focus on :
○ Graph structure learning (learning edges and sparsity patterns during training rather than relying on kNN graphs)
○ Graph-based multimodal imputation (e.g., graph autoencoders, diffusion-based generative models)
○ Temporal and dynamic graph modeling to represent irregular clinical events and capture recency and ordering effects
○ Careful evaluation on real clinical settings, with attention to robustness, missingness, and generalization
Why do you consider this research topic important for the future of AI ?
Healthcare exposes many of AI’s hardest and most important challenges: missing data, shift across hospitals, irregular time, multimodality, and the need for reliability. If AI can learn structured representations in this setting, it pushes the field toward models that are not only accurate but also robust and adaptable. More broadly, learning dynamic relational structure from data is a key frontier for AI beyond pattern recognition: it’s about learning how data points relate, not just learning from them independently.
COLLABORATION & ECOSYSTEM
Which institutions, laboratories, or partners do you collaborate with ?
I collaborate with teams across DATAIA (CentraleSupélec / Inria OPIS / Université Paris-Saclay) and PRAIRIE (Université Paris Cité), including the Institut Imagine and its Clinical Bioinformatics group, under the supervision of Fragkiskos Malliaros, Nora Ouzir and Anita Burgun. The project is designed to bridge methodological research in graph ML with clinically grounded questions and datasets.
How do you view collaborations between different clusters and research domains ?
They’re essential especially in healthcare because the hardest problems are rarely purely “AI problems.” Progress depends on aligning real clinical needs, data realities, and methodological innovation. Interdisciplinary collaboration also improves research quality: it forces clearer assumptions, better evaluation protocols, and a stronger focus on what will actually work outside the lab.
IMPACT & SOCIETY
In your view, what are the ethical or societal implications of your research ?
Learning from EHR data raises major questions around privacy, fairness, bias, and accountability. Models trained on historical clinical data can inherit systemic biases, and missingness is often not random (it can reflect access to care). My work aims to explicitly model missingness and uncertainty rather than ignoring it, and to emphasize evaluation practices that are meaningful for clinical deployment such as robustness across subgroups, sensitivity to data shifts, and transparent reporting of limitations.
How do you see the role of AI in addressing major contemporary challenges ?
AI can be a powerful accelerator when used responsibly : it can help allocate resources, personalize interventions, and extract insights from complex data. In health specifically, its potential is huge for early risk detection and decision support but only if we prioritize reliability, validation, and governance. More broadly, AI can also strengthen scientific culture by enabling reproducible pipelines, open-source tools, and collaborative research that lowers the barrier to entry.
VISION & INSPIRATION
What advice would you give to future PhD students considering a career in research ?
Pick a problem you genuinely care about, because motivation matters over years and not weeks.
If you had to summarise your approach to research in one sentence, what would it be ?
I aim to develop methods that are both theoretically grounded and practically useful, by designing models that respect the structure and imperfections of real clinical data.